US008912387B2

(12) United States Patent (10) Patent No.: US 8,912,387 B2 Xu et al. (45) Date of Patent: Dec. 16, 2014

(54) GENETIC LOCIASSOCIATED WITH HEAD in the spring cornbelt (in Chinese, English abstract). Journal of RESISTANCE IN MAZE science 8: 84-88. Bai JK, Song ZH, Chen J. Liang JY, Liu GZ, Zhao TC, Zhou YL (1994) A review of the pathogenic variation of corn diseases and (75) Inventors: Mingliang Xu, Beijing (CN); Bailin Li, breeding of resistant cultivars (in Chinese, English abstract). Journal Hockessin, DE (US); Kevin Fengler, of Maize Sciences 2:67-72. Wilmington, DE (US); Qing Chao, Xu ML, Melchinger AE, Libberstedt T (1999) Species-Specific Beijing (CN); Yongsheng Chen, Beijing detection of the maize pathogens Sporis Orium reiliana and Ustilago (CN); Xianrong Zhao, Beijing (CN); maydis by Dot Blot Hybridization and PCR-Based Assays. Plant Dis Jing Zhao, Beijing (CN) 83: 390-395. Lu XW. Brewbaker JL (1999) Molecular mapping of QTLS confer ring resistance to Sphacelotheca reiliana (Kühn) Clint. Maize Genet (73) Assignee: E. I. du Pont de Nemours and ics Cooperation Newsletter (MNL) 73:36. Company, Wilmington, DE (US) Ma BY, Li YL, Duan SK (1983) Study on the resistence to head Smut corn varieties and its inheritance (in Chinese, English abstract) (*) Notice: Subject to any disclaimer, the term of this Scientia Agrcultura Sinica 4: 12-17. patent is extended or adjusted under 35 Stromberg EL. Stienstra WC, Kommmendahl T. Matyac CA, U.S.C. 154(b) by 1160 days. Windels CE, Geadelmann JL (1984) Smut expression and resistance of corn to Sphacelotheca reiliana in Minnesota. Plant Dis 68:880 (21) Appl. No.: 12/545,226 884. Ali A. Baggett JR (1990) Inheritance of resistance to head Smut corn (22) Filed: Aug. 21, 2009 disease in corn. JAm Soc Hortic Scil 15: 668-672. Bernardo R. Bourner M. Olivier JL (1992) Generation means analy (65) Prior Publication Data sis of resistance to head Smut in Maize. Agronomie 12:303-306. Shi HL, Jiang YX, Wang ZH, Li XH, Li MS, Zhang (2005) QTL US 201O/OOSO291 A1 Feb. 25, 2010 identification of resistance to head Smutin maize (in Chinese, English abstract). Acta Agronomica Sinica 31: 1449-1454. Lander and Botstein (1989), Mapping mendelian factors underlying Related U.S. Application Data quantitative traits using RFLP linage maos genetics. 121:185-199. Lebowitz et al. (1987) Trait-based Analyses for the detection of (60) Provisional application No. 61/090,704, filed on Aug. linkage between marker loci and quantitative trait loci in crosses 21, 2008. between inbred lined. Theor. Appl. Genet. 73:556-562. Yongsheng Chen et al., Identification and fine-mapping of a major (51) Int. Cl. QTL conferring resistance against head Smut in maize. Theor. Appl. AOIHL/04 (2006.01) Genet, 2008, pp. 1241-1252, vol. 117. AOIH 3/00 (2006.01) X.H. Li et al., Analysis of QTL for resistance to head Smut (52) U.S. Cl. (Sporisorium reiliana) in maize, Elsevier-Field Crops Research, USPC ...... 800/267: 800/275 2008, pp. 148-155, vol. 106. T. Lubberstedt et al., QTL mapping of resistance to Sporisorium (58) Field of Classification Search reiliana in maize. Theor Appl Genet, 1999, pp. 593-596, vol. 99. None Wenkai Xiao et al., Mapping of genome-wide resistance gene ana See application file for complete search history. logs (RGAs)in maize (Zea may’s L.), Theor Apl. Genet, 2007, pp. 501-508, vol. 115. (56) References Cited Internation Search Report, 2009. PUBLICATIONS * cited by examiner Negrotto et al (Plant Cell Reports 19: 798-803, 2000).* Primary Examiner — Shubo (Joe) Zhou Xuetal (Plant Disease 83(4): 390-395, 1999).* Bernardo etal (Agronomie 12:303-306, 1992).* Assistant Examiner — Keith Robinson Wu XL, Pang ZC, Tian LM. Hu JS (1981) On the environmental factors affecting infection and cultural measures of controlling corn (57) ABSTRACT head Smut (in Chinese, English abstract). Acta Phytophlacica Sinica Head Smut is one of the most devastating diseases in maize, 8:41-46. causing severe yield loss worldwide. The present invention Kriger W (1962) Sphacelotheca reiliana on maize, I-Infection and control studies. South AfrJAgric Sci 5:43-56. describes the fine-mapping of a major QTL conferring resis Mytac CA, Kommedahl T (1985a) Factors affecting the developmen tance to head Smut. Markers useful for breeding, and methods ofhead Smut caused by Sphacelotheca reiliana on corn. Phytopathol for conferring head Smut resistance are described. Nucleic ogy 75: 577-581. acid sequence from the genetic locus conferring head Smut Frederiksen RA (1977) Head Smuts of corn and . Proc Corn resistance is disclosed. Genes encoding proteins conferring Soghum Res Conf32:89-104. head Smut resistance are disclosed. Jin QM. Li JP, Zhang XW. Wang GX, Song SY. Liu YC, Wang LX (2000) Establishment IPM of system of corn diseases and pest insects 4 Claims, 4 Drawing Sheets U.S. Patent Dec. 16, 2014 Sheet 1 of 4 US 8,912,387 B2

AAM- 1403.3-lit 57 ir CTTCCACCGAGAATAGGGCTTTCATTTGTGTTAGCAGuru AZM4 140313-Huangzaid. ... CTTCCACCGAGAATAGGGTTTTCATTTGTGTTAGCAG.

...... 8 ......

39thp to 373hp -

FIG. 1 U.S. Patent Dec. 16, 2014 Sheet 2 of 4 US 8,912,387 B2

Distance(M. Markers A ture 23 STSrga840810 brig 1520 Larrets

3.

SSR brig 1893

& HSR1 phi-427434

STST s: 88 SNPSG STS 4. unc21.84 in 27

urne224

FIG. 2 U.S. Patent Dec. 16, 2014 Sheet 3 of 4 US 8,912,387 B2

9."?INH

SV?TILHNESSOSTIÐIVINT,ILLÆÐTÕ SV?TILHN,ISSOSTIÐIVINTI,HLLH5)TÕ U.S. Patent Dec. 16, 2014 Sheet 4 of 4 US 8,912,387 B2

†3.Infi!--

US 8,912,387 B2 1. 2 GENETIC LOCASSOCATED WITH HEAD (a) at least one nucleotide sequence encoding a polypeptide SMUT RESISTANCE IN MAZE conferring or improving resistance to head Smut Selected from the group consisting of SEQID NOS:27, 32,35, 38, 41, This application claims the benefit of U.S. Provisional 44, 105, 108, 111, 113, and 116; Application No. 61/090.704, filed Aug. 21, 2008, the contents (b) at least one nucleotide sequence capable of encoding a of which are hereby incorporated by reference. polypeptide conferring or enhancing resistance to head Smut selected from the group consisting of SEQID NOS:25, 26.30, FIELD OF THE INVENTION 31, 34, 36, 37, 39, 40, 42,43, 45,104,106, 107,109, 110, 112, 114, 115, and 117; and The present disclosure relates to compositions and meth 10 (c) a complement of the nucleotide sequence of part (a) or ods useful in enhancing resistance to head Smut in maize. (b), wherein the complement and the nucleotide sequence consist of the same number of nucleotides and are 100% BACKGROUND OF THE INVENTION complementary. In a second embodiment, the invention concerns a vector Head Smut is a soil-borne and systemic disease in maize 15 comprising the claimed isolated polynucleotide. (Frederiksen 1977) caused by the host-specific In a third embodiment, the invention concerns a recombi Sphacelotheca reiliana (Kühn) Clint. The teliospores from nant DNA construct comprising the isolated polynucleotide Sori buried in soil are the primary source of infection, and can of the invention operably linked to at least one regulatory Survive three years in Soil without loss of any infection capac Sequence. ity (Wu et al. 1981). The fungus infects seedlings through In a fourth embodiment, the invention concerns a maize roots or coleoptiles during and after seed emergence (Kriger cell comprising the recombinant DNA construct or the iso 1962). In an infection of a susceptible variety the plants lated polynucleotide of the invention. continue normal vegetative growth, but some may be stunted In a fifth embodiment, the invention concerns a process for (Matyac and Kommedahl 1985a). At maturity sori replace producing a maize plant comprising transforming a plant cell ears or tassels of the infected plants, resulting in nearly no 25 with the recombinant DNA construct of the invention and maize yield for the plant. The proportion of infected plants in regenerating a plant from the transformed plant cell. an infected field could amount to 80% (Frederiksen 1977). Jin In a sixth embodiment, the invention concerns a maize (2000) reported the incidence of this disease varied from plant comprising the recombinant DNA construct of the 7.0% to 35.0%, some even reaching 62.0%, resulting from the invention. cultivation of susceptible cultivars. In Northern China, head 30 In a seventh embodiment, the invention concerns a maize Smut causes yield loss of up to 0.3 million tons annually (Bai seed comprising the recombinant DNA construct of the et al. 1994). It was reported that maize in Southern Europe, invention. North America, and Asia also seriously suffer from this dis In an eighth embodiment, the invention concerns a process ease (Xu et al. 1999). Considering both economic and eco of conferring or improving resistance to head Smut, compris logical elements, cultivation of resistant varieties is an effec 35 ing transforming a plant with the recombinant DNA construct tive way to control epidemics of head Smut. Breeding for of the invention, thereby conferring or improving resistance multiple resistant genes/QTLS against head Smut into elite to head Smut. maize varieties would be a promising way to improve the In a ninth embodiment, the invention concerns a process of resistance against this disease. determining the presence or absence of the polynucleotide of To date, many researches have studied genetic models con 40 the invention in a maize plant, comprising at least one of ferring resistance against head Smut. Mei et al. (1982) (a) isolating nucleic acid molecules from said maize plant reported that resistance against head Smut was controlled by and amplifying sequences homologous to the polynucleotide partially dominant nuclear genes with no difference being of the invention, or found in reciprocal crosses. Ma et al. (1983) reported maize (b) isolating nucleic acid molecules from said maize plants resistance to head Smut was a quantitative trait, affected by 45 and performing a Southern hybridization, or partial resistance genes and their non-allelic interactions. (c) isolating proteins from said maize plant and performing Stromberget al. (1984) discovered that F population showed a western blot using antibodies to the protein, or an intermediate disease incidence between resistant and Sus (d) isolating proteins from said maize plant and performing ceptible parents. Ali and Baggett (1990) reported additive and an ELISA assay using antibodies to the protein, or dominant genetic actions were preponderant under different 50 (e) demonstrating the presence of mRNA sequences treatments. Bernardo et al. (1992) studied genetic effect of derived from the mRNA transcript and unique to the head resistance gene(s) by using generation mean analysis, Sug Smut resistance locus, thereby determining the presence of gesting that additive effect is decisive, while the dominant and the polynucleotide of the invention in said maize plant. epistatic effects are weak. Shietal. (2005) reported that apart In a tenth embodiment, the invention concerns a process of from additive effect, over-dominance also plays a key role in 55 determining the presence or absence of the head Smut resis resistance against head Smut. It is obvious that resistance tance locus in a maize plant, comprising at least one of against head Smut in maize may involve in a number of (a) isolating nucleic acid molecules from said maize plant genetic elements and act in a complex way. and amplifying sequences unique to the polynucleotide of the invention, or SUMMARY OF THE INVENTION 60 (b) isolating proteins from said maize plant and performing a western blot using antibodies to the protein, or Compositions and methods for identifying and selecting (c) isolating proteins from said maize plant and performing maize plants with increased resistance to head Smut are pro an ELISA assay using antibodies to the protein, or vided. (d) demonstrating the presence of mRNA sequences In a first embodiment, the invention concerns an isolated 65 derived from the mRNA transcript and unique to the head polynucleotide comprising a polynucleotide selected from Smut resistance locus, thereby determining the presence of the group consisting of the head Smut resistance locus in said maize plant. US 8,912,387 B2 3 4 In an eleventh embodiment, the invention concerns a pro resistance, the method comprising detecting in the germ cess of altering the level of expression of a protein capable of plasm of the maize plant at least one allele of a marker locus conferring resistance to head Smut a maize cell comprising: wherein: (a) transforming a maize cell with the recombinant DNA (a) the marker locus is within 7 cM of SSR148152, construct of the invention and CAPS25082, STS171, SNP661, and STS1944; and (b) growing the transformed maize cell under conditions (b) at least one allele is associated with head Smut resis that are suitable for expression of the recombinant DNA tance. construct wherein expression of the recombinant DNA con In a seventeenth embodiment, the invention concerns a struct results in production of altered levels of a protein method of identifying a maize plant that displays head Smut capable of conferring resistance to head Smut in the trans 10 resistance, the method comprising detecting in the germ formed maize cell when compared to levels of expression in plasm of the maize plant at least one allele of a marker locus a wild-type maize plant having resistance to head Smut. wherein: In a twelfth embodiment, the invention concerns a process (a) the marker locus is located within a chromosomal inter of altering the level of expression of a protein capable of 15 val comprising and flanked by umc1736 and umc2184 or conferring resistance to head Smutina maize cell comprising: within a chromosomal interval comprising and flanked by (a) transforming a maize cell with the recombinant DNA SSR148152/SNP661; and construct of the invention; and (b) at least one allele is associated with head Smut resis (b) growing the transformed maize cell under conditions tance. that are suitable for expression of the recombinant DNA In an eighteenth embodiment, the invention concerns a construct wherein expression of the recombinant DNA con method of marker assisted selection comprising: struct results in production of altered levels of a protein (a) obtaining a first maize plant having at least one allele of capable of conferring resistance to head Smut in the trans a marker locus, wherein the marker locus is located within 7 formed maize cell when compared to levels of expression in cM of SSR148152, CAPS25082, STS171, SNP661, and a wild-type maize plant having resistance to head Smut. 25 STS1944 on a public IBM genetic map and the allele is In a thirteenth embodiment, the invention concerns a pro associated with increased resistance to head Smut: cess of altering the level of expression of a protein capable of (b) crossing said first maize plant to a second maize plant; conferring resistance to head Smut in a maize plant compris (c) evaluating the progeny for at least said allele; and ing: (d) selecting progeny maize plants that possess at least said (a) transforming a maize plant cell with the recombinant 30 allele. DNA construct of the invention; and (b) regenerating a transformed maize plant from the trans In a nineteenth embodiment, the invention concerns a formed maize plant cell; and method of marker assisted selection comprising: (c) growing the transformed maize plant under conditions (a) obtaining a first maize plant having at least one allele of that are suitable for expression of the recombinant DNA 35 a marker locus, wherein the marker locus is located within a construct wherein expression of the recombinant DNA con chromosomal interval comprising and flanked by umc1736 struct results in production of altered levels of a protein andumc2184 and the allele is associated with increased resis capable of conferring resistance to head Smut in the trans tance to head Smut, formed maize plant when compared to levels of expression in (b) crossing said first maize plant to a second maize plant; a wild-type maize plant having resistance to head Smut. 40 (c) evaluating the progeny for at least said allele; and In a fourteenth embodiment, the invention concerns a pro (d) selecting progeny maize plants that possess at least said cess of altering the level of expression of a protein capable of allele. In a nineteenth embodiment, the invention concerns a conferring resistance to head Smut in a maize plant compris method of detecting a head Smut resistance locus comprising ing: detecting the presence of at least one marker allele selected (a) transforming a maize plant cell with the recombinant 45 from the group consisting of: MZA6393, 1M2-9, E6765-3, DNA construct of the invention; and 2M4-1, 2M10-5, 2M11-3, 3M1-25, and STS 148-1. (b) regenerating the transformed maize plant from the It is also clear that in any of the aforementioned methods, transformed maize plant cell; and any of the described marker alleles associated head Smut (c) growing the transformed maize plant under conditions resistance may be linked to any second marker allele. Such a that are suitable for expression of the recombinant DNA 50 second marker allele would also be associated with head Smut construct wherein expression of the recombinant DNA con resistance, and would be useful in the ways described above. struct results in production of altered levels of a protein capable of conferring resistance to head Smut in the trans BRIEF DESCRIPTION OF FIGURES AND formed maize plant when compared to levels of expression in SEQUENCE LISTINGS a wild-type maize plant having resistance to head Smut. 55 In a fifteenth embodiment, the invention concerns a method The invention can be more fully understood from the fol of identifying a maize plant that displays head Smut resis lowing detailed description and the accompanying drawings tance, the method comprising detecting in a maize plant a and Sequence Listing which form a part of this application. genetic marker locus wherein: The Sequence Listing contains the one letter code for nucle (a) a genetic marker probe comprising all or a portion of the 60 otide sequence characters and the three letter codes for amino genetic marker locus, or complement thereof, hybridizes acids as defined in conformity with the IUPAC-IUBMB stan under stringent conditions to bacm.pk071.12, bac dards described in Nucleic Acids Research 13:3021-3030 m.pk007.18, and bacm2.pk166.h1; and (1985) and in the Biochemical Journal 219 (No. 2): 345-373 (b) said genetic marker locus comprises at least one allele (1984), which are herein incorporated by reference in their that is associated with head Smut resistance. 65 entirety. The symbols and format used for nucleotide and In a sixteenth embodiment, the invention concerns a amino acid sequence data comply with the rules set forth in 37 method of identifying a maize plant that displays head Smut C.F.R.S 1.822. US 8,912,387 B2 5 6 FIG. 1. Development of a SNP marker (SNP140313) for SEQ ID NO:26 is the nucleic acid sequence from Mo17 AZM4 140313 (assembled Zea mays sequence from TIGR) representing the gene coding region for a Xylanase inhibitor and its application in genotyping BC populations. gene contained within the qHRS1 locus. FIG. 2. Genetic-mapping of the newly-developed markers SEQ ID NO:27 is the translation product of SEQ ID in the bin2.09 region. NO:26. FIG. 3. Alignment of the Xylanase inhibitor gene from SEQ ID NO:28 is the nucleic acid sequence from B73 Mo17 and B73. The Mo17 sequence is found in qHSR1, the representing the gene coding region for a Xylanase inhibitor locus that confers head Smut resistance in maize. B73 is a gene contained within the region of the B73 genome that is head Smut sensitive variety of maize. syntenic to the qHRS1 locus. FIG. 4. A comparative drawing of Mo17, B73, and Hua 10 SEQ ID NO:29 is the translation product of SEQ ID ngzhao genomic structure in the qHSR region. B73 and Hua NO:28. ngzhao both have deletions in the region when compared to SEQ ID NO:30 is the genomic DNA region from Mo17 Mo17. The markers mentioned in the current invention are encoding the Xylanase inhibitor of SEQ ID NO:26/27 and 3 shown at the top. Six genes of interest are noted, a hydrolase kb upstream of the coding region. gene that is unique to Mo 17: Gene 1, and ankyrin-repeat 15 SEQ ID NO:31 is the nucleic acid sequence from Mo17 protein, is found in all three lines; Gene 2 a cell wall-associ representing the gene coding region for a cell wall associated ated kinase, is found in Mo17 and B73; Gene 3 and Gene 4 are protein kinase gene contained within the qHRS1 locus. related LRR-Xa21-like kinases that are unique to Mo17; and SEQ ID NO:32 is the translation product of SEQ ID Gene 5 is a third LRR-Xa21 D-like kinase wholly or partly NO:31. found in all three lines. Mo17 is 172 kb in length in this SEQ ID NO:33 is the genomic DNA region from Mo17 region, and Huangzhao is 56 kb in length. encoding the cell wall associated protein kinase of SEQ ID The sequence descriptions and Sequence Listing attached NO:31/32 and 2.4 kb upstream of the coding region. hereto comply with the rules governing nucleotide and/or SEQ ID NO:34 is the nucleic acid sequence from Mo17 amino acid sequence disclosures in patent applications as set representing the gene coding region for a HAT family dimer forth in 37 C.F.R. S1.821-1.825. The Sequence Listing con 25 ization protein gene (PCO662117) contained within the tains the one letter code for nucleotide sequence characters qHRS1 locus. and the three letter codes for amino acids as defined in con SEQ ID NO:35 is the translation product of SEQ ID formity with the IUPAC-IUBMB standards described in NO:34. Nucleic Acids Res. 13:3021-3030 (1985) and in the Biochemi SEQ ID NO:36 is the genomic DNA region from Mo17 cal J. 219 (2):345-373 (1984) which are herein incorporated 30 encoding the HAT family dimerization protein gene of SEQ by reference. The symbols and format used for nucleotide and ID NO:34/35 and 2.4 kb upstream of the coding region. amino acid sequence data comply with the rules set forth in 37 SEQ ID NO:37 is the nucleic acid sequence from Mo17 C.F.R.S 1.822. representing the gene coding region for a HAT family dimer SEQ ID NO:1 is amplification primer CAPS25082-L. ization protein gene (PCO66 2162/PCO548849/ SEQ ID NO:2 is amplification primer CAPS25082-R. 35 PCO523172) contained within the qHRS1 locus. SEQ ID NO:3 is amplification primer SNP140313-L. SEQ ID NO:38 is the translation product of SEQ ID SEQ ID NO:4 is amplification primer SNP140313-R. NO:37. SEQ ID NO:5 is amplification primer SNP140313-snpL. SEQ ID NO:39 is the genomic DNA region from Mo17 SEQ ID NO:6 is amplification primer SNP140313-snpR. encoding the HAT family dimerization protein gene of SEQ SEQ ID NO:7 is amplification primer SNP661-L. 40 ID NO:37/38 and 2.4 kb upstream of the coding region. SEQ ID NO:8 is amplification primer SNP661-R. SEQ ID NO:40 is the nucleic acid sequence from Mo17 SEQ ID NO:9 is amplification primer SNP661-Smpl. representing the gene coding region for an uncharacterized SEQ ID NO:10 is amplification primer SNP661-snpR. protein gene (PCO648231) contained within the qHRS1 SEQ ID NO:11 is amplification primer STS1944-L. locus. SEQ ID NO:12 is amplification primer STS1944-R. 45 SEQ ID NO:41 is the translation product of SEQ ID SEQ ID NO:13 is amplification primer STS171-L. NO:40. SEQ ID NO:14 is amplification primer STS171-R. SEQ ID NO:42 is the genomic DNA region from Mo17 SEQ ID NO:15 is amplification primer SSR148152-L. encoding the uncharacterized protein gene of SEQID NO:40/ SEQ ID NO:16 is amplification primer SSR148152-R. 41 and 2.4 kb upstream of the coding region. SEQ ID NO:17 is amplification primer STSrga3195-L. 50 SEQ ID NO:43 is the nucleic acid sequence from Mo17 SEQ ID NO:18 is amplification primer STSrga3195-R. representing the gene coding region for an uncharacterized SEQ ID NO:19 is amplification primer STSrga840810-L. protein gene (61. 24) contained within the qHRS1 locus. SEQ ID NO:20 is amplification primer STSrga840810-R. SEQ ID NO:44 is the translation product of SEQ ID SEQID NO:21 is amplification primer STSsyn1-L. NO:43. SEQ ID NO:22 is amplification primer STSsyn1-R. 55 SEQ ID NO:45 is the genomic DNA region from Mo17 SEQ ID NO:23 is MZA6393 marker (from encoding the uncharacterized protein gene of SEQID NO:43/ bacm.pk071j12.f) that defines one end of the BAC contig 44 and 2.4 kb upstream of the coding region. covering the qHSR1 locus. The Huangzhao and B73 versions SEQID NO:46 is nucleic acid sequence encoding a single of this marker region are found in SEQ ID NOs:47 and 48 EST sequence from Mo 17 contained within the qHRS1 locus. respectively. 60 SEQ ID NO.47 is MZA6393 marker covering the qHSR1 SEQ ID NO:24 is ST148-1 the marker from the Mo17 locus from Huangzhao. version of ZMMBBc0478L09f that defines one end of the SEQ ID NO:48 is MZA6393 marker covering the qHSR1 BAC contig covering the qHSR1 locus. The Huangzhao ver locus from B73. sion of this marker region can be found in SEQID NOs:49. SEQ ID NO:49 is ST148-1 marker from Huangzhao4. SEQID NO:25 is the BAC contig comprised of overlap 65 SEQ ID NO.47 is MZA6393 marker from Huangzhao4. ping clones bacm.pk071.12, bacm.pk007. 18, and SEQID NO:48 is MZA6393 marker from B73. bacm2.pk166.h1 that cover the qHSR1 locus. SEQ ID NO:49 is STS148-1 marker from Huangzhao4. US 8,912,387 B2 7 8 SEQ ID NO:50 is amplification primer MZA6393L. SEQ ID NO:110 is the nucleic acid sequence from Mo17 SEQ ID NO:51 is amplification primer MZA6393R. representing the gene coding region for LRR-Xa21-like SEQ ID NO:52 is amplification primer 1M2-9L. kinase (Gene 3, FIG. 4) coding region SEQ ID NO:53 is amplification primer 1M2-9R. SEQID NO:111 is the translation product of SEQID NO: SEQ ID NO:54 is 1M2-9 marker from Mo17. 110. SEQ ID NO:55 is 1M2-9 marker from Huangzhao4. SEQID NO:112 is the nucleic acid sequence from Mo17 SEQ ID NO:56 is amplification primer E6765-3L. representing the gene coding region for LRR-Xa21-like SEQ ID NO:57 is amplification primer E6765-3R. kinase (Gene 4, FIG. 4) coding region SEQID NO:58 is E.6765-3 marker from Mo17. SEQ ID NO:113 is the translation product of SEQ ID SEQ ID NO:59 is amplification primer 2M4-1 L. 10 NO:112. SEQ ID NO:60 is amplification primer 2M4-1R. SEQ ID NO:114 is the genomic DNA region from Mo17 SEQ ID NO:61 is 2M4-1 marker from Mo17. encoding LRR-Xa21-like kinase (Gene 4, FIG. 4). SEQ ID NO:62 is amplification primer 2M10-5L. SEQ ID NO: 115 is the nucleic acid sequence from Mo17 SEQ ID NO:63 is amplification primer 2M10-5R. representing the gene coding region for LRR-Xa21 D-like SEQ ID NO:64 is 2M10-5 marker from Mo17. 15 kinase (Gene 5, FIG. 4). SEQ ID NO:65 is amplification primer 2M11-3L. SEQ ID NO:116 is the translation product of SEQ ID SEQ ID NO:66 is amplification primer 2M1'-3R. NO: 115. SEQ ID NO:67 is 2M11-3 marker from Mo17. SEQ ID NO:117 is the genomic DNA region from Mo17 SEQ ID NO:68 is amplification primer 3M1-25L. encoding LRR-Xa21 D-like kinase (Gene 5, FIG. 4). SEQ ID NO:69 is amplification primer 3M1-25R. SEQID NO:70 is 3M1-25 marker from Mo17. DETAILED DESCRIPTION SEQ ID NO:71 is 3M1-25 marker from Huangzhao4 SEQ ID NO:72 is amplification primer STS148-1L. The present invention provides allelic compositions in SEQ ID NO:73 is amplification primer STS148-1R. maize and methods for identifying and for selecting maize SEQ ID NO:74 is amplification primer MZA15839-4-L. 25 plants with increased head Smut resistance. Also within the SEQ ID NO:75 is amplification primer MZA15839-4-R. Scope of this invention are allelic compositions and methods SEQID NO:76 is amplification primer MZA18530-16-L. used to identify and to counter-select maize plants that have SEQID NO:77 is amplification primer MZA18530-16-R. decreased head Smut resistance. The following definitions are SEQID NO:78 is amplification primer MZA5473-801-L. provided as an aid to understand this invention. SEQID NO:79 is amplification primer MZA5473-801-R. 30 The mapping of the head Smut resistance locus is outlined SEQID NO:80 is amplification primer MZA16870-15-L. in a manuscript “Identification and fine-mapping of a major SEQID NO:81 is amplification primer MZA16870-15-R. QTL conferring resistance against head Smut in maize” by SEQ ID NO:82 is amplification primer MZA4087-19-L. Yongsheng Chen, Qing Chao, Guoqing Tan, Jing Zhao, SEQ ID NO:83 is amplification primer MZA4087-19-R. Meijing Zhang, Qing Ji, and MingliangXu. The manuscript is SEQ ID NO:84 is amplification primer MZA158-30-L. 35 attached as an appendix to the specification. SEQ ID NO:85 is amplification primer MZA158-30-R. The term “allele” refers to one of two or more different SEQID NO:86 is amplification primer MZA15493-15-L. nucleotide sequences that occurat a specific locus. A “favor SEQID NO:87 is amplification primer MZA15493-15-R. able allele' is the allele at a particular locus that confers, or SEQ ID NO:88 is amplification primer MZA9967-11-L. contributes to, an agronomically desirable phenotype, e.g., SEQ ID NO:89 is amplification primer MZA9967-11-R. 40 increased head Smut resistance, or alternatively, is an allele SEQ ID NO:90 is amplification primer MZA1556-23-L. that allows the identification of plants with decreased head SEQ ID NO:91 is amplification primer MZA1556-23-R. Smut resistance that can be removed from a breeding program SEQID NO:92 is amplification primer MZA1556-801-L. or planting ("counterselection'). A favorable allele of a SEQID NO:93 is amplification primer MZA1556-801-R. marker is a marker allele that segregates with the favorable SEQID NO:94 is amplification primer MZA17365-10-L. 45 phenotype, or alternatively, segregates with the unfavorable SEQID NO:95 is amplification primer MZA17365-10-R. plant phenotype, therefore providing the benefit of identify SEQID NO:96 is amplification primer MZA17365-801-L. ing plants. A favorable allelic form of a chromosome segment SEQID NO:97 is amplification primer MZA17365-801-R. is a chromosome segment that includes a nucleotide sequence SEQ ID NO:98 is amplification primer MZA14192-8-L. that contributes to Superior agronomic performance at one or SEQ ID NO:99 is amplification primer MZA14192-8-R. 50 more genetic loci physically located on the chromosome seg SEQID NO:100 is amplification primer MZA15554-13-L. ment. Allele frequency” refers to the frequency (proportion SEQID NO:101 is amplification primer MZA15554-13-R. or percentage) at which an allele is presentata locus within an SEQID NO:102 is amplification primer MZA4454-14-L. individual, within a line, or within a population of lines. For SEQID NO:103 is amplification primer MZA4454-14-R. example, for allele 'A', diploid individuals of genotype SEQ ID NO:104 is the nucleic acid sequence from Mo17 55 “AA”, “Aa', or “aa” have allele frequencies of 1.0, 0.5, or 0.0, representing the gene coding region for ankyrin-repeat pro respectively. One can estimate the allele frequency within a tein (Gene 1 FIG. 4). line by averaging the allele frequencies of a sample of indi SEQ ID NO:105 is the translation product of SEQ ID viduals from that line. Similarly, one can calculate the allele NO:104. frequency within a population of lines by averaging the allele SEQ ID NO: 106 is the genomic DNA region from Mo17 60 frequencies of lines that make up the population. For a popu encoding ankyrin repeat protein. lation with a finite number of individuals or lines, an allele SEQ ID NO:107 is the nucleic acid sequence from Mo17 frequency can be expressed as a count of individuals or lines representing the gene coding region for hydrolase. (or any other specified grouping) containing the allele. SEQ ID NO:108 is the translation product of SEQ ID An allele is “positively' associated with a trait when it is NO: 107. 65 linked to it and when the presence of the allele is an indicator SEQ ID NO:109 is the genomic DNA region from Mo17 that the desired trait or trait form will occur in a plant com encoding hydrolase. prising the allele. An allele is “negatively associated with a US 8,912,387 B2 9 10 trait when it is linked to it and when the presence of the allele “Genetic recombination frequency’ is the frequency of a is an indicator that a desired trait or trait form will not occur crossing over event (recombination) between two genetic in a plant comprising the allele. loci. Recombination frequency can be observed by following An individual is “homozygous' at a locus if the individual the segregation of markers and/or traits following meiosis. A has only one type of allele at that locus (e.g., a diploid organ genetic recombination frequency can be expressed in centi ism has a copy of the same allele at a locus for each of two morgans (cM), where one cM is the distance between two homologous chromosomes). An organism is "heterozygous' genetic markers that show a 1% recombination frequency at a locus if more than one allele type is present at that locus (i.e., a crossing-over event occurs between those two markers (e.g., a diploid individual with one copy each of two different once in every 100 cell divisions). 10 The term “genotype' is the genetic constitution of an indi alleles). The term “homogeneity' indicates that members of a vidual (or group of individuals) at one or more genetic loci, as group have the same genotype at one or more specific loci. In contrasted with the observable trait (the phenotype). Geno contrast, the term "heterogeneity' is used to indicate that type is defined by the allele(s) of one or more known loci that individuals within the group differingenotype at one or more the individual has inherited from its parents. The term geno specific loci. 15 type can be used to refer to an individual’s genetic constitu As used herein, the terms "chromosome interval' or "chro tion at a single locus, at multiple loci, or, more generally, the mosome segment” designate a contiguous linear span of term genotype can be used to refer to an individuals genetic genomic DNA that resides in planta on a single chromosome. make-up for all the genes in its genome. The genetic elements or genes located on a single chromo “Germplasm' refers to genetic material of or from an indi some interval are physically linked. The size of a chromo vidual (e.g., a plant), a group of individuals (e.g., a plant line, Some interval is not particularly limited. In some aspects, the variety or family), or a clone derived from a line, variety, genetic elements located within a single chromosome interval species, or culture. The germplasm can be part of an organism are genetically linked, typically with a genetic recombination or cell, or can be separate from the organism or cell. In distance of for example, less than or equal to 20 cM, or general, germplasm provides genetic material with a specific alternatively, less than or equal to 10 cM. That is, two genetic 25 molecular makeup that provides a physical foundation for elements within a single chromosome interval undergo Some or all of the hereditary qualities of an organism or cell recombination at a frequency of less than or equal to 20% or culture. As used herein, germplasm includes cells, seed or 10%. tissues from which new plants may be grown, or plant parts, The term “crossed’ or “cross' means the fusion of gametes Such as leafs, stems, pollen, or cells that can be cultured into via pollination to produce progeny (e.g., cells, seeds or 30 a whole plant. plants). The term encompasses both sexual crosses (the pol A“haplotype' is the genotype of an individualata plurality lination of one plant by another) and selfing (self-pollination, of genetic loci, i.e. a combination of alleles. Typically, the e.g., when the pollen and ovule are from the same plant). A genetic loci described by a haplotype are physically and “topcross test” is a progeny test derived by crossing each genetically linked, i.e., on the same chromosome segment. parent with the same tester, usually a homozygous line. The 35 “Hybridization” or “nucleic acid hybridization” refers to parent being tested can be an open-pollinated variety, a cross, the pairing of complementary RNA and DNA strands as well or an inbred line. as the pairing of complementary DNA single strands. "Strin A "genetic map' is a description of genetic linkage rela gency” refers to the conditions with regard to temperature, tionships among loci on one or more chromosomes (or link ionic strength, and the presence of certain organic Solvents, age groups) within a given species, generally depicted in a 40 Such as formamide, under which nucleic acid hybridizations diagrammatic or tabular form. "Genetic mapping is the pro are carried out. Under high Stringency conditions (high tem cess of defining the linkage relationships of loci through the perature and low salt), two nucleic acid fragments will pair, or use of genetic markers, populations segregating for the mark “hybridize', only if there is a high frequency of complemen ers, and standard genetic principles of recombination fre tary base sequences between them. quency. A "genetic map location' is a location on a genetic 45 The term “introgression” refers to the transmission of a map relative to Surrounding genetic markers on the same desired allele of a genetic locus from one genetic background linkage group where a specified marker can be found within to another. For example, introgression of a desired allele at a a given species. If two different markers have the same specified locus can be transmitted to at least one progeny via genetic map location, the two markers are in Such close proX a sexual cross between two parents of the same species, where imity to each other that recombination occurs between them 50 at least one of the parents has the desired allele in its genome. with such low frequency that it is undetectable. Alternatively, for example, transmission of an allele can The order and genetic distances between genetic markers occur by recombination between two donor genomes, e.g., in can differ from one genetic map to another. This is because a fused protoplast, where at least one of the donor protoplasts each genetic map is a product of the mapping population, has the desired allele in its genome. The desired allele can be, types of markers used, and the polymorphic potential of each 55 e.g., a selected allele of a marker, a QTL, a transgene, or the marker between different populations. For example, 10 cM like. In any case, offspring comprising the desired allele can on the internally derived genetic map (also referred to herein be repeatedly backcrossed to a line having a desired genetic as “PHB for Pioneer Hi-Bred) is roughly equivalent to 25-30 background and selected for the desired allele, to result in the cM on the IBM2 2005 neighbors frame public map (a high allele becoming fixed in a selected genetic background. resolution map available on maizeGDB). However, informa 60 A “line' or “strain” is a group of individuals of identical tion can be correlated from one map to another using a general parentage that are generally inbred to some degree and that framework of common markers. One of ordinary skill in the are generally homozygous and homogeneous at most loci art can use the framework of common markers to identify the (isogenic or near isogenic). A 'subline' refers to an inbred positions of genetic markers and QTLS on each individual subset of descendents that are genetically distinct from other genetic map. A comparison of marker positions between the 65 similarly inbred Subsets descended from the same progenitor. internally derived genetic map and the IBM2 neighbors An “ancestral line' is a parent line used as a source of genes genetic map can be seen in Table 3. e.g., for the development of elite lines. An “ancestral popu US 8,912,387 B2 11 12 lation' is a group of ancestors that have contributed the bulk that linkage is 1000 times more likely than no linkage. Lower of the genetic variation that was used to develop elite lines. LOD values, such as 2.0 or 2.5, may be used to detect a greater “Descendants' are the progeny of ancestors, and may be level of linkage. separated from their ancestors by many generations of breed “Linked loci' are located in close proximity such that ing. For example, elite lines are the descendants of their 5 meiotic recombination between homologous chromosome ancestors. A "pedigree structure' defines the relationship pairs does not occur with high frequency (frequency of equal between a descendant and each ancestor that gave rise to that to or less than 10%) between the two loci, e.g., linked loci descendant. A pedigree structure can span one or more gen co-segregate at least about 90% of the time, e.g., 91%, 92%, erations, describing relationships between the descendant 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.5%, 99.75%, or and its parents, grandparents, great-grandparents, etc. 10 more of the time. Marker loci are especially useful when they An “elite line' or “elite strain' is an agronomically supe demonstrate a significant probability of co-segregation (link rior line that has resulted from many cycles of breeding and age) with a desired trait (e.g., increased head Smut resistance). selection for Superior agronomic performance. Numerous For example, in Some aspects, these markers can be termed elite lines are available and known to those of skill in the art “linked QTL markers’. of maize breeding. An "elite population' is an assortment of 15 Linkage can be expressed as a desired limit or range. For elite individuals or lines that can be used to represent the state example, in some embodiments, any marker is linked (geneti of the art in terms of agronomically Superior genotypes of a cally and physically) to any other marker when the markers given crop species, such as maize. Similarly, an “elite germ are separated by less than 50, 40, 30, 25, 20, or 15 map units plasm' or elite strain of germplasm is an agronomically Supe (or cM). Further linkage can be described by separations of rior germplasm, typically derived from and/or capable of 14, 13, 12, 11, 10,9,8,7,6, 5, 4, 3, 2, 1 map units (or cM). In giving rise to a plant with Superior agronomic performance, Some aspects, it is advantageous to define a bracketed range of Such as an existing or newly developed elite line of maize. linkage, for example, between 10 and 20 cM, between 10 and A “public IBM genetic map’ refers to any of following 30 cM, or between 10 and 40 cM. maps: IBM, IBM2, IBM2 neighbors, IBM2 FPC0507, IBM2 The more closely a marker is linked to a second locus, the 2004 neighbors, IBM22005 neighbors, or IBM22005 neigh 25 better an indicator for the second locus that marker becomes. bors frame. All of the IBM genetic maps are based on a Thus, “closely linked loci' such as a marker locus and a B73xMo17 population in which the progeny from the initial second locus display an inter-locus recombination frequency cross were random-mated for multiple generations prior to of 10% or less, or about 9% or less, or about 8% or less, or constructing recombinant inbred lines for mapping. Newer about 7% or less, or about 6% or less, or about 5% or less, or versions reflect the addition of genetic and BAC mapped loci 30 about 4% or less, or about 3% or less, and or about 2% or less. as well as enhanced map refinement due to the incorporation In other embodiments, the relevant loci display a recombina of information obtained from other genetic maps. tion frequency of about 1% or less, e.g., about 0.75% or less, In contrast, an “exotic maize strain” or an “exotic maize or about 0.5% or less, or about 0.25% or less. Two loci that are germplasm' is a strain or germplasm derived from a maize localized to the same chromosome, and at Such a distance that not belonging to an available elite maize line or strain of 35 recombination between the two loci occurs at a frequency of germplasm. In the context of a cross between two maize less than 10% (e.g., about 9%, 8%, 7%, 6%. 5%, 4%,3%, 2%, plants or strains of germplasm, an exotic germplasm is not 1%, 0.75%, 0.5%, 0.25%, or less) are also said to be “proxi closely related by descent to the elite germplasm with which mal to each other. Since one cM is the distance between two it is crossed. Most commonly, the exotic germplasm is not genetic markers that show a 1% recombination frequency, derived from any known elite line of maize, but rather is 40 any marker is closely linked (genetically and physically) to selected to introduce novel genetic elements (typically novel any other marker that is in close proximity, e.g., at or less than alleles) into a breeding program. 10 cM distant. Two closely linked markers on the same chro As used herein, the term “linkage' is used to describe the mosome can be positioned 9, 8, 7, 6, 5, 4, 3, 2, 1, 0.75, 0.5, degree with which one marker locus is “associated with 0.25, 0.1, 0.075, 0.05, 0.025, or 0.01 cM or less from each another markerlocus or some otherlocus (for example, ahead 45 other. Smut resistance locus). The linkage relationship between a When referring to the relationship between two genetic molecular marker and a phenotype is given as a "probability” elements, such as a genetic element contributing to increased or “adjusted probability”. The probability value (also known head Smut resistance and a proximal marker, "coupling as p-value) is the statistical likelihood that the particular com phase linkage indicates the state where the “favorable' allele bination of a phenotype and the presence or absence of a 50 at the stalk strength locus is physically associated on the same particular marker allele is random. Thus, the lower the prob chromosome strand as the “favorable' allele of the respective ability score, the greater the likelihood that a phenotype and a linked marker locus. In coupling phase, both favorable alleles particular marker will co-segregate. In some aspects, the are inherited together by progeny that inherit that chromo probability score is considered “significant’ or “nonsignifi some strand. In “repulsion’ phase linkage, the “favorable' cant'. In some embodiments, a probability score of 0.05 55 allele at the locus of interest is physically linked with an (p=0.05, or a 5% probability) of random assortment is con “unfavorable' allele at the proximal marker locus, and the sidered a significant indication of co-segregation. However, two “favorable' alleles are not inherited together (i.e., the two an acceptable probability can be any probability of less than loci are “out of phase' with each other). 50% (p=0.5). For example, a significant probability can be The term “linkage disequilibrium” refers to a non-random less than 0.25, less than 0.20, less than 0.15, less than 0.1, less 60 segregation of genetic loci or traits (or both). In either case, than 0.05, less than 0.01, or less than 0.001. linkage disequilibrium implies that the relevant loci are In interval mapping, linkage between two marker loci can within Sufficient physical proximity along a length of a chro be calculated using odds ratios (i.e. the ratio of linkage versus mosome so that they segregate together with greater than no linkage). This ratio is more conveniently expressed as the random (i.e., non-random) frequency (in the case of co-seg logarithm of the ratios and is called a logarithm of odds 65 regating traits, the loci that underlie the traits are in Sufficient (LOD) value or LOD score (Risch, Science 255:803-804 proximity to each other). Markers that show linkage disequi (1992)). A LOD value of 3 between two markers indicates librium are considered linked. Linked loci co-segregate more US 8,912,387 B2 13 14 than 50% of the time, e.g., from about 51% to about 100% of tion of alleles, also referred to as a “haplotype', at informative the time. In other words, two markers that co-segregate have polymorphic sites of that specific marker locus. In some a recombination frequency of less than 50% (and by defini aspects, marker loci correlating with head Smut resistance in tion, are separated by less than 50 cM on the same linkage maize are provided. group.) As used herein, linkage can be between two markers, A "marker locus’ is a locus that can be used to track the or alternatively between a marker and a phenotype. A marker presence of a second linked locus, e.g., a linked locus that locus can be “associated with (linked to) a trait, e.g., head encodes or contributes to expression of a phenotypic trait. For Smut resistance. The degree of linkage of a molecular marker example, a marker locus can be used to monitor segregation to a phenotypic trait is measured, e.g., as a statistical prob of alleles at a locus, Such as a QTL, that are genetically or ability of co-segregation of that molecular marker with the 10 physically linked to the marker locus. phenotype. “Genetic markers' are nucleic acids that are polymorphic Linkage disequilibrium is most commonly assessed using in a population, and the marker alleles can be detected and the measure r, which is calculated using the formula distinguished by one or more analytic methods, e.g., RFLP, described by Hill, W. G. and Robertson, A. Theor. Appl. AFLP isozyme, SNP SSR, and the like. The term also refers Genet. 38:226-231 (1968). When r=1, complete LD exists 15 to nucleic acid sequences complementary to the genomic between the two marker loci, meaning that the markers have sequences, such as nucleic acids used as probes. not been separated by recombination and have the same allele Markers corresponding to genetic polymorphisms frequency. Values for r above 1/3 indicate sufficiently strong between members of a population can be detected by methods LD to be useful for mapping (Ardlie et al., Nature Reviews well-established in the art. These include, e.g., DNA sequenc Genetics 3:299-309 (2002)). Hence, alleles are in linkage ing, PCR-based sequence specific amplification methods, disequilibrium when r° values between pairwise marker loci detection of restriction fragment length polymorphisms are greater than or equal to 0.33,0.4,0.5,0.6,0.7, 0.8, 0.9, or (RFLP), detection of isozyme markers, detection of poly 1.O. nucleotide polymorphisms by allele specific hybridization As used herein, “linkage equilibrium” describes a situation (ASH), detection of amplified variable sequences of the plant where two markers independently segregate, i.e., sort among 25 genome, detection of self-sustained sequence replication, progeny randomly. Markers that show linkage equilibrium detection of simple sequence repeats (SSRs), detection of are considered unlinked (whether or not they lie on the same single nucleotide polymorphisms (SNPs), or detection of chromosome). amplified fragment length polymorphisms (AFLPs). Well A "locus’ is a chromosomal region where a gene or marker established methods are also known for the detection of is located. For example, a "gene locus is a specific chromo 30 expressed sequence tags (ESTs) and SSR markers derived Some location in the genome of a species where a specific from EST sequences and randomly amplified polymorphic gene can be found. DNA (RAPD). “Maize' and “corn” are used interchangeably herein. “Head Smut resistance” refers to the ability of a maize plant The terms “marker”, “molecular marker”, “marker nucleic to withstand infection by the host-specific fungus Sphace acid”, and “marker locus’ refer to a nucleotide sequence or 35 lotheca reiliana (Kühn) Clint. This includes, but is not limited encoded product thereof (e.g., a protein) used as a point of to, reduced Sori production, improved plant vigor, improved reference when identifying a linked locus. A marker can be tassel function, and improved corn yield when compared to derived from genomic nucleotide sequence or from expressed maize plants lacking the resistance locus described herein. nucleotide sequences (e.g., from a spliced RNA or a cDNA), The nucleic acids and polypeptides of the embodiments or from an encoded polypeptide. The term also refers to 40 find use in methods for conferring or enhancing fungal resis nucleic acid sequences complementary to or flanking the tance to a plant. The Source of the resistance can be a naturally marker sequences, such as nucleic acids used as probes or occurring genetic resistance locus that is introgressed via primer pairs capable of amplifying the marker sequence. breeding into a sensitive maize population lacking the resis A "marker probe' is a nucleic acid sequence or molecule tance locus, or alternatively, the genes conferring the resis that can be used to identify the presence of a marker locus, 45 tance can be ectopically expressed as transgenes which con e.g., a nucleic acid probe that is complementary to a marker fer resistance when expressed in the sensitive population. locus sequence, through nucleic acid hybridization. Marker Accordingly, the compositions and methods disclosed herein probes comprising 30 or more contiguous nucleotides of the are useful in protecting plants from fungal pathogens. “Patho marker locus ("all or a portion' of the markerlocus sequence) gen resistance.” “fungal resistance.” and “disease resistance' may be used for nucleic acid hybridization. Alternatively, in 50 are intended to mean that the plant avoids the disease symp Some aspects, a marker probe refers to a probe of any type that toms that are the outcome of plant-pathogen interactions. is able to distinguish (i.e., genotype) the particular allele that That is, pathogens are prevented from causing plant diseases is present at a marker locus. Nucleic acids are “complemen and the associated disease symptoms, or alternatively, the tary' when they specifically “hybridize', or pair, in solution, disease symptoms caused by the pathogen are minimized or e.g., according to Watson-Crick base pairing rules. 55 lessened, such as, for example, the reduction of stress and The markers with the designation PHM represent a set of associated yield loss. One of skill in the art will appreciate that primers that amplify a specific piece of DNA, herein referred the compositions and methods disclosed herein can be used to as an 'amplicon'. The nucleotide sequences of the ampli with other compositions and methods available in the art for cons from multiple maize lines are compared, and polymor protecting plants from pathogen attack. phisms, or variations, are identified. The polymorphisms 60 Hence, the methods of the embodiments can be utilized to include single nucleotide polymorphisms (SNPs), simple protect plants from disease, particularly those diseases that sequence repeats (SSRs), insertion/deletions (indels), etc. are caused by plant fungal pathogens. As used herein, “fungal A “marker allele’, alternatively an “allele of a marker resistance' refers to enhanced resistance or tolerance to a locus, can refer to one of a plurality of polymorphic nucle fungal pathogen when compared to that of a wild type plant. otide sequences found at a marker locus in a population that is 65 Effects may vary from a slight increase in tolerance to the polymorphic for the marker locus. Alternatively, marker alle effects of the fungal pathogen (e.g., partial inhibition) to total les designated with a number represent the specific combina resistance such that the plant is unaffected by the presence of US 8,912,387 B2 15 16 the fungal pathogen. An increased level of resistance against the use of codons. A nucleic acid encoding a protein may a particular fungal pathogen or against a wider spectrum of comprise non-translated sequences (e.g., introns) within fungal pathogens constitutes "enhanced’ or improved fungal translated regions of the nucleic acid or may lack Such inter resistance. The embodiments of the invention also will vening non-translated sequences (e.g., as in cDNA). enhance or improve fungal plant pathogen resistance, Such 5 The embodiments of the invention encompass isolated or that the resistance of the plant to a fungal pathogen or patho Substantially purified polynucleotide or protein composi gens will increase. The term "enhance' refers to improve, tions. An "isolated or “purified’ polynucleotide or protein, increase, amplify, multiply, elevate, raise, and the like. or biologically active portion thereof, is substantially or Herein, plants of the invention are described as being resistant essentially free from components that normally accompany to infection by Sphacelotheca reiliana (Kühn) Clint or having 10 or interact with the polynucleotide or protein as found in its enhanced resistance to infection by Sphacelotheca reiliana naturally occurring environment. Thus, an isolated or purified (Kühn) Clint as a result of the head Smut resistance locus of polynucleotide or protein is substantially free of other cellular the invention. Accordingly, they typically exhibit increased material, or culture medium when produced by recombinant resistance to the disease when compared to equivalent plants techniques (e.g. PCR amplification), or substantially free of that are susceptible to infection by Sphacelotheca reiliana 15 chemical precursors or other chemicals when chemically Syn (Kühn) Clint because they lack the head Smut resistance thesized. Optimally, an "isolated polynucleotide is free of locus. sequences (for example, protein encoding sequences) that In particular aspects, methods for conferring or enhancing naturally flank the polynucleotide (i.e., sequences located at fungal resistance in a plant comprise introducing into a plant the 5' and 3' ends of the polynucleotide) in the genomic DNA at least one expression cassette, wherein the expression cas of the organism from which the polynucleotide is derived. For sette comprises a nucleotide sequence encoding an antifungal example, in various embodiments, the isolated polynucle polypeptide of the embodiments operably linked to a pro otide can contain less than about 5 kb, about 4 kb, about 3 kb, moter that drives expression in the plant. The plant expresses about 2 kb, about 1 kb, about 0.5 kb, or about 0.1 kb of the polypeptide, thereby conferring fungal resistance upon nucleotide sequence that naturally flank the polynucleotide in the plant, or improving the plants inherent level of resistance. 25 genomic DNA of the cell from which the polynucleotide is In particular embodiments, the gene confers resistance to the derived. A protein that is substantially free of cellular material fungal pathogen, Sphacelotheca reiliana (Kihn) Clint. includes preparations of protein having less than about 30%, Expression of an antifungal polypeptide of the embodi about 20%, about 10%, about 5%, or about 1% (by dry ments may be targeted to specific plant tissues where patho weight) of contaminating protein. When the protein of the gen resistance is particularly important, such as, for example, 30 embodiments, or a biologically active portion thereof, is the leaves, roots, stalks, or vascular tissues. Such tissue-pre recombinantly produced, optimally culture medium repre ferred expression may be accomplished by root-preferred, sents less than about 30%, about 20%, about 10%, about 5%, leaf-preferred, vascular tissue-preferred, stalk-preferred, or or about 1% (by dry weight) of chemical precursors or non seed-preferred promoters. protein-of-interest chemicals. “Nucleotide sequence”, “polynucleotide”, “nucleic acid 35 Fragments and variants of the disclosed nucleotide sequence', and “nucleic acid fragment” are used interchange sequences and proteins encoded thereby are also encom ably and refer to a polymer of RNA or DNA that is single- or passed by the embodiments. "Fragment' is intended to mean double-stranded, optionally containing synthetic, non-natu a portion of the nucleotide sequence or a portion of the amino ral or altered nucleotide bases. A “nucleotide' is a monomeric acid sequence and hence protein encoded thereby. Fragments unit from which DNA or RNA polymers are constructed, and 40 of a nucleotide sequence may encode protein fragments that consists of a purine or pyrimidine base, a pentose, and a retain the biological activity of the native protein and hence phosphoric acid group. Nucleotides (usually found in their have the ability to confer fungal resistance upon a plant. 5'-monophosphate form) are referred to by their single letter Alternatively, fragments of a nucleotide sequence that are designation as follows: “A” for adenylate or deoxyadenylate useful as hybridization probes do not necessarily encode frag (for RNA or DNA, respectively), “C” for cytidylate or deoxy 45 ment proteins retaining biological activity. Thus, fragments cytidylate, “G” for guanylate or deoxyguanylate, “U” for of a nucleotide sequence may range from at least about 15 uridylate, “T” for deoxythymidylate, “R” for purines (A or nucleotides, about 50 nucleotides, about 100 nucleotides, and G), “Y” for pyrimidines (C or T), “K” for G or T. “H” for A or up to the full-length nucleotide sequence encoding the C or T. “I” for inosine, and “N' for any nucleotide. polypeptides of the embodiments. The terms “polypeptide.” “peptide and “protein’ are used 50 A fragment of a nucleotide sequence that encodes a bio interchangeably herein to refer to a polymer of amino acid logically active portion of a polypeptide of the embodiments residues. The terms apply to amino acid polymers in which will encode at least about 15, about 25, about 30, about 40, or one or more amino acid residues is an artificial chemical about 50 contiguous amino acids, or up to the total number of analogue of a corresponding naturally occurring amino acid, amino acids present in a full-length polypeptide of the as well as to naturally occurring amino acid polymers. 55 embodiments. Fragments of a nucleotide sequence that are Polypeptides of the embodiments can be produced either useful as hybridization probes or PCR primers generally need from a nucleic acid disclosed herein, or by the use of standard not encode a biologically active portion of a protein. molecular biology techniques. For example, a truncated pro As used herein, “full-length sequence.” in reference to a tein of the embodiments can be produced by expression of a specified polynucleotide, means having the entire nucleic recombinant nucleic acid of the embodiments in an appropri 60 acid sequence of a native sequence. “Native sequence' is ate host cell, or alternatively by a combination of ex vivo intended to mean an endogenous sequence, i.e., a non-engi procedures, such as protease digestion and purification. neered sequence found in an organism's genome. As used herein, the terms “encoding or “encoded when Thus, a fragment of a nucleotide sequence of the embodi used in the context of a specified nucleic acid mean that the ments may encode a biologically active portion of a polypep nucleic acid comprises the requisite information to direct 65 tide, or it may be a fragment that can be used as a hybridiza translation of the nucleotide sequence into a specified protein. tion probe or PCR primer using methods disclosed below. A The information by which a protein is encoded is specified by biologically active portion of an antipathogenic polypeptide US 8,912,387 B2 17 18 can be prepared by isolatingaportion of one of the nucleotide example, from genetic polymorphism or from human sequences of the embodiments, expressing the encoded por manipulation. Biologically active variants of a native protein tion of the protein and assessing the ability of the encoded of the embodiments will have at least about 40%, about 45%, portion of the protein to confer or enhance fungal resistance in about 50%, about 55%, about 60%, about 65%, about 70%, a plant. Nucleic acid molecules that are fragments of a nucle 5 about 75%, about 80%, about 85%, about 90%, about 91%, otide sequence of the embodiments comprise at least about about 92%, about 93%, about 94%, about 95%, about 96%, 15, about 20, about 50, about 75, about 100, or about 150 about 97%, about 98%, about 99% or more sequence identity nucleotides, or up to the number of nucleotides present in a to the amino acid sequence for the native protein as deter full-length nucleotide sequence disclosed herein. mined by sequence alignment programs and parameters “Variants’ is intended to mean substantially similar 10 described elsewhere herein. A biologically active variant of a sequences. For polynucleotides, a variant comprises a dele protein of the embodiments may differ from that protein by as tion and/or addition of one or more nucleotides at one or more few as about 1-15 amino acid residues, as few as about 1-10, internal sites within the native polynucleotide and/or a sub such as about 6-10, as few as about 5, as few as 4,3,2, or even stitution of one or more nucleotides at one or more sites in the 1 amino acid residue. native polynucleotide. As used herein, a “native' polynucle 15 The proteins of the embodiments may be altered in various otide or polypeptide comprises a naturally occurring nucle ways including amino acid Substitutions, deletions, trunca otide sequence or amino acid sequence, respectively. One of tions, and insertions. Methods for Such manipulations are skill in the art will recognize that variants of the nucleic acids generally known in the art. For example, amino acid sequence of the embodiments will be constructed such that the open variants and fragments of the antipathogenic proteins can be reading frame is maintained. For polynucleotides, conserva prepared by mutations in the DNA. Methods for mutagenesis tive variants include those sequences that, because of the and polynucleotide alterations are well known in the art. See, degeneracy of the genetic code, encode the amino acid for example, Kunkel (1985) Proc. Natl. Acad. Sci. USA sequence of one of the polypeptides of the embodiments. 82:488-492: Kunkel et al. (1987) Methods in Enzymol. 154: Naturally occurring allelic variants such as these can be iden 367-382: U.S. Pat. No. 4,873, 192: Walker and Gaastra, eds. tified with the use of well-known molecular biology tech 25 (1983) Techniques in Molecular Biology (MacMillan Pub niques, as, for example, with polymerase chain reaction lishing Company, New York) and the references cited therein. (PCR) and hybridization techniques as outlined below. Vari Guidance as to appropriate amino acid substitutions that do ant polynucleotides also include synthetically derived poly not affect biological activity of the protein of interest may be nucleotides, such as those generated, for example, by using found in the model of Dayhoffet al. (1978) Atlas of Protein site-directed mutagenesis but which still encode a protein of 30 Sequence and Structure (Natl. Biomed. Res. Found. Wash the embodiments. Generally, variants of a particular poly ington, D.C.), herein incorporated by reference. Conservative nucleotide of the embodiments will have at least about 40%, substitutions, such as exchanging one amino acid with about 45%, about 50%, about 55%, about 60%, about 65%, another having similar properties, may be optimal. about 70%, about 75%, about 80%, about 85%, about 90%, Thus, the genes and polynucleotides of the embodiments about 91%, about 92%, about 93%, about 94%, about 95%, 35 include both naturally occurring sequences as well as mutant about 96%, about 97%, about 98%, about 99% or more forms. Likewise, the proteins of the embodiments encompass sequence identity to that particular polynucleotide as deter both naturally occurring proteins as well as variations and mined by sequence alignment programs and parameters modified forms thereof. Such variants will continue to pos described elsewhere herein. sess the desired ability to confer or enhance plant fungal Variants of a particular polynucleotide of the embodiments 40 pathogen resistance. Obviously, the mutations that will be (i.e., the reference polynucleotide) can also be evaluated by made in the DNA encoding the variant must not place the comparison of the percent sequence identity between the sequence out of reading frame and optimally will not create polypeptide encoded by a variant polynucleotide and the complementary regions that could produce secondary mRNA polypeptide encoded by the reference polynucleotide. Thus, structure. See, EP Patent No. 0075444. for example, isolated polynucleotides that encode a polypep 45 The deletions, insertions, and substitutions of the protein tide with a given percent sequence identity to the polypeptide sequences encompassed herein are not expected to produce of SEQ ID NO: 3 are disclosed. Percent sequence identity radical changes in the characteristics of the protein. However, between any two polypeptides can be calculated using when it is difficult to predict the exact effect of the substitu sequence alignment programs and parameters described else tion, deletion, or insertion in advance of doing so, one skilled where herein. Where any given pair of polynucleotides of the 50 in the art will appreciate that the effect will be evaluated by embodiments is evaluated by comparison of the percent screening transgenic plants which have been transformed sequence identity shared by the two polypeptides they with the variant protein to ascertain the effect on the ability of encode, the percent sequence identity between the two the plant to resist fungal pathogenic attack. encoded polypeptides is at least about 40%, about 45%, about Variant polynucleotides and proteins also encompass 50%, about 55%, about 60%, about 65%, about 70%, about 55 sequences and proteins derived from mutagenic or recombi 75%, about 80%, about 85%, about 90%, about 91%, about nogenic procedures, including and not limited to procedures 92%, about 93%, about 94%, about 95%, about 96%, about such as DNA shuffling. One of skill in the art could envision 97%, about 98%, about 99% or more sequence identity. modifications that would alter the range of pathogens to “Variant protein is intended to mean a protein derived which the protein responds. With such a procedure, one or from the native protein by deletion or addition of one or more 60 more different protein coding sequences can be manipulated amino acids at one or more internal sites in the native protein to create a new protein possessing the desired properties. In and/or Substitution of one or more amino acids at one or more this manner, libraries of recombinant polynucleotides are sites in the native protein. Variant proteins encompassed by generated from a population of related sequence polynucle the embodiments are biologically active, that is they continue otides comprising sequence regions that have substantial to possess the desired biological activity of the native protein, 65 sequence identity and can be homologously recombined in that is, the ability to confer or enhance plant fungal pathogen vitro or in vivo. For example, using this approach, sequence resistance as described herein. Such variants may result, for motifs encoding a domain of interest may be shuffled between US 8,912,387 B2 19 20 the protein gene of the embodiments and other known protein construction of cDNA and genomic libraries are generally genes to obtain a new gene coding for a protein with an known in the art and are disclosed in Sambrook et al. (1989) improved property of interest, such as increased ability to Supra. confer or enhance plant fungal pathogen resistance. Strate For example, an entire polynucleotide disclosed herein, or gies for such DNA shuffling are known in the art. See, for one or more portions thereof, may be used as a probe capable example, Stemmer (1994) Proc. Natl. Acad. Sci. USA of specifically hybridizing to corresponding polynucleotides 91:10747-10751; Stemmer (1994) Nature 370:389-391; and messenger RNAs. To achieve specific hybridization Crameri et al. (1997) Nature Biotech. 15:436-438; Moore et under a variety of conditions, such probes include sequences al. (1997) J. Mol. Biol. 272:336-347; Zhang et al. (1997) that are unique and are optimally at least about 10 nucleotides 10 in length, at least about 15 nucleotides in length, or at least Proc. Natl. Acad. Sci. USA 94:45.04-4509: Crameri et al. about 20 nucleotides in length. Such probes may be used to (1998) Nature 391:288-291; and U.S. Pat. Nos. 5,605,793 amplify corresponding polynucleotides from a chosen organ and 5,837,458. ism by PCR. This technique may be used to isolate additional The polynucleotides of the embodiments can be used to coding sequences from a desired organism or as a diagnostic isolate corresponding sequences from other organisms, par 15 assay to determine the presence of coding sequences in an ticularly other plants. In this manner, methods such as PCR, organism. Hybridization techniques include hybridization hybridization, and the like can be used to identify such screening of plated DNA libraries (either plaques or colonies: sequences based on their sequence homology to the see, for example, Sambrook et al. (1989) supra. sequences set forth herein. Sequences isolated based on their Hybridization of Such sequences may be carried out under sequence identity to the entire sequences set forth herein or to stringent conditions. By "stringent conditions' or 'stringent variants and fragments thereof are encompassed by the hybridization conditions” is intended conditions under which embodiments. Such sequences include sequences that are a probe will hybridize to its target sequence to a detectably orthologs of the disclosed sequences. “Orthologs' is intended greater degree than to other sequences (e.g., at least 2-fold to mean genes derived from a common ancestral gene and over background). Stringent conditions are sequence-depen which are found in different species as a result of speciation. 25 dent and will be different in different circumstances. By con Genes found in different species are considered orthologs trolling the Stringency of the hybridization and/or washing when their nucleotide sequences and/or their encoded protein conditions, target sequences that are 100% complementary to sequences share at least about 60%, about 70%, about 75%, the probe can be identified (homologous probing). Alterna about 80%, about 85%, about 90%, about 91%, about 92%, tively, Stringency conditions can be adjusted to allow some 30 mismatching in sequences so that lower degrees of similarity about 93%, about 94%, about 95%, about 96%, about 97%, are detected (heterologous probing). Generally, a probe is less about 98%, about 99%, or greater sequence identity. Func than about 1000 nucleotides in length, optimally less than 500 tions of orthologs are often highly conserved among species. nucleotides in length. Thus, isolated polynucleotides that encode for a protein that Typically, stringent conditions will be those in which the confers or enhances fungal plant pathogen resistance and that 35 salt concentration is less than about 1.5 M Naion, typically hybridize under stringent conditions to the sequences dis about 0.01 to 1.0 MNaion concentration (or other salts) at pH closed herein, or to variants or fragments thereof, are encom 7.0 to 8.3 and the temperature is at least about 30°C. for short passed by the embodiments. probes (e.g., 10 to 50 nucleotides) and at least about 60° C. for In a PCR approach, oligonucleotide primers can be long probes (e.g., greater than 50 nucleotides). Stringent con designed for use in PCR reactions to amplify corresponding 40 ditions may also be achieved with the addition of destabiliz DNA sequences from cDNA or genomic DNA extracted from ing agents such as formamide. Exemplary low stringency any organism of interest. Methods for designing PCR primers conditions include hybridization with a buffer solution of 30 and PCR cloning are generally known in the art and are to 35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl disclosed in Sambrook et al. (1989) Molecular Cloning. A sulphate) at 37°C., and a wash in 1x to 2xSSC (20xSSC-3.0 Laboratory Manual (2d ed., Cold Spring Harbor Laboratory 45 M. NaC1/0.3 M trisodium citrate) at 50 to 55° C. Exemplary Press, Plainview, N.Y.). See also Innis et al., eds. (1990) PCR moderate stringency conditions include hybridization in 40 to Protocols. A Guide to Methods and Applications (Academic 45% formamide, 1.0 MNaCl, 1% SDS at 37°C., and a wash Press, New York); Innis and Gelfand, eds. (1995) PCR Strat in 0.5x to 1 xSSC at 55 to 60° C. Exemplary high stringency egies (Academic Press, New York); and Innis and Gelfand, conditions include hybridization in 50% formamide, 1 M eds. (1999) PCR Methods Manual (Academic Press, New 50 NaCl, 1% SDS at 37°C., and a final washin 0.1xSSC at 60 to York). Known methods of PCR include, and are not limited 65° C. for at least 30 minutes. Optionally, wash buffers may to, methods using paired primers, nested primers, single spe comprise about 0.1% to about 1% SDS. Duration of hybrid cific primers, degenerate primers, gene-specific primers, Vec ization is generally less than about 24 hours, usually about 4 tor-specific primers, partially-mismatched primers, and the to about 12 hours. The duration of the wash time will be at like. 55 least a length of time sufficient to reach equilibrium. In hybridization techniques, all or part of a known poly Specificity is typically the function of post-hybridization nucleotide is used as a probe that selectively hybridizes to washes, the critical factors being the ionic strength and tem other corresponding polynucleotides present in a population perature of the final wash solution. For DNA-DNA hybrids, of cloned genomic DNA fragments or cDNA fragments (i.e., the thermal melting point (T) can be approximated from the genomic or cDNA libraries) from a chosen organism. The 60 equation of Meinkoth and Wahl (1984) Anal. Biochem. 138: hybridization probes may be genomic DNA fragments, 267-284: T-81.5° C.+16.6 (log M)+0.41 (% GC)-0.61 (% cDNA fragments, RNA fragments, or other oligonucleotides, form)-500/L; where M is the molarity of monovalent cations, and may be labeled with a detectable group such as 'P, or any % GC is the percentage of guanosine and cytosine nucleotides other detectable marker. Thus, for example, probes for in the DNA, 96 form is the percentage of formamide in the hybridization can be made by labeling synthetic oligonucle 65 hybridization solution, and L is the length of the hybrid in otides based on the polynucleotides of the embodiments. base pairs. The T is the temperature (under defined ionic Methods for preparation of probes for hybridization and for strength and pH) at which 50% of a complementary target US 8,912,387 B2 21 22 sequence hybridizes to a perfectly matched probe. T, is the global alignment algorithm of Needleman and Wunsch reduced by about 1° C. for each 1% of mismatching; thus, T. (1970) J. Mol. Biol. 48:443-453; the search-for-local align hybridization, and/or wash conditions can be adjusted to ment method of Pearson and Lipman (1988) Proc. Natl. Acad. hybridize to sequences of the desired identity. For example, if Sci. 85:2444-2448; the algorithm of Karlin and Altschul sequences with >90% identity are sought, the T can be 5 (1990) Proc. Natl. Acad. Sci. USA 872264, modified as in decreased 10° C. Generally, stringent conditions are selected Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA to be about 5° C. lower than the T for the specific sequence 90:5873-5877. and its complement at a defined ionic strength and pH. How Computer implementations of these mathematical algo ever, severely stringent conditions can utilize a hybridization rithms can be utilized for comparison of sequences to deter and/or wash at 1, 2, 3, or 4°C. lower than the T: moderately 10 mine sequence identity. Such implementations include, and stringent conditions can utilize a hybridization and/or wash at are not limited to: CLUSTAL in the PC/Gene program (avail 6, 7, 8, 9, or 10° C. lower than the T. low stringency condi able from Intelligenetics, Mountain View, Calif.); the ALIGN tions can utilize a hybridization and/or wash at 11, 12, 13, 14. program (Version 2.0) and GAP, BESTFIT, BLAST, FASTA, 15, or 20° C. lower than the T. Using the equation, hybrid and TFASTA in the GCG Wisconsin Genetics Software Pack ization and wash compositions, and desired T, those of 15 age, Version 10 (available from Accelrys Inc., 9685 Scranton ordinary skill will understand that variations in the stringency Road, San Diego, Calif., USA). Alignments using these pro of hybridization and/or wash solutions are inherently grams can be performed using the default parameters. The described. If the desired degree of mismatching results in a T. CLUSTAL program is well described by Higgins et al. (1988) of less than 45° C. (aqueous solution) or 32° C. (formamide Gene 73:237-244 (1988); Higgins et al. (1989) CABIOS Solution), it is optimal to increase the SSC concentration so 5:151-153; Corpet et al. (1988) Nucleic Acids Res. 16:10881 that a higher temperature can be used. An extensive guide to 90; Huang et al. (1992) CABIOS8:155-65; and Pearson et al. the hybridization of nucleic acids is found in Tijssen (1993) (1994) Meth. Mol. Biol. 24:307-331. The ALIGN program is Laboratory Techniques in Biochemistry and Molecular Biol based on the algorithm of Myers and Miller (1988) supra. A ogy—Hybridization with Nucleic Acid Probes, Part I, Chapter PAM120 weight residue table, a gap length penalty of 12, and 2 (Elsevier, New York); and Ausubel et al., eds. (1995) Cur 25 a gap penalty of 4 can be used with the ALIGN program when rent Protocols in Molecular Biology, Chapter 2 (Greene Pub comparing amino acid sequences. The BLAST programs of lishing and Wiley-Interscience, New York). See Sambrook et Altschuletal (1990).J. Mol. Biol. 215:403 are based on the al. (1989) supra. algorithm of Karlin and Altschul (1990) supra. BLAST nucle Various procedures can be used to check for the presence or otide searches can be performed with the BLASTN program, absence of a particular sequence of DNA, RNA, or a protein. 30 score=100, wordlength=12, to obtain nucleotide sequences These include, for example, Southern blots, northern blots, homologous to a nucleotide sequence encoding a protein of western blots, and ELISA analysis. Techniques such as these the embodiments. BLAST protein searches can be performed are well known to those of skill in the art and many references with the BLASTX program, score=50, wordlength=3, to exist which provide detailed protocols. Such references obtain amino acid sequences homologous to a protein or include Sambrook et al. (1989) supra, and Crowther, J. R. 35 polypeptide of the embodiments. To obtain gapped align (2001), The ELISA Guidebook, Humana Press, Totowa, N.J., ments for comparison purposes, Gapped BLAST (in BLAST USA. 2.0) can be utilized as described in Altschul et al. (1997) The following terms are used to describe the sequence Nucleic Acids Res. 25:3389. Alternatively, PSI-BLAST (in relationships between two or more polynucleotides or BLAST 2.0) can be used to perform an iterated search that polypeptides: (a) “reference sequence.” (b) “comparison win 40 detects distant relationships between molecules. See Altschul dow, (c) “sequence identity, and, (d) "percentage of et al. (1997) supra. When utilizing BLAST, Gapped BLAST, sequence identity.” PSI-BLAST, the default parameters of the respective pro (a) As used herein, “reference sequence' is a defined grams (e.g., BLASTN for nucleotide sequences, BLASTX sequence used as a basis for sequence comparison. A refer for proteins) can be used. See www.ncbi.nlm.nih.gov. Align ence sequence may be a Subset or the entirety of a specified 45 ment may also be performed manually by inspection. sequence; for example, as a segment of a full-length cDNA or Unless otherwise stated, sequence identity/similarity val gene sequence, or the complete cDNA or gene sequence. ues provided herein refer to the value obtained using GAP (b) As used herein, “comparison window' makes reference Version 10 using the following parameters: % identity and% to a contiguous and specified segment of a polynucleotide similarity for a nucleotide sequence using Gap Weight of 50 sequence, wherein the polynucleotide sequence in the com 50 and Length Weight of 3, and the nwsgapdna.cmp scoring parison window may comprise additions or deletions (i.e., matrix; % identity and % similarity for an amino acid gaps) compared to the reference sequence (which does not sequence using Gap Weight of 8 and Length Weight of 2, and comprise additions or deletions) for optimal alignment of the the BLOSUM62 scoring matrix; or any equivalent program two polynucleotides. Generally, the comparison window is at thereof. By “equivalent program' is intended any sequence least about 20 contiguous nucleotides in length, and option 55 comparison program that, for any two sequences in question, ally can be about 30, about 40, about 50, about 100, or longer. generates an alignment having identical nucleotide or amino Those of skill in the art understand that to avoid a high acid residue matches and an identical percent sequence iden similarity to a reference sequence due to inclusion of gaps in tity when compared to the corresponding alignment gener the polynucleotide sequence a gap penalty is typically intro ated by GAP Version 10. duced and is subtracted from the number of matches. 60 GAP uses the algorithm of Needleman and Wunsch (1970) Methods of alignment of sequences for comparison are J. Mol. Biol. 48:443-453, to find the alignment of two com well known in the art. Thus, the determination of percent plete sequences that maximizes the number of matches and sequence identity between any two sequences can be accom minimizes the number of gaps. GAP considers all possible plished using a mathematical algorithm. Non-limiting alignments and gap positions and creates the alignment with examples of Such mathematical algorithms are the algorithm 65 the largest number of matched bases and the fewest gaps. It of Myers and Miller (1988) CABIOS 4:11-17; the local align allows for the provision of a gap creation penalty and a gap ment algorithm of Smith et al. (1981) Adv. Appl. Math. 2:482; extension penalty in units of matched bases. GAP must make US 8,912,387 B2 23 24 a profit of gap creation penalty number of matches for each window of comparison, and multiplying the result by 100 to gap it inserts. If a gap extension penalty greater than Zero is yield the percentage of sequence identity. chosen, GAP must, in addition, make a profit for each gap The use of the term “polynucleotide' is not intended to inserted of the length of the gap times the gap extension limit the embodiments to polynucleotides comprising DNA. penalty. Default gap creation penalty values and gap exten Those of ordinary skill in the art will recognize that poly sion penalty values in Version 10 of the GCG Wisconsin nucleotides can comprise ribonucleotides and combinations Genetics Software Package for protein sequences are 8 and 2. of ribonucleotides and deoxyribonucleotides. Such deoxyri respectively. For nucleotide sequences the default gap cre bonucleotides and ribonucleotides include both naturally ation penalty is 50 while the default gap extension penalty is occurring molecules and synthetic analogues. The polynucle 3. The gap creation and gap extension penalties can be 10 otides of the embodiments also encompass all forms of expressed as an integer selected from the group of integers sequences including, and not limited to, single-stranded consisting of from 0 to 200. Thus, for example, the gap forms, double-stranded forms, and the like. creation and gap extension penalties can be 0, 1, 2, 3, 4, 5, 6, Isolated polynucleotides of the embodiments can be incor 7,8,9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65 or greater. porated into recombinant DNA constructs capable of intro GAP presents one member of the family of best align 15 duction into and replication in a host cell. A “vector” may be ments. There may be many members of this family, and no Such a construct that includes a replication system and other member has a better quality. GAP displays four figures sequences that are capable of transcription and translation of of merit for alignments: Quality, Ratio, Identity, and Similar a polypeptide-encoding sequence in a given host cell. A num ity. The Quality is the metric maximized in order to align the ber of vectors suitable for stable transfection of plant cells or sequences. Ratio is the quality divided by the number of bases for the establishment of transgenic plants have been described in the shorter segment. Percent Identity is the percent of the in, e.g., Pouwels et al. Cloning Vectors. A Laboratory symbols that actually match. Percent Similarity is the percent Manual, 1985, Supp. 1987; Weissbach and Weissbach, Meth of the symbols that are similar. Symbols that are across from ods for Plant Molecular Biology, Academic Press, 1989; and gaps are ignored. A similarity is scored when the scoring Flevin et al., Plant Molecular Biology Manual. Kluwer Aca matrix value for a pair of symbols is greater than or equal to 25 demic Publishers, 1990. Typically, plant expression vectors 0.50, the similarity threshold. The scoring matrix used in include, for example, one or more cloned plant genes under Version 10 of the GCG Wisconsin Genetics Software Package the transcriptional control of 5' and 3' regulatory sequences is BLOSUM62 (see Henikoff and Henikoff (1989) Proc. and a dominant selectable marker. Such plant expression Natl. Acad. Sci. USA 89:10915). vectors also can contain a promoter regulatory region (e.g., a (c) As used herein, “sequence identity” or “identity” in the 30 regulatory region controlling inducible or constitutive, envi context of two polynucleotides or polypeptide sequences ronmentally- or developmentally-regulated, or cell- or tissue makes reference to the residues in the two sequences that are specific expression), a transcription initiation start site, a ribo the same when aligned for maximum correspondence over a Some binding site, an RNA processing signal, a transcription specified comparison window. When percentage of sequence termination site, and/or a polyadenylation signal. identity is used in reference to proteins it is recognized that 35 The terms “recombinant construct.” “expression cassette.” residue positions which are not identical often differ by con “expression construct.” “chimeric construct.” “construct.” servative amino acid substitutions, whereamino acid residues “recombinant DNA construct” and “recombinant DNA frag are substituted for other amino acid residues with similar ment” are used interchangeably herein and are nucleic acid chemical properties (e.g., charge or hydrophobicity) and fragments. A recombinant construct comprises an artificial therefore do not change the functional properties of the mol 40 combination of nucleic acid fragments, including, and not ecule. When sequences differ in conservative substitutions, limited to, regulatory and coding sequences that are not found the percent sequence identity may be adjusted upwards to together in nature. For example, a recombinant DNA con correct for the conservative nature of the substitution. struct may comprise regulatory sequences and coding Sequences that differ by such conservative substitutions are sequences that are derived from different sources, or regula said to have “sequence similarity” or “similarity.” Means for 45 tory sequences and coding sequences derived from the same making this adjustment are well known to those of skill in the Source and arranged in a manner different than that found in art. Typically this involves scoring a conservative Substitution nature. Such construct may be used by itself or may be used in as a partial rather than a full mismatch, thereby increasing the conjunction with a vector. If a vector is used then the choice percentage sequence identity. Thus, for example, where an of vector is dependent upon the method that will be used to identical amino acid is given a score of 1 and a non-conser 50 transform host cells as is well knownto those skilled in theart. Vative Substitution is given a score of Zero, a conservative For example, a plasmid vector can be used. The skilled artisan Substitution is given a score between Zero and 1. The scoring is well aware of the genetic elements that must be present on of conservative Substitutions is calculated, e.g., as imple the vector in order to successfully transform, select and mented in the program PC/GENE (Intelligenetics, Mountain propagate host cells comprising any of the isolated nucleic View, Calif.). 55 acid fragments of the embodiments. Screening to obtain lines (d) As used herein, percentage of sequence identity” displaying the desired expression level and pattern of the means the value determined by comparing two optimally polynucleotides or of the Rcgl locus may be accomplished by aligned sequences over a comparison window, wherein the amplification, Southern analysis of DNA, northern analysis portion of the polynucleotide sequence in the comparison of mRNA expression, immunoblotting analysis of protein window may comprise additions or deletions (i.e., gaps) as 60 expression, phenotypic analysis, and the like. compared to the reference sequence (which does not com The term “recombinant DNA construct refers to a DNA prise additions or deletions) for optimal alignment of the two construct assembled from nucleic acid fragments obtained sequences. The percentage is calculated by determining the from different sources. The types and origins of the nucleic number of positions at which the identical nucleic acid base or acid fragments may be very diverse. amino acid residue occurs in both sequences to yield the 65 In some embodiments, expression cassettes comprising a number of matched positions, dividing the number of promoter operably linked to a heterologous nucleotide matched positions by the total number of positions in the sequence of the embodiments are further provided. The US 8,912,387 B2 25 26 expression cassettes of the embodiments find use in generat 15:9627-9639. In particular embodiments, the potato pro ing transformed plants, plant cells, and microorganisms and tease inhibitor II gene (PinII) terminator is used. See, for in practicing the methods for inducing plant fungal pathogen example, Keil et al. (1986) Nucl. Acids Res. 14:5641–5650: resistance disclosed herein. The expression cassette will and An et al. (1989) Plant Cell 1:115-122, herein incorpo include 5' and 3' regulatory sequences operably linked to a rated by reference in their entirety. polynucleotide of the embodiments. “Operably linked' is A number of promoters can be used in the practice of the intended to mean a functional linkage between two or more embodiments, including the native promoter of the poly elements. “Regulatory sequences’ refer to nucleotides nucleotide sequence of interest. The promoters can be located upstream (5' non-coding sequences), within, or down selected based on the desired outcome. A wide range of plant stream (3' non-coding sequences) of a coding sequence, and 10 which may influence the transcription, RNA processing, sta promoters are discussed in the recent review of Potenza et al. bility, or translation of the associated coding sequence. Regu (2004) In Vitro Cell Dev Biol Plant 40: 1-22, herein incor latory sequences may include, and are not limited to, promot porated by reference. For example, the nucleic acids can be ers, translation leader sequences, introns, and combined with constitutive, tissue-preferred, pathogen-in polyadenylation recognition sequences. For example, an 15 ducible, or other promoters for expression in plants. Such operable linkage between a polynucleotide of interest and a constitutive promoters include, for example, the core pro regulatory sequence (a promoter, for example) is functional moter of the Rsyn? promoter and other constitutive promoters link that allows for expression of the polynucleotide of inter disclosed in WO99/43838 and U.S. Pat. No. 6,072,050; the est. Operably linked elements may be contiguous or non core CaMV 35S promoter (Odell et al. (1985) Nature 313: contiguous. When used to refer to the joining of two protein 810-812); rice actin (McElroy et al. (1990) Plant Cell 2:163– coding regions, by operably linked is intended that the coding 171); ubiquitin (Christensen et al. (1989) Plant Mol. Biol. regions are in the same reading frame. The cassette may 12:619-632 and Christensen et al. (1992) Plant Mol. Biol. additionally contain at least one additional gene to be cotrans 18:675-689); PEMU (Last et al. (1991) Theor. Appl. Genet. formed into the organism. Alternatively, the additional 81:581-588); MAS (Velten et al. (1984) EMBO.J. 3:2723 gene(s) can be provided on multiple expression cassettes. 25 2730); ALS promoter (U.S. Pat. No. 5,659,026), and the like. Such an expression cassette is provided with a plurality of Other constitutive promoters include, for example, U.S. Pat. restriction sites and/or recombination sites for insertion of the Nos. 5,608, 149; 5,608,144; 5,604,121 : 5,569,597; 5,466,785; polynucleotide that encodes an antipathogenic polypeptide to 5,399,680: 5,268,463; 5,608,142; and 6,177,611. be under the transcriptional regulation of the regulatory It may sometimes be beneficial to express the gene from an regions. The expression cassette may additionally contain 30 inducible promoter, particularly from a pathogen-inducible selectable marker genes. promoter. Such promoters include those from pathogenesis The expression cassette will include in the 5'-3' direction of related proteins (PR proteins), which are induced following transcription, a transcriptional initiation region (i.e., a pro infection by a pathogen; e.g., PR proteins, SAR proteins, moter), translational initiation region, a polynucleotide of the beta-1,3-glucanase, chitinase, etc. See, for example, Redolfi embodiments, a translational termination region and, option 35 et al. (1983) Neth, J. Plant Pathol. 89:245-254; Uknes et al. ally, a transcriptional termination region functional in the host (1992) Plant Cell 4:645-656; and VanLoon(1985)Plant Mol. organism. The regulatory regions (i.e., promoters, transcrip Virol. 4:111-116. See also WO99/43819, hereinincorporated tional regulatory regions, and translational termination by reference. regions) and/or the polynucleotide of the embodiments may Of interest are promoters that result in expression of a be native/analogous to the host cell or to each other. Alterna 40 protein locally at or near the site of pathogen infection. See, tively, the regulatory regions and/or the polynucleotide of the for example, Marineau et al. (1987) Plant Mol. Biol. 9:335 embodiments may be heterologous to the host cell or to each 342; Matton et al. (1989) Molecular Plant-Microbe Interac other. As used herein, "heterologous' in reference to a tions 2:325-331; Somsischet al. (1986) Proc. Natl. Acad. Sci. sequence is a sequence that originates from a foreign species, USA 83:2427-2430; Somsisch et al. (1988) Mol. Gen. Genet. or, if from the same species, is Substantially modified from its 45 2:93-98; and Yang (1996) Proc. Natl. Acad. Sci. USA native form in composition and/or genomic locus by deliber 93: 14972-14977. See also, Chenet al. (1996) Plant J. 10:955 ate human intervention. For example, a promoter operably 966: Zhanget al. (1994) Proc. Natl. Acad. Sci. USA 91:2507 linked to a heterologous polynucleotide is from a species 2511; Warner et al. (1993) Plant J. 3:191-201: Siebertz et al. different from the species from which the polynucleotide was (1989) Plant Cell 1:961-968; U.S. Pat. No. 5,750,386 (nema derived, or, if from the same/analogous species, one or both 50 tode-inducible); and the references cited therein. Of particu are substantially modified from their original form and/or lar interest is the inducible promoter for the maize PRms genomic locus, or the promoter is not the native promoter for gene, whose expression is induced by the pathogen Fusarium the operably linked polynucleotide. moniliforme (see, for example, Cordero et al. (1992) Physiol. The optionally included termination region may be native Mol. Plant. Path. 41:189-200). with the transcriptional initiation region, may be native with 55 Additionally, as pathogens find entry into plants through the operably linked polynucleotide of interest, may be native wounds or insect damage, a wound-inducible promoter may with the plant host, or may be derived from another source be used in the constructions of the embodiments. Such (i.e., foreign or heterologous) to the promoter, the polynucle wound-inducible promoters include potato proteinase inhibi otide of interest, the host, or any combination thereof. Con tor (pin II) gene (Ryan (1990) Ann. Rev. Phytopath. 28:425 Venient termination regions are available from the Ti-plasmid 60 449: Duan et al. (1996) Nature Biotechnology 14:494-498); of A. tumefaciens, such as the octopine synthase and nopaline wun1 and wun2, U.S. Pat. No. 5,428, 148; win1 and win2 synthase termination regions. See also Guerineau et al. (Stanford et al. (1989) Mol. Gen. Genet. 215:200-208); sys (1991) Mol. Gen. Genet. 262:141-144; Proudfoot (1991) Cell temin (McGurlet al. (1992) Science 225:1570-1573); WIP1 64:671-674; Sanfacon et al. (1991) Genes Dev. 5:141-149: (Rohmeier et al. (1993) Plant Mol. Biol. 22:783-792: Eck Mogen et al. (1990) Plant Cell 2:1261-1272; Munroe et al. 65 elkamp et al. (1993) FEBS Letters 323:73-76); MPI gene (1990) Gene 91:151-158: Ballas et al. (1989) Nucleic Acids (Corderok et al. (1994) Plant J. 6(2):141-150); and the like, Res. 17:7891-7903; and Joshi et al. (1987) Nucleic Acids Res. herein incorporated by reference. US 8,912,387 B2 27 28 Chemical-regulated promoters can be used to modulate the ment in the GRP 1.8 gene of French bean); Sanger et al. expression of a gene in a plant through the application of an (1990) Plant Mol. Biol. 14(3):433-443 (root-specific pro exogenous chemical regulator. Depending upon the objec moter of the mannopine synthase (MAS) gene of Agrobacte tive, the promoter may be a chemical-inducible promoter, rium tumefaciens); and Miao et al. (1991) Plant Cell3(1): 11 where application of the chemical induces gene expression, 22 (full-length cDNA clone encoding cytosolic glutamine or a chemical-repressible promoter, where application of the synthetase (GS), which is expressed in roots and root nodules chemical represses gene expression. Chemical-inducible pro of soybean). See also Bogusz et al. (1990) Plant Cell 207): moters are known in the art and include, and are not limited to, 633-641, where two root-specific promoters isolated from the maize In2-2 promoter, which is activated by benzene hemoglobin genes from the nitrogen-fixing nonlegume Para sulfonamide herbicide safeners, the maize GST promoter, 10 sponia andersonii and the related non-nitrogen-fixing nonle which is activated by hydrophobic electrophilic compounds gume Trema tomentosa are described. The promoters of these that are used as pre-emergent herbicides, and the tobacco genes were linked to a B-glucuronidase reporter gene and PR-1a promoter, which is activated by salicylic acid. Other introduced into both the nonlegume Nicotiana tabacum and chemical-regulated promoters of interest include steroid-re the legume Lotus corniculatus, and in both instances root sponsive promoters (see, for example, the glucocorticoid 15 specific promoter activity was preserved. Leach and Aoyagi inducible promoter in Schena et al. (1991) Proc. Natl. Acad. (1991) describe their analysis of the promoters of the highly Sci. USA 88:10421-10425 and McNellis et al. (1998) Plant J. expressed rolC and rolD root-inducing genes of Agrobacte 14(2):247-257) and tetracycline-inducible and tetracycline rium rhizogenes (see Plant Science (Limerick) 79(1):69-76). repressible promoters (see, for example, Gatz et al. (1991) They concluded that enhancer and tissue-preferred DNA Mol. Gen. Genet. 227:229-237, and U.S. Pat. Nos. 5,814,618 determinants are dissociated in those promoters. Teeri et al. and 5,789,156), herein incorporated by reference. (1989) used gene fusion to lacZ to show that the Agrobacte Tissue-preferred promoters can be utilized to target rium T-DNA gene encoding octopine synthase is especially enhanced expression of the polypeptides of the embodiments active in the epidermis of the root tip and that the TR2 gene within a particular plant tissue. For example, a tissue-pre is root specific in the intact plant and stimulated by wounding ferred promoter may be used to express a polypeptide in a 25 in leaf tissue, an especially desirable combination of charac plant tissue where disease resistance is particularly impor teristics for use with an insecticidal or larvicidal gene (see tant, such as, for example, the roots, the stalk or the leaves. EMBO.J. 8(2):343-350). The TR1' gene, fused to mptII (neo Tissue-preferred promoters include Yamamoto et al. (1997) mycin phosphotransferase II) showed similar characteristics. Plant J. 12(2):255-265; Kawamata et al. (1997) Plant Cell Additional root-preferred promoters include the VfBNOD Physiol. 38(7):792-803; Hansen et al. (1997) Mol. Gen. 30 GRP3 gene promoter (Kuster et al. (1995) Plant Mol. Biol. Genet. 254(3):337-343; Russellet al. (1997) Transgenic Res. 29(4):759-772); and rolB promoter (Capana et al. (1994) 6(2):157-168; Rinehart et al. (1996) Plant Physiol. 112(3): Plant Mol. Biol. 25(4):681-691. See also U.S. Pat. Nos. 1331-1341; Van Camp et al. (1996) Plant Physiol. 112(2): 5,837,876; 5,750,386; 5,633,363; 5,459.252: 5,401,836; 525-535; Canevascini et al. (1996) Plant Physiol. 112(2): 5,110,732; and 5,023,179. 513-524; Yamamoto et al. (1994) Plant Cell Physiol. 35(5): 35 “Seed-preferred promoters include both “seed-specific' 773-778; Lam (1994) Results Probl. Cell Differ. 20:181-196: promoters (those promoters active during seed development Orozco et al. (1993) Plant Mol. Biol. 23(6):1129-1138; Mat Such as promoters of seed storage proteins) as well as “seed suoka et al. (1993) Proc Natl. Acad. Sci. USA 90(20):9586 germinating promoters (those promoters active during seed 9590; and Guevara-Garcia et al. (1993) Plant J. 4(3):495-505. germination). See Thompson et al. (1989) BioEssays 10:108, Such promoters can be modified, if necessary, for weak 40 herein incorporated by reference. Such seed-preferred pro expression. moters include, and are not limited to, Cim1 (cytokinin-in Vascular tissue-preferred promoters are known in the art duced message); cz 19B1 (maize 19 kDa zein); milps (myo and include those promoters that selectively drive protein inositol-1-phosphate synthase) (see WO 00/11177 and U.S. expression in, for example, xylem and phloem tissue. Vascu Pat. No. 6.225,529; herein incorporated by reference). lar tissue-preferred promoters include, and are not limited to, 45 Gamma-Zein is a preferred endosperm-specific promoter. the Prunus serotina prunasin hydrolase gene promoter (see, Glob-1 is a preferred embryo-specific promoter. For dicots, e.g., International Publication No. WO 03/006651), and also seed-specific promoters include, and are not limited to, bean those found in U.S. patent application Ser. No. 10/109,488. B-phaseolin, napin, B-conglycinin, soybean lectin, cruciferin, Stalk-preferred promoters may be used to drive expression and the like. For monocots, seed-specific promoters include, of a polypeptide of the embodiments. Exemplary stalk-pre 50 and are not limited to, maize 15kDa zein, 22 kDa Zein, 27kDa ferred promoters include the maize MS8-15 gene promoter Zein, g-Zein, waxy, shrunken 1, shrunken 2, globulin 1, etc. (see, for example, U.S. Pat. No. 5,986,174 and International See also WO00/12733, where seed-preferred promoters from Publication No. WO98/00533), and those found in Graham et endl and end2 genes are disclosed; herein incorporated by al. (1997) Plant Mol Biol 33(4): 729-735. reference. Leaf-preferred promoters are known in the art. See, for 55 Additional sequence modifications are known to enhance example, Yamamoto et al. (1997) Plant J. 12(2):255-265; gene expression in a cellular host. These include elimination Kwon et al. (1994) Plant Physiol. 105:357-67: Yamamoto et of sequences encoding spurious polyadenylation signals, al. (1994) Plant Cell Physiol. 35(5):773-778; Gotor et al. exon-intron splice site signals, transposon-like repeats, and (1993) Plant J. 3:509-18; Orozco et al. (1993) Plant Mol. other such well-characterized sequences that may be delete Biol. 23(6):1129-1138; and Matsuoka et al. 60 rious to gene expression. The G-C content of the sequence (1993) Proc. Natl. Acad. Sci. USA 90(20):9586-9590. may be adjusted to levels average for a given cellular host, as Root-preferred promoters are known and can be selected calculated by reference to known genes expressed in the host from the many available from the literature or isolated de cell. When possible, the sequence is modified to avoid pre novo from various compatible species. See, for example, Hire dicted hairpin secondary mRNA structures. et al. (1992) Plant Mol. Biol. 20(2):207-218 (soybean root 65 Expression cassettes may additionally contain 5' leader specific glutamine synthetase gene); Keller and Baumgartner sequences. Such leader sequences can act to enhance trans (1991) Plant Cell3(10): 1051-1061 (root-specific control ele lation. Translation leaders are known in the art and include: US 8,912,387 B2 29 30 picornavirus leaders, for example, EMCV leader (Encepha ogy, Vol. 78 (Springer-Verlag, Berlin); Gill et al. (1988) lomyocarditis 5' noncoding region) (Elroy-Stein et al. (1989) Nature 334:721-724. Such disclosures are herein incorpo Proc. Natl. Acad. Sci. USA 86:6126-6130); potyvirus leaders, rated by reference. for example, TEV leader (Tobacco Etch Virus) (Gallie et al. The above list of selectable marker genes is not meant to be (1995) Gene 165(2):233-238), MDMV leader (Maize Dwarf limiting. Any selectable marker gene can be used in the Mosaic Virus), and human immunoglobulin heavy-chain embodiments. binding protein (BiP) (Macejak et al. (1991) Nature 353:90 In certain embodiments the nucleic acid sequences of the 94); untranslated leader from the coat protein mRNA of embodiments can be stacked with any combination of poly alfalfa mosaic virus (AMV RNA 4) (Jobling et al. (1987) nucleotide sequences of interest in order to create plants with Nature 325:622-625); tobacco mosaic virus leader (TMV) 10 a desired phenotype. This stacking may be accomplished by (Gallie et al. (1989) in Molecular Biology of RNA, ed. Cech a combination of genes within the DNA construct, or by (Liss, New York), pp. 237-256); and maize chlorotic mottle crossing Rcg1 with another line that comprises the combina virus leader (MCMV) (Lommel et al. (1991) Virology tion. For example, the polynucleotides of the embodiments 81:382-385). See also, Della-Cioppa et al. (1987) Plant 15 may be stacked with any other polynucleotides of the embodi Physiol. 84:965-968. Other methods known to enhance trans ments, or with other genes. The combinations generated can lation can also be utilized, for example, introns, and the like. also include multiple copies of any one of the polynucleotides In preparing the expression cassette, the various DNA frag of interest. The polynucleotides of the embodiments can also ments may be manipulated, so as to provide for the DNA be stacked with any other gene or combination of genes to sequences in the proper orientation and, as appropriate, in the produce plants with a variety of desired trait combinations proper reading frame. Toward this end, adapters or linkers including and not limited to traits desirable for animal feed may be employed to join the DNA fragments or other such as high oil genes (e.g., U.S. Pat. No. 6.232,529); bal manipulations may be involved to provide for convenient anced amino acids (e.g. hordothionins (U.S. Pat. Nos. 5,990, restriction sites, removal of superfluous DNA, removal of 389; 5,885,801; 5,885,802; and 5,703,409); barley high restriction sites, or the like. For this purpose, in vitro 25 lysine (Williamson et al. (1987) Eur: J. Biochem. 165:99-106: mutagenesis, primer repair, restriction, annealing, resubstitu and WO 98/20122); and high methionine proteins (Pedersen tions, e.g., transitions and transversions, may be involved. et al. (1986).J. Biol. Chem. 261:6279: Kirihara et al. (1988) The expression cassette can also comprise a selectable Gene 71:359; and Musumura et al. (1989) Plant Mol. Biol. marker gene for the selection of transformed cells. Selectable 12: 123)); increased digestibility (e.g., modified storage pro marker genes are utilized for the selection of transformed 30 teins (U.S. application Ser. No. 10/053,410, filed Nov. 7, cells or tissues. Marker genes include genes encoding antibi 2001); and thioredoxins (U.S. application Ser. No. 10/005, otic resistance, such as those encoding neomycin phospho 429, filed Dec. 3, 2001)), the disclosures of which are herein transferase II (NEO) and hygromycin phosphotransferase incorporated by reference. The polynucleotides of the (HPT), as well as genes conferring resistance to herbicidal embodiments can also be stacked with traits desirable for compounds, such as glufosinate ammonium, bromoxynil. 35 insect, disease or herbicide resistance (e.g., Bacillus thuring imidazolinones, and 2,4-dichlorophenoxyacetate (2,4-D). iensis toxic proteins (U.S. Pat. Nos. 5,366,892; 5,747,450; Additional selectable markers include phenotypic markers 5,737.514; 5723,756; 5,593.881; Geiser et al (1986) Gene Such as 3-galactosidase and fluorescent proteins such as 48: 109); lectins (Van Damme et al. (1994) Plant Mol. Biol. green fluorescent protein (GFP) (Su et al. (2004) Biotechnol 24:825); fumonisin detoxification genes (U.S. Pat. No. 5,792, Bioeng 85:610-9 and Fetter et al. (2004) Plant Cell 16:215 40 931); avirulence and disease resistance genes (Jones et al. 28), cyan florescent protein (CYP) (Bolte et al. (2004).J. Cell (1994) Science 266:789; Martin et al. (1993) Science 262: Science 1 17:943-54 and Kato et al. (2002) Plant Physiol 1432: Mindrinos et al. (1994) Cell 78:1089); acetolactate 129:913-42), and yellow florescent protein (PhiYFPTM from synthase (ALS) mutants that lead to herbicide resistance such Evrogen, see, Bolte et al. (2004).J. Cell Science 1 17:943-54). as the S4 and/or Hra mutations; inhibitors of glutamine Syn For additional selectable markers, see generally, Yarranton 45 thase Such as phosphinothricin or basta (e.g., bar gene); and (1992) Curr. Opin. Biotech. 3:506-511; Christopherson et al. glyphosate resistance (EPSPS genes, GAT genes such as (1992) Proc. Natl. Acad. Sci. USA 89:6314-6318; Yao et al. those disclosed in U.S. Patent Application Publication (1992) Cell 71:63-72; Reznikoff (1992) Mol. Microbiol. US2004/0082770, also WO02/3.6782 and WO03/092360)); 6:2419-2422; Barkley et al. (1980) in The Operon, pp. 177 and traits desirable for processing or process products such as 220; Hu et al. (1987) Cell 48:555-566; Brown et al. (1987) 50 high oil (e.g., U.S. Pat. No. 6.232.529); modified oils (e.g., Cell 49:603-612: Figge et al. (1988) Cell 52:713-722: Deus fatty acid desaturase genes (U.S. Pat. No. 5,952,544: WO chle et al. (1989) Proc. Natl. Acad. Aci. USA 86:5400-5404; 94/11516)); modified starches (e.g., ADPG pyrophosphory Fuerstetal. (1989) Proc. Natl. Acad. Sci. USA 86:2549-2553; lases (AGPase), starch synthases (SS), starch branching Deuschle et al. (1990) Science 248:480-483; Gossen (1993) enzymes (SBE) and starch debranching enzymes (SDBE)); Ph.D. Thesis, University of Heidelberg; Reines et al. (1993) 55 and polymers or bioplastics (e.g., U.S. Pat. No. 5,602,321; Proc. Natl. Acad. Sci. USA 90:1917-1921; Labow et al. beta-ketothiolase, polyhydroxybutyrate synthase, and (1990) Mol. Cell. Biol. 10:3343-3356; Zambrettietal. (1992) acetoacetyl-CoA reductase (Schubert et al. (1988).J. Bacte Proc. Natl. Acad. Sci. USA 89:3952-3956; Baim et al. (1991) riol. 170:5837-5847) facilitate expression of polyhydroxyal Proc. Natl. Acad. Sci. USA 88:5072-5076; Wyborski et al. kanoates (PHAs)), the disclosures of which are herein incor (1991) Nucleic Acids Res. 19:4647-4653: Hillenand-Wiss 60 porated by reference. One could also combine the man (1989) Topics Mol. Struc. Biol. 10:143-162; Degenkolb polynucleotides of the embodiments with polynucleotides et al. (1991) Antimicrob. Agents Chemother: 35:1591-1595: providing agronomic traits such as male sterility (e.g., see Kleinschnidt et al. (1988) Biochemistry 27:1094-1104; Bon U.S. Pat. No. 5,583,210), stalk strength, flowering time, or in (1993) Ph.D. Thesis, University of Heidelberg: Gossen et transformation technology traits such as cell cycle regulation al. (1992) Proc. Natl. Acad. Sci. USA 89:5547-5551; Oliva et 65 or gene targeting (e.g. WO99/61619; WO 00/17364; WO al. (1992) Antimicrob. Agents Chemother. 36:913–919; 99/25821), the disclosures of which are herein incorporated Hlavka et al. (1985) Handbook of Experimental Pharmacol by reference. US 8,912,387 B2 31 32 These stacked combinations can be created by any method 5,886,244; and 5,932,782; Tomes et al. (1995) in Plant Cell, including and not limited to cross breeding plants by any Tissue, and Organ Culture: Fundamental Methods, ed. Gam conventional or TopCross(R (a grain production system) borg and Phillips (Springer-Verlag, Berlin); McCabe et al. methodology, or genetic transformation. If the traits are (1988) Biotechnology 6:923-926); and Lec 1 transformation stacked by genetically transforming the plants, the polynucle 5 (WO 00/28058). Also see, Weissinger et al. (1988) Ann. Rev. otide sequences of interest can be combined at any time and in Genet. 22:421-477; Sanford et al. (1987) Particulate Science any order. For example, a transgenic plant comprising one or and Technology 5:27-37 (onion); Christou et al. (1988) Plant more desired traits can be used as the target to introduce Physiol. 87:671-674 (soybean); McCabe et al. (1988) Bio/ further traits by subsequent transformation. The traits can be Technology 6:923-926 (soybean); Finer and McMullen introduced simultaneously in a co-transformation protocol 10 (1991) In Vitro Cell Dev. Biol. 27P:175-182 (soybean): Singh with the polynucleotides of interest provided by any combi etal. (1998) Theor. Appl. Genet. 96:319-324 (soybean); Datta nation of transformation cassettes. For example, if two et al. (1990) Biotechnology 8:736-740 (rice); Klein et al. sequences will be introduced, the two sequences can be con (1988) Proc. Natl. Acad. Sci. USA 85:4305-4309 (maize): tained in separate transformation cassettes (trans) or con Klein et al. (1988) Biotechnology 6:559-563 (maize): U.S. tained on the same transformation cassette (cis). Expression 15 Pat. Nos. 5,240,855; 5,322,783 and 5,324,646; Klein et al. of the sequences can be driven by the same promoter or by (1988) Plant Physiol. 91:440-444 (maize): Fromm et al. different promoters. In certain cases, it may be desirable to (1990) Biotechnology 8:833-839 (maize): Hooykaas-Van introduce a transformation cassette that will Suppress the Slogteren et al. (1984) Nature (London) 311:763-764; U.S. expression of the polynucleotide of interest. This may be Pat. No. 5,736,369 (cereals); Bytebieretal. (1987) Proc. Natl. combined with any combination of other suppression cas Acad. Sci. USA 84:5345-5349 (Liliaceae); De Wet et al. settes or overexpression cassettes to generate the desired (1985) in The Experimental Manipulation of Ovule Tissues, combination of traits in the plant. ed. Chapman et al. (Longman, New York), pp. 197-209 (pol The methods of the embodiments may involve, and are not len); Kaeppler et al. (1990) Plant Cell Reports 9:415-418 and limited to, introducing a polypeptide or polynucleotide into a Kaeppler et al. (1992) Theor: Appl. Genet. 84:560-566 (whis plant. “Introducing is intended to mean presenting to the 25 ker-mediated transformation); D Halluin et al. (1992) Plant plant the polynucleotide. In some embodiments, the poly Cell 4:1495-1505 (electroporation); Li et al. (1993) Plant nucleotide will be presented in such a manner that the Cell Reports 12:250-255 and Christou and Ford (1995) sequence gains access to the interior of a cell of the plant, Annals of Botany 75:407-413 (rice); Osjoda et al. (1996) including its potential insertion into the genome of a plant. Nature Biotechnology 14:745-750 (maize via Agrobacterium The methods of the embodiments do not depend on a particu 30 tumefaciens); all of which are herein incorporated by refer lar method for introducing a sequence into a plant, only that CCC. the polynucleotide gains access to the interior of at least one Methods are known in the art for the targeted insertion of a cell of the plant. Methods for introducing polynucleotides polynucleotide at a specific location in the plant genome. In into plants are known in the art including, and not limited to, one embodiment, the insertion of the polynucleotide at a stable transformation methods, transient transformation 35 desired genomic location is achieved using a site-specific methods, and virus-mediated methods. recombination system. See, for example, WO99/25821, “Transformation” refers to the transfer of a nucleic acid WO99/25854, WO99/25840, WO99/25855, and WO99/ fragment into the genome of a host organism, resulting in 25853, all of which are herein incorporated by reference. genetically stable inheritance. Host organisms containing the Briefly, the polynucleotide of the embodiments can be con transformed nucleic acid fragments are referred to as “trans 40 tained in transfer cassette flanked by two non-identical genic' organisms. “Host cell refers the cell into which trans recombination sites. The transfer cassette is introduced into a formation of the recombinant DNA construct takes place and plant have stably incorporated into its genome a target site may include a yeast cell, a bacterial cell, and a plant cell. which is flanked by two non-identical recombination sites Examples of methods of plant transformation include Agro that correspond to the sites of the transfer cassette. An appro bacterium-mediated transformation (De Blaere et al., 1987, 45 priate recombinase is provided and the transfer cassette is Meth. Enzymol. 143:277) and particle-accelerated or “gene integrated at the target site. The polynucleotide of interest is gun' transformation technology (Klein et al., 1987, Nature thereby integrated at a specific chromosomal position in the (London)327:70-73: U.S. Pat. No. 4,945.050), among others. plant genome. "Stable transformation' is intended to mean that the nucle The cells that have been transformed may be grown into otide construct introduced into a plant integrates into the 50 plants in accordance with conventional ways. See, for genome of the plant and is capable of being inherited by the example, McCormicket al. (1986) Plant Cell Reports 5:81– progeny thereof. “Transient transformation” or “transient 84. These plants may then be grown, and either pollinated expression' is intended to mean that a polynucleotide is intro with the same transformed strain or different strains, and the duced into the plant and does not integrate into the genome of resulting progeny having constitutive expression of the the plant or a polypeptide is introduced into a plant. 55 desired phenotypic characteristic identified. Two or more Transformation protocols as well as protocols for introduc generations may be grown to ensure that expression of the ing polypeptides or polynucleotide sequences into plants may desired phenotypic characteristic is stably maintained and vary depending on the type of plant or plant cell, i.e., monocot inherited and then seeds harvested to ensure expression of the or dicot, targeted for transformation. Suitable methods of desired phenotypic characteristic has been achieved. In this introducing polypeptides and polynucleotides into plant cells 60 manner, the embodiments provides transformed seed (also include microinjection (Crossway et al. (1986) Biotechniques referred to as “transgenic Seed') having a nucleotide con 4:320-334), electroporation (Riggs et al. (1986) Proc. Natl. struct of the embodiments, for example, an expression cas Acad. Sci. USA 83:5602-5606, Agrobacterium-mediated sette of the embodiments, stably incorporated into their transformation (U.S. Pat. Nos. 5,563,055- and 5,981,840), genome. direct gene transfer (Paszkowski et al. (1984) EMBO J. 65 As used herein, the term “plant can be a whole plant, any 3:2717-2722), and ballistic particle acceleration (see, for part thereof, or a cell or tissue culture derived from a plant. example, Sanford et al., U.S. Pat. Nos. 4,945,050; 5,879,918: Thus, the term “plant can refer to any of whole plants, plant US 8,912,387 B2 33 34 components or organs (including but not limited to embryos, ogy used herein is for the purpose of describing particular pollen, ovules, seeds, leaves, flowers, branches, fruit, kernels, embodiments, and is not intended to be limiting. As used ears, cobs, husks, stalks, roots, root tips, anthers, and the like), herein and in the appended claims, terms in the singular and plant tissues, plant cells, plant protoplasts, plant cell tissue the singular forms “a”, “an and “the’, for example, include cultures from which maize plant can be regenerated, plant plural referents unless the content clearly dictates otherwise. calli, plant clumps, and plant seeds. A plant cell is a cell of a Thus, for example, reference to “plant”, “the plant' or “a plant, either taken directly from a seed or plant, or derived plant' also includes a plurality of plants. Depending on the through culture from a cell taken from a plant. Grain is context, use of the term "plant can also include genetically intended to mean the mature seed produced by commercial similar or identical progeny of that plant. The use of the term growers for purposes other than growing or reproducing the 10 “a nucleic acid' optionally includes many copies of that species. Progeny, variants, and mutants of the regenerated nucleic acid molecule. plants are also included within the scope of the embodiments, Methods for identifying maize plants with increased head provided that these parts comprise the introduced polynucle Smut resistance through the genotyping of associated marker otides. loci are provided. Head Smut resistance in maize is an agro The embodiments of the invention may be used to conferor 15 nomically important trait, as head Smut infection lowers enhance fungal plant pathogen resistance or protect from yield. fungal pathogen attack in plants, especially corn (Zea mays). It has been recognized for quite Some time that specific It will protect different parts of the plant from attack by chromosomal loci (or intervals) can be mapped in an organ pathogens, including and not limited to stalks, ears, leaves, ism's genome that correlate with particular quantitative phe roots and tassels. Other plant species may also be of interest notypes, such as head Smut resistance. Such loci are termed in practicing the embodiments of the invention, including, quantitative trait loci, or QTL. The plant breeder can advan and not limited to. The terms “phenotype', or “phenotypic tageously use molecular markers to identify desired individu trait” or “trait” refers to one or more trait of an organism. The als by identifying marker alleles that show a statistically phenotype can be observable to the naked eye, or by any other significant probability of co-segregation with a desired phe means of evaluation known in the art, e.g., microscopy, bio 25 notype, manifested as linkage disequilibrium. By identifying chemical analysis, or an electromechanical assay. In some a molecular marker or clusters of molecular markers that cases, a phenotype is directly controlled by a single gene or co-segregate with a quantitative trait, the breeder is thus iden genetic locus, i.e., a “single gene trait”. In other cases, a tifying a QTL. By identifying and selecting a marker allele (or phenotype is the result of several genes. desired alleles from multiple markers) that associates with the A "physical map' of the genome is a map showing the 30 desired phenotype, the plant breeder is able to rapidly select a linear order of identifiable landmarks (including genes, mark desired phenotype by selecting for the proper molecular ers, etc.) on chromosome DNA. However, in contrast to marker allele (a process called marker-assisted selection, or genetic maps, the distances between landmarks are absolute MAS). (for example, measured in base pairs or isolated and overlap A variety of methods well known in the art are available for ping contiguous genetic fragments) and not based on genetic 35 detecting molecular markers or clusters of molecular markers recombination. that co-segregate with a quantitative trait such as head Smut In maize, a number of BACs, or bacterial artificial chro resistance. The basic idea underlying all of these methods is mosomes, each containing a large insert of maize genomic the detection of markers, for which alternative genotypes (or DNA, have been assembled into contigs (overlapping con alleles) have significantly different average phenotypes. tiguous genetic fragments, or “contiguous DNA). A BAC 40 Thus, one makes a comparison among marker loci of the can assemble to a contig based on sequence alignment, if the magnitude of difference among alternative genotypes (or BAC is sequenced, or via the alignment of its BAC fingerprint alleles) or the level of significance of that difference. Trait to the fingerprints of other BACs in a contig. The assemblies genes are inferred to be located nearest the marker(s) that are available to the public using the genome Maize Genome have the greatest associated genotypic difference. Browser, which is publicly available on the internet. 45 Two such methods used to detect QTLs are: 1) Population A“plant can be a whole plant, any part thereof, or a cellor based structured association analysis and 2) Pedigree-based tissue culture derived from a plant. Thus, the term “plant can association analysis. In a population-based structured asso refer to any of whole plants, plant components or organs ciation analysis, lines are obtained from pre-existing popula (e.g., leaves, stems, roots, etc.), plant tissues, seeds, plant tions with multiple founders, e.g. elite breeding lines. Popu cells, and/or progeny of the same. A plant cell is a cell of a 50 lation-based association analyses rely on the decay of linkage plant, taken from a plant, or derived through culture from a disequilibrium (LD) and the idea that in an unstructured cell taken from a plant. Thus, the term “maize plant includes population, only correlations between QTL and markers whole maize plants, maize plant cells, maize plant protoplast, closely linked to the QTL will remain after so many genera maize plant cell or maize tissue culture from which maize tions of random mating. In reality, most pre-existing popula plants can be regenerated, maize plant calli, and maize plant 55 tions have population Substructure. Thus, the use of a struc cells that are intact in maize plants or parts of maize plants, tured association approach helps to control population Such as maize seeds, maize cobs, maize flowers, maize coty structure by allocating individuals to populations using data ledons, maize leaves, maize stems, maize buds, maize roots, obtained from markers randomly distributed across the maize root tips and the like. genome, thereby minimizing disequilibrium due to popula The term "quantitative trait locus” or “OTL refers to a 60 tion structure within the individual populations (also called region of DNA that is associated with the differential expres Subpopulations). The phenotypic values are compared to the sion of a phenotypic trait in at least one genetic background, genotypes (alleles) at each marker locus for each line in the e.g., in at least one breeding population. QTLS are closely Subpopulation. A significant marker-trait association indi linked to the gene or genes that underlie the trait in question. cates the close proximity between the markerlocus and one or Before describing the present invention in detail, it should 65 more genetic loci that are involved in the expression of that be understood that this invention is not limited to particular trait. In pedigree-based association analyses, LD is generated embodiments. It also should be understood that the terminol by creating a population from a small number of founders. US 8,912,387 B2 35 36 For example, in an interval mapping approach (Lander and markerallele with either the increased or decreased head Smut Botstein, Genetics 121:185-199 (1989), each of many posi resistance phenotype is due to the original "coupling linkage tions along the genetic map (say at 1 cMintervals) is tested for phase between the marker allele and the QTL allele in the the likelihood that a QTL is located at that position. The ancestral maize line from which the QTL allele originated. genotype/phenotype data are used to calculate for each test Eventually, with repeated recombination, crossing over position a LOD score (log of likelihood ratio). When the LOD events between the marker and QTL locus can change this score exceeds a critical threshold value (herein equal to 2.5), orientation. For this reason, the favorable marker allele may there is significant evidence for the location of a QTL at that change depending on the linkage phase that exists within the position on the genetic map (which will fall between two resistant parent used to create segregating populations. This particular marker loci). 10 does not change the fact that the genetic marker can be used to Markers associated with the head Smut resistance trait are monitor segregation of the phenotype. It only changes which identified herein, as are marker alleles associated with either marker allele is considered favorable in a given segregating increased or decreased head Smut resistance. The methods population. involve detecting the presence of at least one marker allele A variety of methods well known in the art are available for associated with either the increased or decreased head Smut 15 identifying chromosome intervals. The boundaries of Such resistance in the germplasm of a maize plant. chromosome intervals are drawn to encompass markers that A common measure of linkage is the frequency with which will be linked to one or more QTL. In other words, the traits cosegregate. This can be expressed as a percentage of chromosome interval is drawn Such that any marker that lies cosegregation (recombination frequency) or in centiMorgans within that interval (including the terminal markers that (cM). The cM is a unit of measure of genetic recombination define the boundaries of the interval) can be used as markers frequency. One cM is equal to a 1% chance that a trait at one for head Smut resistance. Each interval comprises at least one genetic locus will be separated from a trait at another locus QTL, and furthermore, may indeed comprise more than one due to crossing over in a single generation (meaning the traits QTL. Close proximity of multiple QTL in the same interval segregate together 99% of the time). Because chromosomal may obfuscate the correlation of a particular marker with a distance is approximately proportional to the frequency of 25 particular QTL, as one marker may demonstrate linkage to crossing over events between traits, there is an approximate more than one QTL. Conversely, e.g., if two markers in close physical distance that correlates with recombination fre proximity show co-segregation with the desired phenotypic quency. For example, in maize, 1 cM correlates, on average, trait, it is sometimes unclear if each of those markers identify to about 2,140,000 base pairs (2.14 Mbp). the same QTL or two different QTL. Regardless, knowledge Marker loci are themselves traits and can be assessed 30 of how many QTL are in a particular interval is not necessary according to standard linkage analysis by tracking the marker to make or practice the invention. loci during segregation. Thus, one cM is equal to a 1% chance Methods for marker assisted selection (MAS), in which that a marker locus will be separated from another locus, due phenotypes are selected based on marker genotypes, are also to crossing over in a single generation. provided. To perform MAS, a nucleic acid corresponding to Other markers linked to the QTL markers can be used to 35 the marker nucleic acid allele is detected in a biological predict the state of the head Smut resistance in a maize plant. sample from a plant to be selected. This detection can take the This includes any marker within 50 cM of the genetic locus. form of hybridization of a probe nucleic acid to a marker The closer a marker is to a QTL marker, the more effective allele or amplicon thereof, e.g., using allele-specific hybrid and advantageous that marker is as an indicator for the desired ization, Southern analysis, northern analysis, in situ hybrid trait. Closely linked loci display an inter-locus cross-over 40 ization, hybridization of primers followed by PCR amplifica frequency of about 10% or less, preferably about 9% or less, tion of a region of the marker, DNA sequencing of a PCR still more preferably about 8% or less, yet more preferably amplification product, or the like. The procedures used to about 7% or less, still more preferably about 6% or less, yet detect marker alleles are known to one of ordinary skill in the more preferably about 5% or less, still more preferably about art. After the presence (or absence) of a particular marker 4% or less, yet more preferably about 3% or less, and still 45 allele in the biological sample is verified, the plant is selected more preferably about 2% or less. In highly preferred and is crossed to a second plant, preferably a maize plant from embodiments, the relevant loci (e.g., a marker locus and a an elite line. The progeny plants produced by the cross can be target locus such as a QTL) display a recombination fre evaluated for that specific marker allele, and only those prog quency of about 1% or less, e.g., about 0.75% or less, more eny plants that have the desired marker allele will be chosen. preferably about 0.5% or less, or yet more preferably about 50 Maize plant breeders desire combinations of desired 0.25% or less. Thus, the loci are about 10 cM, 9 cM, 8 cM, 7 genetic loci. Such as those marker alleles associated with cM, 6 cM, 5 cM, 4 cM, 3 cM, 2 cM, 1 cM, 0.75 cM, 0.5 cM, increased resistance to head Smut, with genes for high yield 0.25 cM, 0.1 cM, 0.075 cM, 0.05 cM, 0.025 cM, or 0.01 cM and other desirable traits to develop improved maize variet or less apart. Put another way, two loci that are localized to the ies. Screening large numbers of samples by non-molecular same chromosome, and at Such a distance that recombination 55 methods (e.g., trait evaluation in maize plants) can be expen between the two loci occurs at a frequency of less than 10% sive, time consuming, and unreliable. Use of the polymorphic (e.g., about 9%, 8%, 7%, 6%. 5%, 4%, 3%, 2%.1%, 0.75%, markers described herein, when genetically-linked to head 0.5%, 0.25%, 0.1%, 0.075%, 0.05%, 0.025%, or 0.01% or Smut resistance loci, provide an effective method for selecting less) are said to be “proximal to each other. varieties with head Smut resistance inbreeding programs. For Although particular marker alleles can show co-segrega 60 example, one advantage of marker-assisted selection over tion with the head Smut resistance phenotype, it is important field evaluations for head Smut resistance is that MAS can be to note that the marker locus is not necessarily part of the QTL done at any time of year, regardless of the growing season. locus responsible for the expression of the head Smut resis Moreover, environmental effects are largely irrelevant to tance phenotype. For example, it is not a requirement that the marker-assisted selection. marker polynucleotide sequence be part of a gene that imparts 65 Another use of MAS in plant breeding is to assist the increased head Smut resistance (for example, be part of the recovery of the recurrent parent genotype by backcrossbreed gene open reading frame). The association between a specific ing. Backcross breeding is the process of crossing a progeny US 8,912,387 B2 37 38 back to one of its parents or parent lines. Backcrossing is Example 2 usually done for the purpose of introgressing one or a few loci from a donor parent (e.g., a parent comprising desirable head Artificial Inoculation and Resistant Scoring in the Smut resistance marker loci) into an otherwise desirable Field genetic background from the recurrent parent (e.g., an other wise highyielding maize line). The more cycles of backcross The Sori containing teliospores of S. reliana were collected ing that are done, the greater the genetic contribution of the from the field in the previous growing season and stored in recurrent parent to the resulting introgressed variety. This is cloth bag in a dry and well ventilated environment. Before often necessary, because plants may be otherwise undesir planting, were removed from the Sori, filtered, and able, e.g., due to low yield, low fecundity, or the like. In 10 then mixed with soil at a ratio of 1:1000. The mixture of soil contrast, Strains which are the result of intensive breeding and teliospores were used to cover maize kernels when Sow programs may have excellent yield, fecundity or the like, ing seeds to conduct artificial inoculation. Plants at maturity merely being deficient in one desired trait Such as head Smut stage were scored for the presence/absence of Sorus in either resistance. ear or tassels as an indicator for Susceptibility/resistance. One application of MAS is to use the markers to increase 15 the efficiency of an introgression or backcrossing effort DNA Extraction aimed at introducing an increased resistance to head Smut Leaf tissues from one-month-old plants were harvested QTL into a desired (typically high yielding) background. In and ground to a powder in liquid nitrogen. Genomic DNA marker assisted backcrossing of specific markers (and asso was extracted followed the method described by Murray and ciated QTL) from a donor source, e.g., to an elite or exotic Thompson (1980). genetic background, one selects among backcross progeny Genotyping at SSR Markers and Linkage Map Construction for the donor trait and then uses repeated backcrossing to the SSR markers were firstly employed to check their poly elite or exotic line to reconstitute as much of the elite/exotic morphisms between two parents Ji 1037 and Huangzhao4. backgrounds genome as possible. Only those SSR markers that showed unambiguously poly The most preferred QTL markers (or marker alleles) for 25 morphic bands and evenly distributed across ten chromo MAS are those that have the strongest association with the Somes were used to genotype segregating populations. PCR head Smut resistance trait. reactions were performed as follows: denaturation at 94° C. for 2 minutes, followed by 35 cycles of denaturation at 94° C. EXAMPLES for 30 seconds, annealing at 58°C. for 30 seconds, extension 30 at 72°C. for 30 seconds, and with a final extension step at 72° The following examples are offered to illustrate, but not to C. for 10 minutes. The PCR products were subjected to elec limit, the appended claims. It is understood that the examples trophoresis on 6% polyacrylamide gel, followed by sliver and embodiments described herein are for illustrative pur staining for visualization. poses only and that persons skilled in the art will recognize A total of 94 BC individuals were randomly selected from various reagents or parameters that can be altered without 35 the BC generation and assayed for their genotypes at the 113 departing from the spirit of the invention or the scope of the polymorphic SSR markers. A PCR band was marked as 2 if appended claims. it is the same as that of the donor parent, and scored as 1 if it is identical to that of the recurrent parent. The ratio of Example 1 homozygotes (1/1) to heterozygotes (1/2) in the BC back 40 cross population was analyzed for its consistency of 1:1 at Plant Materials each SSR marker by x test. The genetic distances between SSR markers were estimated by MAPMAKER/Exp version Two inbred lines, Ji 1037 (donor parent) and 3.0b (Lincoln et al. 1992). By the way, some markers on Huangzhao.4 (recurrent parent), which differ wildly in resis chromosome 2 were genotyped in different scales of popula tance to the host-specific fungus Sphacelotheca reiliana Clint 45 tions, and their genetic positions were adjusted with the inte were used as parental lines to develop all mapping popula gration data in the JoinMap Software. tions in this study. All plant materials tested in the present Data Analysis and QTL/Gene Mapping study were artificially inoculated with S. reiliana Clint. Putative QTLS conferring resistance to head Smut were Ji 1037 shows fully resistant to head Smut and no any sus identified according to design III of Trait-Based Analysis ceptible individual has ever been observed in the field; while, 50 (Lebowitzetal. 1987). Briefly, BC individuals with the resis Huangzhao.4, an elite Chinese inbred line, is highly suscep tance QTL are expected to be more resistant to head Smut than tible to head Smut with ~75% susceptible individuals in the those without the resistance QTL. Consequently, a marker field. In 2004, a BC population consisting of 314 individuals allele adjacent to the resistance QTL in coupling would show along with two parents was grown in the experimental farm of higher frequency in the resistant group than that in the Sus the Jilin Academy of Agricultural Sciences, Gongzhulin. 55 ceptible group. A tetrad grids x test (SASR 8.2 version) was Each BC individual was evaluated for its resistance against used to test allele frequencies at all markers between the head Smut. Resistant BC individuals were backcrossed to resistant and Susceptible groups to scan putative QTL across Huangzhao.4 to generate BC families (BC population). whole genome. Thereafter, a number of methods were In 2005, -20 plants from each BC family were grown in a employed to confirm the major QTL region and its effective single plot to evaluate their resistances to head Smut. Recom 60 ness in resistance to head Smut. First, the SSR markers in the binant individuals from BC, population were identified and putative major QTL region were used to genotype all BC backcrossed to Huangzhao.4 to generate BC families or individuals to confirm the presence of the major QTL. Sec self-pollinated to produce BCF families. In 2006, approxi ond, infection percentages of BC individuals were estimated mately 80 individuals from each of the 59 BC and nine based on their BC progenies to confirm the putative major BCF2 families were grown in the experimental farm of the 65 QTL by single-factor analysis of variance. Third, putative Jilin Academy of Agricultural Sciences for investigating their QTL was identified across the ten chromosomes by the com resistances to head Smut. posite interval mapping method (Windows QTL Cartogra US 8,912,387 B2 39 40 pher Version 2.0 software). Finally, the major QTL was fur harbors the resistance gene, the progeny with the donor ther confirmed by estimating its genetic effect in reducing regions would show significantly higher resistant than those disease incidence. without the donor regions. By comparing the insert sizes of the resistant and non-resistant donor regions, we could fix Example 3 on an interval where the resistance gene resides on. With an application of the newly-developed high-density markers, we Development of the Region-Specific Markers could definitely define the donor regions harboring the resis tance gene and therefore narrow down the resistance region Sequences available in the major resistance QTL region, into a very short interval. In all comparisons, significant dif including the anchored EST. IDP, RGA, BAC, and BAC-end 10 sequences, were used to develop high-density markers. These ferences were estimated on SASR software using x test. sequences were compared to NCBI and MAGI databases via Example 5 tBLASTn to obtain possible longer sequences. Primer was designed using the PRIMER5.0 software in accordance with Construction of the SSR Linkage Map the following parameters: 20 nucleotides in length, GC con 15 tent of 40% to 60%, no secondary structure, and no consecu tive tracts of a single nucleotide. A total of 700 SSR markers were checked for their poly Primer pairs were used to amplify the corresponding seg morphisms between Ji1037 and Huangzhao.4. Among the ments from both parents. The cycling parameters were set up 347 polymorphic SSR markers, 113 markers evenly distrib the same as those described above except for the annealing uted across ten chromosomes were selected to genotype the temperature that was adjusted according to different primer BC mapping population. Of these 113 markers, 33 (29.2%) pairs. Only those amplicons with the same or bigger than showed distortion segregation at PK0.05 or at p-0.01. Gen predicted were cut down from gel and purified with Gel erally, markers showing genetic distortion had no negative Extraction Kit (Qiagen GmbH, Hilden, Germany). The puri impact on QTL detection. Therefore, a linkage map was fied PCR products were then cloned into the vector PGEMR) 25 constructed using all 113 SSR markers. The map was ~1753.4 (Promega, Madison, USA). Normally, three to five positive cM in length with one marker in every 14.6 cMaveragely. clones for each amplicon were selected for sequencing to avoid any contamination or mismatch. The amplicon Example 6 sequence was firstly compared with the original one from which it was derived to make sure the right one was obtained, 30 Mapping Putative QTLs and then comparison was conducted to search for sequence divergence between two parents by using DNAMAN soft According to the Design III of TB analysis (Lebowitz et al. ware. The InDels were amenable for developing sequence 1987), each of the 113 SSR markers was tested for its fre tagged site (STS) markers; while single nuclear polymor quency at 1/2 (heterozygote) and 1/1 (homozygote) in both phism (SNP) can be used to develop either SNP marker or 35 the resistant and Susceptible groups. The significant biases at CAPS marker (cleaved-amplified polymorphic sequence). A frequencies between the resistant and Susceptible groups CAPS marker is developed if the SNP is related to a given were observed for those markers located on the four chromo restriction site. In developing SNP marker, a SNPpicker pro somal regions (bins 1.02/3, 2.08/9, 6.07, and 10.03/4), sug gram of SeqVISTA software was used to see if it was possible gesting the presence of four putative QTLS (Table 1). For to create a specific restriction site by introducing a mismatch 40 instance, the markers on bin 2.09 showed no distortion from base pair into primer to alter a half-site to a full-site for a 1:1 ratios of heterozygote to homozygote in the whole BC specific restriction site, following the method described by population. However, percentages of heterozygote at these Niu and Hu (2004). markers significantly differ between the resistant and Suscep The primer pairs were used to amplify the two parents to tible groups with the P values.<0.0001 (Table 1). The result develop high-density markers. For STS markers, polymor 45 strongly indicated the presence of a major QTL (named as phic PCR bands should appear after electrophoresis on aga qHSR1) in this region. Markers on both bin 10.03/4 and bin rose or polyacrylamide gel. For those CAPS and SNP mark 1.02/3 had the P values.<0.01 (Table 1), implying the presence ers, polymorphic bands could be observed on agarose or of putative QTLs with less effects in these two regions. Mark polyacrylamide gel after digestion with certain restriction ers on bin 6.07 also showed skew with the P values<0.05 endonucleases. 50 (Table 1), Suggesting the presence of a possible minor QTL. In addition, only one marker on bin 4.01 orbin 5.03 was found Example 4 to show frequency skew between the resistant and Susceptible groups (Table 1), it was, therefore, difficult to judge whether Fine Mapping or not a QTL was actually present in these two bins. 55 Recombinant individuals from the BC, population were screened out with the SSR markers in the major QTL region. TABLE 1 Due to partial penetrance for head Smut resistance, it would Scanning putative QTL across the whole genome via be at high risk to judge whether or not a BC recombinant a tetrad grids X2 test at the 113 SSR markers carries the resistance gene based on performance of a single 60 Percentage of individual. Hence, we adopted a more robust method to judge heterozygote (% P putative the presence/absence of the resistance gene for a single BC recombinant based on both genotypes and phenotypes of its bins Markers In Rgroup In S group X2 values QTL progeny. If there is no resistance gene in the donor region for 1.02 bnlg1614 48.65 71.43 4.93 0.0265 Yes a certain BC recombinant, its progeny with donor regions 65 1.02 bnlg1083 SO.OO 72.73 S.OO O.O2S3 would show no difference with those without donor regions in 1.03 umc1403 44.74 76.36 9.69 O.OO19 resistance to head Smut. On the contrary, if the donor region US 8,912,387 B2 41 42 TABLE 1-continued Of the 158 susceptible individuals, however, only 60 (38%) were heterozygotes/recombinants and as many as 98 (62%) Scanning putative QTL across the whole genome via a tetrad grids X2 test at the 113 SSR markers were homozygotes. These results showed that the donor region in bin 2.09 could significantly enhance maize resis Percentage of heterozygote (% P putative tance to head Smut, strongly supporting the presence of the major QTL in bin2.09. It should be noted that head Smut was bins Markers In Rgroup In S group X2 values QTL very serious in 2004 due to drought during the seedling stage. 2.08 bnlg1141 65.63 36.36 6.95 O.OO84 Yes The susceptible Huangzhao4 had 86% susceptible individu 2.08.09 umc1230 68.57 40.38 6.66 O.0099 10 2.09 bnlg1520 72.22 36.36 1119 O.OOO8 als, compared with ~75% in normal year. 2.09 unc1525 81.08 33.93 19.87 3') SEQ ID NO :

CAPS25 O82 IDP25082 CAPS TaqI L: AAGTCCTTCACGGTCTACCA 1) R: CGGTTAGGACGATGTCAGAA 2 SNP14 O313 AZM4 140313 SNP Hha L. : CAGAGGCATTGAACAGGAAG 3 from TIGR R. CTGCTATTCCACGAAGTGCT 4) snpL: CTCTTCCACCGAGAATAGCG 5) snpR: CTGCTATTCCACGAAGTGCT 6)

SNP661 IDP661 SNP TaqI L. CTTCTGTTCTGTGCCAGGTA 7) R: CAAGAACGTAGCAACTCAGC 8 snpL: ATTGTCCCTGAGATGATTCG (9) snpR: CAAGAACGTAGCAACTCAGC 10

STS1944 IDP1944 STS : CATTGGCAACAGGACAAGTG 11 R: GACATCAGCCTCAACATTGG 12

STS171 IDP171 STS : CCAGAGACTTGCGTGAAGAT 13 R: AACAGACTGGTTGTACGTGC 14

SSR1481.52 BAC clone AC148152 SSR : GTAGGAAGACTGCCGGAGAC 15 R: GACGCTAGAATGACTGAACC 16

STSrga3195 ZMTUCO3- 08.11.3195 STS L. : CTAGAGGTTCAGGCATATGGCG (17) (RGA) R: AGCTCCACAGGAATTCGTTGAG (18) STSrga84 0810 BG840810 (RGA) STS L: GCGTCAGGCAGTTCAACTTC 19 R: TGTTCTTGCACTCGCACTTG 20 STSsyn1 LOC OsO7g07050 STS L: GGCACATGGACGTACAAGAT 21 from rice R: GCACAGAGGAAGCTAGGAGA (22

L: left primer; R: right primer. For SNP markers, a pair of 'L' and 'R' primers was firstly used to amplify genomic DNA and then a pair of SnpL (mismatch primer) and SnpR primers was used to amplify diluted PCR products from the first step to alter a half-site to full-site for a specific restriction site. Polymorphic bands Could be observed after digestion of second-round POR products with a certain enzyme and subjected to electropherosis on polyacrylamide gel.

Of the nine newly-developed markers, SNP140313 and value) was observed in percentages of heterozygote between STSrga3195 were mapped on chr. 1, and STSsyn1 was the resistant and Susceptible groups for BCs, indicating the 60 absence of qHSR1 in the donor region. Taken together, 11 mapped on chr. 5. The remaining six markers were authenti BC, recombinants (BC2-64, BC2-50, BC2-65, BC2-27. cally mapped on bin 2.09 with five markers (SSR148152, BC2-19, BC2-46, BC2-66, BC2-60, BC2-43, BC2-37, and CAPS25082, STS171, SNP661, and STS1944) in and one BC2-69) were inferred to carry qHSR1 and regarded as the marker (STSrga840810) out of the resistance qHSR1 region. resistant BC, recombinants; whereas, five BC, recombinants 65 (BC2-67, BC2-68, BC2-49, BC2-25, and BC2-45) were The newly-developed markers would greatly facilitate MAS inferred to harbor no qHSR1 and considered to be the sus and fine mapping of the resistance gene (FIG. 2). ceptible BC recombinants (Table 4). US 8,912,387 B2 45 46 TABLE 4 Parental BC2 recombinants, their genotypes at the qHSR1 region, X test in progenies, and deduced BC2 phenotypes Genotypes at SSR markers for the parental BC2 recombinants Parental BC2 phi427434/ recombinants SSR148152 bnlg1893 STS171 SNP661 STS1944 umc2184

BC2-SO f2 f2 f2 f2 f2 BC2-65 f1 f2 f2 f2 f2 BC2-27 f1 f2 f2 f2 f2 BC2-64 f1 f2 f2 f2 f2 BC2-67 f1 f1 f1 f2 f2 BC2-68 f1 f1 f1 f2 f2 BC2-49 f1 f1 f1 f2 f2 BC2-25 f1 f1 f1 f1 f2 BC2-45 f1 f1 f1 f1 f2 BC2-19 f2 f2 f2 f2 f2 BC2-46 f2 f2 f2 f2 f2 BC2-66 f1 f2 f2 f2 f2 BC2-60 f2 f2 f2 f2 f2 BC2-43 f2 f2 f2 f2 f1 BC2-37 f2 f2 f2 f1 f1 BC2-69 f2 f2 f2 f1 f1 Parental BC2 x test in progenies Deduced BC2 recombinants Markers PValues Phenotypes

BC2-SO STS171 O.OO3 Resistan STS1944 O.OOO2 BC2-65 STS171 O.042 Resistan STS1944 O.OS1 BC2-27 STS171 O.OO6 Resistan BC2-64 STS1944 O.O22 Resistan BC2-67 STS1944 0.273 Susceptible BC2-68 STS1944 O.384 Susceptible BC2-49 STS1944 O.805 Susceptible BC2-25 STS1944 O.478 Susceptible BC2-45 STS1944 O.730 Susceptible BC2-19 STS171 O.O33 Resistan BC2-46 STS171

Based on the deduced phenotypes, the major resistance Example 10 QTL region could be narrowed down by comparing the donor as regions amongst all BC recombinants (Table 4). BC2-50 had Estimation of the Genetic Effect of the Major QTL a heterogenous genotype in the qHSR1 region and showed high resistance to head Smut with the P values 0.01. On the Theoretically, 93.75% of the genetic background in the left side, three BC recombinants (BC2-64 and BC2-65, and BC. progeny was reverted to the recurrent parent 50 Huangzhao.4. Due to the low background noise in BC. BC2-27) with their crossover points upstream of bnlg1893 progeny, the genetic effect of qHSR1 could be definitely showed resistance to head Smut; while, the other five BC, estimated by comparison of disease incidences between two recombinants with their crossover points downstream of groups with/without qHSR1 within the same BC family. A STS171 (BC2-67, BC2-68, and BC2-49) or SNP 661 (BC2 total of 1,524 individuals from 24 BC families were 25 and BC2-45) displayed susceptibility to head Smut. On the 55 checked for the presence/absence of qHSR1 with markers right side, all seven BC recombinants showed resistance to STS171 and STS1944. The disease incidences were esti head Smut and they had crossover points downstream of mated for two groups with/without qHSR1 in each BC. STS1944 (BC2-19, BC2-46, BC2-66, and BC2-60) or family. As a consequence, the group without qHSR1 showed SNP661 (BC2-43) or STS171 (BC2-37 and BC-69). Inter more susceptible than the group with qHSR1 in each BC. 60 family with an average difference of 28.6%+10.8%. In other estingly, one resistant BC, recombinant, BC2-66, had the word, a single resistance qHSR1 could reduce disease inci shortest donor region between SSR148152 and umc2184 and dence by 28.6%+10.8% (FIG. 2). this donor region was assumed to cover qHSR1. It could be Apart from BC progeny. BCF progeny was also concluded from the above analysis that the major resistance employed to estimate the genetic effect of qHSR1 in the QTL (qHSR1) was located in an interval of SSR148152/ 65 present study. The BC population was firstly genotyped at SNP661, which was estimated to be -2 Mb based on the two markers bnlg1893 and umc2184, resulting in 73 BC physical map available at the University of Arizona. plants with qHSR1 and another 31 BC. plants without US 8,912,387 B2 47 48 qHSR1. All these BC plants were self-pollinated to produce ing them in a just-boiled solution of 0.1xSSC and 0.1% SDS corresponding BCF families. As expected, the BCF prog and allowing them to cool to room temperature in the Solution eny derived from BC. plants with qHSR1 showed more resis overnight. tant than those derived from BC. plants without qHSR1. Of BACs that gave a positive signal were isolated from the the 529 BCF individuals derived from 31 BC, plants with plates. Restriction mapping, PCR experiments with primers out qHSR1, 204 (38.7%) were found to be susceptible. corresponding to the markers previously used and sequences Whereas, 262 (19.3%) of 1,358 BCF individuals derived obtained from the ends of each BAC were used to determine from 73 BC, plants with qHSR1 were susceptible. In the the order of the BACs covering the region of interest. Three BCF progeny derived from BC. plants with qHSR1, segre BACs that spanned the entire region (bacm.pk071jl 2, bac gation occurred at the qHSR1 locus, resulting in one-fourth 10 m.pk007.18, and bacm2.pk166.h1) were selected for sequencing. These BACs were sequenced using standard BCF, individuals without qHSR1. These BCF individuals shotgun sequencing techniques and the sequences assembled without qHSR1 are expected to have the same disease inci using the Phred/Phrap/Consed software package (Ewing et dence as that estimated from the 31 BCF families without al. (1998) Genome Research, 8:175-185). The assembled qHSR1 (38.7%). For the other three-fourth BCF, individuals 15 sequence of the BAC clones is shown in SEQID NO:25. with qHSR1 (one-fourth homozygotes and a half heterozy After assembly, the sequences thought to be in the region gotes), we needed to estimate its disease incidence. Based on closest to the locus on the basis of the mapping data were above explanations, we could draw an equation as 34X annotated, meaning that possible gene-encoding regions and %+ A*38.7%–19.3%; here, X represents infection percent regions representing repetitive elements were deduced. Gene age for those BCF individuals with qHSR1. The X is encoding (genic) regions were sought using the f(jenesH calculated to be 12.8%. In summary, the qHSR1 locus could software package (Softberry, Mount Kisco, N.Y., USA). f(Ge reduce disease incidence by 25.9% in the BCF progeny, nesH predicted a portion of a protein, that when BLASTed from 38.7% (individuals without qHSR1) to 12.8% (individu (BLASTX/nr), displayed partial homology at the amino acid als with qHSR1). level to a portion of a rice protein that was annotated as 25 encoding for a protein that confers disease resistance in rice. Example 11 The portion of the maize sequence that displayed homology to this protein fell at the end of a contiguous stretch of BAC Characterization of Genomic Sequence of qHSR1 consensus sequence and appeared to be truncated. In order to obtain the full representation of the gene in the maize BAC, In order to isolate the gene responsible for the phenotype 30 the rice amino acid sequence was used in a tEBLASTn analysis conferred by the qHSR1 locus, BACs containing the region against all other consensus sequences from the same maize between the markers MZA6393 (from bacm.pk071.j12.f BAC clone. This resulted in the identification of a consensus SEQ ID NO:23) and marker ST148-1 the Mo17 version of sequence representing the 3' end of the maize gene. However, ZMMBBc0478L09f (SEQ ID NO:24) were isolated from a the center portion of the gene was not represented in the BAC library prepared from the resistant Mo17 line. This 35 sequences so obtained. PCR primers were designed based on library was prepared using standard techniques for the prepa the 5' and 3' regions of the putative gene and used in a PCR ration of genomic DNA (Zhang et al. (1995) Plant Journal experiment with DNA from the original maize BAC as a 7:175-184) followed by partial digestion with HindIII and template. The sequence of the resulting PCR product con ligation of size selected fragments into a modified form of the tained sequence bridging the 5' and 3' fragments previously commercially available vector pCC1 BACTM (Epicentre, 40 isolated. Madison, USA). After transformation into EPI300TM E. coli Several open reading frames were detected in SEQ ID cells following the vendors instructions (Epicentre, Madison, NO:25 including a Xylanase inhibitor gene (SEQID NO:26/ USA), 125,184 recombinant clones were arrayed into 326 27), a cell wall associated protein kinase (SEQID NO:31/32), 384-well microtiter dishes. These clones were then gridded two HAT family protein dimerization genes (SEQID NO:34/ onto nylon filters (Hybond N-, Amersham Biosciences, Pis 45 35 and SEQID NO:37/38), and two uncharacterized proteins cataway, USA). Three overlapping clones (bacm.pk071j12, (SEQ ID NO:40/41 and SEQ ID NO:43/44). The Xylanase bacm.pk007.18, and bacm2.pk166.h1) were identified and inhibitor gene shows a polymorphic difference when com characterized. pared to the ortholog found in B73. The Mo17 gene is 97.8% The library was probed with overlapping oligonucleotide identical, by Clustal Valignment, to the B73 gene, and con probes (overgo probes; Ross et al. (1999) Screening large 50 tains two deletions of 2 and 10 amino acids (see FIG. 3.) The insert libraries by hybridization, p. 5.6.1-5.6.52. In A. Boyl, genomic DNA region including 2.4 kb upstream of the ORFs ed. Current Protocols in Human Genetics. Wiley, New York) from SEQ ID NOs:43/44 is shown in SEQ ID NO:45. The designed on the basis of sequences found in the BAC nucleic acid sequence encoding an additional EST fragment sequences. BLAST search analyses were done to screen out from the qHSR region is shown in SEQID NO:46. repeated sequences and identify unique sequences for probe 55 Any one, any combination, or all, of these genes may design. The position and interspacing of the probes along the confer, or contribute to, head Smut resistance at the qHSR1 contig was verified by PCR. For each probe two 24-mer locus. It is expected that polymorphisms associated with oligos self-complementary over 8 bp were designed. Their Mo17, which is resistant to head Smut, will be diagnostic of annealing resulted in a 40 bp overgo, whose two 16 bp over sequences that define qHSR1. hangs were filled in. The exact sequences are different as they 60 were to be used as overgo probes rather than just PCR prim Example 12 ers. Probes for hybridization were prepared as described (Ross et al. (1999) supra), and the filters prepared by the Backcrossing of the qHSR1 Locus into Susceptible gridding of the BAC library were hybridized and washed as Lines described by (Ross et al. (1999) supra). Phosphorimager 65 analysis was used for detection of hybridization signals. A qHSR1 locus introgression of inbred lines are made to Thereafter, the membranes were stripped of probes by plac confirm that the qHSR1 locus could be successfully back US 8,912,387 B2 49 50 crossed into inbreds, and that hybrids produced with the both for hybrid seed production and as a donor source for inbred lines with the qHSR1 locus would have enhanced or further introgression of the qHSR1 gene into other inbred conferred head Smut resistance. lines. MO17 is an inbred line with strong resistance to head Smut, Thus, the data shows that inbred progeny converted by but its weak agronomic characteristics make it a poor donor 5 using MO17 as a donor source retain the truncated MO17 parent in the absence of the use of the marker assisted breed chromosomal interval. The inbreds comprising the truncated ing methods described herein. To demonstrate the phenotypic MO17 chromosomal interval are very useful as donor sources value of the qHSR1 locus, the locus is introgressed into 10 themselves, and there is no need to revert to MO17 as a donor elite inbred lines, with an additional 25 inbreds added in the Source. By using marker assisted breeding as described second through to the BC3 stage as follows. The F1 popula 10 herein, the truncated MO17 chromosomal interval can be tion derived from the cross between MO17 and the elite further reduced in size as necessary without concern for los inbred lines are backcrossed once more to the recurrent par ing the linkage between the markers and the qHSR1 gene. ents (the elite inbreds), resulting in a BC1 population. Seed Example 13 lings are planted out, genotyped with markers across the 15 genome, selected (with the qHSR1 locus and minimal MO17 background) and backcrossed again to recurrent inbred lines Use of qHSR1 as a Transgene to Create Resistant to develop a BC2 population. BC2 families are genotyping Corn Plants and selected again for the presence of the MO17 qHSR1 region. Positive plants are backcrossed to recurrent parental The qHSR1 gene can be expressed as a transgene as well, inbreds once more to develop BC3 populations. Seeds from allowing modulation of its expression in different circum these BC3 populations are planted and plants are genotyped. stances. The following examples show how the qHSR1 gene BC3 plants with or without the region of interest are selfed to could be expressed in different ways to combat different make BC3S1 families. These families were used for pheno diseases or protect different portions of the plant, or simply to typic comparison (BC3S1 with or without the region of inter 25 move the qHSR1 gene into different corn lines as a transgene, est). as an alternative to the method described in Example 12. In order to observe the performance of the qHSR1 gene in a heterozygous situation Such as would be found in a com Example 13a mercial hybrid, appropriate testcrosses are made. Specifi cally, individual BC3S1 plants homozygous for the qHSR1 30 In this example, the qHSR1 candidate gene (Xylanase gene as well as plants homozygous for the Susceptible allele inhibitor and other annotated genes in the QTL interval, as are used to make testcrosses with selected inbreds. defined in Example 11) is expressed using its own promoter. In the case of both the BC3S1 lines and the hybrids, the In order to transform the complete qHSR1 genes, including expected phenotypic differences indicate significant the promoter and protein encoding regions, DNA fragments improvement for head Smut resistance in lines and hybrids 35 containing the complete coding region and approximately 2 containing the region carrying qHSR1. The data clearly dem kb upstream region are amplified by PCR using the BAC onstrate that using crossing techniques to move the gene of clone as template DNA. To enable cloning using the GATE the embodiments into other lines genetically competent to use WAYR Technology (Invitrogen) Carlsbad, USA), attB sites the gene result in enhanced resistance to head Smut. 40 are incorporated into the PCR primers, and the amplified As a result offine mapping the location of the qHSR1 gene, product is cloned into PDONRTM221 vector by GATEWAYR one may utilize any two flanking markers that are genetically BP recombination reaction. The resulting fragment, flanked linked with the qHSR1 gene to select for a small chromo by attL sites, is moved by the GATEWAYR LR recombina somal region with crossovers both north and south of the tion reaction into a binary vector. The construct DNA is then qHSR1 gene. This has the benefit of reducing linkage drag, 45 used for corn transformation as described in Example 14. which can be a confounding factor when trying to introgress a specific gene from non-adapted germplasm, such as MO17. Example 13b into elite germplasm. It is advantageous to have closely linked flanking markers for selection of a gene, and highly advanta In order to express the qHSR1 genes (Xylanase inhibitor geous to have markers within the gene itself. This is an 50 and other annotated genes in the QTL interval, as defined in improvement over the use of a single marker or distant flank Example 11) throughout the plant at a low level, the coding ing markers, since with a single marker or with distant flank region of the genes and their terminators are placed behind the ing markers the linkage associated with qHSR1 may be bro promoters of either a rice actingene (U.S. Pat. Nos. 5,641,876 ken, and by selecting for Such markers one is more likely to and 5,684,239) or the F3.7 gene (U.S. Pat. No. 5,850,018). To inadvertently select for plants without the qHSR1 gene. Since 55 enable cloning using the GATEWAYR Technology (Invitro marker assisted selection is often used instead of phenotypic gen, Carlsbad, USA), attB sites are incorporated into PCR selection once the marker-trait association has been con primers that are used to amplify the qHSR1 genes starting 35 firmed, the unfortunate result of such a mistake would be to bp upstream from its initiation codon. A NotI site is added to select plants that are not resistant to head Smut and to discard the attB1 primer. The amplified qHSR1 product is cloned into plants that are resistant to head Smut. In this regard, markers 60 PDONRTM221 vector by GATEWAYR BP recombination within the qHSR1 gene are particularly useful, since they reaction (Invitrogen, Carlsbad, USA). After cloning, the will, by definition, remain linked with resistance to head Smut resulting qHSR1 gene is flanked by attL sites and has a unique as enhanced or conferred by the gene. Further, markers within NotI site at 35 bp upstream the initiation codon. Thereafter, the qHSR1 locus are just as useful for a similar reason. Due to promoter fragments are PCR amplified using primers that their very close proximity to the qHSR1 gene they are highly 65 contain NotI sites. Each promoter is fused to the NotI site of likely to remain linked with the qHSR1 gene. Once intro qHSR1. In the final step, the chimeric gene construct is gressed with the qHSR1 gene, such elite inbreds may be used moved by GATEWAYR LR recombination reaction (Invitro US 8,912,387 B2 51 52 gen, Carlsbad, USA) into the binary vector PHP20622. This Maize is transformed with selected polynucleotide con is used for corn transformation as described in Example 14. structs described in Example 13a and 13c using the method of Zhao (U.S. Pat. No. 5,981,840, and PCT patent publication Example 13c WO98/32326). Briefly, immature embryos were isolated from maize and the embryos contacted with a suspension of In order to express the qHSR1 genes (Xylanase inhibitor Agrobacterium, where the bacteria were capable of transfer and other annotated genes in the QTL interval, as defined in ring the polynucleotide construct to at least one cell of at least Example 11) throughout the plant at a high level, the coding one of the immature embryos (step 1: the infection step). In region of the genes and their terminators are placed behind the this step the immature embryos were immersed in an Agro promoter, 5' untranslated region and an intron of a maize 10 bacterium Suspension for the initiation of inoculation. The ubiquitin gene (Christensen et al. (1989) Plant Mol. Biol. embryos were co-cultured for a time with the Agrobacterium 12:619-632: Christensen et al. (1992) Plant Mol. Biol. (step 2: the co-cultivation step). The immature embryos were 18:675-689). To enable cloning using the GATEWAYR Tech cultured on solid medium following the infection step. Fol nology (Invitrogen, Carlsbad, USA), attB sites are incorpo lowing this co-cultivation period an optional “resting step is rated into PCR primers that are used to amplify the qHSR1 15 performed. In this resting step, the embryos were incubated in gene starting at 142 bp upstream of the initiation codon. The the presence of at least one antibiotic known to inhibit the amplified product is cloned into PDONRTM221 (Invitrogen, growth of Agrobacterium without the addition of a selective Carlsbad, USA) using a GATEWAYR BP recombination agent for plant transformants (step 3: resting step). The imma reaction (Invitrogen, Carlsbad, USA). After cloning, the ture embryos were cultured on solid medium with antibiotic, resulting qHSR1 gene is flanked by attl sites. In the final step, but without a selecting agent, for elimination of Agrobacte the qHSR1 clone is moved by GATEWAYR LR recombina rium and for a resting phase for the infected cells. Next, tion reaction (Invitrogen, Carlsbad, USA) into a vector which inoculated embryos were cultured on medium containing a contained the maize ubiquitin promoter, 5' untranslated selective agent, and growing transformed callus is recovered region and first intron of the ubiquitin gene as described by 25 (step 4: the selection step). The callus is then regenerated into Christensen et al. (supra) followed by GATEWAYRATTR1 plants (step 5: the regeneration step), and calli grown on and R2 sites for insertion of the qHSR1 gene, behind the selective medium were cultured on Solid medium to regener ubiquitin expression cassette. The vector also contained a ate the plants. marker gene Suitable for corn transformation, so the resulting plasmid, carrying the chimeric gene (maize ubiquitin pro 30 Example 15 moter-ubiquitin 5' untranslated region-ubiquitin intron 1-qHSR1), is suitable for corn transformation as described in Transgenic Plant Evaluation Example 14. Transgenic plants are made as described in Example 14 Example 13d 35 using the constructs described in Examples 13a to 13d, In order to express the qHSR1 genes (Xylanase inhibitor respectively. They are evaluated with protocols described in and other annotated genes in the QTL interval, as defined in Example 9 for improvement in head Smut resistance. Example 11) at a root-preferred, low level of expression, the Example 16 coding region of the genes and their terminators are placed 40 behind a root preferred promoter such as but not limited to, maize NAS2 promoter, the maize Cyclo promoter (US 2006/ Analysis of qHSR1 Gene Distribution Across 0156439, published Jul. 13, 2006), the maize ROOTMET2 Germplasm and Identification of qHSR1 Sequence promoter (WO05063998, published Jul. 14, 2005), the Variants CR1BIO promoter (WO06055487, published May 26, 2006), 45 the CRWAQ81 (WO05035770, published Apr. 21, 2005) and Following the identification, sequencing and fine mapping the maize ZRP2.47 promoter (NCBI accession number: of qHSR1, other lines are screened for the qHSR1 gene. To U38790; GI No. 1063664). The fragment described in determine the presence of the qHSR1 gene in other maize Example 13b containing the qHSR1 coding region flanked by germplasm, gene specific primers combinations are used to attL sites and containing a unique Not site 35bp upstream of 50 amplify genomic DNA from a diverse panel of maize inbred the qHSR1 initiation codon is used to enable cloning using the lines by polymerase chain reaction. Inbred lines with qHSR1 GATEWAYR Technology (Invitrogen, Carlsbad, USA). Pro (MO17 allele) are identified. Thus, in addition to using MO17 moter fragment is PCR amplified using primers that contain as the donor Source, other sources containing the qHSR1 gene NotI sites. Each promoter is fused to the NotI site of qHSR1. can also be used as a donor Source. In the final step, the chimeric gene construct is moved by 55 Variants of the qHSR1 gene are also identified and ana GATEWAYR LR recombination reaction (Invitrogen, Carls lyzed for single nucleotide polymorphisms (SNPs). Not all of bad, USA) into the binary vector PHP20622. This is used for the allelic variants of the qHSR1 gene indicated a resistant corn transformation as described in Example 14. phenotype. Inbred lines with distinct haplotypes or alleles are evaluated for their head Smut resistance, and putative resistant Example 14 60 allelic variants are identified. Their efficacy in head Smut resistance is validated in segregating populations (e.g. F2 Agrobacterium-Mediated Transformation of Maize population). The SNPs can be used as markers to precisely and Regeneration of Transgenic Plants identify and track the qHSR1 sequence in a plant breeding program, and to distinguish between resistant and Susceptible The recombinant DNA constructs prepared in Example 65 allelic variants. Further, these SNPs indicate that there are 6a-6d were used to prepare transgenic maize plants as fol variant sequences that show a resistant phenotype and can be lows. used in the methods and products disclosed herein. US 8,912,387 B2 53 54 Example 17 Example 18 Characterization of Candidate Resistance Genes in Further Analysis of qHSR1 Gene Distribution Across the qHSR1 Region Germplasm and Identification of qHSR1 Sequence Variants Three approaches are being taken to validate the candidate resistance genes: 1) a complementarity test, since both Mo17 and B73 show some resistance to head Smut, the three shared The qHSR1 region has been further defined as an 172-kb genes (Ankyrin-repeat protein, Wall-associated kinase pro interval in the resistant parental line Ji1037 and a 56-kb tein, and Xa21 D kinase), are likely to be candidate genes interval in the susceptible parental line Huangzhao.4. The size 10 contributing to the phenotype, all these three genes are Sub discrepancy is due to a deletion (116 kb) in Huangzhao.4 cloned from the positive BAC clones into an expression vec compared with Ji 1037. The key recombinants which were tor, followed by transformation into susceptible inbred lines: used for fine-mapping have been repeatedly investigated for 2) RNAi technique, RNAi vectors are constructed for all six their resistances to head Smut in Gongzhuling in Jilin Prov putative genes in the 172-kb region and then are transformed ince and in the winter nursery on Hainan Island, and show 15 into Mo17 to knockoutputative genes one by one, this allows for the identification of those genes involved in resistance to consistent resistance to head Smut. head Smut; 3) overexpression of candidate genes in Suscep PositiveMo17 BAC clones have been selected based on the tible lines, overexpression constructs with each of the six characterization of the qHSR1 region. In addition, markers in individual candidate genes linked to strong promoters are the qHSR1 region were used to screen a Huangzhao4 BAC constructed and introduced into Susceptible lines to deter library. The minimal tiling positive BAC clones were sub mine if any of the individual candidate genes in the qHSR1 jected to sequencing to get abroad view in the qHSR1 region. region is Sufficient to confer resistance to head Smut. The comparative view among the Mo17, B73, and Huang Zhao.4 inbred lines is shown in the FIG. 4. A total of six Example 19 additional putative genes have been identified, an ankyrin 25 repeat protein (SEQ ID NO:104-106, the coding sequence, Development of Markers in the qHSR1 Region protein translation, and genomic DNA, respectively) is found Useful for Marker-Assisted Selection in all three inbred lines, a gene coding a Wall-associated kinase protein (SEQ ID NOS:31-33) is missing in Huang The BAC sequences, especially those coding sequences Zhao4, a gene coding hydrolase (SEQ ID NO: 107-109) is 30 were further used to develop high-density markers. In total, missing in B73 and Huangzhao4, two of the three Xa21-like eight markers have been developed in the 172-kb region kinase proteins (SEQID NOs: 110-115) are missing in Hua (Ji1037 qHSR1 which is equivalent to Mo17) (Table 5). These ngzhao 4, and the third Xa21-like kinase protein (SEQ ID markers were used to integrate the resistance qHSR1 into NOs: 115-117) is present in at least Mo17 and Huangzhao4. other susceptible inbred lines via marker-assisted selection. TABLE 5 Markers in the 172 kb interval covering the gHSR1 region PCR product Marker Sequence (Jil 037/Huangzhao.4) positionname primer SEQ ID NO: SEQ ID NO : maker type O MZA6393 MZA6393L 5' - GTATTTCTACCAGCGTGGCCT-3' 412 bp/325 bp codominant 50 23/47 MZA6393R 5 " - GACAAGCTGCAGATCGAAGA-3' 51 727k 1M2-9 1M2-9L. 5-TCGTGACGGACCTGTAGTGC-3' 618 bp /759 bp codominant 52 54/55 1M2-9R. 5-TCGCGGTTCAGAAGAACAAC-3' 53

26.4k E6765-3 E6765-3L 5 - CATGTGCCGACCGACCATTC-3' 426 bp dominant 56 58 E6765-3R 5 " - GGAGTGCGATGTCTACAGCT-3' 57

99ko 2M4 - 1 2M4 - 1.L. 5 - CACGTTGTGACTCAAGATCG-3 573 bp dominant 59 61 2M4-1R 5 - ATCAAGGACCATCAGCACAG-3' 60

141.5ko 2M1 O-5 2M1O-5L s" - CCTCCTCTCCATCTGGTCCA-3' 589 bp dominant 62 64 2M1O-SR s' - CGTGTGCTTGGAAGAATCTC-3' 63

148ko 2M11-3 2M11-3L s' - TGGACAGACCTTAGCTTGCT-3' 563 bp dominant 65 67 2M11-3R 5 - GTTCGTAAGTGCGTCAATGG-3 66 US 8,912,387 B2 55 56 TABLE 5- continued Markers in the 172 kb interval coverind the CHSR1 rection PCR product Marker Sequence (Jil037/Huangzhao.4) position name primer SEQ ID NO : SEQ ID NO: maker type

163ko 3M1 - 25 3M1-2.5L s' - GCTAGATAGCTGCTTCTTCC-3' 328 bp/468 bp codominant 68 70/71) 3M1-25R 5 - GTACCTACGATTCGGCAGAA-3' 69 1721ko STS148-1STS148-1L 5 - CTTCCATCGGTACTCCATTC-3 177 bp/132 bp codominant 72 24/49 STS148-1R 5-TTCTCCAGGTGTGAGAAATC-3' 73)

The genetic effect of the qHSR1 region in resistance to TABLE 6B-continued head Smut was tested using eleven BC4 populations. The Mo17 inbred line was crossed to Ji853, 444, 4287, 98107, Percentage of the resistant 99094, Chang7-2, V022, V4, 982, 8903, and 8902. The Genetic plants in backcross populations qHSR1 region was then backcrossed for four generations, back-grounds Without qHSR1 With qHSR1 Difference P-value using markers, such as MZA6393, 2M10-5, STS 148-1, 4287 84.67% 84.31% -0.36% STS661 and E148-4, to select the plants with the qHSR1 98.107 18.22% 42.22% 24.00% O.OOO)4 region. These BC4 populations were phenotyped in the win 25 99.094 O 33.93% 33.93% 0.01253 ter nursery in Hainan Island. These BC4 populations con Chang7-2 12.48% 38.63% 26.15% 7.S2E-08 VO22 44.41% 71.82% 27.41%, 1.71E-06 tained plants both with and without the qHSR1 (Table 6A and V4 21.29% 49.97% 28.68% S.24E-OS B.) The plants without the qHSR1 region were considered 982 16.83% 29.91% 13.08% 3.3OE-OS controls to tell the baseline resistance of the different genetic 89.03 23.96% 40.34% 16.38% 2.79E-09 backgrounds. The individual plants within the BC4 popula 30 89.02 18.41% 38.26% 19.85% 9.19E-06 tions were scored for resistance to head Smut, and the per centage of resistant plants was calculated, for the groups both with and without the qHSR1 region. The qHSR1 region con Example 20 ferred an increase of approximately 25% in resistance index. The inbred line 4287 itself has the qHSR1 region and shows 35 Additional Development of Markers in the qHSR1 resistance to head Smut, this is why the integration of the qHSR1 region in 4287 genetic background has minimal Region Useful for Marker-Assisted Selection effect on resistance to head Smut. Introgression lines for qHSR1 are being created for breed 40 ing material and the evaluation of qHSR1 efficacy in Western TABLE 6A North America, Mexico, and China. Thirty-five Pioneer The genetic effects of the qHSR1 region in resistance to head Smut inbred lines (CN3K7 is the donor line; GRB1M, HNA9B, HN4CV, HNVS3, HNN4B, HNH9H, HNGFT, GRORA, Size of the population HFTWK, and GRVNS are non-stiff-stalk lines for China: 45 GROP2HEF3D, HFOSV, HFHHN, HNO5F, HNO88, HNOE1, Genetic Without With HN8TO, HNNWJ, and HNW4C are non-stiff-Stalk lines for backgrounds qHSR1 qHSR1 Markers following the R region Western North America; EDGJ4, EDW1N, EDVNA, J853 353 28 MZA6393, 2M10-5, STS148-1 EDVS9, and EDV97 are stiff-stalk lines for China; and 444 118 29 MZA6393, 2M10-5, STS148-1 2HC5H, 2H071, 4F1FM, 4F1VJ, 4FJNE, 7T9HV, 1ARMJ, 4287 226 64 MZA6393, STS661 50 98.107 81 27 MZA6393, 2M10-5, STS661 1AYOM, 1AGFC, and 1A1V3 are stiff-stalk lines for Mexico) 99.094 17 46 MZA6393, 2M10-5, STS661 were crossed with Mo17 to create the F1. SNP markers, such Chang7-2 176 86 MZA6393, 2M10-5, STS148-1 as MZA15839-4 MZA18530-16, MZA5473-801, VO22 148 91 MZA6393, 2M10-5, STS148-1 MZA16870-15, MZA4087-19, MZA158-30, MZA15493 V4 69 134 MZA6393, 2M10-5, STS148-1 15, MZA9967-11, MZA1556-23, MZA1556-801, 982 99 83 MZA6393, 2M10-5, STS661 55 89.03 2O1 143 MZA6393, 2M10-5, E148-4 MZA17365-10, MZA17365-801, MZA14192-8, 89.02 67 118 MZA6393, 2M10-5, E148-4 MZA15554-13 and MZA4454-14, are being used to select for the qHSR1 region during Subsequent backcrosses. Between 39 and 65 SNP markers on unlinked chromosomal regions were used in the BC1 generation to select against the back TABLE 6B 60 ground. Percentage of the resistant The lines are being backcrossed to a BC1, BC2. BC3, or Genetic plants in backcross populations BC4 generation, and then selfed. The plants homozygous for the qHSR1 region are identified in the selfed generation, and back-grounds Without qHSR1 With qHSR1 Difference P-value then crossed to an appropriate Test Cross Inbred, such as J853 20.54% 52.60% 32.06% S.27E-09 65 EF6WC or EF890 for NSS introgressions. The Test Cross BC 444 35.37% 59.53% 24.16% O.OO12 lines are thenevaluated for efficacy at the location appropriate for the inbred line, such as Western North America, Mexico, US 8,912,387 B2 57 58 or China. At each location, a Sufficient number of reps and 5, 2M11-3, 3M1-25, and STS148-1) are located within the population size are used to evaluate the qHSR1 efficacy. The qHSR region. The markers in Table 7 that are outside of the equivalent hybrid without the head Smut QTL was also grown qHSR region have been developed to be specific for Mo17, for comparison. If high disease pressure is not expected, the and therefore are linked to the qHSR region. These markers, experiment will be artificially inoculated with the head Smut although exemplary, are not intended to be a complete listing pathogen to insure high disease pressure. of all useful markers. Many markers that are specific for the Markers that are useful for marker assisted breeding to qHSR region can be developed. In addition, any marker that is develop introgression lines are shown in Table 7. Eight of linked or associated with one of these specific markers could these markers (MZA6393, 1M2-9, E6765-3, 2M4-1, 2M10 be useful in marker assisted selection. TABLE 7 Markers in the qHSR Region Physical Marker Genetic Position arker Type Chromsome Position (bp) * Mo17 SNP

ZA15839 - 4 SNP 2 22 O.22 T

ZA18530-16 SNP 2 22 O. 34 G

ZA5473 - 8O1 SNP 2 225.11 G

ZA16870 -15 SNP 2 226.92 G

ZA4 O87-19 SNP 2 22858 C

ZA158-3 O SNP 2 22858 T

ZA15493-15 SNP 2 23 O. 55 G

ZA9967-11 SNP 2 231.1 T

ZA6393 codominant 2 X O X

1M2-9 codominant 2 X 7.27 X

E6765-3 dominant 2 X 26. 4 X

2M4 - 1 dominant 2 X 99 X

2M1O-5 dominant 2 X 1415 X

2M11-3 dominant 2 X 148 X

3M1-25 codominant 2 X 163 X

STS148 - 1 codominant 2 X 1721 X

ZA1556-23 SNP 2 235 .32 A.

ZA1556-8O1 SNP 2 235 .32 C

ZA17365-10 SNP 2 235 68 G

ZA17365 - 8 O1 SNP 2 235 68 D

ZA14192-8 SNP 2 235 .. 8 G

ZA15554-13 SNP 2 24427 G

ZA4454-14 SNP 2 245.91 C

Size Forward Primer Reverse Primer (Jil037/Huangzhao.4) arker SEQ ID NO : SEQ ID NO: SEQ ID NO:

ZA15839 - 4 gatgcaatggaagaatt.cgtg talacticagotttggataccala 74 75 ZA1853 O-16 gttt cot catggcac tact ct agtaaagccacacat cittatt c 76 77 ZA5473-801 cc catgatggctacattctg cagaggcttgcgittaacaac 78) 79) ZA16870-15 attt cagogtttgcggtgtc. ataatgaagttgacctaagt cc

US 8,912,387 B2 61 62 - Continued

SEQ ID NO 2 LENGTH: 2O TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer CAPS25082-R SEQUENCE: 2 cggittaggac gatgtcagaa

SEQ ID NO 3 LENGTH: 2O TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer SNP140313-L SEQUENCE: 3 cagaggcatt galacaggaag

SEQ ID NO 4 LENGTH: 2O TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer SNP140313-R SEQUENCE: 4 ctgctatt co acgaagtgct

SEQ ID NO 5 LENGTH: 2O TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer SNP140313-snpL

SEQUENCE: 5 citct tccacc gagaatagcg

SEQ ID NO 6 LENGTH: 2O TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer SNP140313-snpR

SEQUENCE: 6 ctgctatt co acgaagtgct

SEO ID NO 7 LENGTH: 2O TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer SNP661-L

SEQUENCE: 7 cittctgttct gtgc.caggta

SEQ ID NO 8 LENGTH: 2O TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer SNP661-R US 8,912,387 B2 63 64 - Continued

<4 OOs, SEQUENCE: 8 Caagaacgta gcaact cagc

<210s, SEQ ID NO 9 &211s LENGTH: 2O &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: primer SNP661-snpL

<4 OOs, SEQUENCE: 9 attgtc.cctg agatgatt.cg

<210s, SEQ ID NO 10 &211s LENGTH: 2O &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: primer SNP661-snpR

<4 OOs, SEQUENCE: 10 Caagaacgta gcaact cagc

<210s, SEQ ID NO 11 &211s LENGTH: 2O &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: primer STS1944-L < 4 OO SEQUENCE: 11 cattggcaac aggacaagtg

<210s, SEQ ID NO 12 &211s LENGTH: 2O &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: primer STS1944-R <4 OOs, SEQUENCE: 12 gacatcagcc tica acattgg

<210s, SEQ ID NO 13 &211s LENGTH: 2O &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: primer STS171-L <4 OOs, SEQUENCE: 13 ccagagacitt gcgtgaagat

<210s, SEQ ID NO 14 &211s LENGTH: 2O &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: primer STS171-R

<4 OOs, SEQUENCE: 14 alacagactgg ttgtacgtgc

<210s, SEQ ID NO 15 &211s LENGTH: 2O US 8,912,387 B2 65 66 - Continued

TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer SSR148152-L.

SEQUENCE: 15 gtaggalagac to cqgagac

SEQ ID NO 16 LENGTH: 2O TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer SSR148152-R

SEQUENCE: 16 gacgctagaa tactgaacc

SEQ ID NO 17 LENGTH: 22 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer STSrga3195-L

SEQUENCE: 17

Ctagaggttc aggcatatgg C9 22

SEQ ID NO 18 LENGTH: 22 TYPE DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer STSrga3195-R

SEQUENCE: 18 agct coacag gaatticgttg ag 22

SEQ ID NO 19 LENGTH: 2O TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer STSrga84 081O-L

SEQUENCE: 19 gcgt caggca gttcaact tc

SEQ ID NO LENGTH: 2O TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer STSrga84 0810-R

SEQUENCE: tgttcttgca citcgcacttg

SEQ ID NO 21 LENGTH: 2O TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: primer STSsyn1-L

SEQUENCE: 21 US 8,912,387 B2 67 - Continued ggcacatgga cgtacaagat

SEQ I D NO 22 LENGT H: 2O TYPE : DNA ORGANISM: Artificial Sequence FEATU RE: OTHER INFORMATION: primer STSsyn1-R

SEQUENCE: 22 gcacagagga agctaggaga

<210s, SEQ I D NO 23 &211s LENGT H: 412 212. TYPE : DNA <213> ORGANISM: Zea mays

<4 OOs, SEQUENCE: 23 gtatttctac Cagcgtggcc ttcaccgatgttggacggcc gacggaggca aalaccttgag 6 O cacatgaact gcc ctittaac atc.tt catga gacactg.ccc tittaa.ccgat gttggacggc 12 O cgacggagtic ttgccaaacc titcct gacct aggatagaca gttagacacc acctt cagot 18O ccttittgccc gtc.cat catg agct citctaa citt cagacitt gggaacctic agaccgacgg 24 O ggacgacgat CtcCtc.cggg agcagcc.cac gacgaccggc gccaacgatt tot CCaCata 3OO

Cagcagcatc agcagaacaa aattic cagta cittgcttgga aagct cqtac ctitt cacagc 360 ggcgcatagc titcctggatc cgctic cittga attctitcgat Ctgcagcttg to 412

<210s, SEQ I D NO 24 LENGTH: 17 6 TYPE: DNA ORGANISM: Zea mays

<4 OOs, SEQUENCE: 24 ctitccatcgg tacticcattcaag cagaaac aaa.caggitta caggcataca ttatactgtt 6 O cgc.caa.cagt tocct cqggit cqctic cattt citt tactgac acgtgaaatt ggcaaacaat 12 O ggagaaaaaa act aagtgca ggaaatta at tat actgatt tot cacacct ggagaa 176

<210s, SEQ ID NO 25 &211s LENGTH: 27O439 &212s. TYPE: DNA <213> ORGANISM: Zea mays 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (25 OO1) ... (25020 <223> OTHER INFORMATION: n is a, c, g, or 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (66.226) . . (66245) <223> OTHER INFORMATION: n is a, c, g, or 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (8102O) . . (81039) <223> OTHER INFORMATION: n is a, c, g, or 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (100923) . . (100942) <223> OTHER INFORMATION: n is a, c, g, or 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (1051.86) ... (105205) <223> OTHER INFORMATION: n is a, c, g, or 22 Os. FEATURE: <221 > NAMEAKEY: misc feature <222s. LOCATION: (112496) ... (112515) <223> OTHER INFORMATION: n is a, c, g, or

US 8,912,387 B2 93 94 - Continued atcaccaaaa aatt catgga gaalacct cqc taagt cqcta gataggctica cqc cogtgtt 282OO gtc.gcaaatc acaact caag gcgact cata gcc ttacatt tttct coctt ggccactitta 28260 tcc.cgg tagc taaaggtoca agtacaaaga gctaaaaatg actaaaatgg togcaaaagtt 2832O ttgtcgttitt agt cacttitt aggacgaaac agaacatgat aattaaggat caacaagcta 2838O agtgcaaatt aaa.gagaatg atgtaatgta acaaggtgtt aatagaattt tdtgct tact 28440 taac agtgca acggcago ca tagga attaa goggat cacat togctgcc ct c titcggaagaa 285OO attgcattga caataatago citcc.gtctgt gtcaatgcat ttgctgccac ttgaacaatt 2856 O atgtgactitt ggat.cttggc act cqttgat atctgtatat aattaattat taattttgttg 28 62O agcattcaaa tagt cacaaa agaga.gtcag togtttgcca ataattitatgaaagcaagga 2868 O gcqatcttitt gagatgtgag aactcaaaag aatccaacac acggttcgga aaattaatct 2874 O taataaacag attcctgctic titttgtttitt tittaaaggca ccc.ttttitta aaaggttgaa 288OO gtcattgcta citttgtc.gct tttgttgcgtt toggatatata cct tcatttit agtggaact a 28.860 gaatticagtic aataaagtaa cccatttacc ttggaattta acattct acc actttittaag 2892O gttcagatat aagtictat ct caaatt catg agatggaaga ttggaaatga titt tatgaat 2898O caatataatt tdttitcgact citcta actta caagacgatt ttcaact cac ttct cqatag 2904 O taaaaatgta gcacataaat atcto cdata ccttgttaat aac agtatac aaatatattt 291OO tacataaaat cqaattagtt aattagtata tacctaatta citgtt attagaatggaattic 29160 aatticcaatig atctaaacgg gacataacaa taatgcdagt gctat cacaa titt cittgtca 2922O acgaatagga agtaga caaa ggaatctgta tigatgtgat gtgtggc.cat acaaatttgt 2928O cgttgataat aattic catga t catatggat ttctgga cag toga catcaga ccaactatot 2934 O ccgittatt at caacga cact agt cogacaa cittgttgc cct ct cacggact ggatggctta 294 OO gcc catttico aaaccact at ttcttgtcaa agtatt caaa gttgtcgt.ct catatgttat 2946 O tattgaagat tctatatagt tat cocqtgt at attaccat acgttgttt catagt ccaca 2952O ctaccggact cacgttctitt gcc.gagtgtc taaaacactic ggctic ggcaa aggccatttit 2958O acacttgaca aatatttitat cqgtaaaggg tttttgtcga gtatatttitt toggatact c 2964 O aacaaagact ttgttcaagtg togaaaaa.ca citcgg taaat taggaatcqc aaaaaaaaac 297OO cctaaaaaat agcaaaac at tittctaaatt atatgaacaa ct citccaacc act acccaat 2976 O accatacc cc attatccitat catttitt cac tattattittgaataaaattt acatgttitta 2982O tgaatagtga gatt.cgaact togcaacct ct citcgc.gcact acact actac atcaattatg 2988 O t citat attac gttitt cattc ccatatacta taacaac ccg agagtaattt gattatttga 2994 O gacactaaat gaatt cattt gaaaatgtga acaactataa agttgtataa cittittcaaga 3 OOOO tctacaagtt c tattittgat agtttctaca tacgagattig titt acaaaac ttcaattitca 3 OO6 O aattittaaaa citt cacacga aattittaaat gigtaagatga tittaaaataa aaaatttgtc. 3012 O aattacaaag titt tattaca tttcaaga cc tacaacttitt attittggtgg tttitt coatt 3 O18O cgagg tagtt ttaaaaattic aaattittaaa atttaatcat agttittgcat gacaagatga 3 O24 O tittcaaacca aaatattgtc. aactataaag titt catalacc cittcaat acc tacaattitt c 3O3OO atgttggtgg tttitttitcga gqtcqttittcaaaattcaaa ttittaaattt ttaaaattica 3O360 gacgtag titt tagttgacaa gatgact tca aatgaaaaag ttgttcaact a taaactticta 3O42O taacttctica agatctacaa agtttatttic ggtggitttgg to atttgttc atc to acatg 3O48O

US 8,912,387 B2 153 154 - Continued aaggitaticta gct citcgtac cacacagatc gtttgaactg. taataattga t caat attitt 993 OO acaagttittg at atcatcta taccalaagct gtagttitt.ca acgcaacct g togcaactitta 9936 O gagttcacaa gttatagaat cqagagacat gtcagt citca actatacatt toaga attcg 99.42O tatttgttitt taatcttctt act at cittgg acgaagaatg catcc tittag cataatatat 9948O tgttgagctic aatgacacct atataagcta cqcggtgaaa caattittaat ttggaat cat 99540 ataattitcag gtctaaaata t t caccgcaa aattatgtaa totagtaaaa aagatatagt 996 OO at catgcc at atatatgatt attctg.cgca gctgtgctga tigg to cittga tittaacctitc 9966 O agagacactt tttittggatt ttgtggaaat cqttaactgagggcct citta acacgacagc 99.720 gtc.catgaat gigcaatacta tacatgttgc atgcc tttgt tagagtaagt at agtaacag 99.78O agtataagaa gtctaaatgc tigtgttggag gacagagaag atgagacaga ggagaat cag 9984 O actatt atga t ct cacaatc gatttagata cdagaacaaa aaaaaaacct gacga.gagag 999 OO acaagttaat catacattaa taataaagag c talactatta tacaagtggt gigttgcaa.ca 9996 O at cactgcag citat cagttg gctatat cat tagcc tittgc tictttgttga ttagcggagg 1 OOO2O ttgcggcctic agittaatgga ttaacaaagg togtagccg gcc caagaca ataatggatt 10008O aattittat ct agtagatgaa attatttct c atttcttgtt ataagaact t c taaaactta 1 OO14 O cacaataa.ca actataattt tact atctat aggaagctat atgtcgatct tatgtttgtc 10O2OO attg tattga tigitat cacat gttitt catcg agctatoaca tdttitt cagg gttctggaag 10O26 O agcattttgt gctggaggtg atgttgttgc tiggtgtc.cag acaataaata atggit acaca 10032O caccittgcat coaaaaagaa totaattt ca gaatttggac aattcagact ttittaattitt 10O38O aatcacgttt atatgaaaaa a catcaat at titatgtc.t.ct agittaggittt attatgaaaa 10044 O tatatt citat aat caa.cata tact tattta gcatcataaa tottccactt titt cataaat 100500 ttgttaaagt tdtttgactt attgg taatt aagagttgca ttcttitt cag gatggaggac 10 O560 atctaactitt agctagoc to taccatttct attgtc.caat tatgattitat ttagt attitt 10062 O caatggctta tat caccitta gttcaaaac cc atgttacata ttatatatag gtctaatgtt 100 68O gttccacatgttttittggct gtgttgttgttg cittatttgat aaattagaag gatggaaatt 100740 ggg.cgctgat ttctt.ccgag atcaatattt tittaalactac ataattgcaa catgcatcaa 1008 OO acct caggtg acctt catct citt catcata gtgattcaat gtaattattt gottct c togc 10086 O t cattgcatc taaatgatag acgatttaac tacaggitttc tictitc.ttgct ggaattgtca 10092 O tgnnnnnnnn nnnnnnnnnn ninggggtgtt tdtttgggat tataatctac ctagattata 100.98O taatccaata acttittggac taagagittag ttaaaaaatt attggattat ataatctagg 101.04 O tagattataa toccaaacaa acacc cactt aattatggta caaacct titt cqtgcgcttg 101100 atcggtgcca gtagttct to ttct catttg gaatatagag aatcttgcat tattitt catt 10116 O ggaaaaatca ttatattittg atatgccaaa at caatgtta t cittggacac taaacgctaa 101.22 O atgaccactic gcc cac catt tattgtttag at agtttaga taatgactag at agtatgaa 101.28O caccittacat tacagoagta aatatacata tdaattataa ttitttgtata aacttittitta 10134 O agtacagttt aatgcatgaa tatagittata aaagttgata taalacagtaa acaattatac 1014 OO ataaaggcat acacatgg to atatggacat catttaaaac ataag cattt gtgct cott c 101.460 acacaa.gcct aattt cacaa aattaataaa gttcaccaac cacccaaac a titcgtggctt 10152O atcaccitata ttattgttta aacacatgtt ttctgtttac atc tact tag ccatcattta 10158O US 8,912,387 B2 155 156 - Continued aacaaattitt ctaaccagtt tdactgttta gacac ctitta gtgtttaatt cagtgac tag 10164. O gacctaaact acacgtgaac acattattta gcqtttaagc acacatggac togt catttag 1017OO acatgtttag gtggacaata accaaaatct gaat cattta ccttittgaat atc catgcag 10176 O gtttittgcaa tdccagaaac atctottggit citttitt cocq atgttggggc ct catattitt 10182O ttgtct cqac titcctgggitt citatggttct ctaat acccc cqatc.ttitat ttagttgaaa 10188 O tgcatgtagc taagt cattg caa.catatat aatatttcaa togttctgcta aacct cq agg 10194 O gatattatgt at cattctitt agttittagtt ttataccatg tdattittatt cittagcaagt 102 OOO gtgatt cittg cacaattgat citatgcaatt acg tatttga t tatggtgat at attaatat 102060 aaagaaatga aaatgacaat ctittcttctt ttcaagttga tittattggitt coattatgag 102120 gtagccatat agttggtggit aatgatagca ttggatat ct tdttaaataa aac attact c 10218O taac attctg. Caggagagta tittgctictt gttggtgcta gattggatgg tectgaaatg 10224 O cittgcatgtg gtc.tc.gcaac to attttgtc. cct tcaaatg taggttcact agtaact caa 1023 OO tttittaaatgagttgtcaat titt Cttacag titatgatgtt toggtgttata tittatggitta 10236 O ctattattga gaatgttcta totataaaaa toctagocto atago atttg cacatgggct 10242O ttgaagttitt gttct ctago at cattaagt ttatt tatgt gigtatat citg ttgaaatagt 10248O tttgttitt co agagaatgct attgctggaa gaatc cctta aaaaggtgga cacct cqaat 10254 O agttttgttg tatgtag tac tat cqatcaa ttctgtcaac agc catc.ccc aaaacaaaaa 1026 OO agttcc ttaa at aggtaagg gcatttctaa tta actcaaa gacatatgtt tdgttcataa 10266 O tat cactatt ttt cattctt toggtgcacct taggttggala atcatcaa.ca aatgcttitt c 10272O taaaggaaca gttgaagaaa ttatat cotc ticttgtaagt ttgttatatt aattgtaggit 10278O ttctatgggit to actt citta tattatgaaa aataataaat gcatatttgt totgtcagga 10284 O ggaagtggcc ticaaatticag caa.gcaaatg ggctgct cag acaattcaat atctgaaaaa 1029 OO ggcttct cot act agt ctda aaatcacatt gagat.cggta t t c cittagaa accacacccc 102960 ataattgtac tattaatcta cqacatatat ttgtct catt atatgttitt c taa catggag 103 O2O ttcagataag agaagggaga acacaaaccg ttggggagtg Cttgcaacgg gaatatagaa 103 080 tggtttgcca tdtcqtacgt gigtgactitta gtc.gagacitt ttittgaagta attaalacatg 10314 O gacat cacta at actittgct citat actttgttgtcattgt a catcaatgt atgtacctaa 1032OO catccaactic ctitttacagg gatgtagggc tatac tagta gataaagata aaaatccaaa 10326 O ggttcttata citt coatatt tag cacct ct c catcaaaat t cattacgac ttattittatt 10332O tgatataacc attagatgat gigtgttcttt tttggggagc titgcagtgga tigcct coaat 10338O gttggaacaa gtgcatgatg atgcagttga agagtatttic tictagggttg atgttcCaga 10344 O gtgggaagat ttggacctac Ctgtcatgtg ttcaaatgga agaattatgg agt ccaa.gct 1035OO ttgaattaag cctittattga at agtttagt gigaac ct cqg ttgtgct tag aataagaacc 103560 catcc catgt toaaagag to tatgtacact gaatacaatt gataaaataa aattgaatat 103 62O gtggtgtata tacacttata ccatagaagg tact cattgt tattitttittg tagggatagg 103 680 caatat caac aagctagotg gttcc tagct catctggct tatttagctic gtgagccacc 10374 O atataatatgctaccact at toggittaaatt gaaataataa tatgttatct attggitttca 1038OO gaagtactaa aacaaataat gattgttgaac ct cittgaact gacacaaaaa atggittagca 10386 O caac cagttt aagccactaa cagg tagtgt tatgcactac ttatt coaat aacaaatata 10392 O tagtgttgag gtctat ctitt agc.cgaaggit cotcaaaa.ca ttalactalacc agittatttitt 10398O

US 8,912,387 B2 173 - Continued tgagttcgct aaacaaatga tigaaaatgtt gcacago agg aaaatgaaaa aaatcaaagg 123 OOO aggcattggit gagcaaggta titcgtgctt gttagcaata atggtttgca tacattt atg 123060 taaataattig gogaaatttgg atc catacca ttaaaagat.c accactittgg atc catacca. 12312 O ttaatat citc acttacatgt gigg to cacat gag to aatga catgtggggit coatggtata 12318O tatictaaagt ttggat ctitt taatggtata gatctaattig titcctaaata atttgtttaa 12324 O titatgataat aatgtgttat aggcaccalag gacaacagat acattatata ggttctgaagt 1233 OO gagagt ct ct agagcataca ccagagctgt tatgaataga tittgaggaat caatgaaata 1233 60 cgcc actgca tacaaaat at taaaggaccc agacggatgt gataatgaat ggat.cgtaca 12342O gcatacaaaa cqgtctaata aaattgttgttggggacaa.cat caattcaaga taacagdaaa 123480 Catagaagtt ggggagtata catgcgagtg caaac agtgg galacatacag gtttgtacgt. 12354 O attatgttgg ttagcatalaa aaatttgcat act agagaat ttgatgattg tag tacaaat 123 6OO taatgtatat ttittgaaata atggttcgtt tt catgttitt caggit ctatt gtgtgttcat 123 660 cittittaagag cct tcatgca tottcaagtt gaaaagatac cittcaaagta tatattgcaa 12372O aggtacactg. tct catcaag aaaagatgtt cogtttgaaa gaattgataa gagcttcagg 12378O gggaaggatg gagttact aa at Catacaga Cagaaaatgt tittaacgaa aacaatgaaa 12384 O gtagttcgcc aggcgtgt at gtcaaaag.ca ggg tatgata aggcgatgga tigtgttggat 123900 gagotcgatgtcgttctaag ccgattggag ccagatattg gatgtaatga gtcaiacagat 12396 O gttagtgata atgaggaaga caagg taata at attgcaga tigttittagtg ttatactatt 124 O2O tgta acataa attatgcata gtaacatgtt attttgtacc aggaagaaga gttgaataaa 124 O8O aataatgctg gcgatgggat ggaagatgac aatacaatta catgt catala taagg tatgt 12414 O aaaagatata tatataagct togcatgaagt attgtacata atgaatatat aaaat caatig 1242OO taataagcaa ataatattitt gtaggatt cq cataacacca ggactggatg tdaacatgcg 12426 O ttaacaataa taacaactgg taaccaggta cat attaaaa taatattgtt tatttgtcta 12432 O aaatgaaaat atatattitat gctgcatgta aatgttgatg tatttgggitt gggctictaaa 12438O ggaggacaac atgaga attt Cacatgaggit ttgataca agttctictt gtcacgtagc 12444. O acataaacaa atggaacata ttgctgcatc ct cagaa.gct aaaaaggtga gcattgataa 1245OO atgattittac tdattggitat aaatatatgt aaatatgtgt atact aaata gtaatgtcta 12456 O titat tatgca gaggttgaat tittaacgtgg atgttataaa totgagtatg ccggat.cgtg 124 62O Caagaccalaa aggc.cggaca atcaaaaatt Cagaa.gagag ggittatgaga Citaggtgcga 124 68O aaggagagaa aaagaagaat aggagatgcc atttgttgttgg aatagcagat gggcatalaca 12474 O gCagaacatgtctgtctgtg galagagaaca gggcaaggct agcaaaactg. tctaatcgaa 124800 agagaggacg gcc agc.cgga t caagacitaa acaataaaac aactgct coa cagtggaatg 124 860 aaac atcgac to aaaaaaa cattgtattg atgaagaagt ggaaaatgala gaa.gc.cgatg 124920 agcatatgga tittgggcgaa taatttgaag ttgact aaa gagtggtcga tittagtagaa. 12498O acttgtaata t cacaatgac titatgctitta ttttgttgga aattcaatag tittgcatgaa 12504 O agggittatat aatgcgttga tataaaacta taaaatgttg cqtaaatata gacaattaca 1251OO caaattatgt aactgaatat gtacaacaaa ttggcattga acgtttagca atcggaaggit 12516 O ctggacatta atggtatata aaactgtata cacttaacaa tactaat caa atgggagtgc 12522 O atacgt.cggit ttcacactitc caatacaa.ca aac aggatat agcataaaca cittgttacaa 12528O US 8,912,387 B2 175 176 - Continued ttagcataat tag.cgtgaaa aatgtataaa agatgaacct agtgtatata aaataaggaa 12534 O gtatat atta tdcataaaac ttgaatattgaaacaatatgataatacaaa taa.gc.caaac 1254OO attggaagica aaatgattaa ccaact caca aactggagat gaaaa.ca.gca tagaatttgt 12546 O taacagataa cataaataga cattctggta acaattatgt ttgacacact tcaaataaaa 12552O actgatgcga agtag catag acattagaca cattgaacat aagtggcaca cqgagtatica 12558O aacacatgat cagaggcc td acatalagtga cittaacaagt aacataaac a gttttgttgga 1.2564 O t cccacaggg aat catgttc agctatocta t ctaagt citt ggctittcaaa cagcgg.cgit c 1257OO t ccttggagc at acttgaaa ggcagcagtt Ctgcaggaag gggggcaac C atgttgttgc 12576 O gatgaaaggit aaggtaatgc agaacaaagg cacgttggtc taatggttca t cctgaaat a 12582O gccataatat caaattaatg gtagaataaa atctgaagga gtaatatatgcaagattaga 12588O agactaattit agtgacatac cqgagtataa aattctgata agt cqccatc atcaaaatcg 12594 O tagtag togga gcaagtttgc gacaaaaaaa ccacaat cat ttgaccctgg tdt catggitt 126 OOO ggacaattgg gaaggagc cc aatcttgtag ttgccaaact ttggtacagt tact caggt 126060 cgagctt cat gtaaggcaat gctaagt citt ct cattatta atctggacca agg tat citt c 12612 O gttc catgita t catcatttg at cattatgg atttgtttcc atgtag togcc ticcaa.gcaat 12618O gtaccataag gatttgaatc taagatgtct atticgacaac gttcaaaatt gattgcataa 12624 O agggtc.cagt ggctt.cggcg tag catagga accaggat.ct gttaaatata tatgtaaatt 1263OO tatgaatgat ttatactato catatgaaaa attaaaacta ataca acata aaact agtga 12636 O c caaaagaac attctaac at ttgtacaaat gaaactgatt agagctic acc aatttaatct 12642O ggttcagaac at Catctgaa ggaagggttg gttcaagttg ttctttalaga agtgatgttg 12648O tgaagggctg toggatttgaa citgtgttgtt caaactic ct c gat attcaaa acggtctgga 1.2654. O caaagaaaat aaattgatta gtattatata tittatacagt aagtgttgca ataagataca 1266OO aagttgttgga gaaaaaatta ccc caacgtt gacattcagt attagagtgt tdattacaga 12666 O atctggattig tataatacat catcc tigacg aatacagt cq ataaatticct gcatgaaagt 12672O attct caaga catttgttgg gaccaaacga citggacaa.ca toaaggacag accct coaaa 12678O t cct coaaaa ttaatgatag goctoggggaa aaaactittga ttaca attct tctaattata 12684 O cittaattgga atttgaataa ctaatacata catgctic gga tictagttitt c ctdacaaaat 1269 OO gaacttaa.gc aggttgcgag cacagt ctitc acgtgacacg tdgggatgtt cagcgactgc 126960 agctttacag gaactatoct ttgcaaccag gtcagtagtt gcc taatatg tdaatatatg 127 O2O ttagtaaa.ca gttgcataaa tatgtatgtc. ttagtgtata aataggaaaa atatgaaaat 127080 gacaacatgt cittacatctg. t cagtggggt tittatcqaac agaggtgcac cagatacaag 12714 O agitat catgt gtaacagatt tttgtc.ctitc ctd taac caa taaatttatt aatataaacg 1272OO taatgaaa.ca tagggtataa ttaat attag cataatatga tigaaggtgta aatact taca 12726 O Ctattt at at Ctggtgttaa cittaggagca gcaggcttgt Ctgttgttggg agcctgggga 12732O tittaaaaaaa ataaag catg catatatatt aatgttactit gcatacattg taaatgaaat 127380 aataataatt gtgatataag gaagt caaac attagattitt ggagt cittgt caaaaatagg 12744. O ggcatc cata actt catctt ttittgcagac gttcc cactg acaatctata atggaaacaa 1275 OO ctgagatttg agtaaacagt aacttgtatg cagaatagaa tatatgcatt gtatatatag 127560 tttgtatgag aaaact taca toat cittata cittittatatg at cagtagga acatggctitt 12762O cctgat catgttgctgttta t caggtgttg gaacagttgg togtggatact ggat catttg 12768 O US 8,912,387 B2 177 178 - Continued ttggaggc at tagg tagtggggggittatc tacatttgga acatataata atatagtagt 12774. O tgttactgaa atataactgc aatataaatg tat cqttata aaaaaatggit taalacataca 1278OO catttgcaaa ctacat cagt atact at cag gatgcataaa taaaatacag ttitt caaaaa 127860 atgaaataca citgaaaaaat atgttgctgc titgitatgagt ataaattaat atata atttg 12792 O tatatacaaa actgcatata ataaattatt tdagat.cgta taagtatata ttattitt coa 12798O ggaaataatt cqtgaactitg attagttitta aaaaataaac ataaatacta aaacct tctg. 12804 O gctgttgaatc gcgctg.cgitt gcagotctica gaact tcatc gat catttca ccaaatgttt 1281 OO cagagagttcaatctgctitc gaaacaat catctgatggct cagctgaagg gaat cacagt 12816 O attitat coac ct citttgtca tdagcatcaa acaatgattgaaacattgga cqctgctggg 12822 O agggcaaaca ttgcaacttg ttgccaatca aattcttgat acgtggtaca tatatgttga 12828O aaacggcaga aac aggcc at tctggagcaa tacgitalaca C9tttctgaa cqgctacga a 12834. O actgtagaaa gtaatataga acagttagca gtaataatat tittaa catct gcatalacagg 1284 OO acaaaaagtt gtt tatgtct aacct ctdca tat cogaatg gttcaccagt tttacggggg 12846 O ggggginnnnn nnnnnnnnnn nnnnnggaat gcacaaatat ataaaattag tacataacaa 12852O ctatttgcat at catatata togcacattaa acaaaaaaat acttaccaga acaatgagtg 12858O agcatc cata tattgttgtt gtgatgtttgttttgttcct tittatgccaa cqtgcggcag 12864 O catcacacaa atc.cgtgtaa act agttgac accagt caat at cagacatt cqggc catgt 1287OO ttgaagt cat gagtacct cattgtttgtaa tdcc.ccaaga agcagaagga aaaagaa.gcc 12876 O gattgaatagaat cagaaaa aaaac atcta attgatagot catcatcgtt gccaa.gcact 12882O atcttgtc.tt gaagcttgac aac atcaaat t cittctttac caa.cattgag at cacgt.ct c 12888O agttittgcag cagcatccac titcaccatac caatcagtaa act coct gcc ticctic cago a 12894 O catggcaaac ccaaaatcag atgaaccota t ctitttgtta t ctitcagttc cittgc.ca.gcc 129 OOO cctggg.cgta tdgtoatgtc atgtggat.cc aattitat coat caaccacct gatgagtgat 12906 O ctgct citcta aggcatcagt togcaaatca aatatgcttgaaaattic caa cctagogaca 12912 O gcatcgc.gct gcc.gatcqct cattatccaa gatgatacaa taacat cata gggaatgcag 12918O cggatatt ca gtttctggaa goataaatga atatgcatta atacacaaag aatatatata 12924 O gttagaacta atgatacaat atttacaaac aagggataat tattacatgc titgitaaataa 1293 OO c tagt ctaca acaataaaaa aagtaaacac catatttgtt gataataata tacataaatt 12936 O taaagtgggg gaaaatacaa acttgattgg acaccaaaat taacacatga ataaa cataa 12942O atatatacat gttacctgtg attittctagg agt cittctgt ttaggagagt tttittttgta 12948 O atgat attag actittggact tcttitt cacg caaggaactt aaggaaaagg tacttittaac 12954 O ttctittgcac tdatttgttctgaaggagaa gotgaggcag atctgaact g acgcttcaac 1296OO gatggagcat C catgaaatc gtcgt.cggac agtgtggatg gag caggagg ttitt Ctttitt 1296.60 aaacgacaag Ctalacgc.ctt t ctg.cgaaaa catggattggggctggctg tdgct catt C 12972O tgct cotgat tittggggaat atcatcttgg tdagctgagc tigctggat.ct td tacgtctg. 12978 O gaaatgtcaa ttgaaaaaac citcct gactggattcaatgg ttaatcttitt toggattact g 12984 O ggtggg.cgat caacagacitt catgattaaa tactgaatag atctgtagaa aaaagaagtt 1299 OO gtttittaaat acattgattt atgcatgaat atatgccaaa aaaagaacta attitt tact a 12996 O taactatotg aaatgcatac tatataatgc taagtggaat tacataactt aaactacctt 13 OO2O

US 8,912,387 B2 181 182 - Continued tgaaaataat tittaattitta acagaaatat gcaaatacaa tittaaatgaa ttaat attta 13248O atttittaa.ca ataatatgca aatacaattit aaatgaatta taatagdaca catagataac 13254. O tagagagatt ttaaattaat tdgcaaccac aatt catagt cqat cataca tdataacaaa 1326 OO tacaatttga t t catacaat ttitt catgaa citaccatttt to cqcatgta tdgctittaag 13266 O ttcttatcaa ttittcttitt c cacacgcttg attct citctg gaccqttaaa gacgaac at c 13272 O tott catact tattgt attc ct catcttgg tot cocacat t ct caactic c aacaatttitt 132780 tgctitt coag aaataactac atgcatctitc. tcatctgctg gatcgagtac atagaac act 13284 O tgtgcaacac attcggcgag aaccoacggg to atctittat atcct actitt ctittaagtct 1329 OO accagtgtgagtctgtagtt atcCactato accc.ccacgg gtccaccitat agt cc ctago 132960 atcgttggat Ctgtgcaaga Cttgaccalag taccCtgtag acgat attgt taggacaag 13302O ccttgctitt c tt catatt co aat caatcga tictdggacaa aaacaaag.ca agctgctact 133 080 ggtttggtaa aaccaggact tdttalactac aaggacaatic ctdtc.ccgcc acactatgct 13314 O gtggtcCaag ttctagaaat cacagacaat ggttgttgaag attgggagat ggattitt CC a 1332OO gttgagggga t catalactitt gcatgaggca attaacgagt tdgtcCtttg gcatcgacgc. 13326 O gacatcaagt ttggtgatga gcc.gaattica acaccaatgc aacaaaaagc tagttcgat c 13332O attic cacacc ccatggtgca ggaagatatg acaccgacct c caaaaaaat cqttgaaaca 133380 atcat atcgc ct Cttgcgaa ggaacticgac gagggggacg tagacaaaat tdttic ct cag 13344 O aatgtcgacg tdaacatalaa aaaggaacca aaggttggca ccaatgttga agggccaact 1335OO atttcaaaga agatt cataa gogaatcgcc acagacgatgtcaagaaacc gtcagaatca 13356 O gtggcacgct acctgcataa attgcagaga gttatgacta atcaatcaaa agcagtat ca 133 620 gCatct catg gag taggcac acggc.cggcc caaatagaca attitcgaaat atgggaagaa 133 680 gatggaatga t t catact cq agaat catta cqt cittaaaa ccaatat ct c gagtatacaag 133740 aaggaggatgttcctic ctaa atttgttgaat gggaggc.cgt t cttaacgac ggtgcaactt 1338OO tctaagttgt cactic.cgga gattagaatg cacgagtggit acatggtggc tagcaacaaa 133860 tacaaact cq aggaatticac atttgttgtg ccagaagatg cattttggag caatgat cat 13392 O ataaatcc td togcggcattt attctittgat gat citctggit cqttgtacca ccgacaaagg 13398O atggaaacga act acttaac cct cittctgc titgitaagtac citctaaact t t catatttgt 13404 O tagttttgta cacgcacaca taaacgagag totaacgatt aac act atta cacgtaggat 1341OO gcaatacatg gatgataaga agaaacaact taaga caggg titt cittgacc cqttgatgat 13416 O atcc caagct cqctacaaag tagttgctic gaggcaagga gaagaataca aagacittgga 13422O cgatgctgaa tittgagaaag cc.gtcaaaca gaatcagaga aagaaaatga aggtaatggC 13428O ggcatacatt ggacgagcca totataatca totacaa.cat gigcaaggact taataatago 13434 O tcc.gcaccac tittaagtaag titat cqacat gig tittaattt togalactataa attatggcaa 1344OO tcqtgatatt alacatgtgtt titcgt.ctato tctatatagt gaccactaca tttgtat cat 13446 O gatctgacca aaggatggta aagttcgtggit Ctt.cgactica Ctgagaatgg aaaaggctac 134520 gtataatgac ttcttgaaga ttittagagaa gtatgattat tattgcacac ttgtactitat 13458O tacaacttaa caaatgtatt cittaatct ca caatgctatt cittatatgca gtgcataccg 13464 O tttittattgg aaagat cittg gcgg.cgaaca tocagaggac aagcc taatt tdt caatat c 1347 OO attatttcta tatggtgaca aacaacct co ggg tactgtc. citgtgcggitt attatgtatg 134760