USOO7834249B2

(12) United States Patent (10) Patent No.: US 7,834.249 B2 Schouten et al. (45) Date of Patent: Nov. 16, 2010

(54) GRG23 EPSP SYNTHASES: COMPOSITIONS 2007. O136840 A1 6, 2007 Peters et al. AND METHODS OF USE (Continued) (75) Inventors: Laura C. Schouten, Pittsboro, NC (US); FOREIGN PATENT DOCUMENTS Cheryl L. Peters, Raleigh, NC (US); Brian Vande Berg, Durham, NC (US) WO WO98,07846 A1 2/1998 (Continued) (73) Assignee: Athenix Corporation, Research Triangle Park, NC (US) OTHER PUBLICATIONS Padgette, S.R., et al., “Site-directed Mutagenesis of a Conserved (*) Notice: Subject to any disclaimer, the term of this Region of the 5-Enolpyruvylshikimate-3- Synthase Active patent is extended or adjusted under 35 Site.” J. Biol. Chem., Nov. 25, 1991, pp. 22364-22369, vol. 266, No. U.S.C. 154(b) by 412 days. 33. (21) Appl. No.: 11/943,801 (Continued) Primary Examiner David H Kruse (22) Filed: Nov. 21, 2007 (74) Attorney, Agent, or Firm Destiny M. Davenport (65) Prior Publication Data (57) ABSTRACT US 2008/O1273.72A1 May 29, 2008 Compositions and methods for conferring resis O O tance or tolerance to bacteria, plants, plant cells, tissues and Related U.S. Application Data seeds are provided. Compositions include polynucleotides (60) Provisional application No. 60/972,502, filed on Sep. encoding herbicide resistance or tolerance polypeptides, vec 14, 2007, provisional application No. 60/872,200, tors comprising those polynucleotides, and host cells com filed on Dec. 1, 2006, provisional application No. prising the vectors. The nucleotide sequences of the invention 60/861,455, filed on Nov. 29, 2006. can be used in DNA constructs or expression cassettes for transformation and expression in organisms, including (51) Int. Cl. and plants. Compositions also include trans CI2N 15/52 (2006.01) formed bacteria, plants, plant cells, tissues, and seeds. In CI2N 5/82 (2006.01) particular, isolated polynucleotides encoding CI2N I/2 (2006.01) resistance or tolerance polypeptides are provided. Addition (52) U.S. Cl...... 800/300; 435/69.1435/252.3; ally, sequences corresponding to the polynucle 435/320.1; 435/419,536/23.2: 800/288: 800/300.1 otides are encompassed. In particular, the present invention (58) Field of Classification Search ...... None provides for isolated polynucleotides containing nucleotide See application file for complete search history. sequences encoding the amino acid sequence shown in SEQ ID NO:9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, or 35, (56) References Cited or the nucleotide sequence set forth in SEQID NO:6, 8, 10. U.S. PATENT DOCUMENTS 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, or 34. 7,674,958 B2 * 3/2010 Peters et al...... 800/300 24 Claims, 3 Drawing Sheets

as s S. 4O

s9 wd s 2 3.O

s 2O

1 O US 7,834.249 B2 Page 2

U.S. PATENT DOCUMENTS Amin, N., et al., “Construction of Stabilized Proteins by Combinato rial Consensus Mutagenesis.” Protein Engineering, Design & Selec 2007,0294785 A1 12, 2007 Heinrichs et al. tion, Nov. 2004, pp. 787-793, vol. 17, No. 11. FOREIGN PATENT DOCUMENTS Lehmann, M. and M. Wyss, “Engineering Proteins for Thermostabil ity: The Use of Sequence Alignments Versus Rational Design and WO WO99.6O132 A1 11, 1999 Directed Evolution.” Current Opinion in Biotechnology, 2001, pp. WO WO 2005/040344 A2 5, 2005 371-375, vol. 12. WO WO 2006/110586 A2 10, 2006 Polizzi, K.M., et al., “Structure-guided Consensus Approach to Cre WO WO 2007/064828 A2 7/2007 ate a More Thermostable Penicillin G Acylase.” Biotechnol. J. May OTHER PUBLICATIONS 5, 2006, pp. 531-536. Rud, I., et al., “A Synthetic Promoter Library for Constitutive Gene Stalker, D.M., et al., “A Single Amino Acid Substitution in the Expression in Lactobacillus plantarum," Microbiology, Apr. 2006, 5-Enolpyruvylshikimate-3-Phosphate Synthase Confers pp. 1011-1019, vol. 152, No. 4. Resistance to the Herbicide Glyphosate.” J. Biol. Chem., Apr. 25. 1985, pp. 4724-4728, vol. 260, No. 8. * cited by examiner U.S. Patent Nov. 16, 2010 Sheet 1 of 3 US 7,834.249 B2

Thermostability of M5 vs. WTGRG23

D

S N e

O 3. {s M5 S GRG23 - Linear (M5) - Linear (GRG23) time (hrs.) at 37 degrees C

FIG. 1 U.S. Patent Nov. 16, 2010 Sheet 2 of 3 US 7,834.249 B2

20,000 -

15,000 -

10,000 -

5,000 -

FIG 2 U.S. Patent Nov. 16, 2010 Sheet 3 of 3 US 7,834.249 B2

s

D O

se O N. Y w

FIG. 3 US 7,834,249 B2 1. 2 GRG2.3 EPSP SYNTHASES: COMPOSITIONS nation between the E. coli and Salmonella typhimurium EPSP AND METHODS OF USE synthase genes, and suggest that mutations at position 42 (T42M) and position 230 (Q230K) are likely responsible for CROSS REFERENCE TO RELATED the observed resistance. Subsequent work (He et al. (2003) APPLICATION 5 Biosci. Biotech. Biochem, 67:1405-1409) shows that the T42M mutation (threonine to methionine) is sufficient to This application claims the benefit of U.S. Provisional improve tolerance of both the E. coli and Salmonella typh Application Ser. Nos. 60/861,455, filed Nov. 29, 2006: imurium . Due to the many advantages herbicide 60/872.200, filed Dec. 1, 2006; and 60/972,502, filed Sep. 14, resistance plants provide, herbicide resistance genes 2007, the contents of which are herein incorporated by refer 10 improved glyphosate resistance activity are desirable. ence in their entirety. An alternate method for mutagenesis is the “permutational mutagenesis' method described in U.S. Patent Application REFERENCE TO SEQUENCE LISTING No. 60/813,095, filed Jun. 13, 2006. SUBMITTED ELECTRONICALLY 15 SUMMARY OF INVENTION The official copy of the sequence listing is Submitted elec tronically via EFS-Web as an ASCII formatted sequence list Compositions and methods for conferring resistance or ing with a file named “33.6850 SequenceListing..txt, created tolerance to are provided. Compositions include EPSP syn on Nov. 19, 2007, and having a size of 151,000 bytes and is thase enzymes that are resistant to glyphosate herbicide, and filed concurrently with the specification. The sequence listing nucleic acid molecules encoding such enzymes, vectors com contained in this ASCII formatted document is part of the prising those nucleic acid molecules, and host cells compris specification and is herein incorporated by reference in its ing the vectors. The compositions include nucleic acid mol entirety. ecules encoding herbicide resistance polypeptides, including those encoding polypeptides comprising SEQID NO:9, 11, FIELD OF THE INVENTION 25 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, or 35, as well as the polynucleotide sequences of SEQID NO:6, 8, 10, 12, 14, 16, This invention relates to plant molecular biology, particu 18, 20, 22, 24, 26, 28, 30, 32, or 34. The coding sequences can larly novel EPSP synthase polypeptides that confer improved be used in DNA constructs or expression cassettes for trans resistance or tolerance to the herbicide glyphosate. formation and expression in organisms, including microor 30 ganisms and plants. Compositions also comprise transformed BACKGROUND OF THE INVENTION bacteria, plants, plant cells, tissues, and seeds that are gly phosate resistant by the introduction of the compositions of N-phosphonomethylglycine, commonly referred to as gly the invention into the genome of the organism. Where the phosate, is an important agronomic chemical. Glyphosate organism is a plant, the introduction of the sequence allows inhibits the enzyme that converts 35 for glyphosate containing to be applied to plants to (PEP) and 3-phosphoshikimic acid (S3P) to 5-enolpyruvyl selectively kill glyphosate sensitive weeds or other untrans 3-phosphoshikimic acid. Inhibition of this enzyme formed plants, but not the transformed organism. The (5-enolpyruvylshikimate-3-phosphate synthase; referred to sequences can additionally be used a marker for selection of herein as “EPSP synthase', or “EPSPS) kills plant cells by plant cells growing under glyphosate conditions. shutting down the shikimate pathway, thereby inhibiting aro 40 Methods for identifying an EPSP synthase enzyme with matic amino acid biosynthesis. glyphosate resistance activity are additionally provided. The Since glyphosate-class herbicides inhibit aromatic amino methods comprise identifying additional EPSP synthase acid biosynthesis, they not only kill plant cells, but are also sequences that are resistant to glyphosate based on the pres toxic to bacterial cells. Glyphosate inhibits many bacterial ence of the domain of the invention. EPSP synthases, and thus is toxic to these bacteria. However, 45 certain bacterial EPSP synthases have a high tolerance to DESCRIPTION OF FIGURES glyphosate. Plant cells resistant to glyphosate toxicity can be produced FIG. 1 demonstrates the enzymatic activity of GRG23 by transforming plant cells to express glyphosate-resistant (acel) (“M5'; SEQ ID NO:10) compared to wild type bacterial EPSP synthases. Notably, the bacterial gene from 50 GRG23 (“WTGRG23”; SEQ ID NO:2) at 37° C. for 0 to 25 tumefaciens strain CP4 has been used to con hours. fer herbicide resistance on plant cells following expression in FIG.2 shows the resistance of GRG23 and several variants plants. A mutated EPSP synthase from Salmonella typhimu to glyphosate, expressed as the inhibition constant, or K, rium strain CT7 confers glyphosate resistance in bacterial FIG. 3 shows the half life of GRG23 and GRG23 variants cells, and confers glyphosate resistance on plant cells (U.S. 55 at 37° C. Pat. Nos. 4,535,060; 4,769,061; and 5,094,945). U.S. Pat. No. 6,040,497 reports mutant maize EPSP syn DETAILED DESCRIPTION OF THE INVENTION thase enzymes having Substitutions of threonine to isoleucine at position 102 and proline to serine at position 106 (the The present inventions now will be described more fully “TIPS mutation). Such alterations confer glyphosate resis 60 hereinafter with reference to the accompanying drawings, in tance upon the maize enzyme. A mutated EPSP synthase from which some, but not all embodiments of the inventions are Salmonella typhimurium Strain CT7 confers glyphosate resis shown. Indeed, these inventions may be embodied in many tance in bacterial cells, and is reported to confer glyphosate different forms and should not be construed as limited to the resistance upon plant cells (U.S. Pat. Nos. 4,535,060; 4,769, embodiments set forth herein; rather, these embodiments are 061; and 5,094,945). He et al. (2001) Biochim et Biophysica 65 provided so that this disclosure will satisfy applicable legal Acta 1568:1-6) have developed EPSP synthases with requirements. Like numbers refer to like elements through increased glyphosate tolerance by mutagenesis and recombi Out. US 7,834,249 B2 3 4 Many modifications and other embodiments of the inven resistance protein’ includes a protein that confers upon a cell tions set forth herein will come to mind to one skilled in the art the ability to tolerate a higher concentration of glyphosate to which these inventions pertain having the benefit of the than cells that do not express the protein, or to tolerate a teachings presented in the foregoing descriptions and the certain concentration of glyphosate for a longer period of associated drawings. Therefore, it is to be understood that the time than cells that do not express the protein. By “tolerate' or inventions are not to be limited to the specific embodiments “tolerance' is intended either to survive, or to carry out essen disclosed and that modifications and other embodiments are tial cellular functions such as protein synthesis and respira intended to be included within the scope of the appended tion in a manner that is not readily discernable from untreated claims. Although specific terms are employed herein, they are cells. used in a generic and descriptive sense only and not for 10 The present invention further contemplates variants and purposes of limitation. fragments of the polynucleotides described herein. A “frag The present invention is drawn to compositions and meth ment of a polynucleotide may encode a biologically active ods for regulating herbicide resistance in organisms, particu portion of a polypeptide, or it may be a fragment that can be larly in plants or plant cells. The methods involve transform used as a hybridization probe or PCR primer using methods ing organisms with nucleotide sequences encoding the 15 disclosed elsewhere herein. Polynucleotides that are frag glyphosate resistance gene of the invention. The nucleotide ments of a polynucleotide comprise at least about 15, 20, 50. sequences of the invention are useful for preparing plants that 75, 100, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, show increased tolerance to the herbicide glyphosate. Thus, 750, 800, 850,900,950, 1000, 1050, 1100, 1150, 1200, 1250, by "glyphosate resistance' or "glyphosate tolerance’ gene of 1300, 1350, 1400, 1450, 1500, 1550, 1600, 1650, 1700, 1750, the invention is intended the nucleotide sequence set forth in 1800, 1850, 1900, 1950 contiguous nucleotides, or up to the SEQID NO:6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28.30,32. number of nucleotides present in a full-length polynucleotide or 34, and fragments and variants thereof that encode a gly disclosed herein depending upon the intended use (e.g., an phosate resistance or tolerance polypeptide. Likewise, a "gly EPSP synthase polynucleotide comprising SEQ ID NO:1). phosate resistance' or "glyphosate tolerance' polypeptide of By “contiguous nucleotides is intended nucleotide residues the invention is a polypeptide having the amino acid sequence 25 that are immediately adjacent to one another. set forth in SEQID NO:9, 11, 13, 15, 17, 19, 21,23,25,27,29, Fragments of the polynucleotides of the present invention 31, 33, or 35, and fragments and variants thereof, that confer generally will encode polypeptide fragments that retain the glyphosate resistance or tolerance to a host cell. biological activity of the full-length glyphosate resistance A. Isolated Polynucleotides, and Variants and Fragments protein; i.e., herbicide-resistance activity. By “retains herbi Thereof 30 cide resistance activity” is intended that the fragment will In some embodiments, the present invention comprises have at least about 30%, at least about 50%, at least about isolated, recombinant, or purified polynucleotides. An "iso 70%, at least about 80%, 85%, 90%, 95%, 100%, 110%, lated' or “purified’ polynucleotide or polypeptide, or bio 125%, 150%, 175%, 200%, 250%, at least about 300% or logically active portion thereof, is substantially free of other greater of the herbicide resistance activity of the full-length cellular material, or culture medium when produced by 35 glyphosate resistance protein disclosed herein as SEQ ID recombinant techniques, or Substantially free of chemical NO:2, 3, or 5. Methods for measuring herbicide resistance precursors or other chemicals when chemically synthesized. activity are well known in the art. See, for example, U.S. Pat. By “biologically active' is intended to possess the desired Nos. 4,535,060, and 5,188,642, each of which are herein biological activity of the native polypeptide, that is, retain incorporated by reference in their entirety. herbicide resistance activity. An "isolated polynucleotide 40 A fragment of a polynucleotide that encodes a biologically may be free of sequences (for example, protein encoding active portion of a polypeptide of the invention will encode at sequences) that naturally flank the nucleic acid (i.e., least about 15, 25, 30, 50, 75, 100, 125, 150, 175, 200, 250, sequences located at the 5' and 3' ends of the nucleic acid) in 300, 350, 400 contiguous amino acids, or up to the total the genomic DNA of the organism from which the polynucle number of amino acids present in a full-length polypeptide of otide is derived. For purposes of the invention, “isolated 45 the invention. when used to refer to polynucleotides excludes isolated chro The invention also encompasses variant polynucleotides. mosomes. For example, in various embodiments, the isolated “Variants' of the polynucleotide include those sequences that glyphosate resistance-encoding polynucleotide can contain encode the polypeptides disclosed herein but that differ con less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb servatively because of the degeneracy of the genetic code, as of nucleotide sequence that naturally flanks the polynucle 50 well as those that are sufficiently identical. otide in genomic DNA of the cell from which the polynucle The term “sufficiently identical' is intended a polypeptide otide is derived. or polynucleotide sequence that has at least about 60% or Polynucleotides of the invention include those that encode 65% sequence identity, about 70% or 75% sequence identity, glyphosate-resistant polypeptides comprising SEQID NO:9, about 80% or 85% sequence identity, about 90%, 91%, 92%, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, or 35, as well as 55 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence identity the polynucleotide sequences of SEQID NO:6, 8, 10, 12, 14. compared to a reference sequence using one of the alignment 16, 18, 20, 22, 24, 26, 28, 30, 32, or 34. programs using standard parameters. One of skill in the art By "glyphosate' is intended any herbicidal form of will recognize that these values can be appropriately adjusted N-phosphonomethylglycine (including any salt thereof) and to determine corresponding identity of polypeptides encoded other forms that result in the production of the glyphosate 60 by two polynucleotides by taking into account codon degen anion in planta. An "herbicide resistance protein' or a protein eracy, amino acid similarity, reading frame positioning, and resulting from expression of an "herbicide resistance-encod the like. ing nucleic acid molecule' includes proteins that confer upon Bacterial genes quite often possess multiple methionine a cell the ability to tolerate a higher concentration of an initiation codons in proximity to the start of the open reading herbicide than cells that do not express the protein, or to 65 frame. Often, translation initiation at one or more of these tolerate a certain concentration of an herbicide for a longer start codons will lead to generation of a functional protein. time than cells that do not express the protein. A glyphosate These start codons can include ATG codons. However, bac US 7,834,249 B2 5 6 teria Such as Bacillus sp. also recognize the codon GTG as a otides encoding polypeptides having the sequence domain of start codon, and proteins that initiate translation at GTG the present invention to generate polypeptides that confer codons contain a methionine at the first amino acid. Further glyphosate resistance. more, it is not often determined a priori which of these codons Using methods such as PCR, hybridization, and the like are used naturally in the bacterium. Thus, it is understood that corresponding herbicide resistance sequences can be identi use of one of the alternate methionine codons may lead to fied by looking for EPSP synthase domains of the present generation of variants that confer herbicide resistance. These invention. See, for example, Sambrook and Russell (2001) herbicide resistance proteins are encompassed in the present Molecular Cloning: A Laboratory Manual (Cold Spring Har invention and may be used in the methods of the present bor Laboratory Press, Cold Spring Harbor, N.Y.) and Innis et invention. 10 al. (1990) PCR Protocols: A Guide to Methods and Applica tions (Academic Press, NY). Naturally occurring allelic variants can be identified with B. Isolated Proteins and Variants and Fragments. Thereof the use of well-known molecular biology techniques, such as Herbicide resistance polypeptides are also encompassed polymerase chain reaction (PCR) and hybridization tech within the present invention. An herbicide resistance niques as outlined below. Variant polynucleotides also 15 polypeptide that is substantially free of cellular material include synthetically derived polynucleotides that have been includes preparations of polypeptides having less than about generated, for example, by using site-directed or other 30%, 20%, 10%, or 5% (by dry weight) of non-herbicide mutagenesis strategies but which still encode the polypeptide resistance polypeptide (also referred to herein as a “contami having the desired biological activity. nating protein'). In the present invention, “herbicide resis The skilled artisan will further appreciate that changes can tance protein’ is intended an EPSP synthase polypeptide be introduced by further mutation of the polynucleotides of having the sequence domain of the invention. Fragments, the invention thereby leading to further changes in the amino biologically active portions, and variants thereof are also acid sequence of the encoded polypeptides, without altering provided, and may be used to practice the methods of the the biological activity of the polypeptides. Thus, variant iso present invention. 25 “Fragments’ or “biologically active portions' include lated polynucleotides can be created by introducing one or polypeptide fragments comprising a portion of an amino acid more additional nucleotide Substitutions, additions, or dele sequence encoding an herbicide resistance protein and that tions into the corresponding polynucleotide encoding the retains herbicide resistance activity. Abiologically active por EPSP synthase domain disclosed herein, such that one or tion of an herbicide resistance protein can be a polypeptide more amino acid substitutions, additions or deletions are 30 that is, for example, 10, 25, 50, 100 or more amino acids in introduced into the encoded polypeptide. Further mutations length. Such biologically active portions can be prepared by can be introduced by standard techniques, such as site-di recombinant techniques and evaluated for herbicide resis rected mutagenesis and PCR-mediated mutagenesis, or gene tance activity. shuffling techniques. Such variant polynucleotides are also By “variants’ is intended proteins or polypeptides having encompassed by the present invention. 35 an amino acid sequence that is at least about 60%. 65%, about Variant polynucleotides can be made by introducing muta 70%, 75%, about 80%, 85%, 90%, 91%, 92%, 93%, 94%, tions randomly along all or part of the coding sequence. Such 95%, 96%, 97%, 98% or 99% identical to an EPSP synthase as by Saturation mutagenesis, and the resultant mutants can be polypeptide having the EPSP synthase domain of the present screened for the ability to confer herbicide resistance activity invention. One of skill in the art will recognize that these to identify mutants that retain activity. 40 values can be appropriately adjusted to determine corre Gene shuffling or sexual PCR procedures (for example, sponding identity of polypeptides encoded by two polynucle Smith (1994) Nature 370:324-325; U.S. Pat. Nos. 5,837,458; otides by taking into account codon degeneracy, amino acid 5,830,721; 5,811,238; and 5,733,731, each of which is herein similarity, reading frame positioning, and the like. incorporated by reference) can be used to further modify or For example, conservative amino acid Substitutions may be enhance polynucleotides and polypeptides having the EPSP 45 made at one or more nonessential amino acid residues. A synthase domain of the present invention (for example, “nonessential amino acid residue is a residue that can be polypeptides that confer glyphosate resistance). Gene shuf altered from the wild-type sequence of a polypeptide without fling involves random fragmentation of several mutant DNAS altering the biological activity, whereas an “essential amino followed by their reassembly by PCR into full length mol acid residue is required for biological activity. A "conserva ecules. Examples of various gene shuffling procedures 50 tive amino acid substitution' is one in which the amino acid include, but are not limited to, assembly following DNase residue is replaced with an amino acid residue having a simi treatment, the staggered extension process (STEP), and ran lar side chain. Families of amino acid residues having similar dom priming in vitro recombination. In the DNase mediated side chains have been defined in the art. These families method, DNA segments isolated from a pool of positive include amino acids with basic side chains (e.g., lysine, argi mutants are cleaved into random fragments with DNaseI and 55 nine, histidine), acidic side chains (e.g., aspartic acid, subjected to multiple rounds of PCR with no added primer. glutamic acid), uncharged polar side chains (e.g., glycine, The lengths of random fragments approach that of the asparagine, glutamine, serine, threonine, , cysteine), uncleaved segment as the PCR cycles proceed, resulting in nonpolar side chains (e.g., alanine, Valine, leucine, isoleu mutations in different clones becoming mixed and accumu cine, proline, , methionine, ), beta lating in Some of the resulting sequences. Multiple cycles of 60 branched side chains (e.g., threonine, Valine, isoleucine) and selection and shuffling have led to the functional enhance aromatic side chains (e.g., tyrosine, phenylalanine, tryp ment of several enzymes (Stemmer (1994) Nature 370:389 tophan, histidine). Amino acid Substitutions may be made in 391; Stemmer (1994) Proc. Natl. Acad. Sci. USA 91:10747 nonconserved regions that retain function. In general. Such 10751; Crameri et al. (1996) Nat. Biotechnol. 14:315-319; substitutions would not be made for conserved amino acid Zhanget al. (1997) Proc. Natl. Acad. Sci. USA 94.4504-4509: 65 residues, or for amino acid residues residing within a con and Cramerietal. (1997) Nat. Biotechnol. 15:436-438). Such served motif, where such residues are essential for polypep procedures could be performed, for example, on polynucle tide activity. However, one of skill in the art would understand US 7,834,249 B2 7 8 that functional variants may have minor conserved or non (U.S. Pat. No. 5,850,019); the 35S promoter from cauliflower conserved alterations in the conserved residues. mosaic virus (CaMV) (Odell et al. (1985) Nature 313:810 Antibodies to the polypeptides of the present invention, or 812); promoters of Chlorella virus methyltransferase genes to variants or fragments thereof, are also encompassed. Meth (U.S. Pat. No. 5,563.328) and the full-length transcript pro ods for producing antibodies are well known in the art (see, moter from figwort mosaic virus (FMV) (U.S. Pat. No. 5,378, for example, Harlow and Lane (1988) Antibodies: A Labora 619); the promoters from such genes as rice actin (McElroy et tory Manual, Cold Spring Harbor Laboratory, Cold Spring al. (1990) Plant Cell 2:163-171); ubiquitin (Christensen et al. Harbor, N.Y.; U.S. Pat. No. 4,196,265). (1989) Plant Mol. Biol. 12:619-632 and Christensen et al. C. Polynucleotide Constructs (1992) Plant Mol. Biol. 18:675-689); pEMU (Last et al. The polynucleotides encoding the EPSP synthase polypep 10 (1991) Theor. Appl. Genet. 81:581–588); MAS (Velten et al. tides of the present invention may be modified to obtain or (1984) EMBO.J. 3:2723-2730); maize H3 histone (Lepetit et enhance expression in plant cells. The polynucleotides al. (1992) Mol. Gen. Genet. 231:276-285 and Atanassova et encoding the polypeptides identified herein may be provided al. (1992) Plant J. 2(3):291-300); Brassica napus ALS3 (PCT in expression cassettes for expression in the plant of interest. application WO 97/41228); and promoters of various Agro A “plant expression cassette' includes a DNA construct, 15 bacterium genes (see U.S. Pat. Nos. 4,771,002; 5,102,796: including a recombinant DNA construct, that is capable of 5,182,200; and 5,428,147). resulting in the expression of a polynucleotide in a plant cell. Suitable inducible promoters for use in plants include: the The cassette can include in the 5'-3' direction of transcription, promoter from the ACE 1 system which responds to copper a transcriptional initiation region (i.e., promoter, particularly (Mett et al. (1993) PNAS 90:4567-4571): the promoter of the a heterologous promoter) operably-linked to one or more maize In2 gene which responds to benzenesulfonamide her polynucleotides of interest, and/or a translation and transcrip bicide safeners (Hershey et al. (1991) Mol. Gen. Genetics tional termination region (i.e., termination region) functional 227:229-237 and Gatz et al. (1994) Mol. Gen. Genetics 243: in plants. The cassette may additionally contain at least one 32-38); and the promoter of the Tet repressor from Tn 10 (Gatz additional polynucleotide to be introduced into the organism, et al. (1991) Mol. Gen. Genet. 227:229-237). Another induc Such as a selectable marker gene. Alternatively, the additional 25 ible promoter for use in plants is one that responds to an polynucleotide(s) can be provided on multiple expression inducing agent to which plants do not normally respond. An cassettes. Such an expression cassette is provided with a exemplary inducible promoter of this type is the inducible plurality of restriction sites for insertion of the promoter from a steroid hormone gene, the transcriptional polynucleotide(s) to be under the transcriptional regulation of activity of which is induced by a glucocorticosteroid hormone the regulatory regions. 30 (Schena et al. (1991) Proc. Natl. Acad. Sci. USA 88:10421) or “Heterologous' generally refers to the polynucleotide or the recent application of a chimeric transcription activator, polypeptide that is not endogenous to the cellor is not endog XVE, for use in an estrogen receptor-based inducible plant enous to the location in the native genome in which it is expression system activated by estradiol (Zuo et al. (2000) present, and has been added to the cell by infection, transfec Plant J, 24:265-273). Other inducible promoters for use in tion, microinjection, electroporation, microprojection, or the 35 plants are described in EP 332104, PCT WO 93/21334 and like. By “operably linked' is intended a functional linkage PCT WO97/06269 which are herein incorporated by refer between two polynucleotides. For example, when a promoter ence in their entirety. Promoters composed of portions of is operably linked to a DNA sequence, the promoter sequence other promoters and partially or totally synthetic promoters initiates and mediates transcription of the DNA sequence. It is can also be used. See, e.g., Nietal. (1995) Plant J. 7:661-676 recognized that operably linked polynucleotides may or may 40 and PCT WO95/14098 describing such promoters for use in not be contiguous and, where used to reference the joining of plants. two polypeptide coding regions, the polypeptides are The promoter may include, or be modified to include, one expressed in the same reading frame. or more enhancer elements. In some embodiments, the pro The promoter may be any polynucleotide sequence which moter may include a plurality of enhancer elements. Promot shows transcriptional activity in the chosen plant cells, plant 45 ers containing enhancer elements provide for higher levels of parts, or plants. The promoter may be native or analogous, or transcription as compared to promoters that do not include foreign or heterologous, to the plant host and/or to the DNA them. Suitable enhancer elements for use in plants include the sequence of the invention. Where the promoter is “native' or PC1SV enhancer element (U.S. Pat. No. 5,850,019), the “analogous' to the plant host, it is intended that the promoter CaMV 35S enhancer element (U.S. Pat. Nos. 5,106.739 and is found in the native plant into which the promoter is intro 50 5,164.316) and the FMV enhancer element (Maiti et al. duced. Where the promoter is “foreign' or "heterologous' to (1997) Transgenic Res. 6:143-156). See also PCT WO the DNA sequence of the invention, it is intended that the 96723898. promoter is not the native or naturally occurring promoter for Often, such constructs can contain 5' and 3' untranslated the operably linked DNA sequence of the invention. The regions. Such constructs may contain a 'signal sequence” or promoter may be inducible or constitutive. It may be natu 55 “leader sequence' to facilitate co-translational or post-trans rally-occurring, may be composed of portions of various lational transport of the peptide of interest to certain intrac naturally-occurring promoters, or may be partially or totally ellular structures such as the chloroplast (or other plastid), synthetic. Guidance for the design of promoters is provided endoplasmic reticulum, or Golgi apparatus, or to be secreted. by studies of promoter structure, such as that of Harley and For example, the construct can be engineered to contain a Reynolds (1987) Nucleic Acids Res. 15:2343-2361. Also, the 60 signal peptide to facilitate transfer of the peptide to the endo location of the promoter relative to the transcription start may plasmic reticulum. By "signal sequence' is intended a be optimized. See, e.g., Roberts et al. (1979) Proc. Natl. Acad. sequence that is known or Suspected to result in cotransla Sci. USA, 76:760-764. Many suitable promoters for use in tional or post-translational peptide transport across the cell plants are well known in the art. membrane. In eukaryotes, this typically involves secretion For instance, Suitable constitutive promoters for use in 65 into the Golgi apparatus, with Some resulting glycosylation. plants include: the promoters from plant viruses. Such as the By “leader sequence' is intended any sequence that, when peanut chlorotic streak caulimovirus (PC1SV) promoter translated, results in an amino acid sequence Sufficient to US 7,834,249 B2 10 trigger co-translational transport of the peptide chain to a al. (1991) Plant Mol. Biol. Rep. 9:104-126: Clarket al. (1989) Sub-cellular organelle. Thus, this includes leader sequences J. Biol. Chem. 264:17544-17550; Della-Cioppa et al. (1987) targeting transport and/or glycosylation by passage into the Plant Physiol. 84:965-968; Romer et al. (1993) Biochem. endoplasmic reticulum, passage to vacuoles, plastids includ Biophys. Res. Commun. 196:1414-1421; and Shah et al. ing chloroplasts, mitochondria, and the like. It may also be (1986) Science 233:478-481. preferable to engineer the plant expression cassette to contain The polynucleotides of interest to be targeted to the chlo an intron, such that mRNA processing of the intronis required roplast may be optimized for expression in the chloroplast to for expression. account for differences in codon usage between the plant By 3' untranslated region' is intended a polynucleotide nucleus and this organelle. In this manner, the polynucle located downstream of a coding sequence. Polyadenylation 10 otides of interest may be synthesized using chloroplast-pre signal sequences and other sequences encoding regulatory ferred codons. See, for example, U.S. Pat. No. 5,380,831, signals capable of affecting the addition of polyadenylic acid herein incorporated by reference. tracts to the 3' end of the mRNA precursor are 3' untranslated This plant expression cassette can be inserted into a plant regions. By '5' untranslated region' is intended a polynucle transformation vector. By “transformation vector” is intended otide located upstream of a coding sequence. 15 a DNA molecule that allows for the transformation of a cell. Other upstream or downstream untranslated elements Such a molecule may consist of one or more expression include enhancers. Enhancers are polynucleotides that act to cassettes, and may be organized into more than one vector increase the expression of a promoter region. Enhancers are DNA molecule. For example, binary vectors are plant trans well known in the art and include, but are not limited to, the formation vectors that utilize two non-contiguous DNA vec SV40 enhancer region and the 35S enhancer element. tors to encode all requisite cis- and trans-acting functions for The termination region may be native with the transcrip transformation of plant cells (Hellens and Mullineaux (2000) tional initiation region, may be native with the sequence of the Trends in Plant Science 5:446-451). "Vector refers to a poly present invention, or may be derived from another source. nucleotide construct designed for transfer between different Convenient termination regions are available from the Ti host cells. “Expression vector” refers to a vector that has the plasmid of A. tumefaciens, such as the octopine synthase and 25 ability to incorporate, integrate and express heterologous nopaline synthase termination regions. See also Guerineau et DNA sequences or fragments in a foreign cell. al. (1991) Mol. Gen. Genet. 262:141-144; Proudfoot (1991) The plant transformation vector comprises one or more Cell 64:671-674; Sanfacon et al. (1991) Genes Dev. 5:141 DNA vectors for achieving plant transformation. For 149; Mogen et al. (1990) Plant Cell 2:1261-1272; Munroe et example, it is a common practice in the art to utilize plant al. (1990) Gene 91:151-158; Ballas et al. (1989) Nucleic 30 transformation vectors that comprise more than one contigu Acids Res. 17:7891-7903; and Joshietal. (1987) Nucleic Acid ous DNA segment. These vectors are often referred to in the Res. 15:9627-9639. art as binary vectors. Binary vectors as well as vectors with In one aspect of the invention, synthetic DNA sequences helper plasmids are most often used for Agrobacterium-me are designed for a given polypeptide. Such as the polypeptides diated transformation, where the size and complexity of DNA of the invention. Expression of the open reading frame of the 35 segments needed to achieve efficient transformation is quite synthetic DNA sequence in a cell results in production of the large, and it is advantageous to separate functions onto sepa polypeptide of the invention. Synthetic DNA sequences can rate DNA molecules. Binary vectors typically contain a plas be useful to simply remove unwanted restriction endonu mid vector that contains the cis-acting sequences required for clease sites, to facilitate DNA cloning strategies, to alter or T-DNA transfer (such as left border and right border), a remove any potential codon bias, to alter or improve GC 40 selectable marker that is engineered to be capable of expres content, to remove or alter alternate reading frames, and/or to sion in a plant cell, and a “polynucleotide of interest” (a alter or remove intron/exon splice recognition sites, polyade polynucleotide engineered to be capable of expression in a nylation sites, Shine-Delgarno sequences, unwanted pro plant cell for which generation of transgenic plants is moter elements and the like that may be present in a native desired). Also present on this plasmid vector are sequences DNA sequence. It is also possible that synthetic DNA 45 required for bacterial replication. The cis-acting sequences sequences may be utilized to introduce other improvements to are arranged in a fashion to allow efficient transfer into plant a DNA sequence, such as introduction of an intron sequence, cells and expression therein. For example, the selectable creation of a DNA sequence that in expressed as a protein marker sequence and the sequence of interest are located fusion to organelle targeting sequences, such as chloroplast between the left and right borders. Often a second plasmid transit peptides, apoplast/vacuolar targeting peptides, or pep 50 vector contains the trans-acting factors that mediate T-DNA tide sequences that result in retention of the resulting peptide transfer from Agrobacterium to plant cells. This plasmid in the endoplasmic reticulum. Synthetic genes can also be often contains the virulence functions (Vir genes) that allow synthesized using host cell-preferred codons for improved infection of plant cells by Agrobacterium, and transfer of expression, or may be synthesized using codons at a host DNA by cleavage at border sequences and vir-mediated DNA preferred codon usage frequency. See, for example, Campbell 55 transfer, as is understood in the art (Hellens and Mullineaux and Gowri (1990) Plant Physiol. 92:1-11; U.S. Pat. Nos. (2000) Trends in Plant Science, 5:446-451). Several types of 6,320,100; 6,075,185; 5,380,831; and 5,436,391, U.S. Pub Agrobacterium strains (e.g., LBA4404, GV3101, EHA101, lished Application Nos. 2004.0005600 and 20010003849, and EHA105, etc.) can be used for plant transformation. The Murray et al. (1989) Nucleic Acids Res. 17:477-498, herein second plasmid vector is not necessary for introduction of incorporated by reference. 60 polynucleotides into plants by other methods such as micro In one embodiment, the polynucleotides of interest are projection, microinjection, electroporation, polyethylene targeted to the chloroplast for expression. In this manner, glycol, etc. where the polynucleotide of interest is not directly inserted D. Plant Transformation into the chloroplast, the expression cassette will additionally Methods of the invention involve introducing a nucleotide contain a polynucleotide encoding a transit peptide to direct 65 construct into a plant. By “introducing is intended to present the nucleotide of interest to the chloroplasts. Such transit to the plant the nucleotide construct in Such a manner that the peptides are known in the art. See, for example, Von Heijneet construct gains access to the interior of a cell of the plant. The US 7,834,249 B2 11 12 methods of the invention do not require that a particular example, McCormicket al. (1986) Plant Cell Reports 5:81– method for introducing a nucleotide construct to a plant is 84. These plants may then be grown, and either pollinated used, only that the nucleotide construct gains access to the with the same transformed strain or different strains, and the interior of at least one cell of the plant. Methods for introduc resulting hybrid having constitutive expression of the desired ing nucleotide constructs into plants are known in the art phenotypic characteristic identified. Two or more generations including, but not limited to, stable transformation methods, may be grown to ensure that expression of the desired phe transient transformation methods, and virus-mediated meth notypic characteristic is stably maintained and inherited and ods. then seeds harvested to ensure expression of the desired phe In general, plant transformation methods involve transfer notypic characteristic has been achieved. In this manner, the ring heterologous DNA into target plant cells (e.g. immature 10 present invention provides transformed seed (also referred to or mature embryos, Suspension cultures, undifferentiated cal as “transgenic seed') having a nucleotide construct of the lus, protoplasts, etc.), followed by applying a maximum invention, for example, an expression cassette of the inven threshold level of appropriate selection (depending on the tion, stably incorporated into their genome. selectable marker gene and in this case "glyphosate') to E. Evaluation of Plant Transformation recover the transformed plant cells from a group of untrans 15 Following introduction of heterologous foreign DNA into formed cell mass. Explants are typically transferred to a fresh plant cells, the transformation or integration of the heterolo Supply of the same medium and cultured routinely. Subse gous gene in the plant genome is confirmed by various meth quently, the transformed cells are differentiated into shoots ods such as analysis of nucleic acids, proteins and metabolites after placing on regeneration medium Supplemented with a associated with the integrated gene. maximum threshold level of selecting agent (e.g. 'glypho PCR analysis is a rapid method to screen transformed cells, sate'). The shoots are then transferred to a selective rooting tissue or shoots for the presence of incorporated gene at the medium for recovering rooted shoot or plantlet. The trans earlier stage before transplanting into the Soil (Sambrook and genic plantlet then grow into mature plants and produce fer Russell (2001) Molecular Cloning: A Laboratory Manual tile seeds (e.g. Hieietal. (1994) The Plant Journal 6:271-282; (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, Ishida et al. (1996) Nature Biotechnology 14:745-750). 25 N.Y.)). PCR is carried out using oligonucleotide primers spe Explants are typically transferred to a fresh supply of the cific to the gene of interest or Agrobacterium vector back same medium and cultured routinely. A general description of ground, etc. the techniques and methods for generating transgenic plants Plant transformation may be confirmed by Southern blot are found in Ayres and Park (1994) Critical Reviews in Plant analysis of genomic DNA (Sambrook and Russell (2001) Science 13:219-239 and Bommineni and Jauhar (1997) May 30 supra). In general, total DNA is extracted from the transfor dica 42:107-120. Since the transformed material contains mant, digested with appropriate restriction enzymes, frac many cells; both transformed and non-transformed cells are tionated in anagarose geland transferred to a nitrocellulose or present in any piece of Subjected target callus or tissue or nylon membrane. The membrane or “blot' can then be probed group of cells. The ability to kill non-transformed cells and with, for example, radiolabeled P target DNA fragment to allow transformed cells to proliferate results in transformed 35 confirm the integration of the introduced gene in the plant plant cultures. Often, the ability to remove non-transformed genome according to standard techniques (Sambrook and cells is a limitation to rapid recovery of transformed plant Russell, 2001, supra). cells and Successful generation of transgenic plants. Molecu In Northern analysis, RNA is isolated from specific tissues lar and biochemical methods can be used to confirm the of transformant, fractionated in a formaldehyde agarose gel. presence of the integrated heterologous gene of interest in the 40 and blotted onto a nylon filter according to standard proce genome of transgenic plant. dures that are routinely used in the art (Sambrook and Russell Generation of transgenic plants may be performed by one (2001) supra). Expression of RNA encoded by grg sequences of several methods, including, but not limited to, introduction of the invention is then tested by hybridizing the filter to a of heterologous DNA by Agrobacterium into plant cells radioactive probe derived from a GDC by methods known in (Agrobacterium-mediated transformation), bombardment of 45 the art (Sambrook and Russell (2001) supra) plant cells with heterologous foreign DNA adhered to par Western blot and biochemical assays and the like may be ticles, and various other non-particle direct-mediated meth carried out on the transgenic plants to determine the presence ods (e.g. Hiei et al. (1994) The Plant Journal 6:271-282: of protein encoded by the herbicide resistance gene by stan Ishida et al. (1996) Nature Biotechnology 14:745-750; Ayres dard procedures (Sambrook and Russell (2001) supra) using and Park (1994) Critical Reviews in Plant Science 13:219 50 antibodies that bind to one or more epitopes present on the 239; Bommineni and Jauhar (1997) Maydica 42:107-120) to herbicide resistance protein. transfer DNA. In one aspect of the invention, the grg genes described Methods for transformation of chloroplasts are known in herein are useful as markers to assess transformation of bac the art. See, for example, Svab et al. (1990) Proc. Natl. Acad. terial or plant cells. Sci. USA 87:8526-8530; Svab and Maliga (1993) Proc. Natl. 55 F. Plants and Plant Parts Acad. Sci. USA 90:913-917; Svab and Maliga (1993) EMBO By plant' is intended whole plants, plant organs (e.g., J. 12:601-606. The method relies on particle gun delivery of leaves, stems, roots, etc.), seeds, plant cells, propagules, DNA containing a selectable marker and targeting of the embryos and progeny of the same. Plant cells can be differ DNA to the plastid genome through homologous recombina entiated or undifferentiated (e.g., callus, Suspension culture tion. Additionally, plastid transformation can be accom 60 cells, protoplasts, leaf cells, root cells, phloem cells, pollen). plished by transactivation of a silent plastid-borne transgene The present invention may be used for introduction of poly by tissue-preferred expression of a nuclear-encoded and plas nucleotides into any plant species, including, but not limited tid-directed RNA polymerase. Such a system has been to, monocots and dicots. Examples of plants of interest reported in McBride et al. (1994) Proc. Natl. Acad. Sci. USA include, but are not limited to, corn (maize), Sorghum, wheat, 91:7301-73.05. 65 Sunflower, tomato, crucifers, peppers, potato, cotton, rice, The cells that have been transformed may be grown into Soybean, Sugarbeet, Sugarcane, tobacco, barley, and oilseed plants in accordance with conventional ways. See, for rape, Brassica sp., alfalfa, rye, millet, safflower, peanuts, US 7,834,249 B2 13 14 Sweet potato, cassava, coffee, coconut, pineapple, citrustrees, spread of weeds or other untransformed plants without sig cocoa, tea, banana, avocado, fig, guava, mango, olive, papaya, nificantly affecting the glyphosate-resistant plant or plant cashew, macadamia, almond, oats, vegetables, ornamentals, seed. Such effective concentrations for herbicides of interest and conifers. are generally known in the art. The herbicide may be applied Vegetables include, but are not limited to, tomatoes, let either pre- or post emergence in accordance with usual tech tuce, greenbeans, lima beans, peas, and members of the genus niques for herbicide application to fields comprising plants or Curcunnis Such as cucumber, cantaloupe, and musk melon. plant seeds which have been rendered resistant to the herbi Ornamentals include, but are not limited to, azalea, hydran cide. gea, hibiscus, roses, tulips, daffodils, petunias, carnation, I. Temperature Spectrum poinsettia, and chrysanthemum. Crop plants are also of inter 10 Several studies of glyphosate metabolism in plants have est, including, for example, maize, Sorghum, wheat, Sun been carried out, and reveal that glyphosate is not metabo flower, tomato, crucifers, peppers, potato, cotton, rice, Soy lized by plants or is metabolized very slowly. Glyphosate bean, Sugarbeet, Sugarcane, tobacco, barley, oilseed rape, etc. penetrates the cuticle rapidly, and is translocated throughout This invention is suitable for any member of the monocot plants over a considerable period of time (reviewed in Kear plant family including, but not limited to, maize, rice, barley, 15 ney and Kaufman, Eds (1988) Herbicides, Chemistry, Deg oats, wheat, Sorghum, rye, Sugarcane, pineapple, yams, radation & Mode of Action Marcel Dekker, Inc., New York, onion, banana, coconut, and dates. 3:1-70 and Grossbard and Atkinson, Eds. (1985) The Herbi G. Methods for Increasing Plant Yield cide Glyphosate Butterworths, London, p. 25-34). Thus, it is Methods for increasing plant yield are provided. The meth likely that glyphosate tolerance is necessary over a Sustained ods comprise introducing into a plant or plant cell a poly period of time following glyphosate exposure in agronomi nucleotide comprising a grg sequence disclosed herein. As cally-important plants. Where temperatures frequently defined herein, the "yield of the plant refers to the quality exceed 30° C. during the growing season, it would be advan and/or quantity of biomass produced by the plant. By “bio tageous to employ a glyphosate-tolerance EPSP synthase that mass” is intended any measured plant . An increase in maintains activity at elevated temperatures. biomass production is any improvement in the yield of the 25 In one embodiment of the present invention, the EPSP measured plant product. Increasing plant yield has several synthase exhibits thermal stability at a temperature that is commercial applications. For example, increasing plant leaf higher or lower than ambient environmental temperature. By biomass may increase the yield of leafy vegetables for human “thermal stability” is intended that the enzyme is active at a oranimal consumption. Additionally, increasing leafbiomass higher or lower temperature than ambient environmental tem can be used to increase production of plant-derived pharma 30 perature for a longer period of time than an EPSP synthase ceutical or industrial products. An increase in yield can com that is not thermal stable at that temperature. For example, a prise any statistically significant increase including, but not thermal stable EPSP synthase has enzymatic activity for limited to, at least a 1% increase, at least a 3% increase, at greater than about 1 hour, greater than about 2 hours, about 3. least a 5% increase, at least a 10% increase, at least a 20% about 4, about 5, about 6, about 7, about 8, about 9, about 10, increase, at least a 30%, at least a 50%, at least a 70%, at least 35 about 11, about 12, about 13, about 14, about 15, about 20, a 100% or a greater increase. about 25 hours, or longer, at a temperature that is higher or In specific methods, the plant is treated with an effective lower than ambient environmental temperature. For the pur concentration of an herbicide, where the herbicide applica poses of the present invention, “ambient' environmental tem tion results in enhanced plant yield. By “effective concentra perature is about 30°C. In some embodiments, a higher than tion' is intended the concentration which allows the increased 40 ambient temperature is a temperature at or above about 32° yield in the plant. Such effective concentrations for herbicides C., about 34° C., about 37° C., about 40° C., about 45° C., of interest are generally known in the art. The herbicide may about 50° C., about 55° C., about 60° C., about 65° C., or be applied either pre- or post emergence in accordance with higher. A lower than ambient temperature is a temperature at usual techniques for herbicide application to fields compris or below about 28°C., below about 27°C., about 26°C., ing crops which have been rendered resistant to the herbicide 45 about 25°C., about 23°C., about 20°C., about 18°C., about by heterologous expression of a grg gene of the invention. 15° C., about 10°C., at or below about 5°C., or around 0°C. Methods for conferring herbicide resistance in a plant or Methods to assay for EPSP synthase activity are discussed in plant part are also provided. In such methods, a grg poly further details elsewhere herein. For the purposes of the nucleotide disclosed herein is introduced into the plant, present invention, a thermal stable EPSP synthase is consid wherein expression of the polynucleotide results in glypho 50 ered active when it functions at about 90% to 100%, about sate tolerance or resistance. Plants produced via this method 80% to about 90%, about 70% to about 80%, about 60% to can be treated with an effective concentration of an herbicide about 70% or about 50% to about 60% of the maximum and display an increased tolerance to the herbicide. An "effec activity level observed at the optimum temperature for that tive concentration' of an herbicide in this application is an enzyme. amount Sufficient to slow or stop the growth of plants or plant 55 Thus, provided herein are methods and compositions for parts that are not naturally resistant or rendered resistant to the increasing glyphosate tolerance at temperatures higher than herbicide. ambient environmental temperatures. In one embodiment, H. Methods of Controlling Weeds in a Field the methods comprise introducing into a plant a nucleotide Methods for selectively controlling weeds in a field con sequence encoding the glyphosate tolerance EPSP synthase taining a plant are also provided. In one embodiment, the 60 enzyme set forth in SEQID NO:8, 10, 14, 28, 30, 32, or 34, plant seeds or plants are glyphosate resistant as a result of a and growing the plant at a temperature that is higher than grg polynucleotide disclosed herein being inserted into the ambient environmental temperature. In specific embodi plant seed or plant. In specific methods, the plant is treated ments, the growing temperature is higher than ambient tem with an effective concentration of an herbicide, where the perature for an average of at least about 2 hours per day, at herbicide application results in a selective control of weeds or 65 least about 3 hours per day, at least about 4 hours per day, at other untransformed plants. By “effective concentration' is least about 5, about 6, about 7, about 8, about 9, about 10, intended the concentration which controls the growth or about 11, about 12, about 14, about 16, about 18, about 20, at US 7,834,249 B2 15 16 least about 22 hours per day, or up to about 24 hours a day MgCl, 55 mM glucose, 25 mg/liter L-proline, 10 mg/liter during the growing season of the plant. thiamine HCl, sufficient NaOH to adjust the pH to 7.0, and 15 In another embodiment, the method comprises introducing g/liter agar. The plates were incubated for 36 hours at 37°C. into a plant a nucleotide sequence encoding the glyphosate Individual colonies were picked and arrayed into 384-well tolerant EPSP synthase enzyme set forthin SEQID NO:8, 10, plates. Two 384-well plates were created in this manner. A 14, 28, 30, 32, or 34, contacting the plant with an herbicidally third plate of 384 clones was picked from colonies that grown effective concentration of glyphosate, and growing the plant on plates lacking glyphosate. at a temperature that exceeds ambient environmental tem perature for at least 1 hour, at least about 2 hours, at least Example 3 about 3, at least about 4, or more hours per day for at least 2, 10 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or Isolation and Analysis of Glyphosate Resistant more days after glyphosate is applied to the plant, wherein the GRG23 Variants days in which the temperature exceeds ambient environmen tal temperature occur during the growing season of the plant. BL21*DE3 cells transformed with mutagenized syngrg23 The following examples are offered by way of illustration 15 and/or grg23 variants were identified by growth on glypho and not by way of limitation. sate plates. Extracts of mutagenized Syngrg23 and grgr23 variants were prepared and assayed for improved enzymatic EXPERIMENTAL activity. Colonies identified on glyphosate plates were pinned Example 1 into 96-well blocks containing LB medium and were grown to an O.D. of about 0.6. IPTG was then added (0.5 mM) and Syngrg23 Design and Expression the blocks were incubated overnight at 20° C. to induce pro tein expression. Protein extracts were prepared from the cell pellets using POP culture reagent (Novagen) and Lysonase GRG23 (SEQID NO:2) is an EPSP synthase that possesses (Novagen), and the enzymatic activity in the crude lysates excellent kinetic values for Kim, Ki and Vmax (U.S. Patent 25 Application No. 60/741,166 filed Dec. 1, 2005). A novel gene was measured after heating the extracts for 30 min at 37° C. sequence encoding the GRG23 protein (U.S. Patent Applica Extracts with activity greater than two standard deviations above the mean of a set of extracts containing the appropriate tion No. 60/741,166 filed Dec. 1, 2005) was designed and control protein (for example GRG23) were selected for fur synthesized. The resulting sequence is provided herein as ther analysis. SEQID NO:6, and is herein designated 'syngrg23. The open 30 reading frame was cloned into the expression vector pRSF1b Clones showing increased activity after incubation as (Invitrogen) by methods known in the art. crude extracts were grown in 250 mLLB cultures, and protein The Syngrg23 gene encoding GRG23 was cloned into a expression induced with IPTG. Following induction, the pUC19 vector to create paX748. PCR primers that flanked mutant GRG23 protein was purified from each culture by Syngrg23 in this vector were used to amplify Syngrg23 from 35 affinity chromatography using a cobalt resin (Novagen). The pAX748 using the Mutazyme II system (Stratagene) to intro purified proteins were then tested for enzymatic activity fol duce random mutations into the Syngrg23 coding region. The lowing heating for 0, 2, 4, and approximately 16 hours at 37° template was diluted 1:50 in the error-prone PCR reaction, C. and amplification was carried out for 30 cycles. The resulting PCR product was digested with the restriction enzymes 40 Example 4 BamH I and Sgs I, gel-purified, and ligated into the vector pRSF 1b to create a mutagenized syngrg23 library. Improved GRG23 Variants The mutagenized Syngrg23 libraries were transformed into E. coli strain BL21*DE3 star (Invitrogen). Following trans From a DNA library of mutagenized syngrg23, several formation, individual colonies were plated on 1XM63 45 clones with improved activity at 37°C. were identified. The medium containing 150 mM glyphosate to select for clones DNA sequences of the clones corresponding to these extracts that had retained enzymatic activity and growth tolerance. was determined. Table 1 shows the amino acid changes iden tified in six variants of GRG23 that retained glyphosate resis Example 2 tance: grg23(L3P1.B20) (SEQ ID NO:20) encoding the 50 amino acid sequence GRG23(L3P1.B20) (SEQ ID NO:21), Screening for Glyphosate Resistance on Plates grg23(L3P1.B3) (SEQ ID NO:22) encoding the amino acid sequence grg23(L3P1B3) (SEQ ID NO:23); GRG23 Library ligations were transformed into BL21*DE3 com (L3P1.F18) (SEQ ID NO:24) encoding the amino acid petent E. coli cells (Invitrogen). The transformations were sequence GRG23(L3P1.F18) (SEQ ID NO:25); and, grg23 performed according to the manufacturers instructions with 55 (L3P1.O23) (SEQ ID NO:26) encoding the amino acid the following modifications. After incubation for 1 hour at sequence GRG23(L3P1.O23) (SEQID NO:27). 37°C. in SOC medium, the cells were sedimented by cen trifugation (5 minutes, 1000xg, 4°C.). The cells were washed TABLE 1 with 1 ml M63+, centrifuged again, and the Supernatant decanted. The cells were washed a second time with 1 ml 60 Mutations identified in glyphosate-resistant GRG23 variants M63+ and resuspended in 200 ul M63+. For selection of mutant GRG23 enzymes conferring gly Clone Amino Acid (AA) in GRG23 phosate resistance in E. coli, the cells were plated onto M63+ L3P1B2O V2O6-e-I L3P1B3 D75->H, E217->K agar medium plates containing 150 mM glyphosate, 0.05 mM L3P1F18 T274-e-I IPTG (isopropyl-beta-D-thiogalactopyranoside), and 50 65 L3P1O23 RS-B ug/mlkanamycin. M63+ medium contains 100 mMKHPO, 15 mM (NH)SO 50 uM CaCl, 1 uM FeSO 50 uM US 7,834,249 B2 17 18 The clones were grown in 250 mL LB cultures, and protein expression induced isolated as described above. The purified TABLE 3 proteins were then tested for enzymatic activity following heating for 0, 2, 4, and approximately 16 hours at 37°C. One Kinetics of GRG23(ACE1) vs GRG23 of the clones, termed "M5', was found to retain an increased 5 Km Vmax proportion of its enzymatic activity after prolonged incuba (uM) Ki (uM) (nmol/minjug) tion at 37°C. (Table 2). The DNA sequence of this clones was GRG-23 12.2 13,800 14.77 determined, and the gene is designated herein as grg23(ace1) GRG23(ACE1) 9.7 14,620 13.73 (SEQ ID NO:8). The protein expressed from grg23(ace1) is 10 designated GRG23(ACE1) (SEQ ID NO:9). Example 6 TABLE 2 Identification of grg23 (ace2) Half-life of GRG23(ACE1) vs GRG23 at elevated temperature 15 Half-life at 37° C. Protein (hours) GRG23(ACE1) contains two amino acid changes relative to GRG23. To determine if additional substitutions at these GRG23 7 positions could further improve activity, a DNA library was GRG23(ACE1) 15.5 generated that resulted in clones expressing proteins that were substantially mutated at positions 49 and 276 corre GRG23 (ACE1) contains 2 amino acid substitutions rela sponding to GRG23 (SEQID NO:2). Clones conferring gly tive to wild-type GRG23 protein: A49->T and S276->T. The phosate resistance were selected by growth on glyphosate pRSF 1b vector that contains this gene is designated plates, and grown and assayed for kinetic properties as pAX3801. FIG. 1 shows the relative stability of GRG23 25 described. (ACE1) vs GRG23 at elevated temperatures. Surprisingly, one clone, herein designated grg23(ace2) (SEQID NO:10), encoding the GRG23(ACE2) protein (SEQ Example 5 ID NO:11) was identified as having improved thermostabil 30 ity. The DNA sequence of grg23(ace2) shows that GRG23 Determination of EPSPS Activity of GRG-23 (ACE2) contains a single amino acid change (residue 276 of Variants GRG23 to arginine).

Extracts containing GRG23 variant proteins were assayed Example 7 35 for EPSP synthase activity as described in U.S. Patent Appli Comparison of GRG23 and GRG51, and cation No. 60/741,166 filed Dec. 1, 2005, hereinincorporated Mutagenesis of Differing Residues by reference in its entirety. Assays were carried out in a final volume of 50 ul containing 0.5 mM shikimate-3-phosphate, Two libraries were generated to assess the permutations of 200 uM phosphoenolpyruvate (PEP), 1 U/ml Xanthine oxi 40 amino acid sequences possible from comparison of the amino dase, 2 U/ml nucleoside phosphorylase, 2.25 mMinosine, 1 acid sequences of GRG23 and GRG51. The first library intro U/ml horseradish peroxidase, 2 mM glyphosate, 50 mM duced variation from the GRG51 amino acid sequence into a HEPES/KOH pH 7.0, 100 mM KC1, and AMPLEX(R) Red grg23(ace2) coding region. The second library introduced the (Invitrogen) according to the manufacturers instructions. variation from GRG23(ACE2) amino acid sequence into the Extracts were incubated with all assay components except 45 grg51 coding region. shikimate-3-phosphate for 5 minutes at room temperature, Clones of the resulting libraries were assessed for (1) abil and assays were started by adding shikimate-3-phosphate. ity to confer glyphosate resistance upon on a cell, and (2) EPSP synthase activity was measured using a Spectramax activity after prolonged incubation at 37° C. A total often Gemini XPS fluorescence spectrometer (Molecular Dynam 50 clones was sequenced and analyzed in more detail. One par ics, excitation: 555 mm; emission: 590 nm). ticular clone, herein designated grg51.4 (SEQ ID NO:12), A full determination of kinetic parameters was performed encoding the protein GRG51.4 (SEQ ID NO:13), contains on purified protein as previously described (U.S. Patent several amino acid changes relative to both GRG23(ACE2) Application No. 60/741,166 filed Dec. 1, 2005), adjusting for and GRG51. The amino acid changes present in GRG51.4 the quantity of protein determined by Bradford assay as 55 relative to GRG23 (ACE2) were subsequently introduced into known in the art. For any one glyphosate concentration, EPSP the grg23(ace2) gene, to yield grg23(ace3) (SEQID NO:14), synthase activity was measured as a function of a broad range which encodes the GRG23(ACE3) protein (SEQID NO:15). of PEP concentrations. The data were fit to the Michaelis GRG23(ACE3) exhibits superior activity and thermostability Menten equation using KALEIDAGRAPHR software (Syn 60 relative to GRG23, and GRG23(ACE2). ergy Software) and used to determine the K (Kapparent) of GRG23(ace1) was mutagenized, and clones were tested to the EPSP synthase at that glyphosate concentration. K. identify clones expressing variants with improved thermosta apparent values were determined at no fewer than four gly bility and/or activity. One clone, grg23 (L5P2.J2) (SEQ ID phosate concentrations, and the K, of the EPSPS for glypho NO:16), encoding GRG23(L5P2.J2) (SEQ ID NO:17), was sate was calculated from the plot of Kapparent vs. glypho 65 identified by virtue of its improved kinetic properties. GRG23 sate concentration, using the equation (m1*X/(m2+x); m1 =1; (L5P2.J2) contains three amino acid changes relative to m2=1) as known in the art. GRG23 (ACE1), as shown in the following Table 4. US 7,834,249 B2 19 20 Example 9 TABLE 4 Variants of gr23(ace3) Amino Acid changes in GRG23(L5P2.J2) Amino Acid (AA) in 5 Oligonucleotide mutagenesis was utilized to generate vari GRG23 (LSP2.J2) relative to ants of grg23(ace3) that result in expression of proteins with GRG23(ACE1) modifications to amino acid residues corresponding to posi tions 169 to 174 of SEQID NO: 15. Mutagenesis reactions V101-eF were carried out using the QuickChange R. kit (Stratagene) A21.3->S 10 according to the manufacturers instructions. Plasmid clones D284-eN were transformed into E. coli cell line BL21*DE3, and expression of proteins induced by IPTG as known in the art. The proteins were purified by affinity binding to a nickel Oligonucleotide mutagenesis was used to modify the column (TALON metal affinity resin, Clontech). The native grg23(ace3) coding region to contain each of the amino acid 15 GRG23(ace3) protein was also expressed and purified for use changes identified in GRG23(L5P2.J2). A clone with a gene as a control. Following purification of each enzyme, protein containing these modifications was identified, and found to concentration was measured by Bradford assay as known in encode a protein having altered kinetic properties over the art. GRG23(ACE3). This gene was designated grg23(ace4) (SEQ Example 10 ID NO: 18). The protein encoded by grg23(ace4) and desig nated as GRG23(ACE4) (SEQID NO: 19) contains a single Kinetic Analysis of GRG23(ace3) Variants amino acid change relative to GRG23 (ACE3) (Valine 101 to phenylalanine). Based on this result, a separate oligonucle GRG23(ace3) variants GRG23(ace3) R173K (SEQ ID otide mutagenesis was performed to test the kinetics of each 25 NO:29, encoded by SEQID NO:28), GRG23(ace3) G169V/ possible amino acid substitution at position 101. None of the L170V (SEQ ID NO:31, encoded by SEQ ID NO:30), amino acid changes resulted in further improvement in GRG23(ace3) R173Q (SEQID NO:33, encoded by SEQID kinetic properties compared to GRG23(ACE4). See Table 5. NO:32), and GRG23(ace3) I174V (SEQID NO:35, encoded by SEQID NO:34) were characterized by enzymatic assays 30 as described herein, and compared to the native GRG23 TABLE 5 (ace3) enzyme. For each enzyme the Km (app) was deter Kinetics of improved variants mined at each of several glyphosate concentrations, and a plot of Km (app) vs. glyphosate concentration was used to calcu Km Ki Vmax (IM) (IM) (nmol/ming) late the Ki for each enzyme. The thermostability for each 35 enzyme was assessed by incubating the enzyme at 37°C. for GRG-23 14 10,800 13 16 hours, and then quantifying the enzymatic activity remain GRGS1 15 21,048 13 GRG23(ACE1) 10 14,620 14 ing (as Vmax) vs. control enzyme that was incubated at 4°C. GRG23(ACE2) 11 18,104 15 Kinetic analysis reveals that GRG23(ace3)R 173K, GRGS1.4 19 26,610 17 GRG23(ace3)R 173O, GRG23 (ace3)I174V and GRG23 GRG23(ACE3) 15 20,000 17 40 (ace3)G169V/L170V are virtually indistinguishable from GRG23 (LSP2.J2) 15 2,500 23 native GRG23(ace3) and show nearly identical catalytic rate, GRG23(ACE4) 14 5,010 24 extremely high resistance to glyphosate, and very good affin ity for PEP. The observed K for PEP of 19 uM for G169V/ L170V (vs 15uM for GRG23(ace3) may represent a slightly Example 8 45 lower binding affinity for PEP relative to GRG23(ace3). Each of the four variants had a thermostability of greater than 95% Improved Thermostability of GRG23 Variants after 16 hours at 37°C. The kinetic values determined for the GRG23(ace3) variants GRG23(ace3) R173K, GRG23(ace3) R173Q, and GRG23(ace3) I 174V are shown in Table 6. The thermostability of GRG23 and several variants were 50 determined by incubation of protein samples at 37° C. for a TABLE 7 range of times, then determining the residual EPSPS activity as described herein, and comparing the activity to that of a Kinetics of GRG23 (ace) variants control sample incubated at 4°C. Table 6 shows that the Km (IM) Ki (mM) Vmax (nmol/ming) GRG23 variants GRG23(ACE1), GRG23(ACE2), and 55 GRG23(ace3) 16 14 16 GRG23(ACE3) have improved thermostability. GRG23(ace3) R173K 16.6 14.7 17.7 GRG23(ace3) R173O 14.3 14.2 15.8 TABLE 6 GRG23(ace3) I174V 15.5 1S.O 15.5

Thermostability of GRG23 and GRG23 variants 60 Protein to at 37° C. Example 11 GRG23 10.1 Cloning of Syngrg23 and grg23 Variants into a Plant GRG23(ACE1) 15.3 GRG23(ACE2) 34.2 Expression Cassette GRG23(ACE3) 65.4 65 For Syngrg23, and each of the grg23 variants described herein (including, for example, grg23(ace1), grg23(ace2), US 7,834,249 B2 21 22 grg23(ace3), grg23(ace4), grg23(ace3)R173K, grg23(ace3) overnight at 25°C. in the dark. However, it is not necessary R173Q, grg23(ace3)I174V, grg23(ace3)G169V/L170V and perse to incubate the embryos overnight. grg23(L5P2.J2)), the open reading frame (ORF) is amplified The resulting explants are transferred to mesh squares (30 by PCR from a full-length DNA template. HindIII restriction 40 perplate), transferred onto osmotic media for about 30-45 sites are added to each end of the ORFs during PCR. Addi minutes, then transferred to a beaming plate (see, for tionally, the nucleotide sequence ACC is added immediately example, PCT Publication No. WO/0138514 and U.S. Pat. 5' to the start codon of the gene to increase translational No. 5.240,842). efficiency (Kozak (1987) Nucleic Acids Research 15:8125 DNA constructs designed to express the GRG proteins of 8148; Joshi (1987) Nucleic Acids Research 15:6643-6653). the present invention in plant cells are accelerated into plant The PCR product is cloned and sequenced using techniques 10 tissue using an aerosol beam accelerator, using conditions well known in the art to ensure that no mutations are intro essentially as described in PCT Publication No. duced during PCR. WO/O138514. After beaming, embryos are incubated for The plasmid containing the PCR product is digested with about 30 min on osmotic media, and placed onto incubation Hind III and the fragment containing the intact ORF is iso media overnight at 25°C. in the dark. To avoid unduly dam lated. This fragment is cloned into the Hind III site of a 15 aging beamed explants, they are incubated for at least 24 plasmid such as paX200, a plant expression vector contain hours prior to transfer to recovery media. Embryos are then ing the rice actin promoter (McElroy et al. (1991) Molec. spread onto recovery period media, for about 5 days, 25°C. in Gen. Genet. 231: 150-160) and the PinII terminator (An et al. the dark, then transferred to a selection media. Explants are (1989) The Plant Cell 1: 115-122). The promoter-gene-termi incubated in selection media for up to eight weeks, depending nator fragment from this intermediate plasmid is then Sub on the nature and characteristics of the particular selection cloned into plasmid pSB11 (Japan Tobacco, Inc.) to form a utilized. After the selection period, the resulting callus is final pSB11-based plasmid. In some cases, it may be prefer transferred to embryo maturation media, until the formation able to generate an alternate construct in which a chloroplast of mature somatic embryos is observed. The resulting mature leader sequence is encoded as a fusion to the N-terminus of Somatic embryos are then placed under low light, and the the Syngrg23, grg23(acel), grg23(ace2), grg23(ace3), grg23 25 process of regeneration is initiated by methods known in the (ace4), grg23 (L5P2.J2), grg23(ace3)R 173K, grg23(ace3) art. The resulting shoots are allowed to root on rooting media, R173Q, grg23(ace3)I174V, or grg23(ace3)G169V/L170V and the resulting plants are transferred to nursery pots and constructs. These pSB11-based plasmids are typically orga propagated as transgenic plants. nized such that the DNA fragment containing the promoter gene-terminator construct, or promoter-chloroplast leader 30 gene-terminator construct may be excised by double Materials digestion by restriction enzymes, such as Kipn I and Pme I. DN62ASS Media and used fortransformation into plants by aerosol beam injec Components per liter Source tion. The structure of the resulting pSB11-based clones is 35 Verified by restriction digest and gel electrophoresis, and by Chu's N6 Basal Salt 3.98 g/L Phytotechnology Labs sequencing across the various cloning junctions. Mixture (Prod. No. C 416) Chu's N6 Vitamin Solution 1 mL Phytotechnology Labs The plasmid is mobilized into Agrobacterium tumefaciens (Prod. No. C 149) (of 1000x Stock) strain LBA4404 which also harbors the plasmid pSB1 (Japan L-Asparagine 800 mg/L Phytotechnology Labs Tobacco, Inc.), using triparental mating procedures well Myo-inositol 100 mg/L Sigma known in the art, and plating on media containing spectino 40 L-Proline 1.4 g/L Phytotechnology Labs Casamino acids 100 mg/L Fisher Scientific mycin. The pSB11-based plasmid clone carries spectinomy Sucrose 50 g/L Phytotechnology Labs cin resistance but is a narrow host range plasmid and cannot 2,4-D (Prod. No. D-7299) 1 ml/L Sigma replicate in Agrobacterium. Spectinomycin resistant colonies (of 1 mg/ml Stock) arise when pSB11-based plasmids integrate into the broad 45 Adjust the pH of the solution to pH5.8with 1NKOH1NKC1, add Gelrite (Sigma) to 3 g/L, host range plasmid pSB1 through homologous recombina and autoclave. After cooling to 50° C., add 2 ml/L of a 5 mg/ml stock solution of Silver tion. The cointegrate product of pSB1 and the pSB11-based Nitrate (Phytotechnology Labs). Recipe yields about 20 plates. plasmid is verified by Southern hybridization. The Agrobac terium strain harboring the cointegrate is used to transform Example 13 maize by methods known in the art, such as, for example, the 50 PureIntro method (Japan Tobacco). Transformation of EPSP Synthase Enzymes into Maize Plant Cells by Agrobacterium-Mediated Example 12 Transformation

Transformation of Plant Cells by 55 Ears are best collected 8-12 days after pollination. Agrobacterium-Mediated Transformation Embryos are isolated from the ears, and those embryos 0.8- 1.5 mm in size are preferred for use in transformation. Maize ears are best collected 8-12 days after pollination. Embryos are plated Scutellum side-up on a Suitable incuba Embryos are isolated from the ears, and those embryos 0.8- tion media, and incubated overnight at 25°C. in the dark. 1.5 mm in size are preferred for use in transformation. 60 However, it is not necessary perse to incubate the embryos Embryos are plated Scutellum side-up on a Suitable incuba overnight. Embryos are contacted with an Agrobacterium tion media, such as DN62A5S media (3.98 g/L N6 Salts: 1 strain containing the appropriate vectors having an EPSP ml/L (of 1000x Stock) N6 Vitamins; 800 mg/L L-Asparagine; synthase enzyme of the present invention for Ti plasmid 100 mg/L Myo-inositol: 1.4 g/L L-Proline; 100 mg/L mediated transfer for about 5-10 min, and then plated onto Casamino acids; 50 g/L Sucrose; 1 ml/L (of 1 mg/ml stock) 65 co-cultivation media for about 3 days (25°C. in the dark). 2,4-D). However, media and salts other than DN62A5S are After co-cultivation, explants are transferred to recovery suitable and are known in the art. Embryos are incubated period media for about five days (at 25°C. in the dark). US 7,834,249 B2 23 24 Explants are incubated in selection media for up to eight in the art to which this invention pertains. All publications and weeks, depending on the nature and characteristics of the patent applications are herein incorporated by reference to the particular selection utilized. After the selection period, the same extent as if each individual publication or patent appli resulting callus is transferred to embryo maturation media, until the formation of mature somatic embryos is observed. 5 cation was specifically and individually indicated to be incor The resulting mature Somatic embryos are then placed under porated by reference. low light, and the process of regeneration is initiated as Although the foregoing invention has been described in known in the art. The resulting shoots are allowed to root on rooting media, and the resulting plants are transferred to Some detail by way of illustration and example for purposes nursery pots and propagated as transgenic plants. 10 of clarity of understanding, it will be obvious that certain All publications and patent applications mentioned in the changes and modifications may be practiced within the scope specification are indicative of the level of skill of those skilled of the appended claims.

SEQUENCE LISTING

<16 Os NUMBER OF SEO ID NOS: 35

<21 Os SEQ ID NO 1 &211s LENGTH: 1892 &212s. TYPE: DNA <213> ORGANISM: Arthrobacter globiformis 22 Os. FEATURE: <221s NAME/KEY: CDS <222s. LOCATION: (178) . . . (1417) <221s NAMEAKEY: misc feature &222s. LOCATION: 1801 <223> OTHER INFORMATION: n = A, T, C or G

<4 OOs SEQUENCE: 1 gggaccacat gctgctic ctg attt Cagggc tigctg.ccggit atggaccagg gtttagagag 60

ggacggcacg catc.cgggcc ctitat cqgac Caacgc.caac agcggit cqgt ggccttggag 12O cggggcc agc acggc.cgatc acgtag actic tittggagctt cqct caaag gatcacc atg 18O Met 1.

gala act gat ca ct a gtg at C cca gga t c aaa agc at C acc aac C9g 228 Glu Thir Asp Arg Lieu Val Ile Pro Gly Ser Lys Ser Ile Thr Asn Arg 5 1O 15

gct ttg Ctt ttg gct gcc gca gcg aag ggc acg tog gtc. Ctg gtg aga 276 Ala Lieu. Lieu. Lieu Ala Ala Ala Ala Lys Gly. Thir Ser Val Lieu Val Arg 2O 25 3O

cca ttg gtc agc gcc gat acc tica gca ttcaaa act gca att cag goc 324 Pro Lieu Val Ser Ala Asp Thir Ser Ala Phe Llys Thr Ala Ile Glin Ala 35 4 O 45

ctic ggit gcc aac gtc. tca gcc gac ggt gac aat tig gtC gtt gala ggc 372 Lieu. Gly Ala Asn Val Ser Ala Asp Gly Asp Asn Trp Val Val Glu Gly SO 55 6 O 65

Ctgggt cag gca ccc cac ctic gac gcc gac at C tog tec gag gat gca 42O Lieu. Gly Glin Ala Pro His Lieu. Asp Ala Asp Ile Trp Cys Glu Asp Ala 70 7s 8O

ggit acc gtg gCC cqg tt C ctic cct cca ttc gt C gcc gca gga cag ggg 468 Gly Thr Val Ala Arg Phe Leu Pro Pro Phe Val Ala Ala Gly Glin Gly 85 90 95

aag titc acc gtC gaC gga agc gag cag Ctg C9g cgg cqc ccg Ctt cqg 516 Llys Phe Thr Val Asp Gly Ser Glu Glin Lieu. Arg Arg Arg Pro Lieu. Arg 1 OO 105 11O

Ccc ctg gtc gaC ggc at C cqc cac ctg. g.gc gcc cqc gtc. tcc ticc gag 564 Pro Lieu Val Asp Gly Ile Arg His Lieu. Gly Ala Arg Val Ser Ser Glu 115 12O 125

cag ctg. CCC cta aca att gala gcg agc ggg Ctg gca ggc ggg gag tac 61.2 Gln Lieu Pro Lieu. Thir Ile Glu Ala Ser Gly Lieu Ala Gly Gly Glu Tyr 13 O 135 14 O 145

gala att gaa goc cat cag agc agc cag titc gcc to C ggc Ctg atc atg 660

US 7,834,249 B2 27 28

- Continued aataaggagt tittncgaggc cacccagatt C cccdggtgg aaggcgatat gggct tcatg 1847

Ctgaactatgggg.tc.cggat ggaagtgact tttcaactict gcc.ca 1892

<210s, SEQ ID NO 2 &211s LENGTH: 413 212. TYPE: PRT <213> ORGANISM: Arthrobacter globiformis <4 OOs, SEQUENCE: 2 Met Glu Thir Asp Arg Lieu Val Ile Pro Gly Ser Lys Ser Ile Thr Asn 1. 5 1O 15 Arg Ala Lieu. Lieu. Lieu Ala Ala Ala Ala Lys Gly. Thir Ser Val Lieu Val 2O 25 3O Arg Pro Lieu Val Ser Ala Asp Thir Ser Ala Phe Llys Thr Ala Ile Glin 35 4 O 45 Ala Lieu. Gly Ala Asn Val Ser Ala Asp Gly Asp Asn Trp Val Val Glu SO 55 6 O Gly Lieu. Gly Glin Ala Pro His Lieu. Asp Ala Asp Ile Trp Cys Glu Asp 65 70 7s 8O Ala Gly Thr Val Ala Arg Phe Leu Pro Pro Phe Val Ala Ala Gly Glin 85 90 95 Gly Llys Phe Thr Val Asp Gly Ser Glu Gln Lieu. Arg Arg Arg Pro Lieu. 1OO 105 11 O Arg Pro Lieu Val Asp Gly Ile Arg His Lieu. Gly Ala Arg Val Ser Ser 115 12 O 125 Glu Gln Lieu Pro Lieu. Thir Ile Glu Ala Ser Gly Lieu Ala Gly Gly Glu 13 O 135 14 O Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Lieu. Ile 145 150 155 160 Met Ala Ala Pro Tyr Ala Arg Glin Gly Lieu. Arg Val Arg Ile Pro Asn 1.65 17O 17s Pro Val Ser Glin Pro Tyr Lieu. Thr Met Thr Lieu. Arg Met Met Arg Asp 18O 185 19 O Phe Gly Lieu. Glu Thir Ser Thr Asp Gly Ala Thr Val Ser Val Pro Pro 195 2OO 2O5 Gly Arg Tyr Thr Ala Arg Arg Tyr Glu Ile Glu Pro Asp Ala Ser Thr 21 O 215 22O Ala Ser Tyr Phe Ala Ala Ala Ser Ala Val Ser Gly Arg Ser Phe Glu 225 23 O 235 24 O Phe Glin Gly Lieu. Gly Thr Asp Ser Ile Glin Gly Asp Thr Ser Phe Phe 245 250 255 Asn Val Lieu. Gly Arg Lieu. Gly Ala Glu Val His Trp Ala Pro Asn. Ser 26 O 265 27 O Val Thir Ile Ser Gly Pro Glu Arg Lieu. Asn Gly Asp Ile Glu Val Asp 27s 28O 285 Met Gly Glu Ile Ser Asp Thr Phe Met Thr Lieu Ala Ala Ile Ala Pro 29 O 295 3 OO Lieu Ala Asp Gly Pro Ile Thir Ile Thr Asn. Ile Gly His Ala Arg Lieu. 3. OS 310 315 32O Lys Glu Ser Asp Arg Ile Ser Ala Met Glu Thir Asn Lieu. Arg Thr Lieu. 3.25 330 335 Gly Val Glin Thr Asp Val Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O Thr Pro His Gly Gly Arg Val Asn. Cys His Arg Asp His Arg Ile Ala US 7,834,249 B2 29 30

- Continued

355 360 365 Met Ala Phe Ser Ile Lieu. Gly Lieu. Arg Val Asp Gly Ile Thr Lieu. Asp 37 O 375 38O Asp Pro Gln Cys Val Gly Lys Thr Phe Pro Gly Phe Phe Asp Tyr Lieu. 385 390 395 4 OO Gly Arg Lieu. Phe Pro Glu Lys Ala Lieu. Thir Lieu Pro Gly 4 OS 41O

<210s, SEQ ID NO 3 &211s LENGTH: 436 212. TYPE: PRT <213> ORGANISM: Arthrobacter globiformis <4 OOs, SEQUENCE: 3 Met Ala Lieu. Glu Arg Gly Gln His Gly Arg Ser Arg Arg Lieu. Phe Gly 1. 5 1O 15 Ala Ser Lieu. Glu Arg Ile Thr Met Glu Thir Asp Arg Lieu Val Ile Pro 2O 25 3O Gly Ser Lys Ser Ile Thr Asn Arg Ala Lieu. Lieu. Lieu Ala Ala Ala Ala 35 4 O 45 Lys Gly. Thir Ser Val Lieu Val Arg Pro Lieu Val Ser Ala Asp Thir Ser SO 55 6 O Ala Phe Llys Thr Ala Ile Glin Ala Lieu. Gly Ala Asn. Wal Ser Ala Asp 65 70 7s 8O Gly Asp Asn Trp Val Val Glu Gly Lieu. Gly Glin Ala Pro His Lieu. Asp 85 90 95 Ala Asp Ile Trp Cys Glu Asp Ala Gly Thr Val Ala Arg Phe Lieu Pro 1OO 105 11 O Pro Phe Val Ala Ala Gly Glin Gly Llys Phe Thr Val Asp Gly Ser Glu 115 12 O 125 Glin Lieu. Arg Arg Arg Pro Lieu. Arg Pro Lieu Val Asp Gly Ile Arg His 13 O 135 14 O Lieu. Gly Ala Arg Val Ser Ser Glu Glin Lieu Pro Lieu. Thir Ile Glu Ala 145 150 155 160 Ser Gly Lieu Ala Gly Gly Glu Tyr Glu Ile Glu Ala His Glin Ser Ser 1.65 17O 17s Glin Phe Ala Ser Gly Lieu. Ile Met Ala Ala Pro Tyr Ala Arg Glin Gly 18O 185 19 O Lieu. Arg Val Arg Ile Pro Asn Pro Val Ser Gln Pro Tyr Lieu. Thir Met 195 2OO 2O5 Thr Lieu. Arg Met Met Arg Asp Phe Gly Lieu. Glu Thir Ser Thr Asp Gly 21 O 215 22O Ala Thr Val Ser Val Pro Pro Gly Arg Tyr Thr Ala Arg Arg Tyr Glu 225 23 O 235 24 O Ile Glu Pro Asp Ala Ser Thr Ala Ser Tyr Phe Ala Ala Ala Ser Ala 245 250 255 Val Ser Gly Arg Ser Phe Glu Phe Glin Gly Lieu. Gly Thr Asp Ser Ile 26 O 265 27 O Glin Gly Asp Thir Ser Phe Phe Asin Val Lieu. Gly Arg Lieu. Gly Ala Glu 27s 28O 285 Val His Trp Ala Pro Asn Ser Val Thir Ile Ser Gly Pro Glu Arg Lieu. 29 O 295 3 OO Asn Gly Asp Ile Glu Val Asp Met Gly Glu Ile Ser Asp Thr Phe Met 3. OS 310 315 32O

US 7,834,249 B2 35 36

- Continued

Ala Lieu. Gly Ala Asn Val Ser Ala Asp Gly Asp Asp Trp Val Val Glu SO 55 6 O Gly Lieu. Gly Glin Ala Pro Asn Lieu. Asp Ala Asp Ile Trp Cys Glu Asp 65 70 7s 8O Ala Gly Thr Val Ala Arg Phe Leu Pro Pro Phe Val Ala Ala Gly Glin 85 90 95 Gly Llys Phe Thr Val Asp Gly Ser Glu Gln Lieu. Arg Arg Arg Pro Lieu. 1OO 105 11 O Arg Pro Val Val Asp Gly Ile Arg His Lieu. Gly Ala Arg Val Ser Ser 115 12 O 125 Glu Gln Lieu Pro Lieu. Thir Ile Glu Ala Ser Gly Lieu Ala Gly Gly Glu 13 O 135 14 O Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Lieu. Ile 145 150 155 160 Met Ala Ala Pro Tyr Ala Arg Glin Gly Lieu. Arg Val Arg Ile Pro Asn 1.65 17O 17s Pro Val Ser Glin Pro Tyr Lieu. Thr Met Thr Lieu. Arg Met Met Arg Asp 18O 185 19 O Phe Gly Ile Glu Thir Ser Thr Asp Gly Ala Thr Val Ser Val Pro Pro 195 2OO 2O5 Gly Arg Tyr Thr Ala Arg Arg Tyr Glu Ile Glu Pro Asp Ala Ser Thr 21 O 215 22O Ala Ser Tyr Phe Ala Ala Ala Ser Ala Val Ser Gly Arg Arg Phe Glu 225 23 O 235 24 O Phe Glin Gly Lieu. Gly Thr Asp Ser Ile Glin Gly Asp Thr Ser Phe Phe 245 250 255 Asn Val Lieu. Gly Arg Lieu. Gly Ala Glu Val His Trp Ala Ser Asn. Ser 26 O 265 27 O Val Thir Ile Ser Gly Pro Glu Arg Lieu. Thr Gly Asp Ile Glu Val Asp 27s 28O 285 Met Gly Glu Ile Ser Asp Thr Phe Met Thr Lieu Ala Ala Ile Ala Pro 29 O 295 3 OO Lieu Ala Asp Gly Pro Ile Thir Ile Thr Asn. Ile Gly His Ala Arg Lieu. 3. OS 310 315 32O Lys Glu Ser Asp Arg Ile Ser Ala Met Glu Ser Asn Lieu. Arg Met Lieu. 3.25 330 335 Gly Val Glin Thr Asp Val Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O Thr Pro His Gly Gly Arg Val Asn. Cys His Arg Asp His Arg Ile Ala 355 360 365 Met Ala Phe Ser Ile Lieu. Gly Lieu. Arg Val Asp Gly Ile Thr Lieu. Asp 37 O 375 38O Asp Pro Gln Cys Val Gly Lys Thr Phe Pro Gly Phe Phe Asp Tyr Lieu. 385 390 395 4 OO Gly Arg Lieu. Phe Pro Glu Lys Ala Lieu. Thir Lieu Pro Gly 4 OS 41O

<210s, SEQ ID NO 6 &211s LENGTH: 1242 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: synthetic grg23 <221s NAME/KEY: CDS <222s. LOCATION: (1) . . . (1242)

US 7,834,249 B2 39 40

- Continued ttg gcc gat gga CCC atc acg at a acc aac att ggt cat gca C9g ttg 96.O Lieu Ala Asp Gly Pro Ile Thir Ile Thr Asn. Ile Gly His Ala Arg Lieu. 3. OS 310 315 32O aag gala to C gaC cc atc. tca gcc atg gala acc aac Ctg cqc acg ct c OO8 Lys Glu Ser Asp Arg Ile Ser Ala Met Glu Thir Asn Lieu. Arg Thr Lieu. 3.25 330 335 ggt gta caa acc gac gtc gga cac gaC tdg atg aga atc tac ccc. tct O56 Gly Val Glin Thr Asp Val Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O acc cc.g. cac ggc ggit aga gtgaat to cac cqg gaC cac agg at C gct 104 Thr Pro His Gly Gly Arg Val Asn. Cys His Arg Asp His Arg Ile Ala 355 360 365 atg gcg ttt to a atc Ctg gga ctg aga gtg gac ggg att acc ct c gac 152 Met Ala Phe Ser Ile Lieu. Gly Lieu. Arg Val Asp Gly Ile Thr Lieu. Asp 37 O 375 38O gac cct caa togc gtc. gigg aag acc titt colt ggc titc titc gac tac citt 2OO Asp Pro Gln Cys Val Gly Lys Thr Phe Pro Gly Phe Phe Asp Tyr Lieu. 385 390 395 4 OO gga cqc ctt tt C C cc gala aag gCd Ctt acg ct C C cc ggc tag 242 Gly Arg Lieu. Phe Pro Glu Lys Ala Lieu. Thir Lieu Pro Gly * 4 OS 41O

<210s, SEQ ID NO 7 &211s LENGTH: 413 212. TYPE: PRT <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: synthetic GRG23 <4 OO > SEQUENCE: 7 Met Glu Thir Asp Arg Lieu Val Ile Pro Gly Ser Lys Ser Ile Thr Asn 1. 5 1O 15 Arg Ala Lieu. Lieu. Lieu Ala Ala Ala Ala Lys Gly. Thir Ser Val Lieu Val 2O 25 3O Arg Pro Lieu Val Ser Ala Asp Thir Ser Ala Phe Llys Thr Ala Ile Glin 35 4 O 45 Ala Lieu. Gly Ala Asn Val Ser Ala Asp Gly Asp Asn Trp Val Val Glu SO 55 6 O Gly Lieu. Gly Glin Ala Pro His Lieu. Asp Ala Asp Ile Trp Cys Glu Asp 65 70 7s 8O Ala Gly Thr Val Ala Arg Phe Leu Pro Pro Phe Val Ala Ala Gly Glin 85 90 95 Gly Llys Phe Thr Val Asp Gly Ser Glu Gln Lieu. Arg Arg Arg Pro Lieu. 1OO 105 11 O Arg Pro Lieu Val Asp Gly Ile Arg His Lieu. Gly Ala Arg Val Ser Ser 115 12 O 125 Glu Gln Lieu Pro Lieu. Thir Ile Glu Ala Ser Gly Lieu Ala Gly Gly Glu 13 O 135 14 O Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Lieu. Ile 145 150 155 160 Met Ala Ala Pro Tyr Ala Arg Glin Gly Lieu. Arg Val Arg Ile Pro Asn 1.65 17O 17s Pro Val Ser Glin Pro Tyr Lieu. Thr Met Thr Lieu. Arg Met Met Arg Asp 18O 185 19 O Phe Gly Lieu. Glu Thir Ser Thr Asp Gly Ala Thr Val Ser Val Pro Pro 195 2OO 2O5 Gly Arg Tyr Thr Ala Arg Arg Tyr Glu Ile Glu Pro Asp Ala Ser Thr 21 O 215 22O US 7,834,249 B2 41 42

- Continued

Ala S er Tyr Phe Ala Ala Ala Ser Ala Val Ser Gly Arg Ser Phe Glu 225 23 O 235 24 O

Phe G ln Gly Lieu. Gly Thr Asp Ser Ile Glin Gly Asp Thir Ser Phe Phe 245 250 255

Asn. W all Lieu. Gly Arg Lieu. Gly Ala Glu Val His Trp Ala Pro Asn Ser 26 O 265 27 O

Val Thir Ile Ser Gly Pro Glu Arg Lieu. Asn Gly Asp Ile Glu Wall Asp 285

Met G ly Glu Ile Ser Asp Thr Phe Met Thr Luell Ala Ala Ile Ala Pro 2 9 O 295 3 OO

Lieu Ala Asp Gly Pro Ile Thir Ile Thr Asn Ile Gly His Ala Arg Luell 3. OS 310 315

Lys G lu Ser Asp Arg Ile Ser Ala Met Glu Thir Asn Lell Arg Thir Luell 3.25 330 335

Gly V al Glin Thr Asp Val Gly His Asp Trip Met Arg Ile Tyr Pro Ser 34 O 345 35. O

Thir P ro His Gly Gly Arg Val Asn. Cys His Arg Asp His Arg Ile Ala 355 360 365

Met Ala Phe Ser Ile Lieu. Gly Lieu. Arg Val Asp Gly Ile Thir Luell Asp 3 70 375

Asp P ro Glin Cys Val Gly Lys Thr Phe Pro Gly Phe Phe Asp Luell 385 390 395 4 OO

Gly Arg Lieu. Phe Pro Glu Lys Ala Lieu. Thir Luell Pro Gly 4 OS 41O

SEQ ID NO 8 LENGTH: 1242 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: grg23 (ace1) NAME/KEY: CDS LOCATION: (1) . . . (1242)

< 4 OOs SEQUENCE: 8 atgg aa act gat cqc Ctt gtg at C cca gga tog a.a.a. agc at C acc aac 48 Met G lu. Thir Asp Arg Lieu Val Ile Pro Gly Ser Ser Ile Thir Asn 1. 5 1O 15

ct ttg Ctt ttg gct gcc gca gCd aag ggc acg tcg gtC Ctg gtg 96 Arg Ala Lieu. Lieu. Lieu Ala Ala Ala Ala Lys Gly Thir Ser Wall Luell Wall 2O 25 3 O aga C ca ttg gt C agc gcc gat acc to a gca titc a.a.a. act gca at C cag 144 Arg P ro Lieu Val Ser Ala Asp Thir Ser Ala Phe Thir Ala Ile Glin 35 4 O 45 a CC C. tc ggit gcc aac gtc. tca gcc gaC ggit gac aat tgg gtC gtt gala 192 Thir L eu Gly Ala Asn. Wal Ser Ala Asp Gly Asp Asn Trp Wall Wall Glu SO 55 60 ggc C tg ggit cag gca CCC cac ct c gaC gcc gac atc. tgg tgc gag gac 24 O Gly L. eu Gly Glin Ala Pro His Lieu. Asp Ala Asp Ile Trp Cys Glu Asp 65 70 7s 8O gca ggit act gtg gcc cq9 ttic ct C cct coa titc gta gcc gca ggt cag 288 Ala G ly Thr Val Ala Arg Phe Leu Pro Pro Phe Wall Ala Ala Gly Glin 85 90 95 ggg a ag tt C acc gtc gac gga t c a gag cag Ctg cgg cgg cgc cc.g citt 336 Gly Llys Phe Thr Val Asp Gly Ser Glu Glin Luell Arg Arg Arg Pro Luell 1OO 105 11 O cgg C cc ctg gt c gac ggc atc. c9c cac ctg ggc gcc cgc gtC to c to c 384 Arg P ro Lieu Val Asp Gly Ile Arg His Lieu Gly Ala Arg Wall Ser Ser

US 7,834,249 B2 45 46

- Continued

22 Os. FEATURE: <223> OTHER INFORMATION: GRG23 (ACE1)

<4 OOs, SEQUENCE: 9 Met Glu Thir Asp Arg Lieu Val Ile Pro Gly Ser Lys Ser Ile Thr Asn 1. 5 1O 15 Arg Ala Lieu. Lieu. Lieu Ala Ala Ala Ala Lys Gly. Thir Ser Val Lieu Val 2O 25 3O Arg Pro Lieu Val Ser Ala Asp Thir Ser Ala Phe Llys Thr Ala Ile Glin 35 4 O 45 Thir Lieu. Gly Ala Asn Val Ser Ala Asp Gly Asp Asn Trp Val Val Glu SO 55 6 O Gly Lieu. Gly Glin Ala Pro His Lieu. Asp Ala Asp Ile Trp Cys Glu Asp 65 70 7s 8O Ala Gly Thr Val Ala Arg Phe Leu Pro Pro Phe Val Ala Ala Gly Glin 85 90 95 Gly Llys Phe Thr Val Asp Gly Ser Glu Gln Lieu. Arg Arg Arg Pro Lieu. 1OO 105 11 O Arg Pro Lieu Val Asp Gly Ile Arg His Lieu. Gly Ala Arg Val Ser Ser 115 12 O 125 Glu Gln Lieu Pro Lieu. Thir Ile Glu Ala Ser Gly Lieu Ala Gly Gly Glu 13 O 135 14 O Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Lieu. Ile 145 150 155 160 Met Ala Ala Pro Tyr Ala Arg Glin Gly Lieu. Arg Val Arg Ile Pro ASn 1.65 17O 17s Pro Val Ser Glin Pro Tyr Lieu. Thr Met Thr Lieu. Arg Met Met Arg Asp 18O 185 19 O Phe Gly Lieu. Glu Thir Ser Thr Asp Gly Ala Thr Val Ser Val Pro Pro 195 2OO 2O5 Gly Arg Tyr Thr Ala Arg Arg Tyr Glu Ile Glu Pro Asp Ala Ser Thr 21 O 215 22O Ala Ser Tyr Phe Ala Ala Ala Ser Ala Val Ser Gly Arg Ser Phe Glu 225 23 O 235 24 O Phe Glin Gly Lieu. Gly Thr Asp Ser Ile Glin Gly Asp Thr Ser Phe Phe 245 250 255 Asn Val Lieu. Gly Arg Lieu. Gly Ala Glu Val His Trp Ala Pro Asn. Ser 26 O 265 27 O Val Thir Ile Thr Gly Pro Glu Arg Lieu. Asn Gly Asp Ile Glu Val Asp 27s 28O 285 Met Gly Glu Ile Ser Asp Thr Phe Met Thr Lieu Ala Ala Ile Ala Pro 29 O 295 3 OO Lieu Ala Asp Gly Pro Ile Thir Ile Thr Asn. Ile Gly His Ala Arg Lieu. 3. OS 310 315 32O Lys Glu Ser Asp Arg Ile Ser Ala Met Glu Thir Asn Lieu. Arg Thr Lieu. 3.25 330 335 Gly Val Glin Thr Asp Val Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O Thr Pro His Gly Gly Arg Val Asn. Cys His Arg Asp His Arg Ile Ala 355 360 365 Met Ala Phe Ser Ile Lieu. Gly Lieu. Arg Val Asp Gly Ile Thr Lieu. Asp 37 O 375 38O Asp Pro Gln Cys Val Gly Lys Thr Phe Pro Gly Phe Phe Asp Tyr Lieu. 385 390 395 4 OO

US 7,834,249 B2 49

- Continued aat gta Ctt ggg C9g Ctic ggit gcg gag gtC cac tig gca ccc aac tog 816 Asn Val Lieu. Gly Arg Lieu. Gly Ala Glu Val His Trp Ala Pro Asn. Ser 26 O 265 27 O gtc. acc at a cqg gga ccg gaa agg ctgaac ggc gac att gala gtg gat 864 Val Thir Ile Arg Gly Pro Glu Arg Lieu. Asn Gly Asp Ile Glu Val Asp 27s 28O 285 atg ggc gag att tog gac acc tt C atg aca ctic gcg gcg att gcc cct 912 Met Gly Glu Ile Ser Asp Thr Phe Met Thr Lieu Ala Ala Ile Ala Pro 29 O 295 3 OO ttg gcc gat gga CCC atc acg at a acc aac att ggt cat gca C9g ttg 96.O Lieu Ala Asp Gly Pro Ile Thir Ile Thr Asn. Ile Gly His Ala Arg Lieu. 3. OS 310 315 32O aag gala to C gaC cc atc. tca gcc atg gala acc aac Ctg cqc acg ct c OO8 Lys Glu Ser Asp Arg Ile Ser Ala Met Glu Thir Asn Lieu. Arg Thr Lieu. 3.25 330 335 ggt gta caa acc gac gtc gga cac gaC tdg atg aga atc tac ccc. tct O56 Gly Val Glin Thr Asp Val Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O acc cc.g. cac ggc ggit aga gtgaat to cac cqg gaC cac agg at C gct 104 Thr Pro His Gly Gly Arg Val Asn. Cys His Arg Asp His Arg Ile Ala 355 360 365 atg gcg ttt to a atc Ctg gga ctg aga gtg gac ggg att acc ct c gac 152 Met Ala Phe Ser Ile Lieu. Gly Lieu. Arg Val Asp Gly Ile Thr Lieu. Asp 37 O 375 38O gac cct caa togc gtc. gigg aag acc titt colt ggc titc titc gac tac citt 2OO Asp Pro Gln Cys Val Gly Lys Thr Phe Pro Gly Phe Phe Asp Tyr Lieu. 385 390 395 4 OO gga cqc ctt tt C C cc gala aag gCd Ctt acg ct C C cc ggc tag 242 Gly Arg Lieu. Phe Pro Glu Lys Ala Lieu. Thir Lieu Pro Gly * 4 OS 41O

<210s, SEQ ID NO 11 &211s LENGTH: 413 212. TYPE: PRT <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: GRG23 (ACE2)

<4 OOs, SEQUENCE: 11 Met Glu Thir Asp Arg Lieu Val Ile Pro Gly Ser Lys Ser Ile Thr Asn 1. 5 1O 15 Arg Ala Lieu. Lieu. Lieu Ala Ala Ala Ala Lys Gly. Thir Ser Val Lieu Val 2O 25 3O Arg Pro Lieu Val Ser Ala Asp Thir Ser Ala Phe Llys Thr Ala Ile Glin 35 4 O 45 Ala Lieu. Gly Ala Asn Val Ser Ala Asp Gly Asp Asn Trp Val Val Glu SO 55 6 O Gly Lieu. Gly Glin Ala Pro His Lieu. Asp Ala Asp Ile Trp Cys Glu Asp 65 70 7s 8O Ala Gly Thr Val Ala Arg Phe Leu Pro Pro Phe Val Ala Ala Gly Glin 85 90 95 Gly Llys Phe Thr Val Asp Gly Ser Glu Gln Lieu. Arg Arg Arg Pro Lieu. 1OO 105 11 O Arg Pro Lieu Val Asp Gly Ile Arg His Lieu. Gly Ala Arg Val Ser Ser 115 12 O 125 Glu Gln Lieu Pro Lieu. Thir Ile Glu Ala Ser Gly Lieu Ala Gly Gly Glu 13 O 135 14 O Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Lieu. Ile US 7,834,249 B2 51

- Continued

145 150 155 160 Met Ala Ala Pro Tyr Ala Arg Glin Gly Lieu. Arg Val Arg Ile Pro Asn 1.65 17O 17s Pro Val Ser Glin Pro Tyr Lieu. Thr Met Thr Lieu. Arg Met Met Arg Asp 18O 185 19 O Phe Gly Lieu. Glu Thir Ser Thr Asp Gly Ala Thr Val Ser Val Pro Pro 195 2OO 2O5 Gly Arg Tyr Thr Ala Arg Arg Tyr Glu Ile Glu Pro Asp Ala Ser Thr 21 O 215 22O Ala Ser Tyr Phe Ala Ala Ala Ser Ala Val Ser Gly Arg Ser Phe Glu 225 23 O 235 24 O Phe Glin Gly Lieu. Gly Thr Asp Ser Ile Glin Gly Asp Thr Ser Phe Phe 245 250 255 Asn Val Lieu. Gly Arg Lieu. Gly Ala Glu Val His Trp Ala Pro Asn. Ser 26 O 265 27 O Val Thir Ile Arg Gly Pro Glu Arg Lieu. Asn Gly Asp Ile Glu Val Asp 27s 28O 285 Met Gly Glu Ile Ser Asp Thr Phe Met Thr Lieu Ala Ala Ile Ala Pro 29 O 295 3 OO Lieu Ala Asp Gly Pro Ile Thir Ile Thr Asn. Ile Gly His Ala Arg Lieu. 3. OS 310 315 32O Lys Glu Ser Asp Arg Ile Ser Ala Met Glu Thir Asn Lieu. Arg Thr Lieu. 3.25 330 335 Gly Val Glin Thr Asp Val Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O Thr Pro His Gly Gly Arg Val Asn. Cys His Arg Asp His Arg Ile Ala 355 360 365 Met Ala Phe Ser Ile Lieu. Gly Lieu. Arg Val Asp Gly Ile Thr Lieu. Asp 37 O 375 38O Asp Pro Gln Cys Val Gly Lys Thr Phe Pro Gly Phe Phe Asp Tyr Lieu. 385 390 395 4 OO Gly Arg Lieu. Phe Pro Glu Lys Ala Lieu. Thir Lieu Pro Gly 4 OS 41O

<210s, SEQ ID NO 12 &211s LENGTH: 1242 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: grg51.4 <221s NAME/KEY: CDS <222s. LOCATION: (1) . . . (1242)

<4 OOs, SEQUENCE: 12 atg gala act gat ca cta gtg at C cca gga tog aaa agc at C acc aac 48 Met Glu Thir Asp Arg Lieu Val Ile Pro Gly Ser Lys Ser Ile Thr Asn 1. 5 1O 15 cgg gct ttg ctt ttg gct gcc gca gcg aag ggc acg tcg gt C ctg gtg 96 Arg Ala Lieu. Lieu. Lieu Ala Ala Ala Ala Lys Gly. Thir Ser Val Lieu Val 2O 25 3 O aga cca ttg gtc agc gcc gat acc to a goa titc aaa act gca att cag 144 Arg Pro Lieu Val Ser Ala Asp Thir Ser Ala Phe Llys Thr Ala Ile Glin 35 4 O 45 gcc ct c ggit gcc aac gtc. tca gC9 gaC ggit gat gat t gtC gtt gala 192 Ala Lieu. Gly Ala Asn Val Ser Ala Asp Gly Asp Asp Trp Val Val Glu SO 55 60 ggc ctg ggc cag goa CCC aac Ct c gaC gcc gac atc tig tec gag gat 24 O

US 7,834,249 B2 55 56

- Continued gac Cct caa tt gtc ggg aag acc titt cott ggc tto tto gac tac citt 12 OO Asp Pro Gln Cys Val Gly Lys Thr Phe Pro Gly Phe Phe Asp Tyr Lieu 385 390 395 4 OO gga cgc Ctt tt C C C9 gaa aag gCd Ctt acg citc. c cc ggc tag 1242 Gly Arg Lieu. Phe Pro Glu Lys Ala Lieu. Thir Luell Pro Gly k 4 OS 41O

<210s, SEQ ID NO 13 &211s LENGTH: 413 212. TYPE: PRT <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: 223 OTHER INFORMATION: GRGS1.4

<4 OOs, SEQUENCE: 13

Met Glu Thir Asp Arg Lieu Val Ile Pro Gly Ser Ser Ile Thir Asn 1. 5 15

Arg Ala Lieu. Lieu. Lieu Ala Ala Ala Ala Gly Thir Ser Wall Luell Wall 2O 25 3O

Arg Pro Leu Val Ser Ala Asp Thr Ser Ala Phe Thir Ala Ile Glin 35 4 O 45

Ala Lieu. Gly Ala Asn Val Ser Ala Asp Gly Asp Asp Trp Wall Wall Glu SO 55 6 O

Gly Lieu. Gly Glin Ala Pro Asn Lieu. Asp Ala Asp Ile Trp Glu Asp 65 70

Ala Gly Thr Val Ala Arg Phe Leu Pro Pro Phe Wall Ala Ala Gly Glin 85 90 95

Gly Llys Phe Thr Val Asp Gly Ser Glu Glin Luell Arg Arg Arg Pro Luell 1OO 105 11 O

Arg Pro Val Val Asp Gly Ile Arg His Luell Gly Ala Arg Wall Ser Ser 115 12 O 125

Glu Gln Lieu. Pro Lieu. Thir Ile Glu Ala Ser Gly Lell Ala Gly Gly Glu 13 O 135 14 O

Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Luell Ile 145 150 155 160

Met Ala Ala Pro Tyr Ala Arg Glin Gly Luell Arg Wall Arg Ile Pro Asn 1.65 17O 17s

Pro Val Ser Glin Pro Tyr Lieu. Thir Met Thir Luell Arg Met Met Arg Asp 18O 185 19 O

Phe Gly Ile Glu Thir Ser Thr Asp Gly Ala Thir Wall Ser Wall Pro Pro 195 2OO

Gly Arg Tyr Thir Ala Arg Arg Tyr Glu Ile Glu Pro Asp Ala Ser Thir 21 O 215 22O

Ala Ser Tyr Phe Ala Ala Ala Ser Ala Wall Ser Gly Arg Arg Phe Glu 225 23 O 235 24 O

Phe Glin Gly Lieu. Gly Thr Asp Ser Ile Glin Gly Asp Thir Ser Phe Phe 245 250 255

Asn Val Lieu. Gly Arg Lieu. Gly Ala Glu Wall His Trp Ala Ser Asn Ser 26 O 265 27 O

Wall Thir Ile Arg Gly Pro Glu Arg Lieu Thir Gly Asp Ile Glu Wall Asp 27s 28O 285

Met Gly Glu Ile Ser Asp Thr Phe Met Thir Luell Ala Ala Ile Ala Pro 29 O 295 3 OO

Lell Ala Asp Gly Pro Ile Thr Ile Thr Asn Ile Gly His Ala Arg Luell 3. OS 310 315

Glu Ser Asp Arg Ile Ser Ala Met Glu Ser Asn Lell Arg Thir Luell

US 7,834,249 B2 61

- Continued Ala Gly Thr Val Ala Arg Phe Leu Pro Pro Phe Val Ala Ala Gly Glin 85 90 95 Gly Llys Phe Thr Val Asp Gly Ser Glu Gln Lieu. Arg Arg Arg Pro Lieu. 1OO 105 11 O Arg Pro Val Val Asp Gly Ile Arg His Lieu. Gly Ala Arg Val Ser Ser 115 12 O 125 Glu Gln Lieu Pro Lieu. Thir Ile Glu Ala Ser Gly Lieu Ala Gly Gly Glu 13 O 135 14 O Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Lieu. Ile 145 150 155 160 Met Ala Ala Pro Tyr Ala Arg Glin Gly Lieu. Arg Val Arg Ile Pro Asn 1.65 17O 17s Pro Val Ser Glin Pro Tyr Lieu. Thr Met Thr Lieu. Arg Met Met Arg Asp 18O 185 19 O Phe Gly Ile Glu Thir Ser Thr Asp Gly Ala Thr Val Ser Val Pro Pro 195 2OO 2O5 Gly Arg Tyr Thr Ala Arg Arg Tyr Glu Ile Glu Pro Asp Ala Ser Thr 21 O 215 22O Ala Ser Tyr Phe Ala Ala Ala Ser Ala Val Ser Gly Arg Arg Phe Glu 225 23 O 235 24 O Phe Glin Gly Lieu. Gly Thr Asp Ser Ile Glin Gly Asp Thr Ser Phe Phe 245 250 255 Asn Val Lieu. Gly Arg Lieu. Gly Ala Glu Val His Trp Ala Ser Asn. Ser 26 O 265 27 O Val Thir Ile Arg Gly Pro Glu Arg Lieu. Thr Gly Asp Ile Glu Val Asp 27s 28O 285 Met Gly Glu Ile Ser Asp Thr Phe Met Thr Lieu Ala Ala Ile Ala Pro 29 O 295 3 OO Lieu Ala Asp Gly Pro Ile Thir Ile Thr Asn. Ile Gly His Ala Arg Lieu. 3. OS 310 315 32O Lys Glu Ser Asp Arg Ile Ser Ala Met Glu Ser Asn Lieu. Arg Thr Lieu. 3.25 330 335 Gly Val Glin Thr Asp Val Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O Thr Pro His Gly Gly Arg Val Asn. Cys His Arg Asp His Arg Ile Ala 355 360 365 Met Ala Phe Ser Ile Lieu. Gly Lieu. Arg Val Asp Gly Ile Thr Lieu. Asp 37 O 375 38O Asp Pro Gln Cys Val Gly Lys Thr Phe Pro Gly Phe Phe Asp Tyr Lieu. 385 390 395 4 OO Gly Arg Lieu. Phe Pro Glu Lys Ala Lieu. Thir Lieu Pro Gly 4 OS 41O

<210s, SEQ ID NO 16 &211s LENGTH: 1242 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: grg23 (L5P2. J2) <221s NAME/KEY: CDS <222s. LOCATION: (1) . . . (1242)

<4 OOs, SEQUENCE: 16 atg gala act gat cqc ctt gtg at C cca gga tog aaa agc at C acc aac Met Glu Thir Asp Arg Lieu Val Ile Pro Gly Ser Lys Ser Ile Thr Asn 1. 5 1O 15

US 7,834,249 B2 65 66

- Continued ggit gta Cala acc gac gtc gga CaC gac tgg atg aga atc. tac cc c tot Gly Wall Glin Thir Asp Wall Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O a CC cc.g CaC ggc ggit aga gtg aat CaC cgg gac CaC agg at C gct 104 Thir Pro His Gly Gly Arg Wall Asn Cys His Arg Asp His Arg Ile Ala 355 360 365 atg gcg titt to a atc. Ctg gga aga gtg gac 999 att acc citc. gac 152 Met Ala Phe Ser Ile Lell Gly Luell Arg Wall Asp Gly Ile Thir Luell Asp 37 O 375 38O gac cott Cala tgc gtc 999 acc titt cott ggc tto tto gac tac citt 2OO Asp Pro Glin Cys Wall Gly Thir Phe Pro Gly Phe Phe Asp Luell 385 390 395 4 OO gga cgc citt ttic c cc gaa gcg citt acg citc. c cc ggc tag 242 Gly Arg Luell Phe Pro Glu Ala Luell Thir Luell Pro Gly 4 OS 41O

SEO ID NO 17 LENGTH: 413 TYPE : PRT ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: GRG23 (L5P2. J2)

<4 OOs, SEQUENCE: 17

Met Glu Thir Asp Arg Lell Wall Ile Pro Gly Ser Ser Ile Thir Asn 1. 5 15

Arg Ala Luell Luell Lell Ala Ala Ala Ala Gly Thir Ser Wall Luell Wall 25 3O

Arg Pro Luell Wall Ser Ala Asp Thir Ser Ala Phe Thir Ala Ile Glin 35 4 O 45

Thir Luell Gly Ala Asn Wall Ser Ala Asp Gly Asp Asn Trp Wall Wall Glu SO 55 6 O

Gly Luell Gly Glin Ala Pro His Luell Asp Ala Asp Ile Trp Glu Asp 65 70

Ala Gly Thir Wall Ala Arg Phe Luell Pro Pro Phe Wall Ala Gly Glin 85 90 95

Gly Phe Thir Phe Asp Gly Ser Glu Glin Luell Arg Arg Pro Luell 105

Arg Pro Luell Wall Asp Gly Ile Arg His Luell Gly Ala Arg Ser Ser 115 12 O 125

Glu Glin Luell Pro Lell Thir Ile Glu Ala Ser Gly Lell Ala Gly Glu 13 O 135 14 O

Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Luell Ile 145 150 155 160

Met Ala Ala Pro Tyr Ala Arg Glin Gly Luell Arg Wall Arg Pro Asn 1.65 17O 17s

Pro Wall Ser Glin Pro Lell Thir Met Thir Luell Arg Met Met Arg Asp 18O 185 19 O

Phe Gly Luell Glu Thir Ser Thir Asp Gly Ala Thir Wall Ser Wall Pro Pro 195

Gly Arg Thir Ser Arg Arg Glu Ile Glu Pro Asp Ala Ser Thir 21 O 215 22O

Ala Ser Phe Ala Ala Ala Ser Ala Wall Ser Gly Arg Ser Phe Glu 225 23 O 235 24 O

Phe Glin Gly Luell Gly Thir Asp Ser Ile Glin Gly Asp Thir Ser Phe Phe 245 250 255 US 7,834,249 B2 67

- Continued Asn Val Lieu. Gly Arg Lieu. Gly Ala Glu Val His Trp Ala Pro Asn. Ser 26 O 265 27 O Val Thir Ile Thr Gly Pro Glu Arg Lieu. Asn Gly Asn. Ile Glu Val Asp 27s 28O 285 Met Gly Glu Ile Ser Asp Thr Phe Met Thr Lieu Ala Ala Ile Ala Pro 29 O 295 3 OO Lieu Ala Asp Gly Pro Ile Thir Ile Thr Asn. Ile Gly His Ala Arg Lieu. 3. OS 310 315 32O Lys Glu Ser Asp Arg Ile Ser Ala Met Glu Thir Asn Lieu. Arg Thr Lieu. 3.25 330 335 Gly Val Glin Thr Asp Val Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O Thr Pro His Gly Gly Arg Val Asn. Cys His Arg Asp His Arg Ile Ala 355 360 365 Met Ala Phe Ser Ile Lieu. Gly Lieu. Arg Val Asp Gly Ile Thr Lieu. Asp 37 O 375 38O Asp Pro Gln Cys Val Gly Lys Thr Phe Pro Gly Phe Phe Asp Tyr Lieu. 385 390 395 4 OO Gly Arg Lieu. Phe Pro Glu Lys Ala Lieu. Thir Lieu Pro Gly 4 OS 41O

<210s, SEQ ID NO 18 &211s LENGTH: 1242 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: grg23 (ace4) <221s NAME/KEY: CDS <222s. LOCATION: (1) . . . (1242)

<4 OOs, SEQUENCE: 18 atg gala act gat cqc ctt gtg at C cca gga tog aaa agc at C acc aac 48 Met Glu Thir Asp Arg Lieu Val Ile Pro Gly Ser Lys Ser Ile Thr Asn 1. 5 1O 15 cgg gct ttg ctt ttg gct gcc gca gcg aag ggc acg tcg gt C ctg gtg 96 Arg Ala Lieu. Lieu. Lieu Ala Ala Ala Ala Lys Gly. Thir Ser Val Lieu Val 2O 25 3 O aga cca ttg gtc agc gcc gat acc to a goa titc aaa act gca at C cag 144 Arg Pro Lieu Val Ser Ala Asp Thir Ser Ala Phe Llys Thr Ala Ile Glin 35 4 O 45 gcc ct c ggit gcc aac gtc. tca gcc gaC ggit gac gat t gtC gtt gala 192 Ala Lieu. Gly Ala Asn Val Ser Ala Asp Gly Asp Asp Trp Val Val Glu SO 55 60 ggc ctg ggit cag goa CCC aac Ct c gaC gcc gac atc tig tec gag gac 24 O Gly Lieu. Gly Glin Ala Pro Asn Lieu. Asp Ala Asp Ile Trp Cys Glu Asp 65 70 7s 8O gca ggit act gtg gcc cq9 ttic ct C cct coa titc gta gcc gca ggit cag 288 Ala Gly Thr Val Ala Arg Phe Leu Pro Pro Phe Val Ala Ala Gly Glin 85 90 95 ggg aag tt C acc titc gac gga to a gag cag ctg. C9g C9g cgc cc.g. citt 336 Gly Llys Phe Thr Phe Asp Gly Ser Glu Gln Lieu. Arg Arg Arg Pro Lieu. 1OO 105 11 O cgg ccc gtg gtC gac ggc atc. c9c cac ctg ggc gcc cqc gtc. tcc to C 384 Arg Pro Val Val Asp Gly Ile Arg His Lieu. Gly Ala Arg Val Ser Ser 115 12 O 125 gag cag Ctg ccc Ctt aca att gala gC9 agc ggg Ctg gca ggc ggg gag 432 Glu Gln Lieu Pro Lieu. Thir Ile Glu Ala Ser Gly Lieu Ala Gly Gly Glu 13 O 135 14 O tac gala att gala gcc cat cag agc agc cag titc gcc ticc ggc ctg at C 48O

US 7,834,249 B2 71 72

- Continued

Arg Ala Luell Luell Lell Ala Ala Ala Ala Lys Gly Thir Ser Wall Luell Wall 25 3O

Arg Pro Luell Wall Ser Ala Asp Thir Ser Ala Phe Thir Ala Ile Glin 35 4 O 45

Ala Luell Gly Ala Asn Wall Ser Ala Asp Gly Asp Asp Trp Wall Wall Glu SO 55 6 O

Gly Luell Gly Glin Ala Pro Asn Luell Asp Ala Asp Ile Trp Glu Asp 65 70

Ala Gly Thir Wall Ala Arg Phe Luell Pro Pro Phe Wall Ala Gly Glin 85 90 95

Gly Phe Thir Phe Asp Gly Ser Glu Glin Luell Arg Arg Pro Luell 105

Arg Pro Wall Wall Asp Gly Ile Arg His Luell Gly Ala Arg Ser Ser 115 12 O 125

Glu Glin Luell Pro Lell Thir Ile Glu Ala Ser Gly Lell Ala Gly Glu 13 O 135 14 O

Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Luell Ile 145 150 155 160

Met Ala Ala Pro Tyr Ala Arg Glin Gly Luell Arg Wall Arg Pro Asn 1.65 17O 17s

Pro Wall Ser Glin Pro Lell Thir Met Thir Luell Arg Met Met Arg Asp 18O 185 19 O

Phe Gly Ile Glu Thir Ser Thir Asp Gly Ala Thir Wall Ser Wall Pro Pro 195

Gly Arg Thir Ala Arg Arg Glu Ile Glu Pro Asp Ala Ser Thir 21 O 215 22O

Ala Ser Phe Ala Ala Ala Ser Ala Wall Ser Gly Arg Arg Phe Glu 225 23 O 235 24 O

Phe Glin Gly Luell Gly Thir Asp Ser Ile Glin Gly Asp Thir Ser Phe Phe 245 250 255

Asn Wall Luell Gly Arg Lell Gly Ala Glu Wall His Trp Ala Ser Asn Ser 26 O 265 27 O

Wall Thir Ile Arg Gly Pro Glu Arg Luell Thir Gly Asp Ile Glu Wall Asp 285

Met Gly Glu Ile Ser Asp Thir Phe Met Thir Luell Ala Ala Ile Ala Pro 29 O 295 3 OO

Lell Ala Asp Gly Pro Ile Thir Ile Thir Asn Ile Gly His Ala Arg Luell 3. OS 310 315

Glu Ser Asp Arg Ile Ser Ala Met Glu Ser Asn Lell Arg Thir Luell 3.25 330 335

Gly Wall Glin Thir Asp Wall Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O

Thir Pro His Gly Gly Arg Wall Asn His Arg Asp His Arg Ile Ala 355 360 365

Met Ala Phe Ser Ile Lell Gly Luell Arg Wall Asp Gly Ile Thir Luell Asp 37 O 375

Asp Pro Glin Wall Gly Thir Phe Pro Gly Phe Phe Asp Luell 385 390 395 4 OO

Gly Arg Luell Phe Pro Glu Ala Luell Thir Luell Pro Gly 4 OS 41O

<210s, SEQ ID NO 2 O &211s LENGTH: 1244

US 7,834,249 B2 75 76

- Continued

28O 285 atg ggc gag att tcg gac a CC ttic atg aca Citc. gcg gcg att gcc cott 912 Met Gly Glu Ile Ser Asp Thir Phe Met Thir Lieu. Ala Ala Ile Ala Pro 29 O 295 3 OO ttg gcc gat gga c cc atc. acg at a acc aac att ggit Cat gca cgg ttg 96.O Lell Ala Asp Gly Pro Ile Thir Ile Thir ASn Ile Gly His Ala Arg Luell 3. OS 310 315 32O aag gala to c gac cgc atc. toa gcg atg gala acc aac Ctg cgc acg citc. OO8 Glu Ser Asp Arg Ile Ser Ala Met Glu Thir Asn Lell Arg Thir Luell 3.25 330 335 ggit gta Cala acc gac gtc gga CaC gac tgg atg aga atc. tac cc c tot Gly Wall Glin Thir Asp Wall Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O a CC cc.g CaC ggc ggit aga gtg aat cac cqg gac CaC agg at C gct 104 Thir Pro His Gly Gly Arg Wall Asn Cys His Arg Asp His Arg Ile Ala 355 360 365 atg gcg titt to a atc. Ctg gga aga gtg gac 999 att acc citc. gac 152 Met Ala Phe Ser Ile Lell Gly Luell Arg Val Asp Gly Ile Thir Luell Asp 37 O 375 38O gac cott Cala tgc gtc 999 aag acc titt Cct ggc tto tto gac tac citt 2OO Asp Pro Glin Cys Wall Gly Lys Thir Phe Pro Gly Phe Phe Asp Luell 385 390 395 4 OO gga cgc citt ttic c cc gaa aag gcg citt acg ctic c cc ggc tag 242 Gly Arg Luell Phe Pro Glu Lys Ala Luell Thir Lieu. Pro Gly 4 OS 41O

99 244

SEQ ID NO 21 LENGTH: 413 TYPE : PRT ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: GRG23 (L3P1.

<4 OOs, SEQUENCE: 21

Met Glu Thir Asp Arg Lell Wall Ile Pro Gly Ser Ser Ile Thir Asn 1. 5 1O 15

Arg Ala Luell Luell Lell Ala Ala Ala Ala Thir Ser Wall Luell Wall 25 3O

Arg Pro Luell Wall Ser Ala Asp Thir Ser Ala Phe Thir Ala Ile Glin 35 4 O 45

Ala Luell Gly Ala Asn Wall Ser Ala Asp Gly Asp Asn Trp Wall Wall Glu SO 55 6 O

Gly Luell Gly Glin Ala Pro His Luell Asp Ala Asp Ile Trp Glu Asp 65 70 7s

Ala Gly Thir Wall Ala Arg Phe Luell Pro Pro Phe Wall Ala Gly Glin 85 90 95

Gly Phe Thir Wall Asp Gly Ser Glu Gln Lieu. Arg Arg Pro Luell 105

Arg Pro Luell Wall Asp Gly Ile Arg His Lieu. Gly Ala Arg Ser Ser 115 12 O 125

Glu Glin Luell Pro Lell Thir Ile Glu Ala Ser Gly Lell Ala Gly Glu 13 O 135 14 O

Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Luell Ile 145 150 155 160

Met Ala Ala Pro Tyr Ala Arg Glin Gly Lieu. Arg Wall Arg Pro Asn 1.65 17O 17s US 7,834,249 B2 77

- Continued Pro Val Ser Glin Pro Tyr Lieu. Thr Met Thr Lieu. Arg Met Met Arg Asp 18O 185 19 O Phe Gly Lieu. Glu Thir Ser Thr Asp Gly Ala Thr Val Ser Ile Pro Pro 195 2OO 2O5 Gly Arg Tyr Thr Ala Arg Arg Tyr Glu Ile Glu Pro Asp Ala Ser Thr 21 O 215 22O Ala Ser Tyr Phe Ala Ala Ala Ser Ala Val Ser Gly Arg Ser Phe Glu 225 23 O 235 24 O Phe Glin Gly Lieu. Gly Thr Asp Ser Ile Glin Gly Asp Thr Ser Phe Phe 245 250 255 Asn Val Lieu. Gly Arg Lieu. Gly Ala Glu Val His Trp Ala Pro Asn. Ser 26 O 265 27 O Val Thir Ile Ser Gly Pro Glu Arg Lieu. Asn Gly Asp Ile Glu Val Asp 27s 28O 285 Met Gly Glu Ile Ser Asp Thr Phe Met Thr Lieu Ala Ala Ile Ala Pro 29 O 295 3 OO Lieu Ala Asp Gly Pro Ile Thir Ile Thr Asn. Ile Gly His Ala Arg Lieu. 3. OS 310 315 32O Lys Glu Ser Asp Arg Ile Ser Ala Met Glu Thir Asn Lieu. Arg Thr Lieu. 3.25 330 335 Gly Val Glin Thr Asp Val Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O Thr Pro His Gly Gly Arg Val Asn. Cys His Arg Asp His Arg Ile Ala 355 360 365 Met Ala Phe Ser Ile Lieu. Gly Lieu. Arg Val Asp Gly Ile Thr Lieu. Asp 37 O 375 38O Asp Pro Gln Cys Val Gly Lys Thr Phe Pro Gly Phe Phe Asp Tyr Lieu. 385 390 395 4 OO Gly Arg Lieu. Phe Pro Glu Lys Ala Lieu. Thir Lieu Pro Gly 4 OS 41O

<210s, SEQ ID NO 22 &211s LENGTH: 1244 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: grg23 (L3P1. B3) <221s NAME/KEY: CDS <222s. LOCATION: (1) . . . (1242)

<4 OOs, SEQUENCE: 22 atg gala act gat cqc ctt gtg at C cca gga tog aaa agc at C acc aac 48 Met Glu Thir Asp Arg Lieu Val Ile Pro Gly Ser Lys Ser Ile Thr Asn 1. 5 1O 15 cgg gct ttg ctt ttg gct gcc gca gcg aag ggc acg tcg gt C ctg gtg 96 Arg Ala Lieu. Lieu. Lieu Ala Ala Ala Ala Lys Gly. Thir Ser Val Lieu Val 2O 25 3 O aga cca ttg gtc agc gcc gat acc to a goa titc aaa act gca at C cag 144 Arg Pro Lieu Val Ser Ala Asp Thir Ser Ala Phe Llys Thr Ala Ile Glin 35 4 O 45 gcc ct c ggit gcc aac gtc. tca gcc gaC ggit gac alat t gtC gtt gala 192 Ala Lieu. Gly Ala Asn Val Ser Ala Asp Gly Asp Asn Trp Val Val Glu SO 55 60 ggc ctg ggit cag goa CCC cac ct c gaC gcc cac atc tig tec gag gac 24 O Gly Lieu. Gly Glin Ala Pro His Lieu. Asp Ala His Ile Trp Cys Glu Asp 65 70 7s 8O gca ggit act gtg gcc cq9 ttic ct C cct coa titc gta gcc gca ggit cag 288 Ala Gly Thr Val Ala Arg Phe Leu Pro Pro Phe Val Ala Ala Gly Glin

US 7,834,249 B2 81

- Continued Gly Arg Lieu. Phe Pro Glu Lys Ala Lieu. Thir Lieu Pro Gly * 4 OS 41O

99 1244

<210s, SEQ ID NO 23 &211s LENGTH: 413 212. TYPE: PRT <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: GRG23 (L3P1. B3)

<4 OOs, SEQUENCE: 23 Met Glu Thir Asp Arg Lieu Val Ile Pro Gly Ser Lys Ser Ile Thr Asn 1. 5 1O 15 Arg Ala Lieu. Lieu. Lieu Ala Ala Ala Ala Lys Gly. Thir Ser Val Lieu Val 2O 25 3O Arg Pro Lieu Val Ser Ala Asp Thir Ser Ala Phe Llys Thr Ala Ile Glin 35 4 O 45 Ala Lieu. Gly Ala Asn Val Ser Ala Asp Gly Asp Asn Trp Val Val Glu SO 55 6 O Gly Lieu. Gly Glin Ala Pro His Lieu. Asp Ala His Ile Trp Cys Glu Asp 65 70 7s 8O Ala Gly Thr Val Ala Arg Phe Leu Pro Pro Phe Val Ala Ala Gly Glin 85 90 95 Gly Llys Phe Thr Val Asp Gly Ser Glu Gln Lieu. Arg Arg Arg Pro Lieu. 1OO 105 11 O Arg Pro Lieu Val Asp Gly Ile Arg His Lieu. Gly Ala Arg Val Ser Ser 115 12 O 125 Glu Gln Lieu Pro Lieu. Thir Ile Glu Ala Ser Gly Lieu Ala Gly Gly Glu 13 O 135 14 O Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Lieu. Ile 145 150 155 160 Met Ala Ala Pro Tyr Ala Arg Glin Gly Lieu. Arg Val Arg Ile Pro Asn 1.65 17O 17s Pro Val Ser Glin Pro Tyr Lieu. Thr Met Thr Lieu. Arg Met Met Arg Asp 18O 185 19 O Phe Gly Lieu. Glu Thir Ser Thr Asp Gly Ala Thr Val Ser Val Pro Pro 195 2OO 2O5 Gly Arg Tyr Thr Ala Arg Arg Tyr Lys Ile Glu Pro Asp Ala Ser Thr 21 O 215 22O Ala Ser Tyr Phe Ala Ala Ala Ser Ala Val Ser Gly Arg Ser Phe Glu 225 23 O 235 24 O Phe Glin Gly Lieu. Gly Thr Asp Ser Ile Glin Gly Asp Thr Ser Phe Phe 245 250 255 Asn Val Lieu. Gly Arg Lieu. Gly Ala Glu Val His Trp Ala Pro Asn. Ser 26 O 265 27 O Val Thir Ile Ser Gly Pro Glu Arg Lieu. Asn Gly Asp Ile Glu Val Asp 27s 28O 285 Met Gly Glu Ile Ser Asp Thr Phe Met Thr Lieu Ala Ala Ile Ala Pro 29 O 295 3 OO Lieu Ala Asp Gly Pro Ile Thir Ile Thr Asn. Ile Gly His Ala Arg Lieu. 3. OS 310 315 32O Lys Glu Ser Asp Arg Ile Ser Ala Met Glu Thir Asn Lieu. Arg Thr Lieu. 3.25 330 335 Gly Val Glin Thr Asp Val Gly His Asp Trp Met Arg Ile Tyr Pro Ser

US 7,834,249 B2 87

- Continued

85 90 95 Gly Llys Phe Thr Val Asp Gly Ser Glu Gln Lieu. Arg Arg Arg Pro Lieu. 1OO 105 11 O Arg Pro Lieu Val Asp Gly Ile Arg His Lieu. Gly Ala Arg Val Ser Ser 115 12 O 125 Glu Gln Lieu Pro Lieu. Thir Ile Glu Ala Ser Gly Lieu Ala Gly Gly Glu 13 O 135 14 O Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Lieu. Ile 145 150 155 160 Met Ala Ala Pro Tyr Ala Arg Glin Gly Lieu. Arg Val Arg Ile Pro Asn 1.65 17O 17s Pro Val Ser Glin Pro Tyr Lieu. Thr Met Thr Lieu. Arg Met Met Arg Asp 18O 185 19 O Phe Gly Lieu. Glu Thir Ser Thr Asp Gly Ala Thr Val Ser Val Pro Pro 195 2OO 2O5 Gly Arg Tyr Thr Ala Arg Arg Tyr Glu Ile Glu Pro Asp Ala Ser Thr 21 O 215 22O Ala Ser Tyr Phe Ala Ala Ala Ser Ala Val Ser Gly Arg Ser Phe Glu 225 23 O 235 24 O Phe Glin Gly Lieu. Gly Thr Asp Ser Ile Glin Gly Asp Thr Ser Phe Phe 245 250 255 Asn Val Lieu. Gly Arg Lieu. Gly Ala Glu Val His Trp Ala Pro Asn. Ser 26 O 265 27 O Val Ile Ile Ser Gly Pro Glu Arg Lieu. ASn Gly Asp Ile Glu Val Asp 27s 28O 285 Met Gly Glu Ile Ser Asp Thr Phe Met Thr Lieu Ala Ala Ile Ala Pro 29 O 295 3 OO Lieu Ala Asp Gly Pro Ile Thir Ile Thr Asn. Ile Gly His Ala Arg Lieu. 3. OS 310 315 32O Lys Glu Ser Asp Arg Ile Ser Ala Met Glu Thir Asn Lieu. Arg Thr Lieu. 3.25 330 335 Gly Val Glin Thr Asp Val Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O Thr Pro His Gly Gly Arg Val Asn. Cys His Arg Asp His Arg Ile Ala 355 360 365 Met Ala Phe Ser Ile Lieu. Gly Lieu. Arg Val Asp Gly Ile Thr Lieu. Asp 37 O 375 38O Asp Pro Gln Cys Val Gly Lys Thr Phe Pro Gly Phe Phe Asp Tyr Lieu. 385 390 395 4 OO Gly Arg Lieu. Phe Pro Glu Lys Ala Lieu. Thir Lieu Pro Gly 4 OS 41O

<210s, SEQ ID NO 26 &211s LENGTH: 1244 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: grg23 (L3P1. O23) <221s NAME/KEY: CDS <222s. LOCATION: (1) . . . (1242)

<4 OOs, SEQUENCE: 26 atg gaa act gat cac citt gtg atc cca gga tog aaa agc atc acc aac Met Glu Thir Asp His Leu Val Ile Pro Gly Ser Lys Ser Ile Thr Asn 1. 5 1O 15 cgg gcg ttg ctt ttg gct gcc gca gcg aag ggc acg tcg gt C ctg gtg

US 7,834,249 B2 91 92

- Continued ggit gta Cala acc gac gtc gga CaC gac tgg atg aga atc. tac cc c tot Gly Wall Glin Thir Asp Wall Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O a CC cc.g CaC ggc ggit aga gtg aat cac cqg gac CaC agg at C gct 104 Thir Pro His Gly Gly Arg Wall Asn Cys His Arg Asp His Arg Ile Ala 355 360 365 atg gcg titt to a atc. Ctg gga aga gtg gac 999 att acc citc. gac 152 Met Ala Phe Ser Ile Lell Gly Luell Arg Val Asp Gly Ile Thir Luell Asp 37 O 375 38O gac cott Cala tgc gtc 999 aag acc titt Cct ggc tto tto gac tac citt 2OO Asp Pro Glin Cys Wall Gly Lys Thir Phe Pro Gly Phe Phe Asp Luell 385 390 395 4 OO gga cgc citt ttic c cc gaa aag gcg citt acg citt c cc ggc tag 242 Gly Arg Luell Phe Pro Glu Lys Ala Luell Thir Lieu. Pro Gly 4 OS 41O

99 244

SEO ID NO 27 LENGTH: 413 TYPE : PRT ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: GRG23 (L3P1. O23)

<4 OOs, SEQUENCE: 27

Met Glu Thir Asp His Lell Wall Ile Pro Gly Ser Ser Ile Thir Asn 1. 5 1O 15

Arg Ala Luell Luell Lell Ala Ala Ala Ala Thir Ser Wall Luell Wall 25 3O

Arg Pro Luell Wall Ser Ala Asp Thir Ser Ala Phe Thir Ala Ile Glin 35 4 O 45

Ala Luell Gly Ala Asn Wall Ser Ala Asp Gly Asp Asn Trp Wall Wall Glu SO 55 6 O

Gly Luell Gly Glin Ala Pro His Luell Asp Ala Asp Ile Trp Glu Asp 65 70 7s

Ala Gly Thir Wall Ala Arg Phe Luell Pro Pro Phe Wall Ala Ala Gly Glin 85 90 95

Gly Phe Thir Wall Asp Gly Ser Glu Gln Lieu. Arg Arg Arg Pro Luell 105 11 O

Arg Pro Luell Wall Asp Gly Ile Arg His Lieu. Gly Ala Arg Wall Ser Ser 115 12 O 125

Glu Glin Luell Pro Lell Thir Ile Glu Ala Ser Gly Lell Ala Gly Gly Glu 13 O 135 14 O

Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Luell Ile 145 150 155 160

Met Ala Ala Pro Tyr Ala Arg Glin Gly Lieu. Arg Wall Arg Ile Pro Asn 1.65 17O 17s

Pro Wall Ser Glin Pro Lell Thir Met Thir Lieu. Arg Met Met Arg Asp 18O 185 19 O

Phe Gly Luell Glu Thir Ser Thir Asp Gly Ala Thr Wall Ser Wall Pro Pro 195

Gly Arg Thir Ala Arg Arg Glu Ile Glu Pro Asp Ala Ser Thir 21 O 215 22O

Ala Ser Phe Ala Ala Ala Ser Ala Wall Ser Gly Arg Ser Phe Glu 225 23 O 235 24 O

Phe Glin Gly Luell Gly Thir Asp Ser Ile Gln Gly Asp Thir Ser Phe Phe 245 250 255 US 7,834,249 B2 93

- Continued

Asn Val Lieu. Gly Arg Lieu. Gly Ala Glu Val His Trp Ala Pro Asn. Ser 26 O 265 27 O Val Thir Ile Ser Gly Pro Glu Arg Lieu. Asn Gly Asp Ile Glu Val Asp 27s 28O 285 Met Gly Glu Ile Ser Asp Thr Phe Met Thr Lieu Ala Ala Ile Ala Pro 29 O 295 3 OO Lieu Ala Asp Gly Pro Ile Thir Ile Thr Asn. Ile Gly His Ala Arg Lieu. 3. OS 310 315 32O Lys Glu Ser Asp Arg Ile Ser Ala Met Glu Thir Asn Lieu. Arg Thr Lieu. 3.25 330 335 Gly Val Glin Thr Asp Val Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O Thr Pro His Gly Gly Arg Val Asn. Cys His Arg Asp His Arg Ile Ala 355 360 365 Met Ala Phe Ser Ile Lieu. Gly Lieu. Arg Val Asp Gly Ile Thr Lieu. Asp 37 O 375 38O Asp Pro Gln Cys Val Gly Lys Thr Phe Pro Gly Phe Phe Asp Tyr Lieu. 385 390 395 4 OO Gly Arg Lieu. Phe Pro Glu Lys Ala Lieu. Thir Lieu Pro Gly 4 OS 41O

<210s, SEQ ID NO 28 &211s LENGTH: 1244 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: grg23 (ace3) variant sequence (grg23 (ace3) R173K) <221s NAME/KEY: CDS <222s. LOCATION: (1) . . . (1242)

<4 OOs, SEQUENCE: 28 atg gala act gat cqc ctt gtg at C cca gga tog aaa agc at C acc aac 48 Met Glu Thir Asp Arg Lieu Val Ile Pro Gly Ser Lys Ser Ile Thr Asn 1. 5 1O 15 cgg gct ttg ctt ttg gct gcc gca gcg aag ggc acg tcg gt C ctg gtg 96 Arg Ala Lieu. Lieu. Lieu Ala Ala Ala Ala Lys Gly. Thir Ser Val Lieu Val 2O 25 3 O aga cca ttg gtc agc gcc gat acc to a goa titc aaa act gca at C cag 144 Arg Pro Lieu Val Ser Ala Asp Thir Ser Ala Phe Llys Thr Ala Ile Glin 35 4 O 45 gcc ct c ggit gcc aac gtc. tca gcc gaC ggit gac gat t gtC gtt gala 192 Ala Lieu. Gly Ala Asn Val Ser Ala Asp Gly Asp Asp Trp Val Val Glu SO 55 60 ggc ctg ggit cag goa CCC aac Ct c gaC gcc gac atc tig tec gag gac 24 O Gly Lieu. Gly Glin Ala Pro Asn Lieu. Asp Ala Asp Ile Trp Cys Glu Asp 65 70 7s 8O gca ggit act gtg gcc cq9 ttic ct C cct coa titc gta gcc gca ggit cag 288 Ala Gly Thr Val Ala Arg Phe Leu Pro Pro Phe Val Ala Ala Gly Glin 85 90 95 ggg aag tt C acc gtc gac gga to a gag cag ctg. C9g C9g cgc cc.g. citt 336 Gly Llys Phe Thr Val Asp Gly Ser Glu Gln Lieu. Arg Arg Arg Pro Lieu. 1OO 105 11 O cgg ccc gtg gtC gac ggc atc. c9c cac ctg ggc gcc cqc gtc. tcc to C 384 Arg Pro Val Val Asp Gly Ile Arg His Lieu. Gly Ala Arg Val Ser Ser 115 12 O 125 gag cag Ctg ccc Ctt aca att gala gC9 agc ggg Ctg gca ggc ggg gag 432 Glu Gln Lieu Pro Lieu. Thir Ile Glu Ala Ser Gly Lieu Ala Gly Gly Glu 13 O 135 14 O

US 7,834,249 B2 97 98

- Continued

<4 OOs, SEQUENCE: 29

Met Glu Thir Asp Arg Lell Wall Ile Pro Gly Ser Ser Ile Thir Asn 1. 15

Arg Ala Luell Luell Lell Ala Ala Ala Ala Lys Gly Thir Ser Wall Luell Wall 25 3O

Arg Pro Luell Wall Ser Ala Asp Thir Ser Ala Phe Thir Ala Ile Glin 35 4 O 45

Ala Luell Gly Ala Asn Wall Ser Ala Asp Gly Asp Asp Trp Wall Wall Glu SO 55 6 O

Gly Luell Gly Glin Ala Pro Asn Luell Asp Ala Asp Ile Trp Glu Asp 65 70

Ala Gly Thir Wall Ala Arg Phe Luell Pro Pro Phe Wall Ala Ala Gly Glin 85 90 95

Gly Phe Thir Wall Asp Gly Ser Glu Glin Luell Arg Arg Arg Pro Luell 105 11 O

Arg Pro Wall Wall Asp Gly Ile Arg His Luell Gly Ala Arg Wall Ser Ser 115 12 O 125

Glu Glin Luell Pro Lell Thir Ile Glu Ala Ser Gly Lell Ala Gly Gly Glu 13 O 135 14 O

Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Luell Ile 145 150 155 160

Met Ala Ala Pro Tyr Ala Arg Glin Gly Luell Arg Wall Ile Pro Asn 1.65 17O 17s

Pro Wall Ser Glin Pro Lell Thir Met Thir Luell Arg Met Met Arg Asp 18O 185 19 O

Phe Gly Ile Glu Thir Ser Thir Asp Gly Ala Thir Wall Ser Wall Pro Pro 195

Gly Arg Thir Ala Arg Arg Glu Ile Glu Pro Asp Ala Ser Thir 21 O 215 22O

Ala Ser Phe Ala Ala Ala Ser Ala Wall Ser Gly Arg Arg Phe Glu 225 23 O 235 24 O

Phe Glin Gly Luell Gly Thir Asp Ser Ile Glin Gly Asp Thir Ser Phe Phe 245 250 255

Asn Wall Luell Gly Arg Lell Gly Ala Glu Wall His Trp Ala Ser Asn Ser 26 O 265 27 O

Wall Thir Ile Arg Gly Pro Glu Arg Luell Thir Gly Asp Ile Glu Wall Asp 285

Met Gly Glu Ile Ser Asp Thir Phe Met Thir Luell Ala Ala Ile Ala Pro 29 O 295 3 OO

Lell Ala Asp Gly Pro Ile Thir Ile Thir Asn Ile Gly His Ala Arg Luell 3. OS 310 315

Glu Ser Asp Arg Ile Ser Ala Met Glu Ser Asn Lell Arg Thir Luell 3.25 330 335

Gly Wall Glin Thir Asp Wall Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O

Thir Pro His Gly Gly Arg Wall Asn His Arg Asp His Arg Ile Ala 355 360 365

Met Ala Phe Ser Ile Lell Gly Luell Arg Wall Asp Gly Ile Thir Luell Asp 37 O 375

Asp Pro Glin Wall Gly Thir Phe Pro Gly Phe Phe Asp Luell 385 390 395 4 OO

Gly Arg Luell Phe Pro Glu Ala Luell Thir Luell Pro Gly US 7,834,249 B2 99 100

- Continued

4 OS

SEQ ID NO 3 O LENGTH: 1244 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: grg23 (ace3) variant sequence (grg23 (ace3) NAME/KEY: CDS LOCATION: (1) . . . (1242)

<4 OOs, SEQUENCE: 3 O atg gala act gat cgc citt gtg at C CC a gga tog a.a.a. agc at C acc aac 48 Met Glu Thir Asp Arg Lell Wall Ile Pro Gly Ser Ser Ile Thir Asn 1. 5 1O 15 cgg gct ttg citt ttg gct gcc gca gcg aag ggc acg tcg gtC Ctg gtg 96 Arg Ala Luell Luell Lell Ala Ala Ala Ala Lys Gly Thir Ser Wall Luell Wall 2O 25 3 O aga CC a ttg gt C agc gcc gat acc to a gca titc a.a.a. act gca at C cag 144 Arg Pro Luell Wall Ser Ala Asp Thir Ser Ala Phe Thir Ala Ile Glin 35 4 O 45 gcc citc. ggt gcc aac gtc toa gcc gac ggt gac gat tgg gtC gtt gala 192 Ala Luell Gly Ala Asn Wall Ser Ala Asp Gly Asp Asp Trp Wall Wall Glu SO 55 60 ggc Ctg ggt cag gca c cc aac citc. gac gcc gac atc. tgg tgc gag gac 24 O Gly Luell Gly Glin Ala Pro Asn Luell Asp Ala Asp Ile Trp Cys Glu Asp 65 70 7s 8O gca ggit act gtg gcc cgg tto citc. cott CC a titc gta gcc gCa ggit cag 288 Ala Gly Thir Wall Ala Arg Phe Luell Pro Pro Phe Wall Ala Ala Gly Glin 85 90 95

999 aag ttic acc gtc gac gga to a gag cag Ctg cgg cgg cgc cc.g citt 336 Gly Lys Phe Thir Wall Asp Gly Ser Glu Glin Luell Arg Arg Arg Pro Luell 1OO 105 11 O cgg cc c gtg gt C gac ggc atc. cgc CaC Ctg ggc gcc cgc gtC to c to c 384 Arg Pro Wall Wall Asp Gly Ile Arg His Luell Gly Ala Arg Wall Ser Ser 115 12 O 125 gag cag Ctg cc c citt a Ca att gala gcg agc 999 Ctg gca ggc 999 gag 432 Glu Glin Luell Pro Lell Thir Ile Glu Ala Ser Gly Lell Ala Gly Gly Glu 13 O 135 14 O tac gala att gala gcc Cat Cag agc agc cag titc gcc t cc ggc Ctg at C Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Luell Ile 145 150 155 160 atg gcc gcc cc.g tac gcg aga Cala gt C gtg cgt. gtg aag at a CC a aat 528 Met Ala Ala Pro Tyr Ala Arg Glin Wall Wall Arg Wall Lys Ile Pro Asn 1.65 17O 17s c cc gtg to a cag c cc tac citc. acg atg aca Ctg cgg atg atg agg gac 576 Pro Wall Ser Glin Pro Lell Thir Met Thir Luell Arg Met Met Arg Asp 18O 185 19 O tto ggc att gag a CC agc a CC gac gga gcc acc gtc agc gtC cott CC a 624 Phe Gly Ile Glu Thir Ser Thir Asp Gly Ala Thir Wall Ser Wall Pro Pro 195 2OO 2O5

999 cgc aca gcc cgg cgg tat gala at a gaa cc.g gat gcg to a act 672 Gly Arg Thir Ala Arg Arg Glu Ile Glu Pro Asp Ala Ser Thir 21 O 215 22O gcg tog ttic gcc gcc gct to c gcc gt C tot ggc agg cgc ttic gala 72 O Ala Ser Phe Ala Ala Ala Ser Ala Wall Ser Gly Arg Arg Phe Glu 225 23 O 235 24 O titt Cala ggc citt ggc a Ca gac agc at C Cala ggc gac acg to a ttic ttic 768 Phe Glin Gly Luell Gly Thir Asp Ser Ile Glin Gly Asp Thir Ser Phe Phe 245 250 255 US 7,834,249 B2 101 102

- Continued aat gta citt 999 cgg citc. ggit gcg gag gt C CaC tgg gca to c aac tog 816 Asn Wall Luell Gly Arg Lell Gly Ala Glu Wall His Trp Ala Ser Asn Ser 26 O 265 27 O gtc acc at a cgg gga cc.g gaa agg Ctg acc ggc gac att gala gtg gat 864 Wall Thir Ile Arg Gly Pro Glu Arg Luell Thir Gly Asp Ile Glu Wall Asp 27s 28O 285 atg ggc gag att tcg gac a CC ttic atg aca citc. gcg gcg att gcc cott 912 Met Gly Glu Ile Ser Asp Thir Phe Met Thir Luell Ala Ala Ile Ala Pro 29 O 295 3 OO ttg gcc gat gga c cc atc. acg at a acc aac att ggit Cat gca cgg ttg 96.O Lell Ala Asp Gly Pro Ile Thir Ile Thir Asn Ile Gly His Ala Arg Luell 3. OS 310 315 32O aag gala to c gac cgc atc. toa gcg atg gala agc aac Ctg cgc acg citc. OO8 Lys Glu Ser Asp Arg Ile Ser Ala Met Glu Ser Asn Lell Arg Thir Luell 3.25 330 335 ggit gta Cala acc gac gtc gga CaC gac tgg atg aga atc. tac cc c tot Gly Wall Glin Thir Asp Wall Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O a CC cc.g CaC ggc ggit aga gtg aat CaC cgg gac CaC agg at C gct 104 Thir Pro His Gly Gly Arg Wall Asn Cys His Arg Asp His Arg Ile Ala 355 360 365 atg gcg titt to a atc. Ctg gga aga gtg gac 999 att acc citc. gac 152 Met Ala Phe Ser Ile Lell Gly Luell Arg Wall Asp Gly Ile Thir Luell Asp 37 O 375 38O gac cott Cala tgc gtc 999 aag acc titt cott ggc tto tto gac tac citt 2OO Asp Pro Glin Cys Wall Gly Lys Thir Phe Pro Gly Phe Phe Asp Luell 385 390 395 4 OO gga cgc citt ttic c cc gaa aag gcg citt acg citc. c cc ggc tag 242 Gly Arg Luell Phe Pro Glu Lys Ala Luell Thir Luell Pro Gly 4 OS 41O

99 244

SEQ ID NO 31 LENGTH: 413 TYPE : PRT ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: GRG23 (ace3) variant sequence (GRG23 (ace3)

SEQUENCE: 31

Met Glu Thir Asp Arg Lell Wall Ile Pro Gly Ser Ser Ile Thir Asn 1. 5 15

Arg Ala Luell Luell Lell Ala Ala Ala Ala Lys Gly Thir Ser Wall Luell Wall 25 3O

Arg Pro Luell Wall Ser Ala Asp Thir Ser Ala Phe Thir Ala Ile Glin 35 4 O 45

Ala Luell Gly Ala Asn Wall Ser Ala Asp Gly Asp Asp Trp Wall Wall Glu SO 55 6 O

Gly Luell Gly Glin Ala Pro Asn Luell Asp Ala Asp Ile Trp Glu Asp 65 70

Ala Gly Thir Wall Ala Arg Phe Luell Pro Pro Phe Wall Ala Ala Gly Glin 85 90 95

Gly Phe Thir Wall Asp Gly Ser Glu Glin Luell Arg Arg Arg Pro Luell 105 11 O

Arg Pro Wall Wall Asp Gly Ile Arg His Luell Gly Ala Arg Wall Ser Ser 115 12 O 125

Glu Glin Luell Pro Lell Thir Ile Glu Ala Ser Gly Lell Ala Gly Gly Glu 13 O 135 14 O US 7,834,249 B2 103 104

- Continued

Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Luell Ile 145 150 155 160

Met Ala Ala Pro Tyr Ala Arg Glin Wall Wall Arg Wall Ile Pro Asn 1.65 17O 17s

Pro Wall Ser Glin Pro Lell Thir Met Thir Luell Arg Met Met Arg Asp 18O 185 19 O

Phe Gly Ile Glu Thir Ser Thir Asp Gly Ala Thir Wall Ser Wall Pro Pro 195

Gly Arg Thir Ala Arg Arg Glu Ile Glu Pro Asp Ala Ser Thir 21 O 215 22O

Ala Ser Phe Ala Ala Ala Ser Ala Wall Ser Gly Arg Arg Phe Glu 225 23 O 235 24 O

Phe Glin Gly Luell Gly Thir Asp Ser Ile Glin Gly Asp Thir Ser Phe Phe 245 250 255

Asn Wall Luell Gly Arg Lell Gly Ala Glu Wall His Trp Ala Ser Asn Ser 26 O 265 27 O

Wall Thir Ile Arg Gly Pro Glu Arg Luell Thir Gly Asp Ile Glu Wall Asp 285

Met Gly Glu Ile Ser Asp Thir Phe Met Thir Luell Ala Ala Ile Ala Pro 29 O 295 3 OO

Lell Ala Asp Gly Pro Ile Thir Ile Thir Asn Ile Gly His Ala Arg Luell 3. OS 310 315

Glu Ser Asp Arg Ile Ser Ala Met Glu Ser Asn Lell Arg Thir Luell 3.25 330 335

Gly Wall Glin Thir Asp Wall Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O

Thir Pro His Gly Gly Arg Wall Asn His Arg Asp His Arg Ile Ala 355 360 365

Met Ala Phe Ser Ile Lell Gly Luell Arg Wall Asp Gly Ile Thir Luell Asp 37 O 375

Asp Pro Glin Wall Gly Thir Phe Pro Gly Phe Phe Asp Luell 385 390 395 4 OO

Gly Arg Luell Phe Pro Glu Ala Luell Thir Luell Pro Gly 4 OS 41O

SEQ ID NO 32 LENGTH: 1244 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: grg23 (ace3) variant sequence (grg23 (ace3) R173Q) NAME/KEY: CDS LOCATION: (1) . . . (1242)

<4 OOs, SEQUENCE: 32 atg gala act gat cgc citt gtg at C CC a gga tog a.a.a. agc at C acc aac 48 Met Glu Thir Asp Arg Lell Wall Ile Pro Gly Ser Ser Ile Thir Asn 1. 5 1O 15 cgg gct ttg citt ttg gct gcc gca gcg aag ggc acg tcg gtC Ctg gtg 96 Arg Ala Luell Luell Lell Ala Ala Ala Ala Lys Gly Thir Ser Wall Luell Wall 2O 25 3 O aga CC a ttg gt C agc gcc gat acc to a gca titc a.a.a. act gca at C cag 144 Arg Pro Luell Wall Ser Ala Asp Thir Ser Ala Phe Thir Ala Ile Glin 35 4 O 45 gcc citc. ggt gcc aac gtc toa gcc gac ggt gac gat tgg gtC gtt gala 192 Ala Luell Gly Ala Asn Wall Ser Ala Asp Gly Asp Asp Trp Wall Wall Glu

US 7,834,249 B2 107 108

- Continued Met Ala Phe Ser Ile Lell Gly Luell Arg Wall Asp Gly Ile Thir Lieu. Asp 37 O 375 38O gac cott Cala tgc gtc 999 aag acc titt cott ggc tto tto gac tac citt 12 OO Asp Pro Glin Cys Wall Gly Lys Thir Phe Pro Gly Phe Phe Asp Tyr Lieu 385 390 395 4 OO gga cgc citt ttic c cc gaa aag gcg citt acg citc. c cc ggc tag 1242 Gly Arg Luell Phe Pro Glu Lys Ala Luell Thir Luell Pro Gly k 4 OS 41O

99 1244

SEQ ID NO 33 LENGTH: 413 TYPE : PRT ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: GRG23 (ace3) variant sequence (GRG23 (ace3) R173Q)

<4 OOs, SEQUENCE: 33

Met Glu Thir Asp Arg Lell Wall Ile Pro Gly Ser Ser Ile Thir Asn 1. 5 15

Arg Ala Luell Luell Lell Ala Ala Ala Ala Lys Gly Thir Ser Wall Luell Wall 25 3O

Arg Pro Luell Wall Ser Ala Asp Thir Ser Ala Phe Thir Ala Ile Glin 35 4 O 45

Ala Luell Gly Ala Asn Wall Ser Ala Asp Gly Asp Asp Trp Wall Wall Glu SO 55 6 O

Gly Luell Gly Glin Ala Pro Asn Luell Asp Ala Asp Ile Trp Glu Asp 65 70

Ala Gly Thir Wall Ala Arg Phe Luell Pro Pro Phe Wall Ala Ala Gly Glin 85 90 95

Gly Phe Thir Wall Asp Gly Ser Glu Glin Luell Arg Arg Arg Pro Luell 105 11 O

Arg Pro Wall Wall Asp Gly Ile Arg His Luell Gly Ala Arg Wall Ser Ser 115 12 O 125

Glu Glin Luell Pro Lell Thir Ile Glu Ala Ser Gly Lell Ala Gly Gly Glu 13 O 135 14 O

Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Luell Ile 145 150 155 160

Met Ala Ala Pro Tyr Ala Arg Glin Gly Luell Arg Wall Glin Ile Pro Asn 1.65 17O 17s

Pro Wall Ser Glin Pro Lell Thir Met Thir Luell Arg Met Met Arg Asp 18O 185 19 O

Phe Gly Ile Glu Thir Ser Thir Asp Gly Ala Thir Wall Ser Wall Pro Pro 195

Gly Arg Thir Ala Arg Arg Glu Ile Glu Pro Asp Ala Ser Thir 21 O 215 22O

Ala Ser Phe Ala Ala Ala Ser Ala Wall Ser Gly Arg Arg Phe Glu 225 23 O 235 24 O

Phe Glin Gly Luell Gly Thir Asp Ser Ile Glin Gly Asp Thir Ser Phe Phe 245 250 255

Asn Wall Luell Gly Arg Lell Gly Ala Glu Wall His Trp Ala Ser Asn Ser 26 O 265 27 O

Wall Thir Ile Arg Gly Pro Glu Arg Luell Thir Gly Asp Ile Glu Wall Asp 27s 285

Met Gly Glu Ile Ser Asp Thir Phe Met Thir Luell Ala Ala Ile Ala Pro US 7,834,249 B2 109 110

- Continued

29 O 295 3 OO

Lell Ala Asp Gly Pro Ile Thir Ile Thir Asn Ile Gly His Ala Arg Luell 3. OS 310 315 32O

Glu Ser Asp Arg Ile Ser Ala Met Glu Ser Asn Lell Arg Thir Luell 3.25 330 335

Gly Wall Glin Thir Asp Wall Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O

Thir Pro His Gly Gly Arg Wall Asn His Arg Asp His Arg Ile Ala 355 360 365

Met Ala Phe Ser Ile Lell Gly Luell Arg Wall Asp Gly Ile Thir Luell Asp 37 O 375 38O

Asp Pro Glin Wall Gly Thir Phe Pro Gly Phe Phe Asp Luell 385 390 395 4 OO

Gly Arg Luell Phe Pro Glu Ala Luell Thir Luell Pro Gly 4 OS 41O

SEQ ID NO 34 LENGTH: 1244 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: grg23 (ace3) variant sequence (grg23 (ace3) I174V) NAME/KEY: CDS LOCATION: (1) . . . (1242)

<4 OOs, SEQUENCE: 34 atg gala act gat cgc citt gtg at C CC a gga tog a.a.a. agc at C acc aac 48 Met Glu Thir Asp Arg Lell Wall Ile Pro Gly Ser Ser Ile Thir Asn 1. 5 15 cgg gct ttg citt ttg gct gcc gca gcg aag ggc acg tcg gtC Ctg gtg 96 Arg Ala Luell Luell Lell Ala Ala Ala Ala Lys Gly Thir Ser Wall Luell Wall 2O 25 3 O aga CC a ttg gt C agc gcc gat acc to a gca titc a.a.a. act gca at C cag 144 Arg Pro Luell Wall Ser Ala Asp Thir Ser Ala Phe Thir Ala Ile Glin 35 4 O 45 gcc citc. ggt gcc aac gtc toa gcc gac ggt gac gat tgg gtC gtt gala 192 Ala Luell Gly Ala Asn Wall Ser Ala Asp Gly Asp Asp Trp Wall Wall Glu SO 55 60 ggc Ctg ggt cag gca c cc aac citc. gac gcc gac atc. tgg tgc gag gac 24 O Gly Luell Gly Glin Ala Pro Asn Luell Asp Ala Asp Ile Trp Cys Glu Asp 65 70 7s 8O gca ggt act gtg gcc cgg tto citc. cott CC a titc gta gcc gca ggt cag 288 Ala Gly Thir Wall Ala Arg Phe Luell Pro Pro Phe Wall Ala Ala Gly Glin 85 90 95

999 aag ttic acc gtc gac gga to a gag cag Ctg cgg cgg cgc cc.g citt 336 Gly Lys Phe Thir Wall Asp Gly Ser Glu Glin Luell Arg Arg Arg Pro Luell 1OO 105 11 O cgg cc c gtg gt C gac ggc atc. cgc CaC Ctg ggc gcc cgc gtC to c to c 384 Arg Pro Wall Wall Asp Gly Ile Arg His Luell Gly Ala Arg Wall Ser Ser 115 12 O 125 gag cag Ctg cc c citt a Ca att gala gcg agc 999 Ctg gca ggc 999 gag 432 Glu Glin Luell Pro Lell Thir Ile Glu Ala Ser Gly Lell Ala Gly Gly Glu 13 O 135 14 O tac gala att gala gcc Cat Cag agc agc cag titc gcc t cc ggc Ctg at C Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Luell Ile 145 150 155 160 atg gcc gcc cc.g tac gcg aga Cala ggc Ctg cgt. gtg cgg gtC CC a aat 528 Met Ala Ala Pro Tyr Ala Arg Glin Gly Luell Arg Wall Arg Wall Pro Asn 1.65 17O 17s US 7,834,249 B2 111 112

- Continued c cc gtg to a cag c cc tac citc. acg atg aca Ctg cgg atg atg agg gac 576 Pro Wall Ser Glin Pro Lell Thir Met Thir Luell Arg Met Met Arg Asp 18O 185 19 O tto ggc att gag a CC agc a CC gac gcc acc gtc agc gtC cott CC a 624 Phe Gly Ile Glu Thir Ser Thir Asp Ala Thir Wall Ser Wall Pro Pro 195 2OO 2O5

999 cgc aca gcc cgg cgg tat at a gaa cc.g gat gcg to a act 672 Gly Arg Thir Ala Arg Arg Ile Glu Pro Asp Ala Ser Thir 21 O 215 22O gcg tog ttic gcc gcc gct to c gt C tot ggc agg cgc ttic gala 72 O Ala Ser Phe Ala Ala Ala Ser Wall Ser Gly Arg Arg Phe Glu 225 23 O 235 24 O titt Cala ggc citt ggc a Ca gac agc Cala ggc gac acg to a ttic ttic 768 Phe Glin Gly Luell Gly Thir Asp Ser Glin Gly Asp Thir Ser Phe Phe 245 250 255 aat gta citt 999 cgg citc. ggit gcg gag gt C CaC tgg gca to c aac tog 816 Asn Wall Luell Gly Arg Lell Gly Ala Wall His Trp Ala Ser Asn Ser 26 O 265 27 O gtc acc at a cgg gga cc.g gaa agg Ctg acc ggc gac att gala gtg gat 864 Wall Thir Ile Arg Gly Pro Glu Arg Luell Thir Gly Asp Ile Glu Wall Asp 27s 28O 285 atg ggc gag att tcg gac a CC ttic atg aca citc. gcg gcg att gcc cott 912 Met Gly Glu Ile Ser Asp Thir Phe Met Thir Luell Ala Ala Ile Ala Pro 29 O 295 3 OO ttg gcc gat gga c cc atc. acg at a acc aac att ggit Cat gca cgg ttg 96.O Lell Ala Asp Gly Pro Ile Thir Ile Thir Asn Ile Gly His Ala Arg Luell 3. OS 310 315 32O aag gala to c gac cgc atc. toa gcg atg gala agc aac Ctg cgc acg citc. OO8 Glu Ser Asp Arg Ile Ser Ala Met Glu Ser Asn Lell Arg Thir Luell 3.25 330 335 ggit gta Cala acc gac gtc gga CaC gac tgg atg aga atc. tac cc c tot Gly Wall Glin Thir Asp Wall Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O a CC cc.g CaC ggc ggit aga gtg aat CaC cgg gac CaC agg at C gct 104 Thir Pro His Gly Gly Arg Wall Asn Cys His Arg Asp His Arg Ile Ala 355 360 365 atg gcg titt to a atc. Ctg gga Ctg aga gtg gac 999 att acc citc. gac 152 Met Ala Phe Ser Ile Lell Gly Luell Arg Wall Asp Gly Ile Thir Luell Asp 37 O 375 38O gac cott Cala tgc gtc 999 aag acc titt cott ggc tto tto gac tac citt 2OO Asp Pro Glin Cys Wall Gly Lys Thir Phe Pro Gly Phe Phe Asp Luell 385 390 395 4 OO gga cgc citt ttic c cc gaa aag gcg citt acg citc. c cc ggc tag 242 Gly Arg Luell Phe Pro Glu Lys Ala Luell Thir Luell Pro Gly 4 OS 41O

99 244

SEO ID NO 35 LENGTH: 413 TYPE : PRT ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: GRG23 (ace3) variant sequence (GRG23 (ace3) I174V)

<4 OOs, SEQUENCE: 35 Met Glu Thir Asp Arg Lieu Val Ile Pro Gly Ser Lys Ser Ile Thr Asn 1. 5 15 Arg Ala Lieu. Lieu. Lieu Ala Ala Ala Ala Lys Gly. Thir Ser Val Lieu Val 25 US 7,834,249 B2 113 114

- Continued

Arg Pro Luell Wall Ser Ala Asp Thir Ser Ala Phe Lys Thir Ala Ile Glin 35 4 O 45

Ala Luell Gly Ala Asn Wall Ser Ala Asp Gly Asp Asp Trp Wall Wall Glu SO 55 6 O

Gly Luell Gly Glin Ala Pro Asn Luell Asp Ala Asp Ile Trp Glu Asp 65 70

Ala Gly Thir Wall Ala Arg Phe Luell Pro Pro Phe Wall Ala Ala Gly Glin 85 90 95

Gly Phe Thir Wall Asp Gly Ser Glu Glin Luell Arg Arg Arg Pro Luell 105 11 O

Arg Pro Wall Wall Asp Gly Ile Arg His Luell Gly Ala Arg Wall Ser Ser 115 12 O 125

Glu Glin Luell Pro Lell Thir Ile Glu Ala Ser Gly Lell Ala Gly Gly Glu 13 O 135 14 O

Tyr Glu Ile Glu Ala His Glin Ser Ser Glin Phe Ala Ser Gly Luell Ile 145 150 155 160

Met Ala Ala Pro Tyr Ala Arg Glin Gly Luell Arg Wall Arg Wall Pro Asn 1.65 17O 17s

Pro Wall Ser Glin Pro Lell Thir Met Thir Luell Arg Met Met Arg Asp 18O 185 19 O

Phe Gly Ile Glu Thir Ser Thir Asp Gly Ala Thir Wall Ser Wall Pro Pro 195

Gly Arg Thir Ala Arg Arg Glu Ile Glu Pro Asp Ala Ser Thir 21 O 215 22O

Ala Ser Phe Ala Ala Ala Ser Ala Wall Ser Gly Arg Arg Phe Glu 225 23 O 235 24 O

Phe Glin Gly Luell Gly Thir Asp Ser Ile Glin Gly Asp Thir Ser Phe Phe 245 250 255

Asn Wall Luell Gly Arg Lell Gly Ala Glu Wall His Trp Ala Ser Asn Ser 26 O 265 27 O

Wall Thir Ile Arg Gly Pro Glu Arg Luell Thir Gly Asp Ile Glu Wall Asp 285

Met Gly Glu Ile Ser Asp Thir Phe Met Thir Luell Ala Ala Ile Ala Pro 29 O 295 3 OO

Lell Ala Asp Gly Pro Ile Thir Ile Thir Asn Ile Gly His Ala Arg Luell 3. OS 310 315

Glu Ser Asp Arg Ile Ser Ala Met Glu Ser Asn Lell Arg Thir Luell 3.25 330 335

Gly Wall Glin Thir Asp Wall Gly His Asp Trp Met Arg Ile Tyr Pro Ser 34 O 345 35. O

Thir Pro His Gly Gly Arg Wall Asn His Arg Asp His Arg Ile Ala 355 360 365

Met Ala Phe Ser Ile Lell Gly Luell Arg Wall Asp Gly Ile Thir Luell Asp 37 O 375 38O

Asp Pro Glin Wall Gly Thir Phe Pro Gly Phe Phe Asp Luell 385 390 395 4 OO

Gly Arg Luell Phe Pro Glu Ala Luell Thir Luell Pro Gly 4 OS 41O US 7,834,249 B2 115 116 That which is claimed: wherein said nucleotide sequence is operably linked to a 1. A recombinant nucleic acid molecule selected from the promoter that drives expression of a coding sequence in a group consisting of plant cell. a) a nucleic acid molecule comprising the nucleotide 15. The plant of claim 14, wherein said plant is a plant cell. sequence of SEQID NO:28; and, 16. The plant of claim 14, wherein said plant is a soybean b) a nucleic acid molecule which encodes a polypeptide plant. comprising the amino acid sequence of SEQID NO:29. 17. The plant of claim 14, wherein said plant is a corn plant. 2. The recombinant nucleic acid molecule of claim 1, 18. The plant of claim 14, wherein said plant is selected wherein said nucleotide sequence is a synthetic sequence that from the group consisting of maize, Sorghum, wheat, Sun has been designed for expression in a plant. 10 flower, tomato, crucifers, peppers, potato, cotton, rice, Soy 3. A vector comprising the nucleic acid molecule of claim bean, Sugarbeet, Sugarcane, tobacco, barley, and oilseed rape. 1. 19. A method for increasing vigor or yield in a plant com 4. The vector of claim 3, further comprising a nucleic acid prising: molecule encoding a heterologous polypeptide. a) providing a plant or seed thereof comprising the nucle 5. A host cell that contains the vector of claim 3. 15 otide sequence set forth in SEQID NO:28: 6. The host cell of claim 5 that is a bacterial host cell. b) contacting said plant with an effective concentration of 7. The host cell of claim 5 that is a plant cell. glyphosate; and, 8. A transgenic plant comprising the host cell of claim 7. c) growing said plant under conditions wherein the tem 9. The plant of claim 8, wherein said plant is selected from perature is higher than ambient environmental tempera the group consisting of maize, Sorghum, wheat, Sunflower, ture for at least two consecutive hours per day for at least tomato, crucifers, peppers, potato, cotton, rice, soybean, Sug four days following contact with said glyphosate, arbeet, Sugarcane, tobacco, barley, and oilseed rape. wherein said days following contact is within the grow 10. A transgenic seed comprising the nucleic acid molecule ing season of the plant, of claim 1. wherein the vigororyield of said plant is higher than the vigor 11. A method for producing a polypeptide with herbicide 25 or yield of a plant expressing a glyphosate tolerance EPSP resistance activity, comprising culturing the host cell of claim synthase that does not have a temperature optimum higher 5 under conditions in which a nucleic acid molecule encoding than ambient environmental temperature. the polypeptide is expressed, said polypeptide being selected 20. The method of claim 19, wherein the temperature in from the group consisting of step (c) is about 32° C. to about 60° C. a) a polypeptide comprising the amino acid sequence of 30 21. A method for conferring resistance to glyphosate in a SEQID NO:29; and, plant comprising: b) a polypeptide encoded by the nucleotide sequence of a) providing a plant or seed thereof comprising the nucle SEQID NO: SEQID NO:28. otide sequence set forth in SEQID NO:28: 12. A method for conferring resistance to an herbicide in a b) contacting said plant with an effective concentration of plant, said method comprising transforming said plant with a 35 glyphosate; and, DNA construct, said construct comprising a promoter that c) growing said plant under conditions wherein the tem drives expression in a plant cell operably linked with a nucle perature is higher than ambient environmental tempera otide sequence selected from the group consisting of SEQID ture for at least two consecutive hours per day for at least NO:28, and regenerating a transformed plant. four days following contact with said glyphosate, 13. The method of claim 12, wherein said herbicide is 40 wherein said days following contact is within the grow glyphosate. ing season of the plant. 14. A plant having stably incorporated into its genome a 22. The method of claim 21, wherein the temperature in DNA construct comprising a nucleotide sequence that step (c) is about 32° C. to about 60° C. encodes a protein having herbicide resistance activity, 23. The method of claim 22, wherein the temperature in wherein said nucleotide sequence is selected from the group 45 step (b) is about 37° C. consisting of: 24. The recombinant nucleic acid sequence of claim 1, a) a nucleic acid molecule comprising the nucleotide wherein said nucleic acid sequence is operably linked to a sequence of SEQID NO:28; and, promoter that drives expression of said nucleic acid sequence b) a nucleic acid molecule which encodes a polypeptide in a plant cell. comprising the amino acid sequence of SEQID NO:29: