USOO7943340B2

(12) United States Patent (10) Patent No.: US 7,943,340 B2 MOnod et al. (45) Date of Patent: *May 17, 2011

(54) FUNGAL DIPEPTIDYL PEPTIDASE IV OTHER PUBLICATIONS (DPPIV) ENZYME FOR REAGENT USE Galye et al. Identification of regions in interleukin-1 alpha important for activity. J Biol Chem. Oct. 15, 1993:268(29):22105-11.* (75) Inventors: Michel Monod, Lausanne (CH); Reto Whisstocket al. Prediction of protein function from protein sequence Stocklin, Plan-les-Ouates (CH); Eric and structure. Q Rev Biophys. Aug. 2003:36(3):307-40. Review.* Brenda; definition of dipeptidyl-peptidase IV, downloaded Jun. 18. Grouzmann, La Conversion (CH) 2010.* ExPASy Proteomics Server; definition of dipeptidyl-peptidase IV. (73) Assignee: Funzyme Biotechnologies SA, downloaded Jun. 18, 2010. Plan-les-Ouates (CH) GenBank AC AAC34310, Beauvais, A. et al., “Dipeptidyl-peptidase IV Secreted by Aspergillus filmigatus, a pathogenic to (*) Notice: Subject to any disclaimer, the term of this humans.” Infect. Immun., vol. 65(8):3042-3047 (Sep. 4, 1998). patent is extended or adjusted under 35 GenBank ACCAA05343, Doumas, A. et al., “Characterisation of the U.S.C. 154(b) by 0 days. prolyl dipeptidyl peptidase-encoding gene (dppV) from the koji mould Aspergillus oryzae,” (Mar. 2, 1998). This patent is Subject to a terminal dis GenBank AC CAA33512, Roberts, C.J. et al., “Structure, claimer. biosynthesis, and localization of dipeptidyl aminopeptidase B, an integral membrane glycoprotein of the yeast vacuole. J. Cell Biol. vol. 108(4): 1363-1373 (Feb. 17, 1997). (21) Appl. No.: 12/217,470 GenBank ACCAC41019, Jalving, R. et al., “Cloning of an Aspergil lus niger vacuolar dipeptidyl peptidase.” (Jun. 6, 2001). (22) Filed: Jul. 3, 2008 GenBankACNP 593970, Wood, V. et al., “The genome sequence of Schizosaccharomyces pombe.” Nature, vol. 415 (6874):871-880 (65) Prior Publication Data (Mar. 5, 2002). Stemmer, Willem P.C., “DNA shuffling by random fragmentation US 2009/O 130737 A1 May 21, 2009 and reassembly: In vitro recombination for molecular evolution.” Proc. Natl. Acad. Sci. USA, vol. 91: 10747-10751 (1994). Related U.S. Application Data Stemmer, Willem P.C., “Rapid evolution of a protein invitro by DNA shuffling.” Nature, vol. 370:389-391 (1994). (62) Division of application No. 10/569,908, filed as Stoeckel, Angela et al., “Specific Inhibitors of Aminopeptidase P. application No. PCT/IB2004/002963 on Aug. 25, Peptides and Pseudopeptides of 2-Hydroxy-3-Amino Acids.” Cellu 2004, now Pat. No. 7,468,267. lar Peptidases in Immune Functions and Diseases, edited by Ansorge and Langner, Plenum Press, New York, Chapt. 5, pp. 31-35 (1997). (60) Provisional application No. 60/498.318, filed on Aug. Stoecklin, Reto et al., “A new affinity tag-removal system (TRS) in 25, 2003. protein expression and peptide synthesis.” The First International Congress on Natural Peptides to Drugs (NP2D), pp. 98-99 (2004). (51) Int. Cl. Stoecklin, Reto et al., “Creation of a Spin-off to Industrialise Novel Fungal Enzymes,” retrieved online at: http://www.bioalps.ch/ CI2P2/06 (2006.01) Bioalps/FHomePageBioalps. (52) U.S. Cl...... 435/68.1 aspx?tokenPage=XDDC744ZODyQwaiuV899fFJMQ9nN (58) Field of Classification Search ...... None MtVGNkBVCVMm8%29%29 (2005). See application file for complete search history. Vader, Willemun et al., “The Gluten Response in Children With Celiac Disease Is Directed Toward Multiple Gliadin and Glutenin (56) References Cited Peptides.” Gastroenterology, vol. 122:1729-1737 (2002). Vanbreuseghem, R. et al., “The Dermatophytia or Ringworms.” Prac U.S. PATENT DOCUMENTS tical Guide to Medical and Veterinary Mycology, 2nd Edition, Mas son Publishing USA, Inc., pp. 8-13, 81-150 and 242-249 (1978). 4,522,811 A 6/1985 Eppstein et al. Villain, Matteo et al., "Covalent capture: a new tool for the purifica 4,736,866 A 4, 1988 Leder et al. 4,855,231 A 8, 1989 Stroman et al. tion of synthetic and recombinant polypeptides.” Chemistry & Biol 4,857.467 A 8, 1989 Sreekrishna et al. ogy, vol. 8:673-679 (2001). 4,870,009 A 9, 1989 Evans et al. 4,873,191 A 10/1989 Wagner et al. (Continued) 4,879,231 A 11, 1989 Stroman et al. 4,929,555 A 5/1990 Cregg et al. Primary Examiner — Sheridan Swope 5,328,470 A 7, 1994 Nabel et al. (74) Attorney, Agent, or Firm — Nelson Mullins Riley & 5,603,793 A 2f1997 Yoshida et al. Scarborough LLP: Debra J. Milasincic 5,811,238 A 9, 1998 Stemmer et al. 5,830,721 A 11/1998 Stemmer et al. 5,994,113 A 11/1999 Kauppinen et al. (57) ABSTRACT 6,127,161 A 10, 2000 Umitsuki et al. Disclosed herein are fungal nucleic acid sequences that 2005/O158298 A1 7/2005 Monodet al. encode novel dipetidyl peptidase IV (DPPIV) polypeptides. FOREIGN PATENT DOCUMENTS Also disclosed are polypeptides encoded by these nucleic WO 95/22625 A1 8, 1995 acid sequences, as well as derivatives, variants, mutants, or WO 96.33207 A1 10, 1996 fragments of the aforementioned polypeptide, polynucle WO 97/20078 A1 6, 1997 WO 97.33957 A1 9, 1997 otide, or antibody. The aminopeptidase polypeptides, WO 97/35966 A1 10, 1997 referred to herein as DPPIV proteins of the invention are WO 98.13485 A1 4f1998 WO 98.13487 A1 4f1998 useful in a variety of medical, research, and commercial WO 98.27 230 A1 6, 1998 applications. WO 98.31837 A1 7, 1998 WO 98.42832 A1 10, 1998 18 Claims, 18 Drawing Sheets US 7,943,340 B2 Page 2

OTHER PUBLICATIONS Hasselgren, Catrin et al., “Metal ion binding and activation of Weitzman, Irene et al., “The .” Clinical Microbiology Streptomyces griseus dinuclear aminopeptidase: cadmium(II) bind Reviews, vol. 8(2):240-259 (1995). ing as a model.” J. Biol. Inorg. Chem... vol. 6: 120-127 (2001). Woodfolk, Judith A. et al., “ Antigens Associated with Hausch, Felix et al., “Intestinal digestive resistance of IgE Antibodies and Delayed Type Hypersensitivity.” The Journal of immunodominant gliadin peptides. Am. J. Phsiol. Gastrointest. Biological Chemistyr, vol. 273(45):29489-29496 (1998). Liver Physiol., vol. 283:G996-G1003 (2002). Yelton, M. Melanie et al., “Transformation of Aspergillus nidulans Hauser, Melinda et al., “Multiplicity and regulation of genes encod by using a trpC plasmid.” Proc. Natl. Acad. Sci. USA. vol. 81: 1470 ing peptide transporters in Saccharomyces cerevisiae. Molecular 1474 (1984). Membrane Biology, vol. 18:105-112 (2001). Arentz-Hansen, Helene et al., “Celiac Lesion T Cells Recognize Ike, Yoshimasa et al., “Solid phase synthesis of polynucleotides. VII. Epitopes That Cluster in Regions of Gliadins Rich in Proline Resi Synthesis of mixed oligodeoxyribonucleotides by the dues.” Gastroenterology, vol. 123:803-809 (2002). phosphotriester solid phase method.” Nucleic Acids Research, vol. Arkin, Adam P. et al., “An algorithm for protein engineering: Stimu 11(2):477-488 (1983). lations of recursive ensemble mutagenesis.” Proc. Natl. Acad. Sci. Itakura, Keiichi et al., “Expression in Escherichia coli of a Chemi USA, vol. 89:781 1-7815 (1992). cally Synthesized Gene for the Hormone Somatostatin.” Science, vol. Ausubel, Frederick M., "Hybridization with Radioactive Probes.” 198: 1056-1063 (1977). Current Protocols in Molecular Biology, vol. 1, Section II, John Itakura, Keiichi et al., “Synthesis and Use of Synthetic Wiley & Sons, N.Y., pp. 6.3.1-6.3.6 (1989). Oligonucleotides.” Ann. Rev. Biochem... vol. 53:323-356 (1984). Beauvais, Anne et al., “Biochemical and Antigenic Characterization of a New Dipeptidyl-Peptidase Isolated from Aspergillus fumigatus.” Kwon-Chung, K.J. et al., “Cryptococcosis, Torulosis, European The Journal of Biological Chemistry, vol. 272(10):6238-6244 blastomycosis, Busse-Buschke disease.” Medical Mycology, Carroll (1997). Canned. Lea& Febiger, Chpt. 16, pp. 397-446 (1992). Beauvais, A. et al., “Dipeptidyl-Peptidase IV Secreted by Aspergillus Lin, Lung-Yu et al., “Metal-binding and active-site structure of di filmigatus, a Fungus Pathogenic to Humans.” Infection and Immu zinc Streptomyces griseus aminopeptidase.” JBIC, vol. 2:744-749 nity, vol. 65(8):3042-3047 (1997). (1997). Ben-Meir, Daniella et al., “Specificity of Streptomyces griseus Lubkowitz, Mark A. et al., “An oligopeptide transport gene from aminopeptidase and modulation of activity by divalent metal ion Candida albicans.” Microbiology, vol. 143:387-396 (1997). binding and substitution.” Eur, J. Biochem... vol. 212:107-112 (1993). Luckow, Verne A. et al., “High Level Expression of Nonfused For Borg-von Zepelin, M. et al., “The expression of the secreted aspartyl eign Genes with Autographa californica Nuclear Polyhedrosis Virus proteinases Sap4 to Sap6 from Candida albicans in murine macroph Expression Vectors.” Virology, vol. 170:31-39 (1989). ages.” Molecular Microbiology, vol. 28(3):543-554 (1998). Ma, Julian K-C. et al., “The Production of Recombinant Pharmaceu Brouta, F. et al., “Purification and characterization of a 43-5 kDa tical Proteins in Plants.” Nature, vol. 4(10):794-802 (2003). keratinolytic metalloprotease from canis.” Medical McAdam, S. et al., “Getting to grips with gluten.” Gut, vol. 47:743 Mycology, vol. 39:269-275 (2000). 745 (2000). Brouta, Frederic et al., “Secreted Metalloprotease Gene Family of Mignon, B. et al., “Purification and characterization of a 315 kDa .” Infection and Immunity, vol. 70 (10):5676 keratinolytic Subtilisin-like serine protease from Microsporum canis 5683 (2002). and evidence of its secretion in naturally infected cats.” Medical Chambers, Steve P. et al., “The pMTLnic-cloning vectors. I. Mycology, vol. 36:395-404 (1998). Improved puC polylinker revions to facilitate the use of sonicated Molberg, Oyvindet al., “Tissue transglutiminase selectively modifies DNA for nucleotide sequencing.” Gene, vol. 68:139-149 (1988). Chen, Shu-Hsia et al., “Genetherapy for brain tumors: Regression of gliadin peptides that are recognized by gut-derived T cells in celiac experimental gliomas by adenovirus-mediated genetransfer in vivo.” disease.” Nature Medicine, vol. 4(6):713-717 (1998). Proc. Natl. Acad. Sci. USA, vol. 91:3054-3057 (1994). Monod, Michel et al., “Aminopeptidases and dipeptidyl-peptidases Crameri, Andreas et al., "Construction and evolution of antibody secreted by the .” Microbiology, phage libraries by DNA shuffling.” Nature Medicine, vol. 2(1): 100 vol. 151: 145-155 (2005). 103 (1996). Monod, Michel et al., “Multiplicity of genes encoding secreted Danew, P. et al., “Untersuchungen Zur Peptidaseaktivitat aspartic proteinases in Candida species.” Molecular Microbiology, hautpathogener Pilze,” mykosen, vol. 23(9):502-511 (1980). vol. 13(2):357-368 (1994). De Bersaques, J. et al., “Proteolytic and Leucylnaphthlamidase Monod, Michel et al., “Secreted proteases from pathogenic fungi.” Activity in some Dermatophytes.” Archives Belges de Dermatologic, Int. J. Med. Microbiol., vol. 292:405-419 (2002). vol. 28(2):135-140 (1973). Monod, Michel et al., “Survey of Dermatophyte Infections in the Delagrave, Simon et al., “Recursive ensemble mutagenesis.” Protein Lausanne Area (Switzerland).” Dermatology, vol. 205:201-203 Engineering, vol. 6(3):327-331 (1993). Descamps, Frederic et al., “Isolation of a Microsporum canis Gene (2002). Family Encoding Three Subtilisin-like Proteases Expressed in vivo.” Narang, Saran A., “Tetrahedron Report No. 40, DNA Synthesis.” J. Invest. Dermatol., vol. 119:830-835 (2002). Tetrahedron, vol. 39(1):3:22 (1983). Doumas, Agnes et al., “Characterization of the Prolyl Dipeptidyl Needleman, Saul B. et al., “A General Method Applicable to the Peptidase Gene (dppIV) from the Koji Mold Aspergillus Oryzae.” Search for Similarities in the Amino Acid Sequence of Two Proteins.” Applied and Environmental Microbiology, vol. 64(12):4809-4815 J. Mol. Biol., vol. 48:443-453 (1970). (1998). Nishizawa, Mayumi et al., “Molecular Cloning of the Doumas, Agnes et al., “Cloning of the gene encoding neutral protease Aminopeptidase Y Gene of Saccharomyces cerevisiae.” The Journal I of the koji mold Aspergillus Oryzae and its expression in Pichia of Biological Chemistry, vol. 269(18): 13651-13655 (1994). pastoris.” J. Food Mycol., vol. 2(1):271-279 (1999). O'Cuinn, G. et al., “Generation of non-bitter casein hydrolysates by Ellis, Keith J. et al., “Buffers of Constant Ionic Strength for Studying using combination of a proteinase and aminopeptidases. Biochemi pH-Dependent Processes.” Methods in Enzymology, vol. 87:405-426 cal Society Transactions, vol. 27(4):730-734 (1999). (1982). Rubio-Aliaga, Isabel et al., “Mammalian peptide transporters as tar Greenblatt, H.M. et al., “Streptomyces griseus Aminopeptidase: gets for drug delivery.” TRENDS in Pharmacological Sciences, vol. X-ray Crystallographic Structure at 1.75A Resolution.” J. Mol. Biol. 23(9):434-440 (2002). vol. 265:620-636 (1997). Schnoelzer, Martina et al., “In situ neutralization in Boc-chemistry Grossberger, Dario, “Minipreps of DNA from bacteriophage solid phase peptide synthesis.” Int. J. Peptide Protein Res., vol. lambda,” Nucleic Acids Research, vol. 15(16):6737 (1987). 40: 180-193 (1992). US 7,943,340 B2 Page 3

Shan, Li et al., “Structural Basis for Gluten Intolerance in Celiac Sollid, Ludvig M., “Doeliac Disease: Dissecting a Complex Inflam Sprue.” Science, vol. 297:2275-2279 (2002). matory Disorder.” Nature Review Immunol., vol. 2 (9):647-655 Shilo, Ben-Zion et al., “DNA sequences homologous to vertebrate (2002). oncogenes are conserved in Drosophila melanogaster.” Proc. Natl. Stacey, Gary et al., “Peptide transport in plants.” TRENDS in Plant Acad. Sci. USA, vol. 78(11):6789-6792 (1981). Smith, Gale E. et al., “Production of Human Beta Interferon in Insect Science, vol. 7(6):257-263 (2002). Cells Infected with a Baculovirus Expression Vector.” Molecular and Cellular Biology, vol. 3(12):2156-2165 (1983). * cited by examiner U.S. Patent May 17, 2011 Sheet 1 of 18 US 7,943,340 B2

U.S. Patent May 17, 2011 Sheet 2 of 18 US 7,943,340 B2

U.S. Patent May 17, 2011 Sheet 3 of 18 US 7,943,340 B2

Determination of Trubrum AMPP activity at different pHs using Lys(Abz)-Pro-Pro-pNA as Substrate.

300 280 260 240 220 200 180 160

140120 -N-1\-N1).C 100 80 60 40 20

Q Q c.o 1, Q 1, o Q Q Q:so Q Q Q o QQ Q.o N.Q S && 8a & a S&S a & NNNN& & & pH Fig. 4 U.S. Patent May 17, 2011 Sheet 4 of 18 US 7,943,340 B2

Enzymatic activity of T. rubrum AMPP at different temperatures.

-O- 25 C -O- 30°C 180 170 -H 40 C H 50 C 160 150 - A - 60°C

140 -A- 70° C

130 -CX-80 C 120 110 100 90 80 70 60 50 40 30 20 10 U.S. Patent May 17, 2011 Sheet 5 of 18 US 7,943,340 B2

Digestion of gliadin 14mer without (A) or with (B) rulAP2 over 4h at 37°C with a EIS ration of 1150 (w:w).

A B

Fig. 6 U.S. Patent May 17, 2011 Sheet 6 of 18 US 7,943,340 B2

Digestion of gliadin 14mer over 4h with ruDPPIV alone at an EIS ratio of 1/25 (w:w)(A) and with a mixture of rulAP2 and DPPIV at the same ratio of 1150 (w:w)(B).

Initial Substrate

Fig. 7 U.S. Patent May 17, 2011 Sheet 7 of 18 US 7,943,340 B2

Digestion of gliadin 33mer without (A) and with (B) ruDPPIV over 4h at 37°C with a EIS ration of 1150 (w:w).

U.S. Patent May 17, 2011 Sheet 8 of 18 US 7,943,340 B2

Digestion of gliadin 33mer at 37°C over 4h with a mixture of rulAP2 and ruDPPIV at an EIS ratio of 1:50 (w:w) each.

Initial Substrate

Fig. 9 U.S. Patent May 17, 2011 Sheet 9 of 18 US 7,943,340 B2

Mass spectrum of Gly-Ser-proNPY (calculated molecular mass at 8203.12 Da) before (A) and after digestion (B) with rulAP2 which results in the cleavage of Gly-Ser (-144.1 Da)

A 3.47e6 100 8202.88 A: 8202.71 +0.45

%

8145.88 8217.75 8055.38 8317.00 797538. TSS 6000 8000 10000 Fig. 10A U.S. Patent May 17, 2011 Sheet 10 of 18 US 7,943,340 B2

Mass spectrum of Gly-Ser-proNPY (calculated molecular mass at 8203.12 Da) before (A) and after digestion (B) with rulAP2 which results in the cleavage of Gly-Ser (-144.1 Da)

A 5.12e6 100 8058.63 A 8058,370.58

%

8074.00 802863, 8094.75 819988 aSS 6000 8000 10000 Fig. 10B U.S. Patent May 17, 2011 Sheet 11 of 18 US 7,943,340 B2

Mass spectrum of Ala-proNPY (A, calculated molecular mass at 8130.0 uma) and of the digestion product (B) with rulAP2 Ala-proNPY Controle APRONPYC 10 (1.042) Sm (Mn, 2x0.75); Tr (300 A 1.06e7 100 8129.63 A: 81.29.520.57

%

8145.25 8101.88 8169.00 / SS 6000 8000 10000 Fig. 1 1A U.S. Patent May 17, 2011 Sheet 12 of 18 US 7,943,340 B2

Mass spectrum of Ala-proNPY (A, calculated molecular mass at 8130.0 uma) and of the digestion product (B) with rulAP2 Ala-proNPY Digestion LAP21:20 APRONPY1 15 (1.550) Sm (Mn, 2x0.75); Tr (300 A 1.76e7 100 8058.75- A: 8058.76 0.25 B: 8129,670.00

%

807463 / B 8030.13 1. 8130.13 N

2SS 6000 8000 10000 Fig. 1 IB U.S. Patent May 17, 2011 Sheet 13 of 18 US 7,943,340 B2

Mass spectrum of (A) TG47 (calculated molecular mass of 18894.9 uma) and of the digestion product with rulAP2 (B), which results in the removal of the N-terminal methionine. TG47 Substrat de départ TG477 (0.737) Sm (Mn, 2x0.75); Tr (300:1500,0 100 A: 18894. 13- 153e.7 A. 18894.35087

%

18932.00 / 19006.50 /

aSS 17000 18000 19000 20000 Fig. 12A U.S. Patent May 17, 2011 Sheet 14 of 18 US 7,943,340 B2

Mass spectrum of (A) TG47 (calculated molecular mass of 18894.9 uma) and of the digestion product with rulAP2 (B), Which results in the removal of the N-terminal methionine. TG47 Digestion rulAP21:20 15h Pict TG477 (0.940) Sm (Mn, 2x0.75); Tr (300: 1500,0 100 A 141e7 18762.00 A. 18762.370.43

%

1880025 / 1887518875.25

aSS 17000 18000 19000 20000 Fig. 12B U.S. Patent May 17, 2011 Sheet 15 of 18 US 7,943,340 B2

Characterisation by ESI-MS of the digestion product of desMET-G-CSF without (A) or with (B) ruDPPIV.

Des(MET) G-CSF Control TG47 2 7 (0.737) Sm (Mn, 2x0.75); Tr(300: 1500 100 A 180e,7 18761.38 A. 18762.460.83

%

18876.75 /

aSS 17000 18000 19000 20000 Fig. 13A U.S. Patent May 17, 2011 Sheet 16 of 18 US 7,943,340 B2

Characterisation by ESI-MS of the digestion product of desMET-G-CSF without (A) or with (B) ruDPPIV.

Digestion Des(MET) G-CSF IDPPIV 1120 TG4737 (0.737) Sm (Mn, 2x0.75); Tr (300: 1500 100 A 143e.7 18563.38 A. 18563.820.96

%

18603.38 / 18891.88 / O aSS 17000 18000 19000 20000 Fig. 13B

US 7,943,340 B2 1. 2 FUNGAL DIPEPTDYL PEPTIDASE IV carboxypeptidase activity. For example, the polypeptide may (DPPIV) ENZYME FOR REAGENT USE be a leucine aminopeptidase such as rul AP2. Also provided are isolated polypeptides having one or RELATED APPLICATIONS more conservative amino acid Substitutions. Such polypep tides may possess aminopeptidase activity. This application is a divisional of U.S. application Ser. No. The invention also encompasses polypeptides that are 10/569,908, filed Sep. 19, 2006, now issued as U.S. Pat. No. naturally occurring allelic variants of the sequence selected 7.468,267, which is a 35 USC S371 filing of international from the group consisting of SEQID NOs: 3, 6, 9, 12, 15, 18, application number PCT/IB/2004/002963 which was filed 21, 24, 27.30,33, and 35. These allelic variants includeamino 10 acid sequences that are the translations of nucleic acid Aug. 25, 2004. PCT/IB/2004/002963 claims priority to U.S. sequences differing by one or more nucleotides from nucleic Provisional Application No. 60/498,318, filed on Aug. 25, acid sequences selected from the group consisting of SEQID 2003. The contents of the aforementioned applications are NOs: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, and 35. The variant hereby incorporated herein by reference. polypeptide where any amino acid changed in the chosen 15 sequence is changed to provide a conservative Substitution. FIELD OF THE INVENTION The invention also involves a method of removing particu lar amino acids from peptides, for instance tags from recom The present invention relates to novel polypeptides, and the binant proteins, wherein the active polypeptide removing nucleic acids encoding them, having unique catalytic proper amino acid is a polypeptide having an amino acid sequence at ties. More particularly, the invention relates to nucleic acids least 90% identical to a polypeptide having the amino acid encoding novel leucine aminopeptidase (LAP) and other sequence selected from the group consisting of SEQID NOs: amino- and carboxy-peptidases polypeptides, which will be 3, 6, 9, 12, 15, 18, 21, 24, 27, 30,33, and 35, or a biologically herein collectively referred to as EXOX, as well as vectors, active fragment thereof. host cells, antibodies, and recombinant methods for produc Any of the polypeptides of the invention may be naturally ing these nucleic acids and polypeptides. These genes have 25 occurring. Further, any of these polypeptides can be in a been identified in two different fungal species, Trichophyton composition including a carrier, and the composition can be rubrum and Aspergillus filmigatus. in a kit including one or more containers. Also provided are dermatophytes containing the polypep BACKGROUND OF THE INVENTION tides of the invention. For example, suitable dermatophytes 30 include , , Bacteria, yeast and filamentous fungi, as well as special Microsporum ferrugineum, Trichophyton concentricum, Tri ized cells of plants, invertebrates and vertebrates express chophyton kanei, Trichophyton megninii, Trichophyton men membrane proteins useful for the uptake of amino acids, tagrophytes, Trichophyton raubitschekii, Trichophyton dipeptides and tripeptides. Lubkowitz et al., Microbiology rubrum, Trichophyton Schoenleinii, Trichophyton Soudan 143:387-396 (1997); Hauser et al., Mol. Membr. Biol. 18(1): 35 ense, , Trichophyton violaceum, Tri 105-112 (2001); Stacey et al., Trends Plant Sci. 7(6):257-263 chophyton yaoundei, Microsporum canis, Microsporum (2002); Rubio-Aliaga & Daniel, Trends Pharmacol. Sci. equinum, Microsporum nanum, Microsporum persicolor; Tri 23(9):434-440 (2002). Transporters that also accept larger chophyton equinum, Trichophyton simii, Trichophyton verru oligopeptides (4-5 amino acid residues) are known in yeast, cosum, , Trichophyton aielloi, and Tri filamentous fungi and plants. Protein digestion into amino 40 chophyton terrestre. acids has been investigated in microorganisms used in food The invention also provides microbial culture Supernatants fermentation industry. Bacteria of the genus Lactobacillus containing the polypeptides of the invention. (O'Cuinn et al., Biochem. Soc. Trans. 27(4):730-734 (1999)) The invention also relates to the use of therapeutics in the and fungi of the genus Aspergillus (Doumas et al., Appl. manufacture of a medicament for treating a syndrome asso Environ. Microbiol. 64:4809-4815 (1998)) secrete endopro 45 ciated with a human disease, where the therapeutic includes teases and exoproteases, which cooperate very efficiently in the polypeptides of the invention and the disease is selected protein digestion. from a pathology associated with these polypeptides. Aminopeptidase activity, which may also play a role in the The invention also relates to methods of degrading a development of fungus during infection, has been detected in polypeptide Substrate. Such methods include contacting the the mycelium and culture Supernatant of a species of fungi 50 polypeptide Substrate with one or more of the polypeptides, (De Bersaques & Dockx, Arch. Belg. Dermatol. Syphiligr. which have been isolated. For example, the polypeptide sub 29:135-140 (1973); Danew & Friedrich, Mykosen 23:502 strate can be a full-length protein. Further, the one or more 511 (1980)), however, no aminopeptidase or carboxypepti isolated polypeptides can be used to sequentially digest the dase has been isolated and characterized from dermatophytes polypeptide substrate. The polypeptide substrate can be to date. 55 selected from denatured casein, gliadin, gluten, bovine serum albumin or fragments thereof. For example, the isolated SUMMARY OF THE INVENTION polypeptide can be anaminopeptidase, which can be a leucine aminopeptidase such as rul AP2. The invention is based in part upon the discovery of iso The invention further relates to methods for identifying a lated polypeptides containing the mature form of an amino 60 potential therapeutic agent for use in treatment of fungal acid sequence selected from SEQID NOs: 3, 6, 9, 12, 15, 18, infections, wherein the fungal infection is related to aberrant 21, 24, 27, 30, 33, and 35. The invention also provides iso expression or aberrant physiological interactions of the lated polypeptides containing an amino acid sequence polypeptides of the invention. Such methods include provid selected from SEQID NOS. 3, 6, 9, 12, 15, 18, 21, 24, 27.30, ing a cell expressing the polypeptide and having a property or 33, and 35, as well as isolated polypeptides that are at least 65 function ascribable to the polypeptide, contacting the cell 90% identical to polypeptides having these sequences, with a composition comprising a candidate Substance, and wherein the polypeptide optionally has aminopeptidase or determining whether the substance alters the property or US 7,943,340 B2 3 4 function ascribable to the polypeptide. If no alteration is The invention also relates to methods for producing a pro observed in the presence of the substance when the cell is tein by culturing a dermatophyte containing the polypeptide contacted with a composition in the absence of the Substance, under conditions sufficient for the production of the protein the Substance is identified as a potential therapeutic agent. For and isolating the protein from the dermatophyte culture. For example, the property or function ascribable to the polypep example, the protein can be a secreted protein. Likewise, the tide can be aminopeptidase or carboxypeptidase activity. protein can also be an aminopeptidase or a carboxypeptidase. The invention further relates to methods of treating a Specifically, the aminopeptidase can be a leucine aminopep pathological state in a mammal by administering a polypep tidase, such as rul AP2. Additionally, the dermatophyte can tide to the mammal in an amount that is sufficient to alleviate be selected from Epidermophyton floccosum, Microsporum 10 audouinii, Microsporum ferrugineum, Trichophyton concen the pathological state. Typically, the polypeptide has an tricum, Trichophyton kanei, Trichophyton megninii, Tricho amino acid sequence at least 90% identical to a polypeptide phyton mentagrophytes, Trichophyton raubitschekii, Tricho containing the amino acid sequence selected from SEQ ID phyton rubrum, Trichophyton Schoenleinii, Trichophyton NOs: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, and 35, or a Soudanense, Trichophyton tonsurans, Trichophyton viola biologically active fragment thereof. The pathological state to 15 ceum, Trichophyton yaoundei, Microsporum canis, be treated include a fungal infection, celiac disease, digestive Microsporum equinum, Microsporum nanum, Microsporum tract malabsorption, sprue, an allergic reaction and an persicolor; Trichophyton equinum, Trichophyton mentagro enzyme deficiency. For example, the allergic reaction can be phytes, Trichophyton simii, Trichophyton verrucosum, a reaction to gluten. Microsporum gypseum, Trichophyton aielloi, and Trichophy The invention additionally relates to methods of treating a for terreStre. pathological state in a mammal by administering a protease The produced proteins can be applied to polypeptide Sub inhibitor to the mammal in an amount that is sufficient to strates. In some instances, the produced protein can degrade alleviate the pathological state. The protease inhibitor the polypeptide or can sequentially digests a full-length includes an amino acid sequence at least 90% identical to a polypeptide Substance. Optionally, the polypeptide Substrate polypeptide having the amino acid sequence selected from 25 length can be from 2 to 200 amino acids. SEQID NOS:3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, and 35, or In some instances, the produced protein adds one or more a biologically active fragment thereof. For example, the amino acids to the polypeptide Substrate. In other instances, pathological state can be a fungal infection. the produced protein removes one or more amino acids from The invention further relates to isolated polypeptides hav the polypeptide substrate to form a modified polypeptide ing an amino acid sequence selected from SEQID NOs: 3, 6, 30 Substrate, and the produced protein Subsequently adds one or 9, 12, 15, 18, 21, 24, 27, 30, 33, and 35. These polypeptides more amino acids to the modified polypeptide Substrate, can be produced by culturing a cellunder conditions that lead thereby forming a polypeptide product comprising a different to expression of the polypeptide. In some embodiments, the amino acid sequence than the polypeptide Substrate. cell includes a vector containing an isolated nucleic acid The invention also provides methods for treating mycoses molecule having a nucleic acid sequence selected from the 35 in a patient suffering therefrom. Such methods include group consisting of SEQID NOS: 2, 5, 8, 11, 14, 17, 20, 23, administering an effective amount of an inhibitor with the 26, 29, 32, and 34. Optionally, the cell may be a fungal cell, a activity of an EXOX protein selected from SEQID NOS:3, 6, bacterial cell, an insect cell (with or without a baculovirus), a 9, 12, 15, 18, 21, 24, 27, 30, 33, and 35. For example, the plant cell and a mammalian cell. EXOX protein can include SEQID NO: 2. The invention also provides isolated nucleic acid mol 40 The invention further provides methods of degrading a ecules containing a nucleic acid sequence selected from SEQ polypeptide Substrate. These methods include contacting the ID NOS: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, and 34. For polypeptide substrate with one or more of the isolated example, Such nucleic acid molecules can be naturally occur polypeptides of the invention. Optionally, the polypeptide r1ng. Substrate is a full-length protein, and the one or more isolated The invention also relates to nucleic acid molecules that 45 polypeptides can be polypeptides that sequentially digest the differ by a single nucleotide from a nucleic acid sequence polypeptide substrate. The polypeptide substrate can be selected from SEQID NOS: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, selected from denatured casein, gliadin, gluten, bovine serum 32, and 34 as well as to isolated nucleic acid molecules albuminor fragments thereof. Further, in Some instances, the encoding the mature form of a polypeptide having an amino isolated polypeptide is an aminopeptidase. The aminopepti acid sequence selected from the group consisting of SEQID 50 dase can be a leucine aminopeptidase, Such as rul AP2. NOs: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30,33, and 35. Further, the Additionally, the method optionally contacting the nucleic acid molecules can be ones that hybridizes under polypeptide Substrate with one or more proteases. In some stringent conditions to the nucleotide sequence selected from instances, the proteases are selected from trypsin, pronase, the group consisting of SEQID NOS: 2, 5, 8, 11, 14, 17, 20, chymotrypsin, and proteinaseK. 23, 26, 29, 32, and 34 or a complement of that nucleotide 55 The invention further provides methods of removing sequence. In some embodiments, the nucleic acid molecules amino acids from the amino terminus of a protein. The meth can be included in a vector, that further includes a promoter ods include contacting the protein with one or more of the operably linked to said nucleic acid molecule. Also provided isolated polypeptides of the invention. In some instances, the are cells that include the vector. amino terminus of a protein includes a His tag. In other The invention also provides methods of producing 60 instances the amino terminus of a protein includes an Xaa-Pro polypeptides of the invention. The methods include culturing tag. Optionally, Xaa is an amino acid including at least two a cell under conditions that lead to expression of the polypep vicinal nucleophilic groups, with examples including serine, tide and the cell includes a vector having an isolated nucleic threonine or cysteine. acid molecule containing a nucleic acid sequence selected The invention further provides isolated polypeptides of the from SEQID NOs: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, and 65 invention that can have reverse proteolytic activity. 34. In some instances, the cell is selected from a fungal cell, The invention further provides methods of adding one or a bacterial cell, an insect cell, a plant cell or mammalian cell. more amino acids to a polypeptide Substrate. The method US 7,943,340 B2 5 6 includes contacting the polypeptide Substrate with one or FIGS. 12A and 12B are mass spectra of TG47 (A) before more of the isolated polypeptides of the invention. and (B) after digestion with rul AP2. Unless otherwise defined, all technical and scientific terms FIGS. 13A and 13B are mass spectra of desMet-G-CSF (A) used herein have the same meaning as commonly understood before and (B) after digestion with DPPIV. by one of ordinary skill in the art to which this invention 5 FIG. 14 is an alignment of deduced amino acid sequences belongs. Although methods and materials similar or equiva of aminopeptidases of the M28E subfamily, including lent to those described herein can be used in the practice of the rul AP1 (SEQID NO: 6), fulLAP1 (SEQID NO:12), orLAP1 present invention, Suitable methods and materials are (SEQID NO: 80), AbispILAP1 (SEQID NO: 81), and Vibri described below. All publications, patent applications, pat oLap (SEQ ID NO: 82). ents, and other references mentioned herein are incorporated 10 FIG. 15 is an alignment of deduced amino acid sequences by reference in their entirety. In the case of conflict, the of aminopeptidases of the M28A subfamily, including present specification, including definitions, will control. In rul AP2 (SEQID NO:3), fulLAP2 (SEQID NO:9), orLAP2 addition, the materials, methods, and examples are illustra (SEQID NO: 85), and ScerY (SEQID NO: 86). tive only and are not intended to be limiting. 15 DETAILED DESCRIPTION OF THE INVENTION BRIEF DESCRIPTION OF THE DRAWINGS As used herein, the term protease is synonymous with FIG. 1 is a photograph of a Western blot of T. rubrum peptidase, proteolytic enzyme and peptide hydrolase. The Supernatant preparation probed with anti-A. oryzae Alp proteases include all enzymes that catalyse the cleavage of the (Panel A, left) and Mep antisera (Panel C, right). Panel B peptide bonds (CO -NH) of proteins, digesting these pro shows a 10% SDS-PAGE gel stained with Coomassie blue. In teins into peptides or free amino acids. Exopeptidases act near lane 1, the proteins of 0.25 ml of T. rubrum culture superna the ends of polypeptide chains at the amino (N) or carboxy tant were precipitated with TCA before loading on the SDS (C) terminus. Those acting at a free N terminus liberate a PAGE gel. 0.2 g of purified recombinant A. oryzae ALP and single amino acid residue and are termed aminopeptidases. A MEP were loaded on lane 2 and lane 3, respectively. The 25 large variety of highly specific proteases are involved in a molecular mass of protein standards are shown in the left number of different biological and physiological processes. margin. Thus, these represent targets of choice for new drug applica FIG. 2 is a photograph of a SDS-PAGE gel illustrating a tions as well as for controlled peptidic and/or proteic degra protein profile of recombinant rul AP2 (1,2), fuIAP2 (3, 4), dations. rul AP1 (5, 6) and fuIAP1 (7,8) produced in P. pastoris. 1g 30 Dermatophytes are human and animal pathogenic fungi, of each purified recombinant LAP was loaded on a 10% which cause cutaneous infections. Vanbreuseghem et al., SDS-PAGE gel. Lanes 2, 4, 6 and 8 show the proteins degly GUIDE PRATIQUE DE MYCOLOGIE MEDICALE ET VETERINAIRE. cosylated by N-glycosidase F treatment. The gel was stained (1978); Kwong-Chong & Bennet, MEDICALMYCOLOGY (1992); with Coomassie brilliant blue R-250. Weitzman & Summerbell, Clin. Microbiol. Rev. 8:240-259 FIG. 3 is a photograph of a Western blot of T. rubrum 35 (1995). Examples of dermatophytes include, but are note culture Supernatant and recombinant LAPS used as controls limited to, Tajelloi, A. uncinatum, K. aielloi, Tasteroides, T. probed with anti-rul AP2 (lanes 1-4) and anti-rul AP1 antis mentagrophytes, T. concentricum, T. cruris, E. floccosum, T. era (lanes 5-8). In lane 1, 2, 5 and 6 the proteins of 0.25 ml of dankalienese, G. dankaliensis, T equinum, T. equinum var. T. rubrum culture supernatant was precipitated with TCA autotrophicum, T. equinum var: equinum, T. erinacei, T. fis before loading on the SDS-PAGE gel. 0.1 g of purified recom 40 cheri, T. flavescens, Tfloccosum, E. floccosum, T. gloriae, T. binant rul AP2 (lanes 3, 4) and rul AP1 (lanes 7, 8) was gourvili, T. granulare, T. granulosum, T. gypseum, T. ingui loaded as a control. N-glycosidase F was used for deglyco male, T. interdigitale, T. intertriginis, T. kanei, T. kraidenii, T. Sylation of proteins. The molecular mass of protein standards long filsum, T. megninii, A. quinckianum, A. benhamiae, A. are shown in the left margin. vanbreuseghemii, T. pedis, T. proliferans, T quickaneum, T. FIG. 4 is a graph of the enzymatic activity of T. rubrum 45 radiolatum, T. mentrophytes var. erinacei, T. mentagrophytes AMPP (aminopeptidase P) at various pH values. It appears var. interdigitale, T. mentagrophytes var: mentagrophytes, T. that AMPP has activity over abroad range of pH values, from mentagrophytes var: nodulare, T. mentagrophytes var. pH 6 to 11. quininckeanum, T. niveum, T. nodulare, T. persicolor; M. per FIG. 5 is a graph of the enzymatic activity of T. rubrum sicolor, T. phaseolforme, T. proliferans, T. purpureum, T. AMPP at various temperatures. The enzyme exhibits activity 50 quinckeanum, T. radiolatum, T. raubitschekii, T. rubrum, S. at temperatures ranging from 25 to 60 C with an optimal ruber, T. Schoenleinii, T. Simii, A. Simii, T. Soudanense, T. temperature of 50 C. sulphureum, T. tonsurans, A. insingulare, A. lenticularum, A. FIG. 6 is a graph showing the digestion of gliadin 14 mer quadrifidum, T. tonsurans, T. Sulphureum, T. terrestre, T. ton (A) without rul AP2 or (B) with rul AP2 over 4 hat 37° C. surans var. Sulphureum, T. tonsurans var tonsurans subvar. with an E/S ratio (w:w) of 1/50. 55 perforans, T. vanbreuseghemii, T verrucosum, T. violaceum, FIG. 7 is a graph showing the digestion of gliadin 14 mer T. yaoundei, E. floccosum, M. audouinii, M. ferrugineum, T. (A) with ruldPPIV alone and (B) with a rulDPPIV/ruIAP2 kanei, T. megninii, T. mentragrophytes, T. raubitschekii, T. cocktail. Schoenleinii, T. Soudanese, T. violaceum, M. canis, M. equi FIG. 8 is a graph showing the digestion of gliadin 33 mer num, M. nanum, M. persicolor, T. verrucosum, and M. gyp with ruDPPIV over 4 hat 37° C. with an E/S ratio (w:w) of 60 seum. Among the pathogenic species isolated in hospitals and 1 FSO. private practices in Europe, Trichophyton rubrum, T. menta FIG. 9 is a graph showing the digestion of gliadin 33 mer grophytes and Microsporum canis are most commonly with a DPPIV/ruIAP2 cocktail. observed. Monod et al., Dermatology, 205:201-203 (2002). FIGS. 10A and 10B are mass spectrum of Gly-Ser-proNPY In fact, dermatophytes can grow exclusively in the stratum (A) before and (B) after digestion with rul AP2. 65 corneum, nails or hair, and digest components of the cornified FIGS. 11A and 11B are mass spectra of Ala-proNPY (A) cell envelope. To date, all investigated dermatophytes pro before and (B) after digestion with rul AP2. duce proteolytic activity in vitro and many investigators US 7,943,340 B2 7 8 report the isolation and characterization of one or two viruses, and degrading toxic or contaminant proteins; (V) and, secreted endoproteases from an individual species. For a since rul AP2 and/or other proteases secreted by the fungi is review, see Monodet al., Int. J. Med. Microbiol. 292:405-419 necessary for dermatophytes to grow on the cornified Sub (2002). In particular, M. canis was shown to possess two gene strate of the nail, inhibitors of rul AP2 and/or other proteases families encoding endoproteases of the S8 (subtilisins) and secreted by the fungi would be a new method of treatment for M36 (fungalysins) family as classified in the MEROPS pro mycoses. teolytic enzyme database (at Merops at The Sanger Institute, This invention provides novel fungal nucleic acids and UK). Brouta et al., Infect. Immun. 70:5676-5683 (2002); proteins, which have leucine aminopeptidase activity. LAPs Descamps et al., J. Invest. Dermatol. 70:830-835 (2002). One play a role in diverse functions including, but not limited to member of each isolated M. canis gene family encoded one of 10 blood clotting, controlled cell death, tissue differentiation, the two previously characterized endoproteases from culture tumor invasion, and in the infection cycle of a number of supernatants. Mignon et al., Med. Mycol. 36:395-404 (1998); pathogenic microorganisms and viruses making these Brouta et al., Med. Mycol. 39:269-275 (2001). Both enzymes enzymes a valuable target and a powerful tool for new phar were shown to be keratinolytic and produced during infection maceuticals. Besides having a function in physiology, ami in cats. Mignon et al., Med. Mycol. 36:395-404 (1998); 15 nopeptidases also have commercial applications, mainly in Brouta et al., Med. Mycol. 39:269-275 (2001). This pro the detergent and food industries. Microorganisms, such as teolytic activity enables dermatophytes to grow exclusively in fungi, are an excellent source of these enzymes due to their the stratum corneum, nails or hair, and to use digested com broad biochemical diversity and their susceptibility to genetic ponents of the cornified cell envelope, i.e., singleamino acids manipulation. Microorganisms degrade proteins and utilize or short peptides, as nutrients for in vivo growing. the degradation products as nutrients for their growth. Thus, Two new leucine aminopeptidases (LAP) from the der the novel LAPs identified herein are useful in a multitude of matophyte T. rubrum, rul AP1 and rul AP2 are described industrial applications including but not limited to hydrolysis herein. T. rubrum is a species of the genus Trichophyton, of proteins in the food industry, degradation of by-products which includes, e.g., T. aielloi, T. asteroides, T. mentagro (e.g., feathers); degradation of prions; degradation of proteins phytes, T. concentricum, T. cruris, T. dankalienese, T. equi 25 for proteomics; hydrolysis of polypeptides for amino acid num, T. equinum var. autotrophicum, T. equinum var. equi analysis; wound cleaning (e.g., attacking the dead tissue); num, T. erinacei, T. fischeri, T. flavescens, T. floccosum, T. prothesis cleaning and/or preparation; fabric Softeners; Soaps; gloriae, T. gourvili, T. granulare, T. granulosum, T. gypseum, cleaning or disinfection of septic tanks or any container (Such T. inguinale, T. interdigitale, T. intertriginis, T. kanei, T. kra as Vats of retention, bottles, etc.) containing proteins that idenii, T. long fusum, T. megnini, T. pedis, T. proliferans, T. 30 should be removed or sterilized; and cleaning of surgical quickaneum, T. radiolatum, T. mentrophytes var. erinacei, T. instruments. mentagrophytes var. interdigitale, T. mentagrophytes var. This invention provides novel enzymes and enzyme cock mentagrophytes, T. mentagrophytes var. nodulare, T. menta tails, i.e. a mixture of more than one enzyme that digest grophytes var. quininckeanum, T. niveum, T. nodulare, T. per insoluble protein structures, such as the cornified cell enve sicolor, T. phaseolforme, T. proliferans, T. purpureum, T. 35 lope into short peptides and free amino acids. In fact, in quinckeanum, T. radiolatum, Traubitschekii, T. Schoenleinii, addition to endoproteases of the S8 and M36 family, T. T. Simi, T. Soudanense, T. Sulphureum, T. tonsurans, T. Sul rubrum secretes two LAPs each with different substrate activ phureum, T. terrestre, T. tonsurans var. Sulphureum, T. ton ity. RuLAP1 and rul AP2 each belong to the same family of surans Var tonsurans Subvar. perforans, T. Vanbreuseghemii, LAPs (MEROPS>M28). The properties of both LAPs were T verrucosum, T. violaceum, T. yaoundei, T. kanei, T. 40 compared to those of the secreted enzymes encoded by the raubitschekii, T. Soudanese. The properties of both LAPs orthologue genes of the opportunistic fungus A. fumigatus, were compared to those of the secreted enzymes encoded by ful AP1 and ful AP2, and the commercially available the orthologue genes of the opportunistic fungus Aspergillus microsomal LAP from porcine kidney (pkLAP) filmigatus, fuIAP1 andfulLAP2, and the commercially avail (MEROPS>M1 family). All of these enzymes exhibit leucine able microsomal LAP from porcine kidney (pkLAP) 45 aminopeptidase activity. Furthermore, rul AP2 has an origi (MEROPS>M1 family). All of these enzymes exhibit a leu nal primary structure and is unique in that it is able, in the cine aminopeptidase activity. Also; the A. fumigatus, ami presence of ruDPPIV, to sequentially digest a polypeptide nopepeptidases fuIAP1 and fuIAP2 display about 70% chain, Such as a fragment of gliadin known to be resistant to amino acid identity with the A. oryzae orthologues reported in other proteases. Partially purified rul AP2 is also able, in the U.S. Pat. Nos. 6,127,161 and 5,994,113, which are incorpo 50 presence of a trypsin-like endoprotease originating from the rated herein by reference. Furthermore, rul AP2 appears to be P. pastoris expression system, to sequentially digest a full unique because (i) rul AP1 and rul AP2 display about 50% length polypeptide chain, Such as denatured casein. amino acid identity with the A. fumigatus orthologues The invention is based, in part, upon the isolation of novel ful AP1 and fuIAP2 and with the A. oryzae orthologues nucleic acid sequences that encode novel polypeptides. The reported U.S. Pat. Nos. 6,127,161 and 5,994,113; (ii) a cock 55 novel nucleic acids and their encoded polypeptides are tail of rul AP2 and a trypsin-like endoprotease originating referred to individually as rul AP1, rul AP2, fulLAP1 and from the P pastoris expression system. Sequentially digests a ful AP2. The nucleic acids, and their encoded polypeptides, full length polypeptide chain Such as denatured casein; (iii) a are collectively designated herein as “EXOX'. cocktail of rul AP2 and ruDPPIV (another exoprotease of T. The novel EXOX nucleic acids of the invention include the rubrum) degrades afragment of gliadin known to be resistant 60 nucleic acids whose sequences are provided in Tables 1A, 1B, to protease action, thereby providing evidence that rul AP2 2A, 2B,3A, 3B, 4A, 4B, 5A, 5B, 6A, 6B, 7A, 7B, 8A, 8B, 9A, alone or in combination with ruDPPIV could be used for the 9B, 10A, 10B, 11A, 11B, and 12A, or a fragment, derivative, treatment of celiac disease or any disease of the digestive tract analog or homolog thereof. The novel EXOX proteins of the such as malabsorption; (iv) rul AP2 in combination with invention include the protein fragments whose sequences are other proteases (cocktails) is useful in the food industry. Such 65 provided in Tables 1C, 2C, 3C, 4C, 5C, 6C, 7C, 8C,9C, 10C, as degrading Substrates for bitterness, theves degradation, 11C, and 12B. The individual EXOX nucleic acids and pro treatment of meat, Soap industry, degrading prions, degrading teins are described below. US 7,943,340 B2 10 Also, within the scope of this invention is a methodofusing is caused by wheat gluten. Gluten is the characteristic term for protease inhibitors in the treatment or prevention of a fungal the protein mixture of glutelins and gliadins (prolamines) infection and/or opportunistic infection due to fungi, yeast found in cereals. Due to its inherent physicochemical prop cells and/or bacteria. erties such as acting as a binding and extending agent, gluten Using a reverse genetic approach, two aminopeptidases is commonly used as an additive in food. Detection of gluten secreted by T. rubrum have been characterized in comparison is important in the quality control and selection of food for with orthologues from A. fumigatus and the microsomalami individuals with diseases related to or caused by gluten intol nopeptidase pkLAP from porcine kidney. The four fungal erance including, gluten intolerance enteropathy, celiac dis enzymes identified herein (rul AP1, fulLAP1, rul AP2 and ease, sprue and related allergic reactions, where a diet free ful AP2) as well as pkLAP share a common preference for 10 from the gluten contained in wheat, rye barley, and in some Leu-AMC as a Substrate, and function as leucine aminopep cases oat is necessary. tidases. In addition, the aminopeptidase pkLAP, which acts Exoprotease Nucleic Acids and Polypeptides also with an extremely high efficiency towards Ala-AMC, is T. rubrum aminopeptidase activity demonstrated here and also called alanine aminopeptidase (MEROPSDM1.001). previous studies on Subtilisins and metalloproteases Secreted The EXOX nucleic acids of the invention, encoding EXOX 15 by M. canis show that dermatophytes secrete a battery of proteins, include the nucleic acids whose sequences are pro proteases similar to those of the Aspergillus species in a vided herein or fragments thereof. The invention also medium containing protein as sole carbon and nitrogen includes mutant or variant nucleic acids any of whose bases source. Moreover, two genes, ruDPPIV and ruDPPV: EMBL may be changed from the corresponding base shown herein, AF082514 for rul PPV, coding for dipeptidyl-aminopepti while still encoding a protein that maintains its EXOX-like dases highly similar to DPPIV and DPPV from both A. fumi activities and physiological functions, or a fragment of such a gatus and A. oryzae (Beauvais et al., J. Biol. Chem. 272:6238 nucleic acid. The invention further includes nucleic acids 6244 (1997); Beauvais et al., Infec. Immun. 65:3042-3047 whose sequences are complementary to those described (1997); Doumas et al., Appl. Environ. Microbiol. 64:4809 herein, including nucleic acid fragments that are complemen 4815 (1998); Doumas et al., J. Food Mycol. 2:271-279 tary to any of the nucleic acids just described. The invention 25 (1999)) were isolated from genomic and cDNA libraries of T. additionally includes nucleic acids or nucleic acid fragments, rubrum. The intron-exon structures of the T. rubrum genes or complements thereto, whose structures include chemical encoding these proteases are similar to the homologous genes modifications. Such modifications include, by way of non isolated from A. fumigatus and A. Oryzae. These results are limiting example, modified bases and nucleic acids whose not Surprising since the teleomorphs of Aspergillus species Sugar phosphate backbones are modified or derivatized. 30 and the teleomorphs of dermatophyte species are closely These modifications are carried out at least in part to enhance related, as they belong to the same taxonomic group of Asco the chemical stability of the modified nucleic acid, such that mycetes producing prototunicate asci in cleistothecia (class they may be used, for example, as antisense binding nucleic ). In contrast to the genes encoding Subtilisins acids in therapeutic applications in a Subject. and fungalysins, rul AP1 and rul AP2 are not members of The EXOX proteins of the invention include the EXO 35 large gene families in the T. rubrum genome. proteins whose sequences are provided herein. The invention RuLAP1 displays about 50% amino acid identity with also includes mutant or variant proteins any of whose residues fulLAP1 and/or LAP1 (See Tables 19A and FIG. 14. These may be changed from the corresponding residue shown three enzymes structurally belong to the same Subfamily herein, while still encoding a protein that maintains its EXO M28E as Aeromonas and Vibrio leucyl aminopeptidases like activities and physiological functions, or a functional 40 (MEROPS>M28.002). In addition, rul AP2 displays about fragment thereof. The invention further encompasses anti 50% amino acid identity with fulLAP2 and/or LAP2 (See bodies and antibody fragments, such as F, or (F), that Tables 19B and FIG. 15). These three enzymes structurally bind immunospecifically to any of the proteins of the inven belong to the same subfamily M28A as the vacuolar protease tion. Y of S. cerevisiae (MEROPS>M28.001) and the Streptomy EXOX nucleic acids and proteins are useful in potential 45 ces griseus secreted aminopeptidase (MEROPS-M28.00X). therapeutic applications such as the treatment of fungal infec In addition, the members of the M28A and M28E subfamilies tions. The EXOX nucleic acids, proteins and inhibitors also share low similarities. However, the amino acids of the two have other functions that include but are not limited to: (i) Zn" binding sites in these aminopeptidases are conserved biotechnology reagent for improved protein production, e.g., and were identified in the fungal LAPs characterized herein tag removal, production of rare amino acids; (ii) drug devel 50 (See Tables 20 and 21). In S griseus and Aeromonas pro opment for certain disease indications, e.g., celiac disease teolytica secreted aminopeptidases, the two amino acid resi (gluten intolerance); (iii) drug development for dermatologi dues His and Asp bind a first Zn" ion and two additional cal conditions, e.g., anti-mycosis agents, wart treatment, residues His and Glu bind a second Zn" ion, while a second wound healing; (iv) cosmetology, e.g., with peeling tools, Asp residue bridges the two Zn" ions. Greenblatt et al., J. depilation, dermabrasion and dermaplaning, (v) food indus 55 Mol. Biol. 265:620-636 (1997); Hasselgren et al., J. Biol. try, e.g., production of nutrition Supplements, Sweetners, gen Inorg. Chem. 6:120-127 (2001). Substitution of Zn" by dif erating hypoallergenic foods by predigestion; (vi) disinfect ferent divalent ions in S. griseus secreted aminopeptidase is ing agent, e.g., decontaminating protein-based contaminants affected by Ca" and has variable effects. Ben-Meir et al., Eur. Such as prions or viruses (by digesting coat protein), cleaning J. Biochem 212:107-112 (1993); Lin et al., J. Biol. Inorg. Surgery instruments or preparing items for Surgery Such as 60 Chem. 2:744-749 (1997); Hasselgren et al., J. Biol. Inorg. prosthesis or medical devices; (vii) sanitizing or recycling Chem. 6:120-127 (2001). The aminopeptidases of this inven certain wastes, e.g., feathers, bones, hair and fur; (viii) clean tion were found to be sensitive to different ions. Like the S. ing agent, e.g., shampoo or liquid detergent. griseus aminopeptidase, rul AP2 and fulLAP2 are highly acti Inhibitors of the EXOs, specifically of rul AP2, may also vated by Co". be used as fungal anti-mycotic agents to treat mycoses. The 65 RuLAP2 and fuIAP2 possess substantially different pro LAPs themselves may also be used to treat diseases of the teolytic activities despite a high percentage of sequence iden digestive tract, such as malabsorption or celiac disease, which tity. In particular, rul AP2 is able to efficiently hydrolyze Asp US 7,943,340 B2 11 12 and Glu-7-amine-4-methylcoumarin (AMC), and rul AP2 is TABLE 1A.- Continued the sole LAP identified so far that is able, first in the presence of rul PPIV, to digest a peptide of gliadin known to be resis rul AP2 genomic nucleotide sequence (SEQ ID NO: 1) . tant to digestion by gastric and pancreatic proteases, or sec CCTCGTTGCTTCTGGTAAGATTGATGTCACCATGAACGTTATCAGTCTGT ond, in the form of a partially purified extract that contains a trypsin-like endoprotease originating from the P. pastoris TTGAGAACCGAACCACGTAAGTAGCTCAACGGCTGATCCAGCATCAATTG expression system, to digest a full length polypeptide chain TCTCGAGTATATACTAAATCGATACCTCATAGCTGGAACGTCATTGCTGA such as denatured casein. The ability of a LAP to degrade a long polypeptide is not predictable solely on the basis of its GACCAAGGGAGGAGACCACAACAACGTTATCATGCTCGGTGCTCACTCCG 10 capacity to cleave aminoacyl-AMC residues. Particular prop ACTCCGTCGATGCCGGCCCTGGTATTAACGACAACGGCTCGGGCTCCATT erties of dermatophyte enzymes have been observed with endoproteases secreted by M. canis. The 31.5 kDa M. canis GGTATCATGACCGTTGCCAAAGCCCT CACCAACTTCAAGCTCAACAACGC subtilisin and the 43.5kDa M. canis metalloprotease are both able to digest keratine azure in contrast to homologous CGTCCGCTTTGCCTGGTGGACCGCTGAGGAATTCGGTCTCCTTGGAAGCA 15 secreted proteases from A. fumigatus and A. Oryzae. As der CCTTCTACGTCAACAGCCTCGATGACCGTGAGCTGCACAAGGTCAAGTTG matophytes evolved from their natural habitat in soil, they have developed a strategy of infection using particular pro TACCTCAACTTCGACATGATCGGCTCTCCCAACTTCGCCAACCAGATCTA teases to degrade the keratinized tissues. The unique proper ties of rul AP2 could reflect highly specialized organisms CGACGGTGACGGTTCGGCCTACAACATGACCGGCCCCGCTGGCTCTGCTG parasiting the stratum corneum and the nails. AAATCGAGTACCTGTTCGAGAAGTTCTTTGACGACCAGGGTATCCCACAC In addition to the LAPs disclosed herein, a series of novel proteases have also been isolated from the pathogenic fungi T. CAGCCCACTGCCTTCACTGGCCGATCCGACTACTCTGCTTTCATCAAGCG rubrum and are disclosed below. Like the LAPs these pro CAACGTGCCCGCTGGCGGCCTCTTCACTGGAGCCGAGGTTGTCAAGACCC teases are all characterised as exoproteases. They include: two carboxypeptidases, a prolylaminopeptidase, an amino 25 CCGAGCAAGTCAAGTTGTTCGGTGGTGAGGCTGGCGTTGCCTATGACAAG peptidase P, a prolidase, and a dipeptidylpeptidase IV. Two AACTACCATCGCAAGGGCGACACCGTTGCCAACATCAACAAGGGAGCTAT additional novel proteases have been also characterized: a leucine aminopeptidase (ca. AP1) from Microsporum canis CTTCCTTAACACTCGAGCCATCGCCTACGCTATCGCCGAGTATGCCCGAT and meLAP1, a Trichophyton mentagrophytes leucine ami nopeptidase. 30 CCCTCAAGGGATTCCCAACCCGCCCAAAGACCGGCAAGCGTGACGTCAAC ruIAP2 CCCCAGTATTCTAAGATGCCTGGTGGTGGCTGCGGACACCACACTGTCTT rul AP2 is a T. rubrum leucine aminopeptidase. A rul AP2 nucleic acid of 1757 nucleotides (SEQID NO: 1) is shown in CATGTAA Table 1A. 35 A disclosed rul AP2 open reading frame (“ORF) of 1488 TABL E 1A nucleotides begins with an ATG start codon at position 1 rul AP2 genomic nucleotide sequence (SEQ ID NO: 1) . (underlined in Table 1B).

ATGAAGTCGCAACTGTTGAGCCTGGCTGTGGCCGTCACAACCATCTCCCA 40 TABL E 1B GGGCGTTGTTGGTCAAGAGCCCTTCGGATGGCCTTTCAAGCCTATGGTCA ruIAP2 nucleotide sequence (SEQ ID NO: 2). CTCAGGTGAGTTGCTCTCAACAGATCGATCGATCGATCTACCTTTGTCCC ATGAAGTCGCAACTGTTGAGCCTGGCTGTGGCCGTCACAACCATCTCCCA

TGTCACATCAAACTCCAGCAGAGCCAAAGAAACAGACACAATGTTCCTGG GGGCGTTGTTGGTCAAGAGCCCTTCGGATGGCCTTTCAAGCCTATGGTCA 45 GGAATTCTTATGGGCTAATGTAAATGTATAGGATGACCTGCAAAACAAGA CTCAGGATGACCTGCAAAACAAGATAAAGCTCAAGGATATCATGGCAGGC

TAAAGCTCAAGGATATCATGGCAGGCGTCGAGAAGCTGCAAAGCTTTTCT GTCGAGAAGCTGCAAAGCTTTTCTGATGCTCATCCTGAAAAGAACCGAGT

GATGCTCATCCTGAAAAGAACCGAGTGTTTGGTGGTAATGGCCACAAGGA GTTTGGTGGTAATGGCCACAAGGACACTGTAGAGTGGATCTACAATGAGA 50 CACTGTAGAGTGGATCTACAATGAGATCAAGGCCACTGGCTACTACGATG TCAAGGCCACTGGCTACTACGATGTGAAGAAGCAGGAGCAAGTACACCTG

TGAAGAAGCAGGAGCAAGTACACCTGTGGTCTCATGCCGAGGCTGCTCTC TGGTCTCATGCCGAGGCTGCTCTCAATGCCAATGCCAAGGACCTCAAGGC

AATGCCAATGGCAAGGACCTCAAGGCCAGCGCCATGTCCTACAGCCCTCC CAGCGCCATGTCCTACAGCCCTCCTGCCAGCAAGATCATGGCTGAGCTTG 55 TGCCAGCAAGATCATGGCTGAGCTTGTTGTTGCCAAGAACAATGGCTGCA TTGTTGCCAAGAACAATGGCTGCAATGCTACTGATTACCCAGCGAACACT

ATGCTGTATGTGCCATACACTTTCTATACGTCACATTCTCTCTAGAATGA CAGGGCAAGATCGTCCTCGTTGAGCGTGGTGTCTGCAGCTTCGGCGAGAA

AGAGCACGGGAGAACTAACTTTATGTATACAGACTGATTACCCAGCGAAC GTCTGCTCAGGCTGGTGATGCAAAGGCTGCTGGTGCCATTGTCTACAACA 60 ACTCAGGGCAAGATCGTCCTCGTTGAGCGTGGTGTCTGCAGCTTCGGCGA ACGTCCCCGGATCCCTTGCTGGCACTCTTGGTGGCCTTGACAAGCGCCAT

GAAGTCTGCTCAGGCTGGTGATGCAAAGGCTGCTGGTGCCATTGTCTACA GTCCCAACCGCTGGTCTTTCCCAGGAGGATGGAAAGAACCTTGCTACCCT

ACAACGTCCCCGGATCCCTTGCTGGCACTCTTGGTGGCCTTGACAAGCGC CGTTGCTTCTGGTAAGATTGATGTCACCATGAACGTTATCAGTCTGTTTG

CATGTCCCAACCGCTGGTCTTTCCCAGGAGGATGGAAAGAACCTTGCTAC 65 AGAACCGAACCACCTGGAACGTCATTGCTGAGACCAAGGGAGGAGACCAC US 7,943,340 B2 13 14 TABLE 1B - continued TABLE 1C-continued rul AP2 nucleotide sequence (SEQ ID NO: 2). Encoded ruIAP2 protein sequence (SEQ ID NO : 3) .

AACAACGTTATCATGCTCGGTGCTCACTCCGACTCCGTCGATGCCGGCCC WSHAEAALNANGKDLKASAMSYSPPASKIMAELWWAKNNGCNATDYPANT TGGTATTAACGACAACGGCTCGGGCTCCATTGGTATCATGACCGTTGCCA QGKIWLVERGWCSFGEKSAOAGDAKAAGAIWYNNWPGSLAGTLGGLDKRH AAGCCCT CACCAACTTCAAGCTCAACAACGCCGTCCGCTTTGCCTGGTGG VPTAGLSOEDGKNLATLVASGKIDWTMNVISLFENRTTWNVIAETKGGDH ACCGCTGAGGAATTCGGTCTCCTTGGAAGCACCTTCTACGTCAACAGCCT 10 NNWIMLGAHSDSWDAGPGINDINGSGSIGIMTWAKALTNFKLNNAWRFAWW CGATGACCGTGAGCTGCACAAGGTCAAGTTGTACCTCAACTTCGACATGA TAEEFGLLGSTFYWNSLDDRELHKWKLYLNFDMIGSPNFANOIYDGDGSA TCGGCTCTCCCAACTTCGCCAACCAGATCTACGACGGTGACGGTTCGGCC YNMTGPAGSAEIEYLFEKFFDDOGIPHOPTAFTGRSDYSAFIKRNVPAGG TACAACATGACCGGCCCCGCTGGCTCTGCTGAAATCGAGTACCTGTTCGA 15 LFTGAEVVKTPEQWKLFGGEAGWAYDKNYHRKGDTVANINKGAIFLNTRA GAAGTTCTTTGACGACCAGGGTATCCCACACCAGCCCACTGCCTTCACTG IAYAIAEYARSLKGFPTRPKTGKRDVNPOYSKMPGGGCGHHTVFM GCCGATCCGACTACTCTGCTTTCATCAAGCGCAACGTGCCCGCTGGCGGC

CTCTTCACTGGAGCCGAGGTTGTCAAGACCCCCGAGCAAGTCAAGTTGTT The disclosed rul AP2 has homology to the amino acid sequences shown in the BLAST data listed in Table 1D, 1E, CGGTGGTGAGGCTGGCGTTGCCTATGACAAGAACTACCATCGCAAGGGCG and 1F. ACACCGTTGCCAACATCAACAAGGGAGCTATCTTCCTTAACACTCGAGCC The following program options were used: thlastn—compares the protein “Sequence 1 against the ATCGCCTACGCTATCGCCGAGTATGCCCGATCCCTCAAGGGATTCCCAAC nucleotide “Sequence 2' which has been translated in all six reading frames CCGCCCAAAGACCGGCAAGCGTGACGTCAACCCCCAGTATTCTAAGATGC 25 blastX-compares the nucleotide "Sequence 1 against the CTGGTGGTGGCTGCGGACACCACACTGTCTTCATGTAA protein “Sequence 2 blastp—for protein-protein comparisons A disclosed rul AP2 nucleic acid (SEQID NO: 2) encodes In all BLAST alignments herein, the “E-value” or a protein having 495 amino acid residues (SEQ ID NO:3), 30 “Expect value is a numeric indication of the probability that which is presented in Table 1C using the one-letteramino acid the aligned sequences could have achieved their similarity to code. the BLAST query sequence by chance alone, within the data TABL E base that was searched. The Expect value (E) is a parameter that describes the number of hits one can “expect to see just Encoded ruIAP2 protein sequence (SEQ ID NO : 3) . 35 by chance when searching a database of a particular size. It decreases exponentially with the Score (S) that is assigned to MKSOLLSLAVAVTTISOGVWGOEPFGWPFKPMVTODDLONKIKLKDIMAG a match between two sequences. Essentially, the E value WEKLOSFSDAHPEKNRVFGGNGHKDTVEWIYNEIKATGYYDWKKOEOVHL describes the random background noise that exists for matches between sequences. TABLE 1D

TBLASTN results for ruLAP2 Gene Length Identity Positives Index/Identifier Protein/Organism (aa) (%) (%) Expect gi469363 Saccharomyces cerevisiae 32421 170,477 239.437 8e-65 aminopeptidase Y gene (35%) (55%) gi158398.05 Mycobacterium tuberculosis 18857 152,424 225,424 Se-57 CDC15551, section 33 of 280 of the (35%) (53%) complete genome gi9949032 Pseudomonas aeruginosa 12547 129,317 180,317 1e-56 PAO1, section of 281 of (40%) (56%) 529 of the complete genome

TABLE 1E

BLASTX results for rulAP2 Gene Length Identity Positives Index/Identifier Protein/Organism (aa) (%) (%) Expect gi28918599 Hypothetical protein SO8 219,467 287,467 e-112 Neutrospora crassa (46%) (61%) gi584764 APE3 YEAST: Aminopeptidase 537 170477 239.437 1e-65 precursor? (35%) (55%) Saccharomyces cerevisiae gi23017467 Hypothetical protein S14 151,460 237,460 Se-61 Thermobifida fisca (32%) (51%) US 7,943,340 B2 15 16 TABLE 1 E-continued

BLASTX results for rulAP2 Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect gi158398.05 Hydrolasef 493 152,424 225,424 6e-58 Mycobacterium tuberculosis (35%) (53%) CDC15551

TABLE1F

BLASTP results for ruLAP2 Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect G28918S99 Hypothetical protein SO8 219,467 287,467 e-105 Neurospora crassa (46%) (61%) GS84764 APE3 YEAST: Aminopeptidase 537 169.477 237,477 2e-64 precursor (35%) (49%) Saccharomyces cerevisiae G158398OS Hydrolasef 493 152,424 225,424 Se-57 Mycobacterium tuberculosis (35%) (53%) CDC15551 G23O17467 Hypothetical protein S14 150,46O 237,460 1e-56 Thermobifida fisca (32%) (51%) ruIAP1 30 TABLE 2A- continued rul AP1 is a T. rubrum leucine aminopeptidase. A rul AP1 nucleic acid of 1256 nucleotides is shown in Table 2A (SEQ ruIAP1 genomic nucleotide sequence (SEQ ID NO : 4) . ID NO: 4). CTTCGGTATCATCACCGACAACGTCAACGCTAACTTGACCAAGTTCGTCC

TABL E 2A 35 GCATGGTCAT CACCAAGGTAAGCTTCAACTCTTGATAAATATATTTTTCA

rul AP1 genomic nucleotide sequence (SEQ ID NO : 4) . TCGATGAAATGATGTCCTAATAATGCTTAAGTACTGCTCAATCCCAACCA

ATGAAGCTCCTCTCTGTTCTTGCGCTGAGCGCTACCGCTACCTCCGTCCT TCGACACCCGCTGCGGCTATGCTTGCTCTGACCACGCCTCTGCCAACCGC

CGGAGCTAGCATTCCTGTTGATGCCCGGGCCGAGAAGTTCCTCATCGAAC 40 AATGGCTACCCATCTGCCATGGTTGCCGAGTCTCCCATCGATCTCCTCGA

TTGCCCCTGGTGAGACTCGCTGGGTTACCGAGGAGGAGAAGTGGGAGCTT CCCT CACCTCCACACTGACTCTGACAACATTAGCTACCTCGACTTCGACC

AAGCGGGTATGTACCACTATCCTACGCAAAAGTTGTATTTTCACTAGATA ACATGATCGAGCACGCTAAGCTCATTGTCGGCTTCGTCACTGAGCTCGCT

ATATTGGTTATTAACACCCATTCTAGAAGGGTCAAGACTTCTTTGACATC 45 AAGTAA ACTGACGAGGAGGTTGGATTCACTGCTGCTGTTGCACAGCCAGCCATTGC

CTACCCAACCTCCATCCGCCATGCTAATGCTGTTAACGCCATGATTGCTA A disclosed rul AP1 open reading frame (“ORF) of 1122

CCCTCTCCAAGGAGAACATGCAGCGCGATCTGACCAAGCTCAGCTCGTTC nucleotides begins with an ATG codon (underlined in Table 50 2B) at position 1. CAAACCGCTTACTATAAGGTTGACTTTGGCAAGCAGTCTGCCACCTGGCT TABL E 2B CCAGGAGCAAGTCCAGGCTGCCATCAATACCGCTGGTGCCAATCGCTACG ruTAP1 nucleotide sequence (SEQ ID NO: 5). GAGCCAAGGTCGCCAGCTTCCGACACAACTTCGCTCAGCACAGCATCATT 55 ATGAAGCTCCTCTCTGTTCTTGCGCTGAGCGCTACCGCTACCTCCGTCCT GCCACTATTCCCGGCCGCTCCCCTGAAGTCGTTGTCGTCGGTGCT CACCA CGGAGCTAGCATTCCTGTTGATGCCCGGGCCGAGAAGTTCCTCATCGAAC AGACAGCATCAACCAACGCAGCCCCATGACCGGCCGCGCTCCAGGTGCCG TTGCCCCTGGTGAGACTCGCTGGGTTACCGAGGAGGAGAAGTGGGAGCTT ATGACAACGGCAGTGGCTCCGT CACCATCCTTGAGGCCCTCCGTGGTGTT AAGCGGAAGGGTCAAGACTTCTTTGACATCACTGACGAGGAGGTTGGATT 60 CTCCGGGACCAGACCATCCTCCAGGGCAAGGCTGCCAACACCATTGAGTT CACTGCTGCTGTTGCACAGCCAGCCATTGCCTACCCAAC CTCCATCCGCC

CCACTGGTACGCCGGTGAGGAAGCTGGTCTTCTGGGCTCCCAGGCCATCT ATGCTAATGCTGTTAACGCCATGATTGCTACCCTCTCCAAGGAGAACATG

TCGCCAACTACAAACAGACCGGCAAGAAGGTCAAGGGCATGCTCAACCAG CAGCGCGATCTGACCAAGCTCAGCTCGTTCCAAACCGCTTACTATAAGGT 65 GACATGACCGGTTACATCAAGGGAATGGTCGACAAGGGTCTCAAGGTGTC TGACTTTGGCAAGCAGTCTGCCACCTGGCTCCAGGAGCAAGTCCAGGCTG US 7,943,340 B2 17 18 TABLE 2B - continued A disclosed rul AP1 nucleic acid (SEQID NO. 5) encodes a protein having 377 amino acid residues (SEQ ID NO: 6), ruLAP1 nucleotide sequence (SEQ ID NO: 5). which is presented in Table 2C using the one-letter amino acid code. CCATCAATACCGCTGGTGCCAATCGCTACGGAGCCAAGGTCGCCAGCTTC TABL E CGACACAACTTCGCTCAGCACAGCATCATTGCCACTATTCCCGGCCGCTC

CCCTGAAGTCGTTGTCGTCGGTGCTCACCAAGACAGCATCAACCAACGCA Encoded ruIAP1 protein sequence (SEQ ID NO : 6).

GCCCCATGACCGGCCGCGCTCCAGGTGCCGATGACAACGGCAGTGGCTCC 10 MKLLSWLALSATATSWLGASIPWDARAEKFLIELAPGETRWWTEEEKWEL

GTCACCATCCTTGAGGCCCTCCGTGGTGTTCTCCGGGACCAGACCATCCT KRKGODFFDITDEEVGFTAAVAOPAIAYPTSIRHANAVNAMIATLSKENM CCAGGGCAAGGCTGCCAACACCATTGAGTTCCACTGGTACGCCGGTGAGG ORDLTKLSSFOTAYYKVDFGKOSATWLOEQVOAAINTAGANRYGAKVASF AAGCTGGTCTTCTGGGCTCCCAGGCCATCTTCGCCAACTACAAACAGACC 15 GGCAAGAAGGTCAAGGGCATGCTCAACCAGGACATGACCGGTTACATCAA RHNFAOHSIIATIPGRSPEVVVVGAHODSINORSPMTGRAPGADDNGSGS

GGGAATGGTCGACAAGGGTCTCAAGGTGTCCTTCGGTATCATCACCGACA WTILEALRGVLRDOTILOGKAANTIEFHWYAGEEAGLLGSQAIFANYKOT ACGTCAACGCTAACTTGACCAAGTTCGTCCGCATGGTCATCACCAAGTAC GKKWKGMLNODMTGYIKGMVDKGLKVSFGIITDNVNANLTKFVRMVITKY TGCTCAATCCCAACCATCGACACCCGCTGCGGCTATGCTTGCTCTGACCA CSIPTIDTRCGYACSDHASANRNGYPSAMWAESPIDLLDPHLHTDSDNIS CGCCTCTGCCAACCGCAATGGCTACCCATCTGCCATGGTTGCCGAGTCTC

CCATCGATCTCCTCGACCCT CACCTCCACACTGACTCTGACAACATTAGC YLDFDHMIEHAKLIWGFWTELAK 25 TACCTCGACTTCGACCACATGATCGAGCACGCTAAGCTCATTGTCGGCTT

CGTCACTGAGCTCGCTAAGTAA The disclosed rul AP1 has homology to the amino acid sequences shown in the blast data listed in Table 2D, 2E, and 2F. This data was analyzed by the program pairwise blast. TABLE 2D

TBLASTN results for rulAP1 Gene Length Identity Positives Index/Identifier Protein/Organism (aa) (%) (%) Expect >gi1762234 Polyketide synthase PKSL2. 98.94 131,247 171,247 1e-95 Aspergilius parasiticits (53%) (69%) 40.76 57.76 (52%) (75%) 2024 22.24 (83%) (91%) >gi23393798 Leucine aminopeptidase 2547 77,159 97.159 4e-64 (Lap1). (48%) (61%) Aspergilius Soiae 63,148 89.148 (42%) (60%) 14,30 23,30 (46%) (76%) >gi927685 Saccharomyces cerevisiae 78SOO 137.3SO 201 350 3e-62 chromosome IV lambda3641 and (39%) (57%) cosmid 98.31, and 9410 >gi7413486 Agaricits partial 1089 130,346 189,346 2e-SS mRNA for aminopeptidase (37%) (54%)

TABLE 2E

BLASTX results for rulAP1

Gene Length Identity Positives Index/Identifier Protein/Organism (aa) (%) (%) Expect

>gi23393799 Leucine aminopeptidase? 377 126,248 162,248 Se-87 Aspergillus Soiae (50%) (65%) 37/78 55.78 (47%) (70%) 13:24 2024 (54%) (83%)

US 7,943,340 B2 21 22 TABLE 3A.- Continued TABLE 3B - continued fuLAP2 genomic nucleotide sequence (SEQ ID NO: 7). fuLAP2 nucleotide sequence (SEQ ID NO: 8).

TACGCCCTGATGATCTATGACGGCGACGGCTCGGCCTTCAACCTGACGGG ACCCCGCACACGCACACCGGCGGAACAGGATGCTACAAGGACCGGGTTGA GCCGGCCGGCTCGGCGCAGATCGAGCGGCTCTTCGAGGACTACTACACGT GCAGTAG CGATCCGCAAGCCGTTCGTGCCGACCGAGTTCAACGGCCGCTCCGACTAC A disclosed fulLAP2 open reading frame (“ORF) of 1497 CAGGCCTTTATTCTCAACGGCATCCCCGCGGGAGGCCTCTTCACCGGCGC nucleotides begins with an ATG codon (underlined in Table 10 3B) at position 1. GGAGGCGATCAAGACCGAGGAACAGGCCCAATTGTTTGGCGGCCAGGCCG GCGTGGCTCTGGACGCCAACTACCACGCCAAGGGTGACAACATGACTAAT TABL E CTCAACCGCGAGGCTTTCCTGATCAATTCCAGGGCGACGGCCTTTGCCGT fuLAP2 nucleotide sequence (SEQ ID NO: 8). 15 GGCGACGTACGCCAACAGCCTTGACTCGATCCCCCCACGCAACATGACCA ATGAAGCTGCTCTACCTCACATCGTTTGCCTCTCTGGCCGTGGCCAATGG CCGTGGTCAAGCGGTCGCAGCTGGAGCAAGCCATGAAGAGGACCCCGCAC CCCAGGATGGGACTGGAAGCCCCGAGTTCATCCGAAAGTCCTGCCCCAAA ACGCACACCGGCGGAACAGGATGCTACAAGGACCGGGTTGAGCAGTAG TGATCCATTTGTGGGATCTTCTGCAGGGCGCTCAACAGCTGGAAGACTTC

GCCTATGCCTACCCCGAGCGCAACCGCGTCTTTGGTGGACGGGCCCACGA A disclosed fulLAP2 nucleic acid (SEQID NO: 8) encodes a protein having 498 amino acid residues (SEQ ID NO: 9), GGACACCGTCAACTACCTCTACCGTGAGTTGAAGAAAACCGGCTACTACG which is presented in Table3C using the one-letter amino acid code. ACGTTTACAAGCAGCCCCAGGTTCACCAGTGGACCCGAGCCGACCAGGC 25 TCTCACCGTCGACGGCCAGTCCTATGACGCCACAACCATGACTTACAGCC TABL E

CCAGCGTAAACGCCACGGCGCCGCTGGCAGTGGTGAACAACCTGGGCTGC Encoded fuIAP2 protein sequence (SEQ ID NO: 9).

GTCGAGGCTGACTATCCCGCCGATCTGACGGGCAAGATTGCTCTGATCTC MKLLYLTSFASLAVANGPGWDWKPRWHPKVLPOMIHLWDLLOGAQOLEDF 30 GCGGGGCGAGTGCACCTTTGCGACCAAATCCGTCTTGAGCGCCAAGGCCG AYAYPERNRVFGGRAHEDTVNYLYRELKKTGYYDVYKOPOWHOWTRADOA

GGGCGGCGGCGGCACTCGTGTACAACAATATCGAGGGTTCGATGGCGGGA LTVDGOSYDATTMTYSPSVNATAPLAVWNNLGCVEADYPADLTGKIALIS

ACTCTGGGCGGCGCGACCAGCGAGCTGGGTGCCTACGCTCCCATCGCCGG RGECTFATKSWLSAKAGAAAALWYNNIEGSMAGTLGGATSELGAYAPIAG 35 CATCAGCCTCGCGGACGGACAGGCGCTGATCCAGATGATCCAGGCGGGCA ISLADGOALIOMIOAGTVTANLWIDSOVENRTTYNVIAOTKGGDPNNVVA

CGGTGACAGCCAACCTGTGGATCGACAGCCAGGTCGAGAACCGTACCACC LGGHTDSWEAGPGINDDGSGIISNLWWAKALTRFSWKNAWRFCFWTAEEF

TACAACGTGATCGCGCAGACCAAGGGCGGCGACCCCAACAACGTCGTCGC GLLGSNYYWNSLNATEOAKIRLYLNFDMIASPNYALMIYDGDGSAFNLTG 40 GCTGGGTGGCCACACGGACTCGGTCGAGGCCGGGCCCGGCATCAACGACG PAGSAOIERLFEDYYTSIRKPFWPTEFNGRSDYOAFILNGIPAGGLFTGA

ACGGCTCCGGCATCATCAGCAACCTCGTCGTCGCCAAGGCGCTGACCCGC EAIKTEEQAOLFGGOAGVALDANYHAKGDNMTNLNREAFLINSRATAFAW

TTCTCGGTCAAGAACGCGGTGCGCTTCTGCTTCTGGACGGCGGAGGAGTT ATYANSLDSIPPRNMTTV WKRSOLEOAMKRTPHTHTGGTGCYKDRVEO 45 CGGCCTGCTGGGCAGCAACTACTACGTCAACAGCCT CAATGCCACCGAGC The disclosed fuIAP2 has homology to the amino acid AGGCCAAGATCCGCCTGTATCTCAACTTCGACATGATCGCCTCCCCCAAC sequences shown in the BLAST data listed in Table 3D, 3E, and 3F. This data was analyzed by the program PAIRWISE BLAST. TABLE 3D

TBLASTN results for fulLAP2

Gene Length Identity Positives Index. Identifier Protein?Organism (aa) (%) (%) Expect

Saccharomyces cerevisiaef 2272 184f 464 243f464 7e-69 aminopeptidase Y gene (39%) (52%) >gi9949032 Pseudomonas aeruginosa 12547 165,445 231,445 9e-67 PAO1, section of 281 of (37%) (51%) 529 of the complete genome >gi23017467 Mycobacterium tuberculosis 18857 166,426 218,426 2e-62 CDC15551, section 33 of 280 of (38%) (51%) complete genome US 7,943,340 B2 23 24 TABLE 3E

BLASTX results for fulLAP2 Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect >gi28918599 Hypothetical protein? SO8 2SO479 314,479 Neurospora crassa (52%) (65%) >gi23017467 Hypothetical protein? S14 173,46S 251,465 Thermobifida fisca (37%) (53%) >gi584764 APE3 YEAST: Aminopeptidase 537 184f 464 243f464 precursor (39%) (52%) Saccharomyces cerevisiae >gi15598135 Probable aminopeptidasef S36 165,445 231,445 Pseudomonas aeruginosa (37%) (51%) PAO1 >gi158398.05 Hydrolase? 493 166,426 218,426 Mycobacterium tuberculosis (38%) (51%) CDC15551

TABLE 3F

BLASTP results for fulLAP2 Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect >gi28918599 Hypothetical protein? SO8 2SO469 314,479 e-128 Neurospora crassa (52%) (65%) >gi23017467 Hypothetical protein? S14 173,46S 251,465 3e-71 Thermobifida fisca (37%) (53%) >gi584764 APE3 YEAST: Aminopeptidase 537 183,464. 243,464 precursor (39%) (52%) Saccharomyces cerevisiae >gi15598135 Probable aminopeptidasef S36 164,445 230,445 Pseudomonas aeruginosa (36%) (51%) PAO1 fuIAP1 TABLE 4A- continued fulLAP1 is an A. fumigatus leucine aminopeptidase. A fuIAP1 genomic nucleotide sequence (SEQ ID fulLAP1 nucleic acid of 1298 nucleotides is shown in Table a NO: 10). 4A (SEQID NO: 10). TCTCCCGT CAATCTTGGCTGCTCCCGGTGCTGATGACGATGGAAGTGGAA TABL E 4A CTGTCACCATTCTTGAAGCGTTGCGCGGTCTGCTGCAGTCAGACGCCATT fuIAP1 genomic nucleotide sequence (SEQ ID 45 NO: 10). GCCAAGGGTAATGCATCCAATACTGTCGAGTTCCACTGGTACTCTGCAGA

ATGAAAGTTCTTACAGCTATTGCGCTGAGCGCAATAGCTTTCACAGGGGC AGAAGGCGGAATGCTGGGCTCCCAGGCAATATTTTCCAATTACAAGCGGA

TGTAGCTGCAGTGATTACT CAGGAAGCATTCTTAAACAACCCCCGCATCC ATAGGCGGGAAATCAAAGCCATGCTCCAGCAAGACATGACTGGCTACGTC

ATCATGACCAGGAGAAGTACTTGATCGAACTGGCCCCTTATCGAACACGA 50 CAGGGAGCTTTGAACGCCGGTGTTGAGGAAGCCATAGGAATTATGGTCGA

TGGGTGACTGAAGAGGAGAAATGGGCATTGAAATTGGTACCATACTTCCC TTATGTCGACCAGGGCCTCACACAGTTTCTCAAGGACGTTGTTACAGCGG CAAAATTTGGGTCTCCAAGTCCACGGGCGACTAACTGCACGATTGCTTGA TAAGCCTCAGTTGTCCCCCACGAAAAGCTGTTTAGTCGACAAATGAAATT AGGACGGCGTGAATTTTATCGATATCACAGAAGAGCACAACACCGGATTT 55 GACGGCTGCATTAGTACTGCTCTGTGGGTTACCTGGAGACGAAGTGCGGA TACCCGACTCTCCACAGCGCCAGCTATGTGAAATATCCACCGAAGATGCA TATGCCTGCTCCGACCACACCTCGGCCAGTAAATATGGTTATCCCGCGGC GTATGCAGAAGAAGTGGCTGCTCTTAACAAGAATTTATCGAAAGAAAACA TATGGCGACAGAAGCAGAGATGGAAAATACCAATAAGAAGATACATACTA. TGAAGGCCAACCTGGAACGATTCACATCATTTCATACTCGCTATTACAAA 60 CCGACGACAAGATCAAGTATTTGAGCTTCGATCATATGTTGGAGCATGCC TCTCAGACGGGAATCCGATCGGCAACGTGGCTGTTCGACCAAGTTCAGAG AAGTTGAGTCTTGGCTTCGCTTTCGAATTGGCATTTGCGCCGTTTTAA AGTTGTCTCTGAGTCTGGAGCCGCTGAGTATGGTGCAACTGTTGAGCGAT

TCTCTCATCCATGGGGTCAGTTCAGCATTATTGCCCGAATACCCGGCCGA 65 A disclosed fulLAP1 open reading frame (“ORF) of 1167 ACGAACAAGACTGTGGTGCTGGGCGCCCATCAGGACAGCATCAATTTGTT nucleotides begins with an ATG codon at position 1 (under lined in Table 4B). US 7,943,340 B2 25 26 TABL E TABLE 4B- continued

fuLAP1 nucleotide sequence (SEQ ID NO: 11). fuLAP1 nucleotide sequence (SEQ ID NO: 11).

ATGAAAGTTCTTACAGCTATTGCGCTGAGCGCAATAGCTTTCACAGGGGC AATATGGTTATCCCGCGGCTATGGCGACAGAAGCAGAGATGGAAAATACC TGTAGCTGCAGTGATTACT CAGGAAGCATTCTTAAACAACCCCCGCATCC AATAAGAAGATACATACTACCGACGACAAGATCAAGTATTTGAGCTTCGA 10 ATCATGACCAGGAGAAGTACTTGATCGAACTGGCCCCTTATCGAACACGA TCATATGTTGGAGCATGCCAAGTTGAGTCTTGGCTTCGCTTTCGAATTGG TGGGTGACTGAAGAGGAGAAATGGGCATTGAAATTGGACGGCGTGAATTT CATTTGCGCCGTTTTAA 15 TATCGATATCACAGAAGAGCACAACACCGGATTTTACCCGACTCTCCACA A disclosed fulLAP1 nucleic acid (SEQ ID NO: 11) GCGCCAGCTATGTGAAATATCCACCGAAGATGCAGTATGCAGAAGAAGTG encodes a protein having 388 amino acid residues (SEQ ID NO: 12), which is presented in Table 4C using the one-letter GCTGCTCTTAACAAGAATTTATCGAAAGAAAACATGAAGGCCAACCTGGA amino acid code.

ACGATTCACATCATTTCATACTCGCTATTACAAATCTCAGACGGGAATCC TABL E 4C

GATCGGCAACGTGGCTGTTCGACCAAGTTCAGAGAGTTGTCTCTGAGTCT 25 Encoded fuILAP1 protein sequence (SEQ ID NO: 12).

GGAGCCGCTGAGTATGGTGCAACTGTTGAGCGATTCTCTCATCCATGGGG MKVLTAIALSAIAFTGAVAAVITOEAFLNNPRIHHDOEKYLIELAPYRTR

TCAGTTCAGCATTATTGCCCGAATACCCGGCCGAACGAACAAGACTGTGG 30 WVTEEEKWALKLDGVNFIDITEEHNTGFYPTLHSASYWKYPPKMOYAEEV

TGCTGGGCGCCCATCAGGACAGCATCAATTTGTTTCTCCCGTCAATCTTG AALNKNLSKENMKANLERFTSFHTRYYKSOTGIRSATWLFDOVORVVSES

GCTGCTCCCGGTGCTGATGACGATGGAAGTGGAACTGTCACCATTCTTGA 35 GAAEYGATVERFSHPWGOFSIIARIPGRTNKTVVLGAHODSINLFLPSIL

AGCGTTGCGCGGTCTGCTGCAGTCAGACGCCATTGCCAAGGGTAATGCAT AAPGADDDGSGTWTILEALRGLLOSDAIAKGNASNTVEFHWYSAEEGGML

CCAATACTGTCGAGTTCCACTGGTACTCTGCAGAAGAAGGCGGAATGCTG GSOAIFSNYKRNRREIKAMLOODMTGYWOGALNAGVEEAIGIMIDYWDOG 40

GGCTCCCAGGCAATATTTTCCAATTACAAGCGGAATAGGCGGGAAATCAA LTOFLKDVVTAYCSVGYLETKCGYACSDHTSASKYGYPAAMATEAEMENT

AGCCATGCTCCAGCAAGACATGACTGGCTACGTCCAGGGAGCTTTGAACG NKKIHTTDDKIKYLSFDHMLEHAKLSLGFAFELAFAPF 45

CCGGTGTTGAGGAAGCCATAGGAATTATGGTCGATTATGTCGACCAGGGC

CTCACACAGTTTCTCAAGGACGTTGTTACAGCGTACTGCTCTGTGGGTTA The disclosed fuIAP1 has homology to the amino acid 50 sequences shown in the BLAST data listed in Table 4D, 4E, CCTGGAGACGAAGTGCGGATATGCCTGCTCCGACCACACCTCGGCCAGTA and 4F. This data was analyzed by the program PAIRWISE BLAST. TABLE 4D

TBLASTN results for fulLAP1

Gene Length Identity Positives Index/Identifier Protein/Organism (aa) (%) (%) Expect

>gi1762234 Polyketide synthase PKSL2. 98.94 208,249 226,249 e-169 Aspergilius parasiticits (80%) (90%) 61.84 67.84 (72%) (79%) 4662 55.62 (74%) (88%) US 7,943,340 B2 27 28 TABLE 4D-continued

TBLASTN results for fulLAP1

Gene Length Identity Positives Index. Identifier Protein?Organism (aa) (%) (%) Expect >gi23393798 Leucine aminopeptidase (LAP1), 2S47 66,110 82,110 7e-82 Aspergilius Soiae (60%) (74%) 68,152 92 152 (44%) (60%) 37f75 52.75 (49%) (69%) 15.30 21,30 (50%) (70%) Saccharomyces cerevisiae 78SOO 152,341 207 341 1e-71 chromosome IV lambda3641 and (44%) (60%) cosmid 98.31, and 9410 Botrytis cinerea 780 89,134 106.134 7e-58 strain T4 cDNA library under (66%) (79%) condition of nitrogen 27.53 33.53 deprivation (50%) (62%)

2O

TABLE 4E

BLASTX results for fulLAP1

Gene Length Identity Positives Index. Identifier Protein?Organism (aa) (%) (%) Expect >gi28918132 Hypothetical protein 402 208,352 2SS,352 e-116 Neurospora crassa (59%) (72%) >gi23393799 Leucine aminopeptidase 377 183,355 241,355 3e-97 Aspergilius Soiae (51%) (67%) >gi6320623 Hypothetical ORF; Ydrá15cp/ 374 152,341. 207,341 2e-72 Saccharomyces cerevisiae (44%) (60%) >gi18250467 Aminopeptidase 384 139,352 186.352 1e-58 Agarict is bisportis (39%) (52%)

TABLE 4F

BLASTP results for fulLAP1

Gene Length Identity Positives Index. Identifier Protein?Organism (aa) (%) (%) Expect >gi28918132 Hypothetical protein 402 208,352 2SS,352 e-116 Neurospora crassa (59%) (72%) >gi23393799 Leucine aminopeptidase 377 183,355 241,355 6e-98 (LAP1)/ (51%) (67%) Aspergilius Soiae Hypothetical ORFYor415cp? 374 152,341. 207,341 3e-73 Saccharomyces cerevisiae (44%) (60%) >gi18250467 Aminopeptidase 384 140,352 190,352 7e-59 Agarict is bisportis (39%) (53%) ruCBPS1 TABLE 5A- continued ruCBPS1 is a T. rubrum carboxypeptidase. Genomic DNA 55 ruCBPS1 genomic nucleotide sequence (SEQ ID sequence of a ruCBPS1 nucleic acid of 2106 nucleotides NO: 13). (SEQ ID NO: 13) is shown in Table 5A. AGTCGAAATTCGGCAGCGGTGCTCGCATCACTTATAAGGAGGTCCGTTAG TABL E 5A 60 CTGCATAGAAAGTCCACGTGAAGACGCTGTAGCTAACAATCCACTAGCCT ruCBPS1 genomic nucleotide sequence (SEQ ID NO: 13). GGCCTCTGTGAGACGACAGAGGGCGTCAAGTCGTACGCCGGATATGTCCA

ATGGTGTCATTCTGCGGAGTGGCAGCCTGCCTGCTGACAGTTGCTGGCCA TCTGCCTCCAGGCACGCTCAGGGACTTCGGTGTCGAGCAGGACTACCCTA 65 TCTTGCGCAGGCTCAGTTCCCACCAAAACCGGAGGGAGTCACTGTCCTGG TCAACACCTTTTTTTGGTTCTTTGAGGCAAGAAAGGACCCTGAAAATGCC

US 7,943,340 B2

TABLE 5B - continued TABLE 5C-continued ruCBPS1 nucleotide sequence (SEQ ID NO : 14) . Encoded ruCBPS1 protein sequence (SEQ ID NO: 15). PVOVGLSYDTLANFTRNLVTDEITKLKPGEPIPEONATFLVGTYASRNMN GCCGCCGTACGCAGCTGGATCATTGTCGACTCCAACTCGACCTCTCTGTT 5 TTAHGTRHAAMALWHFAOVWFOEFPGYHPRNNKISIATESYGGRYGPAFT CCCCGAGGTAGTTGGCTCAGGGGAACCCACGCCAACCCCTATGCCTGGAG AFFEEONOKIKNGTWKGHEGTMHVLHLDTLMIVNGCIDRLVOWPAYPOMA GGGCTACTACACTATCTGCTCACGGGTTCTTGTATGGCGTGACATTATGG YNNTYSIEAWNASIHAGMLDALYRDGGCRDKINHCRSLSSWFDPENLGIN GCTGTTATTGTTGTAGCTGTTATAGAGCTGGCAATGTAA 10 STVNDWCKDAETFCSNDVRDPYLKFSGRNYYDIGOLDPSPFPAPFYMAWL A disclosed ruCBPS1 nucleic acid (SEQ ID NO: 14) NOPHVOAALGVPLNWTOSNDVVSTAFRAIGDYPRPGWLENLAYLLENGIK encodes a protein having 662 amino acid residues (SEQ ID NO: 15), which is presented in Table 5C using the one-letter WSLWYGDRDYACNWFGGELSSLGINYTDTHEFHNAGYAGIOINSSYIGGO amino acid code. 15 WROYGNLSFARVYEAGHEVPSYOPETALOIFHRSLFNKDIATGTKDTSSR

TABLE 5C MDGGKFYOTSGPADSFGFKNKPPPOHVHFCHILDTSTCTKEOIOSVENGT Encoded rucBPS1 protein sequence (SEQ ID NO: 15). AAWRSWIIWDSNSTSLFPEWWGSGEPTPTPMPGGATTLSAHGFLYGWTLW.

MVSFCGVAACLLTVAGHLAOAOFPPKPEGVTVLESKFGSGARITYKEPGL 2O AWIWWAWIELAM

CETTEGWKSYAGYWHLPPGTLRDFGVEODYPINTFFWFFEARKDPENAPL The disclosed ruCBPS1 has homology to the amino acid sequences shown in the BLAST data listed in Table 5D, 5E GIOMNGGPGSSSMFGMMTENGPCFVNADSNSTRLNPHSWNNEVNMLYIDO and 5F. This data was analyzed by the program PAIRWISE BLAST. TABLE 5D

TBLASTN results for ruCBPS1 Gene Length Identity Positives Index/Identifier Protein/Organism (aa) (%) (%) Expect gi32410708 Neurospora crassa 1947 222,632 321,632 1e-90 strain OR74A (35%) (50%) gi3046860 Schizosaccharomyces pombe 4308 137,481 204,481 6e-41 cpy1 gene for (28%) (42%) carboxypeptidase Y gil 18152938 Pichia angusta 2214 141.520 228,520 4e–40 carboxypeptidase Y (27%) (43%) (CPY) gene gi4028157 Pichia angusta 2509 14OS2O 226,52O 7e-40 carboxypeptidase Y (26%) (43%) precursor (CPY) gene gil 170828 Candida albicans 1985 131,482 205,482 3e-36 carboxypeptidase Y (27%) (42%) precursor (CPY1) gene

TABLE 5E

BLASTX results for ruCBPS1 Gene Length Identity Positives Index/Identifier Protein/Organism (aa) (%) (%) Expect gil 15004616 carboxypeptidase S1, SSS 209,535 294,535 1e-98 Aspergillus Oryzae (39%) (54%) gi435818 carboxypeptidase 423 159,498 234,498 6e-64 S1, CPD-S1/ (31%) (46%) Penicilium janthinelium gi1995456 preprocarboxypeptidase Z. 460 147,506 219,506 8e-48 Absidia zychae (29%) (43%) gi3046861 carboxypeptidase Y 1002 137,481 204,481 7e-42 Schizosaccharomyces pombe (28%) (42%) gil 18152939 carboxypeptidase Y 537 141, S2O 228,520 4e-41 Pichia angusta (27%) (43%) gi4028158 carboxypeptidase Y S41 140,520 226,52O 7e-41 precursor; vacuolar (26%) (43%) carboxypeptidase Pichia angusta gi7597001 carboxypeptidase Y 542 131f482 206,482 2e-37 precursor (27%) (42%) Candida albicans US 7,943,340 B2 33 34 TABLE 5F

BLASTP results for ruCBPS1 Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect gil 15004616 carboxypeptidase S1/ 555 210,537 296,537 Aspergilius Oryzae (39%) (55%) gi435818 carboxypeptidase S1, CPD-S1/ 423 159,498 234,498 Penicilium janthinelium (31%) (46%) gi1995456 preprocarboxypeptidase Z. 46O 146 SOO 217, SOO Absidia zychae (29%) (43%) gil 19115337 carboxypeptidasey 1002 136,481 204,481 Schizosaccharomyces pombe (28%) (42%) ruCBPS1 15 TABLE 6A- continued ruCBPS1 is a T. rubrum carboxypeptidase. Genomic DNA ruCBPS1' genomic nucleotide sequence (SEQ ID sequence of a ruCBPS1 nucleic acid of 2030 nucleotides NO: 16). (SEQ ID NO: 16) is shown in Table 6A. CTTCTCCCAATACGCCCAAGCTGTGGGAAAATCATTCCATGGAGTTGGCG TABL E 6A ACTACGCTCGCCCTGATGTGCGCGGCTTCACCGGTGACATTGCTTATCTT ruCBPS1' genomic nucleotide sequence (SEQ ID NO: 16). CTCGAGAGCGGAGTCAAGGTTGCTCTCGTCTATGGTGACAGAGACTACAT

ATGCGCTTTGCTGCTAGCATTGCCGTGGCCCTGCCAGTCATTCACGCGGC 25 CTGCAATTGGTTCGGTGGTGAGCAGGTCAGTCTTGGCTTGAACTACACTG

GAGTGCTCAAGGCTTCCCTCCACCCGTTAAGGGCGTCACCGTGGTCAAAT GCACCCAAGACTTCCACAGGGCAAAATATGCCGATGTCAAGGTCAACTCT

CCAAGTTCGACGAAAACGTAAAGATCACATACAAGGAGGTATGTGTTTAC TCATACGTCGGAGGCGTAGTGCGTCAACATGGAAACTTCTCTTTCACCAG

ATCATTTTCACATCCAGATCTTATATCCTTACAATAAATCTGGCTAACTC 30 AGTTTTCGAGGCCGGTCATGAAGTCCCTGGTTACCAACCCGAGACTGCCC

ACTGGATAGAATGACATATGTGAAACCACTCAAGGAGTTAGATCATTCAC TCAAGATCTTTGAGCGCATCATGTTCAACAAGGATATTTCTACCGGTGAG CGGTCATGTCCACCTTCCTCCAGACAACGATGACTTTGGTGTCTACCGGA ATCGACATTGCTCAGAAACCAGACTACGGTACCACTGGAACTGAGTCTAC ACTACTCCATCAACACATTCTTCTGGTTCTTTGAAGCTCGTGAAGACCCT 35 GTTCCATATCAAAAACGATATCCCTCCTTCGCCTGAGCCGACCTGCTACC AAGAATGCTCCTCTCTCCATCTGGCTGAACGGTGGTCCGGGATCGTCATC TCCTCAGTGCTGACGGAACCTGTACCCCGGAGCAGCTTAATGCTATTAAG CATGATTGGACTCTTCCAGGAAAACGGTCCATGCTGGGTCAATGAAGACT CATGGAACTGCAGTTGTTGAGAACTACATTATTAAGAGCCCTGCTGCGTC CTAAATCTACCACCAACAATTCATTTTCATGGAACAATAAAGTAAATATG 40 GAAGGGGAACCCTCCACCAACCACGACCTCATCTCCCACAGCAGCCCCTA CTCTACATTGATCAGCCAAACCAAGTCGGTTTCAGTTATGACGTACCTAC CCGCTGGAAGTGCCATGCTAAAGGCTCCTGTGGCAATGCTAGCAATATCA CAACATCACTTACTCTACCATCAATGATACAATATCTGTTGCGGACTTCT GCTCTCACTCTCCTTGCTTTC TTCTTGTAG CTAACGGTGTCCCTGCGCAAAATCTTTCTACGTTGGTTGGAACCGGCAGC 45 AGCCAGAACCCTTGGGCAACTGCCAATAACACTGTGAACGCTGCTCGTTC A ruCBPS1' nucleic acid of 1959 (SEQ ID NO: 17) is TATCTGGCACTTTGCACAACTGTGGTTCCAGGAATTCCCTGAACACAAGC shown in Table 6B. A disclosed ruCBPS1' open reading frame CTAACAATAACAAGATCAGTATTTGGACAGAGTCCTATGGAGGAAGATAT (“ORF) begins with an ATG start codon at position 1 (under 50 lined in Table 6B). GGTCCCTCATTCGCCTCTTACTTCCAGGAACAGAACGAAAAGATCAAAAA TABL E CCATACCATTACTGAAGAAGGAGAGATGCATATTCTGAACCTCGACACCC ruCBPS1' nucleotide sequence (SEQ ID NO: 17). TCGGTATCATCAACGGCTGCATCGATCTTATGTTCCAAGCAGAAAGTTAT 55 ATGCGCTTTGCTGCTAGCATTGCCGTGGCCCTGCCAGTCATTCACGCGGC GCTGAATTCCCATACAACAACACCTATGGCATCAAAGCTTATACCAAGGA GAGTGCTCAAGGCTTCCCTCCACCCGTTAAGGGCGTCACCGTGGTCAAAT GAAGCGTGACGCTATATTACACGACATCCACCGTCCTGACGGCTGCTTCG CCAAGTTCGACGAAAACGTAAAGATCACATACAAGGAGAATGACATATGT ACAAGGTTACCAAGTGCCGTGAGGCCGCGAAAGAAGGAGACCCTCACTTC GAAACCACTCAAGGAGTTAGATCATTCACCGGTCATGTCCACCTTCCTCC 60 TACAGCAACAATGCAACCGTCAACACAATCTGTGCGGATGCTAACTCTGC AGACAACGATGACTTTGGTGTCTACCGGAACTACTCCATCAACACATTCT

CTGCGACAAATATCTAATGGATCCTTTCCAAGAGACCAATCTTGGTTACT TCTGGTTCTTTGAAGCTCGTGAAGACCCTAAGAATGCTCCTCTCTCCATC

ATGATATTGCTCATCCTCTTCAGGATCCCTTCCCCCCACCATTCTATAAG TGGCTGAACGGTGGTCCGGGATCGTCATCCATGATTGGACTCTTCCAGGA 65 GGCTTCCTCAGCCAATCCAGCGTTCTATCTGACATGGGATCGCCAGTCAA AAACGGTCCATGCTGGGTCAATGAAGACTCTAAATCTACCACCAACAATT US 7,943,340 B2 35 36 TABLE 6B-continued TABLE 6B-continued ruCBPS1' nucleotide sequence (SEQ ID NO: 17). ruCBPS1' nucleotide sequence (SEQ ID NO: 17).

CATTTTCATGGAACAATAAAGTAAATATGCTCTACATTGATCAGCCAAAC GTACCCCGGAGCAGCTTAATGCTATTAAGGATGGAACTGCAGTTGTTGAG

CAAGTCGGTTTCAGTTATGACGTACCTACCAACATCACTTACTCTACCAT AACTACATTATTAAGAGCCCTGCTGCGTCGAAGGGGAACCCTCCACCAAC

CAATGATACAATATCTGTTGCGGACTTCTCTAACGGTGTCCCTGCGCAAA CACGACCTCATCTCCCACAGCAGCCCCTACCGCTGGAAGTGCCATGCTAA

ATCTTTCTACGTTGGTTGGAACCGGCAGCAGCCAGAACCCTTGGGCAACT 10 AGGCTCCTGTGGCAATGCTAGCAATATCAGCTCTCACTGTCCTTGCTTTC

GCCAATAACACTGTGAACGCTGCTCGTTCTATCTGGCACTTTGCACAAGT TTCTTGTAG

GTGGTTCCAGGAATTCCCTGAACACAAGCCTAACAATAACAAGATCAGTA A disclosed ruCBPS1' nucleic acid (SEQ ID NO: 17) TTTGGACAGAGTCCTATGGAGGAAGATATGGTCCCTCATTCGCCTCTTAC 15 encodes a protein having 652 amino acid residues (SEQ ID NO: 18), which is presented in Table 6C using the one-letter TTCCAGGAACAGAACGAAAAGATCAAAAACCATACCATTACTGAAGAAGG amino acid code. AGAGATGCATATTCTGAACCTCGACACCCTCGGTATCATCAACGGCTGCA TABL E TCGATCTTATGTTCCAAGCAGAAAGTTATGCTGAATTCCCATACAACAAC Encoded ruCBPS1' protein sequence (SEQ ID NO: 18). ACCTATGGCATCAAAGCTTATACCAAGGAGAAGCGTGACGCTATATTACA MRFAASIAVALPVIHAASAOGFPPPVKGVTVVKSKFDENVKITYKENDIC CGACATCCACCGTCCTGACGGCTGCTTCGACAAGCTTACCAAGTGCCGTG ETTOGVRSFTGHVHLPPDNDDFGVYRNYSINTFFWFFEAREDPKNAPLSI AGGCCGCGAAAGAAGGAGACCCTCACTTCTACAGCAACAATGCAACCGTC 25 WLNGGPGSSSMIGLFOENGPCWWNEDSKSTTNNSFSWNNKVNMLYIDOPN AACACAATCTGTGCGGATGCTAACTCTGCCTGCGACAAATATCTAATGGA OWGFSYDWPTNITYSTINDTISVADFSNGVPAONLSTLVGTGSSONPWAT TCCTTTCCAAGAGACCAATCTTGGTTACTATGATATTGCTCATCCTCTTC ANNTVNAARSIWHFAOVWFOEFPEHKPNNNKISIWTESYGGRYGPSFASY AGGATCCCTTCCCCCCACCATTCTATAAGGGCTTCCTCAGCCAATCCAGC 30 FOEONEKIKNHTITEEGEMHILNLDTLGIINGCIDLMFOAESYAEFPYNN GTTCTATCTGACATGGGATCGCCAGTCAACTTCTCCCAATACGCCCAAGC TYGIKAYTKEKRDAILHDIHRPDGCFDKWTKCREAAKEGDPHFYSNNATW TGTGGGAAAATCATTCCATGGAGTTGGCGACTACGCTCGCCCTGATGTGC NTICADANSACDKYLMDPFOETNLGYYDIAHPLODPFPPPFYKGFLSOSS GCGGCTTCACCGGTGACATTGCTTATCTTCTCGAGAGCGGAGTCAAGGTT 35 VLSDMGSPVNFSOYAOAVGKSFHGVGDYARPDVRGFTGDIAYLLESGVKV GCTCTCGTCTATGGTGACAGAGACTACATCTGCAATTGGTTCGGTGGTGA ALWYGDRDYICNWFGGEOWSLGLNYTGTODFHRAKYADVKVNSSYWGGVW GCAGGTCAGTCTTGGCTTGAACTACACTGGCACCCAAGACTTCCACAGGG ROHGNFSFTRVFEAGHEVPGYOPETALKIFERIMFNKDISTGEIDIAOKP CAAAATATGCCGATGTCAAGGTCAACTCTTCATACGTCGGAGGCGTAGTG DYGTTGTESTFHIKNDIPPSPEPTCYLLSADGTCTPEOLNAIKDGTAVWE CGTCAACATGGAAACTTCTCTTTCACCAGAGTTTTCGAGGCCGGTCATGA 40 NYIIKSPAASKGNPPPTTTSSPTAAPTAGSAMLKAPWAMLAISALTWLAF AGTCCCTGGTTACCAACCCGAGACTGCCCTCAAGATCTTTGAGCGCATCA FL TGTTCAACAAGGATATTTCTACCGGTGAGATCGACATTGCTCAGAAACCA

GACTACGGTACCACTGGAACTGAGTCTACGTTCCATATCAAAAACGATAT 45 The disclosed ruCBPS1 has homology to the amino acid sequences shown in the BLAST data listed in Table 6D, 6E CCCTCCTTCGCCTGAGCCGACCTGCTACCTCCTCAGTGCTGACGGAACCT and 6F. This data was analyzed by the program PAIRWISE BLAST. TABLE 6D

TBLASTN results for ruCBPS1

Gene Length Identity Positives Index/Identifier Protein/Organism (aa) (%) (%) Expect

gi|32410708 Neutrospora crassa 1947 246,632 337,632 e-104 strain OR74A (38%) (53%) gi|3046860 Schizosaccharomyces pombe 43O8 137,48O 21S.480 1e-45 cpy1 gene for carboxypeptidase Y (28%) (44%) gil 18152938 Pichia angusta 2214 139 SO8 227,508 2e-42 carboxypeptidase Y (CPY) gene (27%) (44%) US 7,943,340 B2 37 38 TABLE 6E

BLASTX results for ruCBPS1 Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect gil 15004616 carboxypeptidase S1/ 555 221,567 310,567 e-102 Aspergilius Oryzae (38%) (54%) gi435818 carboxypeptidase S1, CPD-S1/ 423 174,499 258,499 4e-77 Penicilium janthinelium (34%) (51%) gi1995456 preprocarboxypeptidase Z. 460 155,491 243,491 2e-58 Absidia zychae (31%) (49%) gil 19115337 carboxypeptidasey 10O2 137,48O 21S.480 1e-46 Schizosaccharomyces pombe (28%) (44%) gi4028158 carboxypeptidase Y S41 139, SO8 226,508 2e-43 precursor; vacuolar (27%) (44%) carboxypeptidasef Pichia angusta

TABLE 6F

BLASTP results for ruCBPS1 Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect gil 15004616 carboxypeptidase S1/ 555 222,567 310,567 7e-98 Aspergilius Oryzae (39%) (54%) gi435818 carboxypeptidase 423 174,499 259,499 1e-71 S1, CPD-S1/ (34%) (51%) Penicilium janthinelium gi1995456 preprocarboxypeptidase Z. 460 156,491 244,491 2e-57 Absidia zychae (31%) (49%) gil 19115337 carboxypeptidasey 1002 137,480 215,480 4e-44 Schizosaccharomyces pombe (28%) (44%) ruPAP 35 TABLE 7A- continued ruPAP is a T. rubrum prolylaminopeptidase. Genomic DNA sequence of a ruPAP nucleic acid of 1795 nucleotides ruPAP genomic nucleotide sequence (SEQ ID NO : 19). (SEQ ID NO: 19) is shown in Table 7A. GAGGGACTTAAAGAAGTCTTCACAACTGGTGGATTACCCCCTCTTGTGTC

TABL E FA 40 AAAGCCTGATCCTGTGTACGAGAGGACCTACGGTAAGTTGGGATAGATTG

ruPAP genomic nucleotide sequence (SEQ ID NO : 19). GGCTATTTTTAGTTTAATATACAGCTGACATCTACAGACAAGGTCCAGTC

ATGCAAGCAGCAAAATTGTTGAGCCGGTACTGGCAAAATGTACCTGGTTA CCGGAATAAAGTGTACTATTCCACTTTCCCCGAAGACGAAGATCGAGTGC

GTGCAGCTAATCTTGAGTCACATCATGCATAGTTAACCGAGTATCACAAC 45 GGATTATACT CAAGCATCTCCAAACCCACGATGTTAAGCTCCCCGATGGC ACAATCTACTATTGCGTTTTTGCTAATGGCTACCATAGGAAGACTGAGGG TCACCGTTAACTCCGGAACGCTTTCTCCAGCTAGGAATTCATTTTGGAAT

TATCTGAGCTCCTTTTCCATGTCCCTTTAGACTACT CAAACCCGTCTTCC GAAAGGTACGCCATACTTCGCAGGTGACTTCTCGTAACCAATGACTAACA

ACTTCGCTCCGGTTGTTCGCCAGGAGTGTGCAGCGGCGAATTCCAGGGTC TATGCATATAGGGGGCATCGGCTTAGTTCATAGTATGATACCATCAATAA 50 CTCTCTCGATGATAAAGACAGACAGCTACCCTNGGATTGTTTTCCTGCAG CTTACATTATACTTATTCACTGACTAACAATGTCGAAATATCAGGCATAA

GGTGGACCAGGAGGAGCTTGCCCACAACCTCAGGAGGTAGGCTGGGTTGG TTTTGAAGTGCATTAATGAACTGGAATACTTTGGCTTCCTCACACGACCT

GCCATTGCTGGATCGAGGATTCCAGGTGAGTCTCCAGAATCGGGATGAGT ACTTTATCTCTGATTGAGAACGACACGAGTGCAGACAACGGCATTCTATA 55 AACTGTAGAACACCTTGTTGAATTTCTTGATTAGATCCTTCTCCTTGACC TGCCATAATGCATGAATCTATCTACTGCCAAGGGTAAAACGTCTCTCCTG

AGCGAGGAACAGGGCTTTCAACCCCTATAACCGCTGCGACGCTTGCTCTT ATCGAGTCAATATCAGAATCTAACGTGATACCGTAGGGAGGCCTCAAACT

CAGGGAAACGCAGTAAAGCAAGCCGAATATCTTAGGCTATTCCGTGCCGA GGGCTGCCGAAAGACTACTACCAAAGTTCTCTGGCTTCCGAGGCGCTCAT 60 TAATATCGTGCGAGACTGTGAAGCAGTGCGTAAACTATTGACTGCTTATT AATCCTGATGGCATCTACTTCACTGGGGAGATGGTATACAAACACTGGTT

ACCCTCCAGATAAGCAGAAATGGAGCGTCCTTGGCCAGAGTTTTGGAGGA TGAGTCGTCCACAGAACTCGGCCAGCTCAAAGAGGTAGCCGATATTCTTG

TTCTGTGCCGTCACGTATGTTTCTAAG TAGTGAGTAACTACTCCTTCAAA CTTCCTACAATGACTGGCCGCAGTTGTATGATAAGGAACAGCTCGCGCGC 65 TCCACCTGCTATAGATTGTCGTGCAAATCTAACCTTCATCATCTAGTCCT AACGAGGTGCCAGTGTATTCCGCTACATATGTCGAGGATATGTACGTGCA US 7,943,340 B2 39 40 TABLE 7A- continued TABLE 7B - continued ruPAP genomic nucleotide sequence (SEQ ID NO: 19). ruPAP nucleotide sequence (SEQ ID NO: 2O).

CTTCAGCTACGCCAACGAAACAGCTGCCACTATTCACAATTGCAAACAGT TCTATCTACTGCCAAGGGGAGGCCT CAAACTGGGCTGCCGAAAGACTACT

TCAT CACCAACACGATGTACCACAACGGACTGCGTTCAGATTCCGCTGAA ACCAAAGTTCTCTGGCTTCCGAGGCGCTCATAATCCTGATGGCATCTACT

CTTATTGCGCAGCTGTTTGCTCTTCGTGATGATACGATTGACTAG TCACTGGGGAGATGGTATACAAACACTGGTTTGAGTCGTCCACAGAACTC

10 GGCCAGCTCAAAGAGGTAGCCGATATTCTTGCTTCCTACAATGACTGGCC A ruPAP nucleic acid of 1326 (SEQID NO: 20) is shown in Table 7B. A disclosed ruPAP open reading frame (“ORF) GCAGTTGTATGATAAGGAACAGCTCGCGCGCAACGAGGTGCCAGTGTATT begins with an ATG start codon at position 1 (underlined in CCGCTACATATGTCGAGGATATGTACGTGCACTTCAGCTACGCCAACGAA Table 7B). is ACAGCTGCCACTATTCACAATTGCAAACAGTTCATCACCAACACGATGTA TABLE 7B CCACAACGGACTGCGTTCAGATTCCGCTGAACTTATTGCGCAGCTGTTTG

ruPAP nucleotide sequence (SEQ ID NO: 2O). CTCTTCGTGATGATACGATTGACTAG ATGCAAGCAGCAAAATTGTTGAGCCGGTACTGGCAAAATGTACCTGGAAG ACTGAGGGTATCTGAGCTCCTTTTCGATGTCCCTTTAGACTACT CAAACC 2O A disclosed ruPAP nucleic acid (SEQID NO:20) encodes a protein having 441 amino acid residues (SEQID NO: 21), CGTCTTCCACTTCGCTCCGGTTGTTCGCCAGGAGTGTGCAGCGGCGAATT which is presented in Table 7C using the one-letter amino acid CCAGGGTCCTCTCTCGATGATAAAGACAGACAGCTACCCTGGATTGTTTT code.

CCTGCAGGGTGGACCAGGAGGAGCTTGCCCACAACCTCAGGAGGTAGGCTr 25 TABLE 7C GGGTTGGGCCATTGCTGGATCGAGGATTCCAGATCCTTCTCCTTGACCAG Encoded ruPAP protein sequence (SEQ ID NO: 21).

CGAGGAACAGGGCTTTCAACCCCTATAACCGCTGCGACGCTTGCTCTTCA MOAAKLLSRYWONVPGRLRVSELLFDVPLDYSNPSSTSLRLFARSVORRI GGGAAACGCAGTAAAGCAAGCCGAATATCTTAGGCTATTCCGTGCCGATA 50 PGSSLDDKDROLPWIVPLOGGPGGACPQPQEvGWVGPLLDRGFOILLLDo ATATCGTGCGAGACTGTGAAGCAGTGCGTAAACTATTGACTGCTTATTAC RGTGLSTPITAATLALOGNAVKOAEYLRLFRADNIVRDCEAVRKLLTAYY

CCTCCAGATAAGCAGAAATGGAGCGTCCTTGGCCAGAGTTTTGGAGGATT PPDKOKWSVLGOSFGGFCAVTYVSNPEGLKEWFTTGGLPPLVSKPDPVYE CTGTGCCGTCACGTATGTTTCTAATCCTGAGGGACTTAAAGAAGTCTTCA 35 RTYDKvosRNKVYYSTFPEDEDRVRIILKHLoTHDvKLPDGSPLTPERFL CAACTGGTGGATTACCCCCTCTTGTGTCAAAGCCTGATCCTGTGTACGAG OLGIHFGMKGIILKCINELEYFGFLTRPTLSLIENDTSADNGILYAIMHE

AGGACCTACGACAAGGTCCAGTCCCGGAATAAAGTGTACTATTCCACTTT SIYCOGEASNWAAERLLPKFSGFRGAHNPDGIYFTGEMVYKHWFESSTEL CCCCGAAGACGAAGATCGAGTGCGGATTATACTCAAGCATCTCCAAACCC 40 GOLKEVADILASYNDWPOLYDKEOLARNEVPVYSATYVEDMYVHFSYANE ACGATGTTAAGCTCCCCGATGGCTCACCGTTAACTCCGGAACGCTTTCTC TAATIHNCKOFITNTMYHNGLRSDSAELIAOLFALRDDTID

CAGCTAGGAATTCATTTTGGAATGAAAGGCTAATTTTGAAGTGCATTAAT GAAACTGGAATACTTTGGCTTCCTCACACGACC TACTTTATCTCTGATTG The disclosed ruPAP has homology to the amino acid 45 sequences shown in the BLAST data listed in Table 7D, 7E AGAACGACACGAGTGCAGACAACGGCATTCTATATGCCATAATGCATGAA and 7F. This data was analyzed by the program PAIRWISE BLAST. TABLE 7D

TBLASTN results for ruPAP

Gene Length Identity Positives Index. Identifier Protein Organism9. 88 % % Expectp

gil 14329656 Aspergilius niger 3752 151,307 190307 e-118 papA gene for prolyl (49%) (61%) aminopeptidase A gi32414442 Neurospora crassa 1449 212,477 285.477 e-100 strain OR74A (44%) (59%) gi|604877 Aeromonas sobria 1740 175,420 239.42O 4e-77 gene for prolyl (41%) (56%) aminopeptidase US 7,943,340 B2 42 TABLE 7E

BLASTX results for ruPAP Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect gil 183074.08 prolylaminopeptidase Af 442 266,442 334,442 e-152 Aspergilius niger (60%) (75%) gil 14456054 putative prolylaminopeptidase 36S 211,366 263,366 e-114 Aspergilius nidians (57%) (71%) gi22507295 prolylaminopeptidase 3OO 181,301. 226,301 4e-99 Taiaromyces emersonii (60%) (75%) gi|1236731 prolylaminopeptidase 425 175,420 239.42O Aeromonas sobria (41%) (56%)

TABLE 7F

BLASTP results for ruPAP Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect gil 183074.08 prolylaminopeptidase Af 442 267,443 336,443 e-157 Aspergilius niger (60%) (75%) gil 14456054 putative prolylaminopeptidase 36S 211,366 263,366 e-116 Aspergilius nidians (57%) (71%) gi22507295 prolylaminopeptidase 3OO 181,301. 226,301 e-102 Taiaromyces emersonii (60%) (75%) gi|1236731 prolylaminopeptidase 42S 175,420 239.42O Aeromonas (41%) (56%) ruAMPP TABLE 8A-continued ruAMPP is a T. rubrum aminopeptidase P. Genomic DNA sequence of a ruAMPP nucleic acid of 2418 nucleotides ruAMPP genomic nucleotide sequence (SEQ ID NO: 22). (SEQ ID NO: 22) is shown in Table 8A. 35 CGTCTGCTATGGTTATTTGTATGACGCTAGATCTATTTTTGATCAAACAT

TABL E 8A ATACTAACAAACGCAATATAGCCACCTTGGATGAGATTGCATGGCTCTTC

ruAMPP genomic nucleotide sequence (SEQ ID NO: 22). AACCTCCGTGGAAGCGAGTAAGTTTCTATATAAATGGTATCTTTCACTTT 40 ATGCCGCCACCACCGGTTGACACGACCCAGCGTCTCGCAAAGCTGCGAGA ATACAAAAAGCCATGCTGACTGGTGTAGTATTCCATATAACCCCGTCTTT

GCTGATGGCTCAGAACAAGGTCGATGTATATAGTATGCAATTCAGATACA TTCTCGTACGCAATTGTGACGCCCTCAGTTGCGGAACTCTATGTCGATGA

CCATTAAAGCTCCCTTGATAATAACAGTCGTATACT CATTCTTCTTTCTT GAGCAAGCTGTCTCCAGAAGCCAGAAAACATCTCGAAGGCAAGGTCGTTC CTACTCCTCGCCTTAAAGTTGTGCCTTCGGAAGACAGCCATCAGTCGAGT 45 TCAAGCCATACGAGTCCATCTTCCAAGCTTCCAAAGTCCTCGCCGAATCA ACATTGCTCCATGTGATGGGCGTCGAGGTTAGACCTGTCCCTCCATAAAA AAGGCATCGGCTAGCAGCGGTTCCTCTGGGAAGTTCTTGTTGTCTAACAA GAATAC CTACCCGTAATACCAGCCGGCAGACGCTCATACGTATCACTGCA GGCTTCGTGGTCTTTGAGCCTCGCCCTCGGTGGGGAACAGAACGTCGTTG GCTTTCATATCCAGCTTCACTGGCTCGGCAGGATGTGCCATCGTCTCTAT 50 AGGTTCGAAGTCCCATCACTGACGCCAAAGCCATCAAGAACGAAGTTGAA GAGTAAAGCTGCTCTGTCTACAGACGGCAGATACTTCAGCCAAGCTGCAA CTGGAAGGATTCAGAAAATGCCATATCCGAGACGGTGCAGCTCTGATCGA AACAGCTCGATGCCAACTGGATCCTGTTGAAGCGAGGTGTCGAGGGTGTC GTACTTCGCCTGGCTTGAAAATGCATTGATCAAAGAAGGTGCCAAGCTAG CCAACCTGGGAAGAATGGTATATCTGCCCCTGGTATCGACTTTTCCGGTA 55 ACGAAGTAGATGGAGCCGACAAACTCTTCGAGATCCGCAAGAAATATGAC TAATGGTTGACAGGCTGGATATAGGACCGCTGAGCAGGCCGAGACACGGC CTCTTCGTCGGCAACTCCTTCGACACCATCTCTTCTACCGGTGCTAACGG AAGGTTGTGGGTGTTGACCCGTCACTTATTACGGCAGGTGAGAATCTACA TGCTACCATTCATTACAAACCCGAGAAGTCAACTTGCGCTATCATTGACC GTATGCGTCTCTTACAAGTGTCATCGTGACTAACTGTATGTTATAGCGGA 60 TGCACGAAAGCTTTCTCAGACGTTGAAGACCACCGGAGGCTCCTTGGTTG CGAAGGCTATGTACCTGTGTGACTCTGGTGGCCAATACCTTGATGGTACT

GAATTGATCAGAACCTGATTGATGCCGTCTGGGGAGATGAACGTCCTGCA ACTGATACTACCCGAACTCTCCACTTTGGAGAGCCCACGGAGTTCCAGAA

CGGCCTGCCAACCAAATTACGGTACAGCCTGTTGAGCGCGCGGGAAAGTC GAAGGCTTATGCACTTGTTCTAAAGGGACATATCAGCATTGACAATGCCA 65 ATTCGAGGAGAAAGTGGAAGACCTGCGAAAGGAATTGACTGCGAAGAAGA TTTTCCCCAAAGGAACCACCGGATACGCCATTGACTCGTTTGCTCGACAG

US 7,943,340 B2 45 46 The disclosed ruAMPP has homology to the amino acid sequences shown in the BLAST data listed in Table 8D, 8E and 8F. This data was analyzed by the program PAIRWISE BLAST. TABLE 8D

TBLASTN results for ruAMPP Gene Length Identity Positives Index. Identifier Protein?Organism (aa) (%) (%) Expect gi32403169 Neurospora Crassa 1845 339,630 433,630 O.O strain OR74A (53%) (68%) gi20453016 Drosophila melanogaster 12647 268,638. 369,638 e-127 aminopeptidase P gene (42%) (57%) gil 17571207 Drosophila melanogaster 12001. 268,638. 369,638 e-127 (ApepP) on chromosome 2 (42%) (57%) gi4583560 Drosophila melanogaster 2358 268,638. 369,638 e-127 Daminopep-p gene (42%) (57%)

TABLE 8E

BLASTX results for ruAMPP Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect gi25529603 X-Pro aminopeptidase, cytosolic form 613 268,638. 369,638 e-127 Drosophila melanogaster (42%) (57%) gi4107172 aminopeptidase P. 613 258,638 369,638 e-124 Drosophila melanogaster (40%) (57%) gil 15384991 Xaa-Pro aminopeptidase 2. 654 268,674 365,674 e-120 Lycopersicon esculentiin (39%) (54%) gi|8489879 cytosolic aminopeptidase P. 623 254,646 358,646 e-119 Homo sapiens (39%) (55%) gi2584787 Aminopeptidase P-like? 623 254,646 357.646 e-119 Homo sapiens (39%) (55%)

TABLE 8F

BLASTP results for ruAMPP

Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect gi30923284 Probable peptidase 598. 291,629 384,629 e-1S6 C22G7.01c (46%) (61%) gi25529603 X-Pro aminopeptidase, 613 268,638 369,638 e-124 cytosolic form? (42%) (57%) Drosophila melanogaster gil 15384991 Xaa-Pro aminopeptidase 2. 654 268,674 365,674 e-123 Lycopersicon esculentiin (39%) (54%) gi|8489879 cytosolic aminopeptidase P. 623 254,646 358,646 e-122 Homo sapiens (39%) (55%) gi2584787 Aminopeptidase P-like? 623 254,646 357.646 e-122 Homo sapiens (39%) (55%) gi4107172 aminopeptidase P. 613 258,638. 369,638 e-121 Drosophila melanogaster (40%) (57%) gil 18777778 cytoplasmic aminopeptidase P. 623 253,645 353,645 e-120 Rattus norvegicus (39%) (54%) gil 18875372 cytosolic aminopeptidase P. 623 250,645 354,645 e-118 Mits musculus (38%) (54%) gil 15384989 Xaa-Pro aminopeptidase 1. 6SS 264f674 361,674 e-117 Lycopersicon esculentiin (39%) (53%)

US 7,943,340 B2 49 50 TABLE 9B- continued TABL E

Encoded ruPLD protein sequence (SEQ ID NO: 27). ruPLD nucleotide sequence (SEQ ID NO: 26). 5 PNSAIMDIHVDKYPAKSHARRVAEKLKAAGHGSTGIIFVEGOKEHIIDDS

DEPFHFRORRNFLYLSGCLEAECSWAYNIEKDELTLFIPPVDPASVMWSG TAACCCGGCTGACCCGAATCGCATGTTTAAATACTTGCGTCTGCGAGGCA LPLEPAEALKOFDVDAVLLTTEINNYLAKCGGEKVFTIADRVCPEVSFSS CTGTTCCAGAGGGATCCGTCATTACAATTGAGCCCGGTGTCTACTTCTGC 10 FKHNDTDALKLAIESCRIVKDEYEIGLLRRANEVSSOAHIEVMKAATKSK

CGTTACATCATTGAGCCATTCCTTACTAACCCCGAGACCAGCAAGTACA.T NERELYATLNYWCMSNGCSDOSYHPILACGPNAATLHYTKNNGDLTNPAT

CAACTCCGAAGTTCTAGACAAGTACTGGGCTGTTGGAGGTGTACGTATCG GIKDOLVLIDAGCOYKAYCADITRAFPLSGKFTTEGROIYDIALEMOKVA 15 FGMIKPNWLFDDMHAAWHRWAIKGLLKIGILTGSEDEIFDKGISTAFFPH AGGACAACGTCGTCGTCCGCGCCAATGGCTTTGAGAACCTGACCACGGTG GLGHHLGMDTHDWGGNPNPADPNRMFKYLRLRGTWPEGSWITIEPGWYFC CCAAAGGAGCCCGAGGAGGTCGAACGCATTGTCCAGGAGGGTGCTAAATA 20 RYIIEPFLTNPETSKYINSEVLDKYWAVGGVRIEDNVVVRANGFENLTTV

A. PKEPEEVERIVOEGAK

A disclosed partial ruPLD nucleic acid (SEQ ID NO. 26) is The disclosed partial ruPLD has homology to the amino encodes a protein with a partial sequence having 466 amino acid sequences shown in the BLAST data listed in Table 9D. acid residues (SEQID NO: 27), which is presented in Table 9E and 9F. This data was analyzed by the program PAIRWISE 9C using the one-letter amino acid code. BLAST. TABLE 9D

TBLASTN results for ruPLD Gene Length Identity Positives Index/Identifier Protein/Organism (aa) (%) (%) Expect gil 14272360 Aspergilius nidians 2632 199,348 249,348 e-143 pepP gene for prolidase, (57%) (71%) exons 1-3 gi32420910 Neurospora crassa 2562 235,457 324/.457 e-136 strain OR74A (51%) (70%) gi3114965 Siberites domunculia 1688 157,464. 235,464 4e-66 mRNA for prolidase, form 1 (33%) (50%) gi22531161 Arabidopsis thaliana 1672 160,477 242,477 2e-64 X-Pro dipeptidase-like (33%) (50%) protein

TABLE9E

BLASTX results for ruPLD

Gene Length Identity Positives Index. Identifier Protein Organism (aa) (%) (%) Expect

gil 14272361 prolidase? 496 267,463 336,463 e-153 Emericeia nidulians (57%) (72%) gi3114966 prolidase? 5O1 157,464. 235,464 1e-66 Suberites domunculia (33%) (50%) gi22531162 X-Pro 486 160,477 242,477 6e-65 dipeptidase- (33%) (50%) ike protein Arabidopsis thaliana gi30582223 peptidase D 493 152,452 231,452 2e-63 Homo sapiens (33%) (51%) gi20271451 peptidase D 493 152,452 230,452 3e-63 Homo sapiens (33%) (50%) US 7,943,340 B2 51 52 TABLE 9F

BLASTP results for ruPLD Gene Length Identity Positives Index. Identifier Protein?Organism (aa) (%) (%) Expect gil 14272361 prolidase 496 267,463 336,463 e-158 Emericeia nidulians (57%) (72%) gi3114966 prolidase 5O1 158,466 235,466 6e-67 Siberites domunculia (33%) (50%) gi22531162 X-Pro dipeptidase- 486 159,477 241,477 6e-64 like protein (33%) (50%) Arabidopsis thaliana gi30584879 Homo sapiens 494 152,452 231,452 2e-63 peptidase D (33%) (51%) gil 15929143 peptidase D, 493 152,452 231,452 2e-63 Homo sapiens (33%) (51%) gi20271451 peptidase D, 493 152,452 230,452 4e-63 Homo sapiens (33%) (50%) ca. AP2 2O TABLE 1 OA- continued cal AP2 is a Microsporum canis leucine aminopeptidase. Acal AP2 nucleic acid of 1730 nucleotides (SEQID NO: 28) cal AP2 genomic nucleotide sequence (SEQ ID NO: 28). is shown in Table 10A. GGAGTTCGGCCTTCTCGGCAGCACTTTCTACGTCGACAGCCTTGACGACC TABLE 1 OA 25 GTGAACTGCACAAGGTCAAGCTGTACCTCAACTTCGACATGATTGGCTCC cal AP2 genomic nucleotide sequence (SEQ ID NO: 28). CCCAACTTCGCCAACCAGATCTACGACGGAGACGGCTCCGCCTACAACAT

ATGAAGACACAGTTGTTGAGTCTGGGAGTTGCCCTCACGGCCATCTCTCA GACTGGCCCCGCCGGATCTGCTGAAATCGAGTACCTGTTCGAGAAGTTCT GGGCGTTATTGCTGAGGATGCCTTGAACTGGCCATTCAAGCCGTTGGTTA 30 TCGATGACCAGGGAATCCCACACCAGCCCACCGCCTTCACCGGCCGCTCC

ATGCTGTGAGTATATACACAAGATCGATCGATCGTCCTCTTGTCCCTGTC GACTACTCTGCCTTCATCAAGCGCAACGTCCCTGCCGGAGGTCTGTTTAC

ACTTATCGCTCTACAGTAAGCAAAAATACTGGAGAATCATGTGCTGATGT TGGTGCTGAGGTCGTCAAGACCGCCGAGCAGGCTAAGCTATTTGGCGGCG AAATTATAATAccTAAAAAAATTAATAAATTTAT 35 AGGCTGGCGTTGCTTATGACAAGAACTACCACGGCAAGGGCGACACTGTA GACAACATCAACAAGGGTGCTATCTACCTCAACACTCGAGGAATCGCGTA CTGGCGTACAGAAACTCCAAGACTTCGCCTACGCTCACCCTGAGAAGAAT TGCCACTGCTCAGTATGCTAGTTCGCTGCGCGGATTCCCAACCCGCCCAA CGAGTATTCGGTGGTGCTGGCCACAAGGATACCGTCGACTGGATCTACAA

TGAGCTCAAGGCTACCGGCTACTACGATGTGAAGATGCAGCCACAAGTCC 40 AGACGGGTAAGCGTGACGTGAGCCCCCGTGGCCAGTCTATGCCTGGTGGT GGATGCGGACACCACAGCGTCTTCATGTAA ACCTGTGGTCTCATGCTGAGGCAGCTGTCAATGCCAATGGCAAGGATCTC

ACTGCCAGTGCCATGTCCTACAGCCCTCCAGCCGACAAGATCACTGCCGA A disclosed calAP2 open reading frame (“ORF) of 1488 GCTTGTCCTGGCCAAGAACATGGGATGCAATGCTGTATGTGCGCCCCTTT 45 nucleotides begins with an ATG start codon at position 1 (underlined in Table 10B). TCCATTCTATATATCGACTGGTCGCTTGGAAATTCAGAAGAGCTGACAAT

TGCAAACAGACTGATTACCCAGAGGGTACCAAGGGCAAGATTGTCCTCAT TABLE 1 OB

CGAGCGTGGTGTCTGCAGCTTTGGCGAGAAGTCCGCTCAGGCTGGCGATG 50 caLAP2 nucleotide sequence (SEQ ID NO : 29). ATGAAGACACAGTTGTTGAGTCTGGGAGTTGCCCTCACGGCCATCTCTCA CAAAGGCTATTGGTGCCATCGTCTACAACAACGTCCCTGGAAGCTTGGCC GGGCGTTATTGCTGAGGATGCCTTGAACTGGCCATTCAAGCCGTTGGTTA GGCACCCTGGGTGGCCTTGACAACCGCCATGCTCCAACTGCTGGAATCTC ss ATGCTGATGACCTGCAAAACAAGATTAAGCTCAAGGATCTTATGGCTGGC TCAGGCTGATGGAAAGAACCTCGCTAGCCTTGTCGCCTCTGGCAAGGTTA GTACAGAAACTCCAAGACTTCGCCTACGCTCACCCTGAGAAGAATCGAGT CCGT CACCATGAACGTTATCAGCAAGTTTGAGAACAGGACTACGTGAGTA ATTCGGTGGTGCTGGCCACAAGGATACCGTCGACTGGATCTACAATGAGC TTGTTCCATACTTTGGTCAACAATGATATATACACGTACTAACACTGCTC TCAAGGCTACCGGCTACTACGATGTGAAGATGCAGCCACAAGTCCACCTG TATAGCTGGAACGTCATTGCCGAGACCAAGGGAGGAGACCACAACAACGT TGGTCTCATGCTGAGGCAGCTGTCAATGCCAATGGCAAGGATCTCACTGC

CATCATGCTCGGTTCTCACTCTGACTCTGTCGACGCCGGCCCTGGTATCA CAGTGCCATCTCCTACAGCCCTCCAGCCGACAAGATCACTGCCGAGCTTG

ACGACAACGGCTCCGGTACCATTGGTATCATGACCGTTGCCAAAGCCCTC TCCTGGCCAAGAACATGGGATGCAATGCTACTGATTACCCAGAGGGTACC 65 ACCAACTTCAAGGTCAACAACGCCGTCCGCTTCGGCTGGTGGACCGCCGA AAGGGCAAGATTGTCCTCATCGAGCGTGGTGTCTGCAGCTTTGGCGAGAA US 7,943,340 B2 53 54 TABLE 1 OB - continued TABLE 1 OB - continued

caLAP2 nucleotide sequence (SEQ ID NO : 29). caLAP2 nucleotide sequence (SEQ ID NO : 29).

GTCCGCTCAGGCTGGCGATGCAAAGGCTATTGGTGCCAATGGCAAGGATC CCGCCCAAAGACGGGTAAGCGTGACGTGAGCCCCCGTGGCCAGTCTATGC

TCGTCCCTGGAAGCTTGGCCGGCACCCTGGGTGGCCTTGACAACCGCCAT CTGGTGGTGGATGCGGACACCACAGCGTCTTCATGTAA 10

GCTCCAACTGCTGGAATCTCTCAGGCTGATGGAAAGAACCTCGCTAGCCT A disclosed calAP2 nucleic acid (SEQ ID NO: 29) encodes a protein having 495 amino acid residues (SEQ ID TGTCGCCTCTGGCAAGGTTACCGT CACCATGAACGTTATCAGCAAGTTTG 15 NO:30), which is presented in Table 10C using the one-letter amino acid code. AGAACAGGACTACCTGGAACGTCATTGCCGAGACCAAGGGAGGAGACCAC TABLE AACAACGTCATCATGCTCGGTTCTCACTCTGACTCTGTCGACGCCGGCCC Encoded calAP2 protein sequence (SEQ ID NO: 30). TGGTATCAACGACAACGGCTCCGGTACCATTGGTATCATGACCGTTGCCA MKTOLLSLGVALTAISQGVIAEDALNWPFKPLVNADDLONKIKLKDLMAG

AAGCCCT CACCAACTTCAAGGTCAACAACGCCGTCCGCTTCGGCTGGTGG WOKLODFAYAHPEKNRVFGGAGHKDTVDWIYNELKATGYYDWKMOPOVHL 25

ACCGCCGAGGAGTTCGGCCTTCTCGGCAGCACTTTCTACGTCGACAGCCT WSHAEAAWNANGKDLTASAMSYSPPADKITAELWLAKNMGCNATDYPEGT

TGACGACCGTGAACTGCACAAGGTCAAGCTGTACCTCAACTTCGACATGA KGKIWLIERGWCSFGEKSAQAGDAKAIGAIWYNNWPGSLAGTLGGLDNRH 30 TTGGCTCCCCCAACTTCGCCAACCAGATCTACGACGGAGACGGCTCCGCC APTAGISOADGKNLASLVASGKVTVTMNVISKFENRTTWNVIAETKGGDH

TACAACATGACTGGCCCCGCCGGATCTGCTGAAATCGAGTACCTGTTCGA NNWIMLGSHSDSWDAGPGINDINGSGTIGIMTWAKALTNFKWNNAWRFGWW

35 GAAGTTCTTCGATGACCAGGGAATCCCACACCAGCCCACCGCCTTCACCG TAEEFGLLGSTFYVDSLDDRELHKWKLYLNFDMIGSPNFANOIYDGDGSA

GCCGCTCCGACTACTCTGCCTTCATCAAGCGCAACGTCCCTGCCGGAGGT YNMTGPAGSAEIEYLFEKFFDDOGIPHOPTAFTGRSDYSAFIKRNVPAGG

40 CTGTTTACTGGTGCTGAGGTCGTCAAGACCGCCGAGCAGGCTAAGCTATT LFTGAEVVKTAEOAKLFGGEAGWAYDKNYHGKGDTVDNINKGAIYLNTRG

IAYATAOYASSLRGFPTRPKTGKRDVSPRGOSMPGGGCGHHSVFM TGGCGGCGAGGCTGGCGTTGCTTATGACAAGAACTACCACGGCAAGGGCG

45 ACACTGTAGACAACATCAACAAGGGTGCTATCTACCTCAACACTCGAGGA The disclosed calAP2 has homology to the amino acid sequences shown in the BLAST data listed in Table 10D, 10E ATCGCGTATGCCACTGCTCAGTATGCTAGTTCGCTGCGCGGATTCCCAAC and 1 OF. This data was analyzed by the program PAIRWISE BLAST. TABLE 1 OD

TBLASTN results for caLAP2

Gene Length Identity Positives Index/Identifier Protein/Organism (aa) (%) (%) Expect

gi|600025 Saccharomyces cerevisiae (s288c) 32421 182,477 254,477 8e-77 RIF1, DPB3, YmL27 (38%) (53%) and SNF5 genes gi469463 Saccharomyces cerevisiae 2272 182,477 254,477 8e-77 aminopeptidase Y gene (38%) (53%) gil 16033407 Bacilius licheniformis 2054 132,474 215,474 3e-27 leucine aminopeptidase (27%) (45%) precursor, gene US 7,943,340 B2 55 56 TABLE 1 OE

BLASTX results for caLAP2 Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect gil 1077010 aminopeptidase Y 537 182,477 254,477 precursor, vacuolar? (38%) (53%) Saccharomyces cerevisiae gi|6319763 Aminopeptidase yScIII; Ape3p. 563 182,477 254,477 Saccharomyces cerevisiae (38%) (53%) gi31791596 probable lipoprotein SOO 188,485 269,485 aminopeptidase LPQL (38%) (55%) Mycobacterium bovis gil 158398.05 hydrolase? 493 187,481 268,481 Mycobacterium tuberculosis (38%) (55%)

TABLE 1 OF

BLASTP results for caLAP2 Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect gi|6319763 aminopeptidase yScIII; Ape3p. 563 182,477 254,477 Se-78 Saccharomyces cerevisiae (38%) (53%) gil 1077010 aminopeptidase Y 537 182,477 254,477 precursor, vacuolar? (38%) (53%) Saccharomyces cerevisiae gil 158398.05 hydrolase? 493 187,481 268,481 Mycobacterium tuberculosis (38%) (55%) gi31617182 probable lipoprotein SOO 188,485 269,485 aminopeptidase LPQL (38%) (55%) Mycobacterium bovis gil 15598135 probable aminopeptidase? S36 166,445 242,445 Pseudomonas aeruginosa (37%) (54%)

35 meLAP2 TABLE 11A- continued meLAP2 is a Trichophyton mentagrophytes leucine ami nopeptidase. A meLAP2 nucleic acid of 1775 nucleotides meLAP2 genomic nucleotide sequence (SEQ ID NO : 31). (SEQ ID NO:31) is shown in Table 11A. 40 ACGTCCCCGGATCCCTTGCTGGCACTCTTGGTGGCCTTGACAAGCGCCAT

TABLE 11A GTCCCAACCGCTGGTCTTTCCCAGGAGGATGGAAAGAATCTTGCTAGCCT meLAP2 genomic nucleotide sequence (SEQ ID NO : 31). CGTTGCTTCTGGCAAGGTTGATGTCACCATGAACGTTGTCAGTCTGTTTG ATGAAGTCGCAACTGTTGAGCCTAGCCGTGGCCGT CACCACCATTTCCCA 45 AGAACCGAACCACGTAAGTAACT CAACGTCATATCCAGCATTAATCTTCA GGGCGTTGTTGGTCAAGAGCCCTTTGGATGGCCCTTCAAGCCTATGGTCA GGAGTATATATACTAATTCGGTATCTCACAGCTGGAACGTCATTGCTGAG CTCAGGTGAGTTGCTGTCAACAGATCGATCGATCGATCTACCTTCGTCCC ACCAAGGGAGGAGACCACAACAATGTTGTCATGCTTGGTGCTCACTCCGA TGTCACCTATAACTCCACAGCAGGACCAAGAAAACACAAGTTTTCCGGGG 50 CTCCGTCGATGCCGGCCCCGGTATCAACGACAACGGCTCCGGCTCCATTG AATTCTTATGTGCTGATGTAAATGTATAGGATGACCTGCAAAACAAGATT GTATCATGACCGTTGCCAAAGCCCTTACTAACTTCAAGCTCAACAACGCC AAGCTCAAGGATATCATGGCAGGTGTGGACACTGTCGAGTGGATCTACAA GTTCGCTTTGCCTGGTGGACCGCTGAGGAATTCGGTCTCCTTGGAAGCAC TGAGCTCAAGGCCACCGGCTACTACAATGTGAAGAAGCAGGAGCAGGTAC 55 CTTCTACGTCGACAGCCTTGATGACCGTGAGCTGCACAAGGTCAAGCTGT ACCTGTGGTCTCACGCTGAGGCCGCTCTCAGTGCCAATGGCAAGGACCTC ACCTCAACTTCGACATGATCGGCTCTCCCAACTTCGCCAACCAGATCTAC AAGGCCAGCGCCATGTCGTACAGCCCTCCTGCCAACAAGATCATGGCCGA GACGGTGACGGTTCGGCCTACAACATGACTGGTCCCGCTGGCTCTGCTGA GCTTGTCGTTGCCAAGAACAATGGCTGCAATGCTGTAAGTGCCATACACT 60 TCCTATACATCACATTCACTTTAGAATGAAGAGCGCGGGAGAACTGATTT AATCGAGTACCTGTTCGAGAAGTTCTTTGACGACCAGGGTCTCCCACACC

TTTTTTTTTTTTTTTTTTTTTTGTAACAGACCGATTACCCAGAGAACACT AGCCCACTGCCTTCACCGGCCGATCCGACTACTCTGCATTCATCAAGCGC

CAGGGAAAGATAGTCCTCATTCAGCGTGGTGTCTGCAGCTTCGGCGAGAA AACGTCCCCGCTGGAGGTCTTTTCACTGGTGCCGAGGTTGTCAAGACCCC 65 GTCTTCTCAGGCTGGTGATGCGAAGGCTATTGGTGCCGTTGTCTACAACA CGAGCAAGTTAAGCTGTTCGGTGGTGAGGCTGGCGTTGCCTATGACAAGA

US 7,943,340 B2 59 60 TABLE 11E

BLASTX results for meLAP2 Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect gil 1077010 aminopeptidase Y 537 80,479 251,479 precursor, vacuolar? (37%) (52%) Saccharomyces cerevisiae gi|6319763 aminopeptidase yScIII; Ape3p. 563 80,479 251,479 Saccharomyces cerevisiae (37%) (52%) gil 158398.05 hydrolase? 493 59,440 236,440 Mycobacterium tuberculosis (36%) (53%) gi31791596 probable lipoprotein 500 59,440 236,440 aminopeptidase LPQL (36%) (53%) Mycobacterium bovis gil 15598135 probable aminopeptidase? 536 58,445 237,445 Pseudomonas aeruginosa (35%) (53%) gi| 1045225 N-acetylpuromycin 485 54,477 218,477 N-acetylhydrolase? (32%) (45%) Streptomyces antiatus gi2983.1415 putative aminopeptidase 315 95.244 131,244 Streptomyces avermitiis (38%) (53%)

TABLE 11 F

BLASTP results for meLAP2 Gene Length Identity Positives Index/Identifier Protein Organism (nt) (%) (%) Expect gi|6319763 aminopeptidase yScIII; Ape3p. S63 179,479 248,479 Saccharomyces cerevisiae (37%) (51%) gil 1077010 aminopeptidase Yprecursor, 537 179,479 248,479 vacuolar (37%) (51%) Saccharomyces cerevisiae gi31617182 probable lipoprotein SOO 159,440 236,440 aminopeptidase LPQL (36%) (53%) Mycobacterium bovis gil 158398.05 hydrolase? 493 159,440 236,440 Mycobacterium tuberculosis (36%) (53%)

40 ruDPPIV TABLE 12A- continued ruDPPIV is a T. rubrum dipeptidylpeptidase IV. A rulDP PIV nucleic acid of 2326 nucleotides (SEQ ID NO. 34) is ruDPPIV nucleotide sequence (SEQ ID NO. 34). shown in Table 12A. A disclosed ruDPPIV open reading frame (“ORF) begins with an ATG start codon at position 1 45 CTGGAACAATGGCAAGACCAAGCGTATTACCGAAAATGGCGGCCCGGATA (underlined in Table 12A). TCTTCAATGGTGTCCCTGACTGGGATATACGAGGAAGAAATCTTCGGGGA

TABLE 12A CCGGTTCGTCTTTGGTTCT CACCTGACGGTGAATACCTTGCGTACCTCCG ruDPPIV nucleotide sequence (SEQ ID NO. 34). 50 CTTTAACGAGACTGGAGTCCCGACCTACACTATTCCGTACTACAAGAACA

ATGAAGCTCCTCTCGCTACTTATGCTGGCGGGCATCGCCCAAGCCATCGT AGCAAAAGATTGCCCCTGCCTACCCAAGGGAGCTGGAGATCCGTTACCCT

TCCTCCTCGTGAGCCCCGTTCACCAACTGGTGGCGGCAACAAGCTGTTGA AAAGTCTCTGCGAAGAACCCAACCGTGCAGTTCCACCTGTTAAACATTGC

CCTACAAGGAGTGTGTCCCTAGAGCTACTATCTCTCCAAGGTCGACGTCC 55 TTCATCCCAGGAGACAACTATCCCAGTTACTGCGTTCCCGGAAAACGATC

CTTGCCTGGATTAACAGTGAAGAAGATGGCCGGTACATCTCCCAGTCCGA TTGTGATCGGTGAGGTTGCTTGGCTCAGCAGTGGCCATGATAGTGTAGCA

CGATGGAGCATTGATCCTCCAGAACATCGTCACGAACACCAACAAGACTC TATCGTGCTTTCAACCGTGTCCAGGATAGAGAAAAGATTGTCAGCGTCAA

TCGTGGCCGCAGACAAGGTACCCAAGGGTTACTATGACTACTGGTTCAAG GGTTGAGTCCAAGGAATCCAAGGTTATTCGCGAAAGAGATGGCACCGACG 60 CCAGACCTTTCTGCTGTCTTATGGGCAACCAATTACACCAAGCAGTACCG GCTGGATCGACAACCTTCTCTCATGTCATATATCGGAAACGTTAACGGCA

TCACTCTTACTTTGCCAACTACTTATTCTAGACATCAAAAAAGGGATCGT AGGAGTACTACGTCGATATATCTGATGCTTCTGGCTGGGCACATATCTAC

TGACCCCTCTAGCCCAGGACCAGGCTGGTGACATCCAGTATGCTCAATGG CTCTACCCGGTTGATGGAGGAAAGGAGATTGCACTAACAAAGGGAGAATG 65 AGCCCCATGAACAACTCTATCGCCTATGTCCGTGRAAACGACCTGTATAT GGAAGTCGTTGCCATTCTCAAGGTTGACACGAAGAAGAAGCTGATCTACT

US 7,943,340 B2 63 64 TABLE 12D

BLASTX results for ruDPPIV Gene Length Identity Positives Index. Identifier Protein?Organism (aa) (%) (%) Expect gi2351700 dipeptidyl- 765 218,341 270,341 peptidase IV (63%) (79%) Aspergillus finigatus gi2924305 prolyl dipeptidyl 771 213,344 270,344 peptidase (61%) (78%) Aspergillus Oryzae gil 1621279 dipeptidyl- 748 118,349 186,349 peptidase IV (33%) (53%) Xenopus laevis gi53.5388 dipeptidyl peptidase IV 766 125,375 191.375 Homo sapiens (33%) (50%)

TABLE 12E

BLASTP results for ruDPPIV Gene Length Identity Positives Index/Identifier Protein Organism (aa) (%) (%) Expect gi2351700 dipeptidyl-peptidase IV 765 468,761 585,761 Aspergilius finigatus (61%) (76%) gi2924305 prolyl dipeptidyl peptidase 771 448,769 S68,769 Aspergilius Oryzae (58%) (73%) gil 14330263 dipeptidyl 901. 261,733 387f733 e-114 aminopeptidase type IV (35%) (52%) Aspergilius niger gil 19114882 dipeptidyl 793 258,742 396,742 e-106 aminopeptidase? (34%) (53%) Schizosaccharomyces pombe gi3660 dipeptidyl aminopeptidase Bf 841 254f750 370, 750 Saccharomyces cerevisiae (33%) (49%)

One aspect of the invention pertains to isolated nucleic acid Thus a mature form arising from a precursor polypeptide or molecules that encode EXOX polypeptides or biologically protein that has residues 1 to N, where residue 1 is the N-ter active portions thereof. Also included in the invention are minal methionine, would have residues 2 through N remain nucleic acid fragments Sufficient for use as hybridization 40 ing after removal of the N-terminal methionine. Alternatively, probes to identify EXOX-encoding nucleic acids (e.g., a mature form arising from a precursor polypeptide or protein EXOX mRNAs) and fragments for use as PCR primers for the having residues 1 to N, in which an N-terminal signal amplification and/or mutation of EXOX nucleic acid mol sequence from residue 1 to residue M is cleaved, would have ecules. As used herein, the term “nucleic acid molecule' is 45 the residues from residue M+1 to residue N remaining. Fur intended to include DNA molecules (e.g., cDNA or genomic ther as used herein, a “mature' form of a polypeptide or DNA), RNA molecules (e.g., mRNA), analogs of the DNA or protein may arise from a step of post-translational modifica RNA generated using nucleotide analogs, and derivatives, tion other than a proteolytic cleavage event. Such additional fragments and homologs thereof. The nucleic acid molecule processes include, by way of non-limiting example, glycosy may be single-stranded or double-stranded. 50 lation (N-, O- and W types), myristoylation, phosphorylation, An EXOX nucleic acid can encode a mature EXOX Sulfation, N-terminus cyclisation, or C-terminus amidation. polypeptide. As used herein, a “mature' form of a polypep In general, a mature polypeptide or protein may result from tide or protein disclosed in the present invention is the product the operation of only one of these processes, or a combination of a naturally occurring polypeptide or precursor form or of any of them. proprotein. The naturally occurring polypeptide, precursor or 55 The term “probes’, as utilized herein, refers to nucleic acid proprotein includes, by way of nonlimiting example, the full sequences of variable length, preferably between at least length gene product, encoded by the corresponding gene. about 10 nucleotides (nt), 100 nt, or as many as approxi Alternatively, it may be defined as the polypeptide, precursor mately, e.g., 6.000 nt, depending upon the specific use. Probes or proprotein encoded by an ORF described herein. The prod are used in the detection of identical, similar, or complemen uct “mature' form arises, again by way of nonlimiting 60 tary nucleic acid sequences. Longer length probes are gener example, as a result of one or more naturally occurring pro ally obtained from a natural or recombinant source, are highly cessing steps as they may take place within the cell, or host specific, and much slower to hybridize than shorter-length cell, in which the gene product arises. Examples of Such oligomer probes. Probes may be single- or double-stranded processing steps leading to a “mature' form of a polypeptide and designed to have specificity in PCR, membrane-based or protein include the cleavage of the N-terminal methionine 65 hybridization technologies, or ELISA-like technologies. residue encoded by the initiation codon of an ORF, or the The term "isolated nucleic acid molecule, as utilized proteolytic cleavage of a signal peptide or leader sequence. herein, is one, which is separated from other nucleic acid US 7,943,340 B2 65 66 molecules, which are present in the natural Source of the preferably about 15 nt to 30 nt in length. In one embodiment nucleic acid. Preferably, an "isolated nucleic acid is free of of the invention, an oligonucleotide comprising a nucleic acid sequences, which naturally flank the nucleic acid (e.g., molecule less than 100 nt in length would further comprise at sequences located at the 5’- and 3'-termini of the nucleic acid) least 6 contiguous nucleotides of SEQID NOS: 2, 5, 8, 11, 14, in the genomic DNA of the organism from which the nucleic 17, 20, 23, 26, 29, 32, or 34, or a complement thereof. Oli acid is derived. For example, in various embodiments, the gonucleotides may be chemically synthesized and may also isolated EXOX nucleic acid molecules can contain less than be used as probes. about 5kb. 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of nucleotide In another embodiment, an isolated nucleic acid molecule sequences which naturally flank the nucleic acid molecule in of the invention comprises a nucleic acid molecule that is a genomic DNA of the cell/tissue/species from which the 10 complement of the nucleotide sequence shown in SEQ ID nucleic acid is derived. Moreover, an "isolated nucleic acid NOs: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, or 34, or a portion molecule, such as a cDNA molecule, can be substantially free of this nucleotide sequence (e.g., a fragment that can be used of other cellular material or culture medium when produced as a probe or primer or a fragment encoding a biologically by recombinant techniques, or of chemical precursors or active portion of a EXOX polypeptide). A nucleic acid mol other chemicals when chemically synthesized. Particularly, it 15 ecule that is complementary to the nucleotide sequence means that the nucleic acid or protein is at least about 50% shown in SEQID NOs: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, pure, more preferably at least about 85% pure, and most or 34 is one that is sufficiently complementary to the nucle preferably at least about 99% pure. otide sequence shown in SEQID NOS: 2, 5, 8, 11, 14, 17, 20, As used herein, the term “recombinant' when used with 23, 26, 29, 32, or 34 that it can hydrogen bond with little or no reference to a cell indicates that the cell replicates a heterolo mismatches to the nucleotide sequence shown in SEQ ID gous nucleic acid, or expresses a peptide or protein encoded NOs: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, or 34, thereby by a heterologous nucleic acid. Recombinant cells can con forming a stable duplex. tain genes that are not found within the native (non-recombi As used herein, the term “complementary” refers to Wat nant) form of the cell. Recombinant cells can also contain son-Crick or Hoogsteen base pairing between nucleotide genes found in the native form of the cell wherein the genes 25 units of a nucleic acid molecule. The term “binding means are modified and re-introduced into the cell by artificial the physical or chemical interaction between two polypep means. The term also encompasses cells that contain a nucleic tides or compounds or associated polypeptides or compounds acid endogenous to the cell that has been modified without or combinations thereof. Binding includes ionic, non-ionic, removing the nucleic acid from the cell; Such modifications van der Waals, hydrophobic interactions, and the like. A include those obtained by gene replacement, site-specific 30 physical interaction can be either direct or indirect. Indirect mutation, and related techniques. One skilled in the art will interactions may be through or due to the effects of another recognize that these cells can be used for unicellular or mul polypeptide or compound. Direct binding refers to interac ticellular transgenic organisms, for example transgenic fungi tions that do not take place through, or due to, the effect of producing EXOX. another polypeptide or compound, but instead are without A nucleic acid molecule of the invention, e.g., a nucleic 35 other Substantial chemical intermediates. acid molecule having the nucleotide sequence of SEQ ID Fragments provided herein are defined as sequences of at NOS: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, or 34 or a least 6 (contiguous) nucleic acids or at least 4 (contiguous) complement of this aforementioned nucleotide sequence, can amino acids, a length sufficient to allow for specific hybrid be isolated using standard molecular biology techniques and ization in the case of nucleic acids or for specific recognition the sequence information provided herein. Using all or a 40 of an epitope in the case of amino acids, respectively, and are portion of the nucleic acid sequence of SEQID NOS: 2, 5, 8, at most some portion less than a full length sequence. Frag 11, 14, 17, 20, 23, 26, 29, 32, or 34 as a hybridization probe, ments may be derived from any contiguous portion of a EXOX molecules can be isolated using standard hybridiza nucleic acid oramino acid sequence of choice. Derivatives are tion and cloning techniques (e.g., as described in Sambrooket nucleic acid sequences or amino acid sequences formed from al., (eds.), MOLECULAR CLONING: A LABORATORY MANUAL 2" 45 the native compounds either directly or by modification or Ed., Cold Spring Harbor Laboratory Press, Cold Spring Har partial Substitution. Analogs are nucleic acid sequences or bor, N.Y., 1989; and Ausubel et al., (eds.), CURRENT PROTOCOLS amino acid sequences that have a structure similar to, but not IN MOLECULAR BIOLOGY, John Wiley & Sons, New York, N.Y., identical to, the native compound but differ from it with 1993.) respect to certain components or side chains. Analogs may be A nucleic acid of the invention can be amplified using 50 synthetic or from a different evolutionary origin and may cDNA, mRNA or alternatively, genomic DNA, as a template have a similar or opposite metabolic activity compared to and appropriate oligonucleotide primers according to stan wild type. Homologs or orthologs are nucleic acid sequences dard PCR amplification techniques. The nucleic acid so or amino acid sequences of a particular gene that are derived amplified can be cloned into an appropriate vector and char from different species. acterized by DNA sequence analysis. Furthermore, oligo 55 Derivatives and analogs may be full length or other than nucleotides corresponding to EXOX nucleotide sequences full length, if the derivative or analog contains a modified can be prepared by standard synthetic techniques, e.g., using nucleic acid oramino acid, as described below. Derivatives or an automated DNA synthesizer. analogs of the nucleic acids or proteins of the invention As used herein, the term "oligonucleotide' refers to a series include, but are not limited to, molecules comprising regions of linked nucleotide residues, which oligonucleotide has a 60 that are Substantially homologous to the nucleic acids or sufficient number of nucleotide bases to be used in a PCR proteins of the invention, in various embodiments, by at least reaction. A short oligonucleotide sequence may be based on, about 70%, 80%, or 95% identity (with a preferred identity of or designed from, a genomic or cDNA sequence and is used to 80-95%) over a nucleic acid or amino acid sequence of iden amplify, confirm, or reveal the presence of an identical, simi tical size or when compared to an aligned sequence in which lar or complementary DNA or RNA in a particular cell or 65 the alignment is done by a computer homology program tissue. Oligonucleotides comprise portions of a nucleic acid known in the art, or whose encoding nucleic acid is capable of sequence having about 10 nt, 50 nt, or 100 nt in length, hybridizing to the complement of a sequence encoding the US 7,943,340 B2 67 68 aforementioned proteins under Stringent, moderately strin degeneracy of the genetic code and thus encode the same gent, or low stringent conditions. See, e.g., Ausubel et al., EXOX proteins that are encoded by the nucleotide sequences CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & shown in SEQID NOs: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, Sons, New York, N.Y., 1993, and below. or 34. In another embodiment, an isolated nucleic acid mol A "homologous nucleic acid sequence' or "homologous ecule of the invention has a nucleotide sequence encoding a amino acid sequence, or variations thereof, refer to protein having an amino acid sequence shown in SEQ ID sequences characterized by a homology at the nucleotide NOs: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, or 35. In addition level or amino acid level as discussed above. Homologous to the fungal EXOX nucleotide sequences shown in SEQID nucleotide sequences encode those sequences coding for iso NOS: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, and 34, it will be forms of EXOX polypeptides. Isoforms can be expressed in 10 appreciated by those skilled in the art that DNA sequence the same organism as a result of, for example, alternative polymorphisms that lead to changes in the amino acid splicing of RNA. Alternatively, isoforms can be encoded by sequences of the EXOX polypeptides may exist within a different genes. In the invention, homologous nucleotide population of various species. Such genetic polymorphisms sequences can include nucleotide sequences encoding an in the EXOX genes may exist among individual fungal spe EXOX polypeptide of species other than fungi. Homologous 15 cies within a population due to natural allelic variation. As nucleotide sequences also include, but are not limited to, used herein, the terms “gene' and “recombinant gene' refer naturally occurring allelic variations and mutations of the to nucleic acid molecules comprising an open reading frame nucleotide sequences set forth herein. Homologous nucleic (ORF) encoding an EXOX protein, preferably a fungal acid sequences include those nucleic acid sequences that EXOX protein. Such natural allelic variations can typically encode conservative amino acid Substitutions (see below) in result in 1-5% variance in the nucleotide sequence of the SEQ ID NOS: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, or 34, as EXOX genes. Any and all such nucleotide variations and well as a polypeptide possessing EXOX biological activity. resulting amino acid polymorphisms in the EXOX polypep Various biological activities of the EXOX proteins. are tides, which are the result of natural allelic variation and that described below. do not alter the functional activity of the EXOX polypeptides, A EXOX polypeptide is encoded by the open reading 25 are intended to be within the scope of the invention. frame (“ORF) of an EXOX nucleic acid. A stretch of nucleic Moreover, nucleic acid molecules encoding EXOX pro acids comprising an ORF is uninterrupted by a stop codon. An teins from other species, and, thus, that have a nucleotide ORF that represents the coding sequence for a full protein sequence that differs from the fungal sequence SEQID NOs: begins with an ATG “start codon and terminates with one of 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, or 34 are intended to be the three “stop' codons, namely, TAA, TAG, or TGA. For the 30 within the scope of the invention. Nucleic acid molecules purposes of this invention, an ORF may be any part of a corresponding to natural allelic variants and homologues of coding sequence, with or without a start codon, a stop codon, the EXOX cDNAs of the invention can be isolated based on or both. For an ORF to be considered as a good candidate for their homology to the fungal EXOX nucleic acids disclosed coding for a bona fide cellular protein, a minimum size herein using the fungal cDNAS, or a portion thereof, as a requirement is often set, e.g., a stretch of DNA that would 35 hybridization probe according to standard hybridization tech encode a protein of 50 amino acids or more. niques under stringent hybridization conditions. The nucleotide sequences determined from the cloning of Accordingly, in another embodiment, an isolated nucleic the fungal EXOX genes allows for the generation of probes acid molecule of the invention is at least 6 nucleotides in and primers designed for use in identifying and/or cloning length and hybridizes under stringent conditions to the EXOX homologues in other species, as well as EXOX homo 40 nucleic acid molecule comprising the nucleotide sequence of logues from other fungi. The probe?primer typically com SEQID NOs: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, or 34. prises a Substantially purified oligonucleotide. The oligo In another embodiment, the nucleic acid is at least 10, 25, nucleotide typically comprises a region of nucleotide 50, 100, 250, 500, 750, 1000, 1500, or 2000 or more nucle sequence that hybridizes under stringent conditions to at least otides in length. In yet another embodiment, an isolated about 12, 25, 50, 100, 150, 200, 250, 300, 350 or 400 con 45 nucleic acid molecule of the invention hybridizes to the cod secutive sense strand nucleotide sequence of SEQID NOs: 2. ing region. As used herein, the term "hybridizes under Strin 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, or 34, or an anti-sense gent conditions” is intended to describe conditions for strand nucleotide sequence of SEQID NOS: 2, 5, 8, 11, 14, 17. hybridization and washing under which nucleotide sequences 20, 23, 26, 29, 32, or 34; or of a naturally occurring mutant of at least 60% homologous to each other typically remain SEQID NOs: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, or 34. 50 hybridized to each other. A polypeptide having a biologically-active portion of an Homologs or other related sequences (e.g., orthologs, para EXOX polypeptide' refers to polypeptides exhibiting activity logs) can be obtained by low, moderate or high Stringency similar, but not necessarily identical to, an activity of a hybridization with all or a portion of the particular fungal polypeptide of the invention, including mature forms, as mea sequence as a probe using methods well known in the art for Sured in a particular biological assay, with or without dose 55 nucleic acid hybridization and cloning. dependency. A nucleic acid fragment encoding a “biologi As used herein, the phrase “stringent hybridization condi cally-active portion of EXOX can be prepared by isolating a tions’ refers to conditions under which a probe, primer or portion SEQID NOS: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, or oligonucleotide will hybridize to its target sequence, but to no 34 that encodes a polypeptide having a EXOX biological other sequences. Stringent conditions are sequence-depen activity (the biological activities of the EXOX proteins are 60 dent and will be different in different circumstances. Longer described below), expressing the encoded portion of EXOX sequences hybridize specifically at higher temperatures than protein (e.g., by recombinant expression in vitro) and assess shorter sequences. Generally, Stringent conditions are ing the activity of the encoded portion of EXOX. selected to be about 5°C. lower than the thermal melting point EXOX Nucleic Acid and Polypeptide Variants (T) for the specific sequence at a defined ionic strength and The invention further encompasses nucleic acid molecules 65 pH. The T is the temperature (under defined ionic strength, that differ from the nucleotide sequences shown in SEQ ID pH and nucleic acid concentration) at which 50% of the NOS: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, or 34 due to probes complementary to the target sequence hybridize to the US 7,943,340 B2 69 70 target sequence at equilibrium. Since the target sequences are mutation into the nucleotide sequences of SEQID NOS: 2, 5, generally present at excess, at T, 50% of the probes are 8, 11, 14, 17, 20, 23, 26, 29, 32, or 34 thereby leading to occupied at equilibrium. Typically, Stringent conditions will changes in the amino acid sequences of the encoded EXOX be those in which the salt concentration is less than about 1.0 proteins, without altering the functional ability of said EXOX M sodium ion, typically about 0.01 to 1.0 M sodium ion (or proteins. For example, nucleotide Substitutions leading to other salts) at pH 7.0 to 8.3 and the temperature is at least amino acid Substitutions at “non-essential amino acid resi about 30° C. for short probes, primers or oligonucleotides dues can be made in the sequence of SEQID NOs: 3, 6, 9, 12, (e.g., 10 nt to 50 nt) and at least about 60°C. for longerprobes, 15, 18, 21, 24, 27, 30,33, or 35. A “non-essential amino acid primers and oligonucleotides. Stringent conditions may also residue is a residue that can be altered from the wild-type be achieved with the addition of destabilizing agents, such as 10 sequences of the EXOX proteins without altering their bio formamide. logical activity, whereas an “essential amino acid residue is Stringent conditions are known to those skilled in the art required for Such biological activity. and can be found in Ausubel et al., (eds.), CURRENT PROTOCOLS As used herein, the term “biological activity” or “func IN MOLECULAR BIOLOGY, John Wiley & Sons, N.Y. (1989), tional activity” refers to the natural or normal function of the 6.3.1-6.3.6. Preferably, the conditions are such that sequences 15 EXO proteins, for example the ability to degrade other pro at least about 65%, 70%, 75%, 85%, 90%, 95%, 98%, or 99% teins. Amino acid residues that are conserved among the homologous to each other typically remain hybridized to each EXOX proteins of the invention are predicted to be particu other. A non-limiting example of stringent hybridization con larly non-amenable to alteration. Amino acids for which con ditions are hybridization in a high Salt buffer comprising servative substitutions can be made are well known within the 6xSSC, 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, 0.02% PVP. art. One of skill in the art will recognize that each codon in a 0.02% Ficoll, 0.02% BSA, and 500 mg/ml denatured salmon nucleic acid (except AUG, which is ordinarily the only codon sperm DNA at 65° C., followed by one or more washes in for methionine) can be modified to yield a functionally iden 0.2xSSC, 0.01% BSA at 50° C. An isolated nucleic acid tical molecule by standard techniques. Furthermore, indi molecule of the invention that hybridizes under stringent vidual substitutions, deletions or additions which alter, add or conditions to the sequences of SEQID NOs: 2, 5, 8, 11, 14, 25 delete a single amino acid or a small percentage of amino 17, 20, 23, 26, 29, 32, or 34 corresponds to a naturally acids (typically less than 5%, more typically less than 1%) in occurring nucleic acid molecule. As used herein, a “naturally an encoded sequence are “conservative mutations” where the occurring nucleic acid molecule refers to an RNA or DNA alterations result in the Substitution of an amino acid with a molecule having a nucleotide sequence that occurs in nature chemically similar amino acid. (e.g., encodes a natural protein). 30 Another aspect of the invention pertains to nucleic acid In a second embodiment, a nucleic acid sequence that is molecules encoding EXOX proteins that contain changes in hybridizable to the nucleic acid molecule comprising the amino acid residues that are not essential for activity. Such nucleotide sequence of SEQID NOS: 2, 5, 8, 11, 14, 17, 20, EXOX proteins differ in amino acid sequence from SEQID 23, 26, 29, 32, or 34 or fragments, analogs or derivatives NOs: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, or 35 yet retain thereof, under conditions of moderate stringency is provided. 35 biological activity. In one embodiment, the isolated nucleic A non-limiting example of moderate stringency hybridiza acid molecule comprises a nucleotide sequence encoding a tion conditions are hybridization in 6xSSC, 5xDenhardt's protein, wherein the protein comprises an amino acid solution, 0.5% SDS and 100 mg/ml denatured salmon sperm sequence at least about 45% homologous to the amino acid DNA at 55° C., followed by one or more washes in 1xSSC, sequences of SEQID NOs: 3, 6, 9, 12, 15, 18, 21, 24, 27.30, 0.1% SDS at 37°C. Other conditions of moderate stringency 40 33, or 35. Preferably, the protein encoded by the nucleic acid that may be used are well-known within the art. See, e.g., molecule is at least about 60% homologous to SEQID NOs: Ausubel et al. (eds.), 1993, CURRENT PROTOCOLS IN MOLECULAR SEQID NOS: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, or 35; BIOLOGY, John Wiley & Sons, NY, and Kriegler, 1990: GENE more preferably at least about 70% homologous to SEQ ID TRANSFER AND EXPRESSION, A LABORATORY MANUAL, Stockton NOs: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, or 35; still more Press, NY. 45 preferably at least about 80% homologous to SEQID NOS: In a third embodiment, a nucleic acid that is hybridizable to SEQID NOs: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30,33, or 35; even the nucleic acid molecule comprising the nucleotide more preferably at least about 90% homologous to SEQ ID sequences of SEQID NOS: 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, NOs: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, or 35; and most 32, or 34 or fragments, analogs or derivatives thereof, under preferably at least about 95% homologous to SEQID NOs: 3, conditions of low Stringency, is provided. A non-limiting 50 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, or 35. example of low Stringency hybridization conditions are An isolated nucleic acid molecule encoding an EXOX hybridization in 35% formamide, 5xSSC, 50 mM Tris-HCl proteinhomologous to the protein of SEQID NOs: 3, 6, 9, 12, (pH 7.5), 5 mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 15, 18, 21, 24, 27, 30, 33, or 35 can be created by introducing 100 mg/ml denatured salmon sperm DNA, 10% (w/v) dextran one or more nucleotide Substitutions, additions or deletions sulfate at 40°C., followed by one or more washes in 2xSSC, 55 into the nucleotide sequence of SEQID NOS: 2, 5, 8, 11, 14, 25 mM Tris-HCl (pH 7.4).5 mM EDTA, and 0.1% SDS at 50° 17, 20, 23, 26, 29, 32, or 34 such that one or more amino acid C. Other conditions of low stringency that may be used are substitutions, additions or deletions are introduced into the well known in the art (e.g., as employed for cross-species encoded protein. hybridizations). See, e.g., Ausubeletal. (eds.), 1993, CURRENT Mutations can be introduced into SEQID NOs: 3, 6, 9, 12, PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, NY. 60 15, 18, 21, 24, 27, 30, 33, or 35 by standard techniques, such and Kriegler, 1990, GENETRANSFER AND EXPRESSION, ALABORA as site-directed mutagenesis, PCR-mediated mutagenesis and TORY MANUAL, Stockton Press, NY; Shilo & Weinberg, Proc DNA shuffling. Preferably, conservative amino acid substitu Natl AcadSci USA 78:6789-6792 (1981). tions are made at one or more predicted, non-essential amino Conservative Mutations acid residues. Single base Substitutions are among the most In addition to naturally-occurring allelic variants of EXOX 65 common changes to human DNA. These base changes can sequences that may exist in the population, the skilled artisan occur in the coding or the non-coding regions of the DNA. If will further appreciate that changes can be introduced by they occur in the coding region, they can be conservative or US 7,943,340 B2 71 72 non-conservative Substitutions. A "conservative amino acid protein as well as the possibility of deleting one or more Substitution' is a new amino acid that has similar properties residues from the parent sequence. Any amino acid substitu and is one in which the amino acid residue is replaced with an tion, insertion, or deletion is encompassed by the invention. In amino acid residue having a similar side chain. Non-conser favorable circumstances, the Substitution is a conservative vative substitutions refer to a new amino acid, which has substitution as defined above. different properties. Families of amino acid residues having One aspect of the invention pertains to isolated EXOX similar side chains have been defined within the art. These proteins, and biologically active portions thereof, or deriva families include amino acids with basic side chains (e.g., tives, fragments, analogs or homologs thereof. Biologically lysine, arginine, histidine), acidic side chains (e.g., aspartic active portions refer to regions of the EXOX proteins, which acid, glutamic acid), uncharged polar side chains (e.g., gly 10 are necessary for normal function, for example, aminopepti cine, asparagine, glutamine, serine, threonine, tyrosine, cys dase activity. Also provided are polypeptide fragments Suit teine), nonpolar side chains (e.g., alanine, Valine, leucine, able for use as immunogens to raise anti-EXOX antibodies. In isoleucine, proline, hydroxyproline, phenylalanine, methion one embodiment, native EXOX proteins can be isolated from ine, tryptophan), beta-branched-side chains (e.g., threonine, cells, tissue sources or culture Supernatants by an appropriate Valine, isoleucine) and aromatic side chains (e.g., tyrosine, 15 purification scheme using appropriate protein purification phenylalanine, tryptophan, histidine). Thus, for a conserva techniques. In another embodiment, EXOX proteins are pro tive Substitution, a predicted non-essential amino acid residue duced by recombinant DNA techniques. Alternative to in the EXOX protein is replaced with another amino acid recombinant expression, an EXOX protein or polypeptide residue from the same side chain family. Alternatively, in can be synthesized chemically using standard peptide synthe another embodiment, mutations can be introduced randomly sis techniques. along all or part of an EXOX coding sequence. Such as by An "isolated or “purified’ polypeptide or protein or bio saturation mutagenesis, and the resultant mutants can be logically-active portion thereof is substantially free of cellu screened for EXOX biological activity to identify mutants lar material or other contaminating proteins from the cell or that retain activity. Following mutagenesis of SEQIDNOs: 2. tissue source from which the EXOX protein is derived, or 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, or 34, the encoded protein 25 substantially free from chemical precursors or other chemi can be expressed by any recombinant technology known in cals when chemically synthesized. The language 'substan the art and the activity of the protein can be determined. tially free of cellular material” includes preparations of The relatedness of amino acid families may also be deter EXOX proteins in which the protein is separated from cellular mined based on side chain interactions. Substituted amino components of the cells from which it is isolated or recom acids may be fully conserved “strong residues or fully con 30 binantly-produced. In one embodiment, the language 'Sub served “weak” residues. The “strong group of conserved stantially free of cellular material' includes preparations of amino acid residues may be any one of the following groups: EXOX proteins having less than about 30% (by dry weight) of STA, NEQK, NHQK, NDEQ, QHRK, MILV, MILF, HY, non-EXOX proteins (also referred to herein as a “contami FYW, wherein the single letter amino acid codes are grouped nating protein'), more preferably less than about 20% of by those amino acids that may be substituted for each other. 35 non-EXOX proteins, still more preferably less than about Likewise, the “weak” group of conserved residues may be 10% of non-EXOX proteins, and most preferably less than any one of the following: CSA, ATV, SAG, STNK, STPA, about 5% of non-EXOX proteins. When the EXOX protein or SGND, SNDEQK, NDEQHK, NEQHRK, HFY, wherein the biologically-active portion thereof is recombinantly-pro letters within each group represent the single letter amino duced, it is also preferably substantially free of any constitu acid code. 40 ent of the culture medium, e.g., culture medium components In one embodiment, a mutant EXOX protein can be may represent less than about 20%, more preferably less than assayed for (i) the ability to form protein:protein interactions about 10%, and most preferably less than about 5% of the with other EXOX proteins, other cell-surface proteins, or EXOX protein preparation. biologically-active portions thereof, (ii) complex formation The language “substantially free of chemical precursors or between a mutant EXOX protein and a EXOX ligand; or (iii) 45 other chemicals' includes preparations of EXOX proteins in the ability of a mutant EXOX protein to bind to an intracel which the protein is separated from chemical precursors or lular target protein or biologically-active portion thereof; other chemicals that are involved in the synthesis of the pro (e.g. avidin proteins). tein. In one embodiment, the language “substantially free of In yet another embodiment, a mutant EXOX protein can be chemical precursors or other chemicals' includes prepara assayed for the ability to regulate a specific biological func 50 tions of EXOX proteins having less than about 30% (by dry tion (e.g., proteolytic activity). weight) of chemical precursors or non-EXOX chemicals, EXOX Polypeptides more preferably less than about 20% chemical precursors or A polypeptide according to the invention includes a non-EXOX chemicals, still more preferably less than about polypeptide including the amino acid sequence of EXOX 10% chemical precursors or non-EXOX chemicals, and most polypeptides whose sequences are provided in SEQID NOs: 55 preferably less than about 5% chemical precursors or non 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, and 35. The invention EXOX chemicals. Furthermore, “substantially free of chemi also includes a mutant or variant protein any of whose resi cal precursors or other chemicals” would include oxidation dues may be changed from the corresponding residues shown byproducts. One of skill in the art would know how to prevent in SEQID NOs: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, or 35 oxidation, for example, by keeping chemicals in an oxygen while still encoding a protein that maintains its EXOX activi 60 free environment. ties and physiological functions, or a functional fragment Biologically-active portions of EXOX proteins include thereof. peptides comprising amino acid sequences Sufficiently In general, an EXOX variant that preserves EXOX-like homologous to or derived from the amino acid sequences of function includes any variant in which residues at a particular the EXOX proteins (e.g., the amino acid sequence shown in position in the sequence have been Substituted by otheramino 65 SEQID NOs: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30,33, or 35) that acids, and further include the possibility of inserting an addi include fewer amino acids than the full-length EXOX pro tional residue or residues between two residues of the parent teins, and exhibit at least one activity of an EXOX protein. US 7,943,340 B2 73 74 Typically, biologically active portions comprise a domain or nucleotide comprises a sequence that has at least 80 percent motif with at least one activity of the EXOX protein. A bio sequence identity, preferably at least 85 percent identity and logically active portion of an EXOX protein can be a polypep often 90 to 95 percent sequence identity, more usually at least tide that is, for example, 10, 25, 50, 100 or more amino acid 99 percent sequence identity as compared to a reference residues in length. 5 sequence over a comparison region. Moreover, other biologically active portions, in which Chimeric and Fusion Proteins other regions of the protein are deleted, can be prepared by The invention also provides EXOX chimeric or fusion recombinant techniques and evaluated for one or more of the proteins. As used herein, a EXOX "chimeric protein’ or functional activities of a native EXOX protein. “fusion protein’ comprises a EXOX polypeptide operatively In an embodiment, the EXOX protein has an amino acid 10 linked to a non-EXOX polypeptide. An "EXOX polypeptide' sequence shown in SEQID NOs: 3, 6, 9, 12, 15, 18, 21,24,27, refers to a polypeptide having an amino acid sequence corre 30, 33, or 35. In other embodiments, the EXOX protein is sponding to an EXOX protein (SEQ ID NOs: 3, 6, 9, 12, 15, substantially homologous to SEQID NOs: 3, 6, 9, 12, 15, 18, 18, 21, 24, 27.30,33, or 35), whereas a “non-EXOX polypep 21, 24, 27, 30, 33, or 35, and retains the functional activity of tide' refers to a polypeptide having an amino acid sequence the protein of SEQID NOs: 3, 6, 9, 12, 15, 18, 21, 24, 27.30, 15 corresponding to a protein that is not substantially homolo 33, or 35, yet differs in amino acid sequence due to natural gous to the EXOX protein, e.g., a protein that is different from allelic variation or mutagenesis, as described in detail, below. the EXOX protein and that is derived from the same or a Accordingly, in another embodiment, the EXOX protein is a different organism. Within an EXOX fusion protein the protein that comprises an amino acid sequence at least about EXOX polypeptide can correspond to all or a portion of an 90% homologous to the amino acid sequence SEQID NOs: 3, EXOX protein. In one embodiment, a EXOX fusion protein 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, or 35, and retains the comprises at least one biologically active portion of a EXOX functional activity of the EXOX proteins of SEQID NOs: 3, protein. In another embodiment, an EXOX fusion protein 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, or 35. As used herein, the comprises at least two biologically active portions of an term “biological activity” or “functional activity” refers to the EXOX protein. In yet another embodiment, an EXOX fusion natural or normal function of the EXO proteins, for example 25 protein comprises at least three biologically active portions of the ability to degrade other proteins. an EXOX protein. Within the fusion protein, the term “opera Determining Homology Between Two or More Sequences tively-linked' is intended to indicate that the EXOX polypep To determine the percent of similarity or homology of two tide and the non-EXOX polypeptide are fused in-frame with amino acid sequences or of two nucleic acid sequences, the one another. The non-EXOX polypeptide can be fused to the sequences are aligned for optimal comparison purposes (e.g., 30 N-terminus and/or C-terminus of the EXOX polypeptide. gaps can be introduced in the sequence of a first amino acid or In one embodiment, the fusion protein is a GST-EXOX nucleic acid sequence for optimal alignment with a second fusion protein in which the EXOX sequences arefused to the amino acid or nucleic acid sequence). The amino acid resi C-terminus of the GST (glutathione S-transferase) dues or nucleotides at corresponding amino acid positions or sequences. Such fusion proteins can facilitate the purification nucleotide positions are then compared. When a position in 35 of recombinant EXOX polypeptides. the first sequence is occupied by the same amino acid residue In another embodiment, the fusion protein is an EXOX or nucleotide as the corresponding position in the second protein containing a heterologous signal sequence at its sequence, then the molecules are homologous at that position N-terminus. In certain host cells (e.g., mammalian host cells), (i.e., as used herein amino acid or nucleic acid “homology” is expression and/or secretion of EXOX can be increased equivalent to amino acid or nucleic acid “identity”). 40 through use of a heterologous signal sequence. The nucleic acid sequence homology may be determined In yet another embodiment, the fusion protein is an EXOX as the degree of identity between two sequences. The homol immunoglobulin fusion protein in which the EXOX ogy may be determined using computer programs known in sequences are fused to sequences derived from a member of the art, such as GAP software provided in the GCG program the immunoglobulin protein family. The EXOX-immunoglo package. See Needleman & Wunsch, J. Mol. Biol. 48:443 45 bulin fusion proteins of the invention can be incorporated into 453 1970. Using GCG GAP software with the following pharmaceutical compositions and administered to a subject to settings for nucleic acid sequence comparison: GAP creation inhibit an interaction between a EXOX ligand and a EXOX penalty of 5.0 and GAP extension penalty of 0.3, the coding protein on the surface of a cell, to thereby suppress EXOX region of the analogous nucleic acid sequences referred to mediated signal transduction in vivo. The EXOX-immuno above exhibits a degree of identity preferably of at least 70%, 50 globulin fusion proteins can be used to affect the bioavailabil 75%, 80%, 85%, 90%, 95%, 98%, or 99%, with the CDS ity of an EXOX cognate ligand. Inhibition of the EXOX (encoding) part of the DNA sequence shown in SEQID NOs: ligand/EXOX interaction may be useful therapeutically for 2, 5, 8, 11, 14, 17, 20, 23, 26, 29, 32, and 34. both the treatment of proliferative and differentiative disor The term “sequence identity” refers to the degree to which ders, as well as modulating (e.g. promoting or inhibiting) cell two polynucleotide or polypeptide sequences are identical on 55 survival. Moreover, the EXOX-immunoglobulin fusion pro a residue-by-residue basis over a particular region of com teins of the invention can be used as immunogens to produce parison. The term "percentage of sequence identity” is calcu anti-EXOX antibodies in a subject, to purify EXOX ligands, lated by comparing two optimally aligned sequences over that and in Screening assays to identify molecules that inhibit the region of comparison, determining the number of positions at interaction of EXOX with an EXOX ligand. which the identical nucleic acid base (e.g., A, T, C, G, U, or I. 60 A EXOX chimeric or fusion protein of the invention can be in the case of nucleic acids) occurs in both sequences to yield produced by standard recombinant DNA techniques. For the number of matched positions, dividing the number of example, DNA fragments coding for the different polypep matched positions by the total number of positions in the tide sequences are ligated together in-frame in accordance region of comparison (e.g., the window size), and multiplying with conventional techniques, e.g., by employing blunt the result by 100 to yield the percentage of sequence identity. 65 ended or stagger-ended temmini for ligation, restriction The term “substantial identity” as used herein denotes a char enzyme digestion to provide for appropriate termini, filling-in acteristic of a polynucleotide sequence, wherein the poly of cohesive ends as appropriate, alkaline phosphatase treat US 7,943,340 B2 75 76 ment to avoid undesirable joining, and enzymatic ligation. In sequence with a nuclease under conditions wherein nicking another embodiment, the fusion gene can be synthesized by occurs only about once per molecule, denaturing the double conventional techniques including automated DNA synthe stranded DNA, renaturing the DNA to form double-stranded sizers. Alternatively, PCR amplification of gene fragments DNA that can include sense/antisense pairs from different can be carried out using anchor primers that give rise to nicked products, removing single stranded portions from complementary overhangs between two consecutive gene reformed duplexes by treatment with S nuclease, and ligat fragments that can Subsequently be annealed and reamplified ing the resulting fragment library into an expression vector. to generate a chimeric gene sequence (See, e.g., Ausubeletal. By this method, expression libraries can be derived which (eds.) CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley encode N-terminal and internal fragments of various sizes of & Sons, 1992). Moreover, many expression vectors are com 10 the EXOX proteins. mercially available that already encode a fusion moiety (e.g., Various techniques are known in the art for Screening gene a GST polypeptide). A EXOX-encoding nucleic acid can be products of combinatorial libraries made by point mutations cloned into Such an expression vector Such that the fusion or truncation, and for Screening cDNA libraries for gene moiety is linked in-frame to the EXOX protein. products having a selected property. Such techniques are EXOX Agonists and Antagonists 15 adaptable for rapid screening of the gene libraries generated The invention also pertains to variants of the EXOX pro by the combinatorial mutagenesis of EXOX proteins. The teins that function as either EXOX agonists (e.g., mimetics) most widely used techniques, which are amenable to high or as EXOXantagonists. Variants of the EXOX protein can be throughput analysis, for Screening large gene libraries typi generated by mutagenesis (e.g., discrete point mutation or cally include cloning the gene library into replicable expres truncation of the EXOX protein). An agonist of the EXOX sion vectors, transforming appropriate cells with the resulting protein can retain Substantially the same, or a Subset of the library of vectors, and expressing the combinatorial genes biological activities of the naturally occurring form of the under conditions in which detection of a desired activity EXOX protein. An antagonist of the EXOX protein can facilitates isolation of the vector encoding the gene whose inhibit one or more of the activities of the naturally occurring product was detected. Recursive ensemble mutagenesis form of the EXOX protein by, for example, competitively 25 (REM), a new technique that enhances the frequency of func binding to a downstream or upstream member of a cellular tional mutants in the libraries, can be used in combination signaling cascade, which includes the EXOX protein. Thus, with the screening assays to identify EXOX variants. See, specific biological effects can be elicited by treatment with a e.g., Arkin & Yourvan, Proc. Natl. Acad. Sci. USA 89:7811 variant of limited function. In one embodiment, treatment of 7815 (1992); Delgrave et al., Protein Engineering 6:327-331 a Subject with a variant having a Subset of the biological 30 (1993). activities of the naturally occurring form of the protein has Libraries can also be generated by DNA shuffling. DNA fewer side effects in a subject relative to treatment with the shuffling uses related genes from different species or genes naturally occurring form of the EXOX proteins. that are related in their function, fragments them and reas Variants of the EXOX proteins that function as either sembles them through recombination. It can then be deter EXOX agonists (e.g., mimetics) or as EXOX antagonists can 35 mined if the recombined genes comprise usable or potentially be identified by screening combinatorial libraries of mutants interesting products. Any recombined gene found to be useful (e.g., truncation mutants) of the EXOX proteins for EXOX are again fragmented and reassembled to form new recombi protein agonist or antagonist activity. In one embodiment, a nant genes. As the various fragments of different species and variegated library of EXOX variants is generated by combi genes are annealed and extended, diversity is created in the natorial mutagenesis at the nucleic acid level and is encoded 40 library. The process can be performed until a protein of inter by a variegated gene library. A variegated library of EXOX est is found. The important factors in creating recombined variants can be produced by, for example, enzymatically genes with DNA shuffling include the temperature at which ligating a mixture of synthetic oligonucleotides into gene annealing occurs, the similarity of the genes and the size of sequences such that a degenerate set of potential EXOX the DNA fragments. sequences is expressible as individual polypeptides, or alter 45 Stemmer et al., Nature 370:389-391 (1994); Stemmer, natively, as a set of larger fusion proteins (e.g., for phage Proc. Natl. Acad. USA'91: 10747-10751 (1994); U.S. Pat. No. display) containing the set of EXOX sequences therein. There 5,603,793; U.S. Pat. No. 5,830,721; and U.S. Pat. No. 5,811, are a variety of methods, which can be used to produce librar 238, which are incorporated herein by reference, describe ies of potential EXOX variants from a degenerate oligonucle e.g., in vitro protein shuffling methods, e.g., by repeated otide sequence. Chemical synthesis of a degenerate gene 50 cycles of mutagenesis, shuffling and selection as well as a sequence can be performed in an automatic DNA synthesizer, variety of methods of generating libraries of displayed pep and the synthetic gene then ligated into an appropriate expres tides and antibodies as well as a variety of DNA reassembly sion vector. Use of a degenerate set of genes allows for the techniques following DNA fragmentation, and their applica provision, in one mixture, of all of the sequences encoding the tion to mutagenesis in vitro and in vivo. Moreover, various desired set of potential EXOX sequences. Methods for syn 55 applications of DNA shuffling technology are also known in thesizing degenerate oligonucleotides are well-known within the art. In addition to the publications noted above, see U.S. the art. See, e.g., Narang, Tetrahedron 39:3 (1983); Itakura et Pat. No. 5,837,458, which provides for the evolution of new al., Annu. Rev. Biochem. 53:323 (1984); Itakura et al., Sci metabolic pathways and the enhancement of bio-processing ence 198: 1056 (1984); Ike et al., Nucl. Acids Res. 11:477 through recursive shuffling techniques, and Crameri et al., (1983). 60 Nature Medicine 2(1):100-103 (1996), which describes anti Polypeptide Libraries body shuffling for antibody phage libraries. See also, WO95/ In addition, libraries of fragments of the EXOX protein 22625, WO97/20078, WO96/33207, WO97/33957, WO98/ coding sequences can be used to generate a variegated popu 27230, WO97/35966, WO98/31837, WO98/13487, WO98/ lation of EXOX fragments for screening and Subsequent 13485 and WO989742832. selection of variants of an EXOX protein. In one embodiment, 65 Expression Vectors a library of coding sequence fragments can be generated by Another aspect of the invention pertains to vectors, prefer treating a double stranded PCR fragment of an EXOX coding ably expression vectors, containing a nucleic acid encoding US 7,943,340 B2 77 78 an EXOX protein, or derivatives, fragments, analogs or Host cells of the invention can also be used to produce homologs thereof. As used herein, the term “vector” refers to non-human transgenic animals in which exogenous a nucleic acid molecule capable of transporting another sequences have been introduced into their genome. The trans nucleic acid to which it has been linked. One type of vector is genic animal is a non-human animal, preferably a mammal, a “plasmid', which refers to a circular double stranded DNA more preferably a rodent Such as a rat or mouse, in which one loop into which additional DNA segments can be ligated. or more of the cells of the animal includes a transgene. Other Another type of vector is a viral vector, wherein additional examples of transgenic animals include, e.g., non-human pri DNA segments can be ligated into the viral genome. Certain mates, sheep, dogs, cows, goats, chickens, amphibians. Meth vectors are capable of autonomous replication in a host cell ods for generating transgenic animals via embryo manipula into which they are introduced (e.g., bacterial vectors having 10 tion and micro-injection, particularly animals such as mice, a bacterial origin of replication and episomal mammalian have become conventional in the art and are described, for vectors). Other vectors (e.g., non-episomal mammalian vec example, in U.S. Pat. Nos. 4.736,866; 4,870,009; and 4,873, tors) are integrated into the genome of a host cell upon intro 191; and Hogan, 1986. In: MANIPULATING THE MOUSE EMBRYO, duction into the host cell, and thereby are replicated along Cold Spring Harbor Laboratory Press, Cold Spring Harbor, with the host genome. Moreover, certain vectors are capable 15 N.Y. Similar methods are used for production of other trans of directing the expression of genes to which they are opera genic animals. tively linked. Such vectors are referred to herein as “expres Pichia pastoris Expression System sion vectors'. In general, expression vectors of used in One such eukaryotic yeast is the methanoltrophic Pichia recombinant DNA techniques are often in the form of plas pastoris. P. pastoris has been developed to be an outstanding mids. In the present specification, "plasmid' and “vector' can host for the production of foreign proteins since its alcohol be used interchangeably as the plasmid is the most commonly oxidase promoter was isolated and cloned: The P. pastoris used form of vector. However, the invention is intended to transformation was first reported in 1985. The P pastoris include Such other forms of expression vectors, such as viral heterologous protein expression system was developed by vectors (e.g., replication defective retroviruses, adenoviruses Phillips Petroleum, see, e.g., U.S. Pat. Nos. 4,855,231, 4,857, and adeno-associated viruses), which serve equivalent func 25 467, 4,879,231 and 4,929,555, each of which is incorporated tions. herein by reference. This system is currently marketed by The production of a functional protein is intimately related Invitrogen. Compared to other eukaryotic expression sys to the cellular machinery of the organism producing the pro tems, Pichia offers many advantages, because it does not have tein. E. Coli has typically been the “factory' of choice for the the endotoxin problem associated with bacteria nor the viral expression of many proteins because its genome has been 30 contamination problem of proteins produced in animal cell fully mapped and the organism is easy to handle; grows cultures. Furthermore, P pastoris can utilize methanol as a rapidly; requires an inexpensive, easy-to-prepare medium for carbon source in the absence of glucose. The P. pastoris growth; and secretes protein into the medium which facili expression system uses the methanol-induced alcohol oxi tates recovery of the protein. However, E. coli is a prokaryote dase (AOX1) promoter, which controls the gene that codes for and lacks intracellular organelles, such as the endoplasmic 35 the expression of alcohol oxidase, the enzyme that catalyzes reticulum and the golgi apparatus that are present in eukary the first step in the metabolism of methanol. This promoter otes, which contain enzymes which modify the proteins being has been characterized and incorporated into a series of P produced. Many eukaryotic proteins can be produced in E. pastoris expression vectors. Since the proteins produced in P coli but these may be produced in a nonfunctional, unfinished pastoris are typically folded correctly and secreted into the form, since glycosylation or post-translational modifications 40 medium, the fermentation of genetically engineered P. pas do not occur. toris provides an excellent alternative to E. coli expression Therefore, researchers have recently turned to eukaryotic systems. Furthermore, P. pastoris has the ability to spontane yeast, mammalian and plant expression systems for protein ously glycosylate expressed proteins, which also is an advan production. For example, the methanoltrophic yeast P. pas tage over E. coli. A number of proteins have been produced toris has become a powerful host for the heterologous expres 45 using this system, including tetanus toxin fragment, Borda sion of proteins during the last few years and has been estab tella pertussis pertactin, human serum albuminand lysozyme. lished as an alternative eukaryotic host for the expression of Tag Removal with EXOX Proteins human proteins with high-throughput technologies. Several systems have been developed to allow for rapid and As another example, plants are being utilized as expression efficient purification of recombinant proteins expressed in hosts for large-scale heterologous expression of proteins and 50 bacteria. Most of these rely on the expression of the proteinas offer potential advantages of cost-effectiveness, Scalability a fusion protein with a glutathione-S-transferase (GST) and safety over traditional expression systems. There are domain, a calmodulin binding peptide (CBP) or a His-tag. For currently a variety of plant heterologous expression systems example, the expression of polypeptides in frame with glu including transient expression, plant cell-Suspension cul tathione S-transferase (GST) allows for purification of the tures, recombinant plant viruses and chloroplast transgenic 55 fusion proteins from crude bacterial extracts under nondena systems. While proteins expressed in plants have some varia turing conditions by affinity chromatography on glutathione tions from mammalian proteins (e.g., glycosylation), there is agarose. currently no evidence that these differences result in adverse Furthermore, this vector expression system generally reactions in human patients. See, e.g., Julian et al., Nat. Rev. incorporates a specific protease cleavage site to facilitate Gen. 4:794-805 (2003). 60 proteolysis of the bacterial fusion proteins, which is, depend Another Suitable heterologous expression system uses ing on the vector used, a thrombin, enterokinase or Factor Xa insect cells, often in combination with baculovirus expression protease cleavage site. Thrombin specifically cleaves target vectors. Baculovirus vectors available for expressing proteins proteins containing the recognition sequence Leu-Val-Pro in cultured insect cells, e.g., SF9 cells include the pac series Arg, Gly-Ser (SEQ ID NO. 44). The enterokinase cleavage (Smith et al., Mol. Cell. Biol. 3: 2156-2165 (1983)) and the 65 site is Asp-Asp-Asp-Asp-Lys (SEQID NO: 45). Like enter pVL series (Lucklow & Summers, Virology 170: 31-39 okinase, Factor Xa cleaves at the C-terminal side of its rec (1989)). ognition sequence Ile-Glu-Gly-Arg (SEQID NO: 46), and US 7,943,340 B2 79 80 can therefore be used for removing all vector-encoded more amino acids under conditions permitting the addition of sequences from appropriately designed constructs. All of the one or more amino acids to the polypeptide. these enzymes are now commercially available in a high There are multiple utilities for using the EXOX proteins of purity to avoid secondary cleavage arising from contaminat the present invention as reverse proteolytic enzymes. For ing proteases. These enzymes are provided eitherina kit often 5 example, the reverse proteolytic activity of the EXOX pro including all the tools for the enzyme capture, or biotinylated teins can be used in the synthesis of a polypeptide chain. The to facilitate removal of the enzyme from cleavage reaction EXOX proteins can also be used as a coupling agent to add medium. More recently Qiagen also developed the TAGZyme one or more amino acids to anotheramino acid, a polypeptide, system for an efficient removal of N-terminal His tags from or any composition with an accessible secondary amine. proteins which involves exopeptidases that cleave dipeptides 10 Pharmaceutical Compositions The EXOX nucleic acid molecules, EXOX proteins, and sequentially from the N-terminus up to a “stop point amino anti-EXOX antibodies (also referred to herein as “active com acid motif, which is either Lys-Xaa-, Arg-Xaa-, Xaa pounds’) of the invention, and derivatives, fragments, ana Xaa-Pro-Xaa-, Xaa-Pro-Xaa-Xaa- or Gln-Xaa-. logs and homologs thereof, can be incorporated into pharma Although it is not always necessary to remove the short His 15 ceutical compositions Suitable for administration. Such affinity tag (whatever the number of His residues) from a compositions typically comprise the nucleic acid molecule, recombinant protein after purification, there are some appli protein, or antibody and a pharmaceutically acceptable car cations, such as structural analysis by X-ray crystallography rier. As used herein, “pharmaceutically acceptable carrier is or NMR, where removal of the tag is desirable. The same intended to include any and all solvents, dispersion media, thing is also true for the residual residues Gly-Ser of the coatings, antibacterial and antifungal agents, isotonic and thrombin cleavage site or any supplementary residual N-ter absorption delaying agents, and the like, compatible with minal amino acid that could be still present and which could pharmaceutical administration. Suitable carriers are be related to the expression system used. described in the most recent edition of Remington’s Pharma A more recent approach to affinity purification involves ceutical Sciences, a standard reference text in the field, which utilizing a condensation reaction between a carbonyl group 25 is incorporated herein by reference. Preferred examples of and a molecule with two vicinal nucleophilic groups. Such carriers or diluents include, but are not limited to, water, Examples of amino acids with two vicinal nucleophilic saline, Ringer's solutions, dextrose solution, and 5% human groups includes, e.g., serine, threonine and cysteine. Purify serum albumin. Liposomes and non-aqueous vehicles such as ing a protein or peptide involves forming a reversible covalent fixed oils may also be used. The use of such media and agents bond formed by between, e.g., an N-terminal cysteine, threo 30 for pharmaceutically active Substances is well known in the nine or serine residue, and an appropriate resin. See Villain et art. Except insofar as any conventional media or agent is al., Chem. & Biol. 8:673-679 (2001). Addition of a pair of incompatible with the active compound, use thereof in the residues, e.g., Thr-Pro, Cys-Pro or Ser-Pro, to the N-terminus compositions is contemplated. Supplementary active com ofa recombinant protein, or of a protein (peptide) obtained by pounds can also be incorporated into the compositions. chemical synthesis, permits two-step purification: (1) purifi 35 Encapsulation technologies are also widely applied in cation by covalent capture; and (2) removal of the di-peptide many industries. Examples include pharmaceuticals for con tag. This method permits efficient recovery of recombinant trolled release of drugs; pigments in foods and beverages; protein in its mature form, without the di-peptide flag antioxidants in foods; and controlled release of insect phero Sequence. mones in agriculture. Capsules, microcapsules and micro Reverse Proteolytic Activity of EXOX Proteins 40 spheres are Small spherical particles, which contain an active Another aspect of the invention pertains to methods of ingredient within the particle matrix or attached to the particle adding one or more amino acids to amino acids, peptides, Surface. For example, encapsulation in biodegradable algi oligopeptides, polypeptides or any composition with an nate microparticles has been shown. Bioencapsulation tech accessible secondary amine, by using the reverse proteolytic nologies are intended to encapsulate cells, enzymes, and bio activity of one or more EXOX proteins. As used herein, the 45 logically active materials. term “reverse proteolytic activity” refers to enzymatic activ A pharmaceutical composition of the invention is formu ity that catalyzes the addition of one or more amino acids to an lated to be compatible with its intended route of administra amino acid, a peptide, an oligopeptide, a polypeptide or any tion. Examples of routes of administration include parenteral, composition with an accessible secondary amine. One of e.g., intravenous, intradermal. Subcutaneous, oral (e.g., inha ordinary skill in the art will recognize that, under suitable 50 lation), transdermal (e.g., topical), transmucosal, and rectal thermodynamic conditions, proteolytic enzymes can have administration. Solutions or Suspensions used for parenteral, reverse proteolytic activity. intradermal, or Subcutaneous application can include the fol An example of a proteolytic enzyme with reverse pro lowing components: a sterile diluent such as water for injec teolytic activity is trypsin, which is a pancreatic serine pro tion, Saline Solution, fixed oils, polyethylene glycols, glycer tease with Substrate specificity based upon positively charged 55 ine, propylene glycolor other synthetic solvents; antibacterial lysine and arginine side chains. Trypsin is widely used in the agents such as benzyl alcohol or methyl parabens; antioxi manufacture of human insulin from porcine insulin, which is dants such as ascorbic acid or sodium bisulfite; chelating similar to the human form except the last amino acid residue agents such as ethylenediaminetetraacetic acid (EDTA); buff in the B-chain is alanine rather than threonine. Reacting por erS Such as acetates, citrates orphosphates, and agents for the cine insulin with a threonine ester in the presence of trypsin 60 adjustment of tonicity Such as sodium chloride or dextrose. yields a human insulin threonine ester by removing the ter The pH can be adjusted with acids or bases, such as hydro minal alanine and adding the threonine ester. Subsequent chloric acid or Sodium hydroxide. The parenteral preparation treatment of the human insulin threonine ester with trifluoro can be enclosed in ampoules, disposable Syringes or multiple acetic acid hydrolyzes the ester to yield human insulin. dose vials made of glass or plastic. In some embodiments, the EXOX proteins are used to 65 Pharmaceutical compositions suitable for injectable use catalyze reverse proteolytic reactions. In some instances, the include sterile aqueous solutions (where water soluble) or EXOX proteins are incubated with a polypeptide and one or dispersions and sterile powders for the extemporaneous US 7,943,340 B2 81 82 preparation of sterile injectable solutions or dispersion. For or dispenser, which contains a suitable propellant, e.g., a gas intravenous administration, Suitable carriers include physi Such as carbon dioxide, or a nebulizer. ological saline, bacteriostatic water, Cremophor ELTM Systemic administration can also be by transmucosal or (BASF, Parsippany, N.J.) orphosphate buffered saline (PBS). transdermal means. For transmucosal or transdermal admin In all cases, the composition must be sterile and should be istration, penetrants appropriate to the barrier to be permeated fluid to the extent that easy syringeability exists. It must be are used in the formulation. Such penetrants are generally stable under the conditions of manufacture and storage and known in the art, and include, for example, for transmucosal must be preserved against contamination by microorganisms, administration, detergents, bile salts, and fusidic acid deriva Such as bacteria, fungi or viruses. The carrier can be a solvent 10 tives. Transmucosal administration can be accomplished or dispersion medium containing, for example, water, etha through the use of nasal sprays or Suppositories. For trans nol, polyol (for example, glycerol, propylene glycol, and dermal administration, the active compounds are formulated liquid polyethylene glycol, and the like), and Suitable mix into ointments, salves, gels, or creams as generally known in tures thereof. The proper fluidity can be maintained, for the art. example, by the use of a coating Such as lecithin, by the 15 maintenance of the required particle size in the case of dis The compounds can also be prepared in the form of Sup persion and by the use of surfactants. Prevention of the action positories (e.g., with conventional Suppository bases such as of microorganisms can be achieved by various antibacterial cocoa butter and other glycerides) or retention enemas for and antifungal agents, for example, parabens, chlorobutanol, rectal delivery. phenol, ascorbic acid, thimerosal, and the like. In many cases, In one embodiment, the active compounds are prepared it will be preferable to include isotonic agents, for example, with carriers that will protect the compound against rapid Sugars, polyalcohols such as mannitol, Sorbitol, or sodium elimination from the body, such as a controlled release for chloride in the composition. Prolonged absorption of the mulation, including implants and microencapsulated delivery injectable compositions can be brought about by including in systems. Biodegradable, biocompatible polymers can be the composition an agent that delays absorption, for example, 25 used. Such as ethylene vinyl acetate, polyanhydrides, polyg aluminum monostearate and gelatin. lycolic acid, collagen, polyorthoesters, and polylactic acid. Sterile injectable solutions can be prepared by incorporat Methods for preparation of such formulations will be appar ent to those skilled in the art. The materials can also be ing the active compound (e.g., an EXOX protein or anti obtained commercially from, for example, Alza Corporation EXOX antibody) in the required amount in an appropriate 30 Solvent with one or a combination of ingredients enumerated and Nova Pharmaceuticals, Inc. Liposomal Suspensions (in above, as required, followed by filtered sterilization. Gener cluding liposomes targeted to infected cells with monoclonal ally, dispersions are prepared by incorporating the active antibodies to viral antigens) can also be used as pharmaceu tically acceptable carriers. These can be prepared according compound into a sterile vehicle that contains a basic disper to methods known to those skilled in the art, for example, as sion medium and the required other ingredients from those 35 enumerated above. In the case of sterile powders for the described in U.S. Pat. No. 4,522,811. preparation of sterile injectable solutions, methods of prepa It is especially advantageous to formulate oral or parenteral ration are vacuum drying and freeze-drying that yields a compositions in dosage unit form for ease of administration powder of the active ingredient plus any additional desired and uniformity of dosage. Dosage unit form as used herein ingredient from a previously sterile-filtered solution thereof. 40 refers to physically discrete units Suited as unitary dosages for A crude preparation of cell culture medium from T. rubrum the Subject to be treated; each unit contains a predetermined or transgenic fungi producing EXOX, or EXOX purified from quantity of active compound calculated to produce the T. rubrum or transgenic fungi producing EXOX can be desired therapeutic effect in association with the required administered orally since the proteases are secreted. Oral 45 pharmaceutical carrier. The specification for the dosage unit compositions generally include an inert diluent or an edible forms of the invention are dictated by and directly dependent carrier. They can be enclosed in gelatin capsules or com on the unique characteristics of the active compound and the pressed into tablets. For the purpose of oral therapeutic particular therapeutic effect to be achieved, and the limita administration, the active compound can be incorporated tions inherent in the art of compounding Such an active com with excipients and used in the form of tablets, troches, or 50 capsules. Oral compositions can also be prepared using a fluid pound for the treatment of individuals. carrier for use as a mouthwash, wherein the compound in the The nucleic acid molecules of the invention can be inserted fluid carrier is applied orally and swished and expectorated or into vectors and used as gene therapy vectors. Gene therapy Swallowed. Pharmaceutically compatible binding agents, vectors can be delivered to a subject by, for example, intra 55 venous injection, local administration (see, e.g., U.S. Pat. No. and/or adjuvant materials can be included as part of the com 5.328,470) or by stereotactic injection. See, e.g., Chen, et al., position. The tablets, pills, capsules, troches and the like can Proc. Natl. Acad. Sci. USA 91:3054-3057 (1994). The phar contain any of the following ingredients, or compounds of a maceutical preparation of the gene therapy vector can include similar nature: a binder Such as microcrystalline cellulose, the gene therapy vector in an acceptable diluent, or can com gum tragacanth or gelatin; an excipient such as starch or prise a slow release matrix in which the gene delivery vehicle lactose, a disintegrating agent such as alginic acid, Primogel, 60 is imbedded. Alternatively, where the complete gene delivery or corn starch; a lubricant such as magnesium Stearate or vector can be produced intact from recombinant cells, e.g., Sterotes; a glidant such as colloidal silicon dioxide; a Sweet retroviral vectors, the pharmaceutical preparation can include ening agent Such as Sucrose or saccharin; or a flavoring agent one or more cells that produce the gene delivery system. Such as peppermint, methyl salicylate, or orange flavoring. 65 The pharmaceutical compositions can be included in a For administration by inhalation, the compounds are deliv container, pack, or dispenser together with instructions for ered in the form of an aerosol spray from pressured container administration. US 7,943,340 B2 83 84 EXAMPLES An A. fumigatus cDNA library was previously constructed with the CHUVI 92-88 strain grown 40 hat 30° C. in liquid Example 1 medium containing 0.2% collagen as a sole nitrogen and carbon source (Monodet al., 1991). Total RNA was extracted Methods and Materials as described (Applegate and Monod) and the mRNA was purified using oligo(dT) cellulose (Sigma, Buchs, Switzer Strains and Plasmids land) according to standard protocols (Sambrook et al., A clinical isolate, T. rubrum CHUV 862-00, was used in 1989). A library was prepared with this mRNA using lambda this study. E. coli LE392 was used for the propagation of the phage gt11 (Promega) and the protocols of the manufacturer. TABLE 13 shows T. rubrum and A. fumigatus genes encoding aminopeptidases.

Genomic DNA cDNA:ORF aa number (bp. from the length (bp.) encoded from Introns ATG to the from the the ATG (bp of the genomic DNA Gene STOP codon) ATG codon codon from the ATG codon) LAP2 1757 1488 495 3 introns (bp 106-231; 556-632; 917-982) 4 exons coding for 35, 108,95, 257 aa fulLAP2 1557 1497 498 1 introns (bp 85-144) 2 exons coding for 28,470 aa LAP1 1256 1122 373 2 introns (bp 157-226:968-1031) 3 exons coding for 52, 247, 74aa fulLAP1 1298 1167 388 2 introns (bp 187-252: 1000-1064) 3 exons coding for 62,249, 77 aa bacteriophage EMBL3 (Promega, Wallisellen, Switzer Lap Gene Cloning land). All plasmid-Subcloning experiments were performed Recombinant plaques (10) of the genomic library were in E. coli DH5C. using plasmid pMTL2I. Chambers et al., immobilized on GeneScreen nylon membranes (NEN Life Gene 68:139-149 (1988). P. pastoris GSI 15 and the expres 30 science products, Boston, Mass.). The filters were hybridized sion vectorpKJ113 (Borg-Von Zepelin et al., Mol. Microbiol. with P-labelled probe using low-stringency conditions. 28:543-554 (1998)) were used to express recombinant pepti Monod et al., Mol. Microbiol. 13:357-368 (1994). All posi dases. It is known in the art that P. pastoris can be utilized to tive plaques were purified and the associated bacteriophage express a multitude of recombinant proteins. DNAs were isolated as described by Grossberger. Gross T. rubrum Growth Media 35 berger, Nucleic Acid Res. 15:6737 (1987). Hybridizing frag ments from EMBL3 bacteriophages were subcloned into T. rubrum was grown on Sabouraud agar and liquid pMTL2I following standard procedures. Nucleotide medium (Bio-Rad, Munchen, Germany) or, to promote pro sequencing was performed by Microsynth (Balgach, Switzer duction of proteolytic activity, in liquid medium containing land). 0.2% soy protein (Supro 1711, Protein Technologies Interna 40 Isolation of cDNA by Standard PCR tional, St. Louis, Mo.) as a sole nitrogen and carbon Source. T. rubrum and A. fumigatus cDNAs were obtained by PCR No salt was added in this medium. Those skilled in theart will using DNA prepared from 106 clones of the cDNA libraries. recognize it is also possible to utilize growth media in which PCR was performed according to standard conditions using salt is added to the medium. A volume of 100 ml of liquid homologous primers derived from DNA sequences of the medium was inoculated with a plug of freshly growing myce 45 different peptidase genes (Table 13). Two hundred ng of lium in 800 ml.-tissue culture flasks. The cultures were incu DNA, 10ul of each sense and antisense oligonucleotides at a bated 10 days at 30°C. without shaking. concentration of 42 mM and 8 ul of deoxynucleotide mix Genomic and cDNA Libraries (containing 10 mM of each dNTP) were dissolved in 100 ul PCR buffer (10 mM Tris-HCl pH 8.3, 50 mM KCl and 1.5 A T rubrum genomic DNA library was prepared using 50 mM MgCl). To each reaction 2.5 units of AmpliTAQ DNA DNA isolated from freshly growing mycelium. (Yelton et al., polymerase (PerkinElmer, Zurich, Switzerland) were added. Proc. Natl. Acad. Sci. USA. 81:1470–1474 (1984). The DNA The reaction mixtures was incubated 5 mm at 94° C., Sub was partially digested with Sau3A and DNA fragments rang jected to 25 cycles of 0.5 mm at 94° C., 0.5 mm at 55° C. and ing from 12 to 20 kb were isolated from low-melting-point 0.5 mm at 72° C. and finally incubated 10 mm at 72° C. agarose (Roche Diagnostics, Rotkreuz, Switzerland) with 55 Production of Recombinant LAPs agarase (Roche Diagnostics). These DNA fragments were Expression plasmids were constructed by cloning cDNA inserted into bacteriophage XEMBL3 using an appropriate PCR products in the multiple cloning site of the E. coli-P cloning system (Promega). pastoris shuttle vector pKJ 113. The PCR products were A T. rubrum cDNA library was prepared in a pSPORT6 purified using a PCR purification kit (Roche Diagnostics) and plasmid (Invitrogen Life Technologies: Rockville, Md., 60 digested by restriction enzymes for which a site was previ USA) using the microquantity mRNA system and 500 lug of ously designed at the 5' extremity of the primers (Table 14). P. total RNA. The RNA was prepared from 10-day-old cultures pastoris GSI 15 (Invitrogen) was transformed by electropo in soy protein liquid medium (10x100 ml). The mycelium ration with 10 pg of plasmid DNA linearized by EcoR1 or was ground under liquid nitrogen to a fine powder using a Smal. Transformants selected on histidine-deficient medium mortar and pestle, and the total RNA was isolated using an 65 (1 M sorbitol, 1% (w/v) dextrose, 1.34% (w/v) yeast nitrogen RNeasy total RNA purification kit for plant and fungi base (YNB) without amino acids, 4x10% (w/v) biotin, (Qiagen, Basel, Switzerland). 5x10% amino acids (e.g. 5x10% (w/v) of each Lglutamic US 7,943,340 B2 85 86 acid, L-methionine, L-lysine, L-leucine, L-isoleucine), 2% phate buffer at pH 6.0, containing 1% (w/v) yeast extract, 2% (w/v) agarose) were screened for insertion of the construct at (w/v) peptone, 1.34% (w/v) YNB without amino acids, 1% the AOX1 site on minimal methanol plates (1.34% (w/v)YNB (v/v) glycerol and 4x1% (w/v) biotin). Cells were harvested without amino acids, 4x10% (w/v) biotin, 0.5% (v/v) and resuspended in 2 ml of the same medium with 0.5% (v/v) methanol, 2% (w/v) agarose). The transformants unable to methanol instead of glycerol and incubated for 2 days. After grow on media containing only methanol as a carbon Source 2 days of incubation, the Supernatant was harvested and tested were assumed to contain the construct at the correct yeast for protein production on SDS-PAGEgels. Recombinant pep genomic location by integration events in the AOX1 locus tidase enzymes were produced in large quantities from 400 ml displacing the AOX1 coding region. These transformants cell culture Supernatant. were grown to near saturation (OD 20 at 600 nm) at 30°C. in 10 Table 14 describes materials used for the expression of the 10 ml of glycerol-based yeast media (0.1 M potassium phos- different LAPs in P. pastoris. TABLE 1.4 Gene Oligonucleotide primers Orientation Encoded amino acid sequence ruLAP2 GT TG/T CGA CTT GTT GGT CAA GAG CCC TTC SeSe (R) (L.) WGOEPFGW (SEQ ID NO: 63) GGA TGG (SEQ ID NO: 47) GT TGC/ GGC CGC TTA CAT GAA GAC AGT TGG antisense GHHTVFMs (SEQ ID NO: 64) GTG TCC (SEO ID NO : 48) fuLAP2, GT TC/T CGA. GGC CCA GGA TGG GAC TGG AAG SeSe (R) GPGWDWK (SEO ID NO : 65) (SEQ ID NO: 49) CGC AAA GG/T GCA CTC GCC CCG CGA antisense SRGECTFA (SEQ ID NO: 66) (SEO ID NO: 5O) TCG CGG GGC GAG/ TGC ACC TTT GCG SeSe SRGECTFA (SEO ID NO : 67) (SEQ ID NO: 51) CTT A/GA TCT CTA CTG CTC AAC CCG GTC CTT antisense KDRWEQs (SEQ ID NO: 68) (SEQ ID NO: 52) ruLAP1 GT TC/T CGA. GGC ATT CCT GTT GAT GCC CGG SeSe (R) (G) IPVDARA (SEO ID NO : 69) GCC G (SEO ID NO. 53) CTT A/GA TCT TTA. CTT AGC AAG CTC AGT GAC antisense WGFVTELAKs (SEQ ID NO: 70) GAA GCC GAC (SEQ ID NO: 54) fuL.API GT TC/T CGA. GGG GCT GTA GCT GCA GTG ATT SeSe (R) GAVAAVI (SEO IDNO : 71.) (SEO ID NO: 55) CTT A/GA TCT TTA AAA CGG CGC AAA TGC CAA antisense LAFAPFs (SEQ ID NO: 72) (SEO ID NO. 56) ruDPPIV's CT TC/T CGA GTC GTT CCT CCT CGT GAG CCC CG sense (R) (V) WPPREPR (SEO ID NO : 73) (SEO ID NO: 57) G TTC CAT GGT/CAT GAC CTT TGT GTC ATA CGA antisense VSYDTKVM (SEO ID NO : 74) GAC AG (SEQ ID NO: 58) GT TCC ATG GT/C ATG ACC CCT CTC GTC AAC SeSe VMTPLVNDK (SEO ID NO: 75) GAT AAG G (SEQ ID NO: 59) CTT G/GA TCC TCA TTC CTC TGC ccT CTC ACC antisense GERAEEsop (SEQ ID NO: 76) (SEQ ID NO: 6O) ruDPPV CCO G/AA TTC TTT ACC CCA GAG GAC TTC SeSe (E) (F) FTPEDF (SEO ID NO : 77) (SEQ ID NO: 61) GAG TACT AGA CTA GTA GTC GAA GTA AGA GTG antisense HSYFDYse (SEQ ID NO: 78) (SEQ ID NO: 62) PCR product (with Gene cloning sites) Wector

ruIAP2 ruLAP2 (58-1485) pKJ113

Sa-Not XhoI-Not fuLAP2 fuIAP2a (49-460) pKJ113 XhoI-ApaLI XhoI-BamHI fuIAP2b (461-1494) ApaL1-BglII ruIAP1 ruIAP1 (61-1119) pKJ113 Xhol-BglII XhoI-BamHI fuL.API fulLAP1 (4 6-1164) pKJII3 XhoI-BglII XhoI-BamHI US 7,943,340 B2 87 88 TABLE 14- Continued ruDPPIVs ruDPPIV a (49-1266) XhoI-Rcal XhoI-BamHI ruDPPIVb (1267-2325) Rica.I-BamHI ruDPPW ruDPPV (58-2178) pPICZOA EcoRI-Xoal EcoRI-Xbal

*In parentheses are shown amino acids encoded by the restriction site sequences and added to the N-terminal extremity of recombinant enzymes. The numbers in parentheses represent nucleoside posisions on LAP aad DPP cDNAs. FuLAP2 and ruDPPIV PCR fragments inserted end to end into E. coli- P. passoris shuttle vectors.

Purification of Recombinant LAPs column with a widergradient between 0 and 1 MNaCl over The secreted proteins from 400 ml of P pastoris culture 142 min at 0.5 ml/min. A first peak of activity eluates at 7-15 Supernatant were concentrated by ultrafiltration using an 15 min corresponding to 70-140 mM NaCl and a second peak Amicon cell and an Ultracel Amicon YM30 membrane (30 elutes at 150-250 mM NaCl (with more activity content). The kDa cut-off) (Millipore, Volketswil, Switzerland). The con fraction at 70-140 mMNaCl elutes at 78-80 min on Superdex centrate was washed with 50 mM Tris-HCl, pH 7.5 and and was therefore pooled with peak 3 obtained above. The applied to a Mono Q-Sepharose (Amersharn Pharmacia, fraction at 150-250 mM NaCl gives two active fractions elut Dübendorf, Switzerland) column equilibrated with the same ingrespectively at 44-49 min (Peak 1) and 50-63 min (Peak 2) buffer. After washing the column with 50 mM Tris-HCl, pH on Superdex. 7.5, elution was performed with a linear gradient of 0-0.5 M Protein Extract Analysis NaCl at a flow-rate of 1 ml/min. The different fractions eluted Protein extracts were analyzed by SDS-PAGE with a sepa from the Mono Q-Sepharose column were screened for enzy ration gel of 12% polyacrylamide Gels were stained with matic activity using Leucine-7-amino-4-methylcoumarin 25 Coomassie brilliant blue R-250 (Bio-Rad). N-glycosidase F (Leu-AMC) as a substrate and LAP-containing fractions digestion was performed as previously described. Doumas et were pooled. After concentration in an Amicon ultrafiltration al., Appl. Environ. Microbiol. 64:4809-4815 (1998) cell with an Ultracel Amicon YM30 membrane and washing Western Blots with 20 mM Tris-HCl, pH 6.0, the LAP extract was loaded on The membranes were first stained with Red-Ponceau and a size exclusion Superose 6 FPLC column (Amersham Phar 30 macia) and elution was performed at a flow-rate of 0.2 ml/min the major protein bands were marked with a needle. Immu using 20 mM Tris-HCl, pH 6.0 as eluant. The eluted active noblots were performed using rabbit antisera and alkaline fractions were pooled The LAP enzyme was concentrated to phosphatase conjugated goat anti-rabbit IgG (Bio-Rad) or a final volume of 0.4-1.0 ml in a Centricon concentrator with peroxidase-conjugated goat anti-rabbit IgG (Amersham a 30kDa cut-off (Millipore) at 4°C. prior to further functional 35 Pharmacia) as secondary labeled antibodies. Rabbit antisera characterization. to rul AP1, rul AP2, A. oryzae secreted alkaline protease In an alternative purification scheme, each step of purifi (ALP) and A. oryzae secreted neutral protease (NPI) of the cation was performed at 4°C. The secreted proteins from 400 fungalysin family (Doumas et al., J. Food Mycol. 2:271-279 ml of P. pastoris culture Supernatant were concentrated by (1999)) were made by Eurogentec (Liege, Belgium) using ultrafiltration using an Amicon cell and an Ultracel Amicon 40 purified recombinant enzyme. YM30 membrane (30 kDa cut-off) (Millipore, Volketswil, Aminopeptidase Activity Assay Switzerland). The concentrate was washed with 100 ml of 20 Aminopeptidase activity was determined using different mM sodium acetate, pH 6.0 and applied to a Mono fluorogenic aminoacyl-4-methylcoumaryl-7-amide deriva Q-Sepharose (Amersham Pharmacia, Dübendorf, Switzer tives of peptides and the internally quenched fluorogenic land) column equilibrated with the same buffer. After wash 45 substrate Lys(Abz)-Pro-Pro-pNA for specific determination ing the column with 20 mM Tris-HCl pH 6.0 buffer, the of aminopeptidase Pactivity. Stockel et al. Adv Exp. Med. enzyme was eluted with a linear gradient of 0-0.2 MNaCl at Biol. 421:31-35 (1997). All substrates were from Bachem a flow-rate of 1 ml/min over 142 min. The different fractions (Bubendorf, Switzerland). Substrate stock solutions were eluted from the Mono Q-Sepharose column were screened for prepared at 0.1 M according to the recommendations of the enzymatic activity using Leucine-7-amino-4-methylcou 50 manufacturer and stored at -20° C. The reaction mixture marin (Leu-AMC) as a substrate (see below) and LAP-con contained a concentration of 5 mM Substrate and enzyme taining fractions were pooled. After concentration in an Ami preparation (between 56 and 2,662ng per assay depending on con ultrafiltration cell with an Ultracel Amicon YM30 the cleavage activity of each enzyme for the Substrates) in membrane and washing with PBS, the LAP extract was 25ul of 50 mM Tris-HCl buffer adjusted at the optimal pH for loaded on a size exclusion Superdex 200 FPLC column (Am 55 each LAP (between 7 and 8). After incubation at 37° C. for 60 ersham Pharmacia) using 20 mM sodium acetate pH 6.0 min, the reaction was terminated by adding 5 ul of glacial buffer and elution was performed at a flow-rate of 0.2 ml/min. acetic acid and the reaction mixture was diluted with 3.5 ml of The eluted active fractions were pooled. The LAP enzyme water. The released 7-amino-4-methylcoumarin (AMC) was was subjected to further characterization after concentration measured using a spectrofluorophotometer (Perkin Elmer to a final volume of 0.4-1.0 ml in a Centricon concentrator 60 LS-5 fluorometer, Zurich, Switzerland) at an excitation wave with a 30 kDa cut-off (Millipore) at 4° C. length of 370 nm and an emission wavelength of 460 nm. A A fraction containing rul AP2 activity elutes from MonoQ standard curve made with synthetic AMC was used to assess at 30-40 min (approx. 50 mM. NaCl) and at 65-70 min with the released AMC. The released diprolyl-p-nitroanilide was superdex 200-Peak 3. However, a large amount of LAP2 measured at an excitation wavelength of 310 nm and an activity was not retained and eluted in the flow-through at 1M 65 emission wavelength of 410 nm. The LA activities were NaCl. Therefore, after desalting this fraction with 20 mM expressed in minoles of released AMC or pNA/min/ug pro Sodium acetate, the sample was applied on the same MonoQ tein. US 7,943,340 B2 89 Table 15 details the hydrolytic activity of different LAPs TABLE 16-continued toward various aminoacyl-MCA comparison (%) to Leu MCA used as a standard. Inhibitor rul AP2 fulLAP2 rul AP1 fulLAP1 pkLAP TABLE 1.5 E 64 100 M 34 71 103 190 93 Substrate LAP2 fuLAP2 ruLAP1 fulLAP1 pkLAP Leupeptin 100 M 113 61 233 149 86 Pepstatin 100 M 45 73 160 14 64 Leu-AMC 1OOO 1OO.O 1OO.O 1OOO 1OOO PMSF 40 M 79 84 78 1S6 58 Ile-AMC 6.4 1.8 7.4 13.2 6.3 10 Wal-AMC 4.8 O.8 4.9 27.6 4.0 Benzamidine 40 M 89 91 85 77 75 Ala-AMC 33.3 11.7 5.2 4.7 S84.7 TLCK 20 M 96 120 68 8O 113 Gly-AMC 3.3 2.2 S.1 O.8 74.8 TPCK 20 M 79 87 68 95 108 Ser-AMC 26.1 10.3 5.9 10.3 24.6 Thir-AMC O.9 O.1 1.7 S.1 4.4 15 Cys-AMC 14.9 2.1 18.5 S.O 35.5 Met-AMC 119.7 89.5 41.3 116.9 46.1 Table 17 details the hydrolytic activity of different EXOs in ASn-AMC 114.6 73.5 6.8 29.4 33.9 the presence of various cations using Leu-MCA as a Substrate Gln-AMC 49.9 37.0 2.3 44.9 50.7 Asp-AMC 3.8 O.3 O.O O.8 O.9 for LAP. The activity is given as the percentage of the activity Glu-AMC 3.7 1.1 O.O O.O 4.7 of control enzymatic reaction without any cation. Lys-AMC 4.6 2.3 9.1 7.7 70.1 Arg-AMC 1.9 2.3 12.3 53.9 174.8 His-AMC O6 1.9 O.1 O.8 17.6 Phe-AMC 17.1 8.9 4.6 163.7 1844 Pro-AMC 21.4 7.4 1.4 12.0 7.9 25 TABLE 17 Hyp-AMC 14.2 13.3 O.3 3.9 1.7 Gly-Pro-AMC 7.2 74.1 O.O 5.4 16.7 Pyr-AMC O.O O.O O.O O.O O.O LAP2 fulLAP2 rul AP1 fulLAP1 pkLAP Lys(Abz) O.O O.O O.O O.O O.O Pro-PropNA 30 CaCl, 0.5 mM 126.6 11O.O 151.7 S4.9 1774 CaCl2, 1 mM 1419 1654 175.6 43.3 1618 MgCl, 0.5 mM 121.2 97.6 129.9 68.5 130.1 Effect of Various Chemical Reagents on Laps MgCl2, 1 mM 110.2 108.0 132.6 72.6 146.1 Inhibitors and metallic cations were pre-incubated with the MnCl2 0.5 mM 77.5 84.3 120.7 25.9 157.6 enzymes for 15 minat37°C. Then, Leu-AMC at a 5 mM final 35 MnCl2 1 mM 86.8 1402 105.2 28.4 165.8 concentration was added. After further incubation for 60 min, CoCl, 0.5 mM 591.2 378.O 2102 104.3 876.1 enzyme activity was measured as described above. The CoCl21 mM 789.7 662.7 2O2.1 96.5 899.8 inhibitors and their concentrations tested purified LAPs were: ZnCl2 0.5 mM 77.9 514 43.O 60.7 437.6 500 uMamastatin (Bachem), 40 uM benzamidine (Sigma), 40 ZnCl2 1 mM 88.9 119.5 68.9 53.2 297.9 500 uM bestatin (Bachem), 5 mM/1 mM EDTA (Sigma). NiCl, 0.5 mM 13 O.S 98.4 74.8 51.7 1187.7 100 LM E-64 (L-trans-epoxysuccinyl-leu-4-guanidinobuty NiCl, 1 mM 147.9 1493 S8.1 37.2 1158.7 lamide) (Bachem), 100 uM leupeptin (Sigma), 5 mM/1 mM CuCl2 0.5 mM SO.9 68.9 40.1 25.8 1422.0 ortho-phenanthroline (Sigma), 500 LM p-chloromercuriben CuCl21 mM 34.7 73.6 13.7 17.0 1092.4 Zoic acid (Sigma), 100LM pepstatin A (Sigma), 40MPMSF 45 (Sigma), 20 uM TLCK (Roche Diagnostics), and 20 uM TPCK (Roche Diagnostics). CaC MgCl, MnOl. CoCl Optimal pH of Activity of EXOXs ZnCl, NiCl, CuCl2 were tested at concentrations of 0.5 mM and 1 mM. 50 Table 16 details the hydrolytic activity of different EXOXs The optimal pH for enzymatic activities was determined in the presence of various protease inhibitors using Leu-MCA as a Substrate for LAP. The activity is given as a percentage of using the Ellis and Morrison buffer system. Ellis & Morrison, the activity of control enzymatic reaction without inhibitor. Methods Enzymol. 87:405-426 (1982). The buffer contained 55 three components with different pKa values while the ionic TABLE 16 strength of buffer remained constant throughout the entire pH Inhibitor rul AP2 fulLAP2 ruLAP1 fulLAP1 pkLAP range examined. The pH of the buffer was adjusted from 6 to EDTA SM 5 50 O 16 99 11 inhalf-pH unit increments with 1M HCl or 1MNaOH. The EDTA 1 mM 7 77 7 19 68 orthophenanthroline O O O O O 60 assay conditions for activity on Leu-AMC substrates was the 5 mM orthophenanthroline O O O O O same as above except that the Tris/HCl buffer was replaced by 1 mM the Ellis and Morrison buffer (composition) at the pH values Bestatin 500 M 55 88 O 11 24 Amastatin 500 M O O O 17 O indicated. p-chloromercuribenzoic 21 96 32 90 59 65 acid 500 M Table 18 details characteristics of native and recombinant T. rubrum and A. fumigatus secreted aminopeptidases. US 7,943,340 B2 91 92 TABLE 1.8 Molecular mass of Molecular mass of Gene Number Mature the polypeptidic the native? length of Preprotein Signal domain chain of the recombinant Gene (nt) introns (aa) (aa) (aa) mature enzyme (kDa) enzyme (kDa) ruIAP1 1256 2 373 19 3S4 38,804 31-33,38-40 fulLAP1 1298 2 388 17 371 41.465 SNI/40 ruIAP2 1757 3 495 18 477 51487 S8, S8-65 fulLAP2 1557 1 498 15 383 52,270 SNI/75-100 ruDPPIV 2326 O 775 15 760 86,610 90.90

Molecular mass of Number of Calculated Yield of recombinant putative pI recombinant GenBank enzyme after glycosylation (mature protein accession Gene deglycosylation (kDa) sites domain)* (ig/ml) number LAP1 38-40 3 6.39 (6.23) 40 AY496930 fulLAP1 40 3 5.67 (5.67) 8O AY436356 LAP2 52 4 7.32 (6.94) 40 AY496929 fulLAP2 52 6 5.57 (5.46) 1OO AY436357 ruDPPIV 84 4 (8.05) 10 AY497021 SNI: means not determined *The value in brackets corresponds to full-length polypeptide without prosequence

Temperature Optima of Activity of EXOXs encoding the Subtilisins and the fungalysins secreted by A. The optimal temperature conditions were determined by Oryzae and A. fumigatus. In addition, the M. Canis and measuring the enzymatic activity their pH optima after incu 25 Aspergillus genes showed colinear intron-exon structures. bating each of the LAPs with Leu-AMC (5 mM) at 20,30, 40, Therefore, DNA sequences available for A. oryzae and 50, 60, 70 and 80° C. for 10, 30 and 60 min. Sacharomyces cerevisiae genes coding for aminopeptidases Proteolytic Assays were used to design probes for screening a T. rubrum genomic The proteolytic activity was measured using resorufin DNA library. Characterization of the T. rubrum secreted ami labeled casein in phosphate buffer (20 mM; pH 7.4). The 30 nopeptidases in comparison to those secrets by the opportun reaction mixture contained 0.02% substrate in a total volume ist A. fumigatus was performed using recombinant proteins. of 0.5 ml. After incubation at 37°C., the undigested substrate was precipitated by trichloroacetic add (4% final concentra Example 4 tion) and separated from the Supernatant by centrifugation. The absorbance at 574 nm of the supernatant was measured 35 Cloning of Genes Encoding T. rubrum and A. after alkalinization by adding 500 ul Tris buffer (500 mM; pH filmigatus Aminopeptidases 9.4). For practical purposes, one unit (U) of proteolytic activ ity was defined as that producing an absorbance of 0.001 per Tables 19A and 19B detail a pairwise comparison of vari 1. ous LAPs. 40 Example 2 TABLE 19A T. rubrum Secreted Proteolytic Activity %. Similarity or Identity T. rubrum was grown at 30° C. in a medium containing 45 M28E 0.2% soy protein as a sole carbon and nitrogen source. After Enzyme ruIAP1 fulLAP1 orLAP1 Vibrio LAP 14 days of growth, a concomitant clarification of the culture ruIAP1 72 72 41 fulLAP1 50 70 39 medium was noted and a substantial proteolytic activity (400 orLAP1 48 49 42 Uml") detected using resorufin-labeled casein as substrate. Vibrio LAP 22 21 23 This proteolytic activity was 15% and 85% inhibited by 50 PMSF and ortho-phenanthroline. respectively, attesting that serine and metalloproteases were secreted by T. rubrum. Western blot analysis of culture supernatant revealed that T. TABLE 19B rubrum, like M. canis, secreted endoproteases of the subtili %. Similarity or Identity sin family (MEROPSDS8) and of the fungalysin family 55 (MEROPS>M36) similar to the alkaline protease ALP and M28A the neutral metalloprotease NPI secreted by A. oryzae (See Enzyme ruIAP2 fulLAP2 orLAP2 S. cer. aaY FIG. 1). In addition, a high activity on Substrates such as ruIAP2 69 71 53 Leu-AMC and Leu-pNA was detected in the T. rubrum cul fulLAP2 51 85 52 ture Supernatant. 60 orLAP2 49 72 53 S. cer. aaY 32 33 34 Example 3 The percent of similarity (top right-hand corner) and percent of identity (bottom left-hand corner values were obtained with the program Gap implemented in the GCG package of the T. rubrum Secreted Aminopeptidase Activity Genetics Computer Group, University of Wisconsin, Madison. 65 FIG. 14 is an alignment of deduced amino acid sequences The nucleotide sequences of Microsporum canis endopro of aminopeptidases of the M28E subfamily. Putative signal tease genes showed 50-70% similarity to homologous genes sequence processing sites are underlined. A putative KR pro US 7,943,340 B2 93 94 cessing site in rul AP1 is indicated by a solid triangle. The Example 5 amino acids of the two Zn" binding sites in S. griseus ami nopeptidase and conserved in the other LAPs are indicated by Production of Recombinant T. rubrum and A. an open arrow The alignment was performed with the Pileup filmigatus Aminopeptidases algorithm implemented in the GCG package of the University of Wisconsin and reformatted with Boxshade 3.2. Abis The T. rubrum and A. fumigatus cDNAs obtained by RT pIAP1 is for LAP of Agaricus bisporus. PCR were cloned in pKJ113 (Borg-von Zepelin et al., 1998) FIG. 15 is an alignment of deduced amino acid sequences and expressed in P. pastoris. Depending on the peptidase of aminopeptidases of the M28A subfamily. Putative signal produced, about 10-80g/ml of active enzyme on Leu-AMC sequence processing sites are underlined. Two amino acid 10 was obtained (See Table 18). Under identical culture condi residues. His and Asp, conserved in the fungal LAPS and tions wild type P pastoris did not secrete any leucine ami binding a first Zn" ion in S. griseus aminopeptidase are nopeptidase activity into the culture medium. SDS-PAGE analysis of recombinant rul AP2, fulLAP1 and fuIAP2 indicated by open triangles. Two additional residues His and secreted by P. pastoris transformants showed a Smearing Glu binding a second Zn" ion are indicated by solid dia 15 band (FIG. 2). Upon treatment with N-glycosidase F, only a monds, while the Asp residue bridging the two Zn'" ions is major band with a faster migration appeared on the gels indicated by an open arrow. The * represent methionine resi attesting that, in contrast to rul AP 1, these three LAPs were dues found only in rul AP2. The alignment was performed glycoproteins (FIG. 2). The apparent molecular mass of each with the Pileup algorithm implemented in the GCG package deglycosylated recombinant LAP was close to that of the of the University of Wisconsin and reformatted with Box calculated molecular mass of the polypeptide chain deduced shade 3.2. from the nucleotide sequence of the genes encoding the pro The amino acid sequences GPGINDDGSG (SEQID NO: tease. The deduced primary structures (amino acid 36) and DM(Q/M)ASPN (SEQID NO:37) were foundina A. sequences) of each recombinant enzyme are provided in oryzae secreted 52 kDa aminopeptidase (U.S. Pat. No. 6,127. Table 18. 161) and the S. cerevisiae aminopeptidase. Nishizawa et al., J. 25 Biol. Chem. 269:13651-13655 (1994). From these data, two Example 6 consensus oligonucleotides (GGXATXAAYGAYGAYG GXTCXGG (SEQID NO: 38) and TTXGGXGAXGCXAT Detection of rul AP1 and rul AP2 in T. rubrum CATRTC (SEQID NO:39) were used as sense and antisense, Culture Supernatant respectively, to amplify DNA from T. rubrum. A 220 bp PCR 30 Using anti-rul AP1 antiserum, an accumulation of a LAP1 product was obtained and sequenced. The deduced amino product with an electrophoretic mobility higher than that of acid sequence showed high similarity to the amino acid recombinant rul AP1 was detected in the T. rubrum culture sequence of the A. oryzae and the S. cerevisiae aminopepti supernatant (See FIG. 3). dases. This 220 bp PCR fragment was used as a probe for 35 Using anti-rul AP2 antiserum, Western blot analysis of a T. screening a phage EMBL3 T. rubrum genomic DNA library rubrum culture supernatant revealed that T. rubrum secreted and a nucleotide sequence coding for a putative aminopepti glycosylated LAP2 with the same electrophoretic mobility as dase (rul AP2) was found. A nucleotide sequence coding for that of the recombinant enzyme from P pastoris (See FIG.3). a similar secreted aminopeptidase (fulLAP2) was found in the A. fumigatus genome sequence (at website address www.TI 40 Example 7 GR.com). A 1200 bp fragment containing the nucleotide sequence of Properties of Recombinant LAPs the gene encoding an A. Oryzae 31 kDa aminopeptidase (U.S. Pat. No. 5,994,113) was obtained by PCR of A. oryzae The aminopeptidases rul AP1, rul AP2, fuIAP1, ful AP2, genomic DNA using the oligonucleotides GCATTCCT 45 as well as the microsomal porcine kidney aminopeptidase GUGATGCCCGGGCCG (sense) (SEQ ID NO: 40) and (pkLAP) each efficiently hydrolyzed Leu-AMC. This sub TTACTTAGCAAGCTCAGTGACGAAGCCGAC (anti strate was used to determine the optimum temperature and pH sense) (SEQID NO: 41). This fragment was used as a probe of activity, and to further characterize the enzymes by mea for a second screening of the T. rubrum genomic DNA library. suring the effect of (i) various known peptidase inhibitors A nucleotide sequence (EMBL) similar to those coding for 50 (See Table 16) and (ii) different divalentions (See Table 17). the A. oryzae 30 kDa aminopeptidase and to another putative Each LAP was capable of cleaving Leu-AMC at 20° C. and secreted aminopeptidase from the A. fumigatus genome had a temperature optimum ranging from 40 to 50° C. The sequence (at website address www.TIGR.com) was found in optimum pH was between 7.0 and 8.5 (See Table 18). A 10 phage EMBL3 DNA of the T. rubrum genomic library. min pre-treatment at 80° C. totally and irreversibly inacti These T. rubrum and A. fumigatus putative aminopeptidases 55 vated the enzymes. were called rul AP1 and ful AP1, respectively. The aminopeptidases tested were strongly or totally inhib The identified nucleotide sequences of rul AP1, rul AP2, ited by amastatin (See Table 16) at a concentration of 500 uM. ful AP1 and fuIAP2 each contain a 17-20 amino acid signal RuLAP1, fuIAP1 and pkLAP were also inhibited by besta sequence. The intron-exon structure of the T. rubrum and A. tin, but this inhibitor had only partial inhibitory effect on both filmigatus genes was determined by sequencing a PCR prod 60 rul AP2 and fulLAP2. Of the chelating agents tested, ortho uct using 5'-sense and 3'-antisense primers based on isolated phenantroline totally inhibited the five enzymes at concentra genomic DNA (See Table 14) and total DNA from a pool of tions of 1 and 5 mM. Ful AP1, rul AP2 and rul AP1 were 10 clones of the T. rubrum or A. fumigatus cDNA libraries as more sensitive to EDTA than the other LAPs. E64 and p-chlo a target. The first of the three introns in rul AP2 was in romercuribenzoate (cysteine protease inhibitors) blunted the position similar to that of the unique intron of fulLAP2 (See 65 activity of rul AP2 indicating the presence of critical thiol Table 13). The genes rul AP1 and fulLAP1 have similar colin residues for activity on the amino acid sequence of this ear structures with two introns and three exons. enzyme. Leupeptin (serine/cysteine protease inhibitor), US 7,943,340 B2 95 96 PMSF (serine protease inhibitor), benzamidine, TLCK and of T-cellactivation by compounds that block peptide binding TPCK had no clear inhibitory effects on all the LAPs tested. to HLA-DQ2, inhibitors of tissue transglutaminase that pre Surprisingly, fulLAP1 and rul AP1 exhibited some sensitivity vent gluten deamidation (Solid, Nat. Rev. Immunol. 2:647 to 0.1 mM pepstatin (aspartic acid protease inhibitor). 655 (2002)) and peroral peptidase supplementation. This lat With the exception of fulLAP1, which exhibits a general ter approach is considered to aid complete digestion of sensitivity to divalent ions, Co ++ ions increased the activity immunostimulatory peptides by involvement of bacterial pro of the LAPs from 200% to 900% at a concentration up to 1 lyl endopeptidases which have broad tolerance for proline mM. The four fungal LAPs showed variable sensitivities to containing peptides. Shan et al., Science 297:2275-2279 divalent cations. For instance, fulLAP2 was activated by Mn" (2002); Hausch et al., Am. J. Physiol. Gastrointest Liver and Ca", while fuIAP1 was inhibited by the same ions. The 10 Physiol. 283:G996-G1003 (2002). A relatively large frag microsomal pkLAP, highly activated by Zn, Ni and Cu" ment of gliadin that is resistant to digestive enzymes degra differs from the four fungal LAPs of the M28 family. dation was identified. Furthermore, this peptide was shown to The hydrolytic activity of the enzymes toward different be a potent stimulator of different HLA-DQ2-restricted T cell aminoacyl-AMC was compared to Leu-AMC used as a ref clones derived from intestinal biopsies of CD patients stimu erence (See Table 15). Following the aminopeptidase tested, 15 lated with gluten, each of these clones recognizing a different various preferences for the different aminoacyl residue were epitope of the 33 mer. The prolyl endopeptidase, which has a detected. For example, the aminopeptidase pkLAP differs preference for Pro-Xaa-Promotif, is able to cleave the 33 mer from the four fungal LAPs by an extremely high efficiency gliadin peptide and the synergistic effect of brush border towards Ala-AMC and Arg-AMC. rul AP1 was clearly the aminopeptidase rapidly decreases the T-cell stimulatory most selective for Leu-AMC. However, some other preferen potential of the peptide. tial cleavage activities were observed with rul AP2, fuIAP1 Though there are stable homologs to this 33 mer in barley and fuIAP2. For instance Ser- and Pro-AMC were more and rye, these gluten peptide motifs that are described as efficiently cleaved by rul AP2, whereas fulLAP1 appreciated resistant to gastrointestinal degradation were used in our case Arg-, Val-, and Phe-AMC. Only rul AP2 efficiently cleaved as model substrates for different LAPs, either alone or in Asp- and Glu-AMC. None of these enzymes exhibited an 25 combination with ruDPPIV: PQPQLPYPQPQLPY (SEQID aminopeptidase Pactivity since they were not able to cleave NO: 42)(14 mer) corresponding to fragment 82-95 of C/B Lys(Abz)-Pro-Pro-pNA. gliadin AIV (P04724) O LQLQPF PQPQLPYPQPQLPYPQPQLPYPQPQPF (SEQID NO:43) Example 8 (33 mer) corresponding to fragment 57-89 of gliadin MM1 (P 30 18573). Application of rul AP2 Together with rulDPPIV in A N-terminal acetylated form of the 33 mer (Ac-33 mer) the Digestion of Gliadin Peptides was also synthesized as control for the digestion experiments with exopeptidases to preclude any endoproteolytic cleavage Celiac disease (CD) is a digestive disease that damages the by a contaminant enzyme. small intestine and interferes with absorption of nutrients 35 The enzymes that have been evaluated include: rul AP1 from food. People who have celiac disease cannot tolerate a (aminopeptidase I of Trichophyton rubrum), rul AP2 (ami protein called gluten, which is found in wheat, rye and barley. nopeptidase II of Trichophyton rubrum), or LAP2 (ami When people with celiac disease eat foods containing gluten, nopeptidase II of Aspergillus Orizae), ful AP2 (aminopepti their immune system responds by damaging the Small intes dase II of Aspergillus fumigatus), MicpKLAP (microsomal tine. The disease has a prevalence of s1:200 in most of the 40 leucine aminopeptidase from porcine kidney, Sigma), Cytp world's population groups and the only treatment for celiac KLAP (cytosolic leucine aminopeptidase from porcine kid disease is to maintain a life-long, strictly gluten-free diet. For ney, Sigma), and rul PPIV. most people, following this diet will stop symptoms, heal Synthesis of the Peptides existing intestinal damage, and prevent further damage. Solid-phase synthesis was performed on a custom-modi The principal toxic components of wheat glutenare a fam 45 fied 430A peptide synthesizer from Applied Biosystems, ily of Pro- and Gln-rich proteins called gliadins, which are using in situ neutralization/2-(1H-benzotriazol-1-yl)-1,1,1,3, resistant to degradation in the gastrointestinal tract and con 3-tetramethyluronium hexafluoro-phosphate (HBTU) acti tain several T-cellstimulatory epitopes. There is some contro Vation protocols for stepwise Boc chemistry chain elongation versy about the epitopes that effectively induce an immuno on a standard —O—CH-phenylacetamidomethyl resin. logical activation of HLA-DQ2 positive gut-derived and 50 Schnólzer et al., Int. J. Peptide Protein Res. 40:180-193 peripheral T cells (Vader et al., Gastroenterology 122:1729 (1992). 1737 (2002)) because different in vitro systems have been At the end of the synthesis, the peptides were deprotected used for these studies. The capacity of gliadin peptides to and cleaved from the resin by treatment with anhydrous HF induce toxicity in an organ culture model of CD does not for 1 hr at 0° C. with 5% p-cresol as a scavenger. After correspond to that of stimulating T-cells and vice versa. 55 cleavage, the peptides were precipitated with ice-cold dieth McAdam & Sollid, Gut 47: 743-745 (2000). Moreover, the ylether, dissolved in aqueous acetonitrile and lyophilized. binding of many gluten epitopes to HLA-DQ2 and HLA The peptides were purified by RP-HPLC with a Cs column DQ8 but not all is enhanced by deamidation of certain from Waters by using linear gradients of buffer B (90% aceto glutamine residues into glutamic acids through the action of nitile/10% HO/0.1% trifluoroacetic acid) in buffer A (HO/ the Small intestinal enzyme tissue transglutaminase, which 60 0.1% trifluoroacetic acid) and UV detection at 214 nm. potentiates their ability to stimulate T-cells. Molberg et al., Samples were analyzed by electrospray mass spectrometry Nat. Med. 4:713-717 (1998). However, deamidation is not an with a Platform II instrument (Micromass, Manchester, absolute requirement for T-cell activation. Arentz-Hansen et England). al., Gastroenterology 123:803-809 (2002). Conditions of Degradation Reaction: Other strategies for treating or preventing CD, with the 65 Incubation was carried out at 37° C. in 50 mM Tris-HCl, ultimate hope being an alternative for the “gluten free” diet, pH7.2 supplemented with 1 mM CoCl with a substrate con have been suggested over the last years, including inhibition centration of 1 mg/mL and an E/S ratio of 1:20. The reaction US 7,943,340 B2 97 98 was stopped by acidification with CHCOOH and the cleavage of the two N-terminal residues Gly-Ser with a theo medium analysed by RP-HPLC on a Cs column using a retical loss of 144.1 amu (found 144.2). The same result is 2%/min CHCN gradient in 0.1% TFA. All peaks were char obtained at an 1:100 E/S ratio. Digestion halts when the acterized by ESI-MS. enzyme reaches a Xaa-Pro-motif, which in case of proNPY is Digestion of the 14 Mer: Tyr-Pro. As shown in FIG. 6, the 14 meris not digested with rul AP2 Digestion of Ala-proNPY with rul AP2: within 4 h. There is no change in the HPLC profile when Conditions of incubation were the same as for Gly-Ser compared with the control. In fact, digestion results only in proNPY. FIG. 11B shows that the N-terminal alanine was the cleavage of the N-terminal Proline. On the other hand, almost totally removed (molecular mass loss of 71 amu) from supplementation with rul PPIV results in a complete break 10 proNPY. down in amino acids and dipeptides, while rul PPIV alone is Successive Cleavage of Met and Thr-Pro from the N-terminus notable to hydrolyse the peptide (FIG. 7). of G-CSF: Digestion of the 33 Mer: The mutant analogue of G-CSF known as TG47 used in Digestion of the 33 mer with rul AP2 alone results in these experiments is methionyl-C17S, K16,23.34.40R partial degradation (less than 50%) of the peptide within 4h 15 G-CSF with a theoretical mass of 18,894.90 for the refolded (data not shown). This peptide is not a substrate for rul PPIV protein. (FIG. 8). However, when both enzymes are mixed, the 33 mer Digestion with rul AP2: is totally digested (FIG. 9) into amino acids and dipeptides Stock solution of G-CSF (1.9 mg/ml in PBS containing some of which could be identified by ESI-MS (Y. L. F. P. PY, 0.1% Sarcosyl) was diluted 4 times in 50 mM Tris-HCl at and PF). pH7.2 supplemented with 1 mM CoCl2, and incubated with The same HPLC pattern is obtained when rulDPPIV is rul AP2 (E/S=1/20 and 1:100, w:w) for 15 h at 37° C. The mixed with rul AP2 or fulLAP2. However, with rul AP1 some solution was diluted with 30% (v:V) acetonitrile, acidified higher molecular weight compounds are still present, but with acetic acid and the protein isolated by RP-HPLC for MS represent less than 10% of the initial substrate. characterization. As shown in FIGS. 12A and B, the overnight On the other hand, incubation with microsomal porcine 25 incubation results in the complete cleavage of the N-terminal kidney aminopeptidase results only in a partial deletion of methionine with atheoretical mass loss of 131.2amu. Withan N-terminal Leu and C-terminal Phe (due to a carboxypepti E/S ratio (w:w) of 1:100, traces of uncleaved material are still dasic contaminant) and addition of DPPIV does not modify present after an overnight incubation. the profile. Cytosolic porcine kidney aminopeptidase is This experiment was repeated at a 2 mg Scale in order to totally inactive towards the 33 mer. 30 isolate the truncated material on a semi-preparative The stability of the Ac-gliadin 33 mer in the digestion RP-HPLC column, by carrying out the digestion with a E/S experiments with either LAP or DPPIV alone, or mixed ratio of 1:25 (w:w) at 37° C. over 15 h. The isolated material together, confirms that a free amino group is required for the (0.8 mg) was characterized by ESI-MS (FIG. 12B, desMet complete breakdown of the gliadin 33 mer by these exopep G-CSF, calculated molecular mass at 18,763.7 amu; mea tidases. 35 sured molecular mass at 18,762.5). Digestion with Other Enzymes: Digestion of desMet-G-CSF with DPPIV: Digestion with Pronase (E/S=1/25) over 20 his only partial The freeze-dried material was suspended at a 1 mg/ml (less than 40%) and the addition of rul AP2 (both enzymes at concentration in 50 mM Tris-HCl, pH 7.5 containing 0.1% an E/S ratio (w:w) of 1:50) does not improve the hydrolysis. Sarcosyland incubated overnight at 37°C. with DPPIV at an On the other hand, addition of DPPIV under the same condi 40 E/S ratio of 1/20 (w:w). The protein was isolated by RP tions results in a complete breakdown of the peptide due to the HPLC as before and characterized by ESI-MS (FIGS. 13A complementary action of an aminopeptidase and dipepti and B). DPPIV digestion (FIG.13B) results in the cleavage of dylpeptidase. Chymotrypsin alone or Supplemented with the N-terminal dipeptide Thr-Pro (calculated molecular mass rul AP or DPPIV is not able to breakdown the peptide. of 18,564.8 uma; measured molecular mass at 18.563). 45 Traces of undigested material are still present in the reaction Example 9 medium. Thus, a sequential application of LAP2 and DPPIV results Application of rul AP2 in the Processing of in the efficient removal of an N-terminal sequence from a Expressed Recombinant Proteins Fused with recombinant protein. Digestion with rul AP2 is halted when Another Protein or with a N-terminal Tag 50 the enzyme reaches a “stop point amino acid motif. Such as Xaa-Pro-Xaa, or the Xaa-Pro motif, which may be specifi LAP2 was evaluated in the cleavage of the Gly-Ser from cally introduced as a LAP2 “stop point', is subsequently the N-terminus of proNPY and of a supplementary Ala from cleaved with DPPIV. the N-terminus of the same peptide. In order to widen the However, initial cleavage of the N-terminal residues is applicability of LAP2 either alone or in conjunction with 55 highly dependent on the sequence since the Met(His) tag was another exopeptidase in the processing of larger recombinant not removed from Met(His)-proNPY by incubating with proteins, a G-CSF recombinant protein (Cys'7->Ser, LAP and DPPIV. Lys''''''->Arg) with an N-terminal sequence Met-Thr Pro-, was successively incubated with rul AP2 and ruDPPIV OTHER EMBODIMENTS to remove sequentially Met and Thr-Pro dipeptide from the 60 175 residue protein. Although particular embodiments have been disclosed Digestion of Gly-Ser-proNPY with rul AP2: herein in detail, this has been done by way of example for The peptide was incubated overnight at 37° C. and 1 mg/ml purposes of illustration only, and is not intended to be limiting in a 50 mM Tris.HCl, 1 mM CoC1 buffer with rul AP2 at an with respect to the scope of the appended claims, which E/S ratio of 1:20 and 1:100 (w:w). The digested material was 65 follow. In particular, it is contemplated by the inventors that isolated by RP-HPLC and characterized by ESI-MS. As various Substitutions, alterations, and modifications may be shown in FIG. 10, incubation with rul AP2 results in the made to the invention without departing from the spirit and

US 7,943,340 B2 103 104 - Continued

Glin Wall His Luell Trp Ser His Ala Glu Ala Ala Lell Asn Ala Asn Gly 105 11 O

Asp Luell Ala Ser Ala Met Ser Ser Pro Pro Ala Ser 115 12 O 125

Ile Met Ala Glu Lell Wall Wall Ala Asn ASn Gly Asn Ala Thir 13 O 135 14 O

Asp Pro Ala Asn Thir Glin Gly Ile Wall Lell Wall Glu Arg Gly 145 150 155 160

Wall Ser Phe Gly Glu Ser Ala Glin Ala Gly Asp Ala Lys Ala 1.65 17O 17s

Ala Gly Ala Ile Wall Asn Asn Wall Pro Gly Ser Lell Ala Gly Thir 18O 185 19 O

Lell Gly Gly Luell Asp Arg His Wall Pro Thir Ala Gly Luell Ser Glin 195

Glu Asp Gly Asn Lell Ala Thir Luell Wall Ala Ser Gly Ile Asp 21 O 215 22O

Wall Thir Met Asn Wall Ile Ser Luell Phe Glu ASn Arg Thir Thir Trp Asn 225 23 O 235 24 O

Wall Ile Ala Glu Thir Gly Gly Asp His ASn Asn Wall Ile Met Luell 245 250 255

Gly Ala His Ser Asp Ser Wall Asp Ala Gly Pro Gly Ile Asn Asp Asn 26 O 265 27 O

Gly Ser Gly Ser Ile Gly Ile Met Thir Wall Ala Ala Luell Thir Asn 28O 285

Phe Lys Luell Asn Asn Ala Wall Arg Phe Ala Trp Trp Thir Ala Glu Glu 29 O 295 3 OO

Phe Gly Luell Luell Gly Ser Thir Phe Wall ASn Ser Lell Asp Asp Arg 3. OS 310 315

Glu Luell His Wall Lell Luell Asn Phe Asp Met Ile Gly Ser 3.25 330 335

Pro Asn Phe Ala Asn Ile Asp Gly Asp Gly Ser Ala Asn 34 O 345 35. O

Met Thir Gly Pro Ala Ser Ala Glu Ile Glu Lell Phe Glu 355 360 365

Phe Phe Asp Asp Glin Ile Pro His Glin Pro Thir Ala Phe Thir Gly 37 O 375

Arg Ser Asp Ser Phe Ile Arg ASn Wall Pro Ala Gly Gly 385 395 4 OO

Lell Phe Thir Gly Ala Wall Wall Thir Pro Glu Glin Wall Lys Luell 4 OS 415

Phe Gly Gly Glu Ala Wall Ala Tyr Asp Asn His Arg 425 43 O

Gly Asp Thir Wall Ala Asn Ile Asn Gly Ala Ile Phe Luell Asn Thir 435 44 O 445

Arg Ala Ile Ala Tyr Ala Ile Ala Glu Ala Arg Ser Luell Gly 450 45.5 460

Phe Pro Thir Arg Pro Lys Thir Gly Arg Asp Wall Asn Pro Glin Tyr 465 470 48O

Ser Met Pro Gly Gly Gly Gly His His Thir Wall Phe Met 485 490 495

<210s, SEQ ID NO 4 &211s LENGTH: 1256 &212s. TYPE: DNA

US 7,943,340 B2 107 108 - Continued gacatgaccg gttacatcaa gggaatggtc gacaagggtc. tca aggtgtc. Ctt.cggitatic 84 O atcaccgaca acgtcaacgc taacttgacc aagttcgt.cc gcatggit cat caccalagtac 9 OO tgct caatcc caac catcga caccc.gctgc ggctatoctit gctctgacca cqc ct ctogcc 96.O aaccocaatig gct acc catc togc catggitt gcc.gagt citc ccatcgatct c ct cqaccct 1 O2O cacctic caca citgactictoga caa.cattagc tacct cqact tcgaccacat gatcgagcac 108 O gctaagct cattgtcggctt cqt cactgag Ctcgctaagt aa 1122

<210s, SEQ ID NO 6 &211s LENGTH: 373 212. TYPE: PRT <213> ORGANISM: Trichophyton rubrum <4 OOs, SEQUENCE: 6 Met Lys Lieu. Lieu. Ser Val Lieu Ala Lieu. Ser Ala Thr Ala Thir Ser Val 1. 5 1O 15 Lieu. Gly Ala Ser Ile Pro Val Asp Ala Arg Ala Glu, Llys Phe Lieu. Ile 2O 25 3 O Glu Lieu Ala Pro Gly Glu Thir Arg Trp Val Thr Glu Glu Glu Lys Trp 35 4 O 45 Glu Lieu Lys Arg Lys Gly Glin Asp Phe Phe Asp Ile Thr Asp Glu Glu SO 55 60 Val Gly Phe Thr Ala Ala Val Ala Glin Pro Ala Ile Ala Tyr Pro Thr 65 70 7s 8O Ser Ile Arg His Ala Asn Ala Val Asn Ala Met Ile Ala Thir Lieu. Ser 85 90 95 Lys Glu Asn Met Glin Arg Asp Lieu. Thir Lys Lieu. Ser Ser Phe Glin Thr 1OO 105 11 O Ala Tyr Tyr Llys Val Asp Phe Gly Lys Glin Ser Ala Thir Trp Lieu. Glin 115 12 O 125 Glu Glin Val Glin Ala Ala Ile Asn. Thir Ala Gly Ala Asn Arg Tyr Gly 13 O 135 14 O Ala Lys Val Ala Ser Phe Arg His Asn. Phe Ala Gln His Ser Ile Ile 145 150 155 160 Ala Thr Ile Pro Gly Arg Ser Pro Glu Val Val Val Val Gly Ala His 1.65 17O 17s Gln Asp Ser Ile Asn Glin Arg Ser Pro Met Thr Gly Arg Ala Pro Gly 18O 185 19 O Ala Asp Asp Asn Gly Ser Gly Ser Val Thir Ile Lieu. Glu Ala Lieu. Arg 195 2OO 2O5 Gly Val Lieu. Arg Asp Glin Thir Ile Lieu. Glin Gly Lys Ala Ala Asn. Thir 21 O 215 22O Ile Glu Phe His Trp Tyr Ala Gly Glu Glu Ala Gly Lieu. Lieu. Gly Ser 225 23 O 235 24 O Glin Ala Ile Phe Ala Asn Tyr Lys Glin Thr Gly Llys Llys Wall Lys Gly 245 250 255 Met Lieu. Asn Glin Asp Met Thr Gly Tyr Ile Lys Gly Met Val Asp Llys 26 O 265 27 O Gly Lieu Lys Val Ser Phe Gly Ile Ile Thir Asp Asn. Wall Asn Ala Asn 27s 28O 285 Lieu. Thir Lys Phe Val Arg Met Val Ile Thr Lys Tyr Cys Ser Ile Pro 29 O 295 3 OO Thir Ile Asp Thir Arg Cys Gly Tyr Ala Cys Ser Asp His Ala Ser Ala 3. OS 310 315 32O

US 7,943,340 B2 113 114 - Continued

115 12 O 125

Asn Asn Luell Gly Wall Glu Ala Asp Pro Ala Asp Luell Thir Gly 13 O 135 14 O

Lys Ile Ala Luell Ile Ser Arg Gly Glu Thir Phe Ala Thir Ser 145 150 155 160

Wall Luell Ser Ala Lys Ala Gly Ala Ala Ala Ala Lell Wall Asn Asn 1.65 17O 17s

Ile Glu Gly Ser Met Ala Gly Thir Luell Gly Gly Ala Thir Ser Glu Luell 18O 185 19 O

Gly Ala Tyr Ala Pro Ile Ala Gly Ile Ser Luell Ala Asp Gly Glin Ala 195

Lell Ile Glin Met Ile Glin Ala Gly Thir Wall Thir Ala Asn Luell Trp Ile 21 O 215

Asp Ser Glin Wall Glu Asn Arg Thir Thir ASn Wall Ile Ala Glin Thir 225 23 O 235 24 O

Gly Gly Asp Pro Asn Asn Wall Wall Ala Luell Gly Gly His Thir Asp 245 250 255

Ser Wall Glu Ala Gly Pro Gly Ile Asn Asp Asp Gly Ser Gly Ile Ile 26 O 265 27 O

Ser Asn Luell Wall Wall Ala Ala Luell Thir Arg Phe Ser Wall Asn 285

Ala Wall Arg Phe Phe Trp Thir Ala Glu Glu Phe Gly Luell Luell Gly 29 O 295 3 OO

Ser Asn Wall Asn Ser Luell Asn Ala Thir Glu Glin Ala Ile 3. OS 310 315

Arg Luell Luell Asn Phe Asp Met Ile Ala Ser Pro Asn Tyr Ala Luell 3.25 330 335

Met Ile Asp Gly Asp Gly Ser Ala Phe ASn Lell Thir Gly Pro Ala 34 O 345 35. O

Gly Ser Ala Glin Ile Glu Arg Luell Phe Glu Asp Tyr Thir Ser Ile 355 360 365

Arg Lys Pro Phe Wall Pro Thir Glu Phe Asn Gly Arg Ser Asp Glin 37 O 375 38O

Ala Phe Ile Luell Asn Gly Ile Pro Ala Gly Gly Lell Phe Thir Gly Ala 385 390 395 4 OO

Glu Ala Ile Thir Glu Glu Glin Ala Glin Luell Phe Gly Gly Glin Ala 4 OS 415

Gly Wall Ala Luell Asp Ala Asn His Ala Gly Asp Asn Met Thir 425 43 O

Asn Luell Asn Arg Glu Ala Phe Luell Ile Asn Ser Arg Ala Thir Ala Phe 435 44 O 445

Ala Wall Ala Thir Tyr Ala Asn Ser Luell Asp Ser Ile Pro Pro Arg Asn 450 45.5 460

Met Thir Thir Wall Wall Lys Arg Ser Glin Luell Glu Glin Ala Met Arg 465 470 47s

Thir Pro His Thir His Thir Gly Gly Thir Gly Cys Asp Arg Wall 485 490 495

Glu Glin

SEQ ID NO 10 LENGTH: 1298 TYPE: DNA ORGANISM: Aspergillus fumigatus

< 4 OOs SEQUENCE: 10

US 7,943,340 B2 117 118 - Continued gctittgaacg ccggtgttga ggaagccata gga attatgg tcgattatgt caccagggc 9 OO ct cacacagt ttct caagga cgttgttaca gcgtactgct Ctgtgggitta Cctggagacg 96.O aagtgcggat atgcctgctic cgaccacacc tcggc.cagta aatatggitta tcc.cgcggct atggcgacag alagcagagat ggaaaatacc aataagaaga taCatactac cgacgacaag 108 O atcaagtatt tgagct tcga t catatgttg gag catgc.ca agttgagtict tggct tcgct 114 O titcgaattgg catttgcgcc gttittaa 1167

SEQ ID NO 12 LENGTH: 388 TYPE : PRT ORGANISM: Aspergillus fumi gatus

< 4 OOs SEQUENCE: 12

Met Llys Val Lieu. Thir Ala Ile Ala Lieu. Ser Ala Ile Ala Phe Thr Gly 1. 5 1O 15

Ala Wall Ala Ala Wall Ile Thr Glin Glu Ala Phe Lell Asn Asn Pro Arg 25 3 O

Ile His His Asp Glin Glu Lys Tyr Lieu. Ile Glu Lell Ala Pro 35 4 O 45

Thir Arg Trp Wall. Thir Glu Glu Glu Llys Trp Ala Lell Luell Asp Gly SO 55 60

Wall Asn Phe Ile Asp Ile Thr Glu Glu. His Asn Thir Phe Tyr Pro 65 70 7s 8O

Thir Luell His Ser Ala Ser Tyr Val Llys Tyr Pro Pro Met Gln Tyr 85 90 95

Ala Glu Glu Wall Ala Ala Lieu. Asn Lys Asn Lieu. Ser Glu As yet 105 11 O

Ala Asn Lieu. Glu Arg Phe Thr Ser Phe His Thir Arg 115 12 O 125

Ser Glin Thir Gly Ile Arg Ser Ala Thir Trp Leu Phe Asp Glin Wall Glin 13 O 135 14 O

Arg Wall Wall Ser Glu Ser Gly Ala Ala Glu Tyr Gly Ala Thir Wall Glu 145 150 55 160

Arg Phe Ser His Pro Trp Gly Glin Phe Ser Ile Ile Ala Arg Ile Pro 1.65 17O 17s

Gly Arg Thir Asn Lys Thr Val Val Lieu. Gly Ala His Glin Asp Ser Ile 18O 185 19 O

Asn Luell Phe Lieu Pro Ser Ile Lieu. Ala Ala Pro Gly Ala Asp Asp Asp 195

Gly Ser Gly Thir Wall. Thir Ile Lieu. Glu Ala Lieu. Arg Gly Luell Lieu. Glin 21 O 215

Ser Asp Ala Ile Ala Lys Gly Asn Ala Ser Asn Thir Wall Glu Phe His 225 23 O 235 24 O

Trp Ser Ala Glu Glu Gly Gly Met Leu Gly Ser Glin Ala Ile Phe 245 250 255

Ser Asn Tyr Lys Arg Asn Arg Arg Glu Ile Llys Ala Met Luell Glin Glin 26 O 265 27 O

Asp Met Thir Gly Tyr Val Glin Gly Ala Lieu. Asn Ala Gly Wall Glu Glu 27s 285

Ala Ile Gly Ile Met Val Asp Tyr Val Asp Glin Gly Lell Thir Glin Phe 29 O 295 3 OO

Lell Asp Val Val Thr Ala Tyr Cys Ser Val Gly Luell Glu Thir 3. OS 310 315 32O

US 7,943,340 B2 123 124 - Continued ggcc ct gcigg act cqtttgg tttcaagaac aaacctic cac cqcagcacgt. c cactitctgt 1740 catat ctitag acaccagcac ctgcaccalag gag cagatcc agt cagttga gaacggcact 18OO gcc.gc.cgitac gcagctggat cattgtcgac tocaactica cct Ctctgtt C cccdaggta 1860 gttggcticag gggaac ccac gccaa.cccct atgcctggag gggct actac act atctgct 1920 cacgggttct t tatggcgt gaCattatgg gctgttattgttgtagctgt tatagagctg 198O gcaatgtaa 1989

<210s, SEQ ID NO 15 &211s LENGTH: 662 212. TYPE: PRT <213> ORGANISM: Trichophyton rubrum <4 OOs, SEQUENCE: 15 Met Val Ser Phe Cys Gly Val Ala Ala Cys Lieu. Lieu. Thr Val Ala Gly 1. 5 1O 15 His Leu Ala Glin Ala Glin Phe Pro Pro Llys Pro Glu Gly Val Thr Val 2O 25 3 O Lieu. Glu Ser Llys Phe Gly Ser Gly Ala Arg Ile Thr Tyr Lys Glu Pro 35 4 O 45 Gly Lieu. Cys Glu Thir Thr Glu Gly Val Lys Ser Tyr Ala Gly Tyr Val SO 55 60 His Leu Pro Pro Gly Thr Lieu. Arg Asp Phe Gly Val Glu Glin Asp Tyr 65 70 7s 8O Pro Ile Asn Thr Phe Phe Trp Phe Phe Glu Ala Arg Lys Asp Pro Glu 85 90 95 Asn Ala Pro Leu Gly Ile Trp Met Asn Gly Gly Pro Gly Ser Ser Ser 1OO 105 11 O Met Phe Gly Met Met Thr Glu Asn Gly Pro Cys Phe Val Asn Ala Asp 115 12 O 125 Ser Asn. Ser Thr Arg Lieu. Asn Pro His Ser Trp Asn. Asn. Glu Val Asn 13 O 135 14 O Met Leu Tyr Ile Asp Glin Pro Val Glin Val Gly Lieu Ser Tyr Asp Thr 145 150 155 160 Lieu Ala Asn. Phe Thr Arg Asn Lieu Val Thir Asp Glu Ile Thir Lys Lieu. 1.65 17O 17s Llys Pro Gly Glu Pro Ile Pro Glu Glin Asn Ala Thr Phe Leu Val Gly 18O 185 19 O Thr Tyr Ala Ser Arg Asn Met Asn. Thir Thr Ala His Gly Thr Arg His 195 2OO 2O5 Ala Ala Met Ala Leu Trp His Phe Ala Glin Val Trp Phe Glin Glu Phe 21 O 215 22O Pro Gly Tyr His Pro Arg Asn Asn Lys Ile Ser Ile Ala Thr Glu Ser 225 23 O 235 24 O Tyr Gly Gly Arg Tyr Gly Pro Ala Phe Thr Ala Phe Phe Glu Glu Gln 245 250 255 Asn Gln Lys Ile Lys Asn Gly. Thir Trp Llys Gly His Glu Gly. Thir Met 26 O 265 27 O His Val Lieu. His Lieu. Asp Thir Lieu Met Ile Val Asn Gly Cys Ile Asp 27s 28O 285 Arg Lieu Val Glin Trp Pro Ala Tyr Pro Gln Met Ala Tyr Asn Asn Thr 29 O 295 3 OO Tyr Ser Ile Glu Ala Val Asn Ala Ser Ile His Ala Gly Met Lieu. Asp 3. OS 310 315 32O US 7,943,340 B2 125 126 - Continued

Ala Luell Tyr Arg Asp Gly Gly Arg Asp Ile Asn His Cys Arg 3.25 330 335

Ser Luell Ser Ser Wall Phe Asp Pro Glu Asn Luell Gly Ile Asn Ser Thir 34 O 345 35. O

Wall Asn Asp Wall Cys Asp Ala Glu Thir Phe Cys Ser Asn Asp Wall 355 360 365

Arg Asp Pro Lell Phe Ser Gly Arg ASn Tyr Asp Ile Gly 37 O 375

Glin Luell Asp Pro Ser Pro Phe Pro Ala Pro Phe Met Ala Trp Luell 385 390 395 4 OO

Asn Glin Pro His Wall Glin Ala Ala Luell Gly Wall Pro Lell Asn Trp Thir 4 OS 415

Glin Ser Asn Asp Wall Wall Ser Thir Ala Phe Arg Ala Ile Gly Asp 425 43 O

Pro Arg Pro Gly Trp Lell Glu Asn Luell Ala Lell Lell Glu Asn Gly 435 44 O 445

Ile Lys Wall Ser Lell Wall Tyr Gly Asp Arg Asp Tyr Ala Asn Trp 450 45.5 460

Phe Gly Gly Glu Lell Ser Ser Luell Gly Ile ASn Tyr Thir Asp Thir His 465 470

Glu Phe His Asn Ala Gly Tyr Ala Gly Ile Glin Ile Asn Ser Ser 485 490 495

Ile Gly Gly Glin Wall Arg Glin Tyr Gly Asn Luell Ser Phe Ala Arg Wall SOO 505

Glu Ala Gly His Glu Wall Pro Ser Tyr Glin Pro Glu Thir Ala Luell 515 525

Glin Ile Phe His Arg Ser Lell Phe Asn Asp Ile Ala Thir Gly Thir 53 O 535 54 O

Lys Asp Thir Ser Ser Arg Met Asp Gly Gly Lys Phe Gly Thir Ser 5.45 550 555 560

Gly Pro Ala Asp Ser Phe Gly Phe Asn Lys Pro Pro Pro Glin His 565 st O sts

Wall His Phe Cys His Ile Lell Asp Thir Ser Thir Thir Lys Glu Glin 585 59 O

Ile Glin Ser Wall Glu Asn Gly Thir Ala Ala Wall Arg Ser Trp Ile Ile 595 605

Wall Asp Ser Asn Ser Thir Ser Luell Phe Pro Glu Wall Wall Gly Ser Gly 610 615 62O

Glu Pro Thir Pro Thir Pro Met Pro Gly Gly Ala Thir Thir Luell Ser Ala 625 630 635 64 O

His Gly Phe Luell Tyr Gly Wall Thir Luell Trp Ala Wall Ile Wall Wall Ala 645 650 655

Wall Ile Glu Luell Ala Met 660

<21Os SEQ ID NO 16 <211 > LENGTH: <212> TYPE: DNA <213> ORGANISM: Trichophyton rubrum

< 4 OOs SEQUENCE: 16 atgcgctttg Ctgctago at tcc.gtggcc ctgcc agt cattcacgcggc gagtgct caa ggct tcc.ctic cacccgittaa gggcgtcacc gtggtcaaat C caagttcga caaaacgta 12 O aagat cacat acaaggaggt atgtgtttac at catttitca catccagat c titat atccitt 18O

US 7,943,340 B2 131 132 - Continued

85 90 95

Pro Luell Ser Ile Trp Lell Asn Gly Gly Pro Gly Ser Ser Ser Met Ile 1OO 105 11 O

Gly Luell Phe Glin Glu Asn Gly Pro Trp Wall Asn Glu Asp Ser 115 12 O 125

Ser Thir Thir Asn Asn Ser Phe Ser Trp Asn ASn Lys Wall Asn Met Luell 13 O 135 14 O

Tyr Ile Asp Glin Pro Asn Glin Wall Gly Phe Ser Asp Wall Pro Thir 145 150 155 160

Asn Ile Thir Ser Thir Ile Asn Asp Thir Ile Ser Wall Ala Asp Phe 1.65 17s

Ser Asn Gly Wall Pro Ala Glin Asn Luell Ser Thir Lell Wall Gly Thir Gly 18O 185 19 O

Ser Ser Glin Asn Pro Trp Ala Thir Ala Asn ASn Thir Wall Asn Ala Ala 195

Arg Ser Ile Trp His Phe Ala Glin Wall Trp Phe Glin Glu Phe Pro Glu 21 O 215 22O

His Pro Asn Asn Asn Ile Ser Ile Trp Thir Glu Ser Gly 225 23 O 235 24 O

Gly Arg Gly Pro Ser Phe Ala Ser Tyr Phe Glin Glu Glin Asn Glu 245 250 255

Ile Asn His Thir Ile Thir Glu Glu Gly Glu Met His Ile Luell 26 O 265 27 O

Asn Luell Asp Thir Lell Gly Ile Ile Asn Gly Cys Ile Asp Luell Met Phe 27s 285

Glin Ala Glu Ser Tyr Ala Glu Phe Pro ASn Asn Thir Gly Ile 29 O 295 3 OO

Lys Ala Thir Glu Arg Asp Ala Ile Lell His Asp Ile His 3. OS 310 315

Arg Pro Asp Gly Cys Phe Asp Wall Thir Arg Glu Ala Ala 3.25 330 335

Glu Gly Asp Pro His Phe Ser Asn ASn Ala Thir Wall Asn Thir 34 O 345 35. O

Ile Ala Asp Ala Asn Ser Ala Asp Lell Met Asp Pro 355 360 365

Phe Glin Glu Thir Asn Lell Gly Tyr Asp Ile Ala His Pro Luell Glin 37 O 375 38O

Asp Pro Phe Pro Pro Pro Phe Gly Phe Lell Ser Glin Ser Ser 385 390 395 4 OO

Wall Luell Ser Asp Met Gly Ser Pro Wall Asn Phe Ser Glin Tyr Ala Glin 4 OS 415

Ala Wall Gly Lys Ser Phe His Gly Wall Gly Asp Ala Arg Pro Asp 425 43 O

Wall Arg Gly Phe Thir Gly Asp Ile Ala Luell Lell Glu Ser Gly Wall 435 44 O 445

Wall Ala Luell Wall Gly Asp Arg Asp Ile Asn Trp Phe 450 45.5 460

Gly Gly Glu Glin Wall Ser Lell Gly Luell Asn Tyr Thir Gly Thir Glin Asp 465 470 47s

Phe His Arg Ala Lys Ala Asp Wall Lys Wall Asn Ser Ser Tyr Wall 485 490 495

Gly Gly Wall Wall Arg Glin His Gly Asn Phe Ser Phe Thir Arg Wall Phe SOO 505 51O US 7,943,340 B2 133 134 - Continued Glu Ala Gly His Glu Val Pro Gly Glin Pro Glu Thir Ala Lieu Lys 515 525

Ile Phe Glu Arg Ile Met Phe Asn Asp Ile Ser Thir Gly Glu Ile 53 O 535 54 O

Asp Ile Ala Glin Llys Pro Asp Tyr Gly Thir Thir Gly Thir Glu Ser Thr 5.45 550 555 560

Phe His Ile Lys Asn Asp Ile Pro Pro Ser Pro Glu Pro Thir Cys Tyr 565 st O sts

Lell Luell Ser Ala Asp Gly Thr Cys Thir Pro Glu Glin Lell Asn Ala Ile 585 59 O

Asp Gly Thir Ala Wal Wall Glu Asn Ile Ile Lys Ser Pro Ala 595 605

Ala Ser Gly Asn Pro Pro Pro Thir Thir Thir Ser Ser Pro Thir Ala 610 615

Ala Pro Thir Ala Gly Ser Ala Met Luell Ala Pro Wall Ala Met Lieu. 625 630 635 64 O

Ala Ile Ser Ala Lieu. Thir Wall Lieu. Ala Phe Phe Lell 645 650

SEQ ID NO 19 LENGTH: 1795 TYPE: DNA ORGANISM: Trichophyton rubrum FEATURE: NAMEAKEY: modified base LOCATION: (283) OTHER INFORMATION: a, c, g t , unknown or other <4 OOs, SEQUENCE: 19 atgcaa.gcag caaaattgtt gag.ccggtac tggcaaaatg tacctggitta gtgcagotaa 6 O t ctitgagt ca cat catgcat agitta accga gtat cacaac acaat Ctact attgcgttitt 12 O tgctaatggc taccatagga agactgaggg tat ctdagct ccttitt cqat gtc. cctittag 18O actact Caaa. ccc.gtottco actitcgct co ggttgttcgc Caggagtgttg Cagcggcgaa 24 O titcCagggtc citct ct cqat gataaagaca gacagct acc ctinggattgt titt cotgcag 3OO ggtggaccag gaggagcttg CCCaCaac Ct Caggagg tag gctgggttgg gcc attgctg 360 gatcgaggat tcc aggtgag t ct coagaat cgggatgagt aactgtagaa caccittgttg aatttcttga ttagat cott ctic cttgacc agc gaggaac agggctitt.ca accoctataa.

gcttgct citt Cagggaaacg Cagtaaagca agc.cgaatat cittaggctat 54 O taatat cqtg cgagactgtg aag cagtgcg taalactattg actgct tatt accctic caga taa.gcagaaa tggagcgt.cc ttggc.ca.gag ttittggagga ttctgtgcc.g 660 t cacg tatgt ttctaagtag tgagtaacta CtcCttcaaa. tccacctgct atagattgtc 72 O gtgcaaatct aac Ctt Catc. atc tagt cct gagggactta aagaagttctt cacaactggit ggattacc cc citc.ttgttgtc aaa.gc.ctgat Cctgtgtacg agaggacct a cggtaagttg 84 O ggatagattg ggctatttitt agtttaat at acagctgaca tctacagaca agg to cagtic 9 OO ccggaataaa gtgtactatt c. cact t t c cc cgalagacgaa gat.cgagtgc ggattatact 96.O caag catcto caaacccacg atgttaagct CCC catggc t caccgittaa Ctc.cggaacg ctitt ct coag c taggaattic attittggaat gaaaggtacg c catact tcg Caggtgactt 108 O citcgtaacca atgactaa.ca tatgcatata gggggcatcg gct tagttca tag tatgata 114 O

CCat Caataa. Cttacattat act tattoac tgactaacaa tgtcgaaata t caggcatala 12 OO ttittgaagtg cattaatgaa ctggaatact ttggct tcct cacacgacct act titat citc. 126 O

US 7,943,340 B2 137 138 - Continued

Met Glin Ala Ala Lys Lell Lell Ser Arg Tyr Trp Glin Asn Wall Pro Gly 1O 15

Arg Luell Arg Wall Ser Glu Lell Luell Phe Asp Wall Pro Lell Asp Ser 25 3 O

Asn Pro Ser Ser Thir Ser Lell Arg Luell Phe Ala Arg Ser Wall Glin Arg 35 4 O 45

Arg Ile Pro Gly Ser Ser Lell Asp Asp Asp Arg Glin Luell Pro Trp SO 55 60

Ile Wall Phe Luell Glin Gly Gly Pro Gly Gly Ala Pro Glin Pro Glin 65 70

Glu Wall Gly Trp Wall Gly Pro Luell Luell Asp Arg Gly Phe Glin Ile Luell 85 90 95

Lell Luell Asp Glin Arg Gly Thir Gly Luell Ser Thir Pro Ile Thir Ala Ala 105 11 O

Thir Luell Ala Luell Glin Gly Asn Ala Wall Glin Ala Glu Luell Arg 115 12 O 125

Lell Phe Arg Ala Asp Asn Ile Wall Arg Glu Ala Wall Arg 13 O 135 14 O

Lell Luell Thir Ala Tyr Tyr Pro Pro Asp Glin Trp Ser Wall Luell 145 150 155 160

Gly Glin Ser Phe Gly Gly Phe Ala Wall Thir Wall Ser Asn Pro 1.65 17O 17s

Glu Gly Luell Lys Glu Wall Phe Thir Thir Gly Gly Lell Pro Pro Luell Wall 18O 185 19 O

Ser Pro Asp Pro Wall Glu Arg Thir Asp Lys Wall Glin Ser 195

Arg Asn Wall Tyr Ser Thir Phe Pro Glu Asp Glu Asp Arg Wall 21 O 215

Arg Ile Ile Luell Lys His Lell Glin Thir His Asp Wall Luell Pro Asp 225 23 O 235 24 O

Gly Ser Pro Luell Thir Pro Glu Arg Phe Luell Glin Lell Gly Ile His Phe 245 250 255

Gly Met Gly Ile Ile Lell Cys Ile ASn Glu Lell Glu Phe 26 O 265 27 O

Gly Phe Luell Thir Pro Thir Luell Ser Luell Ile Glu Asn Asp Thir Ser 285

Ala Asp Asn Gly Ile Lell Tyr Ala Ile Met His Glu Ser Ile 29 O 295 3 OO

Glin Gly Glu Ala Ser Asn Trp Ala Ala Glu Arg Lell Lell Pro Phe 3. OS 310 315

Ser Gly Phe Arg Gly Ala His Asn Pro Asp Gly Ile Phe Thir Gly 3.25 330 335

Glu Met Wall Tyr Lys His Trp Phe Glu Ser Ser Thir Glu Luell Gly Glin 34 O 345 35. O

Lell Glu Wall Ala Asp Ile Luell Ala Ser Asn Asp Trp Pro Glin 355 360 365

Lell Tyr Asp Glu Glin Lell Ala Arg Asn Glu Wall Pro Wall Ser 37 O 375

Ala Thir Wall Glu Asp Met Wall His Phe Ser Ala Asn Glu 385 390 395 4 OO

Thir Ala Ala Thir Ile His Asn Glin Phe Ile Thir Asn Thir Met 4 OS 41O 415

His Asn Gly Lell Arg Ser Asp Ser Ala Glu Lell Ile Ala Glin Luell

US 7,943,340 B2 143 144 - Continued gatgcttctic ticttgaccga agctgagaga aagtgggtga atgattacca togaaagtic 18OO tgggagaaga C cagtic cctt Ctttgagaag gacgagittaa Caaccgc.ctg gctaaag.cgc 1860 gaga cacaac ctatttaa 1878

<210s, SEQ ID NO 24 &211s LENGTH: 625 212. TYPE : PRT &213s ORGANISM: Trichophyton rubrum

<4 OOs, SEQUENCE: 24

Met Pro Pro Pro Pro Wall Asp Thir Thir Glin Arg Lell Ala Luell Arg 1. 5 15

Glu Luell Met Ala Glin Asn Wall Asp Wall Ile Wall Pro Ser Glu 25 3 O

Asp Ser His Glin Ser Glu Ile Ala Pro Asp Gly Arg Arg Ala 35 4 O 45

Phe Ile Ser Ser Phe Thir Gly Ser Ala Gly Ala Ile Wall Ser Met SO 55 60

Ser Ala Ala Lell Ser Thir Asp Gly Arg Phe Ser Glin Ala Ala 65 70

Glin Luell Asp Ala Asn Trp Ile Luell Luell Arg Gly Wall Glu Gly 85 90 95

Wall Pro Thir Trp Glu Glu Trp Thir Ala Glu Glin Ala Glu Thir Arg Glin 105 11 O

Gly Gly Ser Asp Ala Arg Lys Luell Ser Glin Thir Lell Thir Thir 115 12 O 125

Gly Gly Ser Luell Wall Gly Ile Asp Glin Asn Luell Ile Asp Ala Wall Trp 13 O 135 14 O

Gly Asp Glu Arg Pro Ala Arg Pro Ala Asn Glin Ile Thir Wall Glin Pro 145 150 155 160

Wall Glu Arg Ala Gly Ser Phe Glu Glu Wall Glu Asp Luell Arg 1.65 17O 17s

Glu Luell Thir Ala Arg Ser Ala Met Wall Ile Ser Ser Lys 18O 185 19 O

Phe Luell Tyr Trp Lell Ser Luell Thir Ser His Ala Asp 195 2O5

Trp Cys Ser Ile Pro Asn Pro Wall Phe Phe Ser Ala Ile Wall 21 O 215 22O

Thir Pro Ser Wall Ala Glu Lell Wall Asp Glu Ser Luell Ser Pro 225 23 O 235 24 O

Glu Ala Arg His Lell Glu Gly Wall Wall Lell Pro Tyr Glu 245 250 255

Ser Ile Phe Glin Ala Ser Wall Luell Ala Glu Ser Ala Ser Ala 26 O 265 27 O

Ser Ser Gly Ser Ser Gly Phe Luell Luell Ser Asn Lys Ala Ser Trp 28O 285

Ser Luell Ser Luell Ala Lell Gly Gly Glu Glin ASn Wall Wall Glu Wall Arg 29 O 295 3 OO

Ser Pro Ile Thir Asp Ala Ala Ile ASn Glu Wall Glu Luell Glu 3. OS 310 315

Gly Phe Arg Cys His Ile Arg Asp Gly Ala Ala Lell Ile Glu Tyr 3.25 330 335

Phe Ala Trp Luell Glu Asn Ala Luell Ile Glu Gly Ala Lys Luell Asp 34 O 345 35. O US 7,943,340 B2 145 146 - Continued

Glu Wall Asp Gly Ala Asp Llys Lieu. Phe Glu Ile Arg Lys Lys Tyr Asp 355 360 365

Lell Phe Wall Gly Asn. Ser Phe Asp Thir Ile Ser Ser Thir Gly Ala Asn 37 O 375

Gly Ala Thir Ile His Tyr Lys Pro Glu Lys Ser Thir Cys Ala Ile Ile 385 390 395 4 OO

Asp Pro Ala Met Tyr Lieu. Cys Asp Ser Gly Gly Glin Lieu. Asp 4 OS 415

Gly Thir Thir Asp Thr Thr Arg Thr Luell His Phe Gly Glu Pro Thr Glu 425 43 O

Phe Glin Lys Lys Ala Tyr Ala Lieu Wall Luell Gly His Ile Ser Ile 435 44 O 445

Asp Asn Ala Ile Phe Pro Lys Gly Thir Thir Gly Tyr Ala Ile Asp Ser 450 45.5 460

Phe Ala Arg Gln His Lieu. Trp Llys Glu Gly Luell Asp Luell His Gly 465 470 48O

Thir Gly His Gly Val Gly Ser Phe Luell Asn Wall His Glu Gly Pro Met 485 490 495

Gly Ile Gly Ser Arg Ala Glin Tyr Ala Glu Wall Pro Lell Ser Ala Ser SOO 505

Asn Ser Luell Asp Ile Met Lys Thr Ala Thir Ser Ala Phe Wall Ser Arg 515 525

Wall Ser Ser Met Thr Ala Tyr Ser Ser Phe Phe Ile Lell Thir Ala Ser 53 O 535 54 O

Lell Asp Luell Val Ile Cys Lys Glu Wall Glin Thir Ala His Phe Gly 5.45 550 555 560

Asp Pro Phe Leu Gly Phe Glu Ser Ile Thir Lell Wall Pro Phe Cys 565 st O sts

Glin Luell Lieu. Asp Ala Ser Lieu Luell Thir Glu Ala Glu Arg 585 59 O

Wall Asn Asp Tyr His Ala Lys Val Trp Glu Lys Thir Ser Pro Phe Phe 595 605

Glu Lys Asp Glu Lieu. Thir Thir Ala Trp Luell Lys Arg Glu Thir Gln Pro 610 615 62O

Ile 625

<210s, SEQ ID NO 25 &211s LENGTH: 2344 &212s. TYPE: DNA <213> ORGANISM: Trichophyton rubrum <4 OOs, SEQUENCE: 25 at Caac Ct. Ca Cct Ctt Cacc gtct cacgcc citt cqtc.ccg to caact citt cattt cqccc 6 O tctic tatgat aac Caacaaa. catcc.gctgt tatgtaatcg aac cc.gc.cgt. tagc.catcc c 12 O tagcc.ccg.cg titt to tcc ca. gcatcaatac gaccgaaatg alagacagacg gggalaga.cga 18O ggcaaaacaa talacacat Ca acaatttaac Ctt CtaCCCa t cittgtctac 24 O gcatcgt.cca acct t t tott gcc ctatat c agc.cgaactic ggc catcatg gatat coacg 3OO tcqacaaata cc.cggctaag agt cacgc.ca cgaga agctic aaggcc.gcgg 360 ggcacggctic taccggcatc aaggccaaaa ggagcatatt atcgatgata gcgacgagcc gttt cactitc tgggaataca Ctcgactggg cggaataagc 48O taacaaaagg gtgttgatagt Caacgc.cgaa act tcc tota tctgtc.cggc tgtcttgagg 54 O

US 7,943,340 B2 149 150 - Continued citcc ct cittg agc.ccgc.cga agccttgaag Cagttcgatg ttgatgc.cgit gct cotcaca 360 actgagataa acaact at Ct cgcgaagtgt ggggg.cgaga aggtott cac cattgcagac agagtttgcc cggaggtotC citt cit catcc. ttcaa.gcaca acgacaccga tgc cctgaag cittgc.cat cq agt cct gcc.g tat agtgaaa gacgagtatg aaattggtct t ct cogacgt. 54 O gctaatgagg t ct coagc.ca agct catatt gaagtgatga aagcc.gcaac Caagttcaaag aacgagagag agctictatgc tactCtcaac tatgtctgca tgtctaatgg ctgct cogac 660 cagt cittacc atcCaatt Ct tgcatgtggc cc caatgctg CCaCtect CCa ctacaccaag 72 O aacaacggtg accitalactaa cc.cggctacc gggattalagg accagct cqt actitat coac gctggatgcc agtaca aggc gtact.gtgca gatat cactic gtgcatt coc Cttgtc.cggc 84 O aaatticacca cggagggcc.g ccagatctat gatattgcct tggagatgca gaaagtc.gc.g 9 OO tittggcatga toaaac Ctaa tgttttgttc gacga catgc atgctg.cggit ccaccgggitt 96.O gcgatcaagg ggctgctcaa gattggcatt ct cactggct Ctgaggatga gattitt.cgat aagggaat Ca gcactgcc tt titt cocacat ggt ct aggcc accatct cqg catggacact 108 O cacgatgttg gaggaaac cc taacccggct gaccc.gaatc gcatgtttaa at acttgcgt. 114 O

Ctgcgaggca ctgttccaga gggat.ccgt.c attacaattg agc.ccggtgt c tact tctgc 12 OO cgttacat ca ttgagc catt CCttactaac CCC gagacca gcaagtacat caact cogala 126 O gttctaga.ca agtactgggc tgttggaggit gtacgitatcg aggacaacgt. cgt.cgt.ccgc 132O gccalatggct ttgagaac ct gaccacggtg ccalaaggagc CC9aggaggt cgaacgcatt 1380 gtcCaggagg gtgctaaata a. 14 O1

<210s, SEQ I D NO 27 &211s LENGT H: 466 212. TYPE : PRT &213s ORGAN ISM: Trichophyton rubrum

<4 OOs, SEQUENCE: 27

Pro Asn. Ser Ala Ile Met Asp Ile His Val Asp Pro Ala Lys 1. 5 1O 15

Ser His Ala Arg Arg Val Ala Glu Llys Lieu Lys Ala Ala Gly His Gly 25 3 O

Ser Thr Gly Ile Ile Phe Wall Glu Gly Glin Lys Glu His Ile Ile Asp 35 4 O 45

Asp Ser Asp Glu Pro Phe His Phe Arg Glin Arg Arg Asn Phe Leu Tyr SO 55 60

Lieu. Ser Gly Cys Lieu. Glu Ala Glu Cys Ser Val Ala Asn Ile Glu 65 70 8O

Lys Asp Glu Lieu. Thir Lieu. Phe Ile Pro Pro Wall Asp Pro Ala Ser Wall 85 90 95

Met Trp Ser Gly Lieu Pro Lieu. Glu Pro Ala Glu Ala Lell Lys Glin Phe 105 11 O

Asp Wall Asp Ala Wall Lieu. Lieu. Thr Thr Glu Ile Asn Asn Lieu Ala 115 12 O 125

Gly Glu Lys Val Phe Thir Ile Ala Asp Arg Wall Cys Pro 13 O 135 14 O

Glu Wal Ser Phe Ser Ser Phe Lys His Asn Asp Thir Asp Ala Lieu Lys 145 150 155 160

Lieu Ala Ile Glu Ser Cys Arg Ile Val Lys Asp Glu Glu Ile Gly 1.65 17O 17s US 7,943,340 B2 151 152 - Continued Lell Luell Arg Arg Ala Asn. Glu Val Ser Ser Glin Ala His Ile Glu Wall 18O 185 19 O

Met Ala Ala Thr Lys Ser Lys Asin Glu Arg Glu Lell Tyr Ala Thr 195 2O5

Lell Asn Tyr Val Cys Met Ser Asn Gly Cys Ser Asp Glin Ser Tyr His 21 O 215

Pro Ile Luell Ala Cys Gly Pro Asn Ala Ala Thr Lell His Thr Lys 225 23 O 235 24 O

Asn Asn Gly Asp Lieu. Thir Asn Pro Ala Thr Gly Ile Asp Gln Lieu. 245 250 255

Wall Luell Ile Asp Ala Gly Cys Glin Tyr Lys Ala Ala Asp Ile 26 O 265 27 O

Thir Arg Ala Phe Pro Leu Ser Gly Llys Phe Thr Thir Glu Gly Arg Glin 285

Ile Tyr Asp Ile Ala Lieu. Glu Met Glin Llys Val Ala Phe Gly Met Ile 29 O 295 3 OO

Lys Pro Asn Val Lieu. Phe Asp Asp Met His Ala Ala Wall His Arg Val 3. OS 310 315 32O

Ala Ile Gly Lieu Lleu Lys Ile Gly Ile Lieu. Thir Gly Ser Glu Asp 3.25 330 335

Glu Ile Phe Asp Llys Gly Ile Ser Thir Ala Phe Phe Pro His Gly Lieu. 34 O 345 35. O

Gly His His Leu Gly Met Asp Thr His Asp Wall Gly Gly Asn Pro Asn 355 360 365

Pro Ala Asp Pro Asn Arg Met Phe Llys Tyr Lieu. Arg Lell Arg Gly Thr 37 O 375

Wall Pro Glu Gly Ser Val Ile Thr Ile Glu Pro Gly Wall Phe Cys 385 390 395 4 OO

Arg Ile Ile Glu Pro Phe Lieu. Thir Asn. Pro Glu Thir Ser Llys Tyr 4 OS 41O 415

Ile Asn Ser Glu Val Lieu. Asp Llys Tyr Trp Ala Wall Gly Gly Val Arg 425 43 O

Ile Glu Asp Asn Val Val Val Arg Ala Asn Gly Phe Glu Asn Lieu. Thir 435 44 O 445

Thir Wall Pro Lys Glu Pro Glu Glu Val Glu Arg Ile Wall Glin Glu Gly 450 45.5 460

Ala 465

<210s, SEQ ID NO 28 &211s LENGTH: 1730 &212s. TYPE: DNA <213> ORGANISM: Microsporum canis

<4 OOs, SEQUENCE: 28 atgaagacac agttgttgag tctgggagtt gcc ct cacgg CCatctotCa gggcgittatt 6 O gctgaggatg ccttgaactg gcc attcaag cc.gttggitta atgctgtgag tatatacaca 12 O agat.cgat.cg atcgtcct ct tgtcc ctdtc actitat cqct Ctacagtaag Caaaaatact 18O ggagaatcat gtgctgatgt aaatgtatag gatgacctgc aaaacaagat taa.gct caag 24 O gatctt atgg Ctggcgtaca gaaacticcaa gactitcqcct acgct caccc tgagaagaat 3OO cgagtatt.cg ccacaaggat acct cqact ggat.ctacaa tgagct caag 360 gctaccggct actacgatgt gaagatgcag ccacaagt cc acctgtggit c t catgctgag gcagctgtca atgccaatgg caaggat citc actgc.ca.gtg c catgtc.cta cagcc ct coa 48O

US 7,943,340 B2 155 156 - Continued accgc.cgagg agttcggc ct t ct cqgcagc actitt citacg tcgacagcct tacgaccgt. 96.O gaactgcaca aggtoaa.gct gtacct caac titcgacatga ttggctic ccc caactitcgc.c aaccagat ct acgacggaga cggct Cogcc tacaa.catga Ctggcc.ccgc cggatctgct gaaatcgagt acctgttcga gaagttctitc gatgacCagg gaatc.ccaca c cagoccacc 14 O gcct tcaccg gcc.gct Coga c tact ctdcc ttcatcaagc gcaacgt.ccc tic.cggaggit 2OO ctgttt actg gtgctgaggt cgt caagacic gcc.gagc agg Ctaagctatt toggc gag 26 O

cittatgacaa gaactaccac ggcaa.gggcg acactgtaga caa.catcaac 32O aagggtgcta totacct cala cactic gagga atcgc.gt atg c cactgctica gitatgctagt

gatt.cccaac cc.gc.ccaaag acgggtaagc gtgacgtgag ccc.ccgtggc 44 O cagtictatgc atgcggacac cacagcgt.ct t catgitaa 488

SEQ ID NO 3 O LENGTH: 495 TYPE : PRT ORGANISM: Microsporum cani S

< 4 OOs SEQUENCE: 3 O

Met Llys Thr Glin Lieu Lleu Ser Lieu. Gly Val Ala Lell Thir Ala Ile Ser 1. 5 1O 15

Glin Gly Wall Ile Ala Glu Asp Ala Lieu. Asn Trp Pro Phe Llys Pro Lieu. 25 3 O

Wall Asn Ala Asp Asp Lieu. Glin Asn Lys Ile Llys Lell Lys Asp Lieu Met 35 4 O 45

Ala Gly Wall Glin Llys Lieu. Glin Asp Phe Ala Tyr Ala His Pro Glu Lys SO 55 60

Asn Arg Wall Phe Gly Gly Ala Gly His Lys Asp Thir Wall Asp Trp Ile 65 70

Asn Glu Lieu Lys Ala Thr Gly Tyr Tyr Asp Wall Met Glin Pro 85 90 95

Glin Wall His Lieu. Trp Ser His Ala Glu Ala Ala Wall Asn Ala Asn Gly 105 11 O

Asp Lieu. Thir Ala Ser Ala Met Ser Tyr Ser Pro Pro Ala Asp Llys 115 12 O 125

Ile Thir Ala Glu Lieu Val Lieu. Ala Lys Asn Met Gly Asn Ala Thr 13 O 135 14 O

Asp Pro Glu Gly. Thir Lys Gly Lys Ile Val Lell Ile Glu Arg Gly 145 150 155 160

Wall Ser Phe Gly Glu Lys Ser Ala Glin Ala Gly Asp Ala Lys Ala 1.65 17O 17s

Ile Gly Ala Ile Val Tyr Asn. Asn Val Pro Gly Ser Lell Ala Gly Thr 18O 185 19 O

Lell Gly Gly Lieu. Asp Asn Arg His Ala Pro Thir Ala Gly Ile Ser Glin 195

Ala Asp Gly Lys Asn Lieu Ala Ser Lieu Wall Ala Ser Gly Llys Val Thr 21 O 215 22O

Wall Thir Met Asn Val Ile Ser Lys Phe Glu Asn Arg Thir Thir Trp Asn 225 23 O 235 24 O

Wall Ile Ala Glu Thir Lys Gly Gly Asp His Asn Asn Wall Ile Met Lieu. 245 250 255

Gly Ser His Ser Asp Ser Val Asp Ala Gly Pro Gly Ile Asn Asp Asn 26 O 265 27 O US 7,943,340 B2 157 158 - Continued Gly Ser Gly Thir Ile Gly Ile Met Thir Wall Ala Lys Ala Luell Thir Asn 285

Phe Lys Wall Asn Asn Ala Val Arg Phe Gly Trp Trp Thir Ala Glu Glu 29 O 295 3 OO

Phe Gly Luell Leu Gly Ser Thr Phe Tyr Val Asp Ser Lell Asp Asp Arg 3. OS 310 315

Glu Luell His Llys Wall Lys Lieu. Tyr Lieu. Asn. Phe Asp Met Ile Gly Ser 3.25 330 335

Pro Asn Phe Ala ASn G in Ile Tyr Asp Gly Asp Gly Ser Ala Tyr Asn 34 O 345 35. O

Met Thir Gly Pro Ala G y Ser Ala Glu Ile Glu Tyr Lell Phe Glu Lys 355 360 365

Phe Phe Asp Asp Glin G y Ile Pro His Glin Pro Thir Ala Phe Thr Gly 37 O 375

Arg Ser Asp Tyr Ser A a Phe Ile Lys Arg Asn Wall Pro Ala Gly Gly 385 395 4 OO

Lell Phe Thir Gly Ala G ul Wal Wall Llys Thr Ala Glu Glin Ala Llys Lieu. 4 OS 41O 415

Phe Gly Gly Glu Ala G y Val Ala Tyr Asp Llys Asn Tyr His Gly Lys 42O 425 43 O

Gly Asp Thir Val Asp Asn. Ile Asn Lys Gly Ala Ile Tyr Luell Asn. Thir 435 44 O 445

Arg Gly Ile Ala Tyr Ala Thr Ala Gln Tyr Ala Ser Ser Luell Arg Gly 450 45.5 460

Phe Pro Thir Arg Pro Llys Thr Gly Wall Ser Pro Arg Gly 465 470

Glin Ser Met Pro Gly Gly Gly Cys Gly His His Ser Wall Phe Met 485 490 495

SEQ ID NO 31 LENGTH: 1775 TYPE: DNA ORGANISM: Trichophyton mentagrophytes

< 4 OOs SEQUENCE: 31 atgaagtc.gc aactgttgag Cctagocgtg gcc.gt cacca CCattitcCCa gggcgttgtt 6 O ggtcaagagc CCtttggatg gcc cttcaag cctatggtca Ctcaggtgag ttgctgtcaa 12 O cagat.cgatc gatcgatcta cct tcgtocc tgtcaccitat aactic cacag Caggaccaag 18O aaaacacaag tttitc.cgggg aattct tatg tgctgatgta aatgtatagg atgacctgca 24 O aaacaagatt aagct caagg atat catggc aggtgtcgag aagctgcaaa gcttittctga 3OO tgct catcct gaaaagaacc gagtgttcgg tggtaatggc Cacaaggaca Ctgtcgagtg 360 gatctacaat gagctcaagg ccaccggcta ctacaatgtg aagaa.gcagg agcaggtaca

Cctgtggit ct cacgctgagg cc.gct ct cag tgccaatggc aaggacctica aggc.ca.gc.gc catgtcgtac agc cct cotg c caacaagat Catggcc.gag cittgtcgttg c caagaacaa 54 O tggctgcaat gctgtaagtg CCatacaCtt CCtataCatc. acatt Cactt. tagaatgaag agcgcgggag aactgattitt tttitt tttitt tttitt tttitt tgtaiacagac cgattaccca 660 gagaac actic agggaaagat agt cct catt Cagcgtggtg tctgcagctt cgg.cgagaag 72 O tottct cagg Ctggtgatgc gaaggctatt ggtgcc.gttg totacaacaa. cgt.ccc.cgga

gcacticttgg tggccttgac aag.cgc.catg tcc caac cqc tggtc.tttcc 84 O

Caggaggatg gaaagaat ct tgctagocto gttgcttctg gcaaggttga tgtcaccatg 9 OO