USOO8828703B2

(12) UnitedO States Patent (10) Patent No.: US 8,828,703 B2 Ladner (45) Date of Patent: Sep. 9, 2014

(54) PROTEASE INHIBITION 6,423,498 B1 7/2002 Markland et al. 6,534,276 B1 3/2003 Wun et al. (75) Inventor: Robert C. Ladner, Ijamsville, MD (US) 6,583,1086,548,262 B1B2 4/20036/2003 TamburiniGentz et al. et al. (73) Assignee: Dyax Corp., Burlington, MA (US) 6,689,5826,783,960 B2B1 2/20048, 2004 E"Davies et al. 6,806,360 B2 10/2004 Wun et al. (*) Notice: Subject to any disclaimer, the term of this 6,914, 135 B2 7/2005 Sheppard et al. patent is extended or adjusted under 35 23. R: $39. E. 1 U.S.C. 154(b) by 1314 days. 2002/01W 11460 I A1 8/2002 Hollowaya ca. 2003.0114372 A1 6/2003 White et al. (21) Appl. No.: 11/646,148 2003/01296.59 A1 7, 2003 Whelihan et al. 2003/0175919 A1 9/2003 Ley et al. (22) Filed: Dec. 27, 2006 2004/0152633 A1 8/2004 Jorgensen et al. 2004/0171794 A1 9, 2004 Ladner et al. 65 Prior Publication Dat 2005, 0004021 A1 1/2005 Sprecher et al. (65) O DO 2005, 0164945 A1 7, 2005 Nixon et al. US 2008/O255O25A1 Oct. 16, 2008 2005/O180977 A1 8, 2005 Nixon et al. 2006/0264.603 A1 11/2006 Markland et al. O O 2007/002O252 A1 1/2007 Ladner et al. Related U.S. Application Data 2008/O1394.73 A1 6/2008 Ladner et al. 2009,0264350 A1 10, 2009 Blair et al. (60) gyal application No. 60/754,903, filed on Dec. 2010/0286061 A1 1 1/2010 Devy et al. FOREIGN PATENT DOCUMENTS (51) Int. Cl. CI2N 9/48 (2006.01) EP O 274 826 T 1988 CI2N IS/00 (2006.01) EP O 318 451 5, 1989 CI2P2/06 (2006.01) W8 wo3:56, 38; CI2P 2L/04 (2006.01) WO WOO1? 68.707 9, 2001 (52) U.S. Cl. WO WO 2005/021557 3, 2005 USPC ...... 435/212:435/69.1; 435/71. 1; 435/.440 OTHER PUBLICATIONS (58) Field of Classification Search USPC ...... 435/212, 69.1, 71.1, 440 Branden et al. Introduction to Protein Structure, Garland Publishing See application file for complete search history. Inc., New York, p. 247, 1991.* Sheffield. Modification of clearance of therapeutic and potentially (56) References Cited therapeutic proteins. Curr Drug Targets Cardiovasc Haematol U.S. PATENT DOCUMENTS Disord. Jun. 2001;1(1):1-22.* Wriggers et al. Control of protein functional dynamics by peptide 4,966,852 A 10, 1990 Wun et al. linkers Biopolymers. 2005:80(6):736-46.* 5,106,833. A 4, 1992 Broze, Jr. et al. International Search Report of International Application No. PCT/ 5,212,091 A 5, 1993 Diaz-Collier et al. US06/49322 mailed Dec. 7, 2007. 5,223,409 A 6, 1993 Ladner et al. Fries et al., Inter-alpha-inhibitor, hyaluronan and inflammation, Acta 5,312,736 A 5/1994 Rasmussen et al. Biochimica Polonica, 2003, vol. 50, No. 3, pp. 735-742. 5,378,614 A 1/1995 Petersen et al. Magkl 5,455,338 A 10/1995 Sprecher et al. agklara1-rap eta al., Characterization of the enzymaticM activity of human 5,466,783 A 11/1995 Wun et al. 6:Autoactivation, Substrate specificity, and regulation by 5,563,123 A 10, 1996 Innis et al. inhibitors, Biochem. Biophys. Res. Commun., Aug. 2003, 5,589.359 A 12/1996 Innis et al. 8:307(4):948-55, abstract only. 5,618,696 A 4/1997 Norris et al. Scarff et al., Targeted Disruption of SP13/Serpinb6 DoesNot Result 5,629, 176 A 5/1997 Bjorn et al. in Developmental or Growth Defects, Leukocyte Dysfunction, or s A 3. 1992 di t al. Susceptibility to Stroke, Molecular and Cellular Biology, May 2004, ww- ey et al. pp. 4075-4082. s: A 3. R et al 1 Taggart et al., Inactivation of Human beta-Defensins 2 and 3 by 5,780,265- w A 7, 1998 Denniselley etet al.al. Eatolytic 1. TheJournal ofImmunology, 2003, 170:931* V - 5,786.328 A 7, 1998 Dennis et al. 5,795,954. A 8, 1998 Lazarus et al. 5,834,244. A 1 1/1998 Dennis et al. (Continued) 5,863,893 A 1/1999 Dennis et al. 5,880,256 A 3/1999 Dennis et al. Primary Examiner —Yong Pak 5,914,316 A 6, 1999 Brown et al. (74) Attorney, Agent, or Firm — Wolf, Greenfield & Sacks, 6,008,196 A 12/1999 Curran et al. P.C. 6,057,287 A 5, 2000 Markland et al. 6,063,764 A 5/2000 Creasey et al. 6,071,723 A 6, 2000 Markland et al. (57) ABSTRACT 6,103,500 A 8, 2000 Innis et al. Proteins including engineered sequences which inhibit pro 6,174,7216,171,587 B1 1/2001 WunInnis etet al. teases are disclosed, including proteins having two or more 6,180,607 B1 1/2001 Davies et al. engineered KunitZ domains, and uses of Such proteins. 6,242,414 B1 6/2001 Johnson et al. 6,333,402 B1 12/2001 Markland et al. 7 Claims, No Drawings US 8,828.703 B2 Page 2

(56) References Cited Mine et al., “Strutural Mechanism for Heparin-Binding of the Third Kunitz Domain of Human Tissue Factor Pathway Inhibitor.” Bio OTHER PUBLICATIONS chemistry, 41:78-85 (2002). Novotny et al., “Purification and Characterization of the Lipoprotein Borregaard and Cowland, “Granules of the Human Neutrophilic associated Inhibitor from Human Plasma.” The Journal Polymorphonuclear Leukocyte.” Blood, 89(10):3503-3521 (1997). of Biological Chemistry, 264(31): 18832-18837 (1989). Burrage et al., “Matrix Metalloproteinases: Role in Arthritis.” (2006). Petersen et al., “Inhibitory properties of separate recombinant Frontiers in Bioscience, 11:529-543. Kunitz-type-protease-inhibitor domains from tissue-factor-pathway Cassim et al., “Kallikrein cascade and cytokines in inflamed joints.” inhibitor.” Eu. J. Biochem. 235:310-316 (1996). Pharmacology & Therapeutics, 94:1-34 (2002). Pintigny and Dachary-Prigent, "Aprotinin can inhibit the proteolytic Churg and Wright, “Proteases and emphysema.” Curr. Opin. Pulm. activity of : A fluorescence and an enzymatic study.” Eur: J. Biochem. 207:89-95 (1992). Med., 11:153-159 (2005). Piro et al., “Role for the Kunitz-3 Domain of Tissue Factor Pathway Colman, Robert W., “Plasma and tissue kallikrein in arthritis and Inhibitor-O in Cell Surface Binding.” Circulation, 110:3567-3572 inflammatory bowel disease.” Immunopharmacology, 43:103-108 (2004). (1999). Rahman et al., “Identification and functional importance of plasma Cunningham et al., “Structural and functional characterization of kallikrein in the synovial fluids of patients with rheumatoid, psoriatic, tissue factor pathway inhibitor following degradation by matrix and osteoarthritis.” Annals of the Rheumatic Diseases, 54:345-350 metalloproteinase-8.” Biochem. J., 367:451-458 (2002). (1995). Dela Cadena et al., “Inhibition of plasma kallikrein prevents Sprecher et al., “Molecular Cloning. Expression, and Partial Charac peptidoglycan-induced arthritis in the Lewis rat.” FASEB.J., 9:446 terization of a Second Human Tissue-Factor-Pathway Inhibitor.” 452 (1995). Proceedings of the National Academy of Sciences of the United States Devani et al., “Kallikrein-kinin system in inflammatory bowel dis of America, 91 (8):3353-3357 (1994). eases: Intestinal involvement and correlation with the degree of tissue Stadnicki et al., “Activation of Plasma Contact and Coagulation inflammation.” Digestive and Liver Disease, 37:665-673 (2005). Systems and Neutrophils in the Active Phase of Ulcerative Colitis.” Eigenbrotet al., “Structural effects induced by removal of a disulfide Digestive Diseases and Sciences, 42(11):2356-2366 (1997). bridge: the X-ray structure of the C30A/C51A mutant of basic pan Taby et al., “Inhibition of Activated Protein C by Aprotinin and the creatic inhibitor at 1.6 A.” Protein Engineering,3(7):591-598 Use of the Insolubilized Inhibitor for its Purification.” Thrombosis (1990). Research, 59:27-35 (1990). Girard et al., “Functional significance of the Kunitz-type inhibitory Tremblay et al., “Anti-inflammatory activity of neutrophil domains of lipoprotein-associated coagulation inhibitor.” Nature, inhibitors.” Current Opinion in Investigational Drugs, 4(5):556-565 338:518-520 (1989). (2003). Gulberg et al., “BioSynthesis, processing and sorting of neutrophil Volpe-Junior et al., “Augmented plasma and tissue kallikrein like proteins: insight into neutrophil granule development. Eur: Journal activity in Synovial fluid of patients with inflammatory articular of Haematology, 58:137-153 (1997). diseases.” Inflamm. Res., 45:198-202 (1996). Hagihara et al., “Screening for Stable Mutants with Pairs Wun et al., “Cloning and Characterization of a cDNA Coding for the Substituted for the Disulfide Bond between Residues 14 and 38 of Lipoprotein-associated Coagulation Inhibitor Shows that it Consists Bovine Pancreatic Trypsin Inhibitor (BPTI).” The Journal of Biologi of Three Tandem Kunitz-type Inhibitory Domains.” The Journal of cal Chemistry, 277(52):51043-5 1048 (2002). Biological Chemistry, 263(13):6001-6004 (1988). Herter et al., “Hepatocyte growth factor is a preferred in vitro Sub Delgado, C., “The Uses and Properties of PEG-Linked Proteins”. strate for human hepsin, a membrane-anchored impli Critical Reviews in Therapeutic Drug Carrier Systems, 9(3,4):249 cated in prostate and ovarian cancers.” Biochem. J. 390:125-136 304 (1992). (2005). Dhalluin, C., "Structural, Kinetic, and Thermodynamic Analysis of Huang et al., “Kinetics of Factor Xa. Inhibition by Tissue Factor the Binding of the 40 kDa PEG-Interferon-C2a and Its Individual Pathway Inhibitor.” The Journal of Biological Chemistry, Positional Isomers to the Extracellular Domain of the Receptor 268(36): 26950-26955 (1993). IFNAR2, Bioconjugate Chem, 16, 518–527 (2005). Huang et al., “Novel Peptide Inhibitors of Angiotensin-converting Molineux, G., “Pegylation: engineering improved pharmaceuticals 2*.” The Journal of Biological Chemistry, 278(18): 15532 for enhanced therapy”, Cancer Treatment Reviews, 28(Suppl.A): 13 15540 (2003). 16 (2002). Hynes et al., "X-ray Crystal Structure of the Protease Inhibitor Roberts, M.J., “Chemistry for peptide and protein PEGylation'. Domain of Alzheimer's Amyloid f-Protein Precursor.” Biochemis Advanced Drug Delivery Reviews, 54:459-476 (2002). try, 29:100 18-10022 (1990). Veronese, Fm. “Peptide and protein PEGylation: a review of prob Kido et al., “Kunitz-type Protease Inhibitor Found in Rat Mast Cells.” lems and solutions'. Biomaterials, 22:405-417 (2001). The Journal of Biological Chemistry, 263(34): 18104-18107 (1988). Wark, PA.B., “DX-890, Drugs, 5(6):586-589 (2002). Kirchhofer et al., “Hepsin activates pro-hepatocyte growth factor and U.S. Appl. No. 12/471,875, Ley et al. is inhibited by hepatocyte growth factor activator-1B (HAI-1B) and Supplementary European Search Report including the European HAI-2.” FEBS Letters, 579: 1945-1950 (2005). Search Opinion from corresponding European Application No. Kuno et al., “Possible involvement of in impaired EP0684.8187.8, dated Dec. 7, 2009. mucosal repair in patients with ulcerative colitis.” Journal of Gastro Delaria et al., “Characterization of placental bikunin, a novel human enterology, 37(Suppl XIV):22-32 (2002). serine protease inhibitor”. J. Biological Chemistry, vol. 272, No. 18. Léonetti et al., “Increasing Immunogenicity of Antigens Fused to pp. 12209-12214, May 2, 1997. Ig-Binding Proteins by Cell Surface Targeting.” The Journal of Immunology, 160:3820-3827 (1998). * cited by examiner US 8,828,703 B2 1. 2 PROTEASE INHIBITION Some cases, at least one of the KunitZ domains comprises the framework of a naturally occurring human Kunitz domain CROSS-REFERENCE TO RELATED (e.g., TFPI or otherhuman proteinmentioned herein), but that APPLICATIONS differs from the naturally occurring human KunitZ domain at no less than one residue in the protease binding loops. This application claims the benefit of prior U.S. provi In one embodiment, the protein has a molecular weight of sional application 60/754,903, filed Dec. 29, 2005. at least 15 kiloDaltons and/or a beta-phase half life of at least 30 minutes in mice. Molecular weight of proteins and/or BACKGROUND beta-phase half life may be increased by, for example, modi 10 Aspects of the pathology of many diseases include the fying the protein to include one or more PEG moieties or by excessive activity of proteases. Pharmaceutical agents that covalently linking the protein to a carrier Such as human reduce Such protease activity can be used to ameliorate dis serum albumin (HSA). Such covalent linkage may be accom ease conditions. plished by covalent cross-linking of the protein and the carrier 15 following production of the protein or by expression of the SUMMARY protein as a fusion protein with the carrier. In one embodiment, at least one of the Kunitz domains KunitZ domains are robust protease inhibitors. Examples contains only two disulfide bonds. For example, the protein of Kunitz domains include non-naturally occurring Kunitz does not include cysteines at positions corresponding to domains that inhibit human plasma kallikrein, human plas amino-acid positions 14 and 38 in bovine pancreatic trypsin min, human hepsin, human matriptase, human endotheliase 1 inhibitor (BPTI). (hET-1) human endotheliase 2 (hET-2), and human neutro Exemplary proteins can inhibit: phil elastase (also known as human leukocyte elastase). two or more proteases whose excessive activity contributes Kunitz domains can be engineered to have very high affinity to rheumatoid arthritis, multiple sclerosis, Crohn's dis and specificity for most serine proteases. 25 ease, cancer, and chronic obstructive pulmonary disor Sometimes in a single disease state there is more than one der; protease whose activity is detrimental. In different patients, a two or more of (e.g., all of) neutrophil elastase, proteinase given symptom may result from the aberrant activity of two 3, and G: different proteases, one in one patient, another in another two or more of (e.g., all of) human neutrophil elastase, patient. In cases such as these, it would be useful to have a 30 , , and . single active pharmacologic agent that inhibits two or more human plasma kallikrein and human neutrophil elastase; proteases. Often, low-specificity inhibitors (e.g., small mol two or more of (e.g., all of) human kallikrein 6 (hK6), ecule inhibitors) cause serious side effects by inhibiting pro human kallikrein 8 (hK8), and human kallikrein 10 teases other than the intended targets. A molecule that con (hK10): tains multiple domains, each having high affinity and 35 hK6 and human neutrophil elastase; specificity for a different protease can be used to provide human tissue kallikrein (h.K1) and human plasma kal inhibition of different target proteases without causing side likrein effects due to cross reaction of the inhibitor with protease(s) two or more of human , human hepsin, human other than the target(s). This allows the development of a matriptase, human endotheliase 1, and human endothe single agent for the treatment of patients with multiple or 40 liase 2. various dysregulated proteases. For example, the protein includes at least two Kunitz In one aspect, the disclosure features an isolated protein domains, a first Kunitz domain specific for neutrophil elastase that includes one or more engineered protease inhibitory and a second Kunitz domain specific for proteinase 3. The sequences, wherein the protein inhibits two or more human first Kunitz domain can be N-terminal or C-terminal to the proteases. For example, the protein inhibits the two or more 45 second Kunitz domain. human proteases, each with a Ki of less than 100 nM, 10 nM, DX-88 is a Kunitz-domain inhibitor of human plasma kal 1 nM, or 0.1 nM. likrein (hpKall) and DX-890 is a Kunitz-domain inhibitor of In one embodiment, the protein of the invention comprises human neutrophil elastase (hNE). In one embodiment, the two or more Kunitz domains, each of which inhibit the same protein includes (i) DX-88, a domain at least 85,90, 92, 95, protease and the protein has longer serum residence than 50 96, 97,98 or 99% identical to DX-88, or a Kunitz domain that would a single Kunitz domain due to increased molecular includes at least 70, 80, 85,90, 95, or 100% of the amino acid mass. The protein may or may not be glycosylated. residues in the protease binding loops of DX-88 and (ii) In one embodiment, the protein contains a single engi DX-890, a domain at least 85,90, 92, 95, 96, 97, 98 or 99% neered Kunitz domain that inhibits two or more human pro identical to DX-890, or a Kunitz domain that includes at least teases as the engineered protease inhibitory sequence. 55 70, 80, 85,90, 95, or 100% of the amino acid residues in the The protease inhibitory sequences can be, for example, protease binding loops of DX-890. Kunitz domains, peptides (e.g., cyclic peptides or linear pep In one embodiment, the protein includes at least two tides of length less than 24 amino acids), and variable Kunitz domains, and at least two of which are identical or at domains of antigenbinding fragments. For example, the pro least 90 or 95% identical. In one example, the protein includes tease inhibitory sequences can include the amino acid 60 at least two Kunitz domains, each of which is DX-88, a sequence of a heavy chain variable domain that binds and domain at least 85,90, 92,95, 96, 97,98 or 99% identical to inhibits a human protease in conjunction with a light chain DX-88, or a Kunitz domain that includes at least 70, 80, 85, variable domain. 90, 95, or 100% of the amino acid residues in the protease In one embodiment, the protein includes at least two, three, binding loops of DX-88. In another example, the protein four, five, six, seven, or eight Kunitz domains, as the engi 65 includes at least two Kunitz domains, each of which is neered proteases inhibitory sequences. For example, the pro DX-890, a domain at least 85,90, 92, 95, 96, 97, 98 or 99% tein contains two, three, four, five, or six KunitZ domains. In identical to DX-890, or a Kunitz domain that includes at least US 8,828,703 B2 3 4 70, 80, 85,90, 95, or 100% of the amino acid residues in the In another aspect, the disclosure features a method of pro protease binding loops of DX-890. viding a protein that inhibits two or more human proteases. In one embodiment, the protein includes at least two The method includes: providing amino acid sequences of a Kunitz domains, and the Kunitz domains differ from each first KunitZ domain that inhibits a first human protease and a other. second KunitZ domain that inhibits a second human protease; In one embodiment, the at least two KunitZ domains are providing the amino acid sequence of a recipient protein that connected by a linker, e.g., a flexible hydrophilic linker. For includes a plurality of Kunitz domains; and preparing a poly example, the linker is less than 35, 22, 15, 13, 11, or 9 amino nucleotide encoding a modified amino acid sequence of the acids in length. The linker can include at least two, three, or recipient protein, wherein the recipient protein is modified four glycine residues. In one embodiment, the at least two 10 Such that (a) amino acids of one of the KunitZ domains in the Kunitz domains are separated by a serum albumin moiety. In recipient protein is Substituted with amino acid residues that another embodiment, the at least two Kunitz domains are not contribute to binding or specificity of the first Kunitz domain separated by a serum albumin moiety. In one embodiment, the and (b) amino acids of another of the Kunitz domains in the at least two KunitZ domains are connected by a linker that recipient protein is Substituted with amino acid residues that does not include T cell epitopes. For example, the at least two 15 contribute to binding or specificity of the second Kunitz Kunitz domains are connected by a linker that lacks medium domain. to large hydrophobic residues and charged residues. The In another aspect, the disclosure features a method of pro linker may or may not include a glycosylation site. For viding a protein that inhibits two or more human proteases. example, the linker has fewer than three, two, or one glyco The method includes: providing amino acid sequences of a Sylation sites, i.e., no glycosylation sites. The linker can be first KunitZ domain that inhibits a first human protease and a devoid of N-linked glycosylation sites. second KunitZ domain that inhibits a second human protease; In one embodiment, the protein inhibits the human pro and preparing a polynucleotide encoding a polypeptide in teases in the presence of neutrophil defensins. The protein can which sequences encoding the first and second Kunitz be resistant to degradation by resident proteases, e.g., metal domains are in frame. The domains can be separated by a loproteases, e.g., MT6-MMP (Leukolysin). For example, the 25 sequence encoding a linker sequence. In certain embodi protein does not include cleavage sites recognized by MMP-8 ments, (a) the linker does not include medium or large hydro or MMP-12. In one embodiment, the protein does not inhibit phobic residues, nor charged residues, or (b) MHC I epitopes both of factor VII, and . The protein may differ from are not formed by amino acids from the first and second human bikunin by at least two amino acids. Kunitz domain and the linker. In one embodiment, the first In another aspect, the disclosure features a method of treat 30 and second human proteases contribute to a single human ing a subject. The method includes administering, to the disease. subject, a protein described herein, e.g., a protein that These methods can further include expressing the prepared includes one or more engineered protease inhibitory polynucleotide in a cell, e.g., a cell in a transgenic animal sequences, wherein the protein inhibits two or more human (e.g., a mammary cell), or a cultured host cell. The method proteases. For example, the Subject is a Subject in need of a 35 can further include formulating a protein encoded by the reduction in the protease activity of at least two different prepared polynucleotide as a pharmaceutical composition. proteases. The subject can have a disorder to which first and In another aspect, the invention features a method of treat second human proteases contribute and the protein adminis ing rheumatoid arthritis. The method includes: administering, tered inhibits the first and second human proteases. For to the Subject having or Suspected of having rheumatoid example, the protein includes a KunitZ domain that specifi 40 arthritis, an effective amount of a protein that comprises one cally inhibits the first human protease and a Kunitz domain or more engineered protease inhibitory sequences. In some that specifically inhibits the second human protease. embodiments, the protein inhibits both human plasma kal In another aspect, the disclosure features an isolated pro likrein and human tissue kallikrein. tein that includes (i) a KunitZ domain that inhibits a human In another aspect, the disclosure features a method of treat serine protease, and (ii) a polypeptide sequence that contrib 45 ing lung inflammation. The method includes: administering, utes to binding and inhibition of a human protease, e.g., a to a subject having or Suspected of having lung inflammation, metalloprotease. In one example, the protein consists of a an effective amount of a protein that comprises one or more single polypeptide chain. In another example the protein engineered protease inhibitory sequences. In some embodi includes at least two polypeptide chains. In one embodiment, ments, the protein inhibits two or more of human neutrophil one polypeptide chain includes a heavy chain variable 50 elastase, proteinase 3, cathepsin G, and tryptase. domain of an antigen-binding fragment and another polypep In another aspect, the disclosure features a method of treat tide chain includes a light chain variable domain of an anti ing multiple Sclerosis. The method includes: administering, gen-binding fragment, such that the antigen binding binds to a subject having or Suspected of having multiple Sclerosis, and inhibits a human protease, e.g., a metalloprotease. an effective amount of a protein that comprises one or more In one embodiment, the KunitZ domain is a component of 55 engineered protease inhibitory sequences. In some embodi the same polypeptide chain as the heavy chain variable ments, the protein inhibits two or more of human kallikrein 6 domain of the antigen-binding fragment. In another embodi (hK6), human kallikrein 8 (hK8), and human kallikrein 10 ment, the Kunitz domain is a component of the same polypep (hK10). tide chain as the light chain variable domain of the antigen In another aspect, the disclosure features a method of treat binding fragment. For example, the Kunitz domain may be 60 ing spinal chord injury. The method includes: administering, C-terminal to a constant domain of an antibody oran antigen to a Subject having or Suspected of having spinal chord injury, binding fragment thereof or N-terminal to a variable domain. an effective amount of a protein that comprises one or more In another embodiment, one Kunitz domain is N-terminal to engineered protease inhibitory sequences. In some embodi a variable antibody domain and a second Kunitz domain is ments, the protein inhibits both of hK6 and human neutrophil N-terminal to the final constant domain of the same chain. 65 elastase. The two Kunitz domains may have the same inhibitory activ In another aspect, the disclosure features a method of treat ity or different inhibitory activities. ing chronic obstructive pulmonary disorder. The method US 8,828,703 B2 5 6 includes: administering, to a Subject having or Suspected of inhibitor are selected. For example, the library includes varia having chronic obstructive pulmonary disorder, an effective tions at one or more positions in the first and/or second bind amount of a protein that comprises one or more engineered ing loops of the Kunitz domains. For example, positions 13, protease inhibitory sequences. In some embodiments, the 14, 15, 16, 17, 18, 19, 31, 32, 34, 38, 39, and 40 with respect protein inhibits two or more of human neutrophil elastase, to the sequence of BPTI are varied. In other aspects, the PR3, and cathepsin G. invention comprises the construction of genetic diversity that In another aspect, the disclosure features a method of treat encodes constant or nearly constantamino acid sequences to ing an inflammatory bowel disease (e.g., Crohn's disease or avoid genetic instability, e.g., recombination between nucleic ulcerative colitis). The method includes: administering, to a acid sequences encoding different segments of the protein. Subject having or Suspected of having Crohn's disease, an 10 All patents, patent applications, and other references cited effective amount of a protein that comprises one or more herein are incorporated by reference in their entireties. engineered protease inhibitory sequences. In some embodi ments, the protein inhibits human tissue kallikrein (hK1) and DETAILED DESCRIPTION human plasma kallikrein. In another aspect, the disclosure features a protein com 15 A number of disorders involve multiple proteases (Churg prising an engineered KunitZ domain that inhibits a human and Wright, 2005, Curr Opin Pulm Med, 11:153-9), and thus protease, wherein the Kunitz domain lacks the 14-38 disulfide proteins that inhibit at least two human proteases would be found in most naturally occurring Kunitz domains. In some useful pharmacological agents for treating (e.g., improving embodiments, the protease is selected from the group con and/or preventing) Such disorders. Such proteins can include sisting of plasma kallikrein, human neutrophil elastase, plas one or more KunitZ domains, or other sequences that inhibit min, or other protease described herein. In one embodiment, proteases. the binding loops of the Kunitz domain are identical to the Some of these disorders are associated with the excessive binding loops of DX-88, DX-890, or DX-1000, or at least 80, protease activity of dissimilar serine proteases. In Such cases, 85, 90, or 95% identical. In one embodiment, the Kunitz the catalytic sites of any pair of serine proteases may differ domain is at least 85,90, 95, or 100% identical to DX-88, 25 enough that no single Kunitz domain can bind both proteases DX-890, or DX-1000, but does not include cysteines at the with useful affinity. Accordingly, provided herein are compo positions corresponding to cysteine 14 and cysteine 38 of sitions which include two or more Kunitz domains, wherein BPTI. the composition includes at least two different engineered In another aspect, the disclosure features a library of poly Kunitz domains (or Kunitz domains with different specifici nucleotides. The library includes a plurality of polynucle 30 ties) bound via a covalent linkage. The KunitZ domains are otides, each of which comprises a sequence encoding a typically linked via a peptide bond (although it should be Kunitz domain that lacks cysteines at positions correspond understood that the Kunitz domains need not be directly ing to position 14 and 38 of BPTI. The polynucleotides of the linked by a peptide bond in the sense that the carboxy termi plurality vary such that one or more residues in the first and/or nus of one Kunitz domain directly linked via a peptide bond second binding loops of the Kunitz domains differs among 35 to the amino terminus of another KunitZ domain); a linker of the polynucleotides of the plurality. In one embodiment, one or more amino acids may separate the Kunitz domains). codons encoding positions 13, 15, 16, 17, 18, 19, 31, 32, 34, A peptide linker may be any desired length, but in some 39, and 40 (numbered with respect to the sequence of BPTI) embodiments the linker is limited in size, for example, less are varied. than 35, 22, 15, 13, 11, or 9 amino acids in length. The linker In another aspect, the invention features a polynucleotide 40 can include at least two, three, or four glycine residues. encoding a polypeptide that includes at least two Kunitz In one embodiment, the at least two KunitZ domains are not domains that differ by no more than 80, 85,90, 95, 98%. The separated by a serumalbumin moiety. In one embodiment, the polynucleotide sequences encoding the KunitZ domains use at least two KunitZ domains are connected by a linker that different codons at positions corresponding to one or more does not include T cell epitopes. For example, the at least two residues that are identical between the at least two Kunitz 45 Kunitz domains are connected by a linker that lacks medium domains. In one embodiment, the amino-acid sequences of to large hydrophobic residues and charged residues. The the KunitZ domains are identical. For example, the amino linker may or may not include a glycosylation site. For acid sequences of the Kunitz domains are identical, yet the example, the linker has fewer than three, two, or one glyco polynucleotide sequences encoding the identical Kunitz Sylation sites, i.e., no glycosylation sites. The linker can be domains differ at least ten or twenty codons. For example, the 50 devoid of N-linked glycosylation sites. nucleic acid sequences encoding the KunitZ domain are less In some embodiments, proteins comprising multiple dif than 90, 80, 70, 60, or 50% identical. ferent engineered KunitZ domains are produced by transfer It is possible to engineer proteins that contain multiple ring the binding determinants of an engineered Kunitz Kunitz domains, e.g., at least two, three, four, five, or six domain to a recipient protein that naturally includes a plural Kunitz domains. Each Kunitz domain is engineered to have 55 ity of Kunitz domains (e.g., modifying the naturally occurring high affinity and specificity for one or more serine proteases sequences by reference to an engineered sequence so as to Such that the resulting protein can inhibit one to several pro produce a protein comprising at least two engineered Kunitz teases with high affinity and specificity. In addition, the size domains (or Kunitz domains with different specificities). of the engineered protein can be such that the lifetime in If two or more serine proteases are similar, one can select serum is enhanced (e.g., by including amino acid residues 60 for a Kunitz domain that inhibits all of the target serine outside of the Kunitz domains to increase the molecular proteases, e.g., inhibits each target protease with a Ki of less weight of the protein). Any method can be used to engineer a than 10 nM. As used herein, two serine proteases are “similar protein with multiple KunitZ domains. Examples including if the approximately 20 residues that correspond to the trypsin fusing different domains and selecting one or more domains residues that touch BPTI in the trypsin-BPTI complex are from a library of varied Kunitz domains. 65 80% identical. As will be understood, the likelihood of In one aspect of the invention, a library of KunitZ domains obtaining a single KunitZ domain that inhibits two or more lacking the 14-38 disulfide is constructed and high-affinity target serine proteases increases with increasing identity US 8,828,703 B2 7 8 between the target proteases. In some embodiments, proteins BLAST (Altschul et al. (1990).J. Mol. Biol. 215:403-410), comprising KunitZ domains with specificity for more than particularly BLAST 2 Sequences as described by Tatusova one protease have a K, of less than 100 nM, 10 nM, 1 nM, or and Madden (1999, FEMS Microbiol. Lett. 174:247-250) and 0.1 nM for each of the target proteases. as implemented by the National Center for Biotechnology Additionally, in some embodiments, protein comprises a Information (available via the world wide web at ncbi.nlm Kunitz domain linked to a antibody (or antigen binding frag .nih.gov/blast/bl2seq/wblast2.cgi). Parameters for compari ment orderivative thereof, such as Fab's, (Fab), scFv, Fd. Fv) son of two nucleotide sequences (e.g., BLASTN) are Reward or non-Kunitz domain peptide which binds and inhibits a for a match: 1; Penalty for a mismatch: -2; Open gap penalty: serine protease. 5; extension penalty gap: 2; gap X dropoff 50; expect: 10.0; Some disorders involve not only serine proteases, but other 10 word size: 11. Parameters for comparison of two amino acid types of proteases as well (e.g., metalloproteases, such as the sequences (e.g., BLASTP) are Matrix: BLOSUM62: Open matrix metalloproteases, or MMPs, or cysteine proteases). gap penalty: 11; extension gap penalty: 1; gap X dropoff 50: Accordingly the invention also provides proteins which com expect: 10.0; and word size 3. prise at least one KunitZ domain linked to at least one peptide As used herein, reference to hybridization under “low or protein which inhibits a metalloprotease or a cysteine 15 stringency.” “medium stringency.” “high Stringency, or protease. Exemplary proteins which inhibit metalloproteases “very high Stringency' conditions describes conditions for are antibodies and antigen-binding fragments or derivatives hybridization and washing. Guidance for performing hybrid thereof (e.g., Fab's, (Fab), scFv, Fd, FV) which bind to and ization reactions can be found in Current Protocols in inhibit metalloproteases, particularly those that bind to and Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1- inhibit members of the matrix metalloprotease (MMP) fam 6.3.6. Aqueous and nonaqueous methods are described in that ily. Exemplary proteins which inhibit cysteine proteases are reference and either can be used. Specific hybridization con antibodies and antigen-binding fragments or derivatives ditions referred to herein are as follows: (a) low stringency thereof (Fab's, (Fab), scFv, Fd, Fv) which bind to and inhibit hybridization conditions in 6x sodium chloride/sodium cit cysteine proteases, particularly those that bind to and inhibit rate (SSC) at about 45° C., followed by two washes in 0.2x members of the cathepsin family (e.g. cathepsin L and/or 25 SSC, 0.1% SDS at least at 50° C. (the temperature of the cathepsin W). washes can be increased to 55° C. for low stringency condi Fusions between a single or multiple KunitZ domain-con tions); (b) medium stringency hybridization conditions in taining sequences and an antibody (or antigen-binding frag 6xSSC at about 45° C., followed by one or more washes in ment or derivative thereof) may be in any configuration, Such 0.2xSSC, 0.1% SDS at 60° C.; (c) high stringency hybridiza as fusion of the Kunitz domain-containing sequence to: a) the 30 tion conditions in 6xSSC at about 45°C., followed by one or N-term of the light chain of an antigen binding fragment or more washes in 0.2xSSC, 0.1% SDS at 65° C.; and (d) very full length antibody, b) the C-term of the light chain of an high stringency hybridization conditions are 0.5M sodium antigen binding fragment or full length antibody, c) the phosphate, 7% SDS at 65° C., followed by one or more N-term of the heavy chain of an antigen binding fragment or washes at 0.2xSSC, 1% SDS at 65° C. full length antibody, or d) the C-term of heavy chain of an 35 Kunitz, Domains antigen binding fragment or full length antibody. Antibodies A“KunitZ domain is a polypeptide domain having at least can be arranged as Fab's, (Fab), scFv, Fd. Fv. For example, 51 amino acids and containing at least two, and perhaps three, it is can be sufficient to use a protein that includes the light disulfides. The domain is folded such that the first and sixth chain variable domain and the heavy chain variable domain. cysteines, the third and fifth cysteine, and optionally the sec Peptides which inhibit may also be incorporated into the 40 ond and fourth form disulfide bonds (e.g., in a Kunitz domain proteins of the invention. Such inhibitory peptides may be having 58 amino acids, cysteines can be present at positions fused to a KunitZ domain or to a protein containing multiple corresponding to amino acids 5, 14, 30, 38, 51, and 55, Kunitz domains and either antibody or albumin components according to the number of the BPTI sequence provided that will ensure long residence in blood. Peptides that inhibit below, and disulfides can form between the cysteines at posi metalloproteinases could be fused to either end of a protein 45 tion 5 and 55, 14 and 38, and 30 and 51), or, if two disulfides containing one or more therapeutic Kunitz domains. Another are present, they can form between 5 and 55 and 30 and 51. exemplary configuration is the insertion of an inhibitory pep The spacing between respective cysteines can be within 7, 5, tide between Kunitz domains (e.g., the peptide may be used to 4, 3, 2, or 1 amino acids of the following spacing between replacea inter-KunitZ domain linker in an engineered version positions corresponding to: 5 to 55, 14 to 38, and 30 to 51, of human TFPI, or it may be used as a linker to connect two 50 according to the numbering of the BPTI sequence provided separate Kunitz domains). herein (Table 7). The BPTI sequence can be used as a refer Antibodies and/or peptides which bind proteases may be ence to refer to specific positions in any generic Kunitz identified by selection of antibody or peptide libraries with domain. Comparison of Kunitz domains of interest to BPTI the target protease (e.g., U.S. 2005/0180977 and Huanget al., can be performed by identifying the best fit alignment in J Biol Chem. 2003, 278:15532-40). Such selection is typi 55 which the number of aligned cysteines is maximized. cally coupled with a screening step to identify those library The three dimensional structure (at high resolution) of the members that inhibit the target protease as well. Such meth BPTI Kunitz domain is known. One of the X-ray structures of ods are similar to the selection and Screening methods dis BPTI is deposited in the Brookhaven Protein Data Bank as closed herein, as will be understood by practitioners in the art. “6PTI. The three dimensional structure of some BPTIhomo As referred to herein, an “engineered protein sequence 60 logues (Eigenbrot et al., (1990) Protein Engineering, 3(7): (e.g., an engineered Kunitz domain) is a protein that has a 591-598: Hynes et al., (1990) Biochemistry, 29:10018 non-naturally occurring sequence. In some embodiments, an 10022) are likewise known. At least seventy natural Kunitz engineered Kunitz domain will vary from a naturally occur domain sequences are known. Known human homologues ring Kunitz domain by at least 2, 3, 4, 5, 6, 7, 8, 9, or amino include three Kunitz domains of TFPI (Wun et al., (1988).J. acids. 65 Biol. Chem. 263(13):6001-6004: Girardet al., (1989) Nature, The comparison of sequences and determination of percent 338:518-20; Novotny et al. (1989) J. Biol. Chem., 264(31): identity between two sequences can be performed using 18832-18837; in some of these references, TFPI is identified US 8,828,703 B2 10 as LACI (Lipoprotein Associated Coagulation Inhibitor) or TFPI-2, the second Kunitz domain has this non-canonical EPI (Extrinsic Pathway Inhibitor), two Kunitz domains of feature. To re-engineer TFPI-2 Kunitz domain2, a preferred Inter-Ci-Trypsin Inhibitor, APP-I (Kido et al., (1988) J. Biol. embodiment is to build a library of TFPI-2 Kunitz domain2s Chem..., 263(34): 18104-18107), a Kunitz domain from col and select the desired binding properties. Alternatively, one lagen, and three Kunitz domains of TFPI-2 (Sprecher et al., could change residues 8-12 and 39-43 to be canonical. That is, (1994) PNAS USA, 91:3353-3357). TFPI is a human serum we would delete S9a and change D12 to G. At 13 and 39 we glycoprotein with a molecular weight of 39 kDa (amino acid would put the amino acids needed for binding and change sequence in Table 1) containing three Kunitz domains. 40-45 to GNANNF (SEQ ID NO:203) since these are the KunitZ domains interact with target protease using, prima most common amino acids at these positions. rily, amino acids in two loop regions ("binding loops'). The 10 When a Kunitz domain binds to a serine protease, only a first loop region is between about residues corresponding to Small number of the amino acids of the Kunitz domain contact amino acids 11-21 of BPTI. The second loop region is the serine protease. The exact number depends on the serine between about residues corresponding to amino acids 31-42 protease and the KunitZ domain, but is usually in the range 12 of BPTI. An exemplary library of Kunitz domains varies one to 20. In the case of BPTI and trypsin, 13 residues (T11, G12, or more amino acid positions in the first and/or second loop 15 P13, C14, K15,A16, R17, I18, I19, G36, G37, C38, and R39) regions. Particularly useful positions to vary include: posi of BPTI have at least one atom positioned within 4 A of an tions 13, 15, 16, 17, 18, 19, 31, 32, 34, and 39 with respect to atom in trypsin. Residues 12, 14, 16, 37, and 38 are highly the sequence of BPTI. At least some of these positions are conserved in canonical Kunitz domains. In BPTI, residue 34 expected to be in close contact with the target protease. is valine which comes within 5 A of an atom of trypsin. The “framework region' of a Kunitz domain is defined as Charge (or lack thereof) and/or a bulky side chain at this those residues that are a part of the KunitZ domain, but spe position could influence binding, thus Substitution of a large cifically excluding residues in the first and second binding or charged residue at 34 could influence binding. loops regions, e.g., about residues corresponding to amino There are about 20 residues of trypsin (using chymotrypsi acids 11-21 of BPTI and 31-42 of BPTI. The framework nogen numbering) that come within 4 A of an atom in BPTI: region can be derived from a human KunitZ domain, e.g., 25 Y39, H40, F41, C42, H57, N97, L99, Y151, D189, S190, TFPI domain 1. Exemplary frameworks can include at least C191, Q192, G193, D194, S195, S214, W215, G216, G219, one, two, or three lysines. In one embodiment, the lysines are and G226. Of these positions, C42, H57, G193, D194, S195, present at positions corresponding to the positions found in S214, are absolutely conserved among trypsin-homologous the framework of LACI, or within at least three, two, or one serine proteases. Positions 39, 41, 97, 99, 151, and 192 are amino acid(s) from Such a position. 30 highly variable. Position 40 is often histidine but shows con Conversely, residues that are not at these particular posi siderable variability. Position 189 is most often D, but S, G, A, tions or which are not in the loop regions may tolerate a wider and T occur in known serine proteases. Position 190 is often range of amino acid Substitutions (e.g., conservative and/or S, but G. V. T. A., I, and P are observed. Position 191 is most non-conservative Substitutions) than these amino acid posi often C, but F and Y are observed. Position 215 is most often tions. 35 W, but F.Y. G., N, S, and Kare observed. Position 216 is most BPTI is the prototypic Kunitz domain with 58 amino acids often G, but V. S., and Kare observed. Position 219 is most and cysteine residues at position 5, 14, 30, 38, 51, and 55. often G, but S, T, E. P. D., N, and T are observed, as is deletion These form disulfides 5:55, 14:38, and 30:51. of the residue. Position 226 is often G, but TD, S, E, and Aare There are phenylalanine residues at 33 and 45 in almost all observed. These residues are the ones most directly respon Kunitz domains. There are glycine residues at positions 12 40 sible for the substrate specificity of a particular serine pro and 37 in almost all Kunitz domains. There is the possibility tease because they contact the Substrate. For other proteases, of an extra residue between positions 9 and 10, called 9a. If 9a the number of amino acids that would contact a Kunitz is present, 12 need not be glycine. Insertion of up to three domain could be more or less than 20. residues after position 26 is observed in some natural Kunitz BPTI and other Kunitz domains inhibit serine proteases by domains; these inserted residues can be numbered 26a, 26b, 45 binding in the protease catalytic site but resisting cleavage and 26c. Insertion of up to three residues after position 42 is and release. For a KunitZ domain to inhibit a serine protease, observed in Some natural KunitZ domains (numbered 42a, it must fit tightly into the catalytic site. In some cases, the 42b, and 42c). Cysteine residues other than those named are bond between residue 15 of the Kunitz domain and residue 16 observed (e.g., as seen in bungarotoxin) and can lead to dis is cleaved, but the cleaved Kunitz domain does not release ulfides linking the KunitZ domain to other proteins. 50 from the protease. Most Kunitz domains have 58 amino acids with Cys at BPTI is not especially specific and inhibits a variety of positions 5, 14, 30, 38, 51, and 55. These domains are termed serine proteases with affinity better than 10 nM. For example, “canonical Kunitz domains' herein. All canonical Kunitz its affinities for different proteases are: (human urine kal domains have essentially identical main-chain 3D structures. likrein (4 nM), porcine pancreatic kallikrein (0.55 nM), The first Kunitz domain of TFPI is a canonical Kunitz 55 human plasmin (0.89 nM), activated protein C (1 uM) (Taby, domain. We have constructed a phage-display library of et al., 1990, Thromb Res, 59:27-35) thrombin (30 uM) (Pin TFPI-K1 Kunitz domains (see, e.g., U.S. Pat. No. 6,333,402). tigny and Dachary-Prigent, 1992, Eur J. Biochem, 207:89 Moreover, it is possible to select a Kunitz domain having the 95), and bovine trypsin (0.06 pM)). TFPI-K1 framework for binding to a particular target pro Other Kunitz domains can be highly specific. For example, tease. Then residues 13, 15-21, 34, 39-40 can be transferred 60 DX-88 inhibits human plasma kallikrein with K-30 LM into any other canonical Kunitz domain Such that the recipient while its affinity for (human urine kallikrein (>100 uM), Kunitz domain obtains affinity for the particular target pro porcine pancreatic kallikrein (27 LM), human C1re, tease. (>20 LM). C1s (>20 LM), human plasmin (137 nM), One class of non-canonical KunitZ domains has an extra human neutrophil elastase (>20 uM), human plasma F.XIIa amino acid at position 9a. In this class of KunitZ domains, 65 (>20 uM), human plasma F.XIa (10 nM), human thrombin residue 12 need not be glycine and the 3D structure in that (>20 uM), human trypsin (>20 uM), and human chymot region is different from that of canonical KunitZ domains. In rypsin (>20 uM)) is less by a factor at least 250,000. The US 8,828,703 B2 11 12 amino acid sequence of DX-88 is shown in Table 5 and DNA affinity for their targets, thus engineering TFPI-based pro encoding DX-88 is shown in Table 6. Table 8 shows the teins to reduce or eliminate sensitivity to MMP digestion may DX-88 gene with restriction sites. not be necessary. Engineered Kunitz domains are further disclosed in U.S. TFPI itself does not have especially high affinity for factor Pat. Nos. 6,333,402, 5,663,143, 6,953,674, each of which is X, and factor VII. Petersen et al. (Petersen, et al., 1996, Eur incorporated by reference. J Biochem, 235:310-6) report that Kunitz domain 1 binds A Kunitz domain can be located within a larger protein, factor VII/TF with K-250 nM and Kunitz domain 2 binds e.g., there can be sequences present N- and/or C-terminal to factor Xa with K-90 nM. Engineered Kunitz domain pro the KunitZ domain. teins based on TFPI will have much higher affinity for their 10 targets (e.g., TFPI Kunitz domains may be engineered to TFPI: increase affinity for factor X, and factor VII, to a K, of less The sequence of tissue-factor pathway inhibitor (TFPI: than or equal to 10 nM or 1 nM). The basic C-terminal also termed LACI) (GeneBank NM 006287) is shown in segment (Linker 4) is also involved in binding F.Xa (Cun Table 3. The kinetics of TFPI binding to Factor X, was pub ningham, et al., 2002, Biochem J., 367:451-8). For example, a lished in 1993 (Huang, et al., 1993, J Biol Chem, 268:26950 15 protein described herein has a K, of greater than 10 nM or 5). The sequences of TFPI proteins are known for Danio rerio 100 nM for one or both of factor VII, and factor X. (Zebrafish), Canis familiaris (dog), Macaca mulatta (rhesus TFPI-2: macaque), Ogctolagus cuniculus (rabbit), Pan troglodytes The amino acid sequence of TFPI-2 is shown in Table 11. (chimpanzee), Mus musculus (mouse), Gallus gallus Note that the second Kunitz domain is non-canonical, having (chicken), and Homo sapiens (man). TFPI contains three a residue 9a and residues 42a and 42b. The linkers are shorter Kunitz domains. The first Kunitz domain binds and inhibits than in TFPI. The C-terminal linker is quite basic, but linkers factor VII in the factor VII/tissue factor complex, Kunitz 1, 2, and 3 are much shorter than the corresponding linkers in domain2 binds and inhibits factor X, and Kunitz domain-3 is TFPI. TFPI-2 may be engineered to be a multiple protease involved in binding to cells (Piro and Broze, 2004, Circula inhibitor by remodeling Kunitz domain 1 and Kunitz domain tion, 110:3567-72). The cell binding by Kunitz domain-3 can 25 3 in the same way that the Kunitz domains of TFPI are be reduced by changing the P1 arginine to leucine (Piro and remodeled in Examples 2, 3, and 5. Kunitz domain 2 could Broze, 2004, Circulation, 110:3567-72). either be forced to be canonical or one could make a library of Human TFPI may be modified to provide an engineered Kunitz domain 2 derivatives, select a member having the protein that inhibits at least two different human proteases desired binding properties, and remodel Kunitz domain 2 as (e.g., proteases other than factor VII, and factor X) and that 30 needed. TFPI-2, like TFPI, binds cell surfaces. Preferred has a plurality of Kunitz domains. Many of the modified derivatives of TFPI-2 are modified to abolish binding to cells proteins of the present invention will be very similar to human and effects on gene expression. TFPI. A protein that includes TFPI or its Kunitz domains can Table 12 shows a DNA sequence that would cause display be altered so that it does not bind to cells as binding a polypep of TFPI-2 Kunitz domain 2 on the stump of M13 III on tide to cell Surfaces can increase antigenicity (Leonetti, et al., 35 M13-derived phage. The sequence of the parental phage vec 1998, JImmunol. 160:3820-7). For example, modification of tor is shown in Table 13. The restriction sites that bound the human TFPI to delete the C-terminal peptide GGLIKT display cassette of Table 12 are bold in Table 13. To generate KRKRKKQRVKIAYEEIFVKNM (SEQ ID NO:145) while a TFPI-2 Kunitz domain 2 library, the DNA between Pstland retaining Kunitz domain 3 reduces cell binding and other BspEI is synthesized with the diversity shown in Table 12. consequent cell signaling events. In a preferred embodiment, 40 Table 14 shows oligonucleotides (ONs) that can be used for the therapeutic protein has no effect on the expression of any this synthesis. ON1U96 and ON71L97 are complementary at genes and does not bind to cells. their 3' ends where underScored. ON1U21 T2 and ON 147L21 Mine et al. (Biochemistry (2002) 41:78-85.) have reported are identical to the 5' ends of the long oligonucleotides and that three lysine residues in the third Kul Dom of TFPI (which can be used to prime PCR amplification of the long oligo correspond to K241, K260, and K268 in Table 1) form a 45 nucleotides. There are Pst and BspEI sites near the ends of positively-charged cluster that may bind heparin and foster the final product. The double-stranded DNA is cut with PstI binding to cells. In one embodiment of the invention, one or and BspEI and ligated to similarly cleaved vector. The theo more of K241, K260, and K268 are changed to an amino acid retical diversity of the library is 6.7x10" amino acid that is not positively charged. At position 29, K is the most sequences. Actual libraries can include at least 10, 10', or common AA type and Q is second most common, thus 50 10' transformants. One could also use codon-based K241Q is a preferred mutation for removal of positive charge mutagenesis to synthesize the mixtures shown in Table 12. in Kul Dom 3 of TFPI. At position 48, E is the most common Hepatocyte Growth Factor Activator Inhibitor-1 and -2: AA type. Thus K260E is a preferred mutation for removal of Additional examples of a mammalian proteins that include positive charge in Kul Dom 3 of TFPI. At position 56, G is the multiple Kunitz domains are: Hepatocyte Growth Factor most common AA type. Thus K268G is a preferred mutation 55 Activator Inhibitor-1 (HAI-1) and Hepatocyte Growth Factor for removal of positive charge in Kul Dom 3 of TFPI. Removal Activator Inhibitor-2 (HAI-2). Each contains two Kunitz of at least one of the three lysines will reduce the possibility domains. of the TFPI variant binding to glycosaminoglycans. HAI-1 and HAI-2 are cell bound proteins. Derivatives that TFPI is cleaved a particular sites by metalloproteinases, are not associated with cells can be prepared, e.g., by produc e.g., by MMP-8 after S174 and after K20. P18, L19, and L21 60 ing a form of the protein that includes only the KunitZ domain conform to substrate 3 in Table 42. In one embodiment, the and intervening sequences. The regions of HAI-1 and HAI-2 protein used is based on TFPI, but includes modifications in responsible for cell binding can be inactivated, e.g., by dele the linker sequences to prevent MMP cleavage. For example, tion. the sites of cleavage by MMPs are changed to avoid such Bikunin: cleavage. In certain cases S174 and K20 are altered, e.g., to 65 Table 39 shows four amino acid sequences for “humin alanine. In another embodiment, the sites are left as is. In most bikunin' reported in U.S. Pat. No. 6,583,108 (the 108 cases, the engineered KunitZ domains will have very high patent). The SEQ ID NOs of 108 are shown as “SEQ ID US 8,828,703 B2 13 14 NO:123”, “SEQ ID NO.124", and “SEQ ID NO.125” have Libraries of KunitZ domains can be generated by varying differences in the signal sequence and in amino acid sequence one or more binding loop amino acid residues of a Kunitz that follows the second Kunitz domain. Table 38 shows affin domain described herein, e.g., a KunitZ domain having a ity data from the 108 patent for the two Kunitz domains of framework described herein, e.g., a modified or naturally bikunin against a variety of serine proteases. 5 occurring framework region. In one embodiment, the resi Protease Targets dues that are varied are varied among a plurality of amino Proteases are involved in a wide variety of biological pro acids. cesses, including inflammation and tissue injury. Serine pro The library can be an expression library that is used to teases produced by inflammatory cells, including neutro produce proteins. The proteins can be arrayed, e.g., using a 10 protein array. U.S. Pat. No. 5,143,854; De Wildt et al. (2000) phils, are implicated in various disorders, such as pulmonary Nat. Biotechnol. 18:989-994; Lueking et al. (1999) Anal. emphysema. As such, multiple different proteases can con Biochem. 270: 103-111; Ge (2000) Nucleic Acids Res. 28, e3, tribute to a single disorder. I-VII; MacBeath and Schreiber (2000) Science 289: 1760 Examples of particular proteases include the following. 1763; WO 01/98534, WO 01/83827, WO 02/12893, WO Neutrophil elastase is a serine protease produced by polymor 15 00/63701, WO 01/40803 and WO99/51773. phonuclear leukocytes with activity against extracellular KunitZ domains can also be displayed on a replicable matrix components and pathogens. Pulmonary emphysema is genetic package, e.g., in the form of a phage library Such as a characterized by alveolar destruction leading to a major phage display, yeast display library, ribosome display, or impairment in lung function. Human neutrophil elastase con nucleic acid-protein fusion library. See, e.g., U.S. Pat. No. sists of approximately 238 amino acid residues, contains 2 5,223,409; Garrard et al. (1991) Bio/Technology 9:1373 asparagine-linked carbohydrate side chains, and is joined 1377: WO 03/029456; and U.S. Pat. No. 6,207,446. Binding together by 4 disulfide bonds (GeneBank Entry P08246). It is members of such libraries can be obtained by selection and normally synthesized in the developing neutrophilas a proen screened in a high throughput format. See, e.g., U.S. 2003 Zyme but stored in the primary granules in its active form, O129659. ready with full enzymatic activity when released from the 25 Selection from Display Libraries granules, normally at sites of inflammation (Gullberg U, et al. This section describes exemplary methods of selecting EurJ Haematol. 1997:58:137-153; Borregaard N, Cowland members of a display library to identify a Kunitz domain that J. B. Blood. 1997; 89:3503-3521). inhibits a protease. The methods disclosed herein can also be Other exemplary protease targets include: plasmin, tissue modified and used in combination with other types of librar kallikrein, plasma kallikrein, Factor VII, Factor XI, throm 30 ies, e.g., an expression library or a protein array, and so forth. bin, , trypsin 1, trypsin 2, pancreatic , Kunitz domains can be displayed on phage, especially on , tryptase, and Factor II. Classes of rel filamentous phage (e.g., M13). In a preferred embodiment, evant proteases include: proteases associated with blood the library contains members that have a variety of amino acid coagulation, proteases associated with , proteases types at the positions of the Kunitz domain that are likely to associated with complement, proteases that digest extracel 35 contact a serine protease where they can affect the binding. lular matrix components, proteases that digest basement Using immobilized or immobilizable (e.g., by modification membranes, and proteases associated with endothelial cells. with one member of a binding pair, such as avidin or strepta For example, the protease is a serine protease. Proteins vidin) serine protease as a target, one can select those mem described herein that inhibit one or more proteases can be bers of the library that have high affinity for the serine pro engineered to inhibit one or more proteases within these 40 tease of interest. classes. In an exemplary display library selection, a phage library is Identifying Kunitz, Domains and Other Protease Inhibitors contacted with and allowed to bind to the target protease A variety of methods can be used to identify a Kunitz which may be an active oran inactivated form of the protease domain that binds to and/or inhibits a protease, e.g., a human (e.g., mutant or chemically inactivated protein) or a fragment protease. These methods can be used to identify natural and 45 thereof. To facilitate separation of binders and non-binders in non-naturally occurring Kunitz domains that can be used as the selection process, it is often convenient to immobilize the components of a protein that inhibits a plurality of proteases, protease on a solid Support, although it is also possible to first e.g., a plurality of human proteases. Other protease inhibitors permit binding to the target protease in Solution and then can be similarly identified, e.g., by varying known protease segregate binders from non-binders by coupling the target inhibitors or protease resistant proteins, by using libraries of 50 protease to a Support. By way of illustration, when incubated randomized peptides or proteins, or using libraries of anti in the presence of the protease, phage displaying a Kunitz bodies e.g., as described in US 2004-0071705 (including domain that interacts with target protease form complexes 102-112). As will be understood by those of skill in the art, with the target protease immobilized on a solid Support while the description of methods of identifying inhibitors of whereas non-binding phage remain in Solution and are proteases herein are largely directed towards Kunitz domains, 55 washed away with buffer. Bound phage may then be liberated the methods can be adapted for identifying peptide-based or from the protease by a number of means, such as changing the antibody-based protease inhibitors. buffer to a relatively high acidic or basic pH (e.g., pH 2 or pH For example, a Kunitz domain can be identified from a 10), changing the ionic strength of the buffer, adding dena library of proteins in which each of a plurality of library turants, adding a competitor, adding host cells which can be members includes a varied KunitZ domain. Various amino 60 infected (Hogan et al. (2005) Biotechniques 38(4):536,538.), acids can be varied in the domain. See, e.g., U.S. Pat. No. or other known means. 5,223,409: U.S. Pat. No. 5,663,143, and U.S. Pat. No. 6,333, For example, to identify protease-binding peptides, pro 402. KunitZ domains can be varied, e.g., using DNA tease can be adsorbed to a solid Surface. Such as the plastic mutagenesis, DNA shuffling, chemical synthesis of oligo Surface of wells in a multi-well assay plate. Subsequently, an nucleotides (e.g., using codons as subunits), and cloning of 65 aliquot of a phage display library is added to a well under natural genes. See, e.g., U.S. Pat. No. 5,223,409 and U.S. appropriate conditions that maintain the structure of the 2003-0129.659. immobilized protease and the phage. Such as pH 6-7. Phage US 8,828,703 B2 15 16 that display polypeptides that bind the immobilized protease be, for example, NNK which allows any base at the first two are bound to the protease and are retained in the well. Non positions of a varied codon and G or T at the final base and binding phage can be removed (e.g., by washing with buffer). allows all 20 amino-acid types to appear. Alternatively, varied It is also possible to include a blocking agent or competing codons that exclude Some amino-acid types and perhaps stop agent during the binding of the phage library to the immobi codons may be used. NNK provides 32 codons, one of which lized protease. is TAG, a stop. NNK also provides three codons for each of Phage bound to the immobilized protease may then be Ser, Leu, and Arg; two codons for each of Val, Ala, Pro, Thr, eluted by washing with a buffer solution having a relatively and Gly; the other twelve amino-acid types have a single strong acid pH (e.g., pH 2) or an alkaline pH (e.g., pH 8-9). codon. The varied codon NNT provides 15 amino-acid types The solutions of recovered phage that are eluted from the 10 and no stops. NNG provides 13 amino-acid types and one protease are then neutralized and may, if desired, be pooled as stop. an enriched mixed library population of phage displaying Mutagenesis may also be carried out in a more limited or protease binding peptides. Alternatively the eluted phage targeted fashion. For example, if the parental amino-acid in a from each library may be kept separate as a library-specific position of interest is Ser, a codon mix with (0.85 T, 0.05 C. enriched population of protease binders. Enriched popula 15 0.05A, 0.05 G) at the first base, (0.85 C, 0.05T, 0.05A, 0.5G) tions of phage displaying protease binding peptides may then at the second base and (0.85T, 0.05 C, 0.05A, 0.05 G) at the be grown up by standard methods for further rounds of selec third base may be employed. Most of the DNA molecules will tion and/or for analysis of peptide displayed on the phage have TCT encoding Ser, but others will have codons that and/or for sequencing the DNA encoding the displayed bind differ by one, two, or three bases, providing a diversity of ing peptide. amino-acid types. Recovered phage may then be amplified by infection of In one example of iterative selection, the methods bacterial cells, and the screening process may be repeated described herein are used to first identify Kunitz domains with the new pool of phage that is now depleted in non from a display library that bind a target protease with at least protease binders and enriched in protease binders. The recov a minimal binding specificity, e.g., an equilibrium dissocia ery of even a few binding phage may be sufficient to carry the 25 tion constant for binding of less than 100 nM, 10 nM, or 1 nM. process to completion. After a few rounds of selection (e.g. 2 The nucleic acid sequences encoding the KunitZ domains that or 3), the gene sequences encoding the binding moieties are initially identified are used as a template nucleic acid for derived from selected phage clones in the binding pool are the introduction of variations, e.g., to identify a second Kunitz determined by conventional methods, revealing the peptide domain that has enhanced properties (e.g., binding affinity, sequence that imparts binding affinity of the phage to the 30 kinetics, or stability) relative to the initial protein Kunitz target. An increase in the number of phage recovered after domain. each round of selection and the recovery of closely related Selecting or Screening for Specificity. sequences indicate that the screening is converging on The display library selection and screening methods sequences of the library having a desired characteristic. described herein can include a selection or screening process After a set of binding polypeptides is identified, the 35 that discards display library members that bind to a non-target sequence information may be used to design secondary molecule, e.g., a protease other than the target protease. In one libraries. For example, the secondary libraries can explore a embodiment, the non-target molecule is a protease that has Smaller segment of sequence space in more detail than the been inactivated by treatment with an irreversibly bound initial library. In some embodiments, the secondary library inhibitor, e.g., a covalent inhibitor. includes proteins that are biased for members having addi 40 In one implementation, a so-called “negative selection' tional desired properties, e.g., sequences that have a high step or “depletion' is used to discriminate between the target percentage identity to a human protein. and a related, but distinct or an unrelated non-target mol Iterative Selection. ecules. The display library or a pool thereof is contacted with Display library technology may be used in an iterative the non-target protease. Members of the sample that do not mode. A first display library is used to identify one or more 45 bind the non-target protease are collected and used in Subse Kunitz domains that bind to a target protease. These identified quent selections for binding to the target molecule or even for Kunitz domains are then varied using a mutagenesis method Subsequent negative selections. The negative selection step to form a second display library. Higher affinity proteins are can be prior to or after selecting library members that bind to then selected from the second library, e.g., by using higher the target protease. stringency or more competitive binding and washing condi 50 In another implementation, a screening step is used. After tions. display library members are isolated for binding to the target In some implementations, the mutagenesis is targeted to molecule, each isolated library member is tested for its ability regions known or likely to be at the binding interface. For to bind to a non-target protease. For example, a high-through Kunitz domains, many positions near the binding interface put ELISA screen can be used to obtain this data. The ELISA are known. Such positions include, for example, positions 13, 55 screen can also be used to obtain quantitative data for binding 14, 15, 16, 17, 18, 19.31, 32,34,38, and 39 with respect to the of each library member to the target. The non-target and target sequence of BPTI. Some exemplary mutagenesis techniques binding data are compared (e.g., using a computer and soft include: synthesis of DNA with mixed nucleotides, error ware) to identify library members that specifically bind to the prone PCR (Leung et al. (1989) Technique 1:11-15), recom target protease. bination, DNA shuffling using random cleavage (Stemmer 60 Single Kunitz, Domains with Multiple Specificities (1994) Nature 389-391; termed “nucleic acid shuffling”), Single KunitZ domains with more than one specificity are RACHITTTM (Coco et al. (2001) Nature Biotech. 19:354), obtained by alternating selection of libraries using the target site-directed mutagenesis (Zoller et al. (1987) NuclAcids Res serine proteases. Members of a library of Kunitz domains are 10:6487-6504), cassette mutagenesis (Reidhaar-Olson first selected for binding to the first target serine protease, and (1991) Methods Enzymol. 208:564-586) and incorporation of 65 those library members that bind the first target are selected degenerate oligonucleotides (Griffiths et al. (1994) EMBOJ and screened against the second target serine protease. This 13:3245). The mixtures of nucleotides used in synthesis can alternating selection and screening may be carried out on as US 8,828,703 B2 17 18 many target serine proteases as desired. Additionally, the inhibit human neutrophil elastase with high affinity. TFPI selection may be repeated, such that the library members K2 88 is a modification of TFPI-K2 which should inhibit which bind to all of the target proteases are then screened for human plasma kallikrein. TFPI-K3 88 is a derivative of the desired binding. Two to four rounds of repeated selections TFPI Kunitz domain 3 which should inhibit human plasma are commonly performed. kallikrein. ITI-K2 is the second Kunitz domain of bikunin. For example, alternating selection can be used to obtain a DX-890 is a potent inhibitor of human neutrophil elastase that single Kunitz domain that inhibits both human kallikrein 1 was designed bases on inhibitor selected from a library of (h.K1, also known as tissue kallikrein) and murine kallikrein BPTI derivatives for binding to human neutrophil elastase. 1 (m.K1). As described above, a library would be selected first TFPI-K2 890 is a derivative of TFPI KunitZ domain 2 that againstone of the targets (e.g., hK1), then the Kunitz domains 10 was designed to inhibit human neutrophil elastase. TFPI identified in the first selection are then selected against the K3 890 is a derivative of TFPI KunitZ domain 3 that was other target (e.g., mK1). A Kunitz domain that inhibits a designed to inhibit human neutrophil elastase. human serine protease and its counterpart from another mam (*)New engineered KunitZ domain sequences may be malian species (e.g., murine) is useful because its character selected using display library technology. The base, or istics can be evaluated in an animal model of disease. 15 scaffold sequence can be any KunitZ domain-containing In another example, a Kunitz domain is selected that inhib protein, but in certain embodiments, the base sequence is its both human plasma kallikrein (h.pKal) and human tissue derived from a naturally-occurring protein that contains two kallikrein by alternating selection with the two targets. Such or more Kunitz domains (e.g., TFPI, for example, TFPI K1). an inhibitor would be advantageous in a disease. Such as Any serine protease of interest may be used as the target, arthritis, where both human plasma kallikrein and human including neutrophil elastase (e.g., human neutrophil tissue kallikrein are excessively active. In one embodiment, elastase), proteinase 3 (e.g., human proteinase 3), cathepsin G the Kunitz domain differs from human bikunin (e.g., as (e.g., human cathepsin G), tryptase (e.g., human tryptase), described in U.S. Pat. No. 6,583,108) by at least two, three, plasma kallikrein (e.g., human plasma kallikrein), neutrophil five, or ten amino acids. elastase (e.g., human neutrophil elastase), kallikrein 6 (e.g., Transferring Specificity Determinants 25 human kallikrein 6), kallikrein 8 (e.g., human kallikrein 8), Natural proteins that have more than one Kunitz domain kallikrein 10 (e.g., human kallikrein 10), tissue kallikrein can be modified to provide a protein with at least two engi (e.g., human tissue kallikrein), plasmin (e.g., human plas neered Kunitz domains that inhibit, respectively, two differ min), hepsin (e.g., human hepsin), matriptase (e.g., human ent human proteases, e.g., plasma kallikrein (pKal) and matriptase), endotheliase 1 (e.g., human endotheliase 1), and human neutrophil elastase (hNE). For example, residues from 30 endotheliase 2 (e.g., human endotheliase 2). the protease binding loops of DX-88 can be transferred to one In some embodiments, the recipient Kunitz domain is Kunitz domain of the recipient protein and residues from the modified so that residues positions 13, 15, 16, 17, 18, 19, 31, protease binding loops of DX-890 can be transferred into 32, 34, and 39 (numbered with respect to the sequence of another of Kunitz domains. The resulting protein inhibits BPTI) are the same as the residues in the originating Kunitz plasma kallikrein and human neutrophil elastase. 35 domain. Alternatively, one could vary residues 11, 13, 15-19. In one embodiment, residues that contribute to binding 31, 32, 34, 39, and 40. In another embodiment, one could vary and/or specificity are substituted into the Kunitz domains of a 11-19, 31, 32,34, and 38-40; in this case one allows the 14-38 pre-existing protein, e.g., a naturally occurring protein Such disulfide to be changed to other amino-acid types. In other as a serum protein. Several serum proteins contain two or embodiments, the recipient Kunitz domain is modified so that three Kunitz domains. Each of these Kunitz domains could be 40 residues in the first and second binding loops are the same as engineered to inhibit one target. For example, the extracellu the residues in the originating Kunitz domain. lar compartment of human tissue contains tissue-factor path Modifying and Varying Polypeptides way inhibitor (TFPI, also known as lipoprotein-associated It is also possible to vary a KunitZ domain or other protein coagulation inhibitor (LACI)), tissue-factor pathway inhibi described herein to obtain useful variant that has similar or tor-2 (TFPI-2), hepatocyte growth factor activator inhibitor-1 45 improved or altered properties. Typically, a number of vari (HAI-1), hepatocyte growth factor activator inhibitor-2 ants are possible. A variant can be prepared and then tested, (HAI-2, also known as placental bikunin), and bikunin (also e.g., using a binding assay described herein (Such as fluores known as inter-alpha trypsin inhibitor). One strategy is to use cence anisotropy). one of the multi-KunitZ domain human proteins as a template One type of variant is a truncation of a KunitZ domain or and change only those residues that confer needed affinity and 50 other protein described herein. In this example, the variant is specificity. prepared by removing one or more amino acid residues of the The residues that confer affinity and specificity can be domain from the N or C terminus. In some cases, a series of transferred from one Kunitz domain to another with high Such variants is prepared and tested. Information from testing retention of the desired properties by substitution of a few the series is used to determine a region of the domain that is residues. For example, DX-88 is based on Kunitz domain 1 of 55 essential for protease binding and/or stability in serum. A TFPI and has five amino acids within the Kunitz domain series of internal deletions or insertions can be similarly con changed based on selection from a phage-displayed library. structed and tested. For Kunitz domains, it can be possible to DX-88 is a 30 pM inhibitor of human plasma kallikrein (hip remove, e.g., between one and five residues or one and three Kal) while TFPI Kunitz domain 1 has hardly any affinity for residues that are N-terminal to Cs, the first cysteine, and h.pKal. Table 7 shows the amino acid sequences of several 60 between one and five residues or one and three residues that Kunitz domains, some wildtype, some selected from phage are C-terminal to Css, the final cysteine, wherein each of the libraries, some designed, and some designed, built and tested. cysteines corresponds to a respectively numbered cysteine in BPTI, TFPI-K1, TFPI-K2, and TFPI-K3 are naturally occur BPTI. ring Kunitz domains. DX-88 and DX-1000 were selected Another type of variant is a Substitution. In one example, from a library of TFPIKunitz domain 1 derivatives and inhibit 65 the Kunitz domain is subjected to alanine scanning to identify human plasma kallikrein and human plasmin respectively. residues that contribute to binding activity and or stability. In TF890 is designed by comparison with DX-890 and should another example, a library of Substitutions at one or more US 8,828,703 B2 19 20 positions is constructed. The library may be unbiased or, e.g., with a hydrophilic polymer (e.g., to increase the affinity particularly if multiple positions are varied, biased towards an and/or activity of the peptides), its molecular weights can be original residue. In some cases, the Substitutions are all con greater and can range anywhere from about 500 to about servative substitutions. 50,000 Daltons. In another example, each lysine is replaced by arginine and Pegylation and Other Modifications the binding affinity is measured. Thus, one may obtain a A single KunitZ domain has low molecular mass and at binding protein having no lysines and retaining the selected least in Some cases are rapidly filtered by the kidneys. Increas binding properties. Proteins lacking lysine residues with the ing molecular mass can increase serum residence. polypeptide chain provide only a single site for pegylation A variety of moieties can be covalently attached to the (the amino terminus), and thus provide a more homogenous 10 protein to increase the molecular weight of a protein that product upon pegylation. includes a KunitZ domain. In one embodiment, the moiety is Another type of variant includes one or more non-naturally a polymer, e.g., a water Soluble and/or Substantially non occurring amino acids. Such variant domains can be pro antigenic polymer Such as a homopolymer or a non-biologi duced by chemical synthesis or modification. One or more cal polymer. Substantially non-antigenic polymers include, positions can be substituted with a non-naturally occurring 15 e.g., polyalkylene oxides or polyethylene oxides. The moiety amino acid. In some cases, the Substituted amino acid may be may improve stabilization and/or retention of the Kunitz chemically related to the original naturally occurring residue domain in circulation, e.g., in blood, serum, lymph, or other (e.g., aliphatic, charged, basic, acidic, aromatic, hydrophilic) tissues, e.g., by at least 1.5, 2, 5, 10, 50, 75, or 100 fold. One or an isostere of the original residue. or a plurality of moieties can be attached to protein that It is also be possible to include non-peptide linkages and includes a Kunitz domain. For example, the protein is other chemical modifications. For example, part or all of the attached to at least three moieties of the polymer. In one Kunitz domain or other protein may be synthesized as a embodiment, particularly if there are no lysines in the pro peptidomimetic, e.g., a peptoid (see, e.g., Simon et al. (1992) tease binding loops, each lysine of the protein can be attached Proc. Natl. Acad. Sci. USA 89:9367-71 and Horwell (1995) to a moiety of the polymer. Trends Biotechnol. 13:132-4). 25 Suitable polymers can vary substantially by weight. For Protein Production example, it is possible to use polymers having average Recombinant Production of Polypeptides. molecular weights ranging from about 200 Daltons to about Standard recombinant nucleic acid methods can be used to 40kDa, e.g., 1-20 kDa, 4-12 kDa or 3-8 kDa, e.g., about 4, 5, express a protein described herein (e.g., a protein that 6, or 7 kDa. In one embodiment, the average molecular includes one or more Kunitz domains). Generally, a nucleic 30 weight of individual moieties of the polymer that are associ acid sequence encoding the protein is cloned into a nucleic ated with the compound are less than 20, 18, 17, 15, 12, 10, 8, acid expression vector. If the protein is sufficiently small, e.g., or 7 kDa. The final molecular weight can also depend upon the protein is a peptide of less than 60 amino acids, the protein the desired effective size of the conjugate, the nature (e.g. can be synthesized using automated organic synthetic meth structure, Such as linear or branched) of the polymer, and the ods. 35 degree of derivatization. The expression vector for expressing the protein can A non-limiting list of exemplary polymers include poly include a segment encoding the protein and regulatory alkylene oxide homopolymers such as polyethylene glycol sequences, for example, a promoter, operably linked to the (PEG) or polypropylene glycols, polyoxyethylenated poly coding segment. Suitable vectors and promoters are known to ols, copolymers thereof and block copolymers thereof, pro those of skill in the art and are commercially available for 40 vided that the water solubility of the block copolymers is generating the recombinant constructs of the present inven maintained. The polymer can be a hydrophilic polyvinyl tion. See, for example, the techniques described in Sambrook polymers, e.g. polyvinylalcohol and polyvinylpyrrolidone. & Russell, Molecular Cloning: A Laboratory Manual, 3" Additional useful polymers include polyoxyalkylenes Such as Edition, Cold Spring Harbor Laboratory, N.Y. (2001) and polyoxyethylene, polyoxypropylene, and block copolymers Ausubel et al., Current Protocols in Molecular Biology 45 of polyoxyethylene and polyoxypropylene (Pluronics); poly (Greene Publishing Associates and Wiley Interscience, N.Y. lactic acid; polyglycolic acid; polymethacrylates; carbomers; (1989). branched or unbranched polysaccharides which comprise the Expression of proteins that contain no sites for N-linked saccharide monomers D-mannose, D- and L-galactose, glycosylation can be carried out in bacteria, yeast, or mam fucose, fructose, D-xylose, L-arabinose, D-glucuronic acid, malian cells. Proteins having sites for N-linked glycosylation 50 sialic acid, D-galacturonic acid, D-mannuronic acid (e.g. can be expressed in bacteria to obtain non-glycosylated ver polymannuronic acid, oralginic acid), D-glucosamine, D-ga sions or in yeast or mammalian cells if glycosylation is lactosamine, D-glucose and neuraminic acid including desired. For use in humans, glycosylated versions are prefer homopolysaccharides and heteropolysaccharides Such as lac ably produced in mammalian cells such as CHO or HEK293T tose, cellulose, amylopectin, starch, hydroxyethyl starch, cells. 55 amylose, dextrane Sulfate, dextran, dextrins, glycogen, or the Scopes (1994) Protein Purification: Principles and Prac polysaccharide subunit of acid mucopolysaccharides, e.g. tice, New York: Springer-Verlag and other texts provide a hyaluronic acid; polymers of Sugar alcohols such as polysor number of general methods for purifying recombinant (and bitol and polymannitol; heparin or heparon. In some embodi non-recombinant) proteins. ments, the polymer includes a variety of different copolymer Synthetic Production of Peptides. 60 blocks. A polypeptide can also be produced by synthetic means. Additional exemplary methods of attaching more than one See, e.g., Merrifield (1963).J. Am. Chem. Soc., 85: 2149. For polymer moiety to a protein are described in US 2005 example, the molecular weight of synthetic peptides or pep O089515. tide mimetics can be from about 250 to about 8,0000 Daltons. Another method of increasing the molecular weight of a A peptide can be modified, e.g., by attachment to a moiety 65 protein is by making multimers of the protein (e.g., producing that increases the effective molecular weight of the peptide. If the protein such that at least two copies of the protein are the peptide is oligomerized, dimerized and/or derivatized, fused, typically in tandem, but head to head fusions are also US 8,828,703 B2 21 22 possible). Multimers may be produced by chemical pare different proteins. Information from SPR can also be crosslinking or by expression of a construct encoding the used to develop structure-activity relationship (SAR). For multimer. Because long repeats of DNA may be unstable, it is example, if the proteins are all mutated variants of a single preferred that the polynucleotides encoding the different cop parental antibody or a set of known parental antibodies, vari ies of the protein have different sequences. antamino acids at given positions can be identified that cor Characterization of Binding Interactions relate with particular binding parameters, e.g., high affinity The binding properties of a protein (e.g., a protein that and slow k, includes one or more KunitZ domains) can be readily assessed Additional methods for measuring binding affinities using various assay formats. For example, the binding prop include nuclear magnetic resonance (NMR), and binding erties can be assayed by quantitative measurement of the 10 titrations (e.g., using fluorescence energy transfer). ability to inhibit the target and non-target proteases. Alterna Other Solution measures for studying binding properties tively, for example, the binding property of a protein can be include fluorescence resonance energy transfer (FRET), measured in solution by fluorescence anisotropy (Bentley et NMR, X-ray crystallography, molecular modeling, and mea al., (1985).J. Biol. Chem. 260(12):7250-56: Shimada et al., suring bound vs. free molecules. Measurement of bound vs. (1985).J. Biochem.97(6):1637-44: Terpetschnig et al., (1995) 15 free molecules can be accomplished with a KinEXA instru Anal. Biochem. 227(1): 140-7. Durkop et al., (2002) Anal. ment from Sapidyne Instruments Inc., Boise, Id. Bioanal. Chem. 372(5-6):688-94), which provides a conve Characterization of Protease Inhibition nient and accurate method of determining a dissociation con It may be useful to characterize the ability of a protein stant (K) or association constant (Ka) of the protein for a described herein to inhibit target proteases, as well as non particular target protease. In one such procedure, the protein target proteases. to be evaluated is labeled with fluorescein. The fluorescein KunitZ domains can be selected for their potency and selec labeled protein is mixed in wells of a multi-well assay plate tivity of inhibition of the target proteases. In one example, with various concentrations of the particular target protease target protease and a Substrate are combined under assay or non-target protease. Fluorescence anisotropy measure conditions permitting reaction of the protease with its Sub ments are carried out using a fluorescence polarization plate 25 strate. The assay is performed in the absence of the protein reader. being tested, and in the presence of increasing concentrations The binding interactions can also be analyzed using an of the proteinbeing tested. The concentration of test protein at ELISA assay. For example, the protein to be evaluated is which the protease activity is 50% inhibited is the ICs value contacted to a microtitre plate whose bottom surface has been (Inhibitory Concentration) or ECs (Effective Concentration) coated with the target protease, e.g., a limiting amount of the 30 value for that test protein. Proteins having lower ICs or ECso target protease. The molecule is contacted to the plate. The values are considered more potent inhibitors of the target plate is washed with buffer to remove non-specifically bound protease than those proteins having higher ICs or ECso val molecules. Then the amount of the protein bound to the plate ues. Preferred proteins according to this aspect have an ICso is determined by probing the plate with an antibody that value of 100 nM, 10 nM, 1 nM, or 0.1 nMorless as measured recognizes the protein. For example, the protein can include 35 in an in vitro assay for inhibition of protease activity. an epitope tag. The antibody can be linked to an enzyme Such A test protein can also be evaluated for selectivity toward as alkaline phosphatase, which produces a colorimetric prod different protease(s). A test protein is assayed for its potency uct when appropriate Substrates are provided. In the case toward a panel of proteases and other and an ICso where a display library member includes the protein to be value is determined for each. A protein that demonstrates a tested, the antibody can recognize a region that is constant 40 low ICs value for the target protease, and a higher ICs value among all display library members, e.g., for a phage display for other proteases within the test panel (e.g., trypsin, plas library member, a major phage coat protein. min, chymotrypsin), is considered to be selective toward the A binding interaction between a protein and a particular target protease. Generally, a protein is deemed selective if its target protease can be analyzed using Surface plasmon reso ICso value is at least one order of magnitude less than the next nance (SPR). For example, after sequencing of a display 45 Smallest ICso value measured in the panel of enzymes. library member present in a sample, and optionally verified, It is also possible to evaluate Kunitz domain activity in vivo e.g., by ELISA, the displayed protein can be produced in or in samples (e.g., pulmonary lavages or serum) of Subjects quantity and assayed for binding the target using SPR. SPR or to which a compound described herein has been adminis real-time Biomolecular Interaction Analysis (BIA) detects tered. biospecific interactions in real time, without labeling any of 50 Pharmaceutical Compositions & Treatments the interactants (e.g., BIAcore). Changes in the mass at the Also featured is a composition, e.g., a pharmaceutically binding surface (indicative of a binding event) of the BIA chip acceptable composition, that includes a protein described result in alterations of the refractive index of light near the herein (e.g., a protein that inhibits at least two human pro Surface (the optical phenomenon of Surface plasmon reso teases). In one embodiment, the protein includes one or more nance (SPR)). The changes in the refractivity generate a 55 Kunitz domains. For example, the protein inhibits a plurality detectable signal, which are measured as an indication of of target proteases. The pharmaceutical composition can real-time reactions between biological molecules. Methods include a pharmaceutically acceptable carrier. for using SPR are described, for example, in U.S. Pat. No. Exemplary carriers include any and all solvents, dispersion 5,641,640; Raether (1988) Surface Plasmons Springer Ver media, coatings, antibacterial and antifungal agents, isotonic lag; Solander, S, and Urbaniczky, C. (1991) Anal. Chem. 60 and absorption delaying agents, and the like that are physi 63:2338-2345; Szabo et al. (1995) Curr. Opin. Struct. Biol. ologically compatible. Preferably, the carrier is suitable for 5:699-705. intravenous, intramuscular, Subcutaneous, parenteral, spinal Information from SPR can be used to provide an accurate or epidermal administration (e.g., by injection or infusion). and quantitative measure of the equilibrium dissociation con Depending on the route of administration, the active com stant (K), and kinetic parameters, including k, and k, for 65 pound may be coated in a material to protect the compound the binding of a protein (e.g., a protein include Kunitz from the action of acids and other natural conditions that may domains) to a target protease. Such data can be used to com inactivate the compound. US 8,828,703 B2 23 24 The pharmaceutical composition can include a pharmaceu be 100 ug/Kg, 500 ug/Kg, 1 mg/Kg, 5 mg/Kg, 10 mg/Kg, or tically acceptable salt, e.g., a salt that retains the desired mg/Kg. The route and/or mode of administration will vary biological activity of the parent compound and does not depending upon the desired results. impart any undesired toxicological effects (see e.g., Berge, S. In certain embodiments, the protein is prepared with a M., et al. (1977).J. Pharm. Sci. 66:1-19). Examples of such carrier that protects against rapid release, such as a controlled salts include acid addition salts and base addition salts. Acid release formulation, including implants, and microencapsu addition salts include those derived from nontoxic inorganic lated delivery systems. Biodegradable, biocompatible poly acids, such as hydrochloric, nitric, phosphoric, Sulfuric, mers can be used, such as ethylene vinyl acetate, polyanhy hydrobromic, hydroiodic, phosphorous and the like, as well drides, polyglycolic acid, collagen, polyorthoesters, and as from nontoxic organic acids such as aliphatic mono- and 10 dicarboxylic acids, phenyl-substituted alkanoic acids, polylactic acid. Many methods for the preparation of Such hydroxy alkanoic acids, aromatic acids, aliphatic and aro formulations are patented or generally known. See, e.g., Sus matic sulfonic acids and the like. Base addition salts include tained and Controlled Release Drug Delivery Systems, J. R. those derived from alkaline earth metals, such as Sodium, Robinson, ed., Marcel Dekker, Inc., New York, 1978. Phar potassium, magnesium, calcium and the like, as well as from 15 maceutical formulation is a well-established art, and is further nontoxic organic amines, such as N,N'-dibenzylethylenedi described in Gennaro (ed.), Remington: The Science and amine, N-methylglucamine, chloroprocaine, choline, dietha Practice of Pharmacy, 20" ed., Lippincott, Williams & nolamine, ethylenediamine, procaine and the like. Wilkins (2000) (ISBN: 06833.06472); Ansel et al., Pharma The pharmaceutical composition may be in a variety of ceutical Dosage Forms and Drug Delivery Systems, 7" Ed., forms. These include, for example, liquid, semi-solid and Lippincott Williams & Wilkins Publishers (1999) (ISBN: Solid dosage forms, such as liquid solutions (e.g., injectable 0683305727); and Kibbe (ed.), Handbook of Pharmaceutical and infusible solutions), dispersions or Suspensions, lipo Excipients American Pharmaceutical Association, 3" ed. Somes and Suppositories. The preferred form depends on the (2000) (ISBN: 09.1733096X). intended mode of administration and therapeutic application. Also within the scope of the invention are kits comprising Typical compositions are in the form of injectable or infusible 25 a protein described herein (e.g., a protein that inhibits at least Solutions, such as compositions similar to those used for two human proteases) and instructions for use, e.g., treat administration of antibodies to humans. A common mode of ment, prophylactic, or diagnostic use. In one embodiment, the administration is parenteral (e.g., intravenous, intramuscular, kit includes (a) the compound, e.g., a composition that intraarterial, intrathecal, intracapsular, intraorbital, intracar includes the compound, and, optionally, (b) informational diac, intradermal, intraperitoneal, transtracheal, Subcutane 30 material. The informational material can be descriptive, ous, Subcuticular, intraarticular, Subcapsular, Subarachnoid, instructional, marketing or other material that relates to the intraspinal, epidural and intrasternal injection and infusion). methods described herein and/or the use of the compound for In one embodiment, the compound is administered by intra the methods described herein. venous infusion or injection. In another embodiment, the In one embodiment, the informational material can include compound is administered by intramuscular or Subcutaneous 35 instructions to administer the compound in a suitable manner, injection. Pharmaceutical compositions typically must be e.g., in a Suitable dose, dosage form, or mode of administra sterile and stable under the conditions of manufacture and tion (e.g., a dose, dosage form, or mode of administration Storage. described herein). In another embodiment, the informational The composition can be formulated as a solution, micro material can include instructions for identifying a Suitable emulsion, dispersion, liposome, or other ordered structure 40 Subject, e.g., a human, e.g., a human having, or at risk for a Suitable to high drug concentration. Sterile injectable solu disorder characterized by excessive elastase activity. The tions can be prepared by incorporating the active compound informational material can include information about produc in the required amount in an appropriate solvent with one or tion of the compound, molecular weight of the compound, a combination of ingredients enumerated above, as required, concentration, date of expiration, batch or production site followed by filtered sterilization. Generally, dispersions are 45 information, and so forth. The informational material of the prepared by incorporating the active compound into a sterile kits is not limited in its form. In many cases, the informational vehicle that contains a basic dispersion medium and the material, e.g., instructions, is provided in printed matter, e.g., required other ingredients from those enumerated above. In a printed text, drawing, and/or photograph, e.g., a label or the case of sterile powders for the preparation of sterile inject printed sheet. However, the informational material can also able solutions, the preferred methods of preparation are 50 be provided in other formats, such as Braille, computer read vacuum drying and freeze-drying that yields a powder of the able material, video recording, or audio recording. In another active ingredient plus any additional desired ingredient from embodiment, the informational material of the kit is a link or a previously sterile-filtered solution thereof. The proper flu contact information, e.g., a physical address, email address, idity of a solution can be maintained, for example, by the use hyperlink, website, or telephone number, where a user of the of a coating Such as lecithin, by the maintenance of the 55 kit can obtain Substantive information about the compound required particle size in the case of dispersion and by the use and/or its use in the methods described herein. Of course, the of Surfactants. Prolonged absorption of injectable composi informational material can also be provided in any combina tions can be brought about by including in the composition an tion of formats. agent that delays absorption, for example, monostearate salts In addition to the compound, the composition of the kit can and gelatin. 60 include other ingredients, such as a solvent or buffer, a stabi The proteins described herein can be administered by a lizer or a preservative, and/or a second agent for treating a variety of methods known in the art. For many applications, condition or disorder described herein. Alternatively, the the route/mode of administration is intravenous injection or other ingredients can be included in the kit, but in different infusion. For example, for therapeutic applications, the com compositions or containers than the compound. In Such pound can be administered by intravenous infusionata rate of 65 embodiments, the kit can include instructions for admixing less than 30, 20, 10, 5, or 1 mg/min to reach a dose of about 1 the compound and the other ingredients, or for using the to 100 mg/m or 7 to 25 mg/m. Alternatively, the dose could compound together with the other ingredients. US 8,828,703 B2 25 26 The kit can include one or more containers for the compo is to be further understood that for any particular subject, sition containing the compound. In some embodiments, the specific dosage regimens should be adjusted over time kit contains separate containers, dividers or compartments for according to the individual need and the professional judg the composition and informational material. For example, the ment of the person administering or Supervising the admin composition can be contained in a bottle, vial, or syringe, and istration of the compositions (e.g., the Supervising physician), the informational material can be contained in a plastic sleeve and that dosage ranges set forth herein are only exemplary. or packet. In other embodiments, the separate elements of the Therapeutic Uses kit are contained within a single, undivided container. For The proteins of the invention may be used in therapies for example, the composition is contained in a bottle, vial or the treatment or prophylaxis of disorders mediated by two or Syringe that has attached thereto the informational material in 10 the form of a label. In some embodiments, the kit includes a more proteases. Likewise, the proteins of the invention may plurality (e.g., a pack) of individual containers, each contain be used for the manufacture of a medicament for the treatment ing one or more unit dosage forms (e.g., a dosage form of disorders mediated by two or more proteases. described herein) of the compound. For example, the kit Rheumatoid Arthritis (RA): includes a plurality of Syringes, ampules, foil packets, or 15 Human plasma kallikrein, human tissue kallikrein (h.K1), blister packs, each containing a single unit dose of the com and human neutrophil elastase have all be implicated in RA pound. The containers of the kits can be air tight, waterproof (Volpe-Junior, et al., 1996, Inflamm Res, 45:198-202: (e.g., impermeable to changes in moisture or evaporation), Cassim, et al., 2002, Pharmacol Ther,94:1-34; Colman, 1999, and/or light-tight. Immunopharmacology, 43:103-8: Dela Cadena, et al., 1995, A protein described herein (e.g., a protein that inhibits at Faseb J, 9:446-52; Rahman, et al., 1995, Ann Rheum Dis, least two human proteases) can be administered, alone or in 54:345-50; Tremblay, et al., 2003, Curr Opin Investig Drugs, combination with, a second agent to a subject, e.g., a patient, 4:556-65). Thus, a method of treating rheumatoid arthritis e.g., a patient, who has a disorder (e.g., a disorder as described (e.g., Stabilizing or improving total Sharp score, the Sharp herein), a symptom of a disorder or a predisposition toward a erosion score, or the HAQ disability index) by administering disorder, with the purpose to cure, heal, alleviate, relieve, 25 to a Subject having (or Suspected of having) RA an effective alter, remedy, ameliorate, improve or affect the disorder, the amount of a protein that inhibits at least two proteases from symptoms of the disorder or the predisposition toward the the group of human plasma kallikrein, human tissue kal disorder. The treatment may also delay onset, e.g., prevent likrein (h.K1), and human neutrophil elastase is provided. onset, or prevent deterioration of a condition. MMPs 1, 2, 3, 9, and 13 have also been implicated in RA A therapeutically effective amount can be administered, 30 (Burrage etal. (2006) Front Biosci. 11:529-43). Accordingly, typically an amount of the compound which is effective, upon also provided is a method of treating RA (e.g., stabilizing or single or multiple dose administration to a subject, in treating improving total Sharp score, the Sharp erosion score, or the a Subject, e.g., curing, alleviating, relieving or improving at HAQ disability index) by administering an effective amount least one symptom of a disorder in a Subject to a degree of a protein which inhibits (a) at least two proteases from the beyond that expected in the absence of such treatment. A 35 group of human plasma kallikrein, human tissue kallikrein therapeutically effective amount of the composition may vary (h.K1), and human neutrophil elastase, (b) at least one pro according to factors such as the disease state, age, sex, and tease from the group of MMP-1, MMP-2, MMP-3, MMP-9, weight of the individual, and the ability of the compound to and MMP-13. elicit a desired response in the individual. A therapeutically RA can be assessed by a variety of clinical measures. Some effective amount is also one in which any toxic or detrimental 40 exemplary indicia include the total Sharp score (TSS), Sharp effects of the composition is outweighed by the therapeuti erosion score, and the HAQ disability index. A protein of the cally beneficial effects. A therapeutically effective dosage invention is administered to treat RA, e.g., to improve an preferably modulates a measurable parameter, favorably, assessment on one or more of these clinical measures. relative to untreated subjects. The ability of a compound to Pulmonary Disorders: inhibit a measurable parameter can be evaluated in an animal 45 Lung inflammatory disorders involve a number of different model system predictive of efficacy in a human disorder. proteases, including human neutrophil elastase, proteinase 3. Dosage regimens can be adjusted to provide the optimum cathepsin G, tryptase, MMP-12, and MMP-9. Accordingly, desired response (e.g., a therapeutic response). For example, the invention provides methods for treatment and/or prophy a single bolus may be administered, several divided doses laxis of pulmonary disorders involving inflammation (e.g., may be administered over time or the dose may be propor 50 improving or stabilizing PFT scores or slowing, preventing, tionally reduced or increased as indicated by the exigencies of or reducing a deterioration in PFT scores, or for COPD the therapeutic situation. It is especially advantageous to for patients by reducing the destructive index score by at least 10, mulate parenteral compositions in dosage unit form for ease 20, 30 or 40% or to at least within 50, 40,30, or 20% of normal of administration and uniformity of dosage. Dosage unit form destructive index scores for the age- and gender-matched as used herein refers to physically discrete units Suited as 55 normal population) by administering to a subject having (or unitary dosages for the Subjects to be treated; each unit con Suspected of having) a pulmonary disorder involving inflam tains a predetermined quantity of active compound calculated mation an effective amount of a protein which inhibits the to produce the desired therapeutic effect in association with activity of two or more (or all) of these proteases. Exemplary the required pharmaceutical carrier. pulmonary disorders involving inflammation including An exemplary, non-limiting range for a therapeutically or 60 asthma, emphysema, cystic fibrosis, and chronic obstructive prophylactically effective amount of a compound described pulmonary disease (COPD), particularly acute exacerbations herein is 0.1-20 mg/Kg, more preferably 1-10 mg/Kg. The of COPD. In particular, human neutrophil elastase, proteinase compound can be administered by parenteral (e.g., intrave 3, cathepsin G, and MMP-12 have been associated with nous or subcutaneous) infusion at a rate of less than 20, 10, 5, COPD, and thus proteins for the treatment of COPD inhibit at or 1 mg/min to reach a dose of about 1 to 50 mg/m or about 65 least two proteases from the group: human neutrophil 5 to 20 mg/m. It is to be noted that dosage values may vary elastase, proteinase 3, and cathepsin G, and, in some embodi with the type and severity of the condition to be alleviated. It ments additionally inhibit MMP-9 and/or MMP-12. US 8,828,703 B2 27 28 A variety of clinical measures are available to assess pull implicated in both Crohn's disease and ulcerative colitis (Col monary diseases and the efficacy of treatment thereof. Pul man, 1999, Immunopharmacology, 43:103-8: Devani et al., monary function tests (PFT) such as forced expiratory vol (2005) Dig. Liver Dis. 37(9):665-73; Kuno et al., (2002).J. ume in one second (FEV), forced expiratory volume in six Gastroenterol. 37(Suppl 14):22-32; Stadnicki et al., (1997) second (FEV), forced vital capacity (FVC), peak expiratory 5 Dig. Dis. Sci. 42(11):2356-66). Accordingly, the invention flow, the ratio of FEV1 to FVC, expressed as a percentage also provides methods of treatment and/or prophylaxis of (FEV/FVC), the forced expiratory flow less the average Crohn's disease and/or ulcerative colitis (e.g., reducing or forced expiratory flow during the mid (25-75%) portion of the eliminated rectal bleeding, abdominal pain/cramping, weight FVC (FEFs-7s), and thin section computed tomography may loss, or diarrhea, improving hematocrit/eliminating improv be used to assess pulmonary diseases and the efficacy of 10 ing anemia, or improving Clinical Activity Index for Ulcer treatment. The “destructive index, which is a measure of ative Colitis scores in UC patients) by administering to a alveolar septal damage and emphysema, may be particularly subject an effective amount of a protein which inhibits at least useful in monitoring COPD and the therapeutic effect of the two (or all three) of human tissue kallikrein (h.K1), human proteins and methods herein. See, e.g., Am Rev Respir Dis plasma kallikrein and human neutrophil elastase. The Subject 1991 July; 144(1): 156-59. 15 has, is Suspected of having, or is at risk of developing an IBD. Multiple Sclerosis: Many of the symptoms of IBD (e.g., rectal bleeding, Multiple sclerosis (MS) may involve human kallikrein 6 abdominal pain/cramping, diarrhea, weight loss, hematocrit/ (hK6), human kallikrein 8 (hK8), and human kallikrein 10 anemia) can be quantitated using standard methodologies. (hK10). Accordingly, the invention provides methods of Additionally, a clinical index such as the Clinical Activity treating multiple Sclerosis (e.g., reducing or stabilizing EDSS Index for Ulcerative Colitis may be used to monitor IBD. (see score, reducing the intensity or frequency of exacerbations, or also, e.g., Walmsley et al. Gut. 1998 July; 43(1):29-32 and reducing the size, number, or rate of increase in size or num Jowett et al. (2003) Scand J. Gastroenterol. 38(2):164-71). ber or Sclerotic lesions) by administering to a subject having Pancreatitis (or suspected of having) MSan effective amount of a protein Acute pancreatitis is characterized by Sudden intrapancre that inhibits two or more (e.g., all three) of these proteases. 25 atic activation of pancreatic Zymogens, resulting in an essen Patients having MS may be identified by criteria recognized tially autolytic destruction of the pancreas if unchecked. The in the medical arts as establishing a diagnosis of clinically major enzymes involved are human trypsins 1 and 2, human definite MS, such as the criteria defined by the workshop on pancreatic chymotrypsin, and human pancreatic elastase. the diagnosis of MS (Poseret al., Ann. Neurol. 13:227, 1983). Accordingly, the invention provides a method for the treat Efficacy of the treatment of multiple sclerosis may be 30 ment of acute pancreatitis (e.g., stabilizing or improving Ran examined using any method or criteria recognized in the art. son score, Acute Physiology and Chronic Health Evaluation Three commonly used criteria are: EDSS (extended disability (APACHE II) score, or Balthazar computed tomography status scale, Kurtzke, Neurology 33: 1444, 1983), appearance severity index (CTSI)) by administering to a subject having of exacerbations or MRI (magnetic resonance imaging). (or Suspected of having) acute pancreatitis a protein which Spinal Cord Injury: 35 inhibits at least two of the proteases human trypsin 1, human Spinal chord injury may be exacerbated by hK6, human trypsin 2, human pancreatic chymotrypsin, and human pan neutrophil elastase, and other serine proteases. Accordingly, creatic elastase. the invention provides methods of treating spinal cord injury Cancer (e.g., improving gross or fine motor skills, increasing the A number of serine proteases are involved in cancer and activities of daily living (ADLs) which can be performed by 40 tumorigenic pathways. Hepsin and matriptase are involved in the patient, improvement in reflex responses, or improvement the activation of hepatocyte growth factor. Hepatocyte growth in complex locomotion behavior) by administering to a Sub factor (HGF) is a growth factor for many tumors and blocking ject with a spinal cord injury an effective amount of a protein its release will deprive the tumor cells of a needed growth that inhibits at least the two serine proteases hK6 and human factor. In addition, matriptase can also cleave pro-uPA into neutrophil elastase. 45 uPA, the enzyme responsible for the generation of plasmin Behavioural assessments often fail to define the exact from plasminogen. Plasmin is implicated in the activation of nature of motor deficits or to evaluate the return of motor various matrix metalloproteinases (MMPs). Upon conversion behaviour following spinal cord injury. In addition to the from plasminogen to active plasmin, plasmin remains asso assessment of gross motor behaviour, it is appropriate to use ciated with the surface of a variety of epithelial cancer cells. complex tests for locomotion to evaluate the masked deficits 50 MMP activity is important in cleaving a variety of matrix in the evaluation of functional recovery after spinal cord proteins that allow cancer cells to escape from one site or to injury. Suresh et al. have designed sensitive quantitative tests invade at new sites. In addition, matriptase can also cleave for reflex responses and complex locomotor behaviour as a pro-uPA into uPA, the enzyme responsible for the generation combined behavioural score (CBS) to assess the recovery of of plasmin from plasminogen. Endotheliases 1 and 2 have function in the Bonnet monkey (Macaca radiata) (J Neurol 55 also been implicated in cancer. Sci. (2000) 178(2): 136-52.). Since these measurements are Accordingly, the disclosure provides methods of treating non-invasive, they may be adapted for use on human patients. (e.g., slowing, eliminating, or reversing tumor growth, pre Inflammatory Bowel Disease: venting or reducing, either in number or size, metastases, Inflammatory bowel diseases (IBD) include two distinct reducing or eliminating tumor cell invasiveness, providing an disorders, Crohn's disease and ulcerative colitis (UC). Ulcer 60 increased interval to tumor progression, or increasing dis ative colitis occurs in the large intestine, while in Crohn's, the ease-free Survival time) cancer (e.g., breast cancer, including disease can involve the entire GI tract as well as the small and Her2+, Her2-, ER+, ER-, Her2+/ER+, Her2+/ER-, Her2-f large intestines. For most patients, IBD is a chronic condition ER+, and Her2-/ER-breast cancer), head and neck cancer, with symptoms, which include intermittent rectal bleeding, oral cavity cancer, laryngeal cancer, chondrosarcoma, ova crampy abdominal pain, weight loss and diarrhea, lasting for 65 rian cancer, testicular carcinoma, melanoma, brain tumors months to years. Human tissue kallikrein (h.K1), human (e.g., astrocytomas, glioblastomas, gliomas)) by administer plasma kallikrein and human neutrophil elastase have been ing an effective amount of an engineered multi-Kunitz US 8,828,703 B2 29 30 domain protein of the invention that inhibits at least two potential site in Kunitz domain 3 is distant from the binding proteases involved in cancer, for example at least two pro site and glycosylation at this site will not prevent binding to or teases from the group of human plasmin, human hepsin, inhibition of a protease. human matriptase, human endotheliase 1, and human endot Table 3 shows the cDNA that encodes TFPI. The DNA heliase 2. 5 encoding the putative signal sequence runs from base 133 to The disclosed methods are useful in the prevention and base 216 (underlined). Table 32 shows the cDNA for rabbit treatment of solid tumors, soft tissue tumors, and metastases TFPI. There a shorter signal sequence is identified. The thereof. Solid tumors include malignancies (e.g., Sarcomas, mature protein is encoded by bases 217-1044. Table 4 shows adenocarcinomas, and carcinomas) of the various organ sys the annotated DNA sequence of a vector, pFTFPI V1, having tems. Such as those of lung, breast, lymphoid, gastrointestinal 10 a modified sequence of TFPI that it would be expressed in (e.g., colon), and genitourinary (e.g., renal, urothelial, or tes CHO or HEK293T cells. The modifications provide restric tion sites so that parts of TFPI can be readily altered. Other ticular tumors) tracts, pharynx, prostate, and ovary. Exem vectors and other signal sequences can be used. The DNA plary adenocarcinomas include colorectal cancers, renal-cell sequence has been modified to introduce a number of restric carcinoma, liver cancer, non-Small cell carcinoma of the lung, 15 tion sites so that parts of TFPI can be removed or modified and cancer of the Small intestine. Additional exemplary Solid easily. Altered bases are shown as uppercase, bold, or under tumors include: fibrosarcoma, myxosarcoma, liposarcoma, scored. On each line of Table 4, every character that follows chondrosarcoma, osteogenic sarcoma, chordoma, angiosar an exclamation point () is a comment. The naked DNA coma, endotheliosarcoma, lymphangiosarcoma, lymphan sequence in this table and other tables of similar appearance gioendotheliosarcoma, synovioma, mesothelioma, Ewings can be recovered by dropping the exclamation points and all tumor, leiomyosarcoma, rhabdomyosarcoma, gastrointesti characters that follow them on that line. nal system carcinomas, colon carcinoma, pancreatic cancer, The DNA sequence in Table 4 can be constructed using the breast cancer, genitourinary system carcinomas, ovarian can human TFPI cDNA and the oligonucleotides in Table 15. First cer, prostate cancer, squamous cell carcinoma, basal cell car the cDNA is used as template and ONMatureT and cinoma, adenocarcinoma, Sweat gland carcinoma, sebaceous 25 ON 1019L27 are used to amplify the coding region, a 829 base gland carcinoma, papillary carcinoma, papillary adenocarci segment. This DNA can be used as a template for ten PCR nomas, cystadenocarcinoma, medullary carcinoma, bron reactions that use primers 1: (ONSfiT, ONSachinL), 2: (ON chogenic carcinoma, renal cell carcinoma, hepatoma, bile SachinU, ONMlulL), 3: (ONMlul U, ONBBBL), 4: (ON duct carcinoma, choriocarcinoma, seminoma, embryonal BBBU, ONXhoBamL), 5: (ONXhoBamU, ONBsmBIL), 6: carcinoma, Wilms tumor, cervical cancer, endocrine system 30 (ONBsmBIU, ONApaSall), 7: (ONApaSall J, ONSac2L), 8: carcinomas, testicular tumor, lung carcinoma, Small cell lung (ONSac2U, ONBspE2L), 9: (ONBspE2U, ONBssH2L), and carcinoma, non-Small cell lung carcinoma, bladder carci 10: (ONBssH2U, ONfinalI). The ten products (which have noma, epithelial carcinoma, glioma, astrocytoma, medullo 20 or more bases of overlap with one other product) are mixed blastoma, craniopharyngioma, ependymoma, pinealoma, and amplified with the primers ONSfiT and ONfinalI to 35 produce a 874 base product. This nucleic acid is cut with Sfil hemangioblastoma, acoustic neuroma, oligodendroglioma, and Xbaland ligated to a similarly cut expression vector. meningioma, melanoma, neuroblastoma, and retinoblas toma. Metastases of the aforementioned cancers can also be Example 2 treated or prevented in accordance with the methods described herein. 40 ATFPI Derivative that Inhibits h.pKal and Human Neutrophil Elastase EXAMPLES Human neutrophil elastase and human plasma kallikrein Example 1 may both be involved in the inflammatory response associ 45 ated with coronary by-pass surgery (CBPS). Thus a protein Expression of TFPI in Eukaryotic Cells that inhibits both human neutrophil elastase and human plasma kallikrein may have synergistic beneficial effects Table 1 shows the amino acid sequence of LACI (also when used for treating patients, e.g., before, during or after known as tissue-factor pathway inhibitor (TFPI)). Table 2 coronary by-pass Surgery. shows the amino acid sequence of the different Kunitz 50 An inhibitor that is highly potent and specific for each of domains within LACI. The positions marked with “I” (11, 13, human plasma kallikrein (h.pKal) and human neutrophil 15-19, 34, 39-40) are examples of residues that can be varied elastase is encoded by the gene displayed in Table 9. Altered to change the protease specificity of the domain. Positions amino acids are shown bold and mutated bases are shown marked with “” (21 and 32) may also be varied. bold and uppercase. Restriction sites are shown bold. Any The positions marked “S” can contribute to the structure of 55 other signal sequence functional in eukaryotic cells could be the KunitZ domain. In some implementations, these positions used. This protein is named TF88890. This gene can be are kept invariant. In other implementations, they are varied, expressed in the same vector as was used for full-length TFPI but to a limited degree, e.g., to allow for conservative Substi shown in Table 4 or in any suitable expression vector. One tutions. cuts the expression vector with Sfil and Xbaland ligates the LACI contains three potential N-linked glycosylation (NX 60 DNA shown in Table 9 to the digested vector DNA. Table 10 (S/T)) sites. The Asn (N) residues are shown bold and under shows the oligonucleotides that can be used to make changes scored in Table 1. In Kunitz domain2, N25 (numbering from in TFPIKunitz domain1 and Kunitz domain 2. The procedure the beginning of the domain) is a potential glycosylation site. is as follows. This position is distant from the residues that contact serine PCR of pFTFPI V1 with ON1U23 and ON718L21 gives a proteases and glycosylation at this site is unlikely to interfere 65 522 base product, PNK1, comprising the mature sequence up with protease binding or inhibition. Glycosylatioin of a sec to S174. Using PNK1 as template, ON1U23 and ONK1Bot ond Asn in Linker 3 is also unlikely to affect activity. The third give a 141 base product, PNK2, that includes changes around US 8,828,703 B2 31 32 residues 13-20 of TFPI KunitZ domain1. Within Kunitz linker is used (e.g., for a protein with at least three Kunitz domain 1, the changes are K15R, R17A, I18H, I 19P, and domains), the linkers between the Kunitz domains could be F21W. PCR of PNK1 with ONK1T and ONK2B1 gives a 265 different. base product, PNK3, that overlaps the first product by 57 For example, TF88890-2 is an alternative to the design of bases. This PCR introduces changes into Kunitz domain 2: 5 TF88890 and is shown in Table 20. TF88890-2 contains two I13P, R15I, G16A, Y17F, I18F, T19P, and Y21W. PCR of copies of TFPI-Kunitz domain 1, the first modified to make it PNK1 with ONK2T1 and ONK2B2 give PNK4 of length 104 into an inhibitor of human plasma kallikrein (h.pKal) and the which comprises an overlap of 51 bases with PNK3 and the second modified to make it an inhibitor of human neutrophil changes in TFPI Kunitz domain2. This introduces the elastase. changes R32L, K34P and L39Q. PCR of PNK1 with 10 The amino acid sequence of mature TF88890-3 is shown in Table 21. TF88890-3 contains three copies of TFPI-Kunitz ONK2T2 and ON718L21 give PNK5 with length 165. Mix domain 1, the first and third modified to make it into an PNK2, PNK3, PNK4, and PNK5 and use ONSfiTop and inhibitor of human plasma kallikrein and the second modified BotAmp to amplify a 573 base fragment. Cut the final PCR to make it an inhibitor of human neutrophil elastase. The product with Sfil and Xbal and ligate to similarly cleaved 15 instances of TFPI-Kunitz domain 1 are joined by hydrophilic, expression vector. The new vector is called pFTFPI V2. flexible linkers (GGSGGSSSSSG (SEQID NO:147)) which The molecule encoded by pFTFPI V2, TF88890, has the is not expected to provide any T-cell epitope. The first or third following expected characteristics. The first Kunitz domain Kunitz domain could be modified to inhibit a different pro will inhibit human plasma kallikrein and the second Kunitz tease, giving a tri-specific inhibitor. The increased molecular domain will inhibit human neutrophil elastase (hNE). The mass is expected to increase the residence time in serum. mature protein should be 174 amino acids and may be glyco The amino acid sequence of mature TF88890-4 is shown in sylated at N117 and N167. N117 is at position 26 in Kunitz Table 22. TF88890-4 contains four copies of TFPI-Kunitz domain 2. This site is located on a different side of the protein domain 1, the first and third modified to make it into an from the protease , and, accordingly, is not inhibitor of h.pKaland the second and third modified to make expected to interfere with protease binding. The N167 is in 25 it an inhibitor of human neutrophil elastase. The instances of Linker 3 and is not expected to interfere with inhibition of TFPI-Kunitz domain 1 are joined by hydrophilic, flexible hNE. The protein is terminated at S174. MMP-8 and MMP12 linkers (GGSGGSSSSSG) (SEQ ID NO: 147) which is not cleave wildtype TFPI at this site. Thus, the presence of a expected to provide any T-cell epitope. The linkers need not protein ending with this sequence is less likely to cause an all be the same. One could have all four Kunitz domains immune response because just such a peptide results from 30 directed to different proteases. MMP-8 or MMP12 cleavage of wildtype TFPI. The expected The amino acid sequence of mature TF88890-5 is shown in molecular weight of TF88890 is 19 KDa plus the glycosyla Table 23. TF88890-5 contains four copies of TFPI-Kunitz tion. TF88890 is not expected to bind cell surfaces unlike domain 1, the first and third modified to make it into an TFPI because the region of TFPI implicated in cell-surface inhibitor of h.pKaland the second and third modified to make binding is absent. The molecular mass of TF88890 is approxi 35 it an inhibitor of human neutrophil elastase. The instances of mately 14 KDa and would have a short residence in serum. A TFPI-Kunitz domain 1 are joined by hydrophilic, flexible short residency may be useful for treating patients undergoing linkers (GGSGNGSSSSGGSGSG (SEQID NO:150)) which coronary bypass Surgery. However, increased residency time is not expected to provide any T-cell epitope. The linkers can be engineered, e.g., by attaching one or more polymer shown each contain an N-linked glycosylation site (NGS). If moieties (e.g., PEG moieties). 40 produced in eukaryotic cells, TF88890-5 should carry glyco Alternative designs for TF88890 are shown below in sylation at three sites which should increase solubility and Tables 20-25. An additional alternative design is to use the serum residence time. The linkers need not all be the same. sequence shown in Table 9 wibases 508-579 deleted. This One could have all four Kunitz domains directed to different construct would be TF88890S and would comprise 150 proteases. amino acids with two potential N-linked sites of glycosyla 45 tion. TF88890S would inhibit human plasma kallikrein and Example 3 human neutrophil elastase. The sequences in these tables are exemplary mature amino acid sequences. To produce a pro An Inhibitor of Human Plasmin, h.pKal, and Human tein including these sequence by Secretion, a suitable signal Neutrophil Elastase Based on TFPI sequence can be added for expression in eukaryotic cells 50 (such as mammalian (e.g., CHO or HEK293T) or yeast cells Table 16 shows the amino acid sequence of a protein (e.g., Pichia pastoris), or bacterial cells (such as E. coli or derived from human TFPI and named PlaKalE101. The Caulobacter Crescentus). Sequence KGGLIKTKRKRKKQRVKIAYEEIFVKNM The protein can be designed so that it does not include more (SEQ ID NO:151) is deleted from the C-terminus of w.t. than two glycosylation sites, e.g., the protein can be devoid of 55 TFPI; this sequence is involved with binding to cells and the N-linked glycosylation sites. effects that TFPI has on gene expression. Removal of the KunitZ domains can be joined with a linker, e.g., a hydro sequence and changes in KunitZ domain 3 provides a protein philic, flexible linker (e.g., by the linker: GGSGGSSSSSG that neither binds cells nor modulates cellular gene expres (SEQ ID NO:147)). Linkers that lack T-cell epitopes are Sion. In KunitZ domain 1, we have changed residue 15 of particularly useful. All ligands for MHC molecules contain at 60 Kunitz domain 1 from Lys to Arg. At residue 17, we have least one or two amino-acid types picked from: (a) medium to changed Ile to Arg. At residue 18, Met is changed to Phe. At large hydrophobic (VILMFWY (SEQ ID NO:148)), or (b) residue 19, we change Lys to Asp. At residue 21 we change charged (EDRKH (SEQ ID NO:149)). Accordingly, linkers Phe to Trp. These changes produce a Kunitz domain identical (e.g., of twelve amino acids) that do not contain any of these to DX-1000, a potent inhibitor of human plasmin. amino acids avoids creation of an MHC-I neo-epitope, 65 In KunitZ domain2 we make the following changes (all including epitopes that may arise from juxtaposition of resi denoted by position within the Kunitz domain): I13P. G.16A, dues from adjoining KunitZ domains. Where more than one Y17A, I18H, T19PY21W, K34I, and L38E. These changes US 8,828,703 B2 33 34 make a Kunitz domain that is very similar to DX-88 in the part TABL E 43 of the Kunitz domain that contacts human plasma kallikrein. This domain is highly likely to be a potent and specific inhibi Some known substrate cleavades for MMP-12 tor of human plasma kallikrein. SEQ In Kunitz domain 3, we make the following changes: L13P. ID R15I, N17F, E18F, N19P, K34P, S36G, G39Q. Thus, Kunitz Substrate P4 P3 P2 P1 P1 P2' P3 P4" NO : domain is very similar to DX-890 which is a highly specific 1. R P Q W K D T 157 inhibitor of human neutrophil elastase. As reported by Piro 2 G A. M L E A. I 158 and Broze (Circulation 2004, 110:3567-72.), removal of the 10 highly basic C-terminal tail greatly reduces binding to cells. 3 E A. I P M S I P 1.59 Piro and Broze also reported that changing R15 of Kunitz domain 3 to Leu greatly reduced cell binding. 4. L W E A. L Y L W 16 O A gene encoding PlaKalE101 can be constructed as fol 5 E A. L Y L W C G 161 lows. Oligonucleotides shown in Table 17 are used as primers 15 Substrates with the cDNA of human TFPI (shown in Table 3). The 1: alpha1-proteinase inhibitor (Homo sapiens oligonucleotides Get TFPI top and ON937L24 are first used 2: alpha1-proteinase inhibitor to amplify the coding sequence from codon 1 to codon 276, 3: alpha1-proteinase inhibitor thus truncating the 3' terminal codons. The 828 base product 4 : insulin B chain (oxidized) 5: insulin B chain (oxidized) is called PKE P1 and is used as template for six PCR reac Reference for all: Shapiro, S. D., Hartzell, W. O. & Senior, R. M. Macrophage elastase. In Handbook of Proteolytic Enzymes, 2 tions: a) Get TFPI top and MutK1:31OL43 make PKE P2 edin (Barrett, A. J., Rawlings, N. D. & Woessner, J. F. eds), p. (220 bases), b) MutK1:310U43 and MutK2:517L56 make 540 - 544, Elsevier London (2004) PKF P3 (263 bases), c) MutK2:517U56 and MutK2:577L47 Table 30 shows the amino acid sequence of PlaKalE102. make PKF P4 (107 bases), d) MutK2:577U47 and MutK3: This protein differs from PlaKalE101 in that codon 49 is 792L55 make PKF P5 (270 bases), e) MutK3:792U55 and 25 changed from a codon encoding Leuto a codon encoding Pro, MutK3:867L31 make PKF P6 (106 bases), and f) MutK3: 867U31 and ONLowCapPKE make PKE P7 (114 bases). codon 112 is changed from a codon encoding Ile to a codon PKE P2 through PKE P7 are mixed and PCR amplified with encoding Pro, and codon 203 is changed from a codon encod primers ONLowCapPKE and ONN.coI to give a 861 base ing Thr to a codon encoding Pro to prevent cleavage by product having NcoI and Xbal sites at the ends. This fragment 30 MMP-8 or MMP-12. Many proteases have difficulty cleaving is cut with Sfil and Xbal and ligated to similarly cleaved before Pro and none of the known substrates of MMP-8 or expression vector having an expression cassette as shown in MMP-12 have Pro at the P1’ position. Table 9 (e.g., pFcFuse) for expression in eukaryotic cells. Example 5 35 Example 4 HalfLife, Metalloproteinases, and Cysteine Proteases A Modification of PlaKalE101 that is Resistant to Cleavage by MMPs The half-life of a protein (by way of example, the protein 40 PlaKalE101) may be extended by methods including: 1) Known substrates for MMP-8 are shown in Table 42. The PEGylation, 2) fusion to Fc (e.g., the Fc domain of an IgG) at preference for Pat P3, G at P1, and I or L at P1’ should be either the amino or carboxy terminus of Fc (e.g., as noted. Known substrates for MMP-12 are shown in Table 43. PlaKalE101::Fc or Fc::PlaKalE101), 3) fusion to human serum albumin (or a fragment thereof) at the amino or car TABL E 42 45 boxy terminus (e.g., as PlaKalE101::hSA or hSA:: Substrates of MMP-8 (From MEROPS (http:// PlaKalE101, or 4) making multimers of the protein until the meCOs. sance. ac: uk/ molecular mass is large enough to give an acceptable half life. SEQ Table 18 shows the amino acid sequence of a PlaKalEl:: ID 50 GGSGGSGGSGGSGG (SEQID NO:162):hSA fusion pro Substrate P4 P3 P2 P1 P1 P2' P3' P4" Ref NO : tein. Although the molecules are large, the half life is very 1. G772 P O G I A G Q 1 152 likely to be on the order of weeks, similar to IgGs or hSA. A PlaKalE101 dimer is shown in Table 19. After Linker 5, 2 Dpn P Q G I Q G 1 153 a 13 amino acid glycine-rich hydrophilic linker avoiding 3 Dpn P L G L. W. A Dr 1 154 55 known antigenic sequences is inserted. This is followed by another copy of the mature PlaKalE101 sequence. The result 4. G P Q G I W. G Q 1 155 ing protein has a 509 mature amino acid sequence, PKEPKE02. The increased molecular mass is expected to 5 Mica P L G L. Dpa A. R. 1 156 give the molecule a much longer half-life than PlaKalE101. Substrates: 1: Collagen alpha 1 (I) chain (Homo sapiens); 60 Cleavage after G775. PKEPKE02 and related proteins are expected to have 2: Dnp-P-Q-G-I-I-A-G-Q- d.R-OH (SEQ ID NO: 2O1) molecular masses in excess of 60 kDa and may have half lives of several days. PKEPKE02 has two inhibitory sites for each of human Ref 1 = Tsche sche, H., Pieper, M. & Wenzel, H. Neutrophill plasmin, h.pKal, and human neutrophil elastase. The Kunitz Collagenase. In Handbook of Proteolytic Enzymes, 2 edin (Bar 65 rett, A. J., Rawlings, N. D. & Woessner, J. F. eds), p. 480-486, domains of this protein may be further engineered to create a Elsevier London (2004) protein that specifically inhibits up to six different serine proteases. Additionally, the protein may be modified by US 8,828,703 B2 35 36 fusion with one or more short peptide sequences that inhibit: strand PCR primers. ONUATF890 and ONUCTF890 are top a) other serine protease, b) metalloproteinases, or c) cysteine Strands and ONLBTF890 is a lower Strand. The dsDNA is cut proteinases. with Bsu36I which cleaves each strand after CC. Thus the A gene encoding PKEPKE02 is inserted after a functional overlap is 5'-TCA-3" in the upper strand and 5'-TGA-3" in the signal sequence in a vector Suitable for expression in mam lower Strand. This means that ligations will put top strands malian cells or yeast, e.g., secretory expression. For example, together. Had we used an enzyme that produces a palindromic a DNA sequence encoding PKEPKE02 is inserted between overhang, top and bottom Strands would be randomly joined. the Sfil and Xbal sites of an expression vector having an After cleavage, the long inserts are ligated into a vector hav expression cassette similar to the one shown in Table 9. It is ing a compatible Bsu36I site under conditions that favor also possible to employ the signal sequence as shown in Table 10 insertion of multiple inserts. Using priming sites outside the 9 and follow the amino acid sequence shown with one or two Bsu36I insertion site, we screen isolates to find those with 3, stop codons, as in Table 9. 4, 5, 6, 7, or 8 inserts. The vector can be designed to allow Table 31 shows the sequence of Elsix, a protein comprising expression in eukaryotic cells, preferably mammalian cells. six KunitZ domains designed to inhibit human neutrophil As shown in Table 28, the codons for residues S3-F20, I25 elastase. The sequence is derived from a dimer of TFPI. The 15 R47, and C51-X h are varied in a way that maintains the mutations are indicated in bold. In KunitZ domain 1 the muta encoded KunitZ domain amino acid sequence but maximized tions are K151, I17F, M18F. K19P, and E39Q. In Kunitz the DNA diversity. Residues in the linker can be either G or S. domain 2, the mutations are I13P R15I, G16A, Y17F, I18F, Thus, chains of units will have only short segments of exact T19P, and L39Q. In Kunitz domain 3, the mutations are L13P. identity. R15I, N17F, E18F, N19PS36G, and G39Q. Kunitz domain 4 is a repeat of KunitZ domain 1: Kunitz domain 5 is a repeat of Example 6 Kunitz domain 2, and Kunitz domain 6 is a repeat of Kunitz domain 3. The linker that joins the two halves is composed of Stabilization of Kunitz, Domains Gly and Ser. There are six possible glycosylation sites shown as n, one at N25 of Kunitz domain 2, one in the Linker2 at 25 KunitZ domains with disulfide bonds can oligomerize at N17, one at N44 in Kunitz domain 3, and three more in the high concentration and 37° C. on prolonged storage. Oligo second copy of modified TFPI. merization may result form disulfide exchange between Table 24 shows the mature amino acid sequence of TF890 monomers. The C14-C38 disulfide is exposed on the surface 6. This molecule has six identical KunitZ domains, each of canonical Kunitz domains and can exchange with cysteines designed to inhibit human neutrophil elastase. TFPI Kunitz 30 on other Kunitz domains to form intermolecular bonds. domain 1 is modified by the mutations K15I, I17F. M18F Accordingly, KunitZ domains that inhibit target proteases but K19P, and E39Q. There are more than 1. E24 DNA sequences are less likely to form oligomers can be designed by removal that encode this amino acid sequence. Preferably, each of the C14-C38 disulfide. In particular, the cysteine residues instance of the Kunitz domain would be different to avoid at these positions can be mutated to amino acids other than possible genetic instability. The linkers are each twelve amino 35 cysteine, e.g., alanine, glycine, or serine. In one embodiment, acids long and are composed of glycine and serine. The length the mutations are C14G or C38V. Such a mutated Kunitz is such that the carboxy terminal amino acid of one Kunitz domain may be less stable with respect to melting but more domain in conjunction with the amino terminal amino acids stable with respect to multimerization. of the next Kunitz domain would not form an epitope for TFPI Kunitz domain 1 can be engineered into a potent MHC-I. Peptides composed entirely of glycine and serine are 40 inhibitor of human neutrophil elastase by the mutations unlikely to make tight-binding MHC-I epitopes. K151, I17F. M18F. K19P, and E39Q. To eliminate the 14-38 For expression by secretion, a sequence encoding a signal disulfide, the mutations C14G and C38V as found by Hagi sequence can be added before the DNA that encodes TF890 hara et al. (Hagihara, et al., 2002, J Biol Chem, 277:51043-8) 6. For eukaryotic expression by secretion, it is possible to add may be made. This could decrease the binding to human a eukaryotic signal sequence and express the protein in Pichia 45 neutrophil elastase. Table 26 shows a display cassette for pastoris or mammalian cells. Since TF890-6 includes 408 TFPI Kunitz domain 1 in which the mutations K15I, I17F, amino acids, the molecular mass is approximately 44.8 KDa M18F, and K19P have been made in order to confer affinity which could give the protein an increased serum residence for human neutrophil elastase. In addition, positions 12, 13. time relative to proteins having fewer Kunitz domains. 14, 38, and 39 are allowed to be any amino acid except Cys. Table 25 shows the amino acid sequence of TF890-8 which 50 This gives a diversity of 2,476,099. Note that NHK encodes comprises 548 amino acids and would have an approximate amino acids FLSYPHQIMTNKVADE (SEQ ID molecular mass of 60.3 KDa. TF890-8 contains eight Kunitz NOS:146 and 202) while BGG encodes amino acids WRG. domains of identical amino acid sequence. The linkers are By using both of these codons, we allow all amino acids made of glycine and serine and are all of length 12 amino except Cys. ONTAAAXC has a NcoI restriction site, and acids. This molecule is monospecific for human neutrophil 55 codons up to T27 with NHK at codons 12, 13, and 14. elastase, but may have a much greater half-life than DX-890 ONTAABXC covers the same range except that 12 and 13 are (a single Kunitz domain with molecular mass of approxi NHK and 14 is BGG.ONTABAXC has BGG at 13 and NHK mately 7 KDa). at 12 and 14. ONTABBXC has BGG at both 13 and 14 and To avoid having multiple repeats of identical DNA, one can NHK at 12. ONTBAAXC, ONTBABXC, ONTBBAXC, and engineer the gene for (TF890-linker), (n can be 1 to 10) as 60 ONTBBBXC all have BGG at 12 and NHK or BGG at 13 and follows. Table 28 shows DNA that encodes a short linker, a 14 in the same pattern as the first four oligonucleotides. Thus modified version of TFPI Kunitz domain 1 (numbered 1-58), 12, 13, and 14 are allowed to be any AA except C. and more linker (lettered a-l). These sequences are flanked by ONLoAAXC covers codons R20 to E49 (where the Xbal site a primer annealing sequence that includes sequences that is) plus scab DNA. Because ONLOAAXC is a lower strand, it facilitate restriction enzymes digestion. At the end of Table 28 65 has the reverse complements of NHK and BGG, namely are oligonucleotides that can be used with PCR to produce MDN and ccV. Table 27 shows oligonucleotides that would dsDNA. ONUPTF890 and ONLPTF890 are upper and lower implement this strategy. The eight long oligonucleotides are US 8,828,703 B2 37 38 mixed and the short oligonucleotides are used as primers to ONrK2T2 and ONrK3B1 to produce PrTF5, a 236 base oli generate DNA that extends from NcoI to Xbal. The PCR gonucleotide. PrTF1 is used as a template with ONrK3T1 and product is cleaved with NcoI and Xbal and is ligated to a ONrK3B2 to produce PrTF6, a 99 base oligonucleotide. version of DY3P82 containing TFPI Kunitz domain 1 or a PrTF1 is used as a template with ONrK3T2 and ONrXbalL to derivative thereof having NcoI and Xbal sites. Table 29 con produce PrTF7, a 117 base oligonucleotide. PrTF2-7 are tains the annotated sequence of DY3P82TFPI1, a phage vec mixed and ONrSfi and ONrXbalL are used as primers to tor that displays TFPI Kunitz domain1 on the stump of M13 produce PrTF8, a 790 base oligonucleotide which is cut with III. DY3P82TFPI1 has a wildtype M13 iii gene and a display Sfil and Xbaland ligated to a similarly cut expression vector. gene regulated by P. So that a) expression of the display can Table 40 shows the amino acid sequence of mature RaTF be suppressed with glucose and induced with IPTG, and b) 10 PKE04, a derivative of rabbit TFPI. RaTFPKE04 has all of the most phage display one or fewer KunitZ domain::III stump mutations of RaTFPKEO1 and has C14G and C38V in each of proteins. This allows distinguishing very good binders from the KunitZ domains. The purpose is to make a protein that is merely adequate binders. We select binders to human neutro less likely to aggregate than RaTFPKE01 might be. phil elastase from the library of phage. Since DX-890 binds Table 35 shows the amino acid sequence of RaTFPKE02, a human neutrophil elastase with K, of about 3 pM, very strin 15 509 AA protein formed by joining two copies of mature gent washes are used. In addition, DX-890 may be used as an RaTFPKE01 with a linker-GGSGSSGSGGSGSSG (SEQID eluting agent. NO:163). The mature protein should have molecular mass of It is expected that Kunitz domains that lack the 14-38 about 56 Kda plus glycosylation. RaTFPKE02 should have a disulfide and that inhibit target proteases can be selected. long half-like in rabbits and should not be immunogenic. Once an amino acid sequence is determined that gives high Table 41 shows the amino acid sequence of mature RaTF affinity binding and inhibition of human neutrophil elastase, PKE05. RaTFPKE05 is similar to RaTFPKEO2 but in each one can construct a library of multimers as given in Example Kunitz domain, C14 is changed to G and C38 is changed to V. 5. This approach can be applied to any Kunitz domain and to RaTFPKE05 is less likely to aggregate than is RaTFPKE02. any target protease, and to plural target proteases. For example, if human plasma kallikrein is the target, it is pos 25 Example 8 sible to vary the sequence of DX-88 at positions 12, 13, 14, 38, and 39 excluding only Cys from the varied amino acids. An Inhibitor for hNE, hPr3, and hCatG Example 7 Neutrophils release three serine proteases when activated: 30 neutrophil elastase (hNE, also known as leukocyte elastase), A Derivative of Rabbit TFPI for PK and proteinase 3 (Pr3), and cathepsin G (CatG). To block the Immunogenicity Studies damage caused by excessive neutrophil activation, one can use an agent that inhibits two or more of these proteases, e.g., Table 32 shows the amino acid sequence of rabbit TFPI. all three proteases, each with high affinity and specificity. Table 33 shows the cDNA of rabbit TFPI. Table 34 shows 35 Kunitz-domain that inhibit human neutrophil elastase are RaTFPKE01, an engineered derivative of rabbit TFPI. In known. Inhibitors of CatG can be obtained by selection from Kunitz domain 1, the mutations are Y17R, I18F. K19D, and a library of Kunitz domains (U.S. Pat. No. 5,571,698). Inhibi F21W which are designed to confer human plasmin inhibi tors to PR3 could be obtained from a Kunitz domain library in tory activity on KunitZ domain 1. In Kunitz domain 2, the a similar manner. mutations are I13P. G.16A, Y17A, I18H, T19P, and L39E 40 In one embodiment, the amino acid resides that confer Pr3 which are designed to confer inhibition of h.pKal on Kunitz inhibitory activity can then be transferred to another Kunitz domain 2. In Kunitz domain 3, the mutations are L13P. Q15I, domain, e.g., a Kunitz domain of a multi-domain protein, e.g., N17F, E18F, I19P, S36G, and G39Q which are designed to a protein that includes three KunitZ domains. The amino acid confer human neutrophil elastase inhibition on Kunitz residues that confer CatG inhibitory activity can likewise be domain 3. The C-terminal basic peptide has been truncated. 45 transferred to another Kunitz domain of the protein. The This protein can be produced in CHO cells in the same man amino acid residues that confer hNE inhibitory activity can be ner as PlaKalE101. The protein should have a molecular mass transferred to the third domain. of about 30 KDa plus glycosylation. There are six possible In another embodiment, Kunitz domains that inhibit neu NX(S/T) sites but one is NPT and is unlikely to be glycosy trophil elastase, proteinase 3, and cathepsin G are physically lated. Because we have altered the third Kunitz domain and 50 associated with each other, e.g., by forming a fusion protein removed the lysine-rich C-terminal peptide, RaTFPKE01 that includes the different Kunitz domains. should have reasonable half-life and should not bind to cells. In some embodiments, the engineered protein does not Table 36 shows a gene that encodes RaTFPKE01. The bind to cells and can have one or more of the following sequence is arranged to show: a) signal sequence, b) first attributes (e.g., two, three or four of the following): (a) ability linker from rabbit TFPI, c) Kunitz domain 1 with cysteine 55 to function in the presence of neutrophil defensins, (b) ability codons underlined and mutations shown uppercase bold, d) to resist degradation by MT6-MMP (Leukolysin), (c) ability the second linker, e) Kunitz domain 2, f) third linker, g) to inhibit proteases bound to anionic surfaces, (d) ability to Kunitz domain 3, and g) final linker. Table 37 shows oligo reduce production of IL-8 from neutrophils and TNFC. and nucleotides that can be used to construct the gene of Table 36 IL-1B from monocytes. from rabbit TFPI cDNA. One first uses rabbit TFPI cDNA and 60 ONrSfi and TruncLo to produce a 776 base product, PrTF1. Example 9 PrTF1 is used as a template with ONrSfi and ONrK1Bot to produce PrTF2, a 162 base oligonucleotide. PrTF1 is used as Pharmacokinetics in Animals a template with ONrK1 Top and ONrK2Bot1 to produce PrTF3, a 239 base oligonucleotide. PrTF1 is used as a tem 65 The following methods can be used to evaluate the phar plate with ONrK2To1 and ONrK2B2 to produce PrTF4, a 103 macokinetics (PK) of proteins such as proteins that include base oligonucleotide. PrTF1 is used as a template with Kunitz domains in animals, e.g., mice and rabbits. US 8,828,703 B2 39 40 The protein to be tested is labeled with iodine ("I) using Example 10 the iodogen method (Pierce). The reaction tube is rinsed with reaction buffer (25 mM Tris, 0.4MNaCl, pH 7.5). The tube is A Gene that Encodes TIFP Minus the Positively emptied and then replaced with 0.1 ml of reaction buffer and Charged C-Terminal Region 12 uL of carrier free iodine-126, about 1.6 mCi. After six minutes, the activated iodine is transferred to a tube contain Table 50 shows a gene that has been optimized for expres ing the protein to be tested. After nine minutes, the reaction is Sion in CHO cells and which encodeshTFPImC1.hTFPImC1 terminated with 25 uL of saturated tyrosine solution. The comprises: a) a signal sequence, b) a short linker to foster reaction can be purified on a 5 ml D-salt polyacrylamide 6000 cleavage by signal peptidase, c) a six-his tag to facilitate column in Tris/NaCl. HSA can be used to minimize sticking purification of the encoded multi-Kul Dom protein, d) a throm to the gel. 10 bin cleavage site, e) codons corresponding to TFPI 2-270. A sufficient number of mice (about 36) are obtained. The This gene can be expressed in mammalian cells, such as CHO weight of each animal is recorded. In the case of mice, the cells. Codons 271-304 have been eliminated. Residues 271 animals are injected in the tail vein with about 5 ug of the 304 are involved with causing TFPI to bind to cell surfaces, to protein to be tested. Samples are recovered at each time point be internalized, and to alter gene expression. These effects are per animal, with four animals per time point, at approxi 15 unwanted in a therapeutic protease inhibitor, hTFPImC1 can mately 0, 7, 15, 30, and 90 minutes, 4 hours, 8 hours, 16 hours, be purified on a copper-chelating resin via the His6 tag. The and 24 hours post injection. Samples (about 0.5 ml) are col His tag is removable by treatment with thrombin, which lected into anti-coagulant (0.02 ml EDTA). Cells are spun should cleave after R38. down and separated from plasma/serum. Samples can be Table 53 shows the DNA of a vector that contains the gene analyzed by radiation counting and SEC peptide column on of Table 50 so that it can be expressed in eukaryotic cells. HPLC with inline radiation detection. Table 56 shows the amino-acid sequence of hTFPImC1, The For rabbits, the material is injected into the ear vein. ASn residues where N-linked glycosylation is expected are Samples can be collected at 0, 7, 15, 30.90 minutes, 4, 8, 16, shown as bold N with a “*” above. 24, 48,72, 96, 120, and 144 hours post-injection. Samples can hTFPImC1 was expressed in HEK293T cells transformed be collected and analyzed as for mice. 25 with the vector shown in Table 53. The protein was purified by Data can be fit to a bi-exponential (equation 1) or a tri affinity chromatography, and a sample was analyzed by gel exponential (equation 2) decay curve describing “fast'. chromatography, which showed an apparent molecular mass “slow', and “slowest phases of in vivo clearance: of ~42.5 kDa. y=Aexp(-Cut)+Bexp(-Bt) Equation 1 Cell surface binding of hTFPImC1 was tested by flow 30 cytometry, which showed no detectable (above background) Equation 2 cell Surface binding. Where: Example 11 y Amount of label remaining in plasma at time t post-ad ministration 35 A Modified TFPI, hTFPIPpKhNE A=Total label in “fast clearance phase B=Total label in “slow clearance phase Table 51 shows an annotated gene for hTFPIPpKhNE. C=Total label in “slowest clearance phase hTFPIPpKhNE has the first KulDom of TFPI modified so C="Fast clearance phase decay constant that it will inhibit human plasmin (hPla), the second Kul Dom B="Slow clearance phase decay constant 40 modified so that it will inhibit human plasma kallikrein (h.p- Y="Slowest clearance phase decay constant Kal), and the third Kul Dom modified so that it will inhibit t=Time post administration human neutrophil elastase (hNE). To achieve these specifici The C. 3, and Y phase decay constants can be converted to ties, the following mutations have been made. In the first half-lives for their respective phases as: KuDom, we made the mutations K15R, I17R, M18F, K19D, 45 and F21W to confer anti plasmin activity. In the second C. Phase Half-life=0.69(1/C) KuDom, we made the mutations I13P. G.16A, Y17A, I18H, T19PY21W, K34I, and L39E to confer antiplasmakallikrein B Phase Half-life=0.69(1/B) activity. In the third Kul Dom, we made the mutations L13P. R15I, N17F, E18F, N19P, K34P S36G, and G39Q to confer Y Phase Half-life=0.69(1/y) 50 anti human neutrophil elastase activity. Table 54 shows the In the case where the data are fit using the bi-exponential DNA sequence of a vector that allows expression of the gene equation, the percentages of the total label cleared from in shown in Table 51 in eukaryotic cells such as CHO cells. Vivo circulation through the C. and B phases are calculated as: Table 57 shows the amino-acid sequence of hTFPIPpKhNE with annotation. For each of the Kuoms, the w.t. AAS are 55 shown below the actual AA sequence. Because the protein has been truncated and the third Kuom has been modified, it is not expected to bind to cell surfaces (see Example 10, above). It is expected that hTFPIPpKhNE will strongly inhibit In the case where the data are fit using the tri-exponential human plasmin (h.Pla), human plasma kallikrein (h.pKal), equation, the percentages of the total label cleared from in 60 and human neutrophil elastase (hNE). Vivo circulation through the C. and B phases are calculated as: Example 12 Pla301B, a Triple Inhibitor of Human Plasmin 65 Table 52 shows a gene that encodes Pla301 B, a protein % Y Phase=C(A+B+C)x100 derived from human TFPI and having three Kul Doms, each of US 8,828,703 B2 41 42 which is designed to inhibit h.Pla. Table 55 shows the FEBSLett. 2005579(9): 1945-50.) (the first Kunitz domain of sequence of a vector that allows the expression of Pla301B in HAI-1B is identical to the first Kunitz domain of HAI-1). eukaryotic cells. Table 58 shows the amino-acid sequence of HAI-2, domain 1 inhibits hepsin (Parr and Jiang, IntJ Cancer. Pla301B. To confer antih.Pla activity on Pla301B, we made 2006 Sep. 1; 119(5):1176-83.). Table 61 shows an amino-acid the following mutations. In the first Kul Dom, we made the sequence of PlaMathep, a protein designed to inhibit plas mutations K15R, I17R, M18F, K19D, F21W. In the second Kul Dom, we made the mutations I13P. G.16A, Y17R, I18F, min, matriptase, and hepsin. In the first Kul Dom of TFPI, we T19D, Y21W, and K34I. In the third Kudom, we made the have mutated K15 to R, I17 to R, M18 to F, K19 to D, and F21 mutations L13P. N17R, E18F, N19D, F21W, K34I, S36G, to W. This corresponds to DX-1000 (a potent inhibitor of and G39E. Each Kul Dom should now be a potent inhibitor of plasmin) shown in Table 7. In the second Kul Dom, we have h.Pla. 10 mutated I13 to R. Y 17 to S, I18 to F, T19 to PY21 to W. E31 Human plasmin is known to be activated by urokinase-type to K, R32 to S, and K34 to V. The mutated domain is shown in (uPA). uPA is often bound to cell Sur Table 7 as TFPI-K2matr with w.t. AA in lower case and faces by the urokinase-type plasminogen activator receptor mutations in upper case. These changes should make the (uPA-R). h.Pla that is activated by the uPA-uPA-R complex 15 second domain of PlaMathep be a potent inhibitor of often binds to the plasmin receptor (Pla-R). matriptase. In the third domain of TFPI, we have made the Pla301B was expressed in HEK293T cells with the vector mutations R11 to V, L13 to R, N17 to S, E18 to M, N19 to P. shown in Table 55. The protein was assayed for plasmin F21 to W. K34 to V, S36 to G, E42 to S, and K48 to E. The inhibition and was found to be active. K48E mutation will reduce the chance of binding glycoami Example 13 noglycans. These changes are shown in Table 7 as TFPI K3hep and should make the third domain of PlaMatHepbe a A Multiple Kul Dom Inhibitor that Inhibits Plasmin, potent inhibitor of hepsin. A gene to express the protein Matriptase, and Hepsin shown in Table 61 is readily constructed from the genes of other TFPI derivatives shown herein. HAI-1 inhibits matriptase (Herter et al., Biochem J. 2005 390(Pt 1):125-36.) via Kunitz domain I (Kirchhofer et al., Tables TABLE 1.

Human LACI (TFPI) (SEQ ID NO: 1)

AAS Function AA sequence

1-28 Signal 1. 1. 2 2 1 5 O 5 O 5 MIYTMKKWHALWASWCLLLNLAPAPLNA

29- 49 Linker 1 3 4. 4 4 O 5 O 5 9 DSEEDEEHTIITDTELPPLKL

5 O-107 Kuom 1 1. 1. 5 s 6 6 7 7 8 8 9 9 O O O s O 5 O 5 O 5 O 5 O 5 MHSFCAFKADDGPCKAIMKRFFFNIFTROCEEFIYGGCEGNONRFESLEECKKMCTRD

108-12O Linker 2 1. 1. 1. 1. 1. 2 O 5 O NANRIIKTTLOOE

121-178 Kuom 2 1 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 2 2 3 3 4. 4. 5 5 6 6 7 7 1 5 O 5 O 5 O 5 O 5 O 5 KPDFCFLEEDPGICRGYITRYFYNNQTKOCERFKYGGCLGNMNNFETLEECKNICEDG

179-212 Linker 3 1 1. 1. 2 2 2 8 9 9 O O 1. O O 5 O 5 O PNGFQVDNYGTQLNAVNNSLTPQSTKVPSLFEFH

213-270 Kuom 3 2 2 2 2 2 2 2 2 2 2 2 2 1. 2 2 3 3 4. 4. 5 5 6 6 7 5 O 5 O 5 O 5 O 5 O 5 O GPSWCLTPADRGLCRANENRFYYNSVIGKCRPFKYSGCGGNENNFTSKQECLRACKKG

271-3 04 Linker 4 2 2 2 2 2 2 3 3 7 7 8 8 9 9 O O 1 5 O 5 O 5 O 4 FIORISKGGLIKTKRKRKKORVKIAYEEIFVKNM US 8,828,703 B2 43 44 TABL E 2

AA sequences of the KuDoms in human LACI (TFPI)

SEQ ID AAS Name AA sequence NO: 1.

50-107 KuDom 1 MHSFCAFKADDGPCKAIMKRFFFNIFTROCEEFIYGGCEGNONRFESLEECKKMCTRD 3 121-178 KuDom 2 KPDFCFLEEDPGICRGYITRYFYNNOTKOCERFKYGGCLGNMNNFETLEECKNICEDG 4. 213-27O KuDom 3 GPSWCLTPADRGLCRANENRFYYNSWIGKCRPFKYSGCGGNENNFTSKOECLRACKKG 5 |S|S| || || S^ ^SSSSS

TABL E 3 cDNA of human LACI (TFPI) (SEQ ID NO: 2) LOCUS NM_006.287 1431 bp mRNA linear PRI O8 - JUN-2005 DEFINITION Homo sapiens tissue factor pathway inhibitor (lipoprotein-associated coagulation inhibitor) (TFPI), mRNA. ACCESSION NM_006.287 WERSION NM_006287.3 GI: 40254845 sig peptide 133 . .216 /gene=TFPI' /note= lipoprotein-associated coagulation inhibitor Signal peptide" mat peptide 217. ... 1044 /product= lipoprotein-associated coagulation inhibitor" ORIGIN ggcgggtctg. Cttctaaaag aagaagtaga gaagataa at CCtgtctt.ca atacctggaa 6 ggaaaaacaa aataacct ca actic cqttitt gaaaaaaa.ca ttccaagaac titt catcaga 12 gattitt actt agatgattta cacaatgaad aaagtacate cactittgggc titctgitatec

18 citgctgctta atcttgcc cc tocc cctott aatgct9att Ct9aggalaga talaga acac

24 acaattatca cagatacgga gttgccacca citgaaactta togcatt catt ttgtgcattc

3O aaggcggatg atggcc catg taaa.gcaatc atgaaaagat ttittcttcaa tattitt cact 36 cgacagtgcg aagaattitat at atggggga tigtgaaggala at Cagaatcg atttgaaagt

42 Ctggaagagt gcaaaaaaat gtgtacaaga gataatgcaa acaggatt at aaaga caa.ca 48 ttgcaacaag aaaa.gc.ca.ga tittctgcttt ttggaagaag atcCtggaat atgtcgaggit

54 tat attacca gg tatttitta taacaatcag acaaaacagt gtgaacgttt caagtatggit 60 ggatgcctgg gcaatatgaa caattittgag acactggaag aatgcaagaa catttgttgaa

66 gatggit Coga atggitttcca ggtggata at tatggaac cc agct caatgc tigtgaataac 72 to cotgactic cqcaat caac calaggttc.cc agcc tittittgaattitcacgg tocct catgg

78 tgtctic actic cagoagacag aggattgttgt cqtgccalatg agaacagatt Ctact acaat 84 to agt cattg ggaaatgc.cg cc catttalag tacagtggat gtgggggaaa taaaacaat

90 tt tact tcca aacaagaatgtctgagggca totaaaaaag gttt catcca aagaatat ca 96 aaaggaggcc taattaaaac caaaagaaaa agaaagaa.gc agagagtgaa aatagcat at 1 O2 gaagaaattt ttgttaaaaa tatgttgaatt tdttatagoa atgtaacatt aattic tacta

108 aatattittat atgaaatgtt to actatoat tittctattitt tottctaaaa totgttittaat 114 taatatgttc attaaattitt citatgctitat td tacttgtt atcaac acgt ttgtat caga 12O gttgcttitt c taatcttgtt aaattgctta ttctaggit ct gtaatttatt aactggctac

126 tgggaaatta cittattittct ggat citat ct g tattitt cat ttaact acaa attat catac 132 taccggctac atcaaatcag to ctittgatt coatttggtg accatctgtt tdagaatatg

138 at catgtaaa tdattatcto ctittatagcc totalaccaga ttaa.gc.cccc c

US 8,828,703 B2 49 50 TABLE 4 - continued

The annotated DNA sequence of pFTFPI V1 72 O gag gala gat gala gala cac aca att at C aca gat acg gag Ctic cca SacI. . .

Linker 1...... KuDom 1 ...... 18 19 2 O 21 22 23 24 25 26 27 28 29 30 31. 32 P L K L M1 H S C5. A K A. D D11 765 cca citg aaG citt atg cat tca ttt togt gca ttcaag gog gat gat HindIII (1/1) NsiI. . . (1/3) ! Secondary cleavage site of MMP-8 Cleavage site of MMP-12 (Belaaouai, et al., 2 OOO, J Biol Chem, 275: 27123-8

KuDom 1 ...... 33 34 35 36 37 38 39 4 O 41 42 43 44 45 4 6 47 G P C14 K. A. I M K. R2O F N I 810 ggc cca tdt aaa goa atc atgaaa aga titt titc titc aat att titc BspHI. . .

KuDom 1 ...... 48 49 SO 51 52 53 54 55 56 5.7. 58 59 6 O 61 62 T R Q C3 O E E I Y G G C38 E G N 855 acG CT cag tec gala gaa titt at a tat ggg gga tigt gaa gg.T aaC Mlul. . . BstEII. . .

KuDom 1 ...... 63 64 65 66 6f 68 69 70 f1 72 73 74 75 f6 77 Q N R. F45 E S L E ESO C K K M. Ciss T 9 OO cag aat cqa titt gala TCt c td gaa gag tigc aaa aaa atg togt aca BstEII. . BspDI. . . BsrGI. . BSaBI......

Kuom 1 Linker 2...... 78 79 80 81 82 83 84 85 86 87 88 89 9 O 91 92 R D58 N A. N R I I K T T L Q Q E 945 aga gat aat goa aac agg attata aag aca aca ttg caa caa gaa ! Cleavage site of MMP-12 (B elaaouai, et al., 2000, iol Chem, 275: 27 123-8)

Kudom 2 ...... 93 94 95 96 97 98 99 1 OO 1O1 1 O2 103 104 105 106 107 K P D Cs L E E D P G I C14 R 990 aag cca gat ttic togc titt Ctc gag gag gat cott gga ata tgt Ca XhoI. . . BamHI. . .

Kudom 2 ...... 108 109 11 O 113 114 115 G Y I Y Y ggit tat att tat t t t t at ... (not U)

Kudom 2 ...... 123 124 125 126 127 128 129 13 O E R K Y G G C38 gaa cqt titc aag tat ggt giga tigc

Kudom 2 ...... Linker 3 138 139 14 O 141 142 1.43. 144. 145 146 147 148 149 1 SO 151 152 E T L E 5.1 K N I Css E 58 P N gag acG Ctg gala gala to aag aac att tt gala gat g ccC aat BSm3I. . . . - - - - - ApaI...... Bsp12 OI. . . .

Linker 3...... 153 154 155 156. 157 158 159 160 16.1. 162 163 164 1.65 166 167 G F Q W D N Y G T Q L N A. W N-k ggt ttic cag gtC gaC aat tat gga acc cag citc. aat get gtgaat Sall. . .

Linker 3...... 168 169 170 171 172 173 174 175 176 177 178 179 18O 181. 182 N S L T P Q S T K W P S L F E aac toc ct g act cog caa toa acc aag gtt cc c agc citt titt gala Cleavage site of MMP8 Cleavage site of MMP 12 (Belaaouai, et al. 2000, iol Chem, 275: 27123-8) US 8,828,703 B2 51 52 TABLE 4 - continued The annotated DNA sequence of pFTFPI V1

Linker 3 KuDom 3...... 183 184 185 186 187 188 1.89 19 O 191. 192 193 194 195 196 197 H titt cac ggit coc to a tigg tdt Ct c act coC gcG gac aga gga ttg SacII. .

KuDom 3...... 198 1.99 2 OO 201 2 O2 2O3 2O4. 205 206 2. Of 208 209 21 O 211. 212 C14 R A. N E S W I G28 tgt cqt gcc aat gag aac aga t t c tac tac aat to a gtc att ggg

KuDom 3...... 213 214 215 216 217 218 219 22O 221 222 223 224 225 226 227 K C3 O R P Y S E N aaa togc cgc cca titt aag taT TCC gga tigt ggggga aat gala aac Bsp EI. .

KuDom 3...... 228 229 Nk F L R A C55 K. 1395 aat titt ctg. CdC gca tdt aaa BssHII. .

Linker 4 ...... 243 244 245 246 247 248 249 250 251 25.2 253 254. 255 256 257 I Q R I K T K R t to atc. caa aga at a tca aaa gga ggc cta att aaa acc aaa aga

Linker 4 ...... 258 259 26 O 261. 262 263. 264. 265 266 267 268 269 270 271. 272 K R K. K. Q I A. Y E E I aaa aga aag aag cag aga gtgaaa at a gca tat gaa gala att titt

Linker 4 273 2.74 275 276 W K N M (SEQ ID NO: 7). 53 O gtt aaa aat atg taa tda TCTAGA Xball. . 554 AGCTC GCTGATCAGC CTCGACTGTG CCTTC TAGTT GCCAGCCATC TGTTGTTTGC

609 CCC. TCCCCCG TGCCTTCCTT GACCCTGGAA GGTGCCACTC CCACTGTCCT TTCCTAATAA

669 AATGAGGAAA TTGCATCGCA TTGTCTGAGT AGGTGTCATT CTATTCTGGG GGGTGGGGTG

729 GGGCAGGACA GCAAGGGGGA GGATTGGGAA GACAATAGCA GGCATGCTGG GGATGGCCCG

789 GGCTCTATGG CTTCTGAGGC GGAAAGAACC AGCTGGGGCT CTAGGGGGTA TCCCCACGCG

849 CCCTGTAGCG GCGCATTAAG CGCGGCGGGT GTGGTGGTTA CGCGCAGCGT GACCGCTACA

909 CTTGCCAGCG CCCTAGCGCC CGCTCCTTTC GCTTTCTTCC CTTCCTT T.C.T. CGCCACGTTC

96.9 GCCGGCTTTC CCCGTCAAGC TCTAAATCGG GGCATCCCTT TAGGGTTCCG ATTTAGTGCT

TTACGGCACC TCGACCCCAA AAAACTTGAT TAGGGTGATG GTTCACGTAG TGGGCCATCG

CCCTGATAGA CGGTTTTTCG CCCTTTGACG TTGGAGTCCA CGTTCTTTAA TAGTGGACTC

2149 TTGTTCCAAA CTGGAACAAC AC CAACCCT ATCTCGGTCT ATTCTTTTGA TTTATAAGGG

22 O9 ATTTTGGGGA TTTCGGCCTA TTGGTTAAAA AATGAGCTGA TTTAACAAAA ATTTAACGCG

2269 AATTAATTCT GTGGAATGTG CAGTTAG GGTGTGGAAA GTCCCCAGGC TCCCCAGGCA

2329 GGCAGAAGTA TGCAAAGCAT GCATCTCAAT TAGTCAGCAA CCAGGTGTGG AAAGTGCTCA

2389 GGCTCCCCAG CAGGCAGAAG TA GCAAAGC ATGCATCTCA ATTAGTCAGC AACCATAGTC

2449 CCGCCCCTAA CTCCGCCCAT CCCGCCCCTA ACTCCGCCCA GTTCCGCCCA TTCTCCGCCC

2509 CTAGGCTGAC TAATTTTTTT TA TTATGCA GAGGCCGAGG CCGCCTCTGC CTCTGAGCTA

2569 TTCCAGAAGT AGTGAGGAGG C T TTTTGGA GGCCTAGGCT TTTGCAAAAA GCTCCCGGGA

26.29 GGTCCACAAT GATTGAACAA GA GGATTGC ACGCAGGTTC TCCGGCCGCT TGGGTGGAGA

2689 GGCTATTCGG CTATGACTGG GCACAACAGA CAATCGGCTG CTCTGATGCC GCCGTGTTCC

US 8,828,703 B2 59 60 TABLE 8 DX: 88 gene ! DX88 gene a b 1 2 3 E A M H S gag gct atg cac tot . . . . . mature DX88 . . . . derived from human LACI-K1

4. 5 6 7 8 9 10 11 12 13 14 15 16 17 18 C A. K A. D D G P C A. A. 1224 titc togt get ttcaag got gac gaC GGT CCG togc aga got gct cac RsrII 19 2 O 21 22 23 24 25 26 27 28 29 30 31. 32 33 P R W N I T R Q E E 1269 cca aga tigg titc titc aac at c titc acg cqa caa tdc gag gag titc

34 35 36 37 38 39 4 O 41 42 43 44 4s 4- 6 47 48 I Y G G C E G N Q N R E S L 1314 at c tac ggt gigt tdt gag GGT AAC Caa aac aga t t c gag tot cta BstEII

49 SO 51 52 53 54 55 56 Sf 58 E E C K K M C T R D (SEQ ID NO: 2OO 1359 gag gag tdt aag aag atg tdt act aga gat (SEQ ID NO : 199) LACI domain------

TABLE 9 Gene encoding TF88890 for expression in pFTFPI V2 ! Human mature TFPI (1-174) with changes in KuDoml and KuDom2 Signal sequence from pPoFuse3 ...... M G W S C I I L L W A. 1. aTG Giga tigg agc tigt atc atc ctic titc ttg g to gog Niru . . . Signal sequence...... T A. T G A. H S 37 acg gcc aca ggg gcc Cac toc NruI. . Sfil...... Bgll...... EcoQ109 I. ApaI. . . . (Bsp12 OI)

Mature modified TFPI 1. 2 3 4. 5 6 7 8 9 10 11 12 13 14 15 D S E E D E E H T I I T D T E 58 gat tot gag gaa gat gala gaa cac aca att atc aca gat acg gag

KuDom 1 ...... 16 17 18 19 2 O 21 22 23 24 25 26 27 28 29 30 L P P L K L M H S C A. K A. 103 ttg cca cca citg aaa citt atg cat tca ttt tdt goa titc aag gog Nsi. . .

KuDom 1. . . . Now like DX-88 31. 32 33 34 35 36 37 38 39 4 O 41 42 43 44 45 D D G P C R A. A. H P R W N 148 gat gat ggc cca togt aGa gca GCC CAT CCT aga togg titc ttic aat

46 47 48 49 SO 51 52 53 54 55 5.6 s 58 59 60 I T R Q C E E I Y G G C E 193 att tt C act ca cag tec gala gala titt at a tat ggg gga tigt gala

61. 62 63 64 65 66 6f 68 69 70 f1 72 73 74 75 G N Q N R E S L E E C K K M 238 gga aat cag alAT CGA Ttt gala agt ctd gaa gag togc aaa aaa atg BspDI. . .

End KuPoml . . . . Linker ...... f6 77 78 79 80 81 82 83 84 85 86 87 88 89 90 C T R D N A. N R I I K T T L Q 283 TGT ACA aga gat aat gca aac agg attata aag aca aca ttg caa BsrGI. . US 8,828,703 B2 61 62 TAB LE 9- continued Gene encoding TF88890 for expression in pFTFPI V2

KuDom 2 ...... 92 93 94 95 96 97 98 99 1 OO 1 O1 1O2 103 104 105 E K P D Cs L E E D P G12 P13 caa gala aag cca gat ttic tgc ttt ttg gaa gaa gat Cct gga CCa

Kuom 2 . . . . Now like DX-890...... 106 1. Of 108 109 11O 111 112 113 114 115 116 117 118 119 12O C14 A F F P R Y N N O T K 373 tgt A.TC got tTt Titt Coc agg tGG titt tat aac aat cag aca a.a.a.

121 1 22 123 124 125 126 127 128 129 13 O 131 132 133 134 135 Q C E L P Y C Q G N M N 418 cag t gt gala cTt tto CCg tat ggt giga tgc cAg aat atg aac

KuDom 2 ...... 136 1 37 138 139 14 O 141 142 143 144 145 146 147 148 149 150 N E T L E E C N I C E D G 463 aat it tt gag a Ca Ctg gala gala tgc aag aac att tgt gala gat ggt

Linker ...... 151 1, 52 153 154 155 156 157 158 1.59 16 O 161 162 163. 164 1.65 P N G Q W D G T Q L N A. 508 ccg a at ggit tto Cag gat aat tat gga a CC cag Ct c aat get

Linker ...... 16 6 1 67 168 169 17O 171 172 173 174 W N N S L T P (SEQ ID NO : 11) 553 gtg a at aac t cc Ctg act cc.g caa toa Cleavage site of MMP8 Cleavage site of MMP-12 (Belaaoua, et al., 2 OOO, J Biol Chem, 275: 27123-8) taa ta totaga (SEQ ID NO: 10) Xbal. . .

Peptide Mol 2 OO 61

TABL E 1O TABLE 1 O - continued

ONs for Remodeling of KuDom 1 and KuDom 2 ONs for Remode ing of Kuldom 1 and KuDom 2 of mature TFPI (1-174) 40 (All ONs are written 5' to of mature TFPI (1-174) (All ONs are written 5 to 1 ON1U23 GAT TCT GAG GAA GAT (SEQ ID NO: 12) GAA GAA CA 45 7 ONK2B2 CAT ATT GCC Ct GCA (SEQ ID NO: 18) 2 ON71821 TGA TTG CGG AGT CAG (SEQ ID NO: 13) TCC ACC GAA GGA GTT ATA Cogg AaG TTC ACA CTG TTT 3 ONK1Bot GAA ATT GAA GAA (SEQ ID NO: 14) ccA. atg agg 50 AAA CAG TGT, GAA CitT (SEQ ID NO: 19) TGC ACA TGG GCC TTC ccG TAT GGT GGA ATC CGC CTT TGC CaG GGC AAT ATG 4 ONK1T AAG GAT GAT GGC (SEQ ID NO: 15) CCA AgA GCA 9 ON SfilTop TTAATCTTGCGGCCACAGG (SEQ ID NO: 55 cat AGA Tgg TTC GGCCCACTCT GAT TCT TTC ATT TTC GAG GAA. GAT GAA GAA 5 ONK2B1 T GT ATA AAA ccA. (SEQ ID NO: 16) CA CCT GGg AAa Aaa AgC gat ACA. Tgg TCC AGG 60 1.O BotAmp CAAAAAGGCT TCTAGA (SEQ ID NO: 21) ATC TTC TT TCA TTA TGA TTG CGG AGT CAG GGA GTT AA GAA. GAT CCT GGA (SEQ ID NO: 17) ccA. TGT atc GCT TT tTT ccC AGG Tgg TTT 65 Mutations are shown lowercase bold. TAT. AAC A Restriction sites are UPPERCASE BOLD.

US 8,828,703 B2 73 74 TABL E 14 TABLE 14- Continued ONs to make library of TFPI2 KuDom2 ONs to make library of TFPI2 KuDom2 (SEQ ID NO: 26) GTA CGA GTT CTG 3 #4 (SEQ ID NO: 29) s' CCTCCAGAGTTGACAGAATC CGG AAA CCG GTT CTC MNY ON147L21 (SEQ ID NO: 27) MYY GTT MYY MINN, GCA ACC ACC AKA MNN GAA CTT, TTC CCT CCA GAG TTG ACA GAA TCC 3 GCA AGT CAT AGA GCT CAG GTT GA 3 ON1U96 (SEQ ID NO: 28) 4. x 20' x 2 x 16 3. f. GGG TCT, TCT GTA CGA GTT CTG CAG GTTTCT GTT RRK 10 14 AA sequences RRK NNK. TGT NNK. GST INNK NNK NNK ARG TAT TTT TTC 8' x 32 x 2 x 16 1.8 16 DNA sequences AAC CTG AGC TCT ATG ACT TGC GAA 3 Fraction without stops O Only TAG allowed.

TABL E 15 ONs for construction of modified tfpi gene shown in TABLE 4.

SEQ ID No. Name Sequence (5 " - to NO : Len

ONMature gat tot gag gaa gat gaa gala Ca 3 O 829 T

ON101.9L2 aca tat titt taa Cala a.a.a. titt citt 31 7 Cat

ONS fiT ttaatc.ttgcg gcc a Ca 999 gcc CaC 32 95 to c gac tot gag gaa gat gala gala

ONSacHin Cat aag Ctt cag tgg GaG c to 33 L cgt at C

ONSacHin gat acg gag Citc. C Ca C Ca Ctg aaG 34 12O U citt atg

ONMlul ttic gca citg Acg Cdt gala a.a. 35

ONMIU tt titc acG cgT cag togc gaa 36 76

ONBBBL to ttc. Cag agA titc aaa tog att 37 ctg. Gitt Acc titc aca toc

ONBBBU gga tigt gala gg.T aaC cag aat 38 142 Ciga titt gala TCt cts gala ga

ONXhosam t to c agg atc ctic ct c gag a.a.a. 39 L gca gala a

ONXhosam t titc tdc titt Citc. gag gag gat 4 O 142 U cott gga a

ONBsmBIL ttic titc cag Cg t c to aaa a 41

ONBSmBIU t titt gag acG citg gaa gaa 42 74

ONApaSall cata att Gitc Gac ctd gaa acc 43 L att Ggg Coc atc titc a

ONApaSall t gala gat ggG ccCaat ggit ttic 44 148 U cag g to gaC aat tat g

ONSac2L to C tot gtC cqc ggg agt gag a 45

ONSac2U t ct c act coc goggac aga gga 46 1 O2

ONBspE2L cc aca. tcc GGA Ata citt aaa td 47

ONBspE2U cattt aag taT TCC gga tigt gigg g 48 74

ONBssH2L titt aca tyc GcG cag acatt c t 49

21 ONBssH2U a gala tot ctg. CC goa tit aaa. SO 147

22 ONfinal L. ttgTtTCTaGaTCAt Tacatatttittaacaa 51 aaattt Cttcat US 8,828,703 B2 75 76 TABLE 1.5 - continued ONs for construction of modified tfpi gene shown in TABLE 4.

SEQ ID No Name Sequence (5'-to-3') NO: UAL Len 3 ONS fi ttaatcttgcg gCC aca ggg gcc cac 52 U 874 to C gac tot gag gaa gat gaa gala 26 ONfinal L ttgTtTCTaGaTCAt Tacatatttittaacaa 53 L aaattt Cttcat

The primers are upper Strand (U) or lower Strand (L) as indicated in the U/L column. Len' indicates the length of the product obtained with human t?pi CDNA as template and the U-L pair of primers. UpperCase BASES indicate departures from the published cDNA sequence. Restriction sites are shown bold. Where two restriction sites overlap, one is shown as underscored.

TABL E 16

Modified Human TFPI = PlaKalEl O1 (SEO ID NO: 54 AAS Function AA sequence 1-21 Linker 1 DSEEDEEHTIITDTELPPLKL

22-79 Kuom 1 MHSFCAFKADDGPCRARFDRWFFNIFTROCEEFIYGGCEGNONRFESLEECKKMCTRD Inhibit plasmin 80–92 Linker 2 NANRIIKTTLOOE

93 - 15 O Kuom 2 KPDFCFLEEDPGPCRAAHPRWFYNnoTKOCERFIYGGCEGNMNNFETLEECKNICEDG Inhibit h.pKal 151-184 Linker 3 PNGFQVDNYGTQLNAVnNSLTPQSTKVPSLFEFH

185 - 242 Kuom 3 GPSWCLTPADRGPCIAFFPRFYYNSVIGKCRPFPYGGCQGNENnFTSKQECLRACKKG Inhibit 1. 5 1. 1. 2 2 3 3 4. 4. 5 5 5 h.NE O 5 O 5 O 5 O 5 O 5 8 243-248 Linker 4 FIQRIS Possible sites of glycosylation are shown as 'n' .

TABL E 17 TABL E 17- continued 45 ONs for construction of gene encoding PlaKalElO1 ONs for construction of gene encoding PlaKalElO1 (SEO ID NO : 55) #1 Get TFPI top SEO ID NO: 60 5. ATG ATT TAC ACA ATG AAG AAA GTA CAT 3 (SEQ ) so #6 MutK2: 517L56 (SEO ID NO. 56) 5 GT CTG ATT GTT ATA AAA CCA CCT GGG ATG AGC AGC E2 ON937L24 TCG ACA AGG TCC AGG ATC TTC 3' s' TGA TAT TCT TTG GAT GAA ACC TTT 3 (SEQ ID NO : 61) (SEO ID NO : 57) 7 MutK2:577U47 55 E3 MutK1:31OU43 5. CAG TGT GAA CGT TTC ATT TAT GGT GGA TGC GAG GGC s' GAT GGC CCA TGT CGT GCA CGC TTC GAT AGA TGG. TTC AAT ATG AAC AA 3 TTC AAT A 3'

(SEO ID NO. 58) (SEQSEO ID NO : 62 ) E4 MutK1:31 OL143 8 MutK2:577L47 5 T ATT GAA GAA CCA TCT ATC GAA. GCG TGC ACG ACA 60 5 TT GTT CAT ATT GCC CTC GCA TCC ACC ATA AAT GAA TGG GCC ATC 3' ACG TTC ACA CTG 3 ''

(SEO ID NO. 59) (SEQ ID NO: 63) 5 MutK2: 517U56 9 MutK3: 792U55 s' GAA GAT CCT GGA CCT TGT CGA GCT GCT CAT CCC AGG 65 5' A GCA GAC AGA. GGA CCT TOT ATT GCC TTT TTT CCC TGG TTT TAT. AAC AAT CAG AC 3' AGA TTC TAC TAC AAT TCA GTC 3'

US 8,828,703 B2 79 80 TABLE 19- continued

Modified Human TFPI dimer = PKEPKEO2 (SEO ID NO : 7 O AAS Function AA sequence

185 - 242 Kuom 3 GPSWCLTPADRGPCIAFFPRFYYNSVIGKCRPFKYGGCQGNENnFTSKQECLRACKKG 1. 5 1. 1. 2 2 3 3 4. 4. 5 5 5 O 5 O 5 O 5 O 5 O 5 8

243-248 Linker 4 FIQRIS

249-261 Linker 5 GGSGSSGGGSSGG

262-282 Linker 1 DSEEDEEHTIITDTELPPLKL

283-34 O Kuom 1 MHSFCAFKADDGPCRARFDRWFFNIFTROCEEFIYGGCEGNONRFESLEECKKMCTRD 341-353 Linker 2 NANRIIKTTLOOE

354 - 411 Kuom 2 KPDFCFLEEDPGPCRAAHPRWFYNnoTKOCERFIYGGCEGNMNNFETLEECKNICEDG 412- 4.45 Linker 3 PNGFQVDNYGTQLNAVnNSLTPQSTKVPSLFEFH

44 6-503 Kuom 3 GPSWCLTPADRGPCIAFFPRFYYNSVIGKCRPFKYGGCQGNENnFTSKQECLRACKKG

SO4 - 509 FIORIS Possible sites of glycosylation are shown as 'n' .

TABLE 2 O TABLE 22 - continued

TF8889 O-2 (SEO ID NO : 71.) TF8889 O-4 (SEO ID NO: 73)

AA 30 AA No. Function AA sequence No. Function AA sequence 1-58 pKal MHSFCAFKADDGPCRAAHPRWFFNIFTROCE 59-70 Linker GGSGGSSSSSGG inhibition EFIYGGCEGNONRFESLEECKKMCTRD 71 -128 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE 59-70 Linker GGSGGSSSSSGG 35 inhibition EFIYGGCOGNONRFESLEECKKMCTRD

71 - 128 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE 129-140 Linker GGSGGSSSSSGG inhibition EFIYGGCOGNONRFESLEECKKMCTRD 141-198 pKal MHSFCAFKADDGPCRAAHPRWFFNIFTROCE inhibition EFIYGGCEGNONRFESLEECKKMCTRD 40 TABLE 21 199-210 Linker GGSGGSSSSSGG

TF8889 O-3 (SEO ID NO: 72) 211-268 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE inhibition EFIYGGCOGNONRFESLEECKKMCTRD AA No. Function AA sequence 45 1-58 pKal MHSFCAFKADDGPCRAAHPRWFFNIFTROCE TABLE 23 inhibition EFIYGGCEGNONRFESLEECKKMCTRD TF8889 O-5 (SEO ID NO : 74 59 - 7 O Linker GGSGGSSSSSGG 50 AA 71-128 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE No Function AA sequence inhibition EFIYGGCOGNONRFESLEECKKMCTRD 1-58 pKal MHSFCAFKADDGPCRAAHPRWFFNIFTROC 129-14 O Linker GGSGGSSSSSGG inhibition EEFIYGGCEGNONRFESLEECKKMCTRD 141-198 pKal MHSFCAFKADDGPCRAAHPRWFFNIFTROCE ss 59-75 Linker GGSGNGSSSSGGGSGSG inhibition EFIYGGCEGNONRFESLEECKKMCTRD 76-133 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE inhibition EFIYGGCQNONRFESLEECKKMCTRD

TABLE 2.2 134-150 Linker GGSGNGSSSSGGGSGSG

TF8889 O-4 (SEO ID NO : 73 60 151-208 pKal MHSFCAFKADDGPCRAAHPRWFFNIFTROCE inhibition EFIYGGCEGNONRFESLEECKKMCTRD AA No. Function AA sequence 209-225 Linker GGSGNGSSSSGGGSGSG

1-58 pKal MHSFCAFKADDGPCRAAHPRWFFNIFTROCE 226-283 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE inhibition EFIYGGCEGNONRFESLEECKKMCTRD 65 inhibition EFIYGGCOGNONRFESLEECKKMCTRD US 8,828,703 B2 81 82 TABLE 24 TABLE 25

TF890 - 6 (SEO ID NO : 75) TF890-8 (SEO ID NO : 76

AA 5 No Function AA sequence Function AA sequence 1-58 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE inhibition EFIYGGCOGNONRFESLEECKKMCTRD h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE inhibition EFIYGGCOGNONRFESLEECKKMCTRD 59 - 70 Linker GGSGGSSGSSGG 10 71 - 128 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE 59-70 Linker GGSGGSSSSSGG inhibition EFIYGGCOGNONRFESLEECKKMCTRD

71 - 128 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE 129-14 O Linker GGSGGSSSSSGG inhibition EFIYGGCOGNONRFESLEECKKMCTRD 15 141-198 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE inhibition EFIYGGCOGNONRFESLEECKKMCTRD 129-14 O Linker GGSGGSSSSSGG 199-210 Linker GGSGGSGSSSGG

141-198 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE 211-268 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE inhibition EFIYGGCOGNONRFESLEECKKMCTRD 2O inhibition EFIYGGCOGNONRFESLEECKKMCTRD

269-280 Linker GGSGGSSSSGSG 199-21 O Linker GGSGGSSSSSGG 281-338 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE 211-268 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE inhibition EFIYGGCOGNONRFESLEECKKMCTRD inhibition EFIYGGCOGNONRFESLEECKKMCTRD 25 339-350 Linker GGSGSSSGSSSG

269 - 28 O Linker GGSGGSSSSSGG 351 - 4 O8 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE inhibition EFIYGGCOGNONRFESLEECKKMCTRD

281-338 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE 409 - 420 Linker GSGSSGSSGSSG inhibition EFIYGGCOGNONRFESLEECKKMCTRD 30 421 - 478 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE inhibition EFIYGGCOGNONRFESLEECKKMCTRD 339-350 Linker GGSGSSSGSSSG 479 - 490 Linker GGSSSGSGSGSG 351 - 4 O8 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE 35 491-548 h.NE MHSFCAFKADDGPCIAFFPRFFFNIFTROCE inhibition EFIYGGCOGNONRFESLEECKKMCTRD inhibition EFIYGGCOGNONRFESLEECKKMCTRD

TABLE 26 Library for removal of 14-38 disulfide ! LACI-D1 display cassette for removal of 14-38 disulfide Signal sequence of M13 iii ------fM K K L L A. I P L W W P Y gtgaaa aaa tta tta titc gca att cot tta gtt gtt cct titc tat

S M A. A. E M H S tcC. ATG God gcc gag ATGICATItcc titc Nico Nsi

C5. A K A D1 O D X12 X13 X14 I15 A. P R2O F tgc gcc titc aag got gac gaC GGT CCG tot a TT gct TTC TTc CCt cqt titc

N I25 F T R Q C30 E E I Y G G X38 ttic titc aac att titc ACG CGT cag togc gag GAA TTC att tac gigt gigt tdt Mlul EcoRI

X39 G4 O N Q N R. F45 E S L E49 gaa GGT AAC Cag aac CGG Ttc gaa to T CTA Gag BstEII AgeI Xbal Asu

E C K K M C T R D S A. S S A. (SEO ID NO : 78) gaa tdt aag aag atg tdt act cqt gat tot GCT AGC tot gct (SEO ID NO : 77) Nhe an anchor protein such as III stump follows.

equimolar A, C, G, and T equimolar A, C, and T

US 8,828,703 B2 99 100 TABL 29- continued A phage vector to display TFPI KuDom 1 and derivatives.

87.22 aca Cat tact ca ggc att go a titt a.a.a. ata t at gag ggt tot aaa aat titt

8773 tat cott tgc gtt gaa ata aag gCt tot c cc gca aaa gta tta Cag ggit Cat

8824 aat gtt titt ggit aca acc gat tta gct tta togc tict gag gct tta ttg citt

8875 aat titt gct aat t ct ttg cct togc Ctg tat gat tta ttg gat gtt (SEQ ID NO : 100) gene II continues

TABL E 3 O Modified Human TFPI = PlaKalEl O2 (SEQ ID NO : 102) AAS Function AA sequence 1-21 Linker 1 DSEEDEEHTIITDTELPPLKP

22-79 Kuom 1 MHSFCAFKADDGPCRARFDRWFFNIFTROCEEFIYGGCEGNONRFESLEECKKMCTRD Inhibit plasmin

80-92 Linker 2 NANRPIKTTLOOE

93 - 15 O Kuom 2 KPDFCFLEEDPGPCRAAHPRWFYNnoTKOCERFIYGGCEGNMNNFETLEECKNICEDG Inhibit h.pKal 151-184 Linker 3 PNGFQVDNYGTQLNAVnNSLTPQSPKVPSLFEFH

185 - 242 Kuom 3 GPSWCLTPADRGPCIAFFPRFYYNSVIGKCRPFPYGGCQGNENnFTSKQECLRACKKG Inhibit 1. 5 1. 1. 2 2 3 3 4. 4. 5 5 5 h.NE O 5 O 5 O 5 O 5 O 5 8

243-248 Linker 4 FIQRIS

Mutations from w. t. are BOLD, UPPERCASE. Possible glycosylation sites are shown as 'n' .

TABLE 31 Modified Human TFPI dimer = EL six (SEQ ID NO : 103) Aas Function AA sequence

1-21 Linker 1 DSEEDEEHTIITDTELPPLKL

22-79 Kuom 1 MHSFCAFKADDGPCIAFFPRFFFNIFTROCEEFIYGGCOGNONRFESLEECKKMCTRD

80-92 Linker 2 NANRIIKTTLOOE 93 - 15 O Kuom 2 KPDFCFLEEDPGPCIAFFPRFFYNnoTKOCERFIYGGCQGNMNNFETLEECKNICEDG 151 - 184 Linker 3 PNGFQVDNYGTQLNAVnNSLTPOSTKVPSLFEFH 185 - 242 Kuom 3 GPSWCLTPADRGPCIAFFPRFYYNSVIGKCRPFKYGGCQGNENnFTSKQECLRACKKG 1 5 1. 1. 2 2 3 3 4. 4. 5 5 5 O 5 O 5 O 5 O 5 O 5 8

243 - 248 Linker 4 FIORIS

249-261 Linker 5 GGSGSGGSGSSGG

262-282 Linker 1 DSEEDEEHTIITDTELPPLKL

283-341 Kuom 1 MHSFCAFKADDGPCIAFFPRFFFNIFTROCEEFIYGGCOGNONRFESLEECKKMCTRD

342-354 Linker 2 NANRIIKTTLOOE

355 - 410 Kuom 2 KPDFCFLEEDPGPCIAFFPRFFYNnoTKOCERFIYGGCQGNMNNFETLEECKNICEDG US 8,828,703 B2 101 102 TABLE 31- continued

Modified Human TFPI dimer = ELSix (SEO ID NO : 103

Aas Function AA sequence

411-444 Linker 3 PNGFQVDNYGTQLNAVnNSLTPOSTKVPSLFEFH 4 45-5O2 Kuom 3 GPSWCLTPADRGPCIAFFPRFYYNSVIGKCRPFKYGGCQGNENnFTSKQECLRACKKG 5 O3-507 FIORIS

Mutations from W. t. are BOLD, UPPERCASE. Possible glycosylation sites are shown as 'n' .

TABL E 32 AA Sequence of rabbit TFPI (SEQ ID NO : 104) LOCUS AAB2 6836 299 aa linear MAM 25-AUG-1993 DEFINITION tissue factor pathway inhibitor; lipoprotein-associated coagulation inhibitor; extrinsic pathway inhibitor; TFPI; ACI; EPI Oryctolagus cuniculus. ACCESSION AAB2 6836 WERSION AAB2 6836. 1 GI: 386O16 UPPERCASE BOLD Ns may be glycosylated AAS Function AA sequence 1-24 Signal mkkehifwtsiclllgilvpapvss 25-44 Linker 1 aaeedeftNitolikpplgkp

45 - 102 Kuom 1 ths foamkvddgperayikrff finilithgoeefiyggcegnenrfesleeckekcard

103 - 115 Linker 2 ypkmttklt folkg

116-173 Kuom 2 kpdfcfleedpgicrgyitryfynngskocerfkyggcl.gnlinnfesleeckintceNp

174-2O7 Linker 3 tsdfovddhrtolntvNntlind ptkaprrwafh

2O8 - 265 Kuom 3 gpswclppadrigliccaneirffynaiigkcrpfkysgcggnenNft skkacitackkg

266 - 299 Linker 4 firNlskggliktkrkkkkgpvkity vetfvikkit

TABLE 33 cDNA. Sequence of rabbit TFPI LOCUS S619 O2 900 bp mRNA linera MAM 25-AUG-1993 DEFINITION tissue factor pathway inhibitor=Kunitz-type protease inhibitor rabbits, lung, mRNA, 938 int. ACCESSION S619 O2 REGION: 21.92 O WERSION S619 O2. 1 GI: 386 O15 KEYWORDS SOURCE Oryctolagus cuniculus (rabbit) ORGANISM Oryctolagus cuniculus Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; Mammalia; Eutheria; Euarchontoglires; Glires; Lagomorpha; Leporidae; Oryctolagus. REFERENCE 1 (bases 1 to 9 OO) AUTHORS Belaaouai, A., Kuppuswamy, M.N., Birktoft, J. J. and Bajaj, S.P. TITLE Revised cDNA sequence of rabbit tissue factor pathway inhibitor JOURNAL Thromb. Res. 69 (6), 547-553 (1993) PUBMED 85 O3123 REMARK GenBank staff at the National Library of Medicine created this entry NCBI gibb sq. 133173 from the original journal article. FEATURES Location/Qualifiers Solice 1. . . . 9 OO /organism= Oryctolagus cuniculus" /mol type= mRNA." /db Xref="taxon: 9986" gene <1 . . . .9 OO /gene="tissue factor pathway inhibitor, lipoprotein-associated coagulation inhibitor, extrinsic

US 8,828,703 B2 113 114 TABLE 5 O - continued hTFPImC1, an optimized, tagged W. t. TFPI minus Cterminal region. 727 aac aga titc tac tac aat tca gtc att ggg aaa togc cqc coa titt

256 257 258 259 26 O 261. 262 263. 264. 265 266 267 268 269 270 K Y S G C38 G G N E N N T S K 772 aag tac agt gga tigt ggg gga aat gala aac aat titt act tcc aaa.

271. 272 2.73. 2.f4 275 276 277 2 78 279 28 O 281 Q E. C51 L R A C55 K. K G (SEQ ID NO: 131). 817 caa gaa tot ctg agg gca tt aaa aaa ggit taa 850 tga TCTAGA ag (SEQ ID NO: 13 O) Xball. .

TABLE 51 hTFPIPpKhNE), a TFPI with plasmin, pKal, and hNe inhibition

1. GCCACC Nico . . . PfMI. . . Signal sequence ...... 1. 2 3 4. 5 6 7 8 9 10 11 12 13 14 15 M G W S C I I L L W A. T A. T 7 ATG Gga Tgg agc tigt atc atc ct c titc titg gTC GCG AcG GCC aca Nico . . . Nru.I. . . . Sfil...... PflMI...... ! Signal ...... Cleavage ...... His 6 purification tag. . 16 17 18 19 2 O 21 22 23 24 25 26 27 28 29 30 G A. H S E G R P G H H H H H H 52 GGG GCC Cact co gaa ggit cqt cog GGT CAC Cac cat cat cat cat Sfil ...... BstEII. . . EcoC)109I Apal. . . .

Thrombin cleavage site Codon 2 of mature TFPI

31. 32 33 34 35 36 37 38 39 4 O 41 42 43 44 45 G G S S L W P R. G. S E E D E E 97 ggc ggit t ct agt ctg gt C cc.g Cgt get tot gag gala gat gala gaa

Start 1st Kuom

46 47 48 49 SO 51 52 53 54 55 56 s 58 59 60 E. L. L K L M Calc aca att atc. aca gat acg gag ttg C Ca C Ca Ctg AAG CTT ATG HindIII Nsi. . .

61. 62 63 64 65 66 67 68 69 70 71. 72 73 fa 75 A D C14 R A. CAT to a titt tdt gca ttic aag gcg gat gat CC a tgt aga gca Nsi.

76 ft 78 79 8O 81 82 83 84 85 86 87 88 89 90 N I T R O C3 O E cgc titc gat aga tigg ttic ttic aat att tto act cga cag tic gaa

91 92 93 94 95 96 97 98 99 1 OO 102 103 104 105 C38 E G N R E gala titt at a tat ggg gga gala giga aat Cag aAT CGA Ttt gaa BspDI. . .

End of 1st Kuom Linker 2 106 107 108 109 110 111 112 113 114 115 11 6 117 118 11.9 120 S L E E CS1 M Css N A. N agt ctg gaa gag tic a.a.a. a.a.a. atg TGT ACA aga gat aat goa aac G.I. .

Linker 2...... Kudom 2 ...... 121 122 123 124 125 126 127 128 129 130 131. 132 133 134 135 R I I K T T L Q Q E K P D Cs

US 8,828,703 B2 135 136 TABLE 56 - continued

AA seculence of hPImC1 (SEO ID NO : 139

Item Extent AA sequence

Thrombin 31 - 39 GGSSLWPRG

Linker 1 4 O-59 SEEDEEHTIITDTELPPLKL

Kuom 1 6 O-117 MHSFCAFKADDGPCKAIMKRFFFNIFTROCEEFIYGGCEGNONRFESLEECKKMCTRD

Linker 2 118-13 O NANRIIKTTLOOE

k Kuom 2 131-188 KPDFCFLEEDPGICRGYITRYFYNNOTKOCERFKYGGCLGNMNNFETLEECKNICEDG

k Linker 3 189-222 PNGFOWDNYGTOLNAVNNSLTPOSTKVPSLFEFH

k Kuom 3 223-28 O GPSWCLTPADRGLCRANENRFYYNSWIGKCRPFKYSGCGGNENNFTSKOECLRACKKG

TABL E 57 AA sequence of htFPIPOKhNE (SEQ ID NO: 140)

Function Limits AA sequence Signal 1-19 MGWSCIILFLWATATGAHS

Cleavage 2O-24 EGRPG

His 6 25-30 HHHHHH

Thrombin 31-39 GGSSLWPRG

Linker 1 40-59 SEEDEEHTIITDTELPPLKL

Kuom 1 60-117 MHSFCAFKADDGPCRARFDRWFFNIFTRQCEEFIYGGCEGNQNRFESLEECKKMCTRD K IMK F <- wild type

Linker 2 118-13 O NANRIIKTTLOOE

k Kuom 2 131-188 KPDFCFLEEDPGPCRAAHPRWFYNNQTKQCERFIYGGCEGNMNNFETLEECKNICEDG Il GYIT Y K L <- Wild type

k Linker 3 189-222 PNGFOWDNYGTOLNAVNNSLTPOSTKVPSLFEFH

k Kuom 4 223-280 GPSNCLTPADRGPCIAFFPRFYYNSVIGKCRPFPYGGCQGNENNFTSKQECLRACKKG L. R. NEN K. S. G <- Wild type

TABL E 58 Amino-acid sequence of PLA30lb (SEQ ID NO : 141) Function Limits AA sequence

Signal 1- 19 MGWSCIILFLWATATGAHS Linker 1 2 O- 4 O DSEEDEEHTIITDTELPPLKL

Kuom 1 41-98 MHSFCAFKADDGPCRARFDRWFFNIFTROCEEFIYGGCEGNONRFESLEECKKMCTRD K IMIK F <- Wild type

Linker 2 99-111 NANRIIKTTLOOE

Kuom 2 112-169 k KPDFCFLEEDPGPCRARFDRWFYNNOTKOCERFIYGGCEGNMNNFETLEECKNICEDG Il GYIT Y K <- Wild type

US 8,828,703 B2 141 142 TABLE 6O - continued Distributions of A tyres in 103 KuDoms, including 19 human KuDoms

Res. Different Id. AA types Contents

44 8 K8 O3 F S W Y

45 5

46 17

47 8 N3 D2 R2 K L E

48 13

49 13

SO 11

51 3 K

52 3 R26 M22 L15 K13 E11 Q5 N3 I2 F H D S W

53 5 R34

54 3 T4s

55 4.

56 6 G32 5 W12 IS Ef Kf A6 S3 L3 P2 M2 H F D W -

f 6 G3 O

58 8 -23

59 7 - 43

6 O 6 - 45

61 3 -53

62 9 -53 12 D8 P4 R3 S3 K3 G3 Y2 E2 T2 A. L. M. W. W. N. I O

63 3 - 49 22 E1 O Q3 R3 S3 G3 K3 I2 T2 W L P

64 2

TABL E 61

PlaMatHep, a Modified Human TFPI inhibiting Plasmin, Matriptase, and Hepsin (SEQ ID NO: 142)

AAS Function AA sequence

1-21 Linker 1 DSEEDEEHTIITDTELPPLKL

22-79 Kuom 1 MHSFCAFKADDGPCRARFDRWFFNIFTROCEEFIYGGCEGNONRFESLEECKKMCTRD Inhibit plasmin

80-92 Linker 2 NANRIIKTTLOOE

93 - 15 O Kuom 2 kpdfcfleedpg|RorgSFPrWfynndtkocKSf Vyggclgnmnnfetleecknicedg Inhibit matriptase

151-184. Linker 3 PNGFOWDNYGTOLNAVnNSLTPOSTKVPSLFEFH US 8,828,703 B2 143 144 TABLE 61 - continued

PlaMatHep, a Modified Human TFPI inhibiting Plasmin, Matriptase, and Hepsin (SEQ ID NO : 142)

AAS Function AA sequence

185 - 242 Kuom 3 gpswcltpadWgRoraSMPrWyynsvigkcrpfWyGgcggnSnnft seqeclirackkg Inhibit 1. 5 1. 1. 2 2 3 3. 4. 4. 5 5 5 hepsin O 5 O 5 O 5 O 5 O 5 8

243-248 Linker 4 FIORIS Possible sites of glycosylation are shown as 'n' .

Other embodiments are within the following claims.

SEQUENCE LISTING

< 16 Os NUMBER OF SEO ID NOS:

SEQ ID NO 1 LENGTH: TYPE : PRT ORGANISM: Homo sapiens

<4 OOs, SEQUENCE: 1.

Met Ile Tyr Thr Met Wall His Ala Luell Trp Ala Ser Wall Cys 1. 15

Lell Luell Luell Asn Lell Ala Pro Ala Pro Luell ASn Ala Asp Ser Glu Glu 2O 25

Asp Glu Glu His Thir Ile Ile Thir Asp Thir Glu Lell Pro Pro Luell Lys 35 4 O 45

Lell Met His Ser Phe Ala Phe Ala Asp Asp Gly Pro Lys SO 55 6 O

Ala Ile Met Phe Phe Phe Asn Ile Phe Thir Arg Glin Glu 65 70

Glu Phe Ile Gly Gly Glu Gly Asn Glin Asn Arg Phe Glu Ser 85 90 95

Lell Glu Glu Cys Lys Met Thir Arg Asp Asn Ala Asn Arg Ile 1OO 105 11 O

Ile Thir Thir Lell Glin Glin Glu Pro Asp Phe Cys Phe Luell Glu 115 12 O 125

Glu Asp Pro Gly Ile Arg Gly Ile Thir Arg Phe Asn 13 O 135 14 O

Asn Glin Thir Glin Cys Glu Arg Phe Tyr Gly Gly Luell Gly 145 150 155 160

Asn Met Asn Asn Phe Glu Thir Luell Glu Glu Asn Ile Cys Glu 1.65 17O 17s

Asp Gly Pro Asn Gly Phe Glin Wall Asp Asn Thir Glin Luell Asn 18O 185 19 O

Ala Wall Asn Asn Ser Lell Thir Pro Glin Ser Thir Wall Pro Ser Luell 195

Phe Glu Phe His Gly Pro Ser Trp Luell Thir Pro Ala Asp Arg Gly 21 O 215 22O

Lell Arg Ala Asn Glu Asn Arg Phe Tyr Asn Ser Wall Ile Gly 225 23 O 235 24 O

Arg Pro Phe Ser Gly Cys Gly Gly Asn Glu Asn Asn 245 250 255

Phe Thir Ser Lys Glin Glu Luell Arg Ala Cys Gly Phe Ile 26 O 265 27 O US 8,828,703 B2 145 146 - Continued

Glin Arg Ile Ser Lys Gly Gly Lieu. Ile Llys Thr Lys Arg Lys Arg Llys 27s 28O 285 Lys Glin Arg Val Lys Ile Ala Tyr Glu Glu Ile Phe Val Lys Asn Met 29 O 295 3 OO

<210s, SEQ ID NO 2 &211s LENGTH: 1431 &212s. TYPE: DNA <213> ORGANISM: Homo sapiens <4 OOs, SEQUENCE: 2 ggcgggtctg. Cttctaaaag aagaagtaga gaagataaat cctgtct tca atacctggaa 6 O ggaaaaacaa aataacct ca act cogttitt gaaaaaaa.ca ttccaagaac titt catcaga 12 O gattitt actt agatgattta cacaatgaag aaagtacatg cactittgggc titctg tatgc 18O ctgctgctta atcttgcc cc togc.ccct citt aatgctgatt citgaggaaga tigaagaacac 24 O acaattatca cagatacgga gttgccacca citgaaactta togcatt catt ttgtgcatt c 3OO aaggcggatg atggcc catg taaagcaatc atgaaaagat ttittcttcaa tattitt cact 360 cgacagtgcg aagaattitat atatggggga tigtgaaggala at Cagaatcg atttgaaagt 42O Ctggaagagt gcaaaaaaat gtgtacaaga gataatgcaa acaggattat aaaga caa.ca 48O ttgcaacaag aaaagccaga tittctgcttt ttggaagaag atcCtggaat atgtcgaggt 54 O tat attacca gg tatttitta taacaatcag acaaaac agt gtgaacgttt caagtatggit 6OO ggatgcctgg gcaatatgaa caattittgag acactggaag aatgcaagaa catttgttgaa 660 gatggit Coga atggtttcca ggtggata at tatggaacco agct caatgc tigtgaataac 72 O t ccctgactic cqcaat caac caaggttc.cc agc ctittittgaattt cacgg tocct catgg 78O tgtctic actic cagcagacag aggattgttgt cqtgccalatg agaacagatt Ctact acaat 84 O t cagt cattg ggaaatgc.cg cc catttalag tacagtggat gtgggggaala taaaacaat 9 OO tttact tcca aacaagaatgtctgagggca totaaaaaag gttt catcca aagaatat ca 96.O aaaggaggcc taattaaaac Caaaagaaaa agaaagaagc agaga.gtgaa aatagcatat 1 O2O gaagaaattt ttgttaaaaa tatgtgaatt tdttatagoa atgta acatt aattic tact a 108 O aatattittat atgaaatgtt toactatoat tittct attitt tottctaaaa totgttittaat 114 O taatatgttc attaaattitt citatgctitat td tacttgtt atcaacacgt ttgtat caga 12 OO gttgcttitt c taatcttgtt aaattgctta t t c taggtot gtaatttatt aactggctac 126 O tgggaaatta cittattittct ggat citat ct g tattitt cat ttalactacaa attat catac 132O taccggctac atcaaatcag toctittgatt coatttggtg accatctgtt tdagaatatg 1380 at catgtaaa tdattatcto ctittatagcc tdtaaccaga ttaagcc.ccc c 1431

<210s, SEQ ID NO 3 &211s LENGTH: 58 212. TYPE: PRT <213> ORGANISM: Homo sapiens

<4 OOs, SEQUENCE: 3 Met His Ser Phe Cys Ala Phe Lys Ala Asp Asp Gly Pro Cys Lys Ala 1. 5 1O 15

Ile Met Lys Arg Phe Phe Phe Asin Ile Phe Thr Arg Glin Cys Glu Glu 2O 25 3O

Phe Ile Tyr Gly Gly Cys Glu Gly Asn Glin Asn Arg Phe Glu Ser Lieu. 35 4 O 45

US 8,828,703 B2 155 156 - Continued

Asn Glin Asn Arg Phe Glu Ser Luell Glu Glu Cys Lys Lys Met Cys Thir 85 90 95

Arg Asp Asn Ala Asn Arg Ile Ile Lys Thir Thir Lell Glin Glin Glu 1OO 105 11 O

Pro Asp Phe Phe Lell Glu Glu Asp Pro Gly Ile Cys Arg Gly Tyr 115 12 O 125

Ile Thir Arg Phe Asn Asn Glin Thir Glin Cys Glu Arg Phe 13 O 135 14 O

Lys Gly Gly Lell Gly Asn Met Asn ASn Phe Glu Thir Luell Glu 145 150 155 160

Glu Asn Ile Glu Asp Gly Pro ASn Gly Phe Glin Wall Asp 1.65 17O 17s

Asn Gly Thir Glin Lell Asn Ala Wall Asn ASn Ser Lell Thir Pro Glin 18O 185 19 O

Ser Thir Lys Wall Pro Ser Lell Phe Glu Phe His Gly Pro Ser Trp 195

Lell Thir Pro Ala Asp Arg Gly Luell Ala Asn Glu Asn Arg Phe 21 O 215 22O

Tyr Asn Ser Wall Ile Gly Pro Phe Ser Gly 225 23 O 235 24 O

Gly Asn Glu Asn Asn Phe Thir Ser Glin Glu Luell Arg 245 250 255

Ala Lys Gly Phe Ile Glin Arg Ile Ser Gly Gly Luell Ile 26 O 265 27 O

Thir Lys Arg Arg Lys Glin Arg Wall Ile Ala Glu 27s 28O 285

Glu Ile Phe Wall Lys Asn Met 29 O 295

SEQ ID NO 8 LENGTH: 18O TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic polynucleotide

<4 OOs, SEQUENCE: 8 gaagctatgc act ctittctg. tcc tittcaag gctgacgacg gtc.cgtgcag agctgct cac cCaagatggit tottcaac at Ctt cacgcga caatgcgagg agttcatcta C9gtggttgt 12 O gagggitalacc aaaacagatt cagt ct cta gaggagtgta agaagatgtg tact agagac 18O

SEO ID NO 9 LENGTH: 60 TYPE : PRT ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic polypeptide

<4 OOs, SEQUENCE: 9 Glu Ala Met His Ser Phe Cys Ala Phe Lys Ala Asp Asp Gly Pro Cys 1. 5 15 Arg Ala Ala His Pro Arg Trp Phe Phe Asin Ile Phe Thr Arg Glin Cys 25

Glu Glu Phe Ile Tyr Gly Gly Cys Glu Gly Asn Glin Asn Arg Phe Glu 35 4 O 45 US 8,828,703 B2 157 158 - Continued

Ser Lieu. Glu Glu. Cys Llys Llys Met Cys Thr Arg Asp SO 55 6 O

SEQ ID NO 10 LENGTH: 591 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic polynucleotide FEATURE: NAME/KEY: CDS LOCATION: (1) . . (579)

< 4 OOs SEQUENCE: 10 atg gga tgg agc tgt atc. atc. citc. ttic ttg gtc gcg acg gcc aca 999 48 Met Gly Trp Ser Cys Ile Ile Luell Phe Luell Wall Ala Thir Ala Thir Gly 1. 5 1O 15 gcc CaC to c gat tot gag gaa gat gala gala CaC a Ca att at C aca gat 96 Ala His Ser Asp Ser Glu Glu Asp Glu Glu His Thir Ile Ile Thir Asp 2O 25 3O acg gag ttg CC a C Ca Ctg a.a.a. citt atg Cat toa titt tgt gca ttic aag 144 Thir Glu Luell Pro Pro Lell Luell Met His Ser Phe Cys Ala Phe Lys 35 4 O 45 gcg gat gat ggc C Ca tgt aga gca gcc Cat cott aga tgg ttic ttic aat 192 Ala Asp Asp Gly Pro Cys Arg Ala Ala His Pro Arg Trp Phe Phe Asn SO 55 6 O att ttic act cga Cag tgc gaa gala titt at a tat 999 gga tgt gala gga 24 O Ile Phe Thir Arg Glin Cys Glu Glu Phe Ile Tyr Gly Gly Cys Glu Gly 65 70 75 8O aat cag aat cga titt gaa agt Ctg gala gag a.a.a. a.a.a. atg tgt aca 288 Asn Glin Asn Arg Phe Glu Ser Luell Glu Glu Cys Met Cys Thir 85 90 95 aga gat aat gca aac agg att at a aag aca aca ttg Cala Cala gala aag 336 Arg Asp Asn Ala Asn Arg Ile Ile Lys Thir Thir Lell Glin Glin Glu Lys 1OO 105 11 O

C Ca gat ttic tgc titt ttg gaa gala gat cott gga C Ca tgt at C gct titt 384 Pro Asp Phe Cys Phe Lell Glu Glu Asp Pro Gly Pro Cys Ile Ala Phe 115 12 O 125 titt cc c agg tgg titt tat aac aat cag aca a.a.a. Cag gala citt ttic 432 Phe Pro Arg Trp Phe Asn Asn Glin Thir Glin Cys Glu Luell Phe 13 O 135 14 O cc.g tat ggt gga tgc Cag ggc aat atg aac aat titt gag aca Ctg gala 48O Pro Gly Gly Cys Glin Gly Asn Met Asn ASn Phe Glu Thir Luell Glu 145 150 155 160 gaa tgc aag aac att gaa gat ggt cc.g aat ggit tto cag gtg gat 528 Glu Cys Lys Asn Ile Cys Glu Asp Gly Pro ASn Gly Phe Glin Wall Asp 1.65 17O 17s aat tat gga acc Cag citc. aat gct gtg aat aac t cc Ctg act cc.g Cala 576 Asn Tyr Gly Thir Glin Lell Asn Ala Wall Asn ASn Ser Lell Thir Pro Glin 18O 185 19 O toa taatgatcta ga 591 Ser

SEQ ID NO 11 LENGTH: 193 TYPE : PRT ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic polypeptide

SEQUENCE: 11 US 8,828,703 B2 159 160 - Continued

Met Gly Trp Ser Cys Ile Ile Luell Phe Luell Wall Ala Thir Ala Thir Gly 15

Ala His Ser Asp Ser Glu Glu Asp Glu Glu His Thir Ile Ile Thir Asp 25

Thir Glu Luell Pro Pro Lell Luell Met His Ser Phe Cys Ala Phe 35 4 O 45

Ala Asp Asp Gly Pro Arg Ala Ala His Pro Arg Trp Phe Phe Asn SO 55 6 O

Ile Phe Thir Arg Glin Cys Glu Glu Phe Ile Tyr Gly Gly Glu Gly 65 70 8O

Asn Glin Asn Arg Phe Glu Ser Luell Glu Glu Cys Met Cys Thir 85 90 95

Arg Asp Asn Ala Asn Arg Ile Ile Lys Thir Thir Lell Glin Glin Glu 105 11 O

Pro Asp Phe Phe Lell Glu Glu Asp Pro Gly Pro Cys Ile Ala Phe 115 12 O 125

Phe Pro Arg Trp Phe Asn Asn Glin Thir Lys Glin Cys Glu Luell Phe 13 O 135 14 O

Pro Gly Gly Glin Gly Asn Met Asn ASn Phe Glu Thir Luell Glu 145 150 155 160

Glu Asn Ile Glu Asp Gly Pro ASn Gly Phe Glin Wall Asp 1.65 17O 17s

Asn Tyr Gly Thir Glin Lell Asn Ala Wall Asn ASn Ser Lell Thir Pro Glin 18O 185 19 O

Ser

SEQ ID NO 12 LENGTH: 23 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 12 gattctgagg aagatgaaga aca

SEQ ID NO 13 LENGTH: 21 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

SEQUENCE: 13 tgattgcgga gtcagggagt t

SEQ ID NO 14 LENGTH: st TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 14 gaaaat attg aagaac catc taggatgggc tigcticta cat gggc.cat cat CCCCtt US 8,828,703 B2 161 162 - Continued

<210s, SEQ ID NO 15 &211s LENGTH: 57 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 15 aaggcggatg atggcc catg tagagcagcc catcc tagat ggttcttcaa tatttitc f

<210s, SEQ ID NO 16 &211s LENGTH: 51 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 16 tgttataaaa ccacctggga aaaaaag.cga tacatggtcc aggat cittct t 51

<210s, SEQ ID NO 17 &211s LENGTH: 51 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 17 aagaagat CC tigacCatgt atcgctttitt ttcCC aggtg gttitt at aac a 51

<210s, SEQ ID NO 18 &211s LENGTH: 45 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 18 catattgc cc toggcatccac catacgggaa aagttcacac tdttt 45

<210s, SEQ ID NO 19 &211s LENGTH: 45 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 19 aaac agtgtgaacttitt CCC gtatggtgga tigc.cagggca atatg 45

<210s, SEQ ID NO 2 O &211s LENGTH: 52 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 2O ttaatcttgc ggccacaggg gcc cactctg attctgagga agatgaagaa Ca 52

<210s, SEQ ID NO 21 US 8,828,703 B2 163 164 - Continued

&211s LENGTH: 43 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 21 caaaaaggct tctagatcat tatgattgcg gag to aggga gtt 43

<210s, SEQ ID NO 22 &211s LENGTH: 235 212. TYPE: PRT <213> ORGANISM: Homo sapiens <4 OOs, SEQUENCE: 22 Met Asp Pro Ala Arg Pro Lieu. Gly Lieu. Ser Ile Lieu. Lieu. Lieu. Phe Lieu. 1. 5 1O 15 Thr Glu Ala Ala Lieu. Gly Asp Ala Ala Glin Glu Pro Thr Gly Asn. Asn 2O 25 3O Ala Glu Ile Cys Lieu. Lieu Pro Lieu. Asp Tyr Gly Pro Cys Arg Ala Lieu 35 4 O 45 Lieu. Leu Arg Tyr Tyr Tyr Asp Arg Tyr Thr Glin Ser Cys Arg Glin Phe SO 55 6 O Lieu. Tyr Gly Gly Cys Glu Gly Asn Ala Asn Asn Phe Tyr Thr Trp Glu 65 70 7s 8O Ala Cys Asp Asp Ala Cys Trp Arg Ile Glu Lys Val Pro Llys Val Cys 85 90 95 Arg Lieu. Glin Val Ser Val Asp Asp Glin Cys Glu Gly Ser Thr Glu Lys 1OO 105 11 O Tyr Phe Phe Asn Lieu. Ser Ser Met Thr Cys Glu Lys Phe Phe Ser Gly 115 12 O 125 Gly Cys His Arg Asn Arg Ile Glu Asn Arg Phe Pro Asp Glu Ala Thr 13 O 135 14 O Cys Met Gly Phe Cys Ala Pro Llys Lys Ile Pro Ser Phe Cys Tyr Ser 145 150 155 160 Pro Lys Asp Glu Gly Lieu. Cys Ser Ala Asn Val Thr Arg Tyr Tyr Phe 1.65 17O 17s Asn Pro Arg Tyr Arg Thr Cys Asp Ala Phe Thr Tyr Thr Gly Cys Gly 18O 185 19 O Gly Asn Asp Asn. Asn. Phe Val Ser Arg Glu Asp Cys Lys Arg Ala Cys 195 2OO 2O5 Ala Lys Ala Lieu Lys Llys Llys Llys Llys Met Pro Llys Lieu. Arg Phe Ala 21 O 215 22O Ser Arg Ile Arg Lys Ile Arg Llys Lys Glin Phe 225 23 O 235

<210s, SEQ ID NO 23 &211s LENGTH: 210 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic polynucleotide 22 Os. FEATURE: <221s NAME/KEY: CDS <222s. LOCATION: (1) ... (210)

<4 OOs, SEQUENCE: 23 tcc atg gct gat gtt cct aag gtt tt cqt ct cag gtt tot gtt gac 48 US 8,828,703 B2 165 166 - Continued

Ser Met Ala Asp Val Pro Llys Val Cys Arg Lieu. Glin Val Ser Val Asp 1O 15 gat caa tt gag ggit tot act gala aag tat titt tto aac Ctg agc tict 96 Asp Gln Cys Glu Gly Ser Thr Glu Lys Tyr Phe Phe Asn Luell Ser Ser 25 3O atg act togc gala aag titc titt tot ggt get to Cat cgt aac cgt atc 144 Met Thr Cys Glu Lys Phe Phe Ser Gly Gly Cys His Arg Asn Arg Ile 35 4 O 45 gag aac C9g titt CC9 gac gag gCt act tdt atg ggit tto tgt gct cot 192 Glu Asn Arg Phe Pro Asp Glu Ala Thr Cys Met Gly Phe Cys Ala Pro SO 55 6 O aag tot gct gaC got agc 21 O Lys Ser Ala Asp Ala Ser 65 70

<210s, SEQ ID NO 24 &211s LENGTH: 70 212. TYPE: PRT ORGANISM: Artificial Sequence 22 Os. FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic polypeptide

<4 OOs, SEQUENCE: 24

Ser Met Ala Asp Val Pro Llys Val Cys Arg Lieu. Glin Wall Ser Val Asp 1. 5 1O 15

Asp Gln Cys Glu Gly Ser Thr Glu Lys Tyr Phe Phe Asn Luell Ser Ser 25 3O

Met Thr Cys Glu Lys Phe Phe Ser Gly Gly Cys His Arg Asn Arg Ile 35 4 O 45

Glu Asn Arg Phe Pro Asp Glu Ala Thr Cys Met Gly Phe Cys Ala Pro SO 55 6 O Lys Ser Ala Asp Ala Ser 65 70

<210s, SEQ ID NO 25 &211s LENGTH: 89.31 &212s. TYPE: DNA ORGANISM: Artificial Sequence 22 Os. FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic polynucleotide

<4 OOs, SEQUENCE: 25 aatgctacta ct attagtag aattgatgcc accttitt cag citcgc.gc.ccc aaatgaaaat 6 O atagotaaac aggttattga c catttgcga aatgitat cita atggtcaaac taaat Ctact 12 O cgttcgcaga attgggaatc aactgttata tggaatgaaa citt coaga.ca cc.gtactitta 18O gttgcatatt taaaacatgt tgagctacag Cattatatto agcaattaag ctictaagc.ca 24 O tcc.gcaaaaa tacct citta tcaaaaggag Caattaaagg tactic totala tcc tigacctg 3OO ttggagtttg Ctt Coggit ct ggttcgctitt gaagcticgaa ttaaaacgc.g at atttgaag 360 t ctitt.cgggc titcct cittaa totttittgat gcaatcc.gct ttgcttctga ctataatagt 42O

Cagggit aaag acctgattitt tgatt tatgg t cattct cqt tittctgaact gtttaaag.ca 48O tittgaggggg attcaatgaa tatt tatgac gatt.ccgcag tattggacgc tat coagtict 54 O aaaCattitta CitattacCCC citctggcaaa actitcttittg caaaag.cctic tcqctattitt 6OO ggtttittatc gtcgtctggit aaacgagggit tatgatagtg ttgct cittac tatgcct cqt 660 aatticcittitt gg.cgittatgt atctgcatta gttgaatgtg gtatt cotaa atctoaactg 72 O

US 8,828,703 B2 173 174 - Continued gcgc.cgc.cat cacggctitt atcggcgatgtcagtggttt ggccaacggc aacggagc.ca 786 O ccggagacitt cqcaggttcg aattct caga tiggcc caggit tagatggg gacaa.cagtic 7920 cgct tatgaa caactittaga cagtacct tc cqt ct ctitcc gcagagtgtc. gag togcc.gtc 798 O

Catt cqttitt C9gtgc.cggc aag ccttacg agttcagcat cactg.cgat aagat caatc 804 O ttitt cog.cgg cqttitt cqct ttcttgctat acgt.cgctac titt catgitac gttitt cagoa 81OO ctitt cqccaa tattittacgc aacaaagaaa got agtgatc. tcc taggaag ccc.gc.ctaat 816 O gag.cgggctt tttittittctg g tatgcatcc tdaggc.cgat actgtcgt.cg tcc cct caaa 822 O ctggcagatg cacggittacg atgcgcc.cat ctacaccaac gtgacctatic ccattacggit 828O caatcc.gc.cg tttgttcc.ca cqgagaatcc gacgggttgt tact.cgctica catttaatgt 834 O tgatgaaagc tiggctacagg aaggc.ca.gac gcgaattatt tttgatggcg titcct attgg 84 OO ttaaaaaatg agctgattta acaaaaattit aatgcgaatt ttaacaaaat attaacgttt 846 O acaatttaaa tatttgctta tacaatctitc ctdtttittgg ggcttittctg attat caacc 852O gggg tacata tdattgacat gctagttitta cqattaccgt to atcgattic ticttgtttgc 858 O tccagact ct caggcaatga cct gatagcc tttgtag atc tict caaaaat agctaccctic 864 O tccggcatta attitat cago tagaacggitt gaatat cata ttgatggtga tittgactgtc 87OO tccggc ctitt ct cacc ctitt tdaatctitta cct acacatt acticaggcat togcatttaaa 876O atatatgagg gttctaaaaa tttittatcct tcc.gttgaaa taaaggcttic ticcc.gcaaaa 882O gtattacagg gtcataatgt ttittggtaca accatttag citt tatgctic tdaggctitta 888 O ttgcttaatt ttgctaattic tittgccttgc ctgtatgatt tattggatgt t 89.31

<210s, SEQ ID NO 26 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 26 gggtcttctg tacgagttct g 21

<210s, SEQ ID NO 27 &211s LENGTH: 21 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 27 cctic cagagt togacagaatc c 21

<210s, SEQ ID NO 28 &211s LENGTH: 96 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer 22 Os. FEATURE: <221 > NAMEAKEY: modified base <222s. LOCATION: (40) ... (41) <223> OTHER INFORMATION: a, c, g, t, unknown or other 22 Os. FEATURE: <221 > NAMEAKEY: modified base <222s. LOCATION: (46) ... (47) US 8,828,703 B2 175 176 - Continued

<223> OTHER INFORMATION: a, c, g, t, unknown or other 22 Os. FEATURE: <221 > NAMEAKEY: modified base <222s. LOCATION: (52) ... (53) <223> OTHER INFORMATION: a, c, g, t, unknown or other 22 Os. FEATURE: <221 > NAMEAKEY: modified base <222s. LOCATION: (55) . . (56) <223> OTHER INFORMATION: a, c, g, t, unknown or other 22 Os. FEATURE: <221 > NAMEAKEY: modified base <222s. LOCATION: (58) ... (59) <223> OTHER INFORMATION: a, c, g, t, unknown or other <4 OOs, SEQUENCE: 28 gggit cittctg tacgagttct gcaggtttct gttrrkirrkin niktgtninkgs trunk.nnkinnk 6 O argtatttitt to aacctgag ctic tatgact tcc.gaa 96

<210s, SEQ ID NO 29 &211s LENGTH: 97 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer 22 Os. FEATURE: <221 > NAMEAKEY: modified base <222s. LOCATION: (37) <223> OTHER INFORMATION: a, c, g, t, unknown or other 22 Os. FEATURE: <221 > NAMEAKEY: modified base <222s. LOCATION: (49) . . (50) <223> OTHER INFORMATION: a, c, g, t, unknown or other & 22 O FEATURE; <221 > NAMEAKEY: modified base <222s. LOCATION: (64) . . (65) <223> OTHER INFORMATION: a, c, g, t, unknown or other <4 OOs, SEQUENCE: 29 cctic cagagt togacagaatc cqgaalaccgg ttct cmnymy yogttmyymnn goalaccacca 6 O kamningaact titt cqcaagt catagagctic aggttga 97

<210s, SEQ ID NO 3 O &211s LENGTH: 23 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 30 gattctgagg aagatgaaga aca 23

<210s, SEQ ID NO 31 &211s LENGTH: 27 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 31 acat atttitt aacaaaaatt tott cat 27

SEQ ID NO 32 LENGTH: SO TYPE : DNA ORGAN ISM: Artificial Sequence FEATURE: US 8,828,703 B2 177 178 - Continued <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 32 ttaatcttgc ggccacaggg gcc cactic.cg actctgagga agatgaagaa SO

<210s, SEQ ID NO 33 &211s LENGTH: 30 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 33

Cataagct tc agtggtggga gct cogitatic 3 O

<210s, SEQ ID NO 34 &211s LENGTH: 30 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 34 gatacggagc ticccaccact gaagct tatg 3 O

<210s, SEQ ID NO 35 &211s LENGTH: 2O & 212 TYPE DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 35 titcgcactga cqcgtgaaaa

<210s, SEQ ID NO 36 &211s LENGTH: 2O &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 36 ttitt cacgcg tcagtgcgaa

<210s, SEQ ID NO 37 &211s LENGTH: 41 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OO > SEQUENCE: 37 tott coagag attcaaatcg attctggitta cct tca catc c 41

<210s, SEQ ID NO 38 &211s LENGTH: 41 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic US 8,828,703 B2 179 180 - Continued primer

<4 OOs, SEQUENCE: 38 ggatgtgaag gta accagaa ticgatttgaa t ct ctggaag a 41

<210s, SEQ ID NO 39 &211s LENGTH: 29 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 39 titcCaggat.c ct cct caga aag cagaaa 29

<210s, SEQ ID NO 4 O &211s LENGTH: 29 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 4 O tittctgctitt Ctcgaggagg atcctggaa 29

<210s, SEQ ID NO 41 &211s LENGTH: 19 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 41 ttct tccago gtc.tcaaaa 19

<210s, SEQ ID NO 42 &211s LENGTH: 19 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 42 ttittgagacg Ctggaagaa 19

<210s, SEQ ID NO 43 &211s LENGTH: 38 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 43 cataattgtc. gacctggaaa ccattgggcc catct tca 38

<210s, SEQ ID NO 44 &211s LENGTH: 38 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer US 8,828,703 B2 181 182 - Continued

<4 OOs, SEQUENCE: 44 tgaagatggg cccalatggitt t cc aggtoga caattatg 38

<210s, SEQ ID NO 45 &211s LENGTH: 22 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 45 t cct Ctgtcc gcgggagtga ga 22

<210s, SEQ ID NO 46 &211s LENGTH: 22 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 46 tctic actic cc gcggacagag ga 22

<210s, SEQ ID NO 47 &211s LENGTH: 22 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence & 22 O FEATURE; <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 47 ccacatccgg aatact taaa td 22

<210s, SEQ ID NO 48 &211s LENGTH: 24 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 48 catttalagta titc.cggatgt gggg 24

<210s, SEQ ID NO 49 &211s LENGTH: 22 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 49 titta catgcg cqcaga catt ct 22

<210s, SEQ ID NO 50 &211s LENGTH: 22 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer US 8,828,703 B2 183 184 - Continued

<4 OOs, SEQUENCE: SO agaatgtctg. c.gc.gcatgta aa 22

SEO ID NO 51 LENGTH: 43 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 51 ttgtttctag at cattacat atttittaa.ca aaaattt citt cat 43

SEO ID NO 52 LENGTH: SO TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

SEQUENCE: 52 ttaatcttgc ggccacaggg gcc cactic.cg actctgagga agatgaagaa SO

SEO ID NO 53 LENGTH: 43 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 53 ttgtttctag at cattacat atttittaa.ca aaaattt citt cat 43

SEO ID NO 54 LENGTH: 245 TYPE : PRT ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic polypeptide

<4 OOs, SEQUENCE: 54

Asp Ser Glu Glu Asp Glu Glu His Thir Ile Ile Thir Asp Thir Glu Luell 1. 5 15

Pro Pro Luell Lys Lell Met His Ser Phe Cys Ala Phe Ala Asp Asp 2O 25 3O

Gly Pro Cys Arg Ala Arg Phe Asp Arg Trp Phe Phe Asn Ile Phe Thir 35 4 O 45

Arg Glin Glu Glu Phe Ile Gly Gly Cys Glu Gly Asn Glin Asn SO 55 6 O

Arg Phe Glu Ser Lell Glu Glu Lys Met Thir Arg Asp Asn 65 70

Ala Asn Arg Ile Ile Thir Thir Luell Glin Glin Glu Pro Asp Phe 85 90 95

Phe Luell Glu Glu Asp Pro Gly Pro Cys Arg Ala Ala His Pro Arg 105 11 O

Phe Tyr Asn Glin Thir Glin Glu Arg Phe Ile Gly Gly 115 12 O 125

Glu Gly Asn Met Asn Asn Phe Glu Thir Luell Glu Glu Asn US 8,828,703 B2 185 186 - Continued

13 O 135 14 O

Ile Glu Asp Gly Pro Asn Gly Phe Glin Wall Asp Asn Tyr Gly Thir 145 150 155 160

Glin Luell Asn Ala Wall Asn Ser Luell Thir Pro Glin Ser Thir Wall Pro 1.65 17O 17s

Ser Luell Phe Glu Phe His Gly Pro Ser Trp Cys Lell Thir Pro Ala Asp 18O 185 19 O

Arg Gly Pro Ile Ala Phe Phe Pro Arg Phe Tyr Asn Ser Wall 195

Ile Gly Pro Phe Pro Gly Gly Cys Glin Gly Asn Glu 21 O 215 22O

Asn Phe Thir Ser Glin Glu Luell Arg Ala Gly Phe 225 23 O 235 24 O

Ile Glin Arg Ile Ser 245

SEO ID NO 55 LENGTH: 27 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

SEQUENCE: 55 atgatttaca caatgaagaa agtacat 27

SEO ID NO 56 LENGTH: 24 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 56 tgat attctt toggatgaaac ctitt 24

SEO ID NO 57 LENGTH: 43 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: gatggcc.cat gtcgtgcacg Ctt cataga tiggttct tca ata 43

SEO ID NO 58 LENGTH: 43 TYPE: DNA ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 58 tattgaagaa ccatct at cq aag.cgtgcac gacatgggcc atc 43

<210s, SEQ ID NO 59 &211s LENGTH: 56 &212s. TYPE: DNA US 8,828,703 B2 187 188 - Continued <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OO > SEQUENCE: 59 gaagat.cctg gaccttgtcg agctgct cat CCC aggtggt tittataacaa t cagac 56

<210s, SEQ ID NO 60 &211s LENGTH: 56 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 60 gtctgattgt tataaalacca Cctgggatga gcagctcgac alaggt cc agg atctt C 56

<210s, SEQ ID NO 61 &211s LENGTH: 47 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 61 Cagtgtgaac gttt cattta tigtggatgc gagggcaata taacaa 47

<210 SEQ ID NO 62 &211s LENGTH: 47 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 62 ttgttcat at tdcc.ct cqca tocaccataa atgaaacgtt cacactg 47

<210s, SEQ ID NO 63 &211s LENGTH: 55 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 63 agcagacaga gaccttgta ttgcc tttitt toccagattic tactacaatt cagtic 55

<210s, SEQ ID NO 64 &211s LENGTH: 55 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 64 gactgaattig tagtagaatc tiggaaaaaa ggcaatacaa ggit cotctgt Ctgct 55

<210s, SEQ ID NO 65 &211s LENGTH: 31 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence US 8,828,703 B2 189 190 - Continued

22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic primer

<4 OOs, SEQUENCE: 65 talagtacggt ggatgtcagg gaaatgaaaa C 31

<210s, SEQ ID NO 66 &211s LENGTH: 31 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 66 gttitt cattt coctdacatc. cac cog tactt a 31

<210s, SEQ ID NO 67 &211s LENGTH: 36 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OO > SEQUENCE: 67 agattt tact tccatggttt acacaatgaa gaaagt 36

<210s, SEQ ID NO 68 & 211 LENGTH 44 &212s. TYPE: DNA <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic prlmer

<4 OOs, SEQUENCE: 68 citcttgtttic tagat catta tdatattott tdgatgaaac ctitt 44

<210s, SEQ ID NO 69 &211s LENGTH: 872 212. TYPE: PRT <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic polypeptide

<4 OOs, SEQUENCE: 69 Met Ile Tyr Thr Met Lys Llys Val His Ala Leu Trp Ala Ser Val Cys 1. 5 1O 15 Lieu. Lieu. Lieu. Asn Lieu Ala Pro Ala Pro Lieu. Asn Ala Asp Ser Glu Glu 2O 25 3O Asp Glu Glu. His Thr Ile Ile Thr Asp Thr Glu Lieu Pro Pro Leu Lys 35 4 O 45 Lieu Met His Ser Phe Cys Ala Phe Lys Ala Asp Asp Gly Pro Cys Arg SO 55 6 O

Ala Arg Phe Asp Arg Trp Phe Phe Asn. Ile Phe Thr Arg Glin Cys Glu 65 70 7s 8O

Glu Phe Ile Tyr Gly Gly Cys Glu Gly Asn Glin Asn Arg Phe Glu Ser 85 90 95

Lieu. Glu Glu. Cys Llys Llys Met Cys Thr Arg Asp Asn Ala Asn Arg Ile 1OO 105 11 O US 8,828,703 B2 191 192 - Continued

Ile Thir Thir Lell Glin Glin Glu Pro Asp Phe Cys Phe Luell Glu 115 12 O 125

Glu Asp Pro Gly Pro Arg Ala Ala His Pro Arg Trp Phe Tyr Asn 13 O 135 14 O

Glin Thir Glin Glu Arg Phe Ile Tyr Gly Gly Glu Gly Asn 145 150 155 160

Met Asn Asn Phe Glu Thir Lell Glu Glu Cys Asn Ile Glu Asp 1.65

Gly Pro Asn Gly Phe Glin Wall Asp Asn Tyr Gly Thir Glin Luell Asn Ala 18O 185 19 O

Wall Asn Ser Luell Thir Pro Glin Ser Thir Wall Pro Ser Luell Phe Glu 195

Phe His Gly Pro Ser Trp Cys Luell Thir Pro Ala Asp Arg Gly Pro 21 O 215 22O

Ile Ala Phe Phe Pro Arg Phe Asn Ser Wall Ile Gly Cys 225 23 O 235 24 O

Arg Pro Phe Pro Tyr Gly Gly Glin Gly ASn Glu Asn Phe Thir Ser 245 250 255

Glin Glu Cys Lell Arg Ala Lys Lys Gly Phe Ile Glin Arg Ile 26 O 265 27 O

Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Gly Asp 27s 28O 285

Ala His Ser Glu Wall Ala His Arg Phe Asp Lell Gly Glu Glu 29 O 295 3 OO

Asn Phe Ala Lieu Wall Lieu Ile Ala Phe Ala Gln Tyr Luell Gln 3. OS 310 315

Pro Phe Glu Asp His Wall Luell Wall ASn Glu Wall Thir Glu Phe 3.25 330 335

Ala Thir Cys Wall Ala Asp Glu Ser Ala Glu Asn Asp Ser 34 O 345 35. O

Lell His Thir Luell Phe Gly Asp Lys Luell Thir Wall Ala Thir Luell Arg 355 360 365

Glu Thir Gly Glu Met Ala Asp Ala Lys Glin Glu Pro Glu 37 O 375

Arg Asn Glu Phe Lell Glin His Asp Asp Asn Pro Asn Luell Pro 385 390 395 4 OO

Arg Luell Wall Arg Pro Glu Wall Asp Wall Met Thir Ala Phe His Asp 4 OS 415

Asn Glu Glu Thir Phe Lell Tyr Luell Tyr Glu Ile Ala Arg Arg 42O 425 43 O

His Pro Tyr Phe Tyr Ala Pro Glu Luell Luell Phe Phe Ala Arg 435 44 O 445

Ala Ala Phe Thir Glu Cys Glin Ala Ala Asp Ala Ala 450 45.5 460

Lell Luell Pro Lell Asp Glu Luell Arg Asp Glu Gly Ala Ser Ser 465 470 48O

Ala Glin Arg Lell Ala Ser Luell Glin Phe Gly Glu Arg 485 490 495

Ala Phe Ala Trp Ala Wall Ala Arg Luell Ser Glin Arg Phe Pro SOO 505

Ala Glu Phe Ala Glu Wall Ser Lys Luell Wall Thir Asp Lell Thir Wall 515 525

His Thir Glu Cys His Gly Asp Luell Luell Glu Ala Asp Asp Arg US 8,828,703 B2 193 194 - Continued

53 O 535 54 O

Ala Asp Luell Ala Tyr Ile Glu Asn Glin Asp Ser Ile Ser Ser 5.45 550 555 560

Luell Glu Cys Glu Pro Luell Luell Glu Ser His Cys 565 st O sts

Ile Ala Glu Wall Glu Asn Asp Glu Met Pro Ala Asp Lell Pro Ser Luell 585 59 O

Ala Ala Asp Phe Wall Glu Ser Lys Asp Wall Cys Asn Tyr Ala Glu 595 6OO 605

Ala Lys Asp Wall Phe Lell Gly Met Phe Luell Tyr Glu Tyr Ala Arg Arg 610 615 62O

His Pro Asp Ser Wall Wall Luell Luell Luell Arg Lell Ala Thir Tyr 625 630 635 64 O

Glu Thir Thir Luell Glu Ala Ala Ala Asp Pro His Glu 645 650 655

Ala Wall Phe Asp Glu Phe Lys Pro Luell Wall Glu Glu Pro Glin 660 665 67 O

Asn Luell Ile Glin Asn Glu Luell Phe Glu Glin Lell Gly Glu Tyr 675 685

Phe Glin Asn Ala Lell Lell Wall Arg Tyr Thir Lys Wall Pro Glin 69 O. 695 7 OO

Wall Ser Thir Pro Thir Lell Wall Glu Wall Ser Arg Asn Lell Gly Wall 7 Os

Gly Ser Cys His Pro Glu Ala Lys Arg Met Pro Cys Ala 725 730 735

Glu Asp Luell Ser Wall Wall Luell Asn Glin Luell Wall Luell His Glu 740 74. 7 O

Thir Pro Wall Ser Asp Arg Wall Thir Cys Thir Glu Ser Luell 7ss 760 765

Wall Asn Arg Arg Pro Phe Ser Ala Luell Glu Wall Asp Glu Thir Tyr 770 775 78O

Wall Pro Glu Phe Asn Ala Glu Thir Phe Thir Phe His Ala Asp Ile 79 O 79.

Thir Luell Ser Glu Glu Arg Glin Ile Lys Glin Thir Ala Luell 805 810 815

Wall Glu Luell Wall Lys His Pro Lys Ala Thir Glu Glin Luell 825 83 O

Ala Wall Met Asp Asp Phe Ala Ala Phe Wall Glu Cys Ala 835 84 O 845

Asp Asp Glu Thir Phe Ala Glu Glu Gly Lys Luell Wall Ala 850 855 860

Ala Ser Glin Ala Ala Lell Gly Luell 865 87O

SEO ID NO 7 O LENGTH: SO3 TYPE : PRT ORGANISM: Artificial Sequence FEATURE: OTHER INFORMATION: Description of Artificial Sequence: Synthetic polypeptide

<4 OOs, SEQUENCE: 70 Asp Ser Glu Glu Asp Glu Glu. His Thir Ile Ile Thr Asp Thr Glu Lieu 1. 5 15 US 8,828,703 B2 195 196 - Continued

Pro Pro Luell Lys Lell Met His Ser Phe Cys Ala Phe Lys Ala Asp Asp 25

Gly Pro Cys Arg Ala Arg Phe Asp Arg Trp Phe Phe Asn Ile Phe Thir 35 4 O 45

Arg Glin Glu Glu Phe Ile Gly Gly Glu Gly Asn Glin Asn SO 55 6 O

Arg Phe Glu Ser Lell Glu Glu Met Thir Arg Asp Asn 65 70

Ala Asn Arg Ile Ile Thir Thir Luell Glin Glu Pro Asp Phe 85 95

Phe Luell Glu Glu Asp Pro Gly Pro Arg Ala Ala His Pro Arg 105 11 O

Phe Tyr Asn Glin Thir Glin Glu Arg Phe Ile Gly Gly 115 12 O 125

Glu Gly Asn Met Asn Asn Phe Glu Thir Luell Glu Glu Asn 13 O 135 14 O

Ile Glu Asp Gly Pro Asn Gly Phe Glin Wall Asp Asn Gly Thir 145 150 155 160

Glin Luell Asn Ala Wall Asn Ser Luell Thir Pro Glin Ser Thir Wall Pro 1.65 17O 17s

Ser Luell Phe Glu Phe His Gly Pro Ser Trp Lell Thir Pro Ala Asp 18O 185 19 O

Arg Gly Pro Ile Ala Phe Phe Pro Arg Phe Tyr Asn Ser Wall 195 2OO

Ile Gly Arg Pro Phe Gly Gly Cys Gln Gly Asn Glu 21 O 215 22O

Asn Phe Thir Ser Lys Glin Glu Luell Arg Ala Gly Phe 225 23 O 235 24 O

Ile Glin Arg Ile Ser Gly Gly Ser Gly Ser Ser Gly Gly Gly Ser Ser 245 250 255

Gly Gly Asp Ser Glu Glu Asp Glu Glu His Thir Ile Ile Thir Asp Thir 26 O 265 27 O

Glu Luell Pro Pro Lell Lell Met His Ser Phe Ala Phe Ala 285

Asp Asp Gly Pro Cys Arg Ala Arg Phe Asp Arg Trp Phe Phe Asn Ile 29 O 295 3 OO

Phe Thir Arg Glin Cys Glu Glu Phe Ile Gly Gly Glu Gly Asn 3. OS 310 315

Glin Asn Arg Phe Glu Ser Lell Glu Glu Cys Met Thir Arg 3.25 330 335

Asp Asn Ala Asn Arg Ile Ile Thir Thir Luell Glin Glin Glu Pro 34 O 345 35. O

Asp Phe Cys Phe Lell Glu Glu Asp Pro Gly Pro Arg Ala Ala His 355 360 365

Pro Arg Trp Phe Tyr Asn Glin Thir Glin Glu Arg Phe Ile 37 O 375

Gly Gly Glu Gly Asn Met Asn Asn Phe Glu Thir Lell Glu Glu Cys 385 390 395 4 OO

Asn Ile Glu Asp Gly Pro Asn Gly Phe Glin Wall Asp Asn 4 OS 41O 415

Gly Thir Glin Luell Asn Ala Wall Asn Ser Luell Thir Pro Glin Ser Thir Lys 42O 425 43 O

Wall Pro Ser Luell Phe Glu Phe His Gly Pro Ser Trp Luell Thir Pro