US009371369B2

(12) United States Patent (10) Patent No.: US 9,371.369 B2

Schellenberger et al. 45) Date of Patent: *Jun. 21,9 2016

(54) EXTENDED RECOMBINANT POLYPEPTIDES (52) U.S. Cl. AND COMPOSITIONS COMPRISING SAME CPC ...... C07K 14/545 (2013.01); C07K 14/001 (71) Applicant: Amunix Operating Inc., Mountain (2013.01). C07K 14/47 (2013.01); View, CA (US) (Continued) 72 I t : Volk Schellenb Palo Alto, CA (58) Field of Classification Search (72) Inventors: Volker(US); Joshua schellenberger, Silverman, Palo Sunnyvale, Alto, CA CPC ... C07K 14/001; C07K 14/47; C07K 14/545; - 0 C07K 14/605; C07K 14/61; C07K 14/745: (US); Chia-Wei Wang, Milpitas, CA C07K 2319/31: C07K 2319/35 (US); Benjamin Spink, San Carlos, CA See application file for complete search history. (US); Willem P. Stemmer, Los Gatos, CA (US); Nathan Geething, Santa (56) References Cited Clara, CA (US); Wayne To, Fremont, CA (US); Jeffrey L. Cleland, San U.S. PATENT DOCUMENTS Carlos, CA (US) 3,992,518. A 1 1/1976 Chien et al. (73) Assignee: Amunix Operating Inc., Mountain 4,088,864 A 5, 1978 Theeuwes et al. View, CA (US) (Continued) (*) Notice: Subject to any disclaimer, the term of this FOREIGN PATENT DOCUMENTS patent is extended or adjusted under 35 U.S.C. 154(b) by 78 days. CN 1761684. A 4, 2006 This patent is Subject to a terminal dis- CN 1933855. A 3, 2007 claimer. (Continued) (21) Appl. No.: 14/168,973 OTHER PUBLICATIONS (22) Filed: Jan. 30, 2014 Kangueane; et al., “T-Epitope Designer: A HLA-peptide binding (65) Prior Publication Data prediction server”. May 15, 2005, 1(1), 21-4. (Continued) US 2014/O3O1974 A1 Oct. 9, 2014 Primary Examiner — James H Alstrum Acevedo Related U.S. Application Data Assistant Examiner — Randall L. Beane (63) Continuation of application No. 12/699,761, filed on R ttorney, Agent, or Firm — Wilson Sonsini Goodrich & Feb. 3, 2010, now Pat. No. 8,673,860. (57) ABSTRACT (60) Provisional application No. 61/149,669, filed on Feb. The present invention relates to compositions comprising 3, 2009, provisional application No. 61/268,193, filed biologically active proteins linked to extended recombinant (Continued) polypeptide (XTEN), isolated nucleic acids encoding the compositions and vectors and host cells containing the same, (51) Int. Cl. and methods of using Such compositions in treatment of glu A6 IK38/6 (2006.01) cose-related diseases, metabolic diseases, coagulation disor A6 IK38/8 (2006.01) ders, and growth hormone-related disorders and conditions. (Continued) 37 Claims, 48 Drawing Sheets

Long half-life LOW clotting activity XTEN release

Short half-life High clotting activity US 9,371.369 B2 Page 2

Related U.S. Application Data 5,599,907 2, 1997 Anderson et al. 5,610,278 3, 1997 Nordfang et al. on Jun. 8, 2009, provisional application No. 61/185. 5,660,848 8, 1997 Moo-Young 112, filed on Jun. 8, 2009, provisional application No. 5,756,115 5, 1998 Moo-Young et al. 5,789,379 8, 1998 Drucker et al. 61/236,493, filed on Aug. 24, 2009, provisional appli 5,833,911 11, 1998 Llort et al. cation No. 61/236,836, filed on Aug. 25, 2009, provi 5,874, 104 2, 1999 Adler-Moore et al. sional application No. 61/243,707, filed on Sep. 18, 5,916,588 6, 1999 Popescu et al. 2009, provisional application No. 61/245.490, filed on 5,919,766 7, 1999 Osterberg et al. 5,942,252 8, 1999 Tice et al. Sep. 24, 2009, provisional application No. 61/280,955, 5,965,156 10, 1999 Profitt et al. filed on Nov. 10, 2009, provisional application No. 5,972,885 10, 1999 Spira et al. 61/280,956, filed on Nov. 10, 2009, provisional appli 5,981,719 11, 1999 WoisZwillo et al. 6,024,983 2, 2000 Tice et al. cation No. 61/281,109, filed on Nov. 12, 2009. 6,043,094 3, 2000 Martin et al. 6,048,720 4, 2000 Dalborg et al. (51) Int. C. 6,056,973 5/2000 Allen et al. C07K I4/6 (2006.01) 6,060,447 5/2000 Chapman et al. C07K I4/46 (2006.01) 6,090,925 T/2000 WoisZwillo et al. C07K 4/545 (2006.01) 6,110,498 8, 2000 Rudnic et al. 6,126,966 10, 2000 Abra et al. C07K I4/00 (2006.01) 6,183,770 2, 2001 Muchin et al. C07K I4/605 (2006.01) 6,251,632 6, 2001 Lillicrap et al. C07K I4/745 (2006.01) 6,254,573 T/2001 Haim et al. C07K I4/47 (2006.01) 6,268,053 T/2001 WoisZwillo et al. U.S. C. 6,284.276 9, 2001 Rudnic et al. (52) 6,294, 170 9, 2001 Boone et al. CPC ...... C07K 14/605 (2013.01); C07K 14/61 6,294,191 9, 2001 Meers et al. (2013.01); C07K 14/745 (2013.01); C07K 6,294,201 9, 2001 Kettelhoit et al. 2319/31 (2013.01); C07K 2319/35 (2013.01) 6,303,148 10, 2001 Hennink et al. 6,309,370 10, 2001 Haim et al. References Cited 6,316,024 11, 2001 Allen et al. (56) 6,316,226 11, 2001 Van Ooyen et al. U.S. PATENT DOCUMENTS 6,329, 186 12, 2001 Nielsen et al. 6,346,513 2, 2002 Van Ooyen et al. 6,352,716 3, 2002 Janoff et al. 4.200,984 A 5, 1980 Fink 6,352,721 3, 2002 Faour 4,284.444 A 8, 1981 Bernstein et al. 6,361,796 3, 2002 Rudnic et al. 4,398,908 A 8/1983 Siposs 6,395,302 5/2002 Hennink et al. 4,435,173 A 3/1984 Siposs et al. 6,406,632 6, 2002 Safir et al. 4,542,025 A 9, 1985 Tice et al. 6,406,713 6, 2002 Janoff et al. 4,599,311 A 7, 1986 Kawasaki 6,458,387 10, 2002 Scott et al. 4,684,479 A 8/1987 D'Arrigo 6,458,563 10, 2002 Lollar 4,713,339 A 12/1987 Levinson et al. 6,514,532 2, 2003 Rudnic et al. 4,757,006 A 7/1988 Toole, Jr. et al. 6,517,859 2, 2003 Tice et al. 4,845,075 A 7/1989 Murray et al. 6,534,090 3, 2003 Puthli et al. 4,861,800 A 8/1989 Buyske 6,572,585 6, 2003 Choi 4,870,008 A 9, 1989 Brake 6,669,961 12, 2003 Kim et al. 4,882.279 A 1 1/1989 Cregg 6,713,086 3, 2004 Qiu et al. 4,897.268 A 1/1990 Tice et al. 6,715.485 4, 2004 Djupesland 4,931,373 A 6, 1990 Kawasaki et al. 6,733,753 5, 2004 Boone et al. 4,933,185 A 6/1990 Wheatley et al. 6,743,211 6, 2004 Prausnitz et al. 4,965,199 A 10/1990 Capon et al. 6,759,057 T/2004 Weiner et al. 4,976,696 A 12/1990 Sanderson et al. 6,814,979 11, 2004 Rudnic et al. 4,988,337 A 1, 1991 to 6,818.439 11, 2004 Jolly et al. 5,004,804 A 4, 1991 Kuo et al. 6,838,093 1/2005 Flanner et al. 5,017,378 A 5, 1991 Turner et al. 6,890,918 5/2005 Burnside et al. 5,037,743 A 8, 1991 Welch et al. 6,905,688 6, 2005 Rosen et al. 5,089,474 A 2f1992 Castro et al. 6,919,311 7/2005 Lenting et al. 5, 112,950 A 5, 1992 Meulien et al. 6,945,952 9, 2005 Kwon 5,171,844 A 12/1992 van Ooyen et al. 7,045,318 5/2006 Ballance 5,176,502 A 1/1993 Sanderson et al. 7,276.475 10, 2007 Defrees et al. 5, 186,938 A 2/1993 Sablotsky et al. 7,294,513 11/2007 Wyatt 5, 198,349 A 3, 1993 Kaufman 7,329,640 2, 2008 Vlasuk 5,215,680 A 6/1993 D'Arrigo 7,413,537 8, 2008 Ladner et al. 5,223,409 A 6, 1993 Ladner et al. 7,442,778 10, 2008 Gegg et al. 5,250,421 A 10, 1993 Kaufman et al. 7,452,967 11/2008 Bertin 5,270,176 A 12/1993 Dorschug et al. 7,511,024 3, 2009 Pederson et al. 5,298,022 A 3, 1994 Bernardi 4/2009 Lee et al. 5,318.540 A 6/1994 Athayde et al. 7,514,257 5,364,934 A 1 1/1994 Drayna et al. 7,528,242 5/2009 Anderson et al. 5,407,609 A 4/1995 Tice et al. 7,560, 107 T/2009 Lollar 5,424,199 A 6, 1995 Goeddeletal. 7,709,605 5, 2010 Knopfet al. 5.492.534 A 2/1996 Athayde et al. 7,846.445 12/2010 Schellenberger et al. 5,534,617 A 7, 1996 Cunningham et al. 8,129,348 3, 2012 Besman et al. 5,543,502 A 8/1996 Nordfang et al. 8, 178.495 5, 2012 Chilkoti 5,554,730 A 9, 1996 WoiSZwillo et al. 8.492,530 T/2013 Schellenberger et al. 5,573,776 A 11/1996 Harrison et al. 8,557,961 10, 2013 Silverman et al. 5,576,291 A 11/1996 Curtis et al. 8,673,860 3, 2014 Schellenberger et al. 5,578,709 A 11/1996 WoiSZwillo 8,680,050 3, 2014 Schellenberger et al. US 9,371.369 B2 Page 3

(56) References Cited FOREIGN PATENT DOCUMENTS U.S. PATENT DOCUMENTS CN 101 190945. A 6, 2008 DE 257197 A1 6, 1988 8,703,717 B2 4/2014 Schellenberger et al. EP OO36776 A2 9, 1981 8,716.448 B2 5/2014 Schellenberger et al. EP OO36776 A3 10, 1982 8,957,021 B2 2/2015 Schellenberger et al. EP O184438 A2 6, 1986 2002/0042079 A1 4/2002 Simon et al. EP O238O23 A2 9, 1987 2002fO150881 A1 10, 2002 Ladner et al. EP O244234 A2 11, 1987 2003/0049689 A1 3f2003 Edwards et al. E. : A. 1 E. 2003. O181381 A1 9/2003 Himmelspach et al. EP O295597 A2 12/1988 2003. O19074.0 A1 10, 2003 Altman EP O238O23 A3 2, 1989 2003/0228309 A1 12/2003 Salcedo et al. EP O295597 A3 5, 1990 2004/0106118 Al 6 2004 Kolmar et al. EP O556171 B1 8/2000 2004/O142870 A1 7, 2004 Finn JP 20005O2901 A 3, 2000 2004/0259780 A1 12/2004 Glasebrook et al. RU 2005.133665 A 6, 2006 2005, OO32081 A1 2/2005 Ju et al. WO 8704187 A1 7, 1987 2005/OO42721 A1 2/2005 Fang et al. WO 880O831 A1 2, 1988 2005, 004.8512 A1 3/2005 Kolkman et al. WO WO 89,09051 A1 10, 1989 2005, 0118136 A1 6/2005 Leung et al. WO 9210576 A1 6, 1992 2005/O123997 A1 6, 2005 Lollar WO 971. 1178 A1 3, 1997 2005/0260605 A1 11/2005 Punnonen et al. WO WO 97.33552 A1 9, 1997 2005/0287.153 A1 12, 2005 Dennis WO 9822577 A1 5, 1998 2006/0026719 A1 2/2006 Kieliszewski et al. WO 9852976 A1 11, 1998 2006/0040856 A1 2/2006 DeFrees et al. WO WO99/41383 A1 8/1999 2006, OO84113 A1 4/2006 Ladner et al. WO WO99/49901 A1 10, 1999 WO OOO3317 A1 1, 2000 2006, O122376 A1 6/2006 Chapman et al. WO O2O77036 A2 10, 2002 2006/0211621f A1 9/2006f Knudsen1 et al. WO O2O79232 A2 10, 2002 2006O287220 A1 12, 2006 Let al. WO O2O79232 A3 12/2002 2006, O293.232 A1 12, 2006 Levy et al. WO WO 2005/025499 A2 3, 2005 2007/O161087 A1 7/2007 Glaesner et al. WO 2005069845 A2 8, 2005 2007,019 1272 A1 8, 2007 Stemmer et al. WO 2006O24953 A2 3, 2006 2007 O2O3058 A1 8/2007 Lau et al. WO WO 2006/081249 A2 8, 2006 2007/0212703 A1 9, 2007 Stemmer et al. WO WO 2006/081249 A3 2, 2007 2007/0244301 A1 10, 2007 Siekmann et al. WO 2007073486 A2 6, 2007 2008, OO39341 A1 2/2008 Schellenberger et al. WO WO 2007/090584 A1 8, 2007 2008.0167238 A1 7/2008 Rosen et al. WO WO 2007/103455 A2 9, 2007 2008/0176288 A1 7/2008 Leung et al. WO WO 2007/103455 A3 11, 2007 2008.0193441 A1 8/2008 Trown et al. WO 2008O12629 A2 1/2008 2008/0227691 A1 9/2008 Ostergaard et al. WO WO 2008/049931 A1 5, 2008 2008/0234193 A1 9, 2008 Bossard et al. WO WO 2008/07761.6 A1 T 2008 2008/026O755 A1 10, 2008 Metzner et al. WO WO 2008,155.134 A1 12/2008 WO 2009023270 A2 2, 2009 2008/026 1877 A1 10, 2008 Ballance et al. WO WO 2010/091122 A1 8, 2010 2008/0269.125 A1 10, 2008 Ballance et al. WO WO 2010, 1445O2 A2 12/2010 2008.0312157 A1 12, 2008 Levy et al. WO WO 2011/028228 A1 3, 2011 2009, OO42787 A1 2/2009 Metzner et al. WO WO 2011/028229 A1 3, 2011 2009 OO60862 A1 3/2009 Chang et al. WO 2011069 164 A2 6, 2011 2009/0092582 A1 4/2009 Bogin et al. WO WO 2011/0848O8 A2 T 2011 2009.0099.031 A1 4/2009 Stemmer et al. WO 2011123830 A2 10, 2011 2009.0117104 A1 5, 2009 Baker et al. WO WO 2011/123813 A2 10/2011 2009,0169553 A1 7/2009 Day WO 2012O06623 A1 1, 2012 2009/02397.95 A1 9, 2009 Ballance et al. WO 2012O06624 A2 1, 2012 2009/0280056 A1 11/2009 Dennis et al. WO 2012O06633 A1 1/2012 2010.0081615 A1 4, 2010 Pan et al. OTHER PUBLICATIONS 2010, O189682 A1 7/2010 Schellenberger et al. 2010/0239554 A1 9/2010 Schellenberger et al. Ackerman et al. Ion Channels—Basic Science and Clinical Disease. 2010/0292130 A1 11/2010 Skerra et al. New Engl. J. Med. 1997; 336:1575-1587. 2011/0077199 A1 3/2011 Schellenberger et al. Adams, et al. High affinity restricts the localization and tumor pen 2011/O142859 A1 6/2011 Ebens, Jr. et al. etration of single-chain fiv antibody molecules. Cancer Res. 2001; 2011 O151433 A1 6/2011 Schellenberger et al. 61(12):4750-5. 2012/0178691 A1 7/2012 Schellenberger et al. Adams, et al. Increased affinity leads to improved selective tumor 2012fO2200 11 A1 8/2012 Schellenberger et al. livery of single-chain Fv antibodies. Cancer Res. 1998; 58(3):485 2012fO230947 A1 9/2012 Schellenberger et al. 2012/0263701 A1 10/2012 Schellenberger et al. Alam, et al. Expression and purification of a mutant human growth hormone that is resistant to proteolytic cleavage by thrombin, 2012/0263703 A1 10/2012 Schellenberger et al. lasmin and human plasma in vitro.JBiotechnol. 1998; 65(2-3): 183 2013/0017997 A1 1/2013 Schellenberger et al. S. p K-1 - ws 2013,00398.84 A1 2/2013 Bogin et al. Altschul et al. Basic Local Alignment Search Tool. J. Mol. Biol. 2013/O137763 A1 5, 2013 Van Delft et al. 1990; 215:403-410. 2013,0165389 A1 6/2013 Schellenberger et al. Alvarez, et al. Improving Protein Pharmacokinetics by Genetic 2013,0183280 A1 7/2013 Oestergaard et al. Fusion to Simple Amino Acid Sequences. J Biol Chem. 2004; 279: 2014/0162949 A1 6, 2014 Cleland et al. 3375-81. US 9,371.369 B2 Page 4

(56) References Cited Brooks, et al. Evolution of amino acid frequencies in proteins over deep time: inferred order of introduction of amino acids into the OTHER PUBLICATIONS genetic code. Mol Biol Evol. 2002; 19, 1645-1655. Buchner. Supervising the fold: functional principles of molecular Amin, et al. Construction of stabilized proteins by combinatorial chaperones. FASEB J. 1996; 10(1):10-19. consensus mutagenesis. Protein Eng Des Sel. 2004; 17: 787-93. Bulaj, et al. Efficient oxidative folding of and the radiation Antcheva, et al. Proteins of circularly permuted sequence present of venomous cone snails. Proc Natl AcadSci USA. 2003; 100 Suppl within the same organism: the major serine proteinase inhibitor from 2:14562-8. Capsicum annuum seeds. Protein Sci. 2001; 10: 2280-90. Buscaglia, et al. Tandem amino acid repeats from Trypanosoma cruzi Araki, et al. Four disulfide bonds' allocation of Na+, K(+)-ATPase shed antigens increase the half-life of proteins in blood. Blood. Mar. inhibitor (SPAI). Biochemical and biophysical research communica 15, 1999;93(6):2025-32. tions. 1990. 172(1): 42-46. (Abstract Only). Calabrese, et al. Crystal Structure of Phenylalanine Ammonia Lyase: Arap, et al. Steps toward mapping the human vasculature by phage Multiple Helix Dipoles Implicated in Catalysis. Biochemistry, 2004; display. Nat Med. 2002; 8: 121-7. 43: 11403-16. Arnau, et al. Current strategies for the use of affinity tags and tag Calvete, et al. Disulphide-bond pattern and molecular modelling of removal for the purification of recombinant proteins. Protein Expr the dimeric disintegrin EMF-10, a potent and selective integrin Purif. 2006: 48(1): 1-13. alpha5betal antagonist from Eristocophis macmahoni . Arndt, et al. Factors influencing the dimer to monomer transition of Biochem J. Feb. 1, 2000:345 Pt3:573-81. an antibody single-chain Fv fragment. Biochemistry. 1998; Calvete, et al. disintegrins: Evolution of structure and 37(37): 12918-26. function. Toxicon 2005; 45: 1063-1074. ASSadi-Porter, et al. Sweetness determinant sites of braZZein, a small, Calvete, et al. Snake venom disintegrins: Novel dimeric disintegrins heat-stable, Sweet-tasting protein. Arch Biochem Biophys. 2000; and structural diversification by disulfphide bond engineering. 376:259-265. Biochem J. 2003; 372:725-734. Aster, et al. The Folding and Structural Integrity of the first LIN-12 Cao, et al. Development of a compact anti-BAFF antibody in Module of Human Notchl are Calcium-Dependent. Biochemistry Escherichia coli. Appl Microbiol Biotechnol. 2006; 73(1): 151-7. 1999; 38:4736-4742. Carr, etal. Solution structure of a trefoil-motif-containing cell growth Bailon, et al. Rational design of a potent, long-lasting form of inter factor, porcine spasmolytic protein. PNAS 1994; 91:2206-2210. feron: a 40 kDa branched polyethylene glycol-conjugated interferon Chen, et al. Crystal structure of a bovine neurophysin II dipeptide alpha-2a for the treatment of hepatitis C. Bioconjug Chem. Mar.-Apr. complex at 2.8A determined from the single-wavelength anomalous 2001; 12(2): 195-202. scattering signal of an incorporated iodine atom. Proc Natl Acad Sci Baneyx, et al. Recombinant protein folding and misfolding in USA. 1991; 88: 4240-4. Escherichia coli. Nat Biotechnol. 2004; 22(11): 1399-408. Chen, et al. Expression, purification, and in vitro refolding of a Baron, et al. From cloning to a commercial realization: human alpha humanized single-chain Fv antibody against human CTLA4 interferon. Crit Rev Biotechnol. 1990; 10(3):179-90. (CD152). Protein Expr Purif. 2006: 46(2):495-502. Barta, etal. Repeats with variations: accelerated evolution of the Pin2 Chen, et al. Site-directed mutations in a highly conserved region of family of proteinase inhibitors. Trends Genet. 2002; 18: 600-3. delta-endotoxin affect inhibition of short cir Bateman, et al. Granulins: the structure and function of an emerging cuit current across Bombyx morimidguts. Proc Natl AcadSci USA. family of growth factors. J Endocrinol. 1998; 158: 145-151. 1993: 90(19):904.1-5. Beissinger, et al. How chaperones fold proteins. Biol Chem. 1998; Chirino, et al. Minimizing the immunogenicity of protein therapeu 379(3):245-59. tics. Drug Discovery Today, 2004: 9:82-90. Belew, et al. Purification of recombinant human granulocyte Chong, et al. Determination of Disulfide Bond Assignments and macrophage colony-stimulating factor from the inclusion bodies pro N-Glycosylation Sites of the Human Gastrointestinal Carcinoma duced by transformed Escherichia coli cells. JChromatogr A. 1994; Antigen GA733-2 (CO17-1A, EGP, KS1-4, KSA, and Ep-CAM). J. 679(1):67-83. Biol. Chem. 2001; 276:5804-5813. Bensch et al. hBD-1: a novel beta-defensin from human plasma. Chong, et al. Disulfide Bond Assignments of Secreted Frizzled FEBS Lett 1995; 368:331-335. related Protein-1 Provide Insights about Frizzled Homology and Berger, et al. Phoenix mutagenesis: one-step reassembly of multiply Netrin Modules. J. Biol. Chem. 2002; 277:5134-5144. cleaved plasmids with mixtures of mutant and wild-type fragments. Chou, et al. Prediction of Protein Conformation. Biochemistry. 1974; Anal Biochem. 1993; 214: 571-9. 13: 222-245. Beste, et al. Small antibody-like proteins with prescribed ligand Chowdhury, et al. Improving antibody affinity by mimicking Somatic specificities derived from the lipocalin fold. Proc Natl AcadSci US hypermutation in vitro. Nat Biotechnol. 1999; 17(6):568-72. A. 1999;96: 1898-1903. Christmann, et al. The cystine knot of a squash-type protease inhibi BinZ. et al. Engineering novel binding proteins from nonim tor as a structural scaffold for Escherichia coli cell surface display of munoglobulin domains. Nature Biotechnology 2005; 23:1257-68. conformationally constrained peptides. Protein Eng. 1999; 12:797 Bird, et al. Single-chain antigen-binding proteins. Science. 1988; 806. 242(4877):423-6. Clark, et al. Long-acting growth hormones produced by conjugation Bittner, et al. Recombinant human erythropoietin (rhEPO) loaded with polyethylene glycol. J. Biol Chem. 1996; 271 (36): 21969-77. poly(lactide-co-glycolide) microspheres: influence of the encapsula Clark, et al. Recombinant human growth hormone (GH)-binding tion technique and polymer purity on microsphere characteristics. protein enhances the growth-promoting activity of human GH in the EurJ Pharm Biopharm. 1998; 45(3):295-305. rat. Endocrinology. 1996; 137(10):43.08-15. Blanchette, et al. Principles of transmucosal delivery of therapeutic Cleland, et al. Emerging protein delivery methods. Current Opinion agents, Biomedicine & Pharmacotherapy. 2004; 58: 142-152. in Biotechnology. 2001: 12:212-219. Bloch, Jr., et al. H NMR structure of an antifungal gamma-thionin Coia, et al. Use of mutator cells as a means for increasing production protein SI alpha 1: Similarity to . Proteins. 1998; 32: levels of a recombinant antibody directed against Hepatitis B. Gene. 334-49. 1997; 201:203-9. Bodenmuller, et al. The Neuropeptide Head Activator Loses. Its Bio Collen, et al. Polyethylene Glycol-Derivatized Cysteine-Substitu logical Acitivity by Dimerization. EMBO J. Aug. 1986; 5 (8): 1825 tion Variants of Recombinant Staphylokinase for Single-Bolus Treat 1829. ment of Acute Myocardial Infarction. Circulation. 2000; 102: 1766 Boder, et al. Directed evolution of antibody fragments with 72. monovalent femtomolar antigen-binding affinity. Proc Natl Acad Sci Conticello, et al. Mechanisms for evolving hypervariability: the case USA. 2000:97(20): 10701-5. of conopeptides. Mol. Biol. Evol. 2001; 18:120-131. US 9,371.369 B2 Page 5

(56) References Cited Dumoulin, et al. Single-domain antibody fragments with high conformational stability. Protein Sci. 2002; 11(3):500-15. OTHER PUBLICATIONS Dutton, et al. A New Level of Conotoxin Diversity, a Non-native Disulfide Bond Connectivity in-Conotoxin AuIB Reduces Structural Corisdeo, et al. Functional expression and display of an antibody Fab Definition but Increases Biological Activity.J. Biol Chem. 2002; 277: fragment in Escherichia coli: study of vector designs and culture 48849-48857. conditions. Protein Expr Purif. 2004; 34(2):270-9. Dyson, et al. Production of Soluble mammalian proteins in Craik, et al. cyclotides: A unique family of cyclic and knotted Escherichia coli: identification of protein features that correlate with proteins that defines the cyclic cystine knot structural motif. J Mol successful expression. BMC Biotechnol. 2004; 4:32. Biol. 1999; 294: 1327-1336. Ellis, et al. Valid and invalid implementations of GOR secondary Crameri, et al. Improved Green Fluorescent Protein by Molecular structure predictions. Comput Appl Biosci. Jun. 1994; 10(3):341-8. Evolution Using DNA Shuffling. Nature Biotechnology. 1996; 14: European search report dated Feb. 4, 2010 for Application No. 315-319. 68.04210. Cull, et al. Screening for receptor ligands using large libraries of European search report dated Mar. 26, 2009 for Application No. T752636.6. peptides linked to the C terminus of the lac repressor. Proc. Natl. European search report dated Mar. 5, 2009 for Application No. Acad. Sci. USA. 1992; 89: 1865-1869. 7752549.1 Daley, et al. Structure and dynamics of a beta-helical antifreeze Fajloun, et al. Versus Pil/HsTx1 Scorpion Toxins. protein. Biochemistry, 2002; 41: 5515-25. Toward New Insights in the Understanding of Their Distinct Daniel et al. Screening for potassium channel modulators by a high Disulfide Bridge Patterns J. Biol. Chem. 2000: 275:39394-402. through-put 86-rubidium efflux assay in a 96-well microtiter plate. J. Felici, et al. Selection of antibody ligands from a large library of Pharmacol. Meth. 1991; 25:185-193. oligopeptides expressed on a multivalent exposition vector. J Mol Danner, et al. T7 phage display: a novel genetic selection system for Biol. 1991; 222: 301-310. cloning RNA-bining proteins from cDNA libraries. Proc Natl Acad Fisher, et al. Genetic selection for protein solubility enabled by the Sci USA. 2001; 98: 12954-9. folding quatliy control feature of the twin-arginin translocation path D'Aquino, et al. The magnitude of the backbone conformational way. Protein Sci. Mar. 2006;15(3):449-58. entropy change in protein folding. Proteins. 1996; 25: 143-56. Fitzgerald, et al. Interchangeability of Caenorhabditis elegans DSL Dattani, et al. An investigation into the lability of the bioactivity of proteins and intrinsic signalling activity of their extracellular human growth hormone using the ESTA bioassay. Horm Res. 1996; domains in vivo Development. 1995; 121:4275-82. 46(2):64-73. Franz, et al. Percutaneous absorption on the relevance of invitro data. Dauplais, et al. On the convergent evolution of animal toxins. Con J Invest Dermatol. 1975; 64(3):190-5. servation of a diad of functional residues in potassium channel Frenal, et al. Exploring structural features of the interaction between blocking toxins with unrelated structures. J Biol Chem. 1997: 272: the scorpion toxinCnErgl and ERG K+ channels. Proteins. 2004:56: 4302-9. 367-375. De Kruif, et al. Selection and application of human single chain Fv GameZ. et al. Development of pegylated forms of recombinant antibody fragments from a semi-synthetic phage antibody display Rhodosporidium toruloides phenylalanine ammonia-lyase for the library with designed CDR3 regions. J Mol Biol. 1995: 248: 97-105. treatment of classical phenylketonuria. Mol Ther. 2005; 11:986-9. De Rosa, et al. Influence of the co-encapsulation of different non Geething, et al. Gcg-XTEN: an improved glucagon capable of pre ionic surfactants on the properties of PLGA insulin-loaded venting hypoglycemia without increasing baseline blood glucose. microspheres. J Control Release. 2000; 69(2):283-95. PLoS One. Apr. 14, 2010:5(4):e 10175. doi:10.1371journal.pone. De, et al. Crystal Structure of a disulfide-linked “trefoil” motif found OO 10175. in a large family of putative growth factors. PNAS 1994; 91: 1084 GenBank: EIW63862. 1. hypothetical protein TRAVEDRAFT 1088. 138159 Trametes versicolor FP-101664 SS1). Available at http:// Deckert, et al. Pharmacokinetics and microdistribution of polyethyl www.ncbi.nlm.nih.gov/protein/392570690?report-genbank ene glycol-modified humanized A33 antibody targeting colon cancer &logS-protalign&blast rank=1&RID=3ERSOM7501 R. Accessed xenografts. Int J Cancer. 2000; 87: 382-90. on Sep. 16, 2013. Der Maur, et al. Direct in vivo screening of intrabody libraries con Gilkes, et al. Domains in microbial beta-1,4-glycanases: sequence structed on a highly stable single-chain framework. J Biol Chem. conservation, function, and enzyme families. Microbiol Rev. 1991; 2002; 277(47):45075-85. 55:303-15. Desplancq, et al. Multimerization behaviour of single chain Fv vari Gomez-Duarte, et al. Expression of fragment C oftetanus fused ants for the tumour-binding antibody B72.3. Protein Eng. 1994; to a carboxyl-terminal fragment of in Salmonella 7(8): 1027-33. typhi CVD 908 vaccine strain. Vaccine. 1995; 13(16): 1596-602. Dhalluin, et al. Structural and biophysical characterization of the 40 Graff, et al. Theoretical analysis of antibody targeting of tumor sphe kDa PEG-interferon-alpha2a and its individual positional isomers. roids: importance of dosage for penetration, and affinity for retention. Bioconjug Chem. 2005; 16:504-17. Cancer Res. 2003; 63(6):1288-96. Di Lullo, et al. Mapping the ligand-binding sites and disease-associ Gray, et al. Peptide Toxins From Venomous Conus Snails. Annu Rev ated mutations on the most abundant protein in the human, type I Biochem 1988:57:665-700. collagen. J Biol Chem. 2002; 277(6):4223-31. Greenwald, et al. Effective drug delivery by PEGylated drug conju Dietrich, et al.; ABC of oral bioavailability: transporters as gatekeep gates. Adv Drug Deliv Rev. 2003; 55: 217-50. ers in the gut. Gut. 2005; 52: 1788-1795. Guncar, et al. Crystal structure of MHC class II-associated p41 Ii Dolezal, et al. ScFv multimers of the anti-neuraminidase antibody fragment bound to cathepsin L reveals the structural basis for differ NC10: shortening of the linker in single-chain Fv fragment entiation between cathepsins L and SEMBOJ 1999; 18:793-803. assembled in V(L) to V(H) orientation drives the formation of dimers, Guo, et al. Crystal Structure of the Cysteine-rich Secretory Protein trimers, tetramers and higher molecular mass multimers. Protein Stecrisp Reveals That the Cysteine-rich Domain Has a K+ Channel Eng. 2000: 13(8):565-74. Inhibitor-like Fold. J Biol Chem. 2005; 280: 12405-12. Dooley, et al. Stabilization of antibody fragments in adverse environ Gupta, et al. A classification of disulfide patterns and its relationship ments. Biotechnol Appl Biochem. 1998; 28 (Pt 1):77-83. to protein structure and function. Protein Sci. 2004; 13: 2045-2058. Doyle, et al. Crystal structures of a complexed and peptide-free Gustafsson, et al. Codon bias and heterologous protein expression. membrane protein-binding domain: molecular basis of peptide rec Trends Biotechnol. 2004; 22: 346-53. ognition by PDZ. Cell. Jun. 28, 1996;85(7):1067-76. Hamers-Casterman, et al. Naturally occurring antibodies devoid of Dufton. Classification of elapid Snake and cytotoxins light chains. Nature. 1993; 363(6428):446-8. according to chain length: evolutionary implications. J. Mol. Evol. Hammer. New methods to predict MHC-binding sequences within 1984; 20:128-134. protein antigens. Curr Opin Immunol 1995; 7: 263-9. US 9,371.369 B2 Page 6

(56) References Cited Jonsson, et al. Quantitative sequence-activity models (QSAM)— tools for sequence design. Nucleic Acids Res. 1993; 21: 733-9. OTHER PUBLICATIONS Jung, et al. Improving in vivo folding and stability of a single-chain Fv antibody fragment by loop grafting. Protein Eng. 1997; Harris, et al. Effect of pegylation on pharmaceuticals. Nat Rev Drug 10(8):959-66. Discov. 2003; 2: 214-21. Kamikubo, et al. Disulfide bonding arrangements in active forms of Henninghausen, et al. Mouse whey acidic protein is a novel member the Somatomedin B domain of human vitronectin. Biochemistry. of the family of four-disulfide core' proteins. Nucleic Acids Res. 2004; 43: 6519-6534. 1982; 10:2677-2684. Hermeling, et al. Structure-immunogenicty relationships of thera Kay, et al. An M13 phage library displaying random 38-amino-acid peutic proteins. Pharm. Res. 2004; 21: 897-903. peptides as a source of novel sequences with affinity to selected Hill, et al. Conotoxin TVIIA, a novel peptide from the venom of targets. Gene. 1993; 128: 59-65. Conus tulipa 1. Isolation, characterization and chemical synthesis. Kelly, et al. Isolation of a Colon Tumor Specific Binding Peptide Eur J. Biochem. 2000; 267: 4642-8. Using Phage Display Selection Neoplasia, 2003; 5:437-44. Hinds, et al. PEGylated insulin in PLGA microparticles. In vivo and Khan, et al. Solubilization of recombinant ovine growth hormone in vitro analysis. J Control Release. Jun. 2, 2005:104(3):447-60. with retention of native-like secondary structure and its refolding Hirel, et al. Extent of N-terminal methionine excision from from the inclusion bodies of Escherichia coli. Biotechnol Prog. 1998; Escherichia coli proteins is governed by the side-chain length of the 14(5):722-8. penultimate amino acid. Proc Natl Acad Sci U S A. 1989; Kim, et al. Three-dimensional Solution Structure of the Calcium 86(21):8247-51. Channel Antagonist (D-Agatoxin IVA: Consensus Molecular Folding Hogg. Dislfide Bonds as Switches for Protein Function. Trends of Calcium Channel Blockers. J. Mol. Biol. 1995; 250:659-671. Biochem Sci, 2003; 28:210-4. Kimble, etal. The LIN12/Notch signaling pathway and its regulation. Holevinsky et al. ATP-Sensitive K+ channel opener acts as a potent Annu Rev Cell Dev Biol 1997: 13:333-361. Cl- channel inhibitor in vascular smooth muscle cells. J. Membrane Kissel, et al. ABA-triblock copolymers from biodegradable polyester Biology. 1994; 137:59-70. A-blocks and hydrophilic poly(ethylene oxide) B-blocks as a candi Hopp, et al. Prediction of protein antigenic determinants from amino date for in situ forming hydrogel delivery systems for proteins. Adv acid sequences. Proc Natl Acad Sci USA 1981: 78, 3824, #3232. Drug Deliv Rev. 2002; 54(1):99-134. Hsu, et al. Vaccination against gonadotropin-releasing hormone Kochendoerfer. Chemical and biological properties of polymer (GnRH) using toxin receptor-binding domain-conjugated GnRH modified proteins. Expert Opin Biol Ther. 2003; 3: 1253-61. repeats. Cancer Res. Jul. 15, 2000:60(14):3701-5. Kohn, et al. Random-coil behavior and the dimensions of chemically Hudson, et al. High avidity sclv multimers; diabodies and triabodies. unfolded proteins. Proc Natl Acad Sci U S A. Aug. 24. J Immunol Methods. 1999; 231(1-2): 177-89. 2004; 101(34): 12491-6. Hugli. Structure and function of C3a anaphylatoxin. Curr Topics Koide, et al. The fibronectin type III domain as a scaffold for novel Microbiol Immunol. 1990; 153:181-208. binding proteins. J Mol Biol. 1998; 284: 1141-51. Huston, et al. Protein engineering of antibody binding sites: recovery Kornblatt, et al. Cross-linking of cytochrome oxidase subunits with of specific activity in an anti-digoxin single-chain Fv analogue pro difluorodinitrobenzene. Can J. Biochem. 1980; 58: 219-224. duced in Escherichia coli. Proc Natl Acad Sci U S A. 1988; Kortt, et al. Single-chain Fv fragments of anti-neuraminidase anti 85(16):5879-83. body NC 10 containing five- and ten-residue linkers form dimers and International search report and written opinion dated Dec. 20, 2010 with zero-residue linker a trimer. Protein Eng. 1997: 10(4):423-33. for PCT Application No. US 10/02147. Kou, et al. Preparation and characterization of recombinant protein International search report dated Jul. 12, 2011 for PCT Application ScFv(CD11c)-TRP2 for tumor therapy from inclusion bodies in No. US20,61590. Escherichia coli. Protein Expr Purif 2007: 52(1): 131-8. International search report dated Jan. 17, 2008 for PCT Application Kratzner, et al. Structure of Ecballium elaterium trypsin inhibitor II No. US2006,37713. (EETI-II): a rigid molecular scaffold. Acta Crystallogr D Biol International search report dated Oct. 29, 2010 for PCT Application Crystallogr. Sep. 2005;61(Pt 9): 1255-62. No. US 10,37855. Kristensen, et al. Proteolytic selection for protein folding using fila International search report dated Dec. 26, 2007 for PCT Application mentous bacteriophages. Fold Des. 1998; 3: 321-8. No. US2007/05952. Kubetzko, et al. Protein PEGylation decreases observed target asso International search report dated Mar. 16, 2009 for PCT Application ciation rates via a dual blocking mechanism. Mol Pharmacol. 2005; No. US2008/09787. 68: 1439-54. International search report dated Apr. 20, 2010 for PCT Application Kwon, et al. Biodegradable triblock copolymer microspheres based No. US 10-23106. on thermosensitive sol-gel transition. Pharm Res. 2004; 21(2):339 International search report dated Sep. 26, 2007 for PCT Application 43. No. US2007/05857. Kyngas, et al. Unreliability of the Chou-Fasman parameters in pre Iwasaki, et al. Solution structure of midkine, a new heparin-binding dicting protein secondary structure. Protein Eng. May growth factor. Embo J. 1997; 16: 6936-6946. 1998; 11(5):345-8. Jackson, et al. The characterization of paclitaxel-loaded Lane, et al. Influence of post-emulsification drying processes on the microspheres manufactured from blends of poly(lactic-co-glycolic microencapsulation of human serum albumin. Int J Pharm. 2006; acid) (PLGA) and low molecular weight diblock copolymers. Int J 307(1):16-22. Pharm. Sep. 5, 2007:342(1-2):6-17. Lapatto, et al. X-ray structure of antistasin at 1.9 A resolution and its Johansson, et al. Modifications increasing the efficacy of recombi modelled complex with blood coagulation factor Xa. Embo J. 1997; nant vaccines; marked increase in antibody titers with moderately 16:5151-61. repetitive variants of a therapeutic allergy vaccine. Vaccine. 2007; Lauber, et al. Homologous Proteins with Different Folds: The Three 25(9): 1676-82. dimensional Structures of Domains 1 and 6 of the Multiple Kazal Jonassen, et al. Finding flexible patterns in unaligned protein type Inhibitor LEKTI. J. Mol. Biol. 2003: 328:205-219. sequences. Protein Sci 1995; 4:1587-1595. Le Gall, et al. Di-, tri- and tetrameric single chain Fv antibody Jones, et al. Determination of Tumor Necrosis Factor Binding Protein fragments against human CD 19: effect of valency on cell binding. Disulfide Structure: Deviation of the Fourth Domain Structure from FEBS Lett. 1999; 453(1-2): 164-8. the TNFRINGFRFamily Cysteine-Rich Region Signature Biochem Lee, et al. A recombinant human G-CSF/GM-CSF fusion protein istry. 1997; 36: 14914-23. from E. coli showing colony stimulating activity on human bone Jones, et al. Replacing the complementarity-determining regions in a marrow cells. Biotechnol Lett. 2003:25(3): 205-11. human antibody with those from a mouse. Nature. 1986; Lee, VHL. Mucosal drug delivery. J Natl Cancer Inst Monogr. 2001; 321 (6069):522-5. 29:41-44;. US 9,371.369 B2 Page 7

(56) References Cited Murtuza, et al. Transplantation of skeletal myoblasts secreting an IL-1 inhibitor modulates adverse remodeling in infarcted murine OTHER PUBLICATIONS myocardium. Proc Natl Acad Sci U S A. Mar. 23, 2004; 101(12):4216-21. Leong, et al. Adapting pharmacokinetic properties of a humanized Narmoneva, et al. Self-assembling short oligopeptides and the pro anti-interleukin-8 antibody for therapeutic applications using site motion of angiogenesis. Biomaterials. 2005; 26:4837-4846. specific pegylation. Cytokine. 2001; 16(3):106-19. NCBI Reference Sequence: WP 005 158338.1. Serine phosphatase Leong, et al. Optimized expression and specific activity of IL-12 by RsbU, regulator of sigma Subunit Amycolatopsis azurea. Available directed molecular evolution. Proc. Natl. Acad. Sci. USA 2003; at http://www.ncbi.nlm.nih.gov/protein/491300334?report= 100:1163-1168. genbank&logS-protalign&blast rank=1&RID=3ERSOM7501 R. Leung, et al. A method for random mutagenesis of a defined DNA Accessed on Sep. 16, 2013. segment using a modified polymerase chain reaction. Technique. NCBI Reference Sequence: XP 003746909. 1. Predicted: electron 1989; 1: 11-15. transfer flavoprotein subunit alpha, mitochondrial-like Metaseiulus Leung-Hagesteijn, et al. UNC-5, a transmembrane protein with occidentalis. Available at http://www.ncbi.nlm.nih.gov/protein/ immunoglobulin and thrombospondin type 1 domains, guides cell 391345263?report-genbank&logS-protalign&blast rank=1 and pioneer axon migrations in C. elegans. Cell 1992; 71:289-99. &RID=3ERSOM750 1R. Accessed on Sep. 16, 2013. Levitt. A simplified representation of protein conformations for rapid Nielsen, et al. Di-Tri-peptide transporters as drug delivery targets: simulation of protein folding. J Mol Biol 1976; 104,59-107. Regulation of transport under physiological and patho-physiological Levy, et al. Isolation of trans-acting genes that enhance soluble conditions. Current Drug Targets. 2003; 4:373-388. expression of scFv antibodies in the E. coli cytoplasm by lambda Nielsen, etal. Solution Structure of u-Conotoxin PIIIA, a Preferential phage display. J Immunol Methods. 2007: 321(1-2): 164-73. Inhibitor of Persistent -sensitive Sodium Channels. J. Lin, et al. Metal-chelating affinity hydrogels for Sustained protein Biol. Chem 2002; 277: 27247-27255. release. J Biomed Mater Res A. 2007; 83(4):954-64. Nord, et al. Binding proteins selected from combinatorial libraries of Lirazan, et al. The Spasmodic Peptide Defines a New Conotoxin an O-helical bacterial receptor domain. Nat Biotechnol, 1997: 15: Superfamily. Biochemistry, 2000; 39: 1583-8. 772-777. Liu et al. The Human beta-Defensin-1 and alpha-Defensins Are O'Connell, et al. Phage versus phagemid libraries for generation of Encoded by Adjacent Genes: Two Peptide Families with Differing human monoclonal antibodies. J Mol Biol. 2002; 321: 49-56. Disulfide Topology Share a Common Ancestry. Genomics. 1997; Ofir, et al. Versatile protein microarray based on carbohydrate-bind 43:316-320. ing modules. Proteomics. 2005; 5(7): 1806-14. Lowman, et al. Selecting high-affinity binding proteins by Okten, et al. Myosin VI walks hand-over-hand along actin. Nat Struct monovalent phage display. Biochemistry. 1991; 30: 10832-10838. Mol Biol. 2004; 11 (9):884-7. Maggio. IntravailTM: highly effective intranasal delivery of peptide O'Leary, et al. Solution Structure and Dynamics of a Prototypical and protein drugs Expert Opinion in Drug Delivery 2006; 3:529-539. Chordin-like Cysteine-rich Repeat (von Willebrand Factor Type C Maggio. A Renaissance in Peptide Therapeutics in Underway. Drug Module) from Collagen IIA. J. Biol Chem. 2004; 279: 53857-66. Delivery Reports. 2006; 23-26. Padiolleau-Lefevre, et al. Expression and detection strategies for an Maillere et al. Role of thiols in the presentation of a snake toxin to ScFv fragment retaining the same high affinity than Fab and whole murine T cells. J. Immunol 1993; 150, 5270-5280. antibody: Implications for therapeutic use in prion diseases. Mol Maillere, et al. Immunogenicity of a disulphide-containing Immunol 2007:44(8): 1888-96. : presentation to T-cells requires a reduction step. Toxicon, Pallaghy, et al. A common structural motif incorporating a cystine 1995; 33(4): 475-482. knot and a triple-stranded beta-sheet in toxic and inhibitory Marshall, et al. Enhancing the activity of a beta-helical antifreeze polypeptides. Protein Sci 1994; 3:1833-1839. protein by the engineered addition of coils. Biochemistry, 2004; 43: Pallaghy, et al. Three-dimensional Structure in Solution of the Cal 11637-11646. cium Channel Blocker co-Conotoxin. J Mol Biol 1993; 234:405-420. Martin, et al. Rational design of a CD4 mimic that inhabits HIV-1 Pan, et al. Structure and expression of fibulin-2, a novel extracellular entry and exposes cryptic neutralization epitopes. Nat. Biotechnol. matrix protein with multiple EGF-like repeats and consensus motifs 2003; 21: 71-76. for calcium binding. J. Cell. Biol. 1993; 123: 1269-127. Martineau, et al. Expression of an antibody fragment at high levels in Panda. Bioprocessing of therapeutic proteins from the inclusion bod the bacterial cytoplasm. J Mol Biol. 1998; 280(1): 117-27. ies of Escherichia coli. Adv Biochem Eng Biotechnol. 2003; 85:43 McDonald, et al. Significance of blood vessel leakiness in cancer. 93. Cancer Res. 2002; 62: 5381-5. Patra, et al. Optimization of inclusion body solubilization and McNulty, et al. High-resolution NMR structure of the chemically renaturation of recombinant human growth hormone from synthesized melanocortin receptor binding domain AGRP(87-132) Escherichia coli. Protein Expr Punt. 2000; 18(2):182-92. of the Agouti-Related Protein. Biochemistry, 2001; 40: 15520-7. Pelegrini, et al. Plant gamma-thionins. novel insights on the mecha Meier, et al. Determination of a high-precision NMR structure of the nism of action of a multi-functional class of defense proteins. Int J minicollagen cysteine rich domain from Hydra and characterization Biochem Cell Biol. 2005; 37:2239-53. of its disulfide bond formation. FEBS Lett. 2004; 569; 1 12-6. Pepinsky, et al. Improved pharmacokinetic properties of a polyeth Miljanich. Ziconotide: neuronal calcium channel blocker for treating ylene glycol-modified form of interferon-beta-1a with preserved in severe chronic pain. Curr. Med. Chem. 2004; 23: 3029. vitro bioactivity. J Pharmacol Exp Ther. 2001; 297: 1059-66. Misenheimer, et al. Biophysical Characterization of the Signature Petersen, et al. The dual nature of human extracellular Superoxide Domains of Thrombospondin-4 and Thrombospondin-2. J. Biol. dismutase: one sequence and two structures. Proc. Natl. Acad. Sci. Chem. 2005; 280:41229-41235. USA 2003; 100: 13875-80. Misenheimer, et al. Disulfide Connectivity of Recombinant C-termi Pi, et al. Analysis of expressed sequence tags from the venom ducts of nal Region of Human Thrombospondin 2 J. Biol. Chem. 2001; Conus striatus: focusing on the expression profile of . 276:45882-7. Biochimie 2006; 88(2): 131-40. Mitraki, et al. Protein Folding Intermediates and Inclusion Body Pimanda, et al. The von Willebrand factor-reducing activity of Formation. Bio/Technology. 1989: 7:690-697. thrombospondin-1 is located in the calcium-binding/C-terminal Mogk, et al. Mechanisms of protein folding: molecular chaperones sequence and requires a free thiol at position 974. Blood. 2002; 100: and their application in biotechnology. Chembiochem. Sep. 2. 2832-2838. 2002:3(9):807-14. Pokidysheva, et al. The Structure of the Cys-rich Terminal Domain of MrSny, et al. Bacterial toxins as tools for mucosal vaccination. Drug Hydra Minicollagen, Which Is Involved in Disulfide Networks of the Discovery Today, 2002; 4:247-258. Nematocyst Wall. J. Biol Chem. 2004; 279: 30395-401. US 9,371.369 B2 Page 8

(56) References Cited Srivastava, et al. Application of self-assembled ultra-thin film coat ings to stabilize macromolecule encapsulation in alginate OTHER PUBLICATIONS microspheres. J Microencapsul. 2005; 22(4):397-411. Stamos, et al. Crystal structure of the HGF beta-chain in complex Popkov, et al. Isolation of human prostate cancer cell reactive anti with the Sema domain of the Met receptor. Embo J. 2004; 23: 2325 bodies using phage display technology. J. Immunol Methods. 2004; 35. 291:137-151. Steipe, et al. Sequence statistics reliably predict stabilizing mutations Qi, et al. Structural Features and Molecular Evolution of Bowman in a . J Mol Biol. 1994; 240(3): 188-92. Birk Protease Inhibitors and Their Potential Application (283-292). Stemmer, et al. Single-step assembly of a gene and entire plasmid Act Biochim Biophys Sin. (Shanghai) 2005; 37: 283-292. from large numbers of oligodeoxyribonucleotides. Gene 1995; Rao, et al. Molecular and Biotechnological Aspects of Microbial Proteases. Microbiol Mol Biol Rev. 1998; 62(3): 597–635. 164(1):49-53. Rasmussen, et al. Tumor cell-targeting by phage-displayed peptides. Stemmer, W. Rapid evolution of a protein in vitro by DNA shuffling Cancer Gene Ther. 2002; 9: 606-12. Nature. 1994; 370: 389-391. Rawlings, et al. Evolutionary families of peptidase inhibitors. Stickler, et al. Human population-based identification of CD4(+) Biochem J. 2004; 378: 705-16. T-cell peptide epitope determinants. JImmunol Methods. 2003; 281: Rebay, et al. Specific EGF repeats of Notch mediate interactions with 95-108. Delta and Serrate: Implications for notch as a multifunctional recep Stites, etal. Empirical evaluation of the influence of side chains on the tor. Cell 1991; 67:687-699. conformational entropy of the polypeptide backbone. Proteins. 1995; Roberge, et al. Construction and optimization of a CC49-based scFv 22: 132-140. beta-lactamasefusion protein for ADEPT. Protein Eng Des Sel. 2006; Stoll, et al. A mechanistic analysis of carrier-mediated oral delivery 19(4): 141-5. of protein therapeutics. J Control Release. 2000; 64; 217-28. Rosenfeld, et al. Biochemical, Biophysical, and Pharmacological Sturniolo, et al. Generation of tissue-specific and promiscuous HLA Characterization of Bacterially Expressed Human Agouti-Related ligand databases using DNA microarrays and virtual HLA class II Protein. Biochemistry. 1998; 37: 16041-52. matrices. Natural Biotechnol. 1999; 17: 555-561. Roussel, et al. Complexation of Two Proteic Insect Inhibitors to the Suetake, et al. Chitin-binding Proteins in and Active Site of Chymotrypsin Suggests Decoupled Roles for Binding Comprise a Common Chitin-binding Structural Motif, J Biol Chem. and Selectivity. J Biol Chem. 2001; 276: 38893-8. 2000; 275: 17929-32. Sahdev, et al. Production of active eukaryotic proteins through bac Suetake, et al. Production and characterization of recombinant terial expression systems: a review of the existing biotechnology tachycitin, the Cys-rich chitin-binding protein. Protein Eng. 2002; strategies. Mol Cell Biochem. Jan. 2008:307(1-2):249-64. 15: 763-9. Salloum, et al. Anakinra in experimental acute myocardial infarc Summers, et al. Baculovirus structural polypeptides. Virology. 1978; tion—does dosage or duration of treatment matter? Cardiovasc 84(2):390-402. Drugs. Ther. Apr. 2009:23(2): 129-35. Takahashi, etal. Solution structure ofhanatoxinl, agating modifier of Schellenberger, et al. A recombinant polypeptide extends the in vivo voltage-dependent K+ channels: common Surface features of gating half-life of peptides and proteins in a tunable manner. Nat Biotechnol. modifier toxins. J Mol Biol, 2000; 297: 771-80. Dec. 2009:27(12): 1186-90. Takenobu, et al. Development of p53 protein transduction therapy Schlapschy, et al. Fusion of a recombinant antibody fragment with a using membrane-permeable peptides and the application to oral can homo-amino-acid polymer: effects on biophysical properties and cer cells. Mol Cancer Ther. 2002; 1: 1043-9. prolonged plasma half-life. Protein Eng Des Sel. Jun. Tam, et al. A biomimetic strategy in the synthesis and fragmentation 2007:20(6):273-84. Epub Jun. 26, 2007. of cyclic protein. Protein Sci. 1998; 7:1583. Schole, et al. Efficient construction of a large collection of phage Tavladoraki, et al. A single-chain antibody fragment is functionally displayed combinatorial peptide libraries. Comb. Chem. & HTP expressed in the cytoplasm of both Escherichia coli and transgenic Screening 2005; 8:545-551. plants. Eur J. Biochem. 1999; 262(2):617-24. Schultz-Cherry, et al. Regulation of Transforming Growth Factor Tax, et al. Sequence of C. elegans lag-2 reveals a cell-signalling beta Activation by Discrete Sequences of Thrombospondin. J. Biol. domain shared with Delta and Serrate of Drosophila. Nature 1994; Chem. 1995; 270:7304-7310. 368: 150-154. Schultz-Cherry, et al. The type 1 repeats of thrombospondin 1 acti Terpe, K. Overview of tag protein fusions: from molecular and bio vate latent transforming growth factor-beta. J. Biol. Chem. 1994; chemical fundamentals to commercial systems. Appl Microbiol 269:26783-8. Biotechnol. Jan. 2003:60(5):523-33. Shen, et al. A Type I Peritrophic Matrix Protein from the Malaria Thai, et al. Antigen stability controls antigen presentation. J. Biol. Vector Anopheles gambiae Binds to Chitin. Cloning. Expression, and Chem. 2004; 279: 50257-50266. Characterization. J Biol Chem. 1998; 273: 17665-70. Tolkatchev, et al. Design and Solution Structure of a Well-Folded Sidhu, et al. Phage display for selection of novel binding peptides. Stack of Two beta-Hairpins Based on the Amino-Terminal Fragment Methods Enzymol. 2000; 328: 333-63. of Human Granulin A. Biochemistry, 2000; 39: 2878-86. Silverman, et al. Multivalent avimer proteins evolved by exon shuf Torres, et al. Solution structure of a defensin-like peptide from platy fling of a family of human receptor domains. Nat Biotechnol 2005; pus venom. Biochem J. 1999; 341 (Pt3): 785-794. 23: 1556-1561. Tur, et al. A novel approach for immunization, screening and char Simonet, et al. Structural and functional properties of a novel serine acterization of selected ScFv libraries using membrane fractions of protease inhibiting peptide family in arthropods. Comp Biochem tumor cells. Int J Mol Med. 2003; 11:523-7. Physiol B Biochem Mol Biol. 2002; 132: 247-55. Uversky, et al. Why are “natively unfolded” proteins unstructured Singh, et al. ProPred: Prediction of HLA-DR binding sites. under physiologic conditions? Proteins. Nov. 15, 2000:41(3):415-27. Bioinformatics. 2001; 17: 1236-1237. Valente, et al. Optimization of the primary recovery of human inter Skinner , et al. Purification and characterization of two classes of feron alpha2b from Escherichia coli inclusion bodies. Protein Expr neurotoxins from the funnel web , Agelenopsis aperta. J. Biol. Purif 2006: 45(1):226-34. Chem. 1989; 264:2150-2155. Van Den Hooven, etal. Disulfide Bond Structure of the AVR9 Elicitor Smith, et al. Phage Display. Chem Rev. 1997: 97: 391-410. of the Fungal Tomato Pathogen Cladosporium fulvum: Evidence for Smith, et al. Single-step purification of polypeptides expressed in a Cystine Knot. Biochemistry 2001; 40:3458-34.66. Escherichia coli as fusions with glutathione S-transferase. Gene. Van Vlijmen, et al. A novel database of disulfide patterns and its 1988: 67(1):31-40. application to the discovery of distantly related homologs. J Mol. So, et al. Contribution of conformational stability ofhen lysozyme to Biol. 2004; 335: 1083-1092. induction of type 2 T-helper immune responses. Immunology. 2001; Vanhercke, et al. Reducing mutational bias in random protein librar 104: 259-268. ies. Anal Biochem. 2005; 339:9-14. US 9,371.369 B2 Page 9

(56) References Cited Worn, et al. Correlation between in vitro stability and in vivo perfor mance of anti-GCN4 intrabodies as cytoplasmic inhibitors. J Biol OTHER PUBLICATIONS Chem. 2000: 275(4):2795-803. Worn, et al. Stability engineering of antibody single-chain Fv frag Vardar, et al. Nuclear Magnetic Resonance Structure of a Prototype ments. J Mol Biol. 2001: 305(5):989-1010. Lin 12-Notch Repeat Module from Human Notchl. Biochemistry Wrammert, et al. Rapid cloning of high-affinity human monoclonal antibodies against influenza virus. Nature. 2008; 453(7195):667-71. 2003; 42:7061-7067. Wright, et al. Intrinsically unstructured proteins: re-assessing the Venkatachalam, et al. Conformation of polypeptide chains. Annu Rev protein structure-function paradigm. J Mol Biol. Oct. 22. Biochem. 1969; 38: 45-82. 1999:293(2):321-31. Ventura. Sequence determinants of protein aggregation: tools to Xiong, et al. A Novel Adaptation of the Integrin PSI Domain increase protein solubility. Microb Cell Fact. 2005; 4(1): 11. Revealed from Its Crystal Structure. J Biol Chem. 2004; 279: 40252 Vestergaard-Bogind, et al. Single-file diffusion through the Ca"- 4. activated K' channel of human red cells. J. Membrane Biol. 1985; Xu, et al. Solution Structure of BmP02, a New Potassium Channel 88:67-75. Blocker from the Venom of the Chinese Scorpion Buthus martensi Voisey, et al. Agouti: from Mouse to Man, from Skin to Fat Pigment Karsch Biochemistry 2000; 39:13669-13675. Cell Res. 2002; 15: 10-18. Yamazaki, et al. A possible physiological function and the tertiary Vranken, et al. A 30 residue fragment of the carp granulin 1 protein structure of a 4-kDa peptide in legumes. Eur J Biochem. 2003; 270: folds into a stack of two f hairpins similar to that found in the native 1269-1276. protein J Pept Res. 1999; 53: 590-7. Yang, et al. CDR walking mutagenesis for the affinity maturation of Walker, et al. Using protein-based motifs to stabilize peptides. J Pept a potent human anti-HIV-1 antibody into the picomolar range. J Mol Res. Nov. 2003;62(5):214-26. Biol. 1995; 254:392-403. Wang, et al. Structure-function studies of omega-atracotoxin, a Yang, et al. Intestinal Peptide transport systems and oral drug avail potent antagonist of insect voltage-gated calcium channels. Eur J ability. Pharmaceutical Research. 1999; 16: 1331-1343. Biochem. 1999; 264: 488-494. Yang, et al. Tailoring structure-function and pharmacokinetic prop Ward, et al. Binding activities of a repertoire of single erties of single-chain Fv proteins by site-specific PEGylation. Protein immunoglobulin variable domains secreted from Escherichia coli. Eng. 2003; 16:761-70. Nature. 1989; 341 (6242):544-6. Yankai, et al. Ten tandem repeats of beta-hCG 109-118 enhance Watters, et al. An optimized method for cell-based phage display immunogenicity and anti-tumor effects of beta-hCG C-terminal panning. Immunotechnology. 1997; 3: 21-29. peptide carried by mycobacterial heat-shock protein HSP65. Weimer, et al. Prolonged in-vivo half-life of factor Vila by fusion to Biochem Biophys Res Commun. 2006; 345(4): 1365-71. albumin. Thromb Haemost. Apr. 2008:99(4):659-67. (Abstract only). Yuan, et al. Solution structure of the transforming growth factor Weiss, et al. A cooperative model for receptor recognition and cell beta-binding protein-like module, a domain associated with matrix adhesion: evidence from the molecular packing in the 1.6-A crystal fibrils. Embo J. 1997; 16: 6659-66. structure of the pheromone Er-1 from the ciliated protozoan Euplotes Zaveckas, et al. Effect of surface histidine mutations and their num raikovi. Proc Natl Acad Sci USA 1995; 92: 10172-6. ber on the partitioning and refolding of recombinant human granu Wentzel, et al. Sequence requirements of the GPNG beta-turn of the locyte-colony stimulating factor (CyS 17Ser) in aqueous two-phase Ecballium elaterium trypsin inhibitor II explored by combinatorial systems containing chelated metal ions. J Chromatogr B Analyt library screening. J Biol Chem. Jul. 23, 1999:274(30):21037-43. Technol Biomed Life Sci. 2007: 852(1-2):409-19. Werle, et al. The potential of cystine-knot microproteins as novel Zhu, et al. Molecular cloning and sequencing of two short chain and pharmacophoric scaffolds in oral peptide drug delivery. J. Drug Tar two long chain K(+) channel-blocking peptides from the Chinese geting 2006; 14:137-146. scorpion Buthus martensii Karsch. FEBS Lett 1999; 457:509-514. Werther, et al. Humanization of an anti-lymphocyte function-associ Chou; et al., “Prediction of the Secondary Structure of Proteins from ated antigen (LFA)-1 monoclonal antibody and reengineering of the Their Amino Acid Sequence, from Advances in Enzymology vol. 47. humanized antibody for binding to rhesus LFA-1. J Immunol 1996; John Wiley and Sons. Published 1978, p. 60.”. 157(11):4986-95. Office action dated Aug. 23, 2012 for U.S. Appl. No. 12/848,984. Whitlow, et al. Multivalent Fvs: characterization of single-chain Fv Prinz, et al. The Role of the Thioredoxin and Glutaredoxin Pathways oligomers and preparation of a bispecific Fv. Protein Eng. 1994; in Reducing Protein Disulfide Bonds in the Escherichia coli Cyto 7(8): 1017-26. plasm. J Biol Chem. 1997: 272(25): 15661-7. Winter, et al. Humanized antibodies. Trends Pharmacol Sci. May Schulz, et al. Potential of NIR-FT-Raman spectroscopy in natural 1993;14(5): 139-43. carotenoid analysis. Biopolymers. Mar. 2005:77 (4):212-21. Wittrup. Protein engineering by cell-surface display. Curr Opin Voet; et al., “Biochemistry (3rd Ed.). John Wiley and Sons. Published Biotechnol. 2001: 12: 395-9. 2004, p. 230.”. U.S. Patent Jun. 21, 2016 Sheet 1 of 48 US 9,371.369 B2

A. 1OO

1OO

B. 1OO

1OO

C. 101

101

D. 101

101

E. 101

101

F. 105

105

G. 106 U.S. Patent Jun. 21, 2016 Sheet 2 of 48 US 9,371.369 B2

A. U.S. Patent Jun. 21, 2016 Sheet 3 of 48 US 9,371.369 B2

102

104

1-106 ->

FIG. 3 U.S. Patent Jun. 21, 2016 Sheet 4 of 48 US 9,371.369 B2

1-107

1-109

FIG. 3 U.S. Patent Jun. 21, 2016 Sheet 5 of 48 US 9,371.369 B2

Design Rules Motif Library

XTEN Segments

Expression Screen Assembly Sequence analysis Full length XTEN Full length XTEN

Expression Purification Solubilit y in vitro ASSays Stability

XTEN Leads XTEN Leads

Pharmacokinetics

FIG. 4 U.S. Patent Jun. 21, 2016 Sheet 6 of 48 US 9,371.369 B2

Anneal Synthetic oligos

502 -->

Ligation Gel purification 504

Clone into BSal/Kpnl dig. vector

BSal Bbs Kpni

k 5OO ag 4

5O6 504 507 508

FIG. 5 U.S. Patent Jun. 21, 2016 Sheet 7 of 48 US 9,371.369 B2

XTEN Motifs w 16 Alter Sequence XTENASsembly w BP Sequence XTEN Sequence

BPXTEN Sequence Host Transformation w - UncleUnacceptable -->

-> Unacceptable D Profile

Characterization

HD Unacceptable Profile Acceptable Profile

Candidate Product

FIG. 6 U.S. Patent Jun. 21, 2016 Sheet 8 of 48 US 9,371.369 B2

TEV protease site Nee ba Bbs HindIII N / T7 promoter T7 promoter

pCBD-XTEN

OR OR

C. D. Bbs Hind Bbs BSal HindIII

N / / T7 promoter T7 promoter

OR

FIG. 7 U.S. Patent Jun. 21, 2016 Sheet 9 of 48 US 9,371.369 B2

Benchmark CBD-AM875 Benchmark CBD-AE864

FIG. 8 U.S. Patent Jun. 21, 2016 Sheet 10 of 48 US 9,371.369 B2

LCWO569 ATGGCTNNNNNNGCTGGCTCTCCAACCTCCACTGAGGAAGGT M A X X A G S P T S T E is cwoszo ATGGCTNNNNNNGAAAGCGCAACCCCTGAGTCCGGTCCAGGT M A X X E S A T P E S G P CWO571 ATGGCTNNNNNNACTCCGTCTGGTGCTACCGGTTCCCCAGGT M A X X T P S G A T G S P

X = APST, GS or GE TCAG/C/TCAG, AG/G/TC or G/AG/AG Diversity: 16 4. 4.

Batch 2 libraries are based on 3 best clones from batch 1 screening. * All 24 codons for 6 amino acids G, E.S.P.A.T are included. * Each new library is composed of 3X3=9 pairs of annealed oligos.

FIG. 9 U.S. Patent Jun. 21, 2016 Sheet 11 of 48 US 9,371.369 B2

Histogram of Retest Results from LCW569-567 GFPScreen 3O

E s 3 15

Benchmark CBD AM875

O cNG. OOS Co.& oO3 3:oo 3CO &oo do3 3co do& Co.3 oO& oo8 & 3 & 8 oJ H C rn St Ln to N OO on d H on of S i? s s s s s s > Baselined Fluorescence Signal (AU)

F.G. 10 U.S. Patent Jun. 21, 2016 Sheet 12 of 48 US 9,371.369 B2

• Two preferred sequences Six preferred 36mers Use Current from Round #3: identified: AM875 and • LCW546 06 • LCW402 20 AE864 • LCW546 09 • LCW402 41 • 2 sequences • These sequences lack • LCW403 60 optimal 3rd and 4th codons o LCW403 64 determined in round #2, SO LCW404 31 new sequences made LCW404 40 • LCW546 06 3opt 2AE, 2 AF, 2 AG • LCW546 09 3opt • LCW546 06 4opt • LCW546 O9 4opt • Recommend making two libraries: • LCW546 06 34opt e One in front Of AM864 • LCW546 09 34opt e One in front Of AE864 8 total sequences available • Test all possible combinations of 12mer and 36merS • A total of 48 unique clones per library

F.G. 11 U.S. Patent Jun. 21, 2016 Sheet 13 of 48 US 9,371.369 B2

220 kDa XTEN 160 kDa Proteins 110 kDa Š

60 kDa Š

40 kDa

20 kDa

FIG. 12 U.S. Patent Jun. 21, 2016 Sheet 14 of 48 US 9,371.369 B2

Time (days) O 1 7

Monkey in VivO

Kidney homogenate

FIG. 13 U.S. Patent Jun. 21, 2016 Sheet 15 of 48 US 9,371.369 B2

FIG. 14 U.S. Patent Jun. 21, 2016 Sheet 16 of 48 US 9,371.369 B2

Axsix is: i.e. 158 44 17

kDa kDa kDa

BPXTEN ->

S. s s : SS

FIG. 15 U.S. Patent Jun. 21, 2016 Sheet 17 of 48 US 9,371.369 B2

s sS s

Si s s

s 8. : s

FIG. 16 U.S. Patent Jun. 21, 2016 Sheet 18 of 48 US 9,371.369 B2

&

-2 -1 O IL-1R A log nM

& XTEN-hGH Anakinra CBD-XTEN-IL-1 ra L1ra-XTEN

FIG. 17 U.S. Patent Jun. 21, 2016 Sheet 19 of 48 US 9,371.369 B2

kDa 160 110

60

40

20

25 80 25 80 IL-1 ra IL-1ra XTEN

FIG. 18

U.S. Patent Jun. 21, 2016 Sheet 21 of 48 US 9,371.369 B2

(OC,830

Es 10,000 Š------Nss------· ^s S. s S SS*. S \ s 1,830 -r Yr Yr

S. 180 - st-S------

d

to sy

t s 58 OO :50 RO 25. 3. Time (Hr) Mbo IL1ra-AM875 10 mg/kg ox&ox IL1ra-AM8751 mg/kg st IL1ra-AE864 10 mg/kg -- IL1ra-AM 1296 1 mg/kg

FIG. 20 U.S. Patent Jun. 21, 2016 Sheet 22 of 48 US 9,371.369 B2

- s s & ES irrelays

FIG. 21. U.S. Patent Jun. 21, 2016 Sheet 23 of 48 US 9,371.369 B2

EDAY Q DAY 29

Blood Glucose (mg/dL)

FIG. 22 U.S. Patent Jun. 21, 2016 Sheet 24 of 48 US 9,371.369 B2

GFF stricts

1,000,000

100,000

S. Šs

10,00

s W 1. 2O 3. AOC Timehr) NGFP-AD836 ow CSARS78 & Go-YS78 w8 Cro-SS --GFP-L288

FIG. 23 U.S. Patent Jun. 21, 2016 Sheet 25 of 48 US 9,371.369 B2

O O O O w X is a -5 9 c. ... 8 ... O -10 Sp9 39 -15 A1

> -20 180 200 220 240 260 280 300 Wavelength (nm)

FIG. 24 U.S. Patent Jun. 21, 2016 Sheet 26 of 48 US 9,371.369 B2

100000 || Ex4-XTEN IV 2 -:- Ex4-XTEN SC S SO 10000 -is S.C s -8- Sls---is isS. --- - 3E 1000 s is $. - 5 s 1. s. O -- s E : 8 100 : al O 50 100 150 200 250 300 350 Time (hr)

FIG. 25 U.S. Patent Jun. 21, 2016 Sheet 27 of 48 US 9,371.369 B2

predicted 139 hr

1O O

1O O.O1 O.1 1 1O 1OO Mass (kg)

B. 1000 corrorR2 = 0.7778 orpredicted

100 30 ml/hr

1 O

1 -

O. 1 -

O.O1 O.1 1 1O 1OO Mass (kg) C. 1OOOO R = 0.8554 predicted 597O m

10

O.O1 0.1 1 1O 1OO Mass (kg)

FIG. 26 U.S. Patent Jun. 21, 2016 Sheet 28 of 48 US 9,371.369 B2

x GCg-Y288 : Glucagon : s as careert ration, s

FIG. 27 U.S. Patent Jun. 21, 2016 Sheet 29 of 48 US 9,371.369 B2

30000

20000 Š Š O w Y S 10000 s

u-0 Cells Only O s

log Compound M

FIG. 28 U.S. Patent Jun. 21, 2016 Sheet 30 of 48 US 9,371.369 B2

kDa 160 110

60 86

40 5%

s 2O 2

. m W O C 25 C 80 C 25 C 80 C hGH AM864-hGH hGH AM864-hGH

FIG. 29 U.S. Patent Jun. 21, 2016 Sheet 31 of 48 US 9,371.369 B2

2

hCGH, log nM

FIG. 30 U.S. Patent Jun. 21, 2016 Sheet 32 of 48 US 9,371.369 B2

EC50 O96 nM

4. ~2 O Log (Concentration), LM

FIG. 31. U.S. Patent Jun. 21, 2016 Sheet 33 of 48 US 9,371.369 B2

100.00 s

S ^x, ------Yxs s

s 1 nn is lesssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss

o E

an ...... * * * * * -O-gh, 0.3 mg/kg, 13.65 unofkg sys AM864-hGH, 1.5 mg/kg, 14.95 unmol/kg ess AES12-hGH, 1.5 mg/kg, 13.22 Limol/kg xŠx AES12-hGH-AE144, 3.5 mg/kg, 12.64 unmol/kg sesse AE812-hoh-AE288, 3.5 mg/kg, 11.31 umol/kg O,O1 i ------t- --- I- r- -T------O O 2 3. O SO 60 O 80 Time (hr)

FIG. 32 U.S. Patent Jun. 21, 2016 Sheet 34 of 48 US 9,371.369 B2

00000 S.

w s. SS SNN s 8. Šws,

10000

1000 &&A864-GH x88AE82-3 xŠwAE92 G-AEA

1 OO i Y Yo Y M. Y. M. Y. Y. Y. Ya Ya Ya Ya Ya Ya Y v Ya Ya Y Y------O SO O :50 200 250 30 358 400 Time (hours)

FIG. 33 U.S. Patent Jun. 21, 2016 Sheet 35 of 48 US 9,371.369 B2

xxi) g his daily timg/ AM864-hGH (20 ughgH) got

s" is tasks AM864-igh 30 g. higi) GOD * - is ri?k AM884-hGH 30 ughgh QOt

s

assaxxaasayayasasasayasayayassassayasa saysayayasayas - Tirne days

FIG. 34 U.S. Patent Jun. 21, 2016 Sheet 36 of 48 US 9,371.369 B2

FIG. 35 U.S. Patent Jun. 21, 2016 Sheet 37 of 48 US 9,371.369 B2

Activation Peptide

ACtivation Peptide X X

FIG. 36 U.S. Patent Jun. 21, 2016 Sheet 38 of 48 US 9,371.369 B2

FIG. 37 U.S. Patent Jun. 21, 2016 Sheet 39 of 48 US 9,371.369 B2

Long half-life LOW clotting activity XTEN release

Short half-life High clotting activity

FIG 38 U.S. Patent Jun. 21, 2016 Sheet 40 of 48 US 9,371.369 B2

Extrinsic Intrinsic pathway M pathway XII TF PK VII- Vla X HK

mOn Omer

Fibrin polymer TF = TiSSue Factor -- XIla -- XIII PL = Phospholipids PT = Prothrombin (FII) Crossed TH = Thrombin (Fla) polymer PKE Plasma Kallikrien HK = High Molecular Weight Kininogen Adapted from Ferguson etal, Eur Heart J 1998; Suppl 19:8.

FIG. 39 U.S. Patent Jun. 21, 2016 Sheet 41 of 48 US 9,371.369 B2

3 : S: 3 s 5 r . wins T 5 A. 35 5 . . . . CD i. i did D m ar - O - 8 . A (s bg (D: x CD x X O X CD o,o y T T O O O 1

2 CD 9. - c 9 s CD5 5 SBO CDC O) XTEN T 9 s

CD CDk- Ll S. CDc Oc C (D CD w9 C CD O XTEN 2. O 9. 5 c CD CD s 5 CD o XTEN 2. O D SS

CD 9. 9k- Ll S. 92C CV CD (D 5 9 CD O) 2. XTEN O

CD 9. CD- LL S. 92c C (D 9 CD o O XTEN C F SE S

FIG. 40 U.S. Patent Jun. 21, 2016 Sheet 42 of 48 US 9,371.369 B2

AC296 X

AC300 S SS :

(s s AC294 AC296 AC299 AC300 H H H H

FIG. 41 U.S. Patent Jun. 21, 2016 Sheet 43 of 48 US 9,371.369 B2

Expression vector pCW0590 containing the FIX-XTEN AE864 gene (AC296)

FIG. 42 U.S. Patent Jun. 21, 2016 Sheet 44 of 48 US 9,371.369 B2

Unprocessed FIX1 29 47 93 130 172 192 227 461 Mature FX 1 47 84 126 146 181 415

FXa

Unprocessed FIX 192 227 Mature FIX 146 181

SP = signal peptide PP = propeptide GLA = GLADOmain EGF 1 - EGF like domain 1 EGF2 = EGF like domain 2 AP = Activation peptide Catalytic = Serine protease like domain

FIG. 43

U.S. Patent Jun. 21, 2016 Sheet 46 of 48 US 9,371.369 B2

Reaction Profiles for Factor VI-XTEN Fusion

- FVI-XTEN S . p &

S. 100 m UFV s p 9 oA . . . 1 ...... * x x ...... 2. 0.8 mu FVII v 9. s 0.16 mu FVII H s: , Background FVII deficient plasma O.2 - v. S is a s is is a -- a e O 50 1OO 150 2OO 250 Time (sec) Mean PT SD CV Activity (Sec) (mu/ml) 28.2 1.1 4% 2O3

FIG. 45 U.S. Patent Jun. 21, 2016 Sheet 47 of 48 US 9,371.369 B2

Expression vector pSecTag2C containing the FVII-XTEN AE864 gene (AC280)

FIG. 46 U.S. Patent Jun. 21, 2016 Sheet 48 of 48 US 9,371.369 B2

1. 12 1.

Retention Volume (ml)

1. Glucagon -Y288 2 Glucagon Y144

4. Glucagon-Y36 Standards

FIG. 47 US 9,371,369 B2 1. 2 EXTENDED RECOMBINANT POLYPEPTIDES 503-9: Dhalluin, C., et al. (2005) Bioconjug Chem, 16:504 AND COMPOSITIONS COMPRISING SAME 17). Such mixtures are difficult to reproducibly manufacture and characterize as they contain isomers with reduced or no CROSS-REFERENCE TO RELATED therapeutic activity. APPLICATIONS Albumin and immunoglobulin fragments such as Fc regions have been used to conjugate other biologically active This application is a Continuation Application which proteins, with unpredictable outcomes with respect to claims the priority benefit of U.S. application Ser. No. 12/699, increases in half-life or immunogenicity. Unfortunately, the 761, filed Feb. 3, 2010; which claims the priority benefit of Fc domain does not fold efficiently during recombinant U.S. Provisional Application Ser. Nos. 61/149,669 filed on 10 expression and tends to form insoluble precipitates known as Feb. 3, 2009: 61/268,193, filed Jun. 8, 2009: 61/185,112, filed inclusion bodies. These inclusion bodies must be solubilized Jun. 8, 2009: 61/236,493, filed Aug. 24, 2009: 61/236,836, and functional protein must be renatured. This is a time filed Aug. 25, 2009: 61/243,707, filed Sep. 18, 2009: 61/245, consuming, inefficient, and expensive process that requires 490, filed Sep. 24, 2009: 61/280,955, filed Nov. 10, 2009; additional manufacturing steps and often complex purifica 61/280,956, filed Nov. 10, 2009; and 61/281,109, filed Nov. 15 tion procedures. 12, 2009, which are hereby incorporated herein by reference Thus, there remains a significant need for compositions in their entirety. and methods that would improve the biological, pharmaco logical, safety, and/or pharmaceutical properties of a biologi STATEMENT REGARDING FEDERALLY cally active protein. SPONSORED RESEARCH SUMMARY OF THE INVENTION This invention was made with government Support under SBIR grant 2R44GM079873-02 awarded by the National The present disclosure is directed to compositions and Institutes of Health. The government has certain rights in the methods that can be useful for enhancing the biological, invention. 25 pharmaceutical, safety and/or therapeutic properties of bio logically active proteins. The compositions and methods are STATEMENT ACCORDING TO 37 C.F.R. particularly useful for enhancing the pharmacokinetic prop S1.52(e)(5)-SEQUENCE LISTING erties. Such as half-life, and increasing the time spent within the therapeutic window of a biologically active protein, as The specification further incorporates by reference the 30 well as simplifying the production process and pharmaceuti Sequence Listing submitted as a text file via EFS on Jun. 19. cal properties, such as solubility, of such a biologically active 2014. Pursuant to 37 C.F.R.S 1.52(e)(5), the Sequence Listing protein. text file, identified as 32808720.txt, is 7,133,717 bytes and In part, the present disclosure is directed to pharmaceutical was created on Feb. 8, 2010. The Sequence Listing, electroni compositions comprising fusion proteins and the uses thereof cally filed via EFS, does not extend beyond the scope of the 35 for treating diseases, disorders or conditions. The particular specification and thus does not contain new matter. disease to be treated will depend on the choice of the biologi cally active proteins. In some embodiments, the compositions BACKGROUND OF THE INVENTION and methods are useful for treating metabolic and cardiovas cular diseases (including but not limited to glucose- or insu Biologically active proteins including those as therapeutics 40 lin-related diseases), coagulation and bleeding disorders, and are typically labile molecules exhibiting short shelf-lives, growth-hormone related disorders. particularly when formulated in aqueous solutions. In addi In one aspect, the present invention provides compositions tion, many biologically active peptides and proteins have of extended recombinant polypeptides (XTENs), that when limited solubility, or become aggregated during recombinant linked to a biologically active protein enhances the pharma productions, requiring complex solubilization and refolding 45 cokinetic properties, and/or increases the solubility and sta procedures. Various chemical polymers can be attached to bility of the resulting fusion protein, while retaining or such proteins to modify their properties. Of particular interest enhancing overall biologic and/or therapeutic activity of the are hydrophilic polymers that have flexible conformations biologically active protein. Such compositions may have util and are well hydrated in aqueous solutions. A frequently used ity to treat certain diseases, disorders or conditions, as polymer is polyethylene glycol (PEG). These polymers tend 50 described herein. The resulting fusion protein can exhibit a to have large hydrodynamic radii relative to their molecular better safety profile and permit less frequent dosing, which in weight (Kubetzko, S., et al. (2005) Mol Pharmacol. 68: 1439 turn can lead to better patient compliance. The present inven 54), and can result in enhanced pharmacokinetic properties. tion also provides polynucleotides encoding the XTEN and Depending on the points of attachment, the polymers tend to the fusion proteins of biologically active proteins linked with have limited interactions with the protein that they have been 55 XTEN, as well as polynucleotides complementary to poly attached to such that the polymer-modified protein retains its nucleotides that encode the XTEN and the fusion proteins of relevant functions. However, the chemical conjugation of biologically active proteins linked with XTEN. polymers to proteins requires complex multi-step processes. In another aspect, the present invention provides composi Typically, the protein component needs to be produced and tions of extended recombinant polypeptides (XTEN) that are purified prior to the chemical conjugation step. In addition, 60 useful as fusion partners that can be linked to biologically the conjugation step can result in the formation of heteroge active proteins (BPs), resulting in monomeric BPXTEN neous product mixtures that need to be separated, leading to fusion proteins. significant product loss. Alternatively, such mixtures can be In one embodiment, the invention provides an isolated used as the final pharmaceutical product, but are difficult to extended recombinant polypeptide (XTEN) comprising standardize. Some examples are currently marketed PEGy 65 greater than about 400 to about 3000 amino acid residues, lated Interferon-alpha products that are used as mixtures wherein the XTEN is characterized in that the sum of glycine (Wang, B. L., et al. (1998) J Submicrosc Cytol Pathol, 30: (G), alanine (A), serine (S), threonine (T), glutamate (E) and US 9,371,369 B2 3 4 proline (P) residues constitutes more than about 80%, or In some embodiments, the enhanced pharmacokinetic about 85%, or about 90%, or about 95%, or about 96%, or property of the resulting fusion protein encompasses an about 97%, or about 98%, or about 99% of the total amino increase in terminal half-life of at least about two fold, or at acid sequence of the XTEN, the XTEN sequence is substan least about three-fold, or at least about four-fold, or at least tially non-repetitive, the XTEN sequence lacks a predicted about five-fold, or at least about six-fold, or at least about T-cell epitope when analyzed by TEPITOPE algorithm, eight-fold, or at least about ten-fold. In some cases, the wherein the TEPITOPE algorithm prediction for epitopes enhanced pharmacokinetic property is reflected by the fact within the XTEN sequence is based on a score of -5, or -6, or that the blood concentrations that remain within the therapeu -7, or -8, or -9 or greater, the XTEN sequence has greater tic window for the fusion protein for a given period are at least than 90%, or greater than 91%, or greater than 92%, or greater 10 about two fold, or at least about three-fold, or at least about than 93%, or greater than 94%, or greater than 95%, or greater four-fold, or at least about five-fold, or at least about six-fold, than 96%, or greater than 96%, or greater than 98%, or greater or at least about eight-fold, or at least about ten-fold longer than 99% random coil formation as determined by GOR compared to the corresponding BP not linked to XTEN. The algorithm, and the XTEN sequence has less than 2% alpha 15 increase in half-life and time spent within the therapeutic helices and 2% beta-sheets as determined by Chou-Fasman window can permit less frequent dosing and decreased algorithm. amounts of the fusion protein (in moles equivalent) that are In another embodiment, the invention provides XTEN administered to a subject, compared to the corresponding BP comprising greater than about 400 to about 3000 amino acid not linked to XTEN. In one embodiment, the therapeutically residues, wherein the XTEN is characterized in that the sum effective dose regimen results in a gain in time of at least of asparagine and glutamine residues is less than 10% of the two-fold, or at least three-fold, or at least four-fold, or at least total amino acid sequence of the XTEN, the sum of methion five-fold, or at least six-fold, or at least eight-fold, or at least ine and tryptophan residues is less than 2% of the total amino 10-fold between at least two consecutive C peaks and/or acid sequence of the XTEN, the XTEN sequence has less than C. troughs for blood levels of the fusion protein compared 5% amino acid residues with a positive charge, the XTEN 25 to the corresponding BP not linked to the fusion protein and sequence has greater than 90% random coil formation as administered using a comparable dose regimen to a Subject. determined by GOR algorithm and the XTEN sequence has In one embodiment, at least about 80%, at least about 85%, less than 2% alpha helices and 2% beta-sheets as determined at least about 90%, at least about 95%, or about 99% of the by Chou-Fasman algorithm. XTEN comprises motifs selected from one or more In another embodiment, the invention provides XTEN 30 comprising greater than about 400 to about 3000 amino acid sequences of Table 1. In another embodiment, an XTEN residues, wherein the XTEN is characterized in that at least exhibits at least about 80%, or at least about 90%, or at least about 80%, or at least about 90%, or at least about 91%, or at about 91%, or at least about 92%, or at least about 93%, or at least about 92%, or at least about 93%, or at least about 94%, least about 94%, or at least about 95%, or at least about 96%, or at least about 95%, or at least about 96%, or at least about 35 or at least about 97%, or at least about 98%, or at least about 97%, or at least about 98%, or at least about 99% of the XTEN 99% or 100% sequence identity to a sequence selected from sequence consists of non-overlapping sequence motifs Table 2. wherein each of the sequence motifs has about 9 to about 14 In some cases, the XTEN enhances thermostability of a amino acid residues and wherein the sequence of any two biologically active protein when linked to the biologically contiguous amino acid residues does not occur more than 40 active protein wherein the thermostability is ascertained by twice in each of the sequence motifs the sequence motifs measuring the retention of biological activity after exposure consist of four to six types of amino acids selected from to a temperature of about 37°C. for at least about 7 days of the glycine (G), alanine (A), serine (S), threonine (T), glutamate biologically active protein in comparison to the XTEN linked (E) and proline (P), and the XTEN enhances pharmacokinetic to the biologically active protein. In one embodiment of the properties of a biologically active protein when linked to the 45 foregoing, the retention of biological activity is increased by biologically active protein wherein the pharmacokinetic at least about 50%, at least about 60%, at least about 70%, at properties are ascertained by measuring the terminal half-life least about 80%, at least about 90%, at least about 100%, or of the biologically active protein administered to a subject in about 150%, at least about 200%, at least about 300%, or comparison to the XTEN linked to the biologically active about 500% longer compared to the BP not linked to the protein and administered to a Subject at a comparable dose. 50 XTEN comprises of the XTEN. In some cases of the foregoing embodiments, no one type In another aspect, the invention provides an isolated fusion of amino acid constitutes more than 30% of the XTEN protein comprising an XTEN of any of the foregoing embodi sequence. In other cases of the foregoing embodiments, the ments linked to a biologically active protein (BP). In some XTEN can have a sequence in which no three contiguous embodiments, the BP of the fusion protein exhibits at least amino acids are identical unless the amino acid is serine, in 55 about 80%, or at least about 90%, or at least about 91%, or at which case no more than three contiguous amino acids are least about 92%, or at least about 93%, or at least about 94%, serine residues. In other cases of the foregoing embodiments, or at least about 95%, or at least about 96%, or at least about the XTEN sequence has a subsequence score of less than 10, 97%, or at least about 98%, or at least about 99% or 100% or 9, or 8, or 7, or 6. In still other cases of the foregoing sequence identity to a sequence selected from Table 3, Table embodiments, at least about 80%, or about 90%, or about 60 4, Table 5, Table 6, Table 7, and Table 8. In another embodi 95%, or about 96%, or about 97%, or about 98%, or about ment of the foregoing, the isolated fusion protein further 99%, or 100% of the XTEN sequence consists of non-over comprises a second XTEN sequence wherein the cumulative lapping sequence motifs, wherein each of the sequence motifs total of amino acid residues in the XTEN sequences is greater has 12 amino acid residues. In one embodiment, the XTEN than about 400 to about 3000 residues. sequence consists of non-overlapping sequence motifs, 65 In some cases, the isolated fusion protein with an XTEN of wherein the sequence motifs are from one or more sequences one of the foregoing embodiments comprises a BP wherein of Table 1. the BP is a glucose regulating peptide. In one embodiment of US 9,371,369 B2 5 6 the foregoing, the glucose regulating peptide is exendin-4. In 90%, or about 95%, or about 96%, or about 97%, or about another embodiment of the foregoing, the glucose regulating 98%, or about 99% of the total amino acid sequence of the peptide is glucagon. XTEN. The XTEN component of the BPXTEN can lack a In other cases, the isolated fusion protein comprises a BP predicted T-cell epitope when analyzed by TEPITOPE algo wherein the BP is a metabolic protein. In one embodiment of 5 rithm, wherein the TEPITOPE algorithm prediction for the foregoing, the metabolic protein is IL-1 ra. epitopes within the XTEN sequence is based on a score of-7 In still other cases, the isolated fusion protein comprises a or greater, or -8 or greater, or -9 or greater. The XTEN BP wherein the BP is a coagulation factor. In one embodiment component of the BPXTEN can have a sequence with greater of the foregoing, the coagulation factor is factor IX. In than 80%, or about 85%, or about 90%, or about 91%, or another embodiment of the foregoing, the coagulation factor 10 about 92%, or about 93%, or about 94%, or about 95%, or is factor VII. about 96%, or about 97%, or about 98%, or about 99% ran In other cases, the isolated fusion protein comprises a BP dom coil formation as determined by GOR algorithm; and the wherein the BP is growth hormone. XTEN sequence can have less than 2% alpha helices and 2% In one embodiment, the isolated fusion protein can be less beta-sheets as determined by Chou-Fasman algorithm. immunogenic compared to the biologically active protein not 15 In some embodiments, the invention provides BPXTEN linked to the XTEN, wherein immunogenicity is ascertained fusion proteins wherein the BPXTEN administered to a sub by measuring production of IgG antibodies selectively bind ject exhibits an increase in the terminal half-life for the BPX ing to the biologically active protein after administration of TEN compared to the corresponding BP not linked to the comparable doses to a Subject. fusion protein of at least about two-fold longer, or at least The fusion protein comprising XTEN and BP of the fore about three-fold, or at least about four-fold, or at least about going embodiments can comprise a spacer sequence between five-fold, or at least about six-fold, or at least about seven the BP and XTEN, wherein the spacer sequence comprises fold, or at least about eight-fold, or at least about nine-fold, or between about 1 to about 50 amino acid residues that option at least about ten-fold, or at least about 15-fold, or at least a ally comprises a cleavage sequence. In one embodiment, the 20-fold or greater increase in terminal half-life compared to cleavage sequence is Susceptible to cleavage by a protease. 25 the BP not linked to the fusion protein. Non-limiting examples of such protease include FXIa, FXIIa, In some embodiments, the invention provides BPXTEN kallikrein, FVIIa, FIXa, FXa, thrombin, elastase-2, granzyme fusion proteins wherein the BPXTEN exhibits increased B, MMP-12, MMP-13, MMP-17 or MMP-20, TEV, enteroki solubility of at least three-fold, or at least about four-fold, or nase, rhinovirus 3C protease, and Sortase A. at least about five-fold, or at least about six-fold, or at least In some cases, the isolated fusion protein is configured to 30 about seven-fold, or at least about eight-fold, or at least about have reduced binding affinity for a target receptor of the nine-fold, or at least about ten-fold, or at least about 15-fold, corresponding BP, as compared to the corresponding BP not or at least a 20-fold, or at least 40-fold, or at least 60-fold at linked to the fusion protein. In one embodiment, the fusion physiologic conditions compared to the BP not linked to the protein exhibits binding for a target receptor of the BP in the fusion protein. range of about 0.01%-30%, or about 0.1% to about 20%, or 35 In some embodiments, BPXTEN fusion proteins exhibit an about 1% to about 15%, or about 2% to about 10% of the increased apparent molecular weight as determined by size binding capability of the corresponding BP that is not linked exclusion chromatography, compared to the actual molecular to the fusion protein. In a related embodiment, a fusion pro weight, wherein the apparent molecular weight is at least tein with reduced affinity can have reduced receptor-medi about 100 kD, or at least about 150 kD, or at least about 200 ated clearance and a corresponding increase in half-life of at 40 kD, or at least about 300 kD, or at least about 400 kD, or at least about 50%, or at least about 60%, or at least about 70%, least about 500 kD, or at least about 600 kD, or at least about or at least about 80%, or at least about 90%, or at least about 700 kD, while the actual molecular weight of each BP com 100%, or at least about 150%, or at least about 200%, or at ponent of the fusion protein is less than about 25 kD. Accord least about 300%, or at least about 500% compared to the ingly, the BPXTEN fusion proteins can have an Apparent corresponding BP that is not linked to the fusion protein. 45 Molecular Weight that is about 4-fold greater, or about 5-fold In one embodiment, the invention provides an isolated greater, or about 6-fold greater, or about 7-fold greater, or fusion protein that exhibits at least about 80%, or at least about 8-fold greater than the actual molecular weight of the about 90%, or at least about 91%, or at least about 92%, or at fusion protein. In some cases, the isolated fusion protein of least about 93%, or at least about 94%, or at least about 95%, the foregoing embodiments exhibits an apparent molecular or at least about 96%, or at least about 97%, or at least about 50 weight factor under physiologic conditions that is greater 98%, or at least about 99%, or 100% sequence identity to a than about 4, or about 5, or about 6, or about 7, or about 8. sequence selected from Table 40, Table 41, Table 42, Table The invention provides BPXTEN in various configurations 43, and Table 44. wherein the BP retains at least a portion of the biologic In one embodiment, the invention provides compositions activity of the corresponding BP not linked to XTEN. In one comprising a fusion protein, wherein the fusion protein com 55 embodiment, the BPXTEN comprises a BP linked to an prises at least a first BP comprising a sequence that exhibits at XTEN in the configuration of formula I least about 80% sequence identity, or 81%, 82%, 83%, 84%, 85%. 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, (BP)-(S)-(XTEN) I 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a wherein independently for each occurrence, BP is a biologi sequence from any one of Tables 3-8, wherein the BP is linked 60 cally active protein as described hereinabove; S is a spacer to one or more extended recombinant polypeptides (XTEN) sequence having between 1 to about 50 amino acid residues each comprising greater than about 100 to about 3000 amino that can optionally include a cleavage sequence (as described acid residues, more preferably greater than about 400 to about above); x is either 0 or 1; and XTEN is an extended recom 3000 amino acid residues, with a substantially non-repetitive binant polypeptide (as described herein). The embodiment sequence wherein the Sum of glycine (G), alanine (A), serine 65 has particular utility where the BP requires a free N-terminus (S), threonine (T), glutamate (E) and proline (P) residues for desired biological activity or where linking of the C-ter constitutes more than about 80%, or about 85%, or about minus of the BP to the fusion protein reduces biological US 9,371,369 B2 7 8 activity and it is desired to reduce the biological activity one embodiment of the foregoing, the isolated nucleic acid and/or side effects of the administered BPXTEN. comprises a polynucleotide sequence that has at least 80% In another embodiment of the BPXTEN configuration, the sequence identity, or about 85%, or at least about 90%, or invention provides a fusion protein in the configuration of about 91%, or about 92%, or about 93%, or about 94%, or formula II (components as described above): 5 about 95%, or about 96%, or about 97%, or about 98%, or about 99% to about 100% sequence identity to (a) a poly (XTEN)-(S)-(BP) II nucleotide sequence that encodes a polypeptide selected from The embodiment is particularly useful where the BP requires Table 40, Table 41, Table 42, Table 43, and Table 44; or (b) the a free C-terminus for desired biological activity, or where complement of the polynucleotide of (a). The invention pro linking of the N-terminus of the BP to the fusion protein 10 vides expression vectors comprising the nucleic acid of any of reduces biological activity and it is desired to reduce the the embodiments hereinabove described in this paragraph. In biological activity and/or side effects of the administered one embodiment, the expression vector of the foregoing fur BPXTEN. ther comprises a recombinant regulatory sequence operably In another aspect, the invention provides isolated BPXTEN linked to the polynucleotide sequence. In another embodi fusion proteins comprising two molecules of the BP and 15 ment, the polynucleotide sequence of the expression vectors optionally comprising a second XTEN and/or a spacer of the foregoing is fused in frame to a polynucleotide encod sequence wherein the fusion protein has a configuration ing a secretion signal sequence, which can be a prokaryotic selected from formula III, formula IV, and formula V: signal sequence. In one embodiment, the Secretion signal (BP)-(S)-(XTEN)-(S)-(BP)-(S)-(XTEN), III sequence is selected from Omp.A, DsbA, and PhoA signal Sequences. The invention provides a host cell, which can comprise an expression vector disclosed in the foregoing paragraph. In one embodiment, the host cell is a prokaryotic cell. In another wherein independently for each occurrence BP is a biologi embodiment, the host cell is E. coli. cally active protein as described hereinabove, S is a spacer 25 In one embodiment, the invention provides pharmaceutical sequence having between 1 to about 50 amino acid residues compositions comprising the fusion protein of any of the that optionally comprises a cleavage sequence; w is either 0 or foregoing embodiments and at least one pharmaceutically 1; x is either 0 or 1; y is either 0 or 1; Z is either 0 or 1; and acceptable carrier. In another embodiment, the invention pro XTEN is an extended recombinant polypeptide (as described vides kits, comprising packaging material and at least a first herein) wherein the cumulative total of XTEN amino acid 30 container comprising the pharmaceutical composition of the residues is greater than 400 to about 3000. foregoing embodiment and a label identifying the pharma In another aspect, the invention provides isolated fusion ceutical composition and storage and handling conditions, proteins comprising one molecule of the BP and two mol and a sheet of instructions for the reconstitution and/or ecules of XTEN and optionally one or two spacer sequences, administration of the pharmaceutical compositions to a Sub wherein the fusion protein is of formula VI: 35 ject. The invention further provides use of the pharmaceutical (XTEN)-(S)-(BP)-(S)-(XTEN) VI compositions comprising the fusion protein of any of the wherein independently for each occurrence BP is a biologi foregoing embodiments in the preparation of a medicament cally active protein, S is a spacer sequence having between 1 for treating a disease condition in a subject in need thereof. In to about 50 amino acid residues that optionally comprises a 40 one embodiment of the foregoing, the disease, disorder or cleavage sequence; w is either 0 or 1; x is either 0 or 1; y is condition, comprising administering the pharmaceutical either 0 or 1; Z is either 0 or 1; and XTEN is an extended composition described above to a subject in need thereof. In recombinant polypeptide (as described herein) wherein the one embodiment of the foregoing, the disease, disorder or cumulative total of XTEN amino acid residues is greater than condition is selected from type 1 diabetes, type 2 diabetes, 400 to about 3000. 45 obesity, hyperglycemia, hyperinsulinemia, decreased insulin Thus, the fusion proteins can be designed to have different production, insulin resistance, syndrome X, hibernating myo configurations, N- to C-terminus, of a BP. XTEN, and cardium or diabetic cardiomyopathy, excessive appetite, optional spacer sequences, including but not limited to insufficient Satiety, metabolic disorder, glucagonomas, poly XTEN-BP, BP-XTEN, XTEN-S-BP, BP-S-XTEN, XTEN cystic ovary syndrome, dyslipidemia, hibernating myocar BP-XTEN, BP-BP-XTEN, XTEN-BP-BP, BP-S-BP-XTEN, 50 dium, insufficient urinary Sodium excretion, excessive uri and XTEN-BP-S-BP. The choice of configuration can, as nary potassium concentration, conditions or disorders disclosed herein, confer particular pharmacokinetic, physico/ associated with toxic hypervolemia, diabetic cardiomyopa chemical, or pharmacologic properties. thy, and retinal neurodegenerative processes. In some cases, In one embodiment, the BPXTEN fusion proteins of for the pharmaceutical composition can be administered Subcu mulas I, II, III, IV, or V, or VI described above exhibit a 55 taneously, intramuscularly, or intravenously. In one embodi biological activity of at least about 40%, or at least about ment, the pharmaceutical composition is administered at a 50%, or at least about 60%, or at least about 70%, or at least therapeutically effective dose. In some cases of the foregoing, about 80%, or at least about 90%, or at least about 95% of the the therapeutically effective dose results in a gain in time biological activity compared to the BP not linked to the fusion spent within a therapeutic window for the fusion protein protein. In another embodiment, the BPXTEN fusion pro 60 compared to the corresponding BP of the fusion protein not teins of formula I, II, III, IV, or V bind the same receptors or linked to the fusion protein and administered at a comparable ligands as the corresponding parental biologically active pro dose to a Subject. The gain in time spent within the therapeutic tein that is not covalently linked to the fusion protein. window can at least three-fold greater than the corresponding The invention provides isolated nucleic acids comprising a BP not linked to the fusion protein, or alternatively, at least polynucleotide sequence selected from (a) a polynucleotide 65 four-fold, or five-fold, or six-fold, or seven-fold, or eight encoding the fusion protein of any of the foregoing embodi fold, or nine-fold, or at least 10-fold, or at least 20-fold greater ments, or (b) the complement of the polynucleotide of (a). In than the corresponding BP not linked to the fusion protein. US 9,371,369 B2 10 In another embodiment, invention provides a method of patent, or patent application was specifically and individually treating a disease, disorder or condition, comprising admin indicated to be incorporated by reference. istering the pharmaceutical composition described above to a Subject using multiple consecutive doses of the pharmaceu BRIEF DESCRIPTION OF THE DRAWINGS tical composition administered using a therapeutically effec tive dose regimen. In one embodiment of the foregoing, the The features and advantages of the invention may be fur therapeutically effective dose regimen can result in again in ther explained by reference to the following detailed descrip time of at least three-fold, or alternatively, at least four-fold, tion and accompanying drawings that sets forth illustrative or five-fold, or six-fold, or seven-fold, or eight-fold, or nine embodiments. 10 FIG. 1 shows schematic representations of exemplary fold, or at least 10-fold, or at least 20-fold between at least two BPXTEN fusion proteins (FIGS. 1A-G), all depicted in an N consecutive C peaks and/or Cfia troughs for blood levels to C-terminus orientation. FIG. 1A shows two different con of the fusion protein compared to the corresponding BP of the figurations of BPXTEN fusion proteins (100), each compris fusion protein not linked to the fusion protein and adminis ing a single biologically active protein (BP) and an XTEN, the tered at a comparable dose regimen to a Subject. In another 15 first of which has an XTEN molecule (102) attached to the embodiment of the foregoing, the administration of the fusion C-terminus of a BP (103), and the second of which has an protein results in a comparable improvement in at least one XTEN molecule attached to the N-terminus of a BP (103). measured parameter using less frequent dosing or a lower FIG. 1B shows two different configurations of BPXTEN total dosage in moles of the fusion protein of the pharmaceu fusion proteins (100), each comprising a single BP, a spacer tical composition compared to the corresponding biologi sequence and an XTEN, the first of which has an XTEN cally active protein component(s) not linked to the fusion molecule (102) attached to the C-terminus of a spacer protein and administered to a subject dusingatherapeutically sequence (104) and the spacer sequence attached to the C-ter effective regimen to a subject. The parameter can be selected minus of a BP (103) and the second of which has an XTEN from fasting glucose level, response to oral glucose tolerance molecule attached to the N-terminus of a spacer sequence test, peak change of postprandial glucose from baseline, 25 (104) and the spacer sequence attached to the N-terminus of HA, caloric intake, satiety, rate of gastric emptying, insulin a BP (103). FIG. 1C shows two different configurations of secretion, peripheral insulin sensitivity, response to insulin BPXTEN fusion proteins (101), each comprising two mol challenge, beta cell mass, body weight, prothrombin time, ecules of a single BP and one molecule of an XTEN, the first bleeding time, thrombin-antithrombin III complex (TAT), of which has an XTEN linked to the C-terminus of a first BP D-dimer, incidence of bleeding episodes, erythrocyte sedi 30 and that BP is linked to the C-terminus of a second BP, and the second of which is in the opposite orientation in which the mentation rate (ESR), C-reactive protein, bone density, XTEN is linked to the N-terminus of a first BP and that BP is muscle mass, blood pressure, plasma triglycerides, HDL, linked to the N-terminus of a second BP. FIG. 1D shows two cholesterol, LDL cholesterol, incidence of angina, and car different configurations of BPXTEN fusion proteins (101), diac output. 35 each comprising two molecules of a single BP, a spacer In another aspect, the invention provides a method of sequence and one molecule of an XTEN, the first of which has improving a property of a biologically active protein, com an XTEN linked to the C-terminus of a spacer sequence and prising the step of linking a biologically active protein to the the spacer sequence linked to the C-terminus of a first BP XTEN of any of the foregoing embodiments to achieve a which is linked to the C-terminus of a second BP, and the property characterized in that (a) terminal half-life of the 40 second of which is in the opposite orientation in which the biologically active protein linked to the XTEN is longer as XTEN is linked to the N-terminus of a spacer sequence and compared to the terminal half-life of the biologically active the spacer sequence is linked to the N-terminus of a first BP protein that is not linked to the XTEN; (b) shelf-life of the that that BP is linked to the N-terminus of a second BP. FIG. biologically active protein linked to the XTEN is longer as 1E shows two different configurations of BPXTEN fusion compared to the shelf-life of the biologically active protein 45 proteins (101), each comprising two molecules of a single BP, that is not linked to the XTEN, wherein shelf-life is ascer a spacer sequence and one molecule of an XTEN, the first of tained by retention of biological activity after an interval which has an XTEN linked to the C-terminus of a first BP and compared to a baseline sample; (c) solubility under physi the first BP linked to the C-terminus of a spacer sequence ologic conditions of the biologically active protein linked to which is linked to the C-terminus of a second BP molecule, the XTEN is increased as compared to the solubility of the 50 and the second of which is in the opposite configuration of biologically active protein that is not linked to the XTEN; (d) XTEN linked to the N-terminus of a first BP which is linked production of IgG antibodies selectively binding to the bio to the N-terminus of a spacer sequence which in turn is linked logically active protein linked to the XTEN when adminis to the N-terminus of a second molecule of BP. FIG.1F shows tered to a Subject is reduced as compared to production of the two different configurations of BPXTEN fusion proteins IgG when the biologically active protein not linked to the 55 (105), each comprising two molecules of a single BP, and two XTEN is administered to a subject at a comparable dose; molecules of an XTEN, the first of which has a first XTEN and/or (e) time spent within the therapeutic window of the linked to the C-terminus of a first BP which is linked to the biologically active protein linked to the XTEN when admin C-terminus of a second XTEN that is linked to the C-terminus istered to a Subject is longer as compared to the biologically of a second molecule of BP, and the second of which is in the active protein that is not linked to the XTEN when adminis 60 opposite configuration of XTEN linked to the N-terminus of tered to a subject. a first BP linked to the N-terminus of a second XTEN linked to the N-terminus of a second BP. FIG. 1G shows a configu INCORPORATION BY REFERENCE ration (106) of a single BP linked to two XTEN at the N- and C-termini of the BP. All publications, patents, and patent applications men 65 FIG. 2 is a schematic illustration of exemplary polynucle tioned in this specification are herein incorporated by refer otide constructs (FIGS. 2A-G) of BPXTEN genes that encode ence to the same extent as if each individual publication, the corresponding BPXTEN polypeptides of FIG. 1; all US 9,371,369 B2 11 12 depicted in a 5' to 3' orientation. In these illustrative examples FIG.8 shows results of expression assays for the indicated the genes encode BPXTEN fusion proteins with one BP and constructs comprising GFP and XTEN sequences. The XTEN (100); or two BP, one spacer sequence and one XTEN expression cultures were assayed using a fluorescence plate (201); two BP and two XTEN (205); or one BP and two reader (excitation 395 nm, emission 510 nm) to determine the XTEN (206). In these depictions, the polynucleotides encode amount of GFP reporter present. The results, graphed as box the following components: XTEN (202), BP (203), and and whisker plots, indicate that while median expression spacer amino acids that can include a cleavage sequence levels were approximately half of the expression levels com (204), with all sequences linked in frame. pared to the “benchmark” CBD N-terminal helper domain, FIG. 3 is a schematic illustration of an exemplary mono the best clones from the libraries were much closer to the 10 benchmarks, indicating that further optimization around meric BPXTEN acted upon by an endogenously available those sequences was warranted. The results also show that the protease and the ability of the monomeric fusion protein or libraries starting with amino acids MA had better expression the reaction products to bind to a target receptor on a cell levels than those beginning with ME (see Example 14). Surface, with Subsequent cell signaling. FIG. 3A shows a FIG.9 shows three randomized libraries used for the third BPXTEN fusion protein (101) in which a BP (103) and an 15 and fourth codons in the N-terminal sequences of clones from XTEN (102) are linked by spacer sequences that contain a LCW546, LCW547 and LCW552. The libraries were cleavable sequence (104), the latter being susceptible to designed with the third and fourth residues modified such that MMP-13 protease (105). FIG.3B shows the reaction products all combinations of allowable XTEN codons were present at of a free BP. spacer sequence and XTEN. FIG. 3C shows the these positions, as shown. In order to include all the allowable interaction of the reaction product free BP (103) or BPXTEN XTEN codons for each library, nine pairs of oligonucleotides fusion protein (101) with target receptors (106) to BP on a cell encoding 12 amino acids with codon diversities of third and surface (107). In this case, desired binding to the receptor is fourth residues were designed, annealed and ligated into the exhibited when BP has a free C-terminus, as evidenced by the NdeI/Bsal restriction enzyme digested stuffer vector binding of free BP (103) to the receptor while uncleaved pCW0551 (Stuffer-XTEN AM875-GFP), and transformed fusion protein does not bind tightly to the receptor. FIG. 3D 25 into E. coli BL21 Gold(DE3) competent cells to obtain colo shows that the free BP (103), with high binding affinity, nies of the three libraries LCW0569 (SEQID NOS 1,708 and remains bound to the receptor (106), whilean intact BPXTEN 1,709), LCW0570 (SEQ ID NOS 1,710 and 1,711), and (101) is released from the receptor. FIG.3E shows the bound LCW0571 (SEQ ID NOS 1,712 and 1,713). BP has been internalized into an endosome (108) within the FIG. 10 shows a histogram of a retest of the top 75 clones cell (107), illustrating receptor-mediated clearance of the 30 after the optimization step, as described in Example 15, for bound BP and triggering cell signaling (109), portrayed as GFP fluorescence signal, relative to the benchmark stippled cytoplasm. CBD AM875 construct. The results indicated that several FIG. 4 is a schematic flowchart of representative steps in clones were now Superior to the benchmark clones, as seen in the assembly, production and the evaluation of a XTEN. FIG 10. FIG. 5 is a schematic flowchart of representative steps in 35 FIG. 11 is a schematic of a combinatorial approach under the assembly of a BP-XTEN polynucleotide construct encod taken for the union of codon optimization preferences for two ing a fusion protein. Individual oligonucleotides 501 are regions of the N-terminus 48 amino acids. The approach annealed into sequence motifs 502 Such as a 12 amino acid created novel 48mers at the N-terminus of the XTEN protein motif (“12-mer'), which is subsequently ligated with an oligo for evaluation of the optimization of expression that resulted containing BbsI, and KpnI restriction sites 503. Additional 40 in leader sequences that may be a solution for expression of sequence motifs from a library are annealed to the 12-mer XTEN proteins where the XTEN is N-terminal to the BP. until the desired length of the XTEN gene 504 is achieved. FIG. 12 shows an SDS-PAGE gel confirming expression of The XTEN gene is cloned into a stuffer vector. The vector preferred clones obtained from the XTENN-terminal codon encodes a Flag sequence 506 followed by a stopper sequence optimization experiments, in comparison to benchmark that is flanked by Bsal, BbsI, and Kipni sites 507 and an 45 XTEN clones comprising CBD leader sequences at the N-ter exendin-4 gene 508, resulting in the gene 500 encoding an minus of the construct sequences. BP-XTEN fusion for incorporation into a BPXTEN combi FIG. 13 shows an SDS-PAGE gel of samples from a sta nation. bility study of the fusion protein of XTEN AE864 fused to FIG. 6 is a schematic flowchart of representative steps in the N-terminus of GFP (see Example 21). The GFP-XTEN the assembly of a gene encoding fusion protein comprising a 50 was incubated in cynomolgus plasma and rat kidney lysate for biologically active protein (BP) and XTEN, its expression up to 7 days at 37°C. In addition, GFP-XTEN administered and recovery as a fusion protein, and its evaluation as a to cynomolgus monkeys was also assessed. Samples were candidate BPXTEN product. withdrawn at 0, 1 and 7 days and analyzed by SDS PAGE FIG. 7 is a schematic representation of the design of followed by detection using Western analysis and detection IL-1 raXTEN expression vectors with different processing 55 with antibodies against GFP. strategies. FIG. 7A shows an expression vector encoding FIG. 14 shows two samples of 2 and 10 mcg of final XTEN fused to the 3' end of the sequence encoding biologi purified protein of IL-1 ra linked to XTEN AE864 subjected cally active protein IL-1 ra. Note that no additional leader to non-reducing SDS-PAGE, as described in Example 23. The sequences are required in this vector. FIG. 7B depicts an results show that the IL-1 raXTEN composition was recov expression vector encoding XTEN fused to the 5' end of the 60 ered by the process, with an approximate MW of about 160 sequence encoding IL-1ra with a CBD leader sequence and a kDa. TEV protease site. FIG.7C depicts an expression vector as in FIG. 15 shows the output of a representative size exclusion FIG. 7B where the CBD and TEV processing site have been chromatography analysis performed as described in Example replaced with an optimized N-terminal leader sequence 23. The calibration standards, shown in the dashed line, (NTS). FIG. 7D depicts an expression vector encoding an 65 include the markers thyroglobulin (670kDa), bovine gamma NTS sequence, an XTEN, a sequence encoding IL-1 ra, and globulin (158 kDa), chicken ovalbumin (44 kDa), equine than a second sequence encoding an XTEN. myoglobulin (17 kDa) and vitamin B12 (1.35kDa). The BPX US 9,371,369 B2 13 14 TEN component fusion protein of IL-1 ra linked to XTE ous drug administration. Values shown are the average--/- N AM875 is shown as the solid line. The data show that the SEM of 10 animals per group (20 animals in the placebo apparent molecular weight of the BPXTEN construct is sig group). nificantly larger than that expected for a globular protein (as FIG. 23 shows the pharmacokinetic profile (plasma con shown by comparison to the standard proteins run in the same centrations) in cynomolgus monkeys after single doses of assay), and has an Apparent Molecular Weight significantly different compositions of GFP linked to unstructured greater than that determined by SDS-PAGE (data not shown). polypeptides of varying length, administered either Subcuta FIG. 16 shows the reverse phase C18 analysis of purified neously or intravenously, as described in Example 20. The IL-1 ra. XTEN AM875 The output, in absorbance versus compositions were GFP-L288, GFP-L576, GFP time, demonstrates the purity of the final product fusion pro 10 tein. XTEN AF576, GFP-Y576 and XTEN AD836-GFP. Blood FIG. 17 shows the results of the IL-1 receptor binding samples were analyzed at various times after injection and the assay, plotted as a function of IL-1 ra-XTEN AM875 or concentration of GFP in plasma was measured by ELISA IL-1 ra concentration to produce a binding isotherm. To esti using a polyclonal antibody against GFP for capture and a mate the binding affinity of each fusion protein for the IL-1 15 biotinylated preparation of the same polyclonal antibody for receptor, the binding data was fit to a sigmoidal dose-response detection. Results are presented as the plasma concentration curve. From the fit of the data an EC50 (the concentration of Versus time (h) after dosing and show, in particular, a consid IL-1 ra or IL-1 ra-XTEN at which the signal is half maximal) erable increase in half-life for the XTEN AD836-GFP, the for each construct was determined, as described in Example composition with the longest sequence length of XTEN. The 23. The negative control XTEN AM875-hGH construct construct with the shortest sequence length, the GFP-L288 showed no binding under the experimental conditions. had the shortest half-life. FIG. 18 shows an SDS-PAGE of a thermal stability study FIG. 24 shows the near UV circular dichroism spectrum of comparing IL-1ra to the BPXTEN of IL-1ra linked to XTE Ex-4-XTEN AE864, performed as described in Example 30. N AM875, as described in Example 23. Samples of IL-1 ra FIG. 25 shows the pharmacokinetic results of Ex-4-AE864 and the IL-1 ra linked to XTEN were incubated at 25°C. and 25 administered to cynomolgus monkeys by the Subcutaneous 85°C. for 15 min, at which time any insoluble protein was and intravenous routes (see Example 27 for experimental rapidly removed by centrifugation. The soluble fraction was details). then analyzed by SDS-PAGE as shown in FIG. 18, and shows FIG. 26 illustrates allometric scaling results for predicted that only IL-1 ra-XTEN remained soluble after heating, while, human response to Ex-4-XTEN AE864 based on measured in contrast, recombinant IL-1 ra (without XTEN as a fusion 30 partner) was completely precipitated after heating. results from four animal species; i.e., mice, rats, cynomolgus FIG. 19 shows the results of an IL-1ra receptor binding monkeys and dogs. FIG. 26A shows measured terminal half assay performed on the samples shown in FIG. 19. As life versus body mass, with a predicted T1/2 inhumans of 139 described in Example 23, the recombinant IL-1 ra, which was h. FIG. 26B shows measured drug clearance versus body fully denatured by heat treatment, retained less than 0.1% of 35 mass, with a predicted clearance rate value of 30 ml/h in its receptor activity following heat treatment. However, humans. FIG. 26C shows measured volume of distribution IL-1 ra linked to XTEN retained approximately 40% of its versus body mass, with a predicted value of 5970 ml in receptor binding activity. humans. FIG. 20 shows the pharmacokinetic profile (plasma con FIG. 27 shows the results of an in vitro cellular assay for centrations) after single subcutaneous doses of three different 40 glucagon activity, comparing glucagon to glucagon linked to BPXTEN compositions of IL-1ra linked to different XTEN Y288 (see Example 31 for experimental details). sequences, separately administered Subcutaneously to cyno FIG. 28 shows the results of an in vitro cellular assay for molgus monkeys, as described in Example 24. GLP-1 activity, comparing exendin-4 from two commercial FIG. 21 shows body weight results from a pharmacody sources (closed triangles) to exendin-4 linked to Y288 (closed namic and metabolic study using a combination of two fusion 45 squares), with untreated cells (closed diamonds) used as a proteins; i.e., glucagon linked to Y288 (Gcg-XTEN) and negative control (see Example 31 for experimental details). exendin-4 linked to AE864 (Ex-4-XTEN) combination effi The EC50 is indicated by the dashed line. cacy in a diet-induced obesity model in mice (see Example 26 FIG. 29 shows the effects of heat treatment on stability of for experimental details). The graph shows change in body hCH and AM864-hGH. FIG. 29A is an SDS-PAGE gel of the weight in Diet-Induced Obese mice over the course of 28 days 50 two preparations treated at 25°C. and 80° C. for 15 minutes, continuous drug administration. Values shown are the aver while FIG. 29B shows the corresponding percentage of age+/-SEM of 10 animals per group (20 animals in the pla receptor binding activity of the 80°C. sample relative to the cebo group). 25° C. treatment. FIG. 22 shows change in fasting glucose levels from a FIG. 30 shows the results of in vitro binding affinity assay pharmacodynamic and metabolic study using single and 55 of hCGH-AM864 (circles) and AM864-hGH (inverted tri combinations of two BPXTEN fusion proteins; i.e., glucagon angles) to hCHR-Fc. Unmodified recombinant hCH linked to Y288 (Gcg-XTEN) and exendin-4 linked to AE864 (squares) is shown for comparison. (Ex-4-XTEN) in a diet-induced obesity model in mice (see FIG.31 shows the results of in vitro binding affinity assay Example 26 for experimental details). Groups areas follows: of a growth hormone fusion protein with XTEN sequences Gr. 1 Tris Vehicle; Gr. 2 Ex-4-AE576, 10 mg/kg: Gr. 3 Ex-4- 60 linked to the N- and C-terminus of the hCH, compared to AE576, 20 mg/kg. Gr. 4 Vehicle, 50% DMSO; Gr. 5 unmodified hCH, binding to hCHR-Fc. The apparent EC50 Exenatide, 30 ug/kg/day; Gr. 6 Exenatide, 30 uI/kg/day+ values for each compound are listed for the Gcg-Y288 20 g/kg; Gr. 7 Gcg-Y288, 20 ug/kg; Gr. 8 Gcg AE912 hCH AE144 (circles) and unmodified recombinant Y288, 40 g/kg; Gr. 9 Ex-4-AE576 10 mg/kg--Gcg-Y288 20 hGH (squares). ug/kg; Gr. 10 Gcg-Y288 40 ug/kg+Ex-4-AE576 20 mg/kg. 65 FIG. 32 shows the pharmacokinetic results of four hCGH The graph shows the change in fasting blood glucose levels in BTXEN fusion proteins administered to rats by the subcuta Diet-Induced Obese mice over the course of 28 days continu neous route, compared to unmodified recombinant hCH. US 9,371,369 B2 15 16 FIG. 33 shows the concentration profiles of three hCGH of FIX. FIG. 8B is a matching schematic of FIXa showing the XTEN constructs after subcutaneous administration to cyno residue numbers of the removed activation peptide in both the molgus monkeys. unprocessed and the mature forms. FIG. 34 shows the effects of administration of hCGH or FIG. 44 is a chart showing the sequence of the FIX-XTEN AM864-hGH at the indicated doses on body weight in a construct FIX-CFXIa-AG864 (SEQID NO: 1819), pre-FXIa hypox rat model. The results show retention of biologic activ cleavage, and the resulting fragments 1, 2, 3 and 4 (SEQID ity by the BPXTEN constructs that is equivalent in potency to NOS 1820-1823, respectively, in order of appearance) after hGH, yet with less frequent dosing. proteolytic cleavage and activation to FIXa. FIG.35 shows the comparative effects of administration of FIG. 45 shows a graph of results of a FVII assay, with placebo, hCH, and AM864-hGH on growth of cartilage in the 10 reaction profiles for various concentrations of FVII standards tibial epiphyseal plate in hypox rats, shown in histologic and the reaction profile of an FVII-XTEN. Results of the cross-sections of the tibia after 9 days of treatment (groups assay for FVII-XTEN are shown in the table below the graph. were those used as per FIG. 34). FIG. 46 is a plasmid map of the expression vector FIG. 36 shows a schematic representation of exemplary pCW0590 containing the FVII-XTEN gene. FIX and FIX-XTEN fusion proteins. FIG. 36A shows the 15 FIG. 47 shows results of a of a size exclusion chromatog domain architecture of native FIX, with the gamma-carboxy raphy analysis of glucagon-XTEN construct samples mea glutamate domain, the EGF1 and EGF2 domains, the activa Sured against protein standards of known molecular weight, tion peptide, and the protease domain. Arrows indicate the with the graph output as absorbance versus retention Volume. cleavage sites for the activation peptide domain. FIG. 36B The glucagon-XTEN constructs are 1) glucagon-Y288; 2) shows a FIX molecule with an XTEN polypeptide attached to glucagonY-144; 3) glucagon-Y72; and 4) glucagon-Y36. The the C-terminus, and indicates a site for proteolytic cleavage to results indicate an increase inapparent molecular weight with release the XTEN. increasing length of XTEN moiety. FIG. 37 illustrates several examples of FIX-XTEN con figurations and associated protease cleavage sites. FIG. 37A DETAILED DESCRIPTION OF THE INVENTION shows an FIX-XTEN with two proteolytic cleavage sites 25 (arrows) within the XTEN, proximal to the FIX portion of the Before the embodiments of the invention are described, it is fusion protein. FIG. 37B shows an FIX-XTEN that can be to be understood that such embodiments are provided by way autocatalytically activated, with release of the XTEN. FIG. of example only, and that various alternatives to the embodi 37C shows four configurations of FIX-XTEN, with the ments of the invention described herein may be employed in XTEN integrated between the various domains of FIX. FIG. 30 practicing the invention. Numerous variations, changes, and 37D shows an FIX-XTEN with the XTEN portion inserted substitutions will now occur to those skilled in the art without into the activation peptide, which would release the XTEN departing from the invention. upon the proteolytic activation of FIX. FIG. 37E illustrates Unless otherwise defined, all technical and scientific terms FIX-XTEN that contain multiple XTEN sequences inserted used herein have the same meaning as commonly understood between different domains. FIG. 37F illustrates FIX-XTEN 35 by one of ordinary skill in the art to which this invention where the XTEN has been inserted within a domain of FIX. belongs. Although methods and materials similar or equiva FIG. 37G illustrates FIX-XTEN where the XTEN is linked to lent to those described herein can be used in the practice or the C-terminus of FIX and contains multiple cleavage sites testing of the present invention, Suitable methods and mate near the N-terminus of the XTEN rials are described below. In case of conflict, the patent speci FIG. 38 is a schematic illustration of the release of XTEN 40 fication, including definitions, will control. In addition, the from FIX-XTEN. FIX-XTEN with the associated XTEN has materials, methods, and examples are illustrative only and not increased half-life but is largely in an inactive, pro-drug form. intended to be limiting. Numerous variations, changes, and After proteolytic cleavage at the XTEN release sites (arrows), substitutions will now occur to those skilled in the art without the FIX portion can be activated by the coagulation cascade, departing from the invention. but has a short half-life. 45 Definitions FIG. 39 is a schematic of the coagulation cascade, showing As used herein, the following terms have the meanings both the extrinsic and intrinsic pathways. ascribed to them unless specified otherwise. FIG. 40 illustrates a strategy for FIX-XTEN design As used in the specification and claims, the singular forms approach using exon location within the gene encoding FIX, a”, “an and “the include plural references unless the con with exemplary sites for XTEN insertion between exon 50 text clearly dictates otherwise. For example, the term “a cell boundaries indicate by arrows. includes a plurality of cells, including mixtures thereof. FIG. 41. FIG. 41A is a schematic of three representative The terms “polypeptide', 'peptide', and “protein’ are used FIX-XTEN constructs AC296, AC299, and AC300. Two interchangeably herein to refer to polymers of amino acids of arrows in the FIX sequence indicate the FXI activation site. any length. The polymer may be linear or branched, it may The additional arrow in AC299 and AC300 illustrates the 55 comprise modified amino acids, and it may be interrupted by XTEN release site. FIG. 41B is a Western blot of thrombin non-amino acids. The terms also encompass an amino acid treated proteins AC296, AC299, and AC300 which shows, in polymer that has been modified, for example, by disulfide the circle, that FIX has been released from the XTEN moiety, bond formation, glycosylation, lipidation, acetylation, phos at a similar location as the FIX positive control on the left side phorylation, or any other manipulation, such as conjugation of the Western blot. 60 with a labeling component. FIG. 42 is a schematic representation of the design of As used herein the term "amino acid' refers to either natu FIX-XTEN expression vectorpCW0590 containing the FIX ral and/or unnatural or synthetic amino acids, including but XTEN gene. not limited to glycine and both the D or L optical isomers, and FIG. 43. FIG. 43A is a schematic of the unprocessed FIX amino acid analogs and peptidomimetics. Standard single or polypeptide chain which is the expressed product, and the 65 three letter codes are used to designate amino acids. matured FIX polypeptide chain that includes the post trans The term “natural L-amino acid means the L optical iso lation cleavage at the N-terminus that occurs during secretion merforms of glycine (G), proline (P), alanine (A), valine (V), US 9,371,369 B2 17 18 leucine (L), isoleucine (I), methionine (M), cysteine (C), of its naturally occurring counterpart. In general, a polypep phenylalanine (F), tyrosine (Y), tryptophan (W), histidine tide made by recombinant means and expressed in a host cell (H), lysine (K), arginine (R), glutamine (Q), asparagine (N). is considered to be "isolated.” glutamic acid (E), aspartic acid (D), serine (S), and threonine An "isolated polynucleotide or polypeptide-encoding (T). nucleic acid or other polypeptide-encoding nucleic acid is a The term “non-naturally occurring as applied to nucleic acid molecule that is identified and separated from at sequences and as used herein, means polypeptide or poly least one contaminant nucleic acid molecule with which it is nucleotide sequences that do not have a counterpart to, are not ordinarily associated in the natural source of the polypeptide complementary to, or do not have a high degree of homology encoding nucleic acid. An isolated polypeptide-encoding with a wild-type or naturally-occurring sequence found in a 10 nucleic acid molecule is other than in the form or setting in mammal. For example, a non-naturally occurring polypep which it is found in nature. Isolated polypeptide-encoding tide may share no more than 99%, 98%. 95%, 90%, 80%, nucleic acid molecules therefore are distinguished from the 70%, 60%, 50% or even less amino acid sequence identity as specific polypeptide-encoding nucleic acid molecule as it compared to a natural sequence when Suitably aligned. exists in natural cells. However, an isolated polypeptide-en The terms “hydrophilic' and “hydrophobic” refer to the 15 coding nucleic acid molecule includes polypeptide-encoding degree of affinity that a substance has with water. A hydro nucleic acid molecules contained in cells that ordinarily philic Substance has a strong affinity for water, tending to express the polypeptide where, for example, the nucleic acid dissolve in, mix with, or be wetted by water, while a hydro molecule is in a chromosomal or extra-chromosomal location phobic substance substantially lacks affinity for water, tend different from that of natural cells. ing to repel and not absorb water and tending not to dissolve A “chimeric' protein contains at least one fusion polypep in or mix with or be wetted by water. Amino acids can be tide comprising regions in a different position in the sequence characterized based on their hydrophobicity. A number of than that which occurs in nature. The regions may normally scales have been developed. An example is a scale developed exist in separate proteins and are brought together in the by Levitt, M, et al., J Mol Biol (1976) 104:59, which is listed fusion polypeptide; or they may normally exist in the same in Hopp, T P. et al., Proc Natl AcadSci USA (1981)78:3824. 25 protein but are placed in a new arrangement in the fusion Examples of “hydrophilic amino acids' are arginine, lysine, polypeptide. Achimeric protein may be created, for example, threonine, alanine, asparagine, and glutamine. Of particular by chemical synthesis, or by creating and translating a poly interest are the hydrophilic amino acids aspartate, glutamate, nucleotide in which the peptide regions are encoded in the and serine, and glycine. Examples of “hydrophobic amino desired relationship. acids' are tryptophan, tyrosine, phenylalanine, methionine, 30 “Conjugated”, “linked,” “fused,” and “fusion” are used leucine, isoleucine, and Valine. interchangeably herein. These terms refer to the joining A "fragment” is a truncated form of a native biologically together of two more chemical elements or components, by active protein that retains at least a portion of the therapeutic whatever means including chemical conjugation or recombi and/or biological activity. A “variant' is a protein with nant means. For example, a promoter or enhancer is operably sequence homology to the native biologically active protein 35 linked to a coding sequence if it affects the transcription of the that retains at least a portion of the therapeutic and/or biologi sequence. Generally, “operably linked' means that the DNA cal activity of the biologically active protein. For example, a sequences being linked are contiguous, and in reading phase variant protein may share at least 70%, 75%, 80%, 85%,90%, or in-frame. An “in-frame fusion” refers to the joining of two 95%, 96%, 97%, 98% or 99% amino acid sequence identity or more open reading frames (ORFs) to form a continuous with the reference biologically active protein. As used herein, 40 longer ORF, in a manner that maintains the correct reading the term “biologically active protein moiety' includes pro frame of the original ORFs. Thus, the resulting recombinant teins modified deliberately, as for example, by site directed fusion protein is a single protein containing two ore more mutagenesis, insertions, or accidentally through mutations. segments that correspond to polypeptides encoded by the A "host cell' includes an individual cell or cell culture original ORFs (which segments are not normally so joined in which can be or has been a recipient for the subject vectors. 45 nature). Host cells include progeny of a single host cell. The progeny In the context of polypeptides, a "linear sequence' or a may not necessarily be completely identical (in morphology "sequence' is an order of amino acids in a polypeptide in an or in genomic of total DNA complement) to the original amino to carboxyl terminus direction in which residues that parent cell due to natural, accidental, or deliberate mutation. neighbor each other in the sequence are contiguous in the A host cell includes cells transfected in vivo with a vector of 50 primary structure of the polypeptide. A "partial sequence' is this invention. a linear sequence of part of a polypeptide that is known to “Isolated, when used to describe the various polypeptides comprise additional residues in one or both directions. disclosed herein, means polypeptide that has been identified “Heterologous' means derived from a genotypically dis and separated and/or recovered from a component of its natu tinct entity from the rest of the entity to which it is being ral environment. Contaminant components of its natural envi 55 compared. For example, a glycine rich sequence removed ronment are materials that would typically interfere with from its native coding sequence and operatively linked to a diagnostic or therapeutic uses for the polypeptide, and may coding sequence other than the native sequence is a heterolo include enzymes, hormones, and other proteinaceous or non gous glycine rich sequence. The term "heterologous' as proteinaceous solutes. AS is apparent to those of skill in the applied to a polynucleotide, a polypeptide, means that the art, a non-naturally occurring polynucleotide, peptide, 60 polynucleotide or polypeptide is derived from a genotypi polypeptide, protein, antibody, or fragments thereof, does not cally distinct entity from that of the rest of the entity to which require "isolation” to distinguish it from its naturally occur it is being compared. ring counterpart. In addition, a “concentrated”, “separated The terms “polynucleotides”, “nucleic acids”, “nucle or "diluted polynucleotide, peptide, polypeptide, protein, otides' and "oligonucleotides’ are used interchangeably. antibody, or fragments thereof, is distinguishable from its 65 They refer to a polymeric form of nucleotides of any length, naturally occurring counterpart in that the concentration or either deoxyribonucleotides or ribonucleotides, or analogs number of molecules per Volume is generally greater than that thereof. Polynucleotides may have any three-dimensional US 9,371,369 B2 19 20 structure, and may performany function, known or unknown. varied from about 0.1 to 2xSSC, with SDS being present at The following are non-limiting examples of polynucleotides: about 0.1%. Such wash temperatures are typically selected to coding or non-coding regions of a gene or gene fragment, loci be about 5°C. to 20°C. lower than the thermal melting point (locus) defined from linkage analysis, exons, introns, mes (C) for the specific sequence at a defined ionic strength and pH. senger RNA (mRNA), transfer RNA, ribosomal RNA, 5 The Tm is the temperature (under defined ionic strength and ribozymes, cDNA, recombinant polynucleotides, branched pH) at which 50% of the target sequence hybridizes to a polynucleotides, plasmids, vectors, isolated DNA of any perfectly matched probe. An equation for calculating Tm and sequence, isolated RNA of any sequence, nucleic acid probes, conditions for nucleic acid hybridization are well known and and primers. A polynucleotide may comprise modified nucle can be found in Sambrook, J. etal. (1989) Molecular Cloning: otides, such as methylated nucleotides and nucleotide ana 10 logs. If present, modifications to the nucleotide structure may A Laboratory Manual. 2" ed., vol. 1-3, Cold Spring Harbor be imparted before or after assembly of the polymer. The Press, Plainview N.Y.; specifically see volume 2 and chapter sequence of nucleotides may be interrupted by non-nucle 9. Typically, blocking reagents are used to block non-specific otide components. A polynucleotide may be further modified hybridization. Such blocking reagents include, for instance, after polymerization, Such as by conjugation with a labeling 15 sheared and denatured salmon sperm DNA at about 100-200 component. ug/ml. Organic solvent, such as formamide at a concentration The term “complement of a polynucleotide' denotes a of about 35-50% V/V, may also be used under particular cir polynucleotide molecule having a complementary base cumstances, such as for RNA:DNA hybridizations. Useful sequence and reverse orientation as compared to a reference variations on these wash conditions will be readily apparent sequence, such that it could hybridize with a reference to those of ordinary skill in the art. sequence with complete fidelity. The terms “percent identity” and “96 identity,” as applied to “Recombinant’ as applied to a polynucleotide means that polynucleotide sequences, refer to the percentage of residue the polynucleotide is the product of various combinations of matches between at least two polynucleotide sequences in vitro cloning, restriction and/or ligation steps, and other aligned using a standardized algorithm. Such an algorithm procedures that result in a construct that can potentially be 25 may insert, in a standardized and reproducible way, gaps in expressed in a host cell. the sequences being compared in order to optimize alignment The terms “gene' or “gene fragment” are used inter between two sequences, and therefore achieve a more mean changeably herein. They refer to a polynucleotide containing ingful comparison of the two sequences. Percent identity may at least one open reading frame that is capable of encoding a be measured over the length of an entire defined polynucle particular protein after being transcribed and translated. A 30 otide sequence, for example, as defined by a particular SEQ gene or gene fragment may be genomic or cDNA, as long as ID number, or may be measured over a shorter length, for the polynucleotide contains at least one open reading frame, example, over the length of a fragment taken from a larger, which may cover the entire coding region or a segment defined polynucleotide sequence, for instance, a fragment of thereof. A “fusion gene' is a gene composed of at least two at least 45, at least 60, at least 90, at least 120, at least 150, at heterologous polynucleotides that are linked together. 35 least 210 or at least 450 contiguous residues. Such lengths are “Homology” or “homologous” refers to sequence similar exemplary only, and it is understood that any fragment length ity or interchangeability between two or more polynucleotide Supported by the sequences shown herein, in the tables, fig sequences or two or more polypeptide sequences. When ures or Sequence Listing, may be used to describe a length using a program Such as BestFit to determine sequence iden over which percentage identity may be measured. tity, similarity or homology between two different amino acid 40 “Percent (%)amino acid sequence identity, with respect to sequences, the default settings may be used, oran appropriate the polypeptide sequences identified herein, is defined as the scoring matrix, Such as bloSum45 or bloSum80, may be percentage of amino acid residues in a query sequence that are selected to optimize identity, similarity or homology scores. identical with the amino acid residues of a second, reference Preferably, polynucleotides that are homologous are those polypeptide sequence or a portion thereof, after aligning the which hybridize under stringent conditions as defined herein 45 sequences and introducing gaps, if necessary, to achieve the and have at least 70%, preferably at least 80%, more prefer maximum percent sequence identity, and not considering any ably at least 90%, more preferably 95%, more preferably conservative Substitutions as part of the sequence identity. 97%, more preferably 98%, and even more preferably 99% Alignment for purposes of determining percent amino acid sequence identity to those sequences. sequence identity can be achieved in various ways that are The terms “stringent conditions” or “stringent hybridiza 50 within the skill in the art, for instance, using publicly avail tion conditions' includes reference to conditions under which able computer software such as BLAST, BLAST-2. ALIGN a polynucleotide will hybridize to its target sequence, to a or Megalign (DNASTAR) software. Those skilled in the art detectably greater degree than other sequences (e.g., at least can determine appropriate parameters for measuring align 2-fold over background). Generally, stringency of hybridiza ment, including any algorithms needed to achieve maximal tion is expressed, in part, with reference to the temperature 55 alignment over the full length of the sequences being com and salt concentration under which the wash step is carried pared. Percent identity may be measured over the length of an out. Typically, stringent conditions will be those in which the entire defined polypeptide sequence, for example, as defined salt concentration is less than about 1.5 M Naion, typically by a particular SEQ ID number, or may be measured over a about 0.01 to 1.0 MNaion concentration (or other salts) at pH shorter length, for example, over the length of a fragment 7.0 to 8.3 and the temperature is at least about 30°C. for short 60 taken from a larger, defined polypeptide sequence, for polynucleotides (e.g., 10 to 50 nucleotides) and at least about instance, a fragment of at least 15, at least 20, at least 30, at 60° C. for long polynucleotides (e.g., greater than 50 nucle least 40, at least 50, at least 70 or at least 150 contiguous otides)—for example, “stringent conditions' can include residues. Such lengths are exemplary only, and it is under hybridization in 50% formamide, 1 MNaCl, 1% SDS at 37° stood that any fragment length Supported by the sequences C., and three washes for 15 min each in 0.1XSSC/1% SDS at 65 shown herein, in the tables, figures or Sequence Listing, may 60 to 65° C. Alternatively, temperatures of about 65° C., 60° be used to describe a length over which percentage identity C., 55°C., or 42°C. may be used. SSC concentration may be may be measured. US 9,371,369 B2 21 22 The term “non-repetitiveness” as used herein in the context typically refers to the time required for half the quantity of an of a polypeptide refers to a lack or limited degree of internal administered substance deposited in a living organism to be homology in a peptide or polypeptide sequence. The term metabolized or eliminated by normal biological processes. “Substantially non-repetitive' can mean, for example, that The terms “t”, “terminal half-life”, “elimination half-life' there are few or no instances of four contiguous amino acids and “circulating half-life' are used interchangeably herein. in the sequence that are identical amino acid types or that the Apparent Molecular Weight Factor” or “Apparent polypeptide has a Subsequence score (defined infra) of 10 or Molecular Weight” are related terms referring to a measure of less or that there isn't a pattern in the order, from N- to the relative increase or decrease inapparent molecular weight C-terminus, of the sequence motifs that constitute the exhibited by a particular amino acid sequence. The Apparent polypeptide sequence. The term “repetitiveness' as used 10 Molecular Weight is determined using size exclusion chro herein in the context of a polypeptide refers to the degree of matography (SEC) and similar methods compared to globular internal homology in a peptide or polypeptide sequence. In protein standards and is measured in “apparent kD'units. The contrast, a “repetitive' sequence may contain multiple iden Apparent Molecular Weight Factor is the ratio between the tical copies of short amino acid sequences. For instance, a Apparent Molecular Weight and the actual molecular weight; polypeptide sequence of interest may be divided into n-mer 15 the latter predicted by adding, based on amino acid compo sequences and the number of identical sequences can be sition, the calculated molecular weight of each type of amino counted. Highly repetitive sequences contain a large fraction acid in the composition. of identical sequences while non-repetitive sequences con The “hydrodynamic radius' or “Stokes radius' is the effec tain few identical sequences. In the context of a polypeptide, tive radius (R, in nm) of a molecule in a solution measured by a sequence can contain multiple copies of shorter sequences assuming that it is a body moving through the solution and of defined or variable length, or motifs, in which the motifs resisted by the solution’s viscosity. In the embodiments of the themselves have non-repetitive sequences, rendering the full invention, the hydrodynamic radius measurements of the length polypeptide Substantially non-repetitive. The length of XTEN fusion proteins correlate with the Apparent Molecu polypeptide within which the non-repetitiveness is measured lar Weight Factor, which is a more intuitive measure. The can vary from 3 amino acids to about 200 amino acids, about 25 “hydrodynamic radius' of a protein affects its rate of diffu from 6 to about 50 amino acids, or from about 9 to about 14 Sionin aqueous solution as well as its ability to migrate ingels amino acids. “Repetitiveness' used in the context of poly of macromolecules. The hydrodynamic radius of a protein is nucleotide sequences refers to the degree of internal homol determined by its molecular weight as well as by its structure, ogy in the sequence Such as, for example, the frequency of including shape and compactness. Methods for determining identical nucleotide sequences of a given length. Repetitive 30 the hydrodynamic radius are well known in the art, Such as by ness can, for example, be measured by analyzing the fre the use of size exclusion chromatography (SEC), as described quency of identical sequences. in U.S. Pat. Nos. 6,406,632 and 7,294,513. Most proteins A “vector” is a nucleic acid molecule, preferably self have globular structure, which is the most compact three replicating in an appropriate host, which transfers an inserted dimensional structure a protein can have with the Smallest nucleic acid molecule into and/or between host cells. The 35 hydrodynamic radius. Some proteins adopt a random and term includes vectors that function primarily for insertion of open, unstructured, or linear conformation and as a result DNA or RNA into a cell, replication of vectors that function have a much larger hydrodynamic radius compared to typical primarily for the replication of DNA or RNA, and expression globular proteins of similar molecular weight. vectors that function for transcription and/or translation of the “Physiological conditions” refer to a set of conditions in a DNA or RNA. Also included are vectors that provide more 40 living host as well as in vitro conditions, including tempera than one of the above functions. An "expression vector ture, salt concentration, pH, that mimic those conditions of a is a polynucleotide which, when introduced into an appropri living Subject. A host of physiologically relevant conditions ate host cell, can be transcribed and translated into a polypep for use in in vitro assays have been established. Generally, a tide(s). An 'expression system' usually connotes a Suitable physiological buffer contains a physiological concentration host cell comprised of an expression vector that can function 45 of salt and is adjusted to a neutral pH ranging from about 6.5 to yield a desired expression product. to about 7.8, and preferably from about 7.0 to about 7.5. A "Serum degradation resistance.” as applied to a polypep variety of physiological buffers is listed in Sambrook et al. tide, refers to the ability of the polypeptides to withstand (1989). Physiologically relevant temperature ranges from degradation in blood or components thereof, which typically about 25°C. to about 38°C., and preferably from about 35° C. involves proteases in the serum or plasma. The serum degra 50 to about 37° C. dation resistance can be measured by combining the protein A "reactive group' is a chemical structure that can be with human (or mouse, rat, monkey, as appropriate) serum or coupled to a second reactive group. Examples for reactive plasma, typically for a range of days (e.g. 0.25, 0.5, 1, 2, 4, 8, groups are amino groups, carboxyl groups, sulfhydryl 16 days), typically at about 37°C. The samples for these time groups, hydroxyl groups, aldehyde groups, azide groups. points can be run on a Western blot assay and the protein is 55 Some reactive groups can be activated to facilitate coupling detected with an antibody. The antibody can be to a tag in the with a second reactive group. Examples for activation are the protein. If the protein shows a single band on the western, reaction of a carboxyl group with carbodiimide, the conver where the protein's size is identical to that of the injected sion of a carboxyl group into an activated ester, or the con protein, then no degradation has occurred. In this exemplary version of a carboxyl group into an azide function. method, the time point where 50% of the protein is degraded, 60 “Controlled release agent”, “slow release agent”, “depot as judged by Western blots or equivalent techniques, is the formulation' or “sustained release agent” are used inter serum degradation half-life or "serum half-life' of the pro changeably to refer to an agent capable of extending the tein. duration of release of a polypeptide of the invention relative to The term “t as used herein means the terminal half-life the duration of release when the polypeptide is administered calculated as ln(2)/K. K., is the terminal elimination rate 65 in the absence of agent. Different embodiments of the present constant calculated by linear regression of the terminal linear invention may have different release rates, resulting in differ portion of the log concentration vs. time curve. Half-life ent therapeutic amounts. US 9,371,369 B2 23 24 The terms “antigen”, “target antigen' or “immunogen' are biologically active protein. Determination of a therapeuti used interchangeably herein to refer to the structure or bind cally effective amount is well within the capability of those ing determinant that an antibody fragment or an antibody skilled in the art, especially in light of the detailed disclosure fragment-based therapeutic binds to or has specificity provided herein. against. The terms “therapeutically effective amount” and “thera The term “payload’ as used herein refers to a protein or peutically effective dose”, as used herein, refers to an amount peptide sequence that has biological or therapeutic activity; of a biologically active protein, either alone or as a part of a the counterpart to the pharmacophore of Small molecules. fusion protein composition, that is capable of having any Examples of payloads include, but are not limited to, cytok detectable, beneficial effect on any symptom, aspect, mea ines, enzymes, hormones and blood and growth factors. Pay 10 Sured parameter or characteristics of a disease state or condi loads can further comprise genetically fused or chemically tion when administered in one or repeated doses to a Subject. conjugated moieties such as chemotherapeutic agents, anti Such effect need not be absolute to be beneficial. viral compounds, toxins, or contrastagents. These conjugated The term “therapeutically effective dose regimen’, as used moieties can be joined to the rest of the polypeptide via a herein, refers to a schedule for consecutively administered linker which may be cleavable or non-cleavable. 15 doses of a biologically active protein, either alone or as a part The term 'antagonist', as used herein, includes any mol of a fusion protein composition, wherein the doses are given ecule that partially or fully blocks, inhibits, or neutralizes a in therapeutically effective amounts to result in Sustained biological activity of a native polypeptide disclosed herein. beneficial effect on any symptom, aspect, measured param Methods for identifying antagonists of a polypeptide may eter or characteristics of a disease state or condition. comprise contacting a native polypeptide with a candidate I). General Techniques antagonist molecule and measuring a detectable change in The practice of the present invention employs, unless oth one or more biological activities normally associated with the erwise indicated, conventional techniques of immunology, native polypeptide. In the context of the present invention, biochemistry, chemistry, molecular biology, microbiology, antagonists may include proteins, nucleic acids, carbohy cell biology, genomics and recombinant DNA, which are drates, antibodies or any other molecules that decrease the 25 within the skill of the art. See Sambrook, J. et al., “Molecular effect of a biologically active protein. Cloning: A Laboratory Manual” 3' edition, Cold Spring The term “agonist' is used in the broadest sense and Harbor Laboratory Press, 2001: "Current protocols in includes any molecule that mimics a biological activity of a molecular biology”, F. M. Ausubel, et al. eds., 1987; the series native polypeptide disclosed herein. Suitable agonist mol “Methods in Enzymology.” Academic Press, San Diego, ecules specifically include agonist antibodies or antibody 30 Calif.; “PCR 2: a practical approach’, M.J. MacPherson, B. fragments, fragments or amino acid sequence variants of D. Hames and G. R. Taylor eds., Oxford University Press, native polypeptides, peptides, small organic molecules, etc. 1995: "Antibodies, a laboratory manual” Harlow, E. and Methods for identifying agonists of a native polypeptide may Lane, D. eds., Cold Spring Harbor Laboratory, 1988; “Good comprise contacting a native polypeptide with a candidate man & Gilman's The Pharmacological Basis of Therapeu agonist molecule and measuring a detectable change in one or 35 tics.” 11 Edition, McGraw-Hill, 2005; and Freshney, R.I., more biological activities normally associated with the native “Culture of Animal Cells: A Manual of Basic Technique.” 4" polypeptide. edition, John Wiley & Sons, Somerset, N.J., 2000, the con Activity” for the purposes herein refers to an action or tents of which are incorporated in their entirety herein by effect of a component of a fusion protein consistent with that reference. of the corresponding native biologically active protein, 40 II). Extended Recombinant Polypeptides wherein “biological activity” refers to an in vitro or in vivo The present invention provides compositions comprising biological function or effect, including but not limited to extended recombinant polypeptides (XTEN” or “XTENs). receptor binding, antagonist activity, agonist activity, or a In some embodiments, XTEN are generally extended length cellular or physiologic response. polypeptides with non-naturally occurring, Substantially non As used herein, “treatment' or “treating,” or “palliating or 45 repetitive sequences that are composed mainly of Small “ameliorating is used interchangeably herein. These terms hydrophilic amino acids, with the sequence having a low refer to an approach for obtaining beneficial or desired results degree or no secondary or tertiary structure under physiologic including but not limited to a therapeutic benefit and/or a conditions. prophylactic benefit. By therapeutic benefit is meant eradica In one aspect of the invention, XTEN polypeptide compo tion or amelioration of the underlying disorder being treated. 50 sitions are disclosed that are useful as fusion partners that can Also, a therapeutic benefit is achieved with the eradication or be linked to biologically active proteins (“BP), resulting in a amelioration of one or more of the physiological symptoms BPXTEN fusion proteins (e.g., monomeric fusions). XTENs associated with the underlying disorder Such that an improve can have utility as fusion protein partners in that they can ment is observed in the subject, notwithstanding that the confer certain chemical and pharmaceutical properties when subject may still be afflicted with the underlying disorder. For 55 linked to a biologically active protein to a create a fusion prophylactic benefit, the compositions may be administered protein. Such desirable properties include but are not limited to a Subject at risk of developing a particular disease, or to a to enhanced pharmacokinetic parameters and solubility char Subject reporting one or more of the physiological symptoms acteristics, amongst other properties described below. Such of a disease, even though a diagnosis of this disease may not fusion protein compositions may have utility to treat certain have been made. 60 diseases, disorders or conditions, as described herein. As used A “therapeutic effect”, as used herein, refers to a physi herein, “XTEN’ specifically excludes antibodies or antibody ologic effect, including but not limited to the cure, mitigation, fragments such as single-chain antibodies, Fc fragments of a amelioration, or prevention of disease in humans or other light chain or a heavy chain. animals, or to otherwise enhance physical or mental wellbe In some embodiments, XTEN are long polypeptides hav ing of humans or animals, caused by a fusion polypeptide of 65 ing greater than about 100 to about 3000 amino acid residues, the invention other than the ability to induce the production of preferably greater than 400 to about 3000 residues when used an antibody against an antigenic epitope possessed by the as a single sequence, and cumulatively have greater than US 9,371,369 B2 25 26 about 400 to about 3000 amino acid residues when more than ary structure. In other cases, the XTEN sequences can be one XTEN unit is used in a single fusion protein or conjugate. Substantially devoid of secondary structure under physiologic In other cases, where an increase in half-life of the fusion conditions. "Largely devoid as used in this context, means protein is not needed but where an increase in solubility or that less than 50% of the XTEN amino acid residues of the other physico/chemical property for the biologically active 5 XTEN sequence contribute to secondary structure as mea protein fusion partner is desired, an XTEN sequence shorter sured or determined by the means described herein. “Substan than 100 amino acid residues, such as about 96, or about 84, tially devoid as used in this context, means that at least about or about 72, or about 60, or about 48, or about 36 amino acid 60%, or about 70%, or about 80%, or about 90%, or about residues may be incorporated into a fusion protein composi 95%, or at least about 99% of the XTEN amino acid residues tion with the BP to effect the property. 10 of the XTEN sequence do not contribute to secondary struc The Selection criteria for the XTEN to be linked to the ture, as measured or determined by the means described biologically active proteins to create the inventive fusion herein. proteins generally relate to attributes of physical/chemical A variety of methods have been established in the art to properties and conformational structure of the XTEN that can discern the presence or absence of secondary and tertiary be, in turn, used to confer enhanced pharmaceutical and phar 15 structures in a given polypeptide. In particular, secondary macokinetic properties to the fusion proteins. The XTEN of structure can be measured spectrophotometrically, e.g., by the present invention may exhibit one or more of the follow circular dichroism spectroscopy in the “far-UV spectral ing advantageous properties: conformational flexibility, region (190-250 nm). Secondary structure elements, such as enhanced aqueous solubility, high degree of protease resis alpha-helix and beta-sheet, each give rise to a characteristic tance, low immunogenicity, low binding to mammalian shape and magnitude of CD spectra. Secondary structure can receptors, and increased hydrodynamic (or Stokes) radii; also be predicted for a polypeptide sequence via certain com properties that can make them particularly useful as fusion puter programs or algorithms, such as the well-known Chou protein partners. Non-limiting examples of the properties of Fasman algorithm (Chou, P.Y., et al. (1974) Biochemistry, 13: the fusion proteins comprising BP that may be enhanced by 222-45) and the Garnier-Osguthorpe-Robson (“GOR) algo XTEN include increases in the overall solubility and/or meta 25 rithm (Garnier J. Gibrat JF, Robson B. (1996), GOR method bolic stability, reduced susceptibility to proteolysis, reduced for predicting protein secondary structure from amino acid immunogenicity, reduced rate of absorption when adminis sequence. Methods Enzymol 266:540-553), as described in tered Subcutaneously or intramuscularly, and enhanced phar US Patent Application Publication No. 20030228309A1. For macokinetic properties such as terminal half-life and area a given sequence, the algorithms can predict whether there under the curve (AUC), slower absorption after subcutaneous 30 exists some or no secondary structure at all, expressed as the or intramuscular injection (compared to BP not linked to total and/or percentage of residues of the sequence that form, XTEN) such that the C is lower, which may, in turn, result for example, alpha-helices or beta-sheets or the percentage of in reductions in adverse effects of the BP that, collectively, residues of the sequence predicted to result in random coil can result in an increased period of time that a fusion protein formation (which lacks secondary structure). of a BPXTEN composition administered to a subject remains 35 In some cases, the XTEN sequences used in the inventive within a therapeutic window, compared to the corresponding fusion protein compositions can have an alpha-helix percent BP component not linked to XTEN. age ranging from 0% to less than about 5% as determined by A variety of methods and assays are known in the art for a Chou-Fasman algorithm. In other cases, the XTEN determining the physical/chemical properties of proteins sequences of the fusion protein compositions can have a Such as the fusion protein compositions comprising the inven 40 beta-sheet percentage ranging from 0% to less than about 5% tiveXTEN: properties such as secondary or tertiary structure, as determined by a Chou-Fasman algorithm. In some cases, solubility, protein aggregation, melting properties, contami the XTEN sequences of the fusion protein compositions can nation and water content. Such methods include analytical have an alpha-helix percentage ranging from 0% to less than centrifugation, EPR, HPLC-ion exchange, HPLC-size exclu about 5% and a beta-sheet percentage ranging from 0% to less Sion, HPLC-reverse phase, light scattering, capillary electro 45 than about 5% as determined by a Chou-Fasman algorithm. In phoresis, circular dichroism, differential scanning calorim preferred embodiments, the XTEN sequences of the fusion etry, fluorescence, HPLC-ion exchange, HPLC-size protein compositions will have an alpha-helix percentage less exclusion, IR, NMR, Raman spectroscopy, refractometry, than about 2% and a beta-sheet percentage less than about and UV/Visible spectroscopy. Additional methods are dis 2%. In other cases, the XTEN sequences of the fusion protein closed in Arnau et al. Prot Expr and Purif (2006) 48, 1-13. 50 compositions can have a high degree of random coil percent Application of these methods to the invention would be age, as determined by a GOR algorithm. In some embodi within the grasp of a person skilled in the art. ments, an XTEN sequence can have at least about 80%, more Typically, the XTEN component of the fusion proteins are preferably at least about 90%, more preferably at least about designed to behave like denatured peptide sequences under 91%, more preferably at least about 92%, more preferably at physiological conditions, despite the extended length of the 55 least about 93%, more preferably at least about 94%, more polymer. Denatured describes the state of a peptide in solu preferably at least about 95%, more preferably at least about tion that is characterized by a large conformational freedom 96%, more preferably at least about 97%, more preferably at of the peptide backbone. Most peptides and proteins adopt a least about 98%, and most preferably at least about 99% denatured conformation in the presence of high concentra random coil, as determined by a GOR algorithm. tions of denaturants or at elevated temperature. Peptides in 60 1. Non-Repetitive Sequences denatured conformation have, for example, characteristic cir XTEN sequences of the subject compositions can be sub cular dichroism (CD) spectra and are characterized by a lack stantially non-repetitive. In general, repetitive amino acid of long-range interactions as determined by NMR. “Dena sequences have a tendency to aggregate or form higher order tured conformation' and “unstructured conformation' are structures, as exemplified by natural repetitive sequences used synonymously herein. In some cases, the invention pro 65 Such as collagens and leucine Zippers, or form contacts result vides XTEN sequences that, under physiologic conditions, ing in crystalline or pseudocrystalline structures. In contrast, can resemble denatured sequences largely devoid in second the low tendency of non-repetitive sequences to aggregate US 9,371,369 B2 27 28 enables the design of long-sequence XTENs with a relatively 1365-71; Hsu, C. T., et al. (2000) Cancer Res, 60:3701-5); low frequency of charged amino acids that would be likely to Bachmann MF, et al. Eur J. Immunol. (1995). 25(12):3445 aggregate if the sequences were otherwise repetitive. Typi 3451). cally, the BPXTEN fusion proteins comprise XTEN 2. Exemplary Sequence Motifs sequences of greater than about 100 to about 3000 amino acid 5 The present invention encompasses XTEN that can com residues, preferably greater than 400 to about 3000 residues, prise multiple units of shorter sequences, or motifs, in which wherein the sequences are substantially non-repetitive. In one the amino acid sequences of the motifs are non-repetitive. In embodiment, the XTEN sequences can have greater than designing XTEN sequences, it was discovered that the non about 100 to about 3000 amino acid residues, preferably repetitive criterion may be met despite the use of a “building greater than 400 to about 3000 amino acid residues, in which 10 block’ approach using a library of sequence motifs that are no three contiguous amino acids in the sequence are identical multimerized to create the XTEN sequences. Thus, while an amino acid types unless the amino acid is serine, in which XTEN sequence may consist of multiple units of as few as case no more than three contiguous amino acids are serine four different types of sequence motifs, because the motifs residues. In the foregoing embodiment, the XTEN sequence themselves generally consist of non-repetitive amino acid would be substantially non-repetitive. 15 sequences, the overall XTEN sequence is rendered substan The degree of repetitiveness of a polypeptide or a gene can tially non-repetitive. be measured by computer programs or algorithms or by other In one embodiment, XTEN can have a non-repetitive means known in the art. Repetitiveness in a polypeptide sequence of greater than about 100 to about 3000 amino acid sequence can, for example, be assessed by determining the residues, preferably greater than 400 to about 3000 residues, number of times shorter sequences of a given length occur wherein at least about 80%, or at least about 85%, or at least within the polypeptide. For example, a polypeptide of 200 about 90%, or at least about 95%, or at least about 97%, or amino acid residues has 192 overlapping 9-amino acid about 100% of the XTEN sequence consists of non-overlap sequences (or 9-mer “frames') and 1983-mer frames, but the ping sequence motifs, wherein each of the motifs has about 9 number of unique 9-mer or 3-mer sequences will depend on to 36 amino acid residues. In other embodiments, at least the amount of repetitiveness within the sequence. A score can 25 about 80%, or at least about 85%, or at least about 90%, or at be generated (hereinafter “subsequence score) that is reflec least about 95%, or at least about 97%, or about 100% of the tive of the degree of repetitiveness of the subsequences in the XTEN sequence consists of non-overlapping sequence overall polypeptide sequence. In the context of the present motifs wherein each of the motifs has 9 to 14 amino acid invention, “subsequence score” means the sum of occur residues. In still other embodiments, at least about 80%, or at rences of each unique 3-mer frame across a 200 consecutive 30 least about 85%, or at least about 90%, or at least about 95%, amino acid sequence of the polypeptide divided by the abso or at least about 97%, or about 100% of the XTEN sequence lute number of unique 3-mer subsequences within the 200 component consists of non-overlapping sequence motifs amino acid sequence. Examples of Such subsequence scores wherein each of the motifs has 12 amino acid residues. In derived from the first 200 amino acids of repetitive and non these embodiments, it is preferred that the sequence motifs be repetitive polypeptides are presented in Example 73. In some 35 composed mainly of small hydrophilic amino acids, such that embodiments, the present invention provides BPXTEN each the overall sequence has an unstructured, flexible character comprising XTEN in which the XTEN can have a subse istic. Examples of amino acids that can be included in XTEN, quence score less than 12, more preferably less than 10, more are, e.g., arginine, lysine, threonine, alanine, asparagine, preferably less than 9, more preferably less than 8, more glutamine, aspartate, glutamate, serine, and glycine. As a preferably less than 7, more preferably less than 6, and most 40 result of testing variables such as codon optimization, assem preferably less than 5. In the embodiments hereinabove bly polynucleotides encoding sequence motifs, expression of described in this paragraph, an XTEN with a subsequence protein, charge distribution and solubility of expressed pro score less than about 10 (i.e., 9, 8, 7, etc.) would be “substan tein, and secondary and tertiary structure, it was discovered tially non-repetitive.” that XTEN compositions with enhanced characteristics The non-repetitive characteristic of XTEN can impart to 45 mainly include glycine (G), alanine (A), serine (S), threonine fusion proteins with BP(s) a greater degree of solubility and (T), glutamate (E) and proline (P) residues wherein the less tendency to aggregate compared to polypeptides having sequences are designed to be substantially non-repetitive. In repetitive sequences. These properties can facilitate the for a preferred embodiment, XTEN sequences have predomi mulation of XTEN-comprising pharmaceutical preparations nately four to six types of amino acids selected from glycine containing extremely high drug concentrations, in some cases 50 (G), alanine (A), serine (S), threonine (T), glutamate (E) or exceeding 100 mg/ml. proline (P) that are arranged in a Substantially non-repetitive Furthermore, the XTEN polypeptide sequences of the sequence that is greater than about 100 to about 3000 amino embodiments are designed to have a low degree of internal acid residues, preferably greater than 400 to about 3000 resi repetitiveness in order to reduce or substantially eliminate dues in length. In some embodiments, XTEN can have immunogenicity when administered to a mammal. Polypep 55 sequences of greater than about 100 to about 3000 amino acid tide sequences composed of short, repeated motifs largely residues, preferably greater than 400 to about 3000 residues, limited to three amino acids, such as glycine, serine and wherein at least about 80% of the sequence consists of non glutamate, may result in relatively high antibody titers when overlapping sequence motifs wherein each of the motifs has 9 administered to a mammal despite the absence of predicted to 36 amino acid residues wherein each of the motifs consists T-cell epitopes in these sequences. This may be caused by the 60 of 4 to 6 types of amino acids selected from glycine (G), repetitive nature of polypeptides, as it has been shown that alanine (A), serine (S), threonine (T), glutamate (E) and pro immunogens with repeated epitopes, including protein aggre line (P), and wherein the content of any one amino acid type gates, cross-linked immunogens, and repetitive carbohy in the full-length XTEN does not exceed 30%. In other drates are highly immunogenic and can, for example, result in embodiments, at least about 90% of the XTEN sequence the cross-linking of B-cell receptors causing B-cell activa 65 consists of non-overlapping sequence motifs wherein each of tion. (Johansson, J., et al. (2007) Vaccine, 25:1676-82; Yan the motifs has 9 to 36 amino acid residues wherein the motifs kai, Z. et al. (2006) Biochem Biophy's Res Commun, 345: consist of 4 to 6 types of amino acids selected from glycine US 9,371,369 B2 29 30 (G), alanine (A), serine (S), threonine (T), glutamate (E) and sequence consists of multiple units of two or more non proline (P), and wherein the content of any one amino acid overlapping sequence motifs selected from the amino acid type in the full-length XTEN does not exceed 30%. In other sequences of Table 1. In some cases, the XTEN comprises embodiments, at least about 90% of the XTEN sequence non-overlapping sequence motifs in which about 80%, or at consists of non-overlapping sequence motifs wherein each of 5 least about 90%, or about 91%, or about 92%, or about 93%, the motifs has 12 amino acid residues consisting of 4 to 6 or about 94%, or about 95%, or about 96%, or about 97%, or types of amino acids selected from glycine (G), alanine (A), about 98%, or about 99% to about 100% of the sequence serine (S), threonine (T), glutamate (E) and proline (P), and consists of two or more non-overlapping sequences selected wherein the content of any one amino acid type in the full from a single motif family of Table 1, resulting in a “family' length XTEN does not exceed 30%. In yet other embodi 10 sequence in which the overall sequence remains Substantially ments, at least about 90%, or about 91%, or about 92%, or non-repetitive. Accordingly, in these embodiments, an XTEN about 93%, or about 94%, or about 95%, or about 96%, or sequence can comprise multiple units of non-overlapping about 97%, or about 98%, or about 99%, to about 100% of the sequence motifs of the AD motif family, or the AE motif XTEN sequence consists of non-overlapping sequence family, or the AF motif family, or the AG motif family, or the motifs wherein each of the motifs has 12 amino acid residues 15 AM motif family, or the AQ motif family, or the BC family, or consisting of glycine (G), alanine (A), serine (S), threonine the BD family of sequences of Table 1. In other cases, the (T), glutamate (E) and proline (P), and wherein in the content XTEN comprises motif sequences from two or more of the of any one amino acid type in the full-length XTEN does not motif families of Table 1. exceed 30%. In still other embodiments, XTENs comprise non-repeti tive sequences of greater than about 100 to about 3000 amino TABLE 1. acid residues, preferably greater than 400 to about 3000 XTEN Sequence Motifs of 12 Amino amino acid residues wherein at least about 80%, or at least Acids and Motif Families about 90%, or about 91%, or about 92%, or about 93%, or Motif Family * SEQ ID NO : MOTIF SEQUENCE about 94%, or about 95%, or about 96%, or about 97%, or 25 about 98%, or about 99% of the sequence consists of non AD 82 GESPGGSSGSES overlapping sequence motifs of 9 to 14 amino acid residues wherein the motifs consist of 4 to 6 types of amino acids AD 83 GSEGSSGPGESS selected from glycine (G), alanine (A), serine (S), threonine AD 84 GSSESGSSEGGP (T), glutamate (E) and proline (P), and wherein the sequence 30 of any two contiguous amino acid residues in any one motif is AD 85 GSGGEPSESGSS not repeated more than twice in the sequence motif. In other AE, AM 86 GSPAGSPTSTEE embodiments, at least about 90%, or about 91%, or about 92%, or about 93%, or about 94%, or about 95%, or about AE, AM, AQ 87 GSEPATSGSETP 96%, or about 97%, or about 98%, or about 99% of an XTEN 35 sequence consists of non-overlapping sequence motifs of 12 AE, AM, AQ 88 GTSESATPESGP amino acid residues wherein the motifs consist of 4 to 6 types AE, AM, AQ 89 GTSTEPSEGSAP of amino acids selected from glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) and proline (P), and wherein AF, AM 90 GSTSESPSGTAP the sequence of any two contiguous amino acid residues in 40 any one sequence motif is not repeated more than twice in the AF, AM 91 GTSTPESGSASP sequence motif. In other embodiments, at least about 90%, or AF, AM 92 GTSPSGESSTAP about 91%, or about 92%, or about 93%, or about 94%, or about 95%, or about 96%, or about 97%, or about 98%, or AF, AM 93 GSTSSTAESPGP about 99% of an XTEN sequence consists of non-overlapping 45 AG AM 94 GTPGSGTASSSP sequence motifs of 12 amino acid residues wherein the motifs consist of glycine (G), alanine (A), serine (S), threonine (T), AG AM 95 GSSTPSGATGSP glutamate (E) and proline (P), and wherein the sequence of any two contiguous amino acid residues in any one sequence AG AM 96 GSSPSASTGTGP motif is not repeated more than twice in the sequence motif. 50 AG AM 97 GASPGTSSTGSP In yet other embodiments, XTENs consist of 12 amino acid sequence motifs wherein the amino acids are selected from AQ 98 GEPAGSPTSTSE glycine (G), alanine (A), serine (S), threonine (T), glutamate AQ 99 GTGEPSSTPASE (E) and proline (P), and wherein the sequence of any two contiguous amino acid residues in any one sequence motif is 55 AQ 2OO GSGPSTESAPTE not repeated more than twice in the sequence motif, and wherein the content of any one amino acid type in the full AQ 2O1 GSETPSGPSETA length XTEN does not exceed 30%. In the foregoing embodi AQ 2O2 GPSETSTSEPGA ments hereinabove described in this paragraph, the XTEN AQ 2O3 GSPSEPTEGTSA sequences would be substantially non-repetitive. 60 In some cases, the invention provides compositions com BC 1715 GSGASEPTSTEP prising a non-repetitive XTEN sequence of greater than about 100 to about 3000 amino acid residues, preferably greater BC 1716 GSEPATSGTEPS than 400 to about 3000 residues, wherein at least about 80%, BC 1717 GTSEPSTSEPGA or at least about 90%, or about 91%, or about 92%, or about 65 93%, or about 94%, or about 95%, or about 96%, or about BC 1718 GTSTEPSEPGSA 97%, or about 98%, or about 99% to about 100% of the US 9,371,369 B2 31 32 TABLE 1 - continued result in a fusion protein with a disproportional increase in terminal half-life compared to fusion proteins with unstruc XTEN Sequence Motifs of 12 Amino tured polypeptide partners with shorter sequence lengths. Acids and Motif Families Non-limiting examples of XTEN contemplated for inclu Motif Family * SEQ ID NO: MOTIF SEQUENCE sion in the BPXTEN of the invention are presented in Table 2. Accordingly, the invention provides BPXTEN compositions BD 1719 GSTAGSETSTEA wherein the XTEN sequence length of the fusion protein(s) is greater than about 100 to about 3000 amino acid residues, and BD 1720 GSETATSGSETA 10 in some cases is greater than 400 to about 3000 amino acid BD 1721 GTSESATSESGA residues, wherein the XTEN confers enhanced pharmacoki netic properties on the BPXTEN in comparison to payloads BD 1722 GTSTEASEGSAS not linked to XTEN. In some cases, the XTEN sequences of *Denotes individual motif Sequences that, when used together in the BPXTEN compositions of the present invention can be various permutations, results in a family sequence" 15 about 100, or about 144, or about 288, or about 401, or about In other cases, BPXTEN composition can comprise a non 500, or about 600, or about 700, or about 800, or about 900, or repetitive XTEN sequence of greater than about 100 to about about 1000, or about 1500, or about 2000, or about 2500 or up 3000 amino acid residues, preferably greater than 400 to to about 3000 amino acid residues in length. In other cases, about 3000 residues, wherein at least about 80%, or at least the XTEN sequences can be about 100 to 150, about 150 to about 90%, or about 91%, or about 92%, or about 93%, or 250, about 250 to 400, 401 to about 500, about 500 to 900, about 94%, or about 95%, or about 96%, or about 97%, or about 900 to 1500, about 1500 to 2000, or about 2000 to about about 98%, or about 99% to about 100% of the sequence 3000 amino acid residues in length. In one embodiment, the consists of non-overlapping 36 amino acid sequence motifs BPXTEN can comprise an XTEN sequence wherein the selected from one or more of the polypeptide sequences of sequence exhibits at least about 80% sequence identity, or Tables 12-15. 25 alternatively 81%, 82%, 83%, 84%, 85%. 86%, 87%. 88%, In those embodiments wherein the XTEN component of 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, the BPXTEN fusion protein has less than 100% of its amino 99%, or 100% sequence identity to a XTEN selected from acids consisting of four to six amino acid selected from gly Table 2. In some cases, the XTEN sequence is designed for cine (G), alanine (A), serine (S), threonine (T), glutamate (E) optimized expression as the N-terminal component of the and proline (P), or less than 100% of the sequence consisting 30 of the sequence motifs of Tables 1 or the polypeptide BPXTEN. In one embodiment of the foregoing, the XTEN sequences to Tables 12-15, or less than 100% sequence iden sequence has at least 90% sequence identity to the sequence tity with an XTEN from Table 2, the otheramino acid residues of AE912 or AM923. In another embodiment of the forego can be selected from any other of the 14 natural L-amino ing, the XTEN has the N-terminal residues described in acids. The other amino acids may be interspersed throughout 35 Examples 14-17. the XTEN sequence, may be located within or between the In other cases, the BPXTEN fusion protein can comprise a sequence motifs, or may be concentrated in one or more short first and a second XTEN sequence, wherein the cumulative stretches of the XTEN sequence. In such cases where the total of the residues in the XTEN sequences is greater than XTEN component of the BPXTEN comprises amino acids about 400 to about 3000 amino acid residues. In embodiments other than glycine (G), alanine (A), serine (S), threonine (T), 40 of the foregoing, the BPXTEN fusion protein can comprise a glutamate (E) and proline (P), it is preferred that the amino first and a second XTEN sequence wherein the sequences acids not be hydrophobic residues and should not substan each exhibit at least about 80% sequence identity, or alterna tially confer secondary structure of the XTEN component. tively 81%, 82%, 83%, 84%, 85%. 86%, 87%, 88%, 89%, Thus, in a preferred embodiment of the foregoing, the XTEN 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or component of the BPXTEN fusion protein comprising other 45 100% sequence identity to at least a first or additionally a amino acids in addition to glycine (G), alanine (A), serine (S), second XTEN selected from Table 2. Examples where more threonine (T), glutamate (E) and proline (P) would have a than one XTEN is used in a BPXTEN composition include, sequence with less than 5% of the residues contributing to but are not limited to constructs with an XTEN linked to both alpha-helices and beta-sheets as measured by Chou-Fasman the N- and C-termini of at least one BP. algorithm and would have at least 90% random coil formation 50 As described more fully below, the invention provides as measured by GOR algorithm. methods in which the BPXTEN is designed by selecting the 3. Length of Sequence length of the XTEN to confer a target half-life on a fusion In a particular feature, the invention encompasses BPX protein administered to a subject. In general, longer XTEN TEN compositions comprising XTEN polypeptides with lengths incorporated into the BPXTEN compositions result in extended length sequences. The present invention makes use 55 of the discovery that increasing the length of non-repetitive, longer half-life compared to shorter XTEN. However, in unstructured polypeptides enhances the unstructured nature another embodiment, BPXTEN fusion proteins can be of the XTENs and the biological and pharmacokinetic prop designed to comprise XTEN with a longer sequence length erties of fusion proteins comprising the XTEN. As described that is selected to confer slower rates of systemic absorption more fully in the Examples, proportional increases in the 60 after Subcutaneous or intramuscular administration to a Sub length of the XTEN, even if created by a fixed repeat order of ject. In Such cases, the C is reduced in comparison to a single family sequence motifs (e.g., the four AE motifs of comparable dose of a BP not linked to XTEN, thereby con Table 1), can result in a sequence with a higher percentage of tributing to the ability to keep the BPXTEN within the thera random coil formation, as determined by GOR algorithm, peutic window for the composition. Thus, the XTEN confers compared to shorter XTEN lengths. In addition, it was dis 65 the property of a depot to the administered BPXTEN, in covered that increasing the length of the unstructured addition to the other physical/chemical properties described polypeptide fusion partner can, as described in the Examples, herein.

US 9,371,369 B2 39 40 TABLE 2- Continued XTEN Polypeptides

XTEN SEQ Name ID NO: Amino Acid Sequence

TAGSETSTEAGSTAGSETSTEAGSETATSGSETAGTSESATSESGAGTSESATSESGA GSETATSGSETAGTSESATSESGAGTSESATSESGAGSETATSGSETAGSETATSGSE TAGTSTEASEGSASGSTAGSETSTEAGSETATSGSETAGTSESATSESGAGSTAGSET STEAGSTAGSETSTEAGSTAGSETSTEAGTSTEASEGSASGSTAGSETSTEAGSTAGS ETSTEAGTSTEASEGSASGSTAGSETSTEAGSETATSGSETAGTSTEASEGSASGTSE SATSESGAGSETATSGSETAGTSESATSESGAGTSESATSESGAGSETATSGSETAGT SESATSESGAGSETATSGSETAGTSTEASEGSASGTSTEASEGSASGSTAGSETSTEA GSTAGSETSTEAGSETATSGSETAGTSESATSESGAGTSESATSESGAGSETATSGSE TAGSETATSGSETAGSETATSGSETAGTSTEASEGSASGTSESATSESGAGSETATSG SETAGSETATSGSETAGTSESATSESGAGTSESATSESGAGSETATSGSETA

4. Net Charge In some cases, an XTEN sequence may comprise charged In other cases, the XTEN polypeptides can have an residues separated by other residues such as serine or glycine, unstructured characteristic imparted by incorporation of which may lead to better expression or purification behavior. amino acid residues with a net charge and/or reducing the Based on the net charge, XTENs of the subject compositions proportion of hydrophobic amino acids in the XTEN may have an isoelectric point (pl) of 1.0, 1.5.2.0.2.5.3.0, 3.5, sequence. The overall net charge and net charge density may 4.0, 4.5, 5.0, 5.5, 6.0, or even 6.5. In preferred embodiments, be controlled by modifying the content of charged amino the XTEN will have an isoelectric point between 1.5 and 4.5. acids in the XTEN sequences. In some cases, the net charge In these embodiments, the XTEN incorporated into the BPX density of the XTEN of the compositions may be above +0.1 25 TEN fusion protein compositions of the present invention or below -0.1 charges/residue. In other cases, the net charge would carry a net negative charge under physiologic condi of a XTEN can be about 0%, about 1%, about 2%, about 3%, tions that may contribute to the unstructured conformation about 4%, about 5%, about 6%, about 7%, about 8%, about and reduced binding of the XTEN component to mammalian 9%, about 10% about 11%, about 12%, about 13%, about 30 proteins and tissues. 14%, about 15%, about 16%, about 17%, about 18%, about As hydrophobic amino acids can impart structure to a 19%, or about 20% or more. polypeptide, the invention provides that the content of hydro Since most tissues and Surfaces in a human or animal have phobic amino acids in the XTEN will typically be less than a net negative charge, the XTEN sequences can be designed to 5%, or less than 2%, or less than 1% hydrophobic amino acid have a net negative charge to minimize non-specific interac 35 content. In one embodiment, the amino acid content of tions between the XTEN containing compositions and vari methionine and tryptophan in the XTEN component of a ous Surfaces such as blood vessels, healthy tissues, or various BPXTEN fusion protein is typically less than 5%, or less than receptors. Not to be bound by a particular theory, the XTEN 2%, and most preferably less than 1%. In another embodi can adopt open conformations due to electrostatic repulsion ment, the XTEN will have a sequence that has less than 10% between individual amino acids of the XTEN polypeptide 40 amino acid residues with a positive charge, or less than about that individually carry a high net negative charge and that are 7%, or less that about 5%, or less than about 2% amino acid distributed across the sequence of the XTEN polypeptide. residues with a positive charge, the sum of methionine and Such a distribution of net negative charge in the extended tryptophan residues will be less than 2%, and the sum of sequence lengths of XTEN can lead to an unstructured con asparagine and glutamine residues will be less than 10% of formation that, in turn, can result in an effective increase in 45 the total XTEN sequence. hydrodynamic radius. Accordingly, in one embodiment the 5. Low Immunogenicity invention provides XTEN in which the XTEN sequences In another aspect, the invention provides compositions in contain about 8, 10, 15, 20, 25, or even about 30% glutamic which the XTEN sequences have a low degree of immuno acid. The XTEN of the compositions of the present invention genicity or are substantially non-immunogenic. Several fac generally have no or a low content of positively charged 50 tors can contribute to the low immunogenicity of XTEN, e.g., amino acids. In some cases the XTEN may have less than the non-repetitive sequence, the unstructured conformation, about 10% amino acid residues with a positive charge, or less the high degree of solubility, the low degree or lack of self than about 7%, or less than about 5%, or less than about 2% aggregation, the low degree or lack of proteolytic sites within amino acid residues with a positive charge. However, the the sequence, and the low degree or lack of epitopes in the invention contemplates constructs where a limited number of 55 XTEN sequence. amino acids with a positive charge, Such as lysine, may be Conformational epitopes are formed by regions of the pro incorporated into XTEN to permit conjugation between the tein Surface that are composed of multiple discontinuous epsilon amine of the lysine and a reactive group on a peptide, amino acid sequences of the protein antigen. The precise a linker bridge, or a reactive group on a drug or Small mol folding of the protein brings these sequences into a well ecule to be conjugated to the XTEN backbone. In the forego 60 defined, stable spatial configurations, or epitopes, that can be ing, a fusion proteins can be constructed that comprises recognized as “foreign' by the hosthumoral immune system, XTEN, a biologically active protein, plus a chemotherapeutic resulting in the production of antibodies to the protein or agent useful in the treatment of metabolic diseases or disor triggering a cell-mediated immune response. In the latter ders, wherein the maximum number of molecules of the agent case, the immune response to a protein in an individual is incorporated into the XTEN component is determined by the 65 heavily influenced by T-cell epitope recognition that is a numbers of lysines or other amino acids with reactive side function of the peptide binding specificity of that individuals chains (e.g., cysteine) incorporated into the XTEN. HLA-DR allotype. Engagement of a MHC Class II peptide US 9,371,369 B2 41 42 complex by a cognate T-cell receptor on the Surface of the such as M, I, L. V. F. In some embodiments, an XTEN com T-cell, together with the cross-binding of certain other co ponent incorporated into a BPXTEN does not have a pre receptors such as the CD4 molecule, can induce an activated dicted T-cell epitope at a TEPITOPE score of about -5 or state within the T-cell. Activation leads to the release of cytok greater, or -6 or greater, or -7 or greater, or -8 or greater, or ines further activating other lymphocytes such as B cells to 5 at a TEPITOPE score of-9 or greater. As used herein, a score produce antibodies or activating Tkiller cells as a full cellular of “–9 or greater” would encompass TEPITOPE scores of 10 immune response. to -9, inclusive, but would not encompass a score of -10, as The ability of a peptide to bind a given MHC Class II -10 is less than -9. molecule for presentation on the surface of an APC (antigen In another embodiment, the inventive XTEN sequences, presenting cell) is dependent on a number of factors; most 10 notably its primary sequence. In one embodiment, a lower including those incorporated into the subject BPXTEN fusion degree of immunogenicity may be achieved by designing proteins, can be rendered Substantially non-immunogenic by XTEN sequences that resist antigen processing in antigen the restriction of known proteolytic sites from the sequence of presenting cells, and/or choosing sequences that do not bind the XTEN, reducing the processing of XTEN into small pep MHC receptors well. The invention provides BPXTEN 15 tides that can bind to MHC II receptors. In another embodi fusion proteins with substantially non-repetitive XTEN ment, the XTEN sequence can be rendered substantially non polypeptides designed to reduce binding with MHC II recep immunogenic by the use a sequence that is Substantially tors, as well as avoiding formation of epitopes for T-cell devoid of secondary structure, conferring resistance to many receptor or antibody binding, resulting in a low degree of proteases due to the high entropy of the structure. Accord immunogenicity. Avoidance of immunogenicity is, in part, a ingly, the reduced TEPITOPE score and elimination of direct result of the conformational flexibility of XTEN known proteolytic sites from the XTEN may render the sequences; i.e., the lack of secondary structure due to the XTEN compositions, including the XTEN of the BPXTEN selection and order of amino acid residues. For example, of fusion protein compositions, Substantially unable to be bound particular interest are sequences having a low tendency to by mammalian receptors, including those of the immune adapt compactly folded conformations in aqueous solution or 25 system. In one embodiment, an XTEN of a BPXTEN fusion under physiologic conditions that could result in conforma protein can have >100 nM K binding to a mammalian recep tional epitopes. The administration of fusion proteins com tor, or greater than 500 nM K, or greater than 1 MK, prising XTEN, using conventional therapeutic practices and towards a mammalian cell Surface or circulating polypeptide dosing, would generally not result in the formation of neu receptor. tralizing antibodies to the XTEN sequence, and may also 30 Additionally, the non-repetitive sequence and correspond reduce the immunogenicity of the BP fusion partner in the ing lack of epitopes of XTEN can limit the ability of B cells to BPXTEN compositions. bind to or be activated by XTEN. A repetitive sequence is In one embodiment, the XTEN sequences utilized in the recognized and can form multivalent contacts with even a few Subject fusion proteins can be substantially free of epitopes B cells and, as a consequence of the cross-linking of multiple recognized by human T cells. The elimination of such 35 T-cell independent receptors, can stimulate B cell prolifera epitopes for the purpose of generating less immunogenic tion and antibody production. In contrast, while a XTEN can proteins has been disclosed previously; see for example WO make contacts with many different B cells over its extended 98/52976, WO 02/079232, and WO 00/3317 which are incor sequence, each individual B cell may only make one or a porated by reference herein. Assays for human T cell epitopes small number of contacts with an individual XTEN due to the have been described (Stickler, M., et al. (2003) J Immunol 40 lack of repetitiveness of the sequence. As a result, XTENs Methods, 281: 95-108). Of particular interest are peptide typically may have a much lower tendency to stimulate pro sequences that can be oligomerized without generating T cell liferation of B cells and thus an immune response. In one epitopes or non-human sequences. This can be achieved by embodiment, the BPXTEN may have reduced immunogenic testing direct repeats of these sequences for the presence of ity as compared to the corresponding BP that is not fused. In T-cell epitopes and for the occurrence of 6 to 15-mer and, in 45 one embodiment, the administration of up to three parenteral particular, 9-mer sequences that are not human, and then doses of a BPXTEN to a mammal may result in detectable altering the design of the XTEN sequence to eliminate or anti-BPXTEN IgG at a serum dilution of 1:100 but not at a disrupt the epitope sequence. In some cases, the XTEN dilution of 1:1000. In another embodiment, the administra sequences are Substantially non-immunogenic by the restric tion of up to three parenteral doses of an BPXTEN to a tion of the numbers of epitopes of the XTEN predicted to bind 50 mammal may result in detectable anti-BP IgG at a serum MHC receptors. With a reduction in the numbers of epitopes dilution of 1:100 but not at a dilution of 1:1000. In another capable of binding to MHC receptors, there is a concomitant embodiment, the administration of up to three parenteral reduction in the potential for T cell activation as well as T cell doses of an BPXTEN to a mammal may result in detectable helper function, reduced B cell activation or upregulation and anti-XTEN IgG at a serum dilution of 1:100 but not at a reduced antibody production. The low degree of predicted 55 dilution of 1:1000. In the foregoing embodiments, the mam T-cell epitopes can be determined by epitope prediction algo mal can be a mouse, a rat, a rabbit, or a cynomolgus monkey. rithms such as, e.g., TEPITOPE (Sturniolo, T., et al. (1999) An additional feature of XTENs with non-repetitive Nat Biotechnol, 17: 555-61), as shown in Example 74. The sequences relative to sequences with a high degree of repeti TEPITOPE score of a given peptide frame within a protein is tiveness can be that non-repetitive XTENs form weaker con the log of the K (dissociation constant, affinity, off-rate) of 60 tacts with antibodies. Antibodies are multivalent molecules. the binding of that peptide frame to multiple of the most For instance, IgGs have two identical binding sites and IgMS common human MHC alleles, as disclosed in Sturniolo, T. et contain 10 identical binding sites. Thus antibodies against al. (1999) Nature Biotechnology 17:555). The score ranges repetitive sequences can form multivalent contacts with Such over at least 20 logs, from about 10 to about -10 (correspond repetitive sequences with high avidity, which can affect the ing to binding constraints of 10e'K to 10e'K), and can 65 potency and/or elimination of Such repetitive sequences. In be reduced by avoiding hydrophobic amino acids that can contrast, antibodies against non-repetitive XTENs may yield serve as anchor residues during peptide display on MHC, monovalent interactions, resulting in less likelihood of US 9,371,369 B2 43 44 immune clearance such that the BPXTEN compositions can tions, an Apparent Molecular Weight Factor of at least three, remain in circulation for an increased period of time. alternatively of at least four, alternatively of at least five, 6. Increased Hydrodynamic Radius alternatively of at least six, alternatively of at least eight, In another aspect, the present invention provides XTEN in alternatively of at least 10, alternatively of at least 15, or an which the XTEN polypeptides can have a high hydrodynamic Apparent Molecular Weight Factor of at least 20 or greater. In radius that confers a corresponding increased Apparent another embodiment, the BPXTEN fusion protein has, under Molecular Weight to the BPXTEN fusion protein incorporat physiologic conditions, an Apparent Molecular Weight Fac ing the XTEN. As detailed in Example 19, the linking of tor that is about 4 to about 20, or is about 6 to about 15, or is XTEN to BP sequences can result in BPXTEN compositions about 8 to about 12, or is about 9 to about 10 relative to the that can have increased hydrodynamic radii, increased Appar 10 actual molecular weight of the fusion protein. ent Molecular Weight, and increased Apparent Molecular III). Biologically Active Proteins of the BXTEN Fusion Pro Weight Factor compared to a BP not linked to an XTEN. For tein Compositions example, in therapeutic applications in which prolonged half The present invention relates in part to fusion protein com life is desired, compositions in which a XTEN with a high positions comprising biologically active proteins and XTEN hydrodynamic radius is incorporated into a fusion protein 15 and the uses thereof for the treatment of diseases, disorders or comprising one or more BP can effectively enlarge the hydro conditions of a Subject. dynamic radius of the composition beyond the glomerular In one aspect, the invention provides at least a first biologi pore size of approximately 3-5 nm (corresponding to an cally active protein (hereinafter “BP) covalently linked to a apparent molecular weight of about 70 kDA) (Caliceti. 2003. fusion protein comprising one or more extended recombinant Pharmacokinetic and biodistribution properties of poly(eth polypeptides (“XTEN), resulting in an XTEN fusion protein ylene glycol)-protein conjugates. Adv Drug Deliv Rev composition (hereinafter “BPXTEN”). As described more 55:1261-1277), resulting in reduced renal clearance of circu fully below, the fusion proteins can optionally include spacer lating proteins. The hydrodynamic radius of a protein is deter sequences that can further comprise cleavage sequences to mined by its molecular weight as well as by its structure, release the BP from the fusion protein when acted on by a including shape and compactness. Not to be bound by a 25 protease. particular theory, the XTEN can adopt open conformations The term “BPXTEN', as used herein, is meant to encom due to electrostatic repulsion between individual charges of pass fusion polypeptides that comprise one or two payload the peptide or the inherent flexibility imparted by the particu regions each comprising a biologically active protein that lar amino acids in the sequence that lack potential to confer mediates one or more biological or therapeutic activities and secondary structure. The open, extended and unstructured 30 at least one other region comprising at least one XTEN conformation of the XTEN polypeptide can have a greater polypeptide. proportional hydrodynamic radius compared to polypeptides The BP of the subject compositions, particularly those of a comparable sequence length and/or molecular weight disclosed in Tables 3-8, together with their corresponding that have secondary and/or tertiary structure. Such as typical nucleic acid and amino acid sequences, are well known in the globular proteins. Methods for determining the hydrody 35 art and descriptions and sequences are available in public namic radius are well known in the art, Such as by the use of databases such as Chemical Abstracts Services Databases size exclusion chromatography (SEC), as described in U.S. (e.g., the CAS Registry), GenBank, The Universal Protein Pat. Nos. 6,406,632 and 7,294,513. As the results of Example Resource (UniProt) and subscription provided databases such 19 demonstrate, the addition of increasing lengths of XTEN as GenSeq (e.g., Derwent). Polynucleotide sequences may be results in proportional increases in the parameters of hydro 40 a wild type polynucleotide sequence encoding a given BP dynamic radius, Apparent Molecular Weight, and Apparent (e.g., either full length or mature), or in Some instances the Molecular Weight Factor, permitting the tailoring of BPX sequence may be a variant of the wild type polynucleotide TEN to desired characteristic cut-off Apparent Molecular sequence (e.g., a polynucleotide which encodes the wild type Weights or hydrodynamic radii. Accordingly, in certain biologically active protein, wherein the DNA sequence of the embodiments, the BPXTEN fusion protein can be configured 45 polynucleotide has been optimized, for example, for expres with an XTEN such that the fusion protein can have a hydro sion in a particular species; or a polynucleotide encoding a dynamic radius of at least about 5 nm, or at least about 8 nm, variant of the wild type protein, such as a site directed mutant or at least about 10 nm, or 12 nm, or at least about 15 nm. In or an allelic variant. It is well within the ability of the skilled the foregoing embodiments, the large hydrodynamic radius artisan to use a wild-type or consensus cDNA sequence or a conferred by the XTEN in an BPXTEN fusion protein can 50 codon-optimized variant of a BP to create BPXTEN con lead to reduced renal clearance of the resulting fusion protein, structs contemplated by the invention using methods known leading to a corresponding increase in terminal half-life, an in the art and/or in conjunction with the guidance and meth increase in mean residence time, and/or a decrease in renal ods provided herein, and described more fully in the clearance rate. Examples. In another embodiment, an XTEN of a chosen length and 55 The BP for inclusion in the BPXTEN of the invention can sequence can be selectively incorporated into a BPXTEN to include any protein of biologic, therapeutic, prophylactic, or create a fusion protein that will have, under physiologic con diagnostic interest or function, or that is useful for mediating ditions, an Apparent Molecular Weight of at least about 150 a biological activity or preventing or ameliorating a disease, kDa, or at least about 300 kDa, or at least about 400 kDa, or disorder or conditions when administered to a subject. Of at least about 500 kDA, or at least about 600 kDa, or at least 60 particular interest are BP for which an increase in a pharma about 700 kDA, or at least about 800 kDa, or at least about 900 cokinetic parameter, increased solubility, increased stability, kDa, or at least about 1000kDa, or at least about 1200 kDa, or or some other enhanced pharmaceutical property is sought, or at least about 1500 kDa, or at least about 1800 kDa, or at least those BP for which increasing the terminal half-life would about 2000 kDa, or at least about 2300 kDa or more. In improve efficacy, safety, or result in reduce dosing frequency another embodiment, an XTEN of a chosen length and 65 and/or improve patient compliance. Thus, the BPXTEN sequence can be selectively linked to a BP to result in a fusion protein compositions are prepared with various objec BPXTEN fusion protein that has, under physiologic condi tives in mind, including improving the therapeutic efficacy of US 9,371,369 B2 45 46 the bioactive compound by, for example, increasing the in exogenous insulin during their lifetime. Even in well-man vivo exposure or the length that the BPXTEN remains within aged subjects, episodic complications can occur, some of the therapeutic window when administered to a subject, com which are life-threatening. pared to a BP not linked to XTEN. InType II diabetics, rising blood glucose levels after meals ABP of the invention can be a native, full-length protein or 5 do not properly stimulate insulin production by the pancreas. can be a fragment or a sequence variant of a biologically Additionally, peripheral tissues are generally resistant to the active protein that retains at least a portion of the biological effects of insulin, and Such Subjects often have higher than activity of the native protein. normal plasma insulin levels (hyperinsulinemia) as the body In one embodiment, the BP incorporated into the subject attempts to overcome its insulin resistance. In advanced dis compositions can be a recombinant polypeptide with a 10 ease states insulin secretion is also impaired. sequence corresponding to a protein found in nature. In Insulin resistance and hyperinsulinemia have also been another embodiment, the BP can be sequence variants, frag linked with two other metabolic disorders that pose consid ments, homologs, and mimetics of a natural sequence that erable health risks: impaired glucose tolerance and metabolic retain at least a portion of the biological activity of the native obesity. Impaired glucose tolerance is characterized by nor BP. In non-limiting examples, a BP can be a sequence that 15 mal glucose levels before eating, with a tendency toward exhibits at least about 80% sequence identity, or alternatively elevated levels (hyperglycemia) following a meal. These indi 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, viduals are considered to be at higher risk for diabetes and 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% coronary artery disease. Obesity is also a risk factor for the sequence identity to a protein sequence selected from Tables group of conditions called insulin resistance syndrome, or 3-8. In one embodiment, a BPXTEN fusion protein can com “Syndrome X. as is hypertension, coronary artery disease prise a single BP molecule linked to an XTEN (as described (arteriosclerosis), and lactic acidosis, as well as related dis more fully below). In another embodiment, the BPXTEN can ease states. The pathogenesis of obesity is believed to be comprise a first BP and a second molecule of the same BP. multifactorial but an underlying problem is that in the obese, resulting in a fusion protein comprising the two BP linked to nutrient availability and energy expenditure are not in balance one or more XTEN (for example, two molecules of glucagon, 25 until there is excess adipose tissue. Other related diseases or or two molecules of hCGH). disorders include, but are not limited to, gestational diabetes, In general, BP will exhibit a binding specificity to a given juvenile diabetes, obesity, excessive appetite, insufficient target or another desired biological characteristic when used Satiety, metabolic disorder, glucagonomas, retinal neurode in vivo or when utilized in an in vitro assay. For example, the generative processes, and the "honeymoon period of Type I BP can be an agonist, a receptor, a ligand, an antagonist, an 30 diabetes. enzyme, or a hormone. Of particular interest are BP used or Dyslipidemia is a frequent occurrence among diabetics; known to be useful for a disease or disorder wherein the native typically characterized by elevated plasma triglycerides, low BP have a relatively short terminal half-life and for which an HDL (high density lipoprotein) cholesterol, normal to enhancement of a pharmacokinetic parameter (which option elevated levels of LDL (low density lipoprotein) cholesterol ally could be released from the fusion protein by cleavage of 35 and increased levels of small dense, LDL particles in the a spacer sequence) would permit less frequent dosing or an blood. Dyslipidemia is a main contributor to an increased enhanced pharmacologic effect. Also of interest are BP that incidence of coronary events and deaths among diabetic Sub have a narrow therapeutic window between the minimum jects. effective dose or blood concentration (C) and the maxi Most metabolic processes in glucose homeostatis and insu mum tolerated dose or blood concentration (C). In such 40 lin response are regulated by multiple peptides and hormones, cases, the linking of the BP to a fusion protein comprising a and many Such peptides and hormones, as well as analogues select XTEN sequence(s) can result in an improvement in thereof, have found utility in the treatment of metabolic dis these properties, making them more useful as therapeutic or eases and disorders. Many of these peptides tend to be highly preventive agents compared to BP not linked to XTEN. homologous to each other, even when they possess opposite The BP encompassed by the inventive compositions can 45 biological functions. Glucose-increasing peptides are exem have utility in the treatment in various therapeutic or disease plified by the peptide hormone glucagon, while glucose-low categories, including but not limited to glucose and insulin ering peptides include exendin-4, glucagon-like peptide 1. disorders, metabolic disorders, cardiovascular diseases, and amylin. However, the use of therapeutic peptides and/or coagulation/bleeding disorders, growth disorders or condi hormones, even when augmented by the use of Small mol tions, tumorigenic conditions, inflammatory conditions, 50 ecule drugs, has met with limited Success in the management autoimmune conditions, etc. of Such diseases and disorders. In particular, dose optimiza (a) Glucose-Regulating Peptides tion is important for drugs and biologics used in the treatment Endocrine and obesity-related diseases or disorders have of metabolic diseases, especially those with a narrow thera reached epidemic proportions in most developed nations, and peutic window. Hormones in general, and peptides involved represent a Substantial and increasing health care burden in 55 in glucose homeostasis often have a narrow therapeutic win most developed nations, which include a large variety of dow. The narrow therapeutic window, coupled with the fact conditions affecting the organs, tissues, and circulatory sys that such hormones and peptides typically have a short half tem of the body. Of particular concern are endocrine and life, which necessitates frequent dosing in order to achieve obesity-related diseases and disorders, which. Chief amongst clinical benefit, results in difficulties in the management of these is diabetes; one of the leading causes of death in the 60 such patients. While chemical modifications to a therapeutic United States. Diabetes is divided into two major sub-classes protein, Such as pegylation, can modify its in vivo clearance Type I, also known as juvenile diabetes, or Insulin-Dependent rate and Subsequent serum half-life, it requires additional Diabetes Mellitus (IDDM), and Type II, also known as adult manufacturing steps and results in a heterogeneous final onset diabetes, or Non-Insulin-Dependent Diabetes Mellitus product. In addition, unacceptable side effects from chronic (NIDDM). Type I Diabetes is a form of autoimmune disease 65 administration have been reported. Alternatively, genetic that completely or partially destroys the insulin producing modification by fusion of an Fc domain to the therapeutic cells of the pancreas in Such subjects, and requires use of protein or peptide increases the size of the therapeutic protein, US 9,371,369 B2 47 48 reducing the rate of clearance through the kidney, and pro cally active polypeptides that increase glucose-dependent motes recycling from lysosomes by the FcRn receptor. Unfor secretion of insulin by pancreatic beta-cells or potentiate the tunately, the Fc domain does not fold efficiently during action of insulin. Glucose-regulating peptides can also recombinant expression and tends to form insoluble precipi include all biologically active polypeptides that stimulate tates known as inclusion bodies. These inclusion bodies must pro-insulin gene transcription in the pancreatic beta-cells. be solubilized and functional protein must be renatured; a Furthermore, glucose-regulating peptides can also include all time-consuming, inefficient, and expensive process. biologically active polypeptides that slow down gastric emp Thus, one aspect of the present invention is the incorpora tying time and reduce food intake. Glucose-regulating pep tion of peptides involved in glucose homoestasis, insulin tides can also include all biologically active polypeptides that resistance and obesity (collectively, “glucose regulating pep 10 inhibit glucagon release from the alpha cells of the Islets of tides’) in BPXTEN fusion proteins to create compositions Langerhans. Table 3 provides a non-limiting list of sequences with utility in the treatment of glucose, insulin, and obesity of glucose regulating peptides that are encompassed by the disorders, disease and related conditions. Glucose regulating BPXTEN fusion proteins of the invention. Glucose regulating peptides can include any protein of biologic, therapeutic, or peptides of the inventive BPXTEN compositions can be a prophylactic interest or function that is useful for preventing, 15 peptide that exhibits at least about 80% sequence identity, or treating, mediating, or ameliorating a disease, disorder or alternatively 81%, 82%, 83%, 84%, 85%. 86%, 87%. 88%, condition of glucose homeostasis or insulin resistance or 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, obesity. Suitable glucose-regulating peptides that can be 99%, or 100% sequence identity to a protein sequence linked to the XTEN to create BPXTEN include all biologi selected from Tables 3. TABLE 3 Glucose regulating peptides and corresponding amino acid sequences

Name of Protein SEO ID (Synonym) NO: Sequence

Adrenomedullin 1. YROSMNNFOGLRSFGCRFGTCTVOKLAHOIYOFTDKDKDNVAPRSKIS (ADM) POGY

Amylin, rat 2 KCNTATCATORLANFLVRSSNNLGPWLPPTNVGSNTY

Amylin, human 3 KCNTATCATORLANFLVHSSNNFGAILSSTNVGSNTY

Calcitonin (hCT) 4. CGNLSTCMLGTYTODFNKFHTFPOTAIGVGAP

Calcitonin, salmon 5 CSNLSTCVLGKLSOELHKLOTYPRTNTGSGTP

Calcitonin gene 6 ACDTATCWTHRLAGLLSRSGGWWKNMWPTNWGSKAF related peptide (h- CGRP Ct.)

Calcitonin gene 7 ACNTATCWTHRLAGLLSRSGGMWKSNFWPTNWGSKAF related peptide (h- CGRP B)

cholecystokinin 8 MNSGWCLCVLMAVLAAGALTQPWPPADPAGSGLORAEEAPRROLRVS (CCK) QRTDGESRAHL.GALLARYIQQARKAPSGRMSIVKNLONLDPSHRISDR DYMGWMDFGRRSAEEYEYPS

CCK-33 9 KAPSGRMSIWKNLONLDPSHRISDRDYMGWMDF

CCK-8 1O DYMGWMDF

Exendin-3 11 HSDGTFTSDLSKOMEEEAVRLFIEWLKNGGPSSGAPPPS

Exendin-4 12 HGEGTFTSDLSKOMEEEAVR LFIEWLKNGGPSSGAPPPS

FGF- 19 13 MRSGCVWWHWWILAGLWLAWAGRPLAFSDAGPHWHYGWGDPIRLRH LYTSGPHGLSSCFLRIRADGVVDCARGOSAHSLLEIKAVALRTVAIKGV HSVRYLCMGADGKMOGLLOYSEEDCAFEEEIRPDGYNVYRSEKHRLP WSLSSAKOROLYKNRGFLPLSHFLPMLPMVPEEPEDLRGHLESDMFSSP LETDSMDPFGLWTGLEAWRSPSFEK

FGF-21 14 MDSDETGFEHSGLWWSWLAGLLLGACQAHPIPDSSPLLOFGGOVRORY LYTDDAQOTEAHLEIREDGTWGGAADQSPESLLOLKALKPGVIOILGV KTSRFLCORPDGALYGSLHFDPEACSFRELLLEDGYNVYOSEAHGLPL HLPGNKSPHRDPAPRGPARFLPLPGLPPALPEPPGILAPOPPDWGSSDPL SMVGPSOGRSPSYAS

Gastrin 15 OLGPQGPPHLVADPSKKQGPWLEEEEEAYGWMDF

Gastrin-17 16 DPSKKQGPWLEEEEEAYGWMDF US 9,371,369 B2 49 50 TABLE 3 - continued Glucose reculatinct peptides and correspondind amino acid sequences

Name of Protein SEO ID (Synonym) NO : Sequence

Gastric inhibitory 17 YAEGTFISDYSIAMDKIHOODFWNWLLAOKGKKNDWKHNITO polypeptide (GIP)

Ghrelin 18 GSSFLSPEHORVOORKESKKPPAKLQPR Glucagon 19 HSQGTFTSDYSKYLDSRRAQDFWQWLMNT Glucagon-like peptide- 2 O HDEFERHAEGTFTSDVSSTLEGQAALEFIAWLVKGRG 1 (hGLP-1) (GLP-1; 1 - 37) GLP-1 (7-36), human 21 HAEGTFTSDWSSYLEGOAALEFIAWLWKGR GLP-1 (7-37), human 22 HAEGTFTSDWSSTLEGOAALEFIAWLWKGRG GLP-1, frog 23 HAEGTYNOWTEYLEEKAAKEFEWKGKPKKRYS Glucagon-like peptide 24 HADGSFSDEMINTILDNLAARDFINWLIETKITD 2 (GLP-2), human GLP-2, frog 25 HAEGTFTNDMTNYLEEKAAKEFWGWLIKGRP-OH

IGF-1 26 GPETLCGAELVDALOFVCGDRGFYFNKPTGYGSSSRRAPOTGIVDECC FRSCDLRRLEMYCAPLKPAKSA

IGF-2 27 AYRPSETLCGGELVDTLOFWCGDRGFYFSRPASRVSRRSRGIVEECCFR SCDLALLETYCATPAKSE INGAP peptide 28 EESOKKLPSSRITCPOGSWAYGSYCYSLILIPOTWSNAELSCOMHFSGH (islet neogenesis- LAFLLSTGEITFWSSLWKNSLTAYOYIWIGLHDPSHGTL.PNGSGWKWSS associated protein) SNVLTFYNWERNPSIAADRGYCAVLSOKSGFOKWRDFNCENELPYICK FKV

Intermedin (AFP-6) 29 TOAOLLRVGCVLGTCOWONLSHRLWOLMGPAGRODSAPWDPSSPHSY Leptin, human 3 O WPIOKVODDTKTLIKTIVTRINDISHTOSVSSKOKVTGLDFIPGLHPILTL SKMDOTLAVYOOILTSMPSRNVIOISNDLENLRDLLHVLAFSKSCHLP WASGLETLDSLGGVLEASGYSTEVVALSRLQGSLODMLWOLDLSPGC

Neuromedin (U-8) 31 YFLFRPRN porcine

Neuromedin (U-9) 32 GYFLFRPRN neuromedin (U25) 33 FRVDEEFOSPFASOSRGYFLFRPRN human)

Neuromedin (U25) 34 FKVDEEFOGPIVSONRRYFLFRPRN pig

Neuromedin S, human 35 ILORGSGTAAWDFTKKDHTATWGRPFFLFRPRN Neuromedin U, rat 36 YKVNEYOGPWAPSGGFFLFRPRN oxyntomodulin 37 HSOGTFTSDYSKYLDSRRAODFWOWLMNTKRNRNNIA (OXM) Peptide YY (PYY) 38 YPIKPEAPGEDASPEELNRYYASLRHYLNLVTRORY

Pramlintide 39 KCNTATCATNRLANFLWHSSNNFGPILPPTNWGSNTY-NH2

Urocortin (Ucn-1) 4 O DNPSLSIDLTFHLLRTLLELARTOSQRERAEQNRIIFDSV Urocortin (Ucn-2) 41 IVLSLDWPIGLLOILLEQARARAAREQATTNARILARVGHC Urocortin (Ucn-3) 42 FTLSLDWPTNIMNLLFNIAKAKNLRAQAAANAHLMAQI

“Adrenomedullin' or "ADM means the human adrenom- a measured plasma half-life of 22 min. ADM-containing edulin peptide hormone and species and sequence variants fusion proteins of the invention may find particular use in thereof having at least a portion of the biological activity of diabetes for stimulatory effects on insulin secretion fromislet mature ADM. ADM is generated from a 185 amino acid 65 cells for glucose regulation or in subjects with sustained preprohormone through consecutive enzymatic cleavage and hypotension. The complete genomic infrastructure for human amidation, resulting in a 52 amino acid bioactive peptide with AM has been reported (Ishimitsu, et al., Biochem. Biophys. US 9,371,369 B2 51 52 Res. Commun 203:631-639 (1994)), and analogs of ADM mature CGRP. Calcitonin gene related peptide is a member of peptides have been cloned, as described in U.S. Pat. No. the calcitonin family of peptides, which in humans exists in 6,320,022. Amylin' means the human peptide hormone two forms, C.-CGRP (a 37 amino acid peptide) and B-CGRP. referred to as amylin, pramlintide, and species variations CGRP has 43-46% sequence identity with human amylin. thereof, as described in U.S. Pat. No. 5,234.906, having at CGRP-containing fusion proteins of the invention may find least a portion of the biological activity of mature amylin. particular use in decreasing morbidity associated with diabe Amylin is a 37-amino acid polypeptide hormone co-se tes, ameliorating hyperglycemia and insulin deficiency, inhi creted with insulin by pancreatic beta cells in response to bition of lymphocyte infiltration into the islets, and protection nutrient intake (Koda et al., Lancet 339:1179-1180. 1992), of beta cells against autoimmune destruction. Methods for 10 making synthetic and recombinant CGRP are described in and has been reported to modulate several key pathways of U.S. Pat. No. 5,374,618. carbohydrate metabolism, including incorporation of glucose “Cholecystokinin’ or “CCK means the human CCK pep into glycogen. Amylin-containing fusion proteins of the tide and species and sequence variants thereofhaving at least invention may find particular use in diabetes and obesity for a portion of the biological activity of mature CCK. CCK-58 is regulating gastric emptying, Suppressing glucagon Secretion 15 the mature sequence, while the CCK-33 amino acid sequence and food intake, thereby affecting the rate of glucose appear first identified in humans is the major circulating form of the ance in the circulation. Thus, the fusion proteins may comple peptide. The CCK family also includes an 8-amino acid in ment the action of insulin, which regulates the rate of glucose vivo C-terminal fragment (“CCK-8), pentagastrin or CCK-5 disappearance from the circulation and its uptake by periph being the C-terminal peptide CCK(29-33), and CCK-4 being eral tissues. Amylin analogues have been cloned, as described the C-terminal tetrapeptide CCK(30-33). CCK is a peptide in U.S. Pat. Nos. 5,686,411 and 7,271.238. Amylin mimetics hormone of the gastrointestinal system responsible for stimu can be created that retain biologic activity. For example, lating the digestion of fat and protein. CCK-33 and CCK-8- pramlintide has the sequence KCNTATCATNRLANFLVH containing fusion proteins of the invention may find particu SSNNFGPILPPTNVGSNTY (SEQ ID NO: 43), wherein lar use in reducing the increase in circulating glucose after amino acids from the rat amylin sequence are Substituted for 25 meal ingestion and potentiating the increase in circulating amino acids in the human amylin sequence. In one embodi insulin. Analogues of CCK-8 have been prepared, as ment, the invention contemplates fusion proteins comprising described in U.S. Pat. No. 5,631,230. amylin mimetics of the sequence “Exendin-3’ means a glucose regulating peptide isolated from Heloderma horridum and sequence variants thereof 30 having at least a portion of the biological activity of mature (SEQ ID NO: 44) exendin-3. Exendin-3 amide is a specific exendin receptor KCNTATCATXRLANFLVHSSNNFGXILXX, TNVGSNTY antagonist from that mediates an increase in pancreatic wherein X is independently N or Q and X is independently cAMP, and release of insulin and amylase. Exendin-3-con S. Por G. In one embodiment, the amylin mimetic incorpo taining fusion proteins of the invention may find particular rated into a BPXTEN can have the sequence KCNTATCAT 35 use in the treatment of diabetes and insulin resistance disor NRLANFLVHSSNNFGGILGGTNVGSNTY (SEQID NO: ders. The sequence and methods for its assay are described in 45). In another embodiment, wherein the amylin mimetic is U.S. Pat. No. 5,424,286. used at the C-terminus of the BPXTEN, the mimetic can have “Exendin-4” means a glucose regulating peptide found in the sequence KCNTATCATNRLANFLVHSSNNFGGILG the saliva of the Gila-monster Heloderma suspectum, as well GTNVGSNTY(NH2) (SEQID NO:46). 40 as species and sequence variants thereof, and includes the “Calcitonin’ (CT) means the human calcitonin protein and native 39 amino acid sequence His-Gly-Glu-Gly-Thr-Phe species and sequence variants thereof, including salmon cal Thr-Ser-Asp-Leu-Ser-Lys-Gln-Met-Glu-Glu-Glu-Ala-Val citonin (“sCT), having at least a portion of the biological Arg-Leu-Phe-Ile-Glu-Trp-Leu-Lys-Asn-Gly-Gly-Pro-Ser activity of mature CT. CT is a 32 amino acid peptide cleaved Ser-Gly-Ala-Pro-Pro-Pro-Ser (SEQ ID NO: 47) and from a larger prohormone of the thyroid that appears to func 45 homologous sequences and peptide mimetics, and variants tion in the nervous and vascular systems, but has also been thereof natural sequences, such as from primates and non reported to be a potent hormonal mediator of the satiety natural having at least a portion of the biological activity of reflex. CT is named for its secretion in response to induced mature exendin-4. Exendin-4 is an incretin polypeptide hor hypercalcemia and its rapid hypocalcemic effect. It is pro mone that decreases blood glucose, promotes insulin secre duced in and secreted from neuroendocrine cells in the thy 50 tion, slows gastric emptying and improves satiety, providing roid termed C cells. CT has effects on the osteoclast, and the a marked improvement in postprandial hyperglycemia. The inhibition of osteoclast functions by CT results in a decrease exendins have some sequence similarity to members of the in bone resorption. In vitro effects of CT include the rapid loss glucagon-like peptide family, with the highest identity being of ruffled borders and decreased release of lysosomal to GLP-1 (Goke, et al., J. Biol. Chem.,268:19650-55 (1993)). enzymes. A major function of CT(1-32) is to combat acute 55 A variety of homologous sequences can be functionally hypercalcemia in emergency situations and/or protect the equivalent to native exendin-4 and GLP-1. Conservation of skeleton during periods of "calcium stress' Such as growth, GLP-1 sequences from different species are presented in pregnancy, and lactation. (Reviewed in Becker, JCEM, 89(4): Regulatory Peptides 2001 98 p. 1-12. Table 4 shows the 1512-1525 (2004) and Sexton, Current Medicinal Chemistry sequences from a wide variety of species, while Table 5 shows 6: 1067-1093 (1999)). Calcitonin-containing fusion proteins 60 a list of synthetic GLP-1 analogs; all of which are contem of the invention may find particular use for the treatment of plated for use in the BPXTEN described herein. Exendin-4 osteoporosis and as a therapy for Paget’s disease of bone. binds at GLP-1 receptors on insulin-secreting BTC1 cells, and Synthetic calcitonin peptides have been created, as described also stimulates Somatostatin release and inhibits gastrin in U.S. Pat. Nos. 5,175,146 and 5,364,840. release in isolated stomachs (Goke, et al., J. Biol. Chem. “Calcitonin gene related peptide' or “CGRP” means the 65 268:19650-55, 1993). As a mimetic of GLP-1, exendin-4 human CGRP peptide and species and sequence variants displays a similar broad range of biological activities, yet has thereof having at least a portion of the biological activity of a longer half-life than GLP-1, with a mean terminal half-life US 9,371,369 B2 53 54 of 2.4 h. Exenatide is a synthetic version of exendin-4, mar food intake and increase fat mass by an action exerted at the keted as Byetta. However, due to its short half-life, exenatide level of the hypothalamus. Ghrelin also stimulates the release is currently dosed twice daily, limiting its utility. Exendin-4- of growth hormone. Ghrelin is acylated at a serine residue by containing fusion proteins of the invention may find particu n-octanoic acid; this acylation is essential for binding to the GHS1a receptor and for the GH-releasing capacity of ghrelin. lar use in the treatment of diabetes and insulin resistance 5 Ghrelin-containing fusion proteins of the invention may find disorders. particular use as agonists; e.g., to selectively stimulate motill Fibroblast growth factor 21, or “FGF-21” means the ity of the GI tract in gastrointestinal motility disorder, to human protein encoded by the FGF21 gene, or species and accelerate gastric emptying, or to stimulate the release of sequence variants thereof having at least a portion of the growth hormone. Ghrelin analogs with sequence Substitu biological activity of mature FGF21. FGF-21 stimulates glu 10 tions or truncated variants, such as described in U.S. Pat. No. cose uptake in adipocytes but not in other cell types; the effect 7.385,026, may find particular use as fusion partners to is additive to the activity of insulin. FGF-21 injection in ob?ob XTEN for use as antagonists for improved glucose homeo mice results in an increase in Glut 1 in adipose tissue. FGF21 Stasis, treatment of insulin resistance and treatment of obe also protects animals from diet-induced obesity when over sity. The isolation and characterization of ghrelin has been expressed in transgenic mice and lowers blood glucose and 15 reported (Kojima M. et al., Ghrelin is a growth-hormone triglyceride levels when administered to diabetic rodents releasing acylated peptide from stomach. Nature. 1999: 402 (Kharitonenkov A. et al., (2005). “FGF-21 as a novel meta (6762): 656-660.) and synthetic analogs have been prepared bolic regulator”. J. Clin. Invest. 115: 1627-35). FGF-21-con by peptide synthesis, as described in U.S. Pat. No. 6,967.237. taining fusion proteins of the invention may find particular “Glucagon” means the human glucagon glucose regulating use in treatment of diabetes, including causing increased peptide, or species and sequence variants thereof, including energy expenditure, fat utilization and lipid excretion. FGF the native 29 amino acid sequence and homologous 21 has been cloned, as disclosed in U.S. Pat. No. 6,716,626. sequences; natural, Such as from primates, and non-natural “FGF-19", or “fibroblast growth factor 19” means the sequence variants having at least a portion of the biological human protein encoded by the FGF19 gene, or species and activity of mature glucagon. The term “glucagon” as used sequence variants thereof having at least a portion of the 25 herein also includes peptide mimetics of glucagon. Native biological activity of mature FGF-19. FGF-19 is a protein glucagon is produced by the pancreas, released when blood member of the fibroblast growth factor (FGF) family. FGF glucose levels start to fall too low, causing the liver to convert family members possess broad mitogenic and cell Survival stored glycogen into glucose and release it into the blood activities, and are involved in a variety of biological pro stream. While the action of glucagon is opposite that of insu cesses. FGF-19 increases liver expression of the leptin recep 30 lin, which signals the body's cells to take in glucose from the tor, metabolic rate, stimulates glucose uptake in adipocytes, blood, glucagon also stimulates the release of insulin, so that and leads to loss of weight in an obese mouse model (Fu, L., et newly-available glucose in the bloodstream can be taken up al. FGF-19-containing fusion proteins of the invention may and used by insulin-dependent tissues. Glucagon-containing find particular use in increasing metabolic rate and reversal of fusion proteins of the invention may find particular use in dietary and leptin-deficient diabetes. FGF-19 has been cloned 35 increasing blood glucose levels in individuals with extant and expressed, as described in US Patent Application No. hepatic glycogen stores and maintaining glucose homeostasis 2002OO42367. in diabetes. Glucagon has been cloned, as disclosed in U.S. "Gastrin” means the human gastrin peptide, truncated ver Pat. No. 4,826,763. sions, and species and sequence variants thereof having at “GLP-1” means human glucagon like peptide-1 and least a portion of the biological activity of mature gastrin. 40 sequence variants thereof having at least a portion of the Gastrin is a linear peptide hormone produced by G cells of the biological activity of mature GLP-1. The term “GLP-1 duodenum and in the pyloric antrum of the stomach and is includes human GLP-1(1-37), GLP-1 (7-37), and GLP-1 (7- secreted into the bloodstream. Gastrin is found primarily in 36)amide. GLP-1 stimulates insulin secretion, but only dur three forms: gastrin-34 ("big gastrin); gastrin-17 ("little gas ing periods of hyperglycemia. The safety of GLP-1 compared trin'); and gastrin-14 (“minigastrin'). It shares sequence 45 to insulin is enhanced by this property and by the observation homology with CCK. Gastrin-containing fusion proteins of that the amount of insulin secreted is proportional to the the invention may find particular use in the treatment of magnitude of the hyperglycemia. The biological half-life of obesity and diabetes for glucose regulation. Gastrin has been GLP-1 (7-37)OH is a mere 3 to 5 minutes (U.S. Pat. No. synthesized,as described in U.S. Pat. No. 5,843,446. 5,118,666). GLP-1-containing fusion proteins of the inven “Ghrelin' means the human hormone that induces satia 50 tion may find particular use in the treatment of diabetes and tion, or species and sequence variants thereof, including the insulin-resistance disorders for glucose regulation. GLP-1 native, processed 27 or 28 amino acid sequence and homolo has been cloned and derivatives prepared, as described in U.S. gous sequences. Ghrelin is produced mainly by P/D1 cells Pat. No. 5,118,666. Non-limited examples of GLP-1 lining the fundus of the human stomach and epsilon cells of sequences from a wide variety of species are shown in Table the pancreas that stimulates hunger, and is considered the 55 4, while Table 5 shows the sequences of a number of synthetic counterpart hormone to leptin. Ghrelin levels increase before GLP-1 analogs; all of which are contemplated for use in the meals and decrease after meals, and can result in increased BPXTEN compositions described herein. TABLE 4 Naturally GLP-1. Homologs

Gene Name SEQ ID NO: Sequence

GLP-1 frog 48 HAEGTYTNDWTEYLEEKAAKEFIEWLIKGKPKKIRYS

GLP-1a Xenopus laevis 49 HAEGTFTSDWTOOLDEKAAKEFIDWLINGGPSKEIIS US 9,371,369 B2 55 56 TABL 4- Continued Naturally GLP-1 Homolods

Gene Name SEQ ID NO: Sequence

GLP-1b Xenopus laevis SO AEGT YTNDWTEYLEEKAAKE FIIEWLIKGKPK

GLP-1c (Xenopus laevis 51 AEGT FTNDMTNYLEEKAAKE IKGRPK

Gastric Inhibitory 52 AEGT FISDYSIAMDKIROOD L Polypeptide Mus musculus

Glucose-dependent 53 AEGT FISDYSIAMDKIROOD L insulinotropic polypeptide Equus cabalius

Glucagon-like peptide 54 HADGT FTNDMTSYLDAKAARD FWSW ARSDKS Petromyzon marinus

Glucagon-like peptide 55 DWSSYLODQAAKE FWSW Anguilla rostrata

Glucagon-like peptide 56 AEGT YTSDWSSYLODOAAKE FWSW KTGR Anguilla anguilla

Glucagon-like peptide f YTSDWASLTDYLKSKR FWES LSNYNKRONDRRM Hydrolagus colliei

Glucagon-like peptide 58 YADAP YISDVYSYLODOVAKKWLKSGODRRE Amia calva GLUC ICTPU/38-65 59 DWSSYLOEQAAKD ITW KS

GLUCL ANGROA1-28 6 O DWSSYLODQAAKE FWSW

GLUC BOVINA 98-125 61 AEGT FTSDWSSYLEGOAAKE FIAW WK

GLUC1 LOPAM/91-118 62 HADGT FTSDWSSYLKDQAIKD WDR KA

GLUCL HYDCO/1-28 63 YTSDWASLTDYLKSKR FWES SN

GLUCCAVPO/53-8O 64 SQGT FTSDYSKYLDSRRAQQ FLKW LN

GLUC CHIBRA1-28 65 FTSDYSKHLDSRYAOE FWOW

GLUC1 LOPAM/53 – 8 O 66 SEGT FSNDYSKYLEDRKAOE

GLUC HYDCOA1-28 67 HTDG FSSDYSKYLDNRRTKD FWOW LLS

GLUC CALMIA1 - 28 68 SEGT FSSDYSKYLDSRRAKD FWOW LMS

GIP BOVINA1-28 69 YAEGT S DYSIAMDKIROOD LLA

VIP MELGAA89-116 70 FTTWYSHLLAKLAWKRYLHS IR

PACA CHICK/131-158 71. HIDG FTDSYSRYRKOMAVKKY AAVIG

VIP CAVPO/45-72 72 SDAL FTDTYTRLRKOMAMKKY NSWLN

VIP DIDMA/1-28 73 SDAW FTDSYTRLLKOMAMRKY LDSILN

EXE1 HELSUA1-28 74 SDAT FTAEYSKLLAKLALOKY ESILG

SLIB CAPHIA1 - 28 YADAI FTNSYRKVLGOLSARKL LODIMN

SLIB RATA31-58 76 HADAI FTSSYRRILGOLYARKL LHEIMN

SLIB MOUSEA31-58 77 WDAI FTTNYRKLLSOLYARKVIODIMN

PACA HUMANA83-110 78 WAHGI LNEAYRKVLDOLSAGKH

PACA SHEEPA83-110 79 WAHGI LDKAYRKVLDOLSARRY LOTLMA

PACA. ONCNEA82-109 HADGM FNKAYRKALGOLSARKY HSLMA

GLUC BOVINA146-173 81 HADGS FSDEMNTVLDSLATRDFINWLLO

SECR CANFA/1-27 82 FTSELSRLRESARLORL US 9,371,369 B2 57 58 TABLE 4 - continued Naturally GLP-1 Homolods Gene Name SEQ ID NO: Sequence SECRCHICK/1-27 83 HSDGLFTSEYSKMRGNAOWOKFIONLM

EXE3 HELHO/48-75 84 HSDGTFTSDLSKOMEEEAVRLFIEWLKN

10 GLP native sequences may be described by several TABLE 5- continued sequence motifs, which are presented below. Letters in brack ets represent acceptable amino acids at each sequence posi GLP-1 synthetic analogs tion: HVY AGISTVI DEHQ AGILMPSTV FLY SEQ DINSTADEKNSTADENSTV LMVY ANRSTY 15 ID EHIKNQRST AHILMQVY LMRT LADEGKQS NO: Sequence ADEGKNQSY AEIKLMQR AKQRSVY AILM QSTV GKQR DEKLQR FHLVWY ILV (ADE O6 HGEGTFTSDWSSYLEGOAAREFIAWLVKGRGRK GHIKNQRST ADEGNRSTW GILVW AIKLMQSV O7 HGEGTFTSDWSSYLEGOAAKEFIAWLVRGRGRRK ADGIKNORSTIGKRSY. In addition, synthetic analogs of GLP-1 can be useful as fusion partners to XTEN to create O8 HGEGTFTSDWSSYLEGOAAREFIAWLVRGKGRK BPXTEN with biological activity useful in treatment of glu cose-related disorders. Further sequences homologous to O9 HGEGTFTSDWSSYLEGOAAREFIAWLVRGKGRRK Exendin-4 or GLP-1 may be found by standard homology O HAEGTFTSDWSSYLEGOAAREFIAWLWRGRGK searching techniques. 25 1 HAEGTFTSDWSSYLEGOAAREFIAWLWRGRGRK

TABL E 5 2 HAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRRK GLP-1 synthetic analogs 3 HAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREK

SEQ 30 4 HAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFK ID NO: Sequence 5 HAEGTFTSDWSSYLEGOAAREFIAWLWRGRGRREFPK

85 HAEGTFTSDWSSYLEGOAAREFIAWLVKGRG 6 HAEGTFTSDWSSYLEGOAAREFIAWLWRGRGRREFPEK

86 HAEGTFTSDWSSYLEGOAAKEFIAWLVRGRG 35 7 HAEGTFTSDWSSYLEGOAAREFIAWLWRGRGRREFPEEK

87 HAEGTFTSDWSSYLEGOAAKEFIAWLVKGKG 8 HDEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGK

88 HAEGTFTSDWSSYLEGOAAREFIAWLVRGKG 9 HDEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRK

89 HAEGTFTSDWSSYLEGOAAREFIAWLVRGKGR 40 2O HDEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRRK 9 O HAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRK 21 HDEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREK

91 HAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRRK 22 HDEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFK

92 HAEGTFTSDWSSYLEGOAAREFIAWLVKGKG 23 HDEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPK 45 93 HAEGTFTSDWSSYLEGOAAKEFIAWLVRGKG 24 HDEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPEK

94 HAEGTFTSDWSSYLEGOAAREFIAWLVKGRGRK 25 HDEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPEEK

95 HAEGTFTSDWSSYLEGOAAKEFIAWLVRGRGRRK 26 DEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRK 50 96 HAEGTFTSDWSSYLEGOAAREFIAWLVRGKGRK 27 DEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRRK

97 HAEGTFTSDWSSYLEGOAAREFIAWLVRGKGRRK 28 DEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREK

98 HGEGTFTSDWSSYLEGOAAREFIAWLVKGRG 29 DEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFK 55 99 HGEGTFTSDWSSYLEGOAAKEFIAWLVRGRG 3 O DEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPK

1OO HGEGTFTSDWSSYLEGOAAKEFIAWLVKGKG 31 DEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPEK

101 HGEGTFTSDWSSYLEGOAAREFIAWLVRGKG 32 DEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPEEK 60 102 HGEGTFTSDWSSYLEGOAAREFIAWLVRGRGRK 33 EFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGK 103 HGEGTFTSDWSSYLEGOAAREFIAWLVRGRGRRK 34 EFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRK 104 HGEGTFTSDWSSYLEGOAAREFIAWLVKGKG 35 EFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRRK 105 HGEGTFTSDWSSYLEGOAAKEFIAWLVRGKG 65 36 EFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREK US 9,371,369 B2 59 60 TABLE 5- continued TABLE 5- continued GLP-1 synthetic analocs GLP-1 synthetic analocs SEQ SEQ ID ID NO: Sequence NO: Sequence

37 EFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFK 174 HDEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGKGRK

38 EFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPK 10 175 HAEGTFTSDWSSYLEGOAAREFIAWLVKGRGRK 39 EFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPEK 176 HAEGTFTSDWSSYLEGOAAKEFIAWLVRGRGRK

4 O EFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPEEK 177 HAEGTFTSDWSSYLEGOAAREFIAWLVRGKGRK

41 FERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGK 15 178 HGEGTFTSDWSSYLEGOAAREFIAWLVKGRGK 42 FERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRK 179 HGEGTFTSDWSSYLEGOAAREFIAWLVRGKGK

43 FERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRRK “GLP-2 means human glucagon like peptide-2 and 44 FERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREK sequence variants thereof having at least a portion of the 45 FERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFK biological activity of mature GLP-2. More particularly, GLP-2 is a 33 amino acid peptide, co-secreted along with 46 FERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPK GLP-1 from intestinal endocrine cells in the small and large 47 FERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPEK intestine. 25 “IGF-1 or “Insulin-like growth factor 1' means the human 48 FERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPEEK IGF-1 protein and species and sequence variants thereofhav ing at least a portion of the biological activity of mature 49 ERHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGK IGF-1. IGF-1, which was once called somatomedin C, is a 50 ERHAEGTFTSDWSSYLEGOAAREFIAWLWRGRGRK polypeptide protein anabolic hormone similar in molecular 30 structure to insulin, and that modulates the action of growth 51 ERHAEGTFTSDWSSYLEGOAAREFIAWLWRGRGRRK hormone. IGF-1 consists of 70 amino acids and is produced primarily by the liver as an endocrine hormone as well as in 52 ERHAEGTFTSDWSSYLEGOAAREFIAWLWRGRGRREK target tissues in a paracrine/autocrine fashion. IGF-1-contain 53 ERHAEGTFTSDWSSYLEGOAAREFIAWLWRGRGRREFK ing fusion proteins of the invention may find particular use in 35 the treatment of diabetes and insulin-resistance disorders for 54 ERHAEGTFTSDWSSYLEGOAAREFIAWLWRGRGRREFPK glucose regulation. IGF-1 has been cloned and expressed in 55 ERHAEGTFTSDWSSYLEGOAAREFIAWLWRGRGRREFPEK E. coli and yeast, as described in U.S. Pat. No. 5,324,639. “IGF-2 or “Insulin-like growth factor 2 means the human 56 ERHAEGTFTSDWSSYLEGOAAREFIAWLWRGRGRREFPEEK IGF-2 protein and species and sequence variants thereofhav 40 ing at least a portion of the biological activity of mature 57 RHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGK IGF-2. IGF-2 is a polypeptide protein hormone similar in 58 RHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRK molecular structure to insulin, with a primary role as a growth-promoting hormone during gestation. IGF-2 has been 59 RHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRRK cloned, as described in Bell GI, et al. Isolation of the human 6O RHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREK 45 insulin-like growth factor genes: insulin-like growth factor II and insulin genes are contiguous. Proc Natl Acad Sci USA. 61 RHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFK 1985.82(19):6450-4. “INGAP, or “islet neogenesis-associated protein', or 62 RHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPK “pancreatic beta cell growth factor” means the human INGAP 63 RHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPEK 50 peptide and species and sequence variants thereof having at least a portion of the biological activity of mature INGAP. 64. RHAEGTFTSDWSSYLEGOAAREFIAWLVRGRGRREFPEEK INGAP is capable of initiating duct cell proliferation, a pre 6.5 HDEFERHAEGTFTSDWSSYLEGOAAREFIAWLVKGRGK requisite for islet neogenesis. INGAP-containing fusion pro teins of the invention may find particular use in the treatment 66 HDEFERHAEGTFTSDWSSYLEGOAAKEFIAWLVRGRGK 55 or prevention of diabetes and insulin-resistance disorders. INGAP has been cloned and expressed, as described in R 6.7 HDEFERHAEGTFTSDWSSYLEGOAAREFIAWLVRGKGK Rafaeloff R. et al., Cloning and sequencing of the pancreatic 68 HAEGTFTSDWSSYLEGOAAREFIAWLVKGRGK islet neogenesis associated protein (INGAP) gene and its expression in islet neogenesis in hamsters. J Clin Invest. 69 HAEGTFTSDWSSYLEGOAAKEFIAWLVRGRGK 60 1997.99(9): 2100-2109. 7O HAEGTFTSDWSSYLEGOAAREFIAWLVRGKGK “Intermedin' or 'AFP-6' means the human intermedin peptide and species and sequence variants thereof having at 71 HAEGTFTSDWSSYLEGOAAREFIAWLVRGRGK least a portion of the biological activity of mature intermedin. 72 DEFERHAEGTFTSDWSSYLEGOAAREFIAWLWKGRGRK Intermedin is a ligand for the calcitonin receptor-like recep 65 tor. Intermedin treatment leads to blood pressure reduction 73 DEFERHAEGTFTSDWSSYLEGOAAKEFIAWLWRGRGRK both in normal and hypertensive Subjects, as well as the Suppression of gastric emptying activity, and is implicated in US 9,371,369 B2 61 62 glucose homeostasis. Intermedin-containing fusion proteins ing, and are specific for the CRF type 2 receptor and do not of the invention may find particular use in the treatment of activate CRF-R1 which mediates ACTH release. BPXTEN diabetes, insulin-resistance disorders, and obesity. Interme comprising urocortin, e.g., Ucn-2 or Ucn-3, may be useful for din peptides and variants have been cloned, as described in vasodilation and thus for cardiovascular uses such as chronic U.S. Pat. No. 6,965,013. heart failure. Urocortin-containing fusion proteins of the "Leptin' means the naturally occurring leptin from any invention may also find particular use in treating or prevent species, as well as biologically active D-isoforms, or frag ing conditions associated with stimulating ACTH release, ments and sequence variants thereof. Leptin plays a key role hypertension due to vasodilatory effects, inflammation medi in regulating energy intake and energy expenditure, including ated via other than ACTH elevation, hyperthermia, appetite appetite and metabolism. Leptin-containing fusion proteins 10 disorder, congestive heart failure, stress, anxiety, and psoria of the invention may find particular use in the treatment of sis. Urocortin-containing fusion proteins may also be com diabetes for glucose regulation, insulin-resistance disorders, bined with a natriuretic peptide module, amylin family, and and obesity. Leptin is the polypeptide product of the ob gene exendin family, or a GLP1 family module to provide an as described in the International Patent Pub. No. WO enhanced cardiovascular benefit, e.g. treating CHF, as by 96/05309. Leptin has been cloned, as described in U.S. Pat. 15 providing a beneficial vasodilation effect. No. 7, 112,659, and leptin analogs and fragments in U.S. Pat. (b) Metabolic Disease and Cardiovascular Proteins No. 5,521,283, U.S. Pat. No. 5,532,336, PCT/US96/22308 Metabolic and cardiovascular diseases representa Substan and PCTAUS96/O1471. tial health care burden in most developed nations, with car “Neuromedin' means the neuromedin family of peptides diovascular diseases remaining the number one cause of including neuromedin U and S peptides, and sequence vari death and disability in the United States and most European ants thereof. The native active human neuromedin Upeptide countries. Metabolic diseases and disorders include a large hormone is neuromedin-U25, particularly its amide form. Of variety of conditions affecting the organs, tissues, and circu particular interest are their processed active peptide hor latory system of the body. Chief amongst these is diabetes: mones and analogs, derivatives and fragments thereof. one of the leading causes of death in the United States, as it Included in the neuromedin Ufamily are various truncated or 25 results in pathology and metabolic dysfunction in both the splice variants, e.g., FLFHYSKTQKLGKSNVVEELQSP vasculature, central nervous system, major organs, and FASQSRGYFLFRPRN (SEQ ID NO: 180). Exemplary of peripheral tissues. Insulin resistance and hyperinsulinemia the neuromedin S family is human neuromedin S with the have also been linked with two other metabolic disorders that Sequence ILQRGSGTAAVDFTKKDHTATWGR pose considerable health risks: impaired glucose tolerance PFFLFRPRN (SEQID NO:181), particularly its amideform. 30 and metabolic obesity. Impaired glucose tolerance is charac Neuromedin fusion proteins of the invention may find par terized by normal glucose levels before eating, with a ten ticular use in treating obesity, diabetes, reducing food intake, dency toward elevated levels (hyperglycemia) following a and other related conditions and disorders as described meal. These individuals are considered to be at higher risk for herein. Of particular interest are neuromedin modules com diabetes and coronary artery disease. Obesity is also a risk bined with an amylin family peptide, an exendin peptide 35 factor for the group of conditions called insulin resistance family or a GLP I peptide family module. syndrome, or "Syndrome X. as is hypertension, coronary “Oxyntomodulin', or “OXM” means human oxyntomodu artery disease (arteriosclerosis), and lactic acidosis, as well as lin and species and sequence variants thereofhaving at least a related disease states. The pathogenesis of obesity is believed portion of the biological activity of mature OXM. OXM is a to be multifactorial but an underlying problem is that in the 37 amino acid peptide produced in the colon that contains the 40 obese, nutrient availability and energy expenditure are not in 29 amino acid sequence of glucagon followed by an 8 amino balance until there is excess adipose tissue. acid carboxyterminal extension. OXM has been found to Dyslipidemia is a frequent occurrence among diabetics and Suppress appetite. OXM-containing fusion proteins of the Subjects with cardiovascular disease; typically characterized invention may find particular use in the treatment of diabetes by parameters such as elevated plasma triglycerides, low for glucose regulation, insulin-resistance disorders, obesity, 45 HDL (high density lipoprotein) cholesterol, normal to and can be used as a weight loss treatment. elevated levels of LDL (low density lipoprotein) cholesterol “PYY” means human peptide YY polypeptide and species and increased levels of small dense, LDL particles in the and sequence variants thereofhaving at least a portion of the blood. Dyslipidemia and hypertension is a main contributor biological activity of mature PYY. PYY includes both the to an increased incidence of coronary events, renal disease, human full length, 36 amino acid peptide, PYY and 50 and deaths among Subjects with metabolic diseases like dia PYY which have the PP fold structural motif. PYY inhib betes and cardiovascular disease. its gastric motility and increases water and electrolyte absorp Cardiovascular disease can be manifest by many disorders, tion in the colon. PYY may also Suppress pancreatic secre symptoms and changes in clinical parameters involving the tion. PPY-containing fusion proteins of the invention may heart, vasculature and organ systems throughout the body, find particular use in the treatment of diabetes for glucose 55 including aneurysms, angina, atherosclerosis, cerebrovascu regulation, insulin-resistance disorders, and obesity. Analogs lar accident (Stroke), cerebrovascular disease, congestive of PYY have been prepared, as described in U.S. Pat. Nos. heart failure, coronary artery disease, myocardial infarction, 5,604,203, 5,574,010 and 7,166,575. reduced cardiac output and peripheral vascular disease, "Urocortin' means a human urocortin peptide hormone hypertension, hypotension, blood markers (e.g., C-reactive and sequence variants thereofhaving at least a portion of the 60 protein, BNP, and enzymes such as CPK, LDH, SGPT. biological activity of mature urocortin. There are three human SGOT), amongst others. urocortins: Ucn-1, Ucn-2 and Ucn-3. Further urocortins and Most metabolic processes and many cardiovascular param analogs have been described in U.S. Pat. No. 6.214.797. eters are regulated by multiple peptides and hormones Urocortins Ucn-2 and Ucn-3 have food-intake Suppression, (“metabolic proteins'), and many such peptides and hor antihypertensive, cardioprotective, and inotropic properties. 65 mones, as well as analogues thereof, have found utility in the Ucn-2 and Ucn-3 have the ability to suppress the chronic HPA treatment of such diseases and disorders. However, the use of activation following a stressful stimulus such as dieting/fast therapeutic peptides and/or hormones, even when augmented US 9,371,369 B2 63 64 by the use of small molecule drugs, has met with limited or used in the treatment of metabolic and cardiovascular Success in the management of such diseases and disorders. In diseases and disorders into BPXTEN fusion proteins to create particular, dose optimization is important for drugs and bio compositions with utility in the treatment of such disorders, disease and related conditions. The metabolic proteins can logics used in the treatment of metabolic diseases, especially include any protein of biologic, therapeutic, or prophylactic those with a narrow therapeutic window. Hormones in gen interest or function that is useful for preventing, treating, eral, and peptides involved in glucose homeostasis often have mediating, or ameliorating a metabolic or cardiovascular dis a narrow therapeutic window. The narrow therapeutic win ease, disorder or condition. Table 6 provides a non-limiting dow, coupled with the fact that such hormones and peptides list of such sequences of metabolic BPs that are encompassed typically have a short half-life which necessitates frequent 10 by the BPXTEN fusion proteins of the invention. Metabolic dosing in order to achieve clinical benefit, results in difficul proteins of the inventive BPXTEN compositions can be a ties in the management of Such patients. Therefore, there protein that exhibits at least about 80% sequence identity, or remains a need for therapeutics with increased efficacy and alternatively 81%, 82%, 83%, 84%, 85%. 86%, 87%. 88%, safety in the treatment of metabolic diseases. 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, Thus, one aspect of the present invention is the incorpora 99%, or 100% sequence identity to a protein sequence tion of biologically active metabolic proteins and involved in selected from Tables 6. TABLE 6 Biologically active proteins for metabolic disorders and cardiology

Name of Protein SEO ID (Synonym) Sequence NO. Anti-CD3 See U.S. Pat. Nos. 5, 885, 573 and 6, 491, 916 IL-1ra, human MEICRGLRSHILITLLLFLFHSETICRPSGRKSSKMOAFRIWDVNOKTFYLRN 1723 full length NOLVAGYLOGPNVNLEEKIDVWPIEPHALFLGIHGGKMCLSCVKSGDETRL OLEAVNITDLSENRKODKRFAFIRSDSGPTTSFESAACPGWFLCTAMEADO PWSLTNMPDEGVMVTKFYFOEDE

IL-1ra, Dog METCRCPLSYLISFLLFLPHSETACRLGKRPCRMOAFRIWDVNOKTF 1724 YLRNNOLVAGYLOGSNTKLEEKLDVWPWEPHAVFLGIHGGKLCLA CVKSGDETRLOLEAVNITDLSKNKDODKRFTFILSDSGPTTSFESAA CPGWFLCTALEADRPVSLTNRPEEAMMVTKFYFOKE

IL-1ra, Rabbit MRPSRSTRRHILISLLLFLFHSETACRPSGKRPCRMOAFRIWDVNOKT 1725 FYLRNNOLVAGYLQGPNAKLEERIDWWPLEPOLLFLGIORGKLCLSC WKSGDKMKLHLEAVNITDLGKNKEQDKRFTFIRSNSGPTTTFESASC PGWFLCTALEADOPVSLTNTPDDSIVVTKFYFOED

IL-1ra Rat MEICRGPYSHLISLLLILLFRSESAGHIPAGKRPCKMOAFRIWDTNOK 1726 TFYLRNNOLIAGYLOGPNTKLEEKIDMVPIDFRNVFLGIHGGKLCLS CVKSGDDTKLOLEEVNITDLNKNKEEDKRFTFIRSETGPTTSFESLA CPGWFLCTTLEADHPVSLTNTPKEPCTVTKFYFOED

IL-1ra Mouse MEICWGPYSHLISLLLILLFHSEAACRPSGKRPCKMOAFRIWDTNOK 1727 TFYLRNNOLIAGYLOGPNIKLEEKIDMVPIDLHSVFLGIHGGKLCLSC AKSGDDIKLOLEEVNITDLSKNKEEDKRFTFIRSEKGPTTSFESAACP GWFLCTTLEADRPVSLTNTPEEPLIVTKFYFOEDO

Anakinra MRPSGRKSSKMOAFRIWDVNOKTFYLRNNOLVAGYLOGPNVNLEEKIDV 1728 WPIEPHALFLGIHGGKMCLSCVKSGDETRLOLEAVNITDLSENRKODKRFA FIRSDSGPTTSFESAACPGWFLCTAMEADOPVSLTNMPDEGVMVTKFYFOE DE

C.-natriuretic SLRRSSCFGGRMDRIGAOSGLGCNSFRY 1729 peptide (ANP)

3-natriuretic SPKMVOGSGGFGRKMDRISSSSGLGCKVLRRH 1730 peptide, human (BNP human)

Brain NSKMAHSSSCFGOKIDRIGAVSRLGCDGLRLF 1731 natriuretic peptide, Rat; (BNP Rat)

C-type GLSKGCFGLKLDRIGSMSGIGC 1732 natriuretic peptide (CNP, porcine)

Fibroblast PALPEDGGSGAFPPGHFKDPKRLYCKNGGFFLRIHPDGRWDGWREKSDPHI 1733 growth factor 2 KLOLOAEERGVVSIKGVCANRYLAMKEDGRLLASKCVTDECFFFERLESN (FGF-2) NYNTYRS RKYTSWYVAL KRTGOYKLGS KTGPGOKAIL FLPMSAKS US 9,371,369 B2 65 66 TABLE 6-continued Biolodically active proteins for metabolic disorders and cardiolody

Name of Protein SEO ID (Synonym) Sequence NO.

TNF receptor LPAOVAFTPYAPEPGSTCRLREYYDOTAOMCCSKCSPGOHAKVFCTKTSD 1734 (TNFR) TVCDSCEDSTYTOLWNWVPECLSCGSRCSSDOVETOACTREONRICTCRPG WYCALSKOEGCRLCAPLRKCRPGFGVARPGTETSDVWCKPCAPGTFSNTTS STDICRPHOICNVVAIPGNASMDAVCTSTSPTRSMAPGAVHLPOPVSTRSOH TOPTPEPSTAPSTSFLLPMGPSPPAEGSTGD

Anti-CD3 means the monoclonal antibody against the T excessive appetite, insufficient satiety, metabolic disorders, cell Surface protein CD3, species and sequence variants, and glucagonomas, secretory disorders of the airway, osteoporo fragments thereof, including OKT3 (also called muromonab) 15 sis, central nervous system disease, restenosis, neurodegen erative disease, renal failure, congestive heart failure, neph and humanized anti-CD3 monoclonal antibody (hCKT31 rotic syndrome, cirrhosis, pulmonary edema, hypertension, (Ala-Ala))(KC Herold et al., New England Journal of Medi disorders wherein the reduction of food intake is desired, cine 346:1692-1698. 2002) Anti-CD3 prevents T-cell activa irritable bowel syndrome, myocardial infarction, stroke, post tion and proliferation by binding the T-cell receptor complex Surgical catabolic changes, hibernating myocardium, diabetic present on all differentiated T cells. Anti-CD3-containing cardiomyopathy, insufficient urinary sodium excretion, fusion proteins of the invention may find particular use to excessive urinary potassium concentration, conditions or dis slow new-onset Type 1 diabetes, including use of the anti orders associated with toxic hypervolemia, polycystic ovary CD3 as a therapeutic effector as well as a targeting moiety for syndrome, respiratory distress, chronic skin ulcers, nephr a second therapeutic BP in the BPXTEN composition. The 25 opathy, left ventricular systolic dysfunction, gastrointestinal sequences for the variable region and the creation of anti-CD3 diarrhea, postoperative dumping syndrome, irritable bowel have been described in U.S. Pat. Nos. 5,885,573 and 6,491, syndrome, critical illness polyneuropathy (CIPN), systemic 916. inflammatory response syndrome (SIRS), dyslipidemia, rep “IL-1 ra' means the human IL-1 receptorantagonist protein erfusion injury following ischemia, and coronary heart dis and species and sequence variants thereof, including the 30 ease risk factor (CHDRF) syndrome. IL-1 ra-containing sequence variant anakinra (Kineret(R), having at least a por fusion proteins of the invention may find particular use in the tion of the biological activity of mature IL-1 ra. Human IL-1 ra treatment of any of the foregoing diseases and disorders. is a mature glycoprotein of 152 amino acid residues. The IL-1ra has been cloned, as described in U.S. Pat. Nos. 5,075, inhibitory action of IL-1 ra results from its binding to the type 222 and 6,858,409. IIL-1 receptor. The protein has a native molecular weight of 35 “Natriuretic peptides' means atrial natriuretic peptide 25 kDa, and the molecule shows limited sequence homology (ANP), brain natriuretic peptide (BNP or B-type natriuretic to IL-1C. (19%) and IL-1B (26%) Anakinra is a nonglycosy peptide) and C-type natriuretic peptide (CNP); both human lated, recombinant human IL-1 ra and differs from endog and non-human species and sequence variants thereofhaving enous human IL-1 ra by the addition of an N-terminal at least a portion of the biological activity of the mature methionine. A commercialized version of anakinra is mar 40 counterpart natriuretic peptides. Alpha atrial natriuretic pep keted as Kineret(R). It binds with the same avidity to IL-1 tide (aANP) or (ANP) and brain natriuretic peptide (BNP) receptor as native IL-1 ra and IL-1b, but does not result in and type C natriuretic peptide (CNP) are homologous receptor activation (signal transduction), an effect attributed polypeptide hormones involved in the regulation of fluid and to the presence of only one receptor binding motif on IL-1 ra electrolyte homeostasis. Sequences of useful forms of natri versus two such motifs on IL-1C. and IL-1B. Anakinra has 153 45 uretic peptides are disclosed in U.S. Patent Publication amino acids and 17.3 kD in size, and has a reported half-life 20010027181. Examples of ANPs include human ANP (Kan of approximately 4-6 hours. gawa et al., BBRC 118:131 (1984)) or that from various Increased IL-1 production has been reported in patients species, including pig and rat ANP (Kangawa et al., BBRC with various viral, bacterial, fungal, and parasitic infections; 121:585 (1984)). Sequence analysis reveals that preproBNP intravascular coagulation; high-dose IL-2 therapy; Solid 50 consists of 134 residues and is cleaved to a 108-amino acid tumors; leukemias; Alzheimer's disease; HIV-1 infection; ProBNP. Cleavage of a 32-amino acid sequence from the autoimmune disorders; trauma (Surgery); hemodialysis; C-terminal end of ProBNP results in human BNP (77-108), ischemic diseases (myocardial infarction); noninfectious which is the circulating, physiologically active form. The hepatitis; asthma, UV radiation; closed head injury; pancre 32-amino acid human BNP involves the formation of a dis atitis; peritonitis; graft-Versus-host disease; transplant rejec 55 ulfide bond (Sudoh et al., BBRC 159: 1420 (1989)) and U.S. tion; and in healthy Subjects after strenuous exercise. There is Pat. Nos. 5,114,923, 5,674,710, 5,674,710, and 5,948,761. an association of increased IL-1b production in patients with BPXTEN-containing one or more natriuretic functions may Alzheimer's disease and a possible role for IL 1 in the release be useful in treating hypertension, diuresis inducement, natri of the amyloid precursor protein. IL-1 has also been associ uresis inducement, vascular conduct dilatation or relaxation, ated with diseases Such as type 2 diabetes, obesity, hypergly 60 natriuretic peptide receptors (such as NPR-A) binding, apida cemia, hyperinsulinemia, type 1 diabetes, insulin resistance, secretion Suppression from the kidney, aldostrerone secretion retinal neurodegenerative processes, disease states and con Suppression from the adrenal gland, treatment of cardiovas ditions characterized by insulin resistance, acute myocardial cular diseases and disorders, reducing, stopping or reversing infarction (AMI), acute coronary syndrome (ACS), athero cardiac remodeling after a cardiac event or as a result of Sclerosis, chronic inflammatory disorders, rheumatoid arthri 65 congestive heart failure, treatment of renal diseases and dis tis, degenerative intervertebral disc disease, sarcoidosis, orders; treatment or prevention of ischemic stroke, and treat Crohn's disease, ulcerative colitis, gestational diabetes, ment of asthma. US 9,371,369 B2 67 68 “FGF-2 or heparin-binding growth factor 2, means the The physiological trigger of coagulation is the formation of human FGF-2 protein, and species and sequence variants a complex between tissue Factor (TF) and Factor VIIa thereof having at least a portion of the biological activity of (FVIIa) on the surface of TF expressing cells, which are normally located outside the vasculature. This leads to the the mature counterpart. FGF-2 had been shown to stimulate activation of Factor IX and Factor X, ultimately generating proliferation of neural stem cells differentiated into striatal some thrombin. In turn, thrombin activates Factor VIII and like neurons and protect striatal neurons in toxin-induced Factor IX, the so-called “intrinsic' arm of the blood coagul models of Huntington Disease, and also my have utility in lation cascade, thus amplifying the generation of Factor Xa, treatment of cardiac reperfusion injury, and may have endot which is necessary for the generation of the full thrombin helial cell growth, anti-angiogenic and tumor suppressive burst to achieve complete hemostasis. It was Subsequently properties, wound healing, as well as promoting fracture 10 shown that by administering high concentrations of Factor healing in bones. FGF-2 has been cloned, as described in VIIa, hemostasis can be achieved, bypassing the need for Burgess, W. H. and Maciag, T., Ann. Rev. Biochem., 58:575 Factor VIIIa and Factor IXa in certain bleeding disorders. Coagulation and bleeding disorders can result in changes in 606 (1989); Coulier, F., et al., 1994, Prog. Growth Factor Res. Such parameters as prothrombin time, partial prothrombin 5:1; and the PCT publication WO 87/01728. 15 time, bleeding time, clotting time, platelet count, prothrom “TNF receptor” means the human receptor for TNF, and bin fragment 1+2 (F1+2), thrombin-antithrombin III complex species and sequence variants thereofhaving at least a portion (TAT), D-dimer, incidence of bleeding episodes, erythrocyte of the biological receptor activity of mature TNFR. P75 TNF sedimentation rate (ESR), C-reactive protein, and blood con Receptor molecule is the extracellular domain of p75 TNF centration of coagulation factors. receptor, which is from a family of structurally homologous Thus, Factor VIIa (FVIIa) proteins have found utility for receptors which includes the p55 TNF receptor. TNF C. and the treatment of bleeding episodes in hemophilia A or B TNF 3 (TNF ligands) compete for binding to the p55 and p75 patients with inhibitors to FVIII or FIX and in patients with TNF receptors. The X-ray crystal structure of the complex acquired hemophilia, as well as prevention of bleeding in formed by the extracellular domain of the human p55 TNF Surgical interventions or invasive procedures in hemophilia A receptor and TNF B has been determined (Banner et al. Cell 25 or B patients with inhibitors to FVIII or FIX. In addition, 73:431, 1993, incorporated herein by reference). factor VIIa can be utilized intreatment of bleeding episodes in (c) Coagulation Factors patients with congenital Factor VII deficiency and prevention In hemophilia the clotting of blood is disturbed by a lack of of bleeding in Surgical interventions or invasive procedures in certain plasma blood clotting factors. Human factor IX (FIX) patients with congenital FVII deficiency. However, the intra is a Zymogen of a serine protease that is an important com 30 venous administration of products containing FVIIa can lead ponent of the intrinsic pathway of the blood coagulation to side effects including thrombotic events, fever, and injec cascade. In individuals who do not have FIX deficiency, the tion site reactions. In addition, the short half-life of Factor average half-life of FIX is short; approximately 18-24 hours. VIIa of approximately 2 hours, limits its application, and can A deficiency of functional FIX, due to an X-linked disorder require repeated injections every 2-4 hours to achieve hemo that occurs in about one in 30,000 males, results in hemo 35 stasis. Thus, there remains a need for factor IX and factor VIIa philia B, also known as Christmas disease. Over 100 muta compositions with extended half-life and retention of activity tions of factor IX have been described; some cause no symp when administered as part of a preventive and/or therapeutic toms, but many lead to a significant bleeding disorder. When regimen for hemophilia B, as well as formulations that reduce untreated, hemophilia B is associated with uncontrolled side effects and can be administered by both intravenous and bleeding into muscles, joints, and body cavities following 40 Subcutaneous routes. injury, and may result in death. Previously, treatments for the The coagulation factors for inclusion in the BPXTEN of disease included administration of FIX prepared from human the invention can include proteins of biologic, therapeutic, or plasma derived from donor pools, which carried attendant prophylactic interest or function that are useful for prevent risks of infection with blood-borne viruses including human ing, treating, mediating, or ameliorating blood coagulation immunodeficiency virus (HIV) and hepatitis C virus (HCV). 45 disorders, diseases, or deficiencies. Suitable coagulation pro More recently, recombinant FIX products have become com teins include biologically active polypeptides that are mercially available. The in vivo activity of exogenously Sup involved in the coagulation cascade as Substrates, enzymes or plied factor IX is limited both by protein half-life and inhibi co-factors. tors of coagulation, including antithrombin III. Factor IX Table 7 provides a non-limiting list of sequences of coagul compositions typically have short half-lives requiring fre 50 lation factors that are encompassed by the BPXTEN fusion quent injections. Also, current FIX-based therapeutics proteins of the invention. Coagulation factors for inclusion in require intravenous administration due to poor bioavailabil the BPXTEN of the invention can be a protein that exhibits at ity. Thus, there is a need for factor IX compositions with least about 80% sequence identity, or alternatively 81%, 82%, extended half-life and retention of activity when administered 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, as part of a preventive and/or therapeutic regimen for hemo 55 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence philia B. identity to a protein sequence selected from Tables 7. TABLE 7 Coagulation factor polypeptide sequences

BPXTEN SEQ ID Name NO: Sequence

FIX 1735 MQRVNMIMAESPGLITICLLGYLLSAECTWFLDHENANKILNRPKRYNSGKLEEFW precursor OGNLERECMEEKCSFEEAREWFENTERTTEFWKOYWDGDOCESNPCLNGGSCKD DINSYECWCPFGFEGKNCELDWTCNIKNGRCEOFCKNSADNKVVCSCTEGYRLAE NOKSCEPAVPFPCGRWSVSOTSKLTRAETVFPDVDYVNSTEAETILDNITOSTOSFN

US 9,371,369 B2 71 72 TABLE 7- continued Coaculation factor polypeptide sequences

BPXTEN SEQ ID Name NO: Sequence

Sequence 1743 YNSGKLEEFVOGNLERECMEEKCSFEEAREVFENTERTTEFWKOYWDGDOCESNP 3 from CLNGGSCKDDINSYECWCPFGFEGKNCELDATCNIKNGRCEOFCKNSADNKVVCS Patent US CTEGYRLAENOKSCEPAVPFPCGRWSVSOTSKLTRAETVFPDVDYWNSTEAETILD 2008 O1672.19 NITOSTOSFNDFTRVWGGEDAKPGOFPWOVWLNGKVDAFCGGSIWNEKWIVTAAH CVETGVKITVVAGEHNIEETEHTEOKRNVIRIIPHHNYNAAINKYNHDIALLELDAP LWLNSYWTPICIADKEYTNIFLKFGSGYVSGWGRVFHKGRSALVLOYLRVPLVDRA TCLRSTKFTIYNNMFCAGFHEGGRDSCOGDSGGPHVTEVEGTSFLTGIISWGEECA MKGKYGIYTKVSRYWNWIKEKTKLT

Sequence 1744 YNSGKLEEFVOGNLERECMEEKCSFEEAREVFENTERTTEFWKOYWDGDOCESNP 4 from CLNGGSCKDDINSYECWCPFGFEGKNCELDATCNIKNGRCEOFCKNSADNKVVCS Patent US CTEGYRLAENOKSCEPAVPFPCGRWSVSOTSKLTRAETVFPDVDYWNSTEAETILD 2008 O1672.19 NITOSTOSFNDFTRVWGGEDAKPGOFPWOVWLNGKVDAFCGGSIWNEKWIVTAAH CVETGVKITVVAGEHNIEETEHTEOKRNVIRIIPHHNYNAAINKYNHDIALLELDEP LWLNSYWTPICIADKEYTNIFLKFGSGYVSGWGRVFHKGRSALVLOYLRVPLVDRA TCLASTKFTIYNNMFCAGFHEGGRDSCOGDSGGPHVTEVEGTSFLTGIISWGEECA MKGKYGIYTKVSRYWNWIKEKTKLT

Sequence 1745 YNSGKLEEFVOGNLERECMEEKCSFEEAREVFENTERTTEFWKOYWDGDOCESNP 5 from CLNGGSCKDDINSYECWCPFGFEGKNCELDWTCNIKNGRCEOFCKNSADNKVVCS Patent US CTEGYRLAENOKSCEPAVPFPCGRWSVSOTSKLTRAETVFPDVDYWNSTEAETILD 2008 O1672.19 NITOSTOSFNDFTRVWGGEDAKPGOFPWOVWLNGKVDAFCGGSIWNEKWIVTAAH CVETGVKITVVAGEHNIEETEHTEOKRNVIRIIPHHNYNAAINKYNHDIALLELDAP LWLNSYWTPICIADKEYTNIFLKFGSGYVSGWGRVFHKGRSALVLOYLRVPLVDRA TCLASTKFTIYNNMFCAGFHEGGRDSCOGDSGGPHVTEVEGTSFLTGIISWGEECA MKGKYGIYTKVSRYWNWIKEKTKLT

Sequence 1746 YNSGKLEEFVOGNLERECMEEKCSFEEAREVFENTERTTEFWKOYWDGDOCESNP 6 from CLNGGSCKDDINSYECWCPFGFEGKNCELDATCNIKNGRCEOFCKNSADNKVVCS Patent US CTEGYRLAENOKSCEPAVPFPCGRWSVSOTSKLTRAETVFPDVDYWNSTEAETILD 2008 O1672.19 NITOSTOSFNDFTRVWGGEDAKPGOFPWOVWLNGKVDAFCGGSIWNEKWIVTAAH CVETGVKITVVAGEHNIEETEHTEOKRNVIRIIPHHNYNAAINKYNHDIALLELDAP LWLNSYWTPICIADKEYTNIFLKFGSGYVSGWGRVFHKGRSALVLOYLRVPLVDRA TCLASTKFTIYNNMFCAGFHEGGRDSCOGDSGGPHVTEVEGTSFLTGIISWGEECA MKGKYGIYTKVSRYWNWIKEKTKLT

Factor 1747 ANAFLEELRPGSLERECKEEOCSFEEAREIFKDAERTKLFWISYSDGDOCASSPCON VII/VIIa GGSCKDOLOSYICFCLPAFEGRNCETHKDDOLICVNENGGCEOYCSDHTGTKRSC RCHEGYSLLADGVSCTPTVEYPCGKIPILEKRNASKPOGRIWGGKVCPKGECPWOW LLLVNGAOLCGGTLINTIWWWSAAHCFDKIKNWRNLIAWLGEHDLSEHDGDEQSR RVAOVIIPSTYWPGTTNHDIALLRLHOPVVLTDHVVPLCLPERTFSERTLAFVRFSL WSGWGOLLDRGATALELMWLNVPRLMTODCLOOSRKVGDSPNITEYMFCAGYSD GSKDSCKGDSGGPHATHYRGTWYLTGIWSWGOGCATWGHFGVYTRWSOYIEWLO KLMRSEPRPGWLLRAPFP

“Factor IX” (“FIX”) includes the human Factor IX protein 45 herein, including the sequences of Table 7. Minor deletions, and species and sequence variants thereof having at least a additions and/or Substitutions of amino acids of the polypep portion of the biological receptor activity of mature Factor IX. tide sequence of FIX that do notabolish the biological activity FIX shall be any form of factor IX molecule with the typical of the polypeptide (i.e. reducing the activity to below 10% or characteristics of blood coagulation factor IX. FIX shall even below 5% of the wild type form (=100%)) are also include FIX from plasma and any form of recombinant FIX 50 included in the present application as biologically active which is capable of curing bleeding disorders in a patient; derivatives, especially those with improved specific activity e.g., caused by deficiencies in FIX (e.g., hemophilia B). In (above 100% activity of the wild-type form). The FIX accord Some embodiments, the FIX peptide is a structural analog or ing to the present invention may be derived from any verte peptide mimetic of any of the FIX peptides described herein, 55 brate, e.g. a mammal. In one specific example of the present including the sequences of Table 7. Minor deletions, addi invention, the FIX is human FIX. In another embodiment, the tions and/or Substitutions of amino acids of the polypeptide FIX is a polypeptide sequence from Table 7. sequence of FIX that do notabolish the biological activity of Human Factor IX (FIX) is encoded by a single-copy gene the polypeptide (i.e. reducing the activity to below 10% or residing on the X-chromosome at q27.1. The human FIX even below 5% of the wild type form (=100%)) are also 60 mRNA is composed of 205 bases for the 5' untranslated included in the present application as biologically active region, 1383 bases for the prepro factor IX, a stop codon and derivatives, especially those with improved specific activity 1392 bases for the 3' untranslated region. The FIX polypep (above 100% activity of the wild-type form). The FIX accord tide is 55 kDa, synthesized as a prepropolypetide chain com ing to the present invention may be derived from any verte posed of three regions: a signal peptide of 28 amino acids, a brate, e.g. a mammal. 65 propeptide of 18 amino acids, which is required for gamma In some embodiments, the FIX peptide is a structural ana carboxylation of glutamic acid residues, and a mature factor log or peptide mimetic of any of the FIX peptides described IX of 415 amino acids. The mature factor IX is composed of US 9,371,369 B2 73 74 domains. The domains, in an N- to C-terminus configuration of the protein or internal to the Factor IX moiety sequence. are a Gla domain, an EGF1 domain, a EGF2 domain, an Non-limiting examples of Factor IX moieties include the activation peptide domain, and a protease (or catalytic) following: Factor IX: Factor IXa; truncated versions of Factor domain. The protease domain provides, upon activation of IX: hybrid proteins, and peptide mimetics having Factor IX FIX to FIXa, the catalytic activity of FIX. Following activa 5 activity. Biologically active fragments, deletion variants, Sub tion, the single-chain FIX becomes a 2-chain molecule, in stitution variants or addition variants of any of the foregoing which the two chains are linked by a disulfide bond attaching that maintain at least some degree of Factor IX activity or the the enzyme to the Gla domain. Activated factor VIII (FVIIIa) potential for activation can also serve as a Factor IXsequence. is the specific cofactor for the full expression of FIXa activity. “Factor VII (FVII) means the human protein, and species As used herein “factor IX' and 'FIX' are intended to encom 10 and sequence variants thereofhaving at least a portion of the pass polypeptides that comprise the domains Gla, EGF1, biological activity of activated Factor VII. Factor VII and EGF2, activation peptide, and protease, or synonyms of these recombinant human FVIIa has been introduced for use in domains known in the art. uncontrollable bleeding in hemophilia patients (with Factor FIX is expressed as a precursor polypeptide that requires VIII or IX deficiency) who have developed inhibitors against posttranslational processing to yield active FIX. In particular, 15 replacement coagulation factor. Factor VII can be activated the precursor polypeptide of FIX requires gamma carboxyla by thrombin, factor IXa, factor Xa or factor XIIa to FVIIa. tion of certain glutamic acid residues in the so-called gamma FVII is converted to its active form Factor VIIa by proteolysis carboxyglutamate domain and cleavage of propeptide. The of the single peptide bond at Arg152-Ile153 leading to the propeptide is an 18-amino acid residue sequence N-terminal formation of two polypeptide chains, a N-terminal light chain to the gamma-carboxyglutamate domain. The propeptide (24 kDa) and a C-terminal heavy chain (28 kDa), which are binds vitamin K-dependent gamma carboxylase and then is held together by one disulfide bridge. In contrast to other cleaved from the precursor polypeptide of FIX by an endog Vitamin K-dependent coagulation factors no activation pep enous protease, most likely PACE (paired basic amino acid tide, which is cleaved off during activation of these other cleaving enzyme), also known as furin or PCSK3. Without Vitamin-K dependent coagulation factors, has been described the gamma carboxylation, the Gla domain is unable to bind 25 for FVII. The Arg152-Ile153 cleavage site and some amino calcium to assume the correct conformation necessary to acids downstream show homology to the activation cleavage anchor the protein to negatively charged phospholipid Sur site of other vitamin K-dependent polypeptides. Recombi faces, thereby rendering Factor IX nonfunctional. Even if it is nant human factor VIIa has utility in treatment of uncontrol carboxylated, the Gla domain also depends on cleavage of the lable bleeding in hemophilia patients (with Factor VIII or IX propeptide for proper function, since retained propeptide 30 deficiency), including those who have developed inhibitors interferes with conformational changes of the Gla domain against replacement coagulation factor. FVII shall be any necessary for optimal binding to calcium and phospholipid. form of factor VII molecule with the typical characteristics of The resulting mature Factor IX is a single chain protein of 415 blood coagulation factor VII. FVII shall include FVII from amino acid residues that contains approximately 17% carbo plasma and any form of recombinant FVII which is capable of hydrate by weight (Schmidt, A. E., et al. (2003) Trends Car 35 ameliorating bleeding disorders in a patient; e.g., caused by diovasc Med, 13:39). deficiencies in FVII/FVIIa. In some embodiments, the FVII Mature FIX must be activated by activated Factor XI to peptide is the activated form (FVIIa), a structural analog or yield Factor IXa. In the intrinsic pathway of the coagulation peptide mimetic of any of the FVII peptides described herein, cascade, FIX associates with a complex of activated Factor including sequences of Table 7. Factor VII and VIIa have been VIII, Factor X, calcium, and phospholipid. In the complex, 40 cloned, as described in U.S. Pat. No. 6,806,063 and US Patent FIX is activated by Factor Xia. The activation of Factor IX is Application No. 20080261886. achieved by a two-step removal of the activation peptide (Ala In one aspect, the invention provides monomeric BPXTEN 146-Arg 180) from the molecule. (Bajaj et al., “Human factor fusion proteins of FIX comprising the full-length sequence, IX and factor IXa, in METHODS IN ENZYMOLOGY. or active fragments, or sequence variants, or any biologically 1993). The first cleavage is made at the Arg 145-Ala 146 site 45 active derivative of FIX, including the FIX sequences of Table by either Factor Xia or Factor VIIa/tissue factor. The second, 7, covalently linked to an XTEN, to create a chimeric mol and rate limiting cleavage is made at Arg 180-Val 181. The ecule. In some cases, the fusion proteins comprising a coagu activation removes 35 residues. Activated human Factor IX lation factor such as FIX, further comprise one or more pro exists as a disulfidelinked heterodimer of the heavy chain and teolytic cleavage site sequences. In another embodiment, the light chain. Factor IXa in turn activates Factor X in concert 50 fusion protein comprises a first and a second, different cleav with activated Factor VIII. Alternatively, Factors IX and X age sequence. In an embodiment of the foregoing, the cleav can both beactivated by Factor VIIa complexed with apidated age sequence(s) are selected from Table 10. In those cases Tissue Factor, generated via the extrinsic pathway. Factor Xa where the presence of the XTEN inhibits activation of FIX to then participates in the final common pathway whereby pro the activated form of FIX (hereinafter “FIXa) (e.g., wherein thrombin is converted to thrombin, and thrombin, in turn 55 the XTEN is attached to the C-terminus of FIX), the one or converts fibrinogen to fibrin to form the clot. more proteolytic cleavage site permits the release of the In Some cases, the coagulation factor is Factor IX, a XTEN sequence when acted on by a protease, as described sequence variant of Factor IX, or a Factor IX moiety. Such as more fully below. In a feature of the foregoing, the intact the exemplary sequences of Table 7, as well as any protein or FIX-XTEN composition serves as a pro-drug, and the release polypeptide Substantially homologous thereto whose biologi 60 of the XTEN from the FIX-XTEN molecule by proteolysis cal properties result in the activity of Factor IX. As used permits the released FIX sequence to be converted to FIXa. In herein, the term “Factor IX moiety' includes proteins modi one embodiment, the one or more cleavage sequences can be fied deliberately, as for example, by site directed mutagenesis a sequence having at least about 80% sequence identify to a or accidentally through mutations, that result in a factor IX sequence from Table 10, or at least about 85%, or at least sequence that retain at least some factor IX activity. The term 65 about 90%, or at least about 95%, or at least about 96%, or at “Factor IX moiety' also includes derivatives having at least least about 97%, or at least about 98%, or at least about 99%, one additional amino acid at the N- or carboxy terminal ends or 100% sequence identity to a sequence from Table 10. US 9,371,369 B2 75 76 FIX-XTEN fusion proteins can be designed in different comprise both a first and a second cleavage sequence, which configurations. In one embodiment, as illustrated in FIG. may be identical or different, In another embodiment, 37A, the fusion proteins comprise components in the follow wherein the FIX comprises a Gla domain, the XTEN can be ing order (N- to C-terminus): FIX; and XTEN, wherein the inserted and linked between two contiguous amino acids of XTEN may comprise a cleavage sequence internal to the the Gla sequence wherein the XTEN forms a loop structure, XTEN sequence. In another embodiment, as illustrated in wherein the loop structure is substantially external to the Gla FIG. 37B, the fusion proteins comprise components in the structure, and the XTEN can (optionally) comprise both a first following order (N- to C-terminus): FIX; cleavage sequence: and a second cleavage sequence, which may be identical or and XTEN. In some cases, as illustrated in FIGS. 37C and different, In another embodiment, wherein the FIX comprises 37D, the fusion proteins comprise components wherein the 10 XTEN is located internal to the FIX polypeptide, inserted an EGF1 domain, the XTEN can be inserted and linked between FIX domains. In one embodiment, the N-terminus of between two contiguous amino acids of the EGF1 sequence the XTEN is linked to the C-terminus of the FIXGla domain wherein the XTEN forms a loop structure, wherein the loop and the C-terminus of the XTEN is linked to the N-terminus structure is substantially external to the EGF1 structure, and of the EGF1 domain of FIX, resulting in an FIX-XTEN where 15 the XTEN can (optionally) comprise both a first and a second no additional cleavage sequences are introduced, as shown in cleavage sequence, which may be identical or different. FIG. 37C. In another embodiment, the N-terminus of the In another embodiment, as illustrated in FIG. 37G, the XTEN is linked to the C-terminus of the FIX EGF1 domain fusion proteins comprise components in the order FIX; and and the C-terminus of the XTEN is linked to the N-terminus XTEN, wherein the XTEN comprises multiple cleavage of the EGF2 domain of FIX, resulting in an FIX-XTEN where sequences near the N-terminus of the XTEN, preferably no additional cleavage sequences are introduced, as shown in within the first 144 amino acid residues of the XTEN, more FIG. 37C. In another embodiment, wherein the factor IX preferably within about the first 80 amino acids, more pref sequence comprises an activation peptide domain comprising erably within about the first 42 amino acids, more preferably a PDVDYVNSTEAETILDNITQSTQSFNDF (SEQID NO: within about the first 18 amino acids, and even more prefer 1748) sequence, the N- and C-termini of the XTEN sequence 25 ably within the first 12 amino acids of the N-terminus of the can be inserted between and linked to any two contiguous or XTEN sequence. any two discontiguous amino acids of the activation peptide In other embodiments, the FIX-XTEN can exist in the domain sequence, and the XTEN optionally comprises both a configuration (N- to C-terminal) XTEN-FIX, alternatively first and a second cleavage sequence, which may be identical XTEN-FIX-XTEN, alternatively XTEN-CS-FIX, alterna or different. In another embodiment of the foregoing, the 30 tively FIX-XTEN-FIX, alternatively FIX-CS-XTEN-CS XTEN can be inserted between the contiguous T and I amino acids of the foregoing sequence wherein the N-terminus of XTEN, or multimers of the foregoing. the XTEN is linked to the C-terminus of the Tamino acid and In one embodiment, the FIX-XTEN is configured such that the C-terminus of the XTEN is linked to the N-terminus of the the FIX of the FIX-XTEN composition can be activated to I amino acid. 35 FIXaby a coagulation protease without the release of some or In another embodiment, as illustrated in FIG. 37E, the all of an XTEN. In another embodiment, wherein FIX-XTEN fusion proteins comprise two molecules of XTEN: a first comprises at least two XTEN, the FIX-XTEN is configured located between two domains of FIX as described above, and such that the FIX of the FIX-XTEN composition can be a second XTEN wherein the N-terminus of the XTEN is activated to FIXa by a coagulation protease without the linked to the C-terminus of FIX or wherein the N-terminus of 40 release of one of the XTENs. In another embodiment, the second XTEN is linked to the C-terminus of a cleavage wherein the XTEN is to be released from the FIX-XTEN sequence that is linked to the C-terminus of FIX. either prior to activation of the FIX to FIXa, or concomitant In other cases, as illustrated in FIG. 37F, the FIX-XTEN with the activation of the FIX to FIXa, the cleavage sequences fusion proteins comprise components wherein the XTEN is are located sufficiently close to the ends of the XTEN linked located within a sequence of a FIX domain, inserted as a part 45 to any portion of the FIX such that any remaining XTEN or of an existing structural loop in the domain or creating a loop cleavage sequence residues do not appreciably interfere with external to the domainstructure. In one embodiment, wherein the activation of FIX, yet provide sufficient access to the the FIX comprises an EGF2 domain wherein the EGF2 protease to effect cleavage of the corresponding cleavage domain comprises a KNSADNK (SEQ ID NO: 1749) loop, sequence. In one embodiment, wherein an XTEN is linked to the XTEN sequence can be inserted between the S and A 50 the C-terminus of the FIX (as described above), the one or amino acids of the loop sequence wherein the N-terminus of more cleavage sites can be located within about the first 100 the XTEN is linked to the C-terminus of the S and the C-ter amino acids of the N-terminus of the XTEN, more preferably minus of the XTEN is linked to the N-terminus of the A, within the first 80 amino acids, more preferably within the resulting in a loop sequence with the XTEN polypeptide first 54 amino acids, more preferably within the first 42 amino extending outside the globular coagulation protein, and 55 acids, more preferably within the first 30 amino acids, more wherein the XTEN can (optionally) comprise both a first and preferably within the first 18 amino acids, and most prefer a second cleavage sequence, which may be identical or dif ably within the first 6 amino acids of the N-terminus of the ferent. In another embodiment, the XTEN can be inserted XTEN. In another embodiment, wherein an XTEN is linked between any two contiguous or discontiguous amino acids of internal to the FIX sequence (as described above), either the KNSADNK loop sequence. In another embodiment, 60 between two FIX domains or within an external loop of a FIX wherein the FIX comprises an EGF2 domain wherein the domain or internal to a domain sequence, the XTEN can EGF2 domain comprises a LAEN loop, the XTEN sequence comprise two or more cleavage sequences in which at least can be inserted between the contiguous A and E amino acids one cleavage site can be located within about the first 100 of the LAEN loop, wherein the N-terminus of the XTEN is amino acids of the N-terminus of theXTEN, within 80, within linked to the C-terminus of A and the C-terminus of the 65 54, within 42, within 30, within 18, or within 6 amino acids of XTEN is linked to the N-terminus of the E. resulting in a loop the N-terminus and the XTEN can comprise at least a second sequence LA-XTEN-EN, and the XTEN can (optionally) cleavage site located within the last 100 amino acids of the US 9,371,369 B2 77 78 C-terminus of the XTEN, within 80, within 54, within 42, prises an XTEN sequence with at least about 80% sequence within 30, within 18, or within 6 amino acids of the C-termi identity, or alternatively 81%, 82%, 83%, 84%, 85%. 86%, nus of the XTEN. 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, In some cases, protease cleavage of the fusion protein 97%, 98%, 99%, or 100% sequence identity to an XTEN releases the XTEN from the FIX sequence in a subject such 5 selected from Table 2. In another embodiment, the BPXTEN that the FIX sequence can subsequently be activated. In other comprises a sequence with at least about 80% sequence iden cases, protease cleavage of the fusion protein is a result of tity, or alternatively 81%, 82%, 83%, 84%, 85%. 86%, 87%, action of the proteases of the coagulation cascade. Such that 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, the XTEN is released concurrently with the processing and 98%, 99%, or 100% sequence identity to a sequence selected activation of FIX to FIXa. Thus, in a particular feature of 10 from Table 43. The invention also contemplates substitution some embodiments, the XTEN and associated cleavage of any FIX sequence from Table 7, any XTEN from Table 2, site(s) of the FIX-XTEN fusion protein can be located such and any cleavage sequence from Table 10 for the respective that the FIX sequence cannot be activated until XTEN is components of Table 43, or sequences with at least 90% released from the fusion protein by proteolytic cleavage. In sequence identity to the foregoing. such embodiments, the FIX-XTEN can be used as prodrug 15 In another embodiment, the BPXTEN coagulation protein that can be activated once it is administered to the subject. In is the FVII sequence of Table 7 and theXTEN is selected from one embodiment, the released FIX polypeptides can contain AE864 and AM875. In one embodiment of the foregoing, the all or part of an XTEN, particularly when the FIX-XTEN BPXTEN comprises, in an N- to C-terminus configuration, comprises two XTEN sequences. Alternatively, the released FVII-AE864. In another embodiment, the BPXTEN is con FIX polypeptides can be free from the XTEN or substantially figured, N- to C-terminus as FVII-AM875. In another all of the XTEN. embodiment, the configured BPXTEN comprises a FVII that In other embodiments, the cleavage site is used during the is at least about 70%, or 80%, or 90%, or at least about 95% synthesis procedure of the fusion protein. For example, the or greater in the activated FVIIa form. In another embodi fusion protein can further contain an affinity tag for isolation ment, the BPXTEN comprising FVII and an XTEN can fur of the fusion proteinafter recombinant production. A protease 25 ther comprise a cleavage sequence, which may include a that recognized the cleavage site in the fusion protein can sequence selected from Table 10. remove the tag, while leaving a final product that is FIX linked (d) Growth Hormone Proteins to the XTEN. “Growth Hormone' or “GH' means the human growth Thus, the fusion protein can have a cleavage site, which can hormone protein and species and sequence variants thereof, be cleaved before injection, after injection (in the blood cir 30 and includes, but is not limited to, the 191 single-chain amino culation or tissues by proteases) and can be located Such that acid human sequence of GH. Thus, GH can be the native, the XTEN stays with the therapeutic product until it is full-length protein or can be a truncated fragment or a released by an endogenous protease, either permitting the sequence variant that retains at least a portion of the biologi therapeutic product to be activated by the coagulation cas cal activity of the native protein. Effects of GH on the tissues cade, or the XTEN is released by a protease of the coagulation 35 of the body can generally be described as anabolic. Like most cascade. other proteinhormones, GH acts by interacting with a specific In some embodiments, a fusion protein of FIX is converted plasma membrane receptor, referred to as growth hormone from an inactive protein to an active protein (e.g., FIXa) by a receptor. There are two known types of human GH (herein site-specific protease, either in the circulation or within a after “hoH) derived from the pituitary gland: one having a body tissue or cavity. This cleavage may triggeran increase in 40 molecular weight of about 22,000 daltons (22 kD hGH) and potency of the pharmaceutically active domain (pro-drug the other having a molecular weight of about 20,000 daltons activation) or it may enhance binding of the cleavage product (20 kD hGH). The 20 kD HGH has an amino acid sequence to its target. So, for example, FIX-XTEN fusion proteins can that corresponds to that of 22 kD hCGH consisting of 191 be cleaved in the blood of a subject and the FIX sequence can amino acids except that 15 amino acid residues from the 32" become activated by the coagulation cascade or by another 45 to the 46" of 22 kD hCH are missing. Some reports have protease disclosed herein. The active form of the FIX fusion shown that the 20 kD hCH has been found to exhibit lower protein may or may not contain at least a fragment of XTEN. risks and higher activity than 22 kD hCGH. The invention also In a feature of the FIX-XTEN pro-drug embodiments, a contemplates use of the 20 kD hCGH as being appropriate for higher dosage of the FIX-XTEN composition may be admin use as a biologically active polypeptide for BPXTEN com istered, compared to conventional FIX therapeutics, because 50 positions herein. the release of the FIX capable of being activated can be The invention contemplates inclusion in the BPXTEN of controlled by the selection of cleavage sequences, or varying any GH homologous sequences, sequence fragments that are the numbers of cleavage sites required to be cleaved before a natural. Such as from primates, mammals (including domestic form of FIX is released that can be activated. As those with animals), and non-natural sequence variants which retain at skill in the art will appreciate, the sequence of any of the 55 least a portion of the biologic activity or biological function of cleavage sites disclosed herein can be modified by introduc GH and/or that are useful for preventing, treating, mediating, ingamino acid variations into the cleavage sequences in order or ameliorating a GH-related disease, deficiency, disorder or to modify the kinetics of proteolytic cleavage, thereby affect condition. Non-mammalian GH sequences are well-de ing the kinetics of FIX release and Subsequent activation. scribed in the literature. For example, a sequence alignment The invention provides BPXTEN fusion proteins compris 60 of GHs can be found in Genetics and Molecular Biology ing a coagulation protein and XTEN. In some cases, the 2003 26 p. 295-300. An analysis of the evolution of avian GH BPXTEN comprises a coagulation protein that exhibits at sequences is presented in Journal of Evolutionary Biology least about 80% sequence identity, or alternatively 81%, 82%, 2006 19 p. 844-854. In addition, native sequences homolo 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, gous to human GH may be found by standard homology 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence 65 searching techniques, such as NCBI BLAST. identity to a coagulation protein selected from Table 7. In one In one embodiment, the GH incorporated into the subject embodiment of the foregoing, the BPXTEN further com compositions can be a recombinant polypeptide with a

US 9,371,369 B2 83 84 TABLE 8- continued Growth hormone amino acid sequences from animal species SEO ID Species GH Amino Acid Sequence NO.

Monkey FPTIPLSRLFDNAMLRAHRLHOLAFDTYOEFEEAYIPKEOKYSFLONPOTSLCF 1777 (rhesus) SESIPTPSNREETOOKSNLELLRISLLLIOSWLEPVOFLRSWFANSLWYGTSYSD WYDLLKDLEEGIOTLMGRLEDGSSRTGOIFKOTYSKFDTNSHNNDALLKNYG LLYCFRKDMDKIETFLRI VOCR-SVEGSCGF

IV). BPXTEN Structural Configurations and Properties (a) BPXTEN Fusion Protein Configurations The BP of the subject compositions are not limited to The invention provides BPXTEN fusion protein composi native, full-length polypeptides, but also include recombinant 15 tions comprising BP linked to one or more XTEN polypep versions as well as biologically and/or pharmacologically tides useful for preventing, treating, mediating, or ameliorat active variants or fragments thereof. For example, it will be ing a disease, disorder or condition related to glucose appreciated that various amino acid substitutions can be made homeostasis, insulin resistance, or obesity. In some cases, the in the GP to create variants without departing from the spirit BPXTEN is a monomeric fusion protein with a BP linked to of the invention with respect to the biological activity or one or more XTEN polypeptides. In other cases, the BPX pharmacologic properties of the BP. Examples of conserva TEN composition can include two BP molecules linked to tive Substitutions for amino acids in polypeptide sequences one or more XTEN polypeptides. The invention contemplates are shown in Table 9. However, in embodiments of the BPX BPXTEN comprising, but not limited to BP selected from TEN in which the sequence identity of the BP is less than 25 100% compared to a specific sequence disclosed herein, the Tables 3-8 (or fragments or sequence variants thereof), and invention contemplates substitution of any of the other 19 XTEN selected from Table 2 or sequence variants thereof. In natural L-amino acids for a given amino acid residue of the Some cases, at least a portion of the biological activity of the given BP, which may beat any position within the sequence of respective BP is retained by the intact BPXTEN. In other the BP, including adjacent amino acid residues. If any one 30 cases, the BP component either becomes biologically active Substitution results in an undesirable change in biological or has an increase in activity upon its release from the XTEN activity, then one of the alternative amino acids can be by cleavage of an optional cleavage sequence incorporated employed and the construct evaluated by the methods within spacer sequences into the BPXTEN, described more described herein, or using any of the techniques and guide fully below. lines for conservative and non-conservative mutations set 35 In one embodiment of the BPXTEN composition, the forth, for instance, in U.S. Pat. No. 5,364,934, the contents of invention provides a fusion protein of formula I: which is incorporated by reference in its entirety, or using methods generally known to those of skill in the art. In addi (BP)-(S)-(XTEN) I tion, variants can also include, for instance, polypeptides wherein independently for each occurrence, BP is a biologi wherein one or more amino acid residues are added ordeleted 40 cally active protein as described hereinabove; S is a spacer at the N- or C-terminus of the full-length native amino acid sequence having between 1 to about 50 amino acid residues sequence of a BP that retains at least a portion of the biologi that can optionally include a cleavage sequence (as described cal activity of the native peptide. more fully below); x is either 0 or 1; and XTEN is an extended recombinant polypeptide as described hereinabove. The TABLE 9 45 embodiment has particular utility where the BP requires a Exemplary conservative amino acid substitutions free N-terminus for desired biological activity, or where link ing of the C-terminus of the BP to the fusion protein reduces Original Residue Exemplary Substitutions biological activity and it is desired to reduce the biological Ala (A) val; leu; ile 50 activity and/or side effects of the administered BPXTEN. Arg (C lys: gin; asn ASn (N) gin; his; lys; arg In another embodiment of the BPXTEN composition, the Asp (D) glu invention provides a fusion protein of formula II (components Cys (C) Se as described above): Glin (Q) 8Sl Glu (E) asp 55 (XTEN)-(S)-(BP) II Gly (G) pro His (H) aSn: gin: Iys: arg wherein independently for each occurrence, BP is a biologi XIle (I) leu; val; met; ala; phe: norleucine cally active protein as described hereinabove; S is a spacer Leu (L) norleucine: ile: val; met; ala: phe Lys (K) arg: gin: asn sequence having between 1 to about 50 amino acid residues Met (M) leu; phe; ille that can optionally include a cleavage sequence (as described Phe (F) leu: val: ile; ala 60 more fully below): x is either 0 or 1; and XTEN is an extended Pro (P) gly Ser (S) thr recombinant polypeptide as described hereinabove. The Thr (T) Se embodiment has particular utility where the BP requires a Trp (W) tyr free C-terminus for desired biological activity, or where link Tyr (Y) trp: phe: thr: ser 65 ing of the N-terminus of the BP to the fusion protein reduces Val (V) ile; leu; met; phe; ala; norleucine biological activity and it is desired to reduce the biological activity and/or side effects of the administered BPXTEN. US 9,371,369 B2 85 86 Thus, the BPXTEN having a single BP and a singleXTEN structure. In one embodiment, one or both spacer sequences can have at least the following permutations of configura in a BPXTEN fusion protein composition may each further tions, each listed in an N- to C-terminus orientation: BP contain a cleavage sequence, which may be identical or may XTEN; XTEN-BP: BP-S-XTEN; or XTEN-S-BP. be different, wherein the cleavage sequence may be acted on In another embodiment, the invention provides an isolated 5 by a protease to release the BP from the fusion protein. fusion protein, wherein the fusion protein is of formula III: In some cases, the incorporation of the cleavage sequence (BP)-(S)-(XTEN)-(S)-(BP)-(S)-(XTEN), III into the BPXTEN is designed to permit release of a BP that wherein independently for each occurrence, BP is a bio becomes active or more active upon its release from the logically active protein as described hereinabove; S is a 10 XTEN. The cleavage sequences are located sufficiently close spacer sequence having between 1 to about 50 amino acid to the BP sequences, generally within 18, or within 12, or residues that can optionally include a cleavage sequence (as within 6, or within 2 amino acids of the BP sequence termi described more fully below); x is either 0 or 1; y is either 0 or nus, such that any remaining residues attached to the BP after 1; Z is either 0 or 1; and XTEN is an extended recombinant cleavage do not appreciably interfere with the activity (e.g., polypeptide as described hereinabove. 15 such as binding to a receptor) of the BP, yet provide sufficient In another embodiment, the invention provides an isolated access to the protease to be able to effect cleavage of the fusion protein, wherein the fusion protein is of formula IV cleavage sequence. In some embodiments, the cleavage site is (components as described above): a sequence that can be cleaved by a protease endogenous to the mammalian subject such that the BPXTEN can be cleaved after administration to a subject. In such cases, the BPXTEN In another embodiment, the invention provides an isolated can serve as a prodrug or a circulating depot for the BP. fusion protein, wherein the fusion protein is of formula V Examples of cleavage sites contemplated by the invention (components as described above): include, but are not limited to, a polypeptide sequence cleav (BP)-(S).(BP)-(S)-(XTEN) V 25 able by a mammalian endogenous protease selected from In another embodiment, the invention provides an isolated FXIa, FXIIa, kallikrein, FVIIa, FIXa, FXa. FIIa (thrombin), fusion protein, wherein the fusion protein is of formula VI Elastase-2, granzyme B, MMP-12, MMP-13, MMP-17 or (components as described above): MMP-20, or by non-mammalian proteases such as TEV. enterokinase, PreScissionTM protease (rhinovirus 3C pro 30 tease), and Sortase A. Sequences known to be cleaved by the In another embodiment, the invention provides an isolated foregoing proteases are known in the art. Exemplary cleavage fusion protein, wherein the fusion protein is of formula VII sequences and cut sites within the sequences are presented in (components as described above): Table 10, as well as sequence variants. For example, thrombin (activated clotting factor II) acts on the sequence LTPRSLLV (XTEN)-(S).(BP)-(S)-(BP)-(XTEN) VII 35 (SEQ ID NO: 222) Rawlings N. D., et al. (2008) Nucleic In the foregoing embodiments of fusion proteins of formu Acids Res., 36: D320, which would be cut after the arginine las I-VII, administration of a therapeutically effective dose of at position 4 in the sequence. Active FIIa is produced by a fusion protein of an embodiment to a subject in need thereof cleavage of FII by FXa in the presence of phospholipids and can result in a gain in time of at least two-fold, or at least 40 calcium and is down stream from factor IX in the coagulation three-fold, or at least four-fold, or at least five-fold or more pathway. Once activated its natural role in coagulation is to spent within a therapeutic window for the fusion protein cleave fibrinogen, which then in turn, begins clot formation. compared to the corresponding BP not linked to the XTEN of FIIa activity is tightly controlled and only occurs when and administered at a comparable dose to a Subject. coagulation is necessary for proper hemostasis. However, as Any spacer sequence group is optional in the fusion pro 45 coagulation is an on-going process in mammals, by incorpo teins encompassed by the invention. The spacer may be pro ration of the LTPRSLLV (SEQID NO: 223) sequence into the vided to enhance expression of the fusion protein from a host BPXTEN between the BP and the XTEN, the XTEN domain cell or to decrease steric hindrance such that the BP compo would be removed from the adjoining BP concurrent with nent may assume its desired tertiary structure and/or interact activation of either the extrinsic or intrinsic coagulation path appropriately with its target molecule. For spacers and meth 50 ways when coagulation is required physiologically, thereby releasing BP over time. Similarly, incorporation of other ods of identifying desirable spacers, see, for example, sequences into BPXTEN that are acted upon by endogenous George, et al. (2003) Protein Engineering 15:871-879, spe proteases would provide for sustained release of BP that may, cifically incorporated by reference herein. In one embodi in certain cases, provide a higher degree of activity for the BP ment, the spacer comprises one or more peptide sequences 55 from the “prodrug form of the BPXTEN. that are between 1-50 amino acid residues in length, or about 1-25 residues, or about 1-10 residues in length. Spacer In some cases, only the two or three amino acids flanking sequences, exclusive of cleavage sites, can comprise any of both sides of the cut site (four to six amino acids total) would the 20 natural L amino acids, and will preferably comprise be incorporated into the cleavage sequence. In other cases, the 60 known cleavage sequence can have one or more deletions or hydrophilic amino acids that are sterically unhindered that insertions or one or two or three amino acid Substitutions for can include, but not be limited to, glycine (G), alanine (A), any one or two or three amino acids in the known sequence, serine (S), threonine (T), glutamate (E) and proline (P). In wherein the deletions, insertions or substitutions result in Some cases, the spacer can be polyglycines or polyalanines, reduced or enhanced susceptibility but not an absence of or is predominately a mixture of combinations of glycine and 65 Susceptibility to the protease, resulting in an ability to tailor alanine residues. The spacer polypeptide exclusive of a cleav the rate of release of the BP from the XTEN. Exemplary age sequence is largely to Substantially devoid of secondary substitutions are shown in Table 10.