<<

USOO922.1882B2

(12) United States Patent (10) Patent No.: US 9.221,882 B2 Skerra et al. (45) Date of Patent: Dec. 29, 2015

(54) BIOSYNTHETIC /ALANINE WO WO 2007/103,515 A2 9, 2007 RANDOMCOL POLYPEPTDES AND THEIR WO WO 2008, 145400 A2 12/2008 USES WO WO 2010/091122 A1 8, 2010 (75) Inventors: Arne Skerra, Freising (DE); Uli Binder, OTHER PUBLICATIONS Freising (DE); Martin Schlapschy, Rudinger, Peptide Hormones, JA Parsons, Ed., 1976, pp. 1-7.* Freising (DE) SIGMA, 2004, pp. 1-2.* Berendsen, A Glimpae of the Holy Grail?, Science, 1998, 282, pp. (73) Assignees: Technische Universitat Munchen, 642-643. Munich (DE); XL- GmbH, Ngo et al. Computational Complexity, Protein Structure Protection, Freising (DE) and the Levinthal Paradox, 1994, pp. 491-494.* (*) Notice: Subject to any disclaimer, the term of this Bradley et al., Limits of Cooperativity in a Structurally Modular patent is extended or adjusted under 35 Protein: Response of the Notch Ankyrin Domain to Analogous Alanine Substitutions in Each Repeat, J. Mol. BIoL (2002) 324. U.S.C. 154(b) by 0 days. 373-386. Voet et al. Biochemistry, John Wiley & Sons Inc., 1995, pp. 235 (21) Appl. No.: 13/697,569 241.* UniProt Protein Database, Very large tegument protein, UL36, pro (22) PCT Filed: May 20, 2011 tein accession Q7T5D9, accessed on Nov. 25, 2014.* International Search Report received in the parent application PCT/ (86). PCT No.: PCT/EP2011/0583.07 EP2011/0583.07, dated Oct. 7, 2011. “ Therapeutics Inc.'s Polyglutamate (PG) Technology High S371 (c)(1), lighted at International Polymer Therapeutics Meeting; Novel Recombinant Technology Extends PG Platform to G-CSF.” PR (2), (4) Date: Nov. 13, 2012 Newswire, 2002, pp. 1-2, http://www.cticseattle.com/. Affranchino, J.L., et al., “Indentification of a Trypanosoma Cruzi (87) PCT Pub. No.: WO2011/144756 antigen that is shed during the acute phase of Chagas' disease.” PCT Pub. Date: Nov. 24, 2011 Molecular and Biochemical Parasitology, 1989, pp. 221-228, vol. 34. Alvarez, et. al., Improving protein pharmacokinetics by genetic (65) Prior Publication Data fusion to simple sequences. J Biol Chem, 2004, 279, 3375-81. US 2013 FOOT242O A1 Mar. 21, 2013 Axelsson, et al., Studies of the release and turnover of a human neutrophillipocalin. Scand J Clin Lab Invest, 1995, 55.577-88. Bamford, et. al., Synthetic polypeptides, preparation, structure, and Related U.S. Application Data properties, Academic Press Inc., 1956. Benito, et al., “A Polymorphic tandem repeat potentially useful for (60) Provisional application No. 61/428,016, filed on Dec. typing in the chromosome of Yersinia enterocolitica”. Microbiology, 29, 2010. 2004, vol. 150, pp. 199-204. BinZ. et al., Engineering novel binding from nonim (30) Foreign Application Priority Data munoglobulin domains. Nat Biotechnol. 2005, 23, 1257-68. Bohm, et al., “Quantitative analysis of protein far UV circular dichro May 21, 2010 (EP) ...... 10163564 ism spectra by neural networks'. Protein Engineering, vol. 5, No. 3, 1992, pp. 191-195. Brant, et al., Conformational energy estimates for statistically coiling (51) Int. Cl. polypeptide chains. J. Mol. Biol., 1967. 23.47-65. C07K I4/00 (2006.01) Breustedt, et al., The 1.8-A crystal structure of human tear lipocalin C07K 7/08 (2006.01) reveals an extended branched cavity with capacity for multiple (52) U.S. Cl. ligands. J Biol Chem, 2005, 280,484-93. CPC. C07K 14/00 (2013.01); C07K 7/08 (2013.01); Breustedt, et al...) Comparative ligand-binding analysis often human lipocalins. Biochim Biophys Acta, 2006, 1764, 161-73. C07K 14/001 (2013.01) Buscaglia, C.A., et al., “Tandem Amino Acid Repeats From Trypano (58) Field of Classification Search some cruzi Shed Antigens Increase the Half-Life of Proteins in CPC ...... C07K 7/08; C07K 14/00; C07K 14/001; Blood.” Blood, The American Society of Hematology, 1999, pp. C07K 17/00; C07K 147/08 2025-2032, vol. 93, No. 6. See application file for complete search history. Buscaglia, C.A., et al., “The Repetitive Domain of Trypanosoma Cruzi trans-Sialidase Enhances the Immune Response against the (56) References Cited Catalytic Domain.” The Journal of Infectious Diseases, 1998, pp. 431-436, vol. 177. U.S. PATENT DOCUMENTS (Continued) 5,677,172 A 10, 1997 Makarow Primary Examiner — James H Alstrum Acevedo 2003. O19074.0 A1 10, 2003 Altman Assistant Examiner — Erinne Dabkowski 2006/0252120 A1* 11/2006 Kieliszewski ...... 435/69.1 2009,0192087 A1 7/2009 Glass et al. (74) Attorney, Agent, or Firm — Foley & Lardner LLP 2010, O292130 A1 11/2010 Skerra (57) ABSTRACT We discovered that a biosynthetic polypeptide, consisting of FOREIGN PATENT DOCUMENTS at least about 50 proline (Pro) and alanine (Ala) amino acid residues, forms a random coil, and increases the stability of EP 1739.167 A1 1, 2007 other compounds to which it is conjugated. Such as Small EP 2173890 B1 3, 2011 JP 2007-529522 10/2007 molecules and polypeptides. We describe such biosynthetic WO WOO1,785.03 A2 10, 2001 polypeptides, constructs containing Such polypeptides, ways WO WO O2/O2597 A2 1, 2002 of making them, and their use. WO WO 2005/089796 A1 9, 2005 34 Claims, 24 Drawing Sheets US 9.221,882 B2 Page 2

(56) References Cited Macewan S.R., et al., Invited Review Elastin-Like Polypeptides: Biomedical Applications of Tunable Biopolymers, PeptideScience, OTHER PUBLICATIONS 2010, pp. 60-77, vol. 94, No. 1. Caliceti, et al.) Pharmacokinetic and biodistribution properties of Makrides, et al., Extended in vivo half-life of human soluble comple poly(ethylene glycol)-protein conjugates. Adv Drug Deliv Rev. ment receptor type I fused to a serum albumin-binding receptor. J Pharmacol Exp. Ther, 1996, 277, 534-42. 2003, 55, 1261-77. Miller, et al., Dimensions of protein random coils. Biochemistry, Cantor, et. al., The Conformation of Biological Macromolecules, 1968, 7,3925-35. Biophysical Chemistry, 1996, Part I, pp. 425-428, Part II, pp. 1006 Mori, et al., Endocytic delivery of lipocalin-siderophore-iron com 1010. plex rescues the kidney from ischemia-reperfusion injury. J Clin Carter, et al., Purification, cloning, expression and biological char Invest, 2005, 115, 610-21. acterization of an interleukin-I receptor antagonist protein. Nature, Nguyen, et al., The pharmacokinetics of an albumin-binding Fab 1990, 344, 633-8. (AB.Fab) can be modulated as a function of affinity for albumin. Chou, et al., Prediction of protein conformation. Biochemistry, 1974, Protein Eng Des Sel, 2006, 19, 291-7. 13,222-45. Osborn, et al., Pharmacokinetic and pharmacodynamic studies of a Chu, et al., The hydrophobic pocket of 24p;3 protein from mouse human serum albumin-interferon-alpha in uterine luminal fluid: fatty acid and retinol binding activity and cynomolgus monkeys. J Pharmacol Exp Ther, 2002, 303, 540-8. predicted structural similarity to lipocalins. J Pept Res., 1998, Perelygina Ludmila, et al., “Complete sequence and comparative 52,390-7. analysis of the genome of herpes B virus (Cercopithecine herpesvirus Clark, et al.) Long-acting growth hormones produced by conjuga 1) from a rhesus monkey, Journal of Virology, vol. 77, No. 11, 2003, tion with polyethylene glycol. J. Biol Chem, 1996, 271, 21969-77. pp. 6167-6177. (XP002598446). Extended Search Report received in the related European Patent Perlman, et al., of an N-terminal extension prolongs Application No. 10163564.7, dated Feb. 3, 2011. the half-life and increases the in vivo activity of follicle stimulating Cowan, et al., Structure of poly-I-proline, Nature, 1955, pp. 501-503, hormone. J Clin Endocrinol Metab, 2003, 88, 3227-35. vol. 176. Plückthun, A., IBC Conference on "Antibodies and Beyond Antibod Creighton, Thomas E., “Proteins structures and molecular proper ies”. Loews Coronado Bay Resort, Coronado, CA. Jun. 1-2, 2006. ties'. European Molecular Biology Laboratory, Second Edition, pp. www.IBCLifeSciences.com/3198, pp. 1-6. 190-191, 176-177. Quadrifoglio, et al., Ultraviolet rotatory properties of polypeptides in Dennis, et al., “Albumin binding as a general strategy for improving solution. II. Poly-L-. JAm Chem Soc, 1968,90, 2760-5. the pharmacokinetics of proteins”. J. Biol Chem, 2002, 277.35035 Radhakrishnan, et al., Zinc mediated dimer of human interferon 43. alpha2b revealed by X-ray crystallography. Structure, 1996, 4, 1453 Elliott, et al., Enhancement of therapeutic protein in vivo activities 63. through glycoengineering. Nat Biotechnol. 2003, 21, 414-21. Ray, et al. A simple procedure for removing contaminating Estevez, et al., "Characterization of Synthetic Hydroxyproline-Rich aldehydes and peroxides from aqueous solutions of polyethylene Proteoglycans with Arabinogalactan Protein and Extensin Motifs in glycols and of nonionic detergents that are based on the Arabidopsis'. Plant Physiology, 2006, 142, pp. 458-470. polyoxyethylene linkage. Anal Biochem, 1985, 146, 307-12. Fandrich, et al., “The behaviour of polyamino acids reveals an inverse Rosendahl, et al., Site-Specific Protein PEGylation. BioProcess side chain effect in amyloid structure formation”. Embo J., 2002, 21. International, 2005, 3, 52-61. 5682-90. Schellenberger, V, "AMUNIX Engineering of Microproteins for Flo, et al., "Lipocalin 2 mediates an innate immune response to Pharmaceutical Applications.” Powerpoint presentation, pp. 1-35. bacterial infection by sequestrating iron”, Nature, 2004,432,917-21. Menlo Park, CA. Ghetie, et al., Transcytosis and catabolism of antibody. Immunol Res, Schellenberger, V., et al., “A recombinant polypeptide extends the in 2002, 25,97-113. vivo half-life of peptides and proteins in a tunable manner.” Nature Gill, et al., Calculation of protein extinction coefficients from amino Biotechnology, 2009, pp. 1186-1190, vol. 27. No. 12. acid sequence data. Anal Biochem, 1989, 182, 319-26. Schimmel, et al., Conformational energy and configurational statis Goetz, et al.) The neutrophillipocalin NGAL is a bacteriostatic agent tics of poly-L-proline. Proc Natl AcadSci USA, 1967. 58, 52-9. that interferes with siderophore-mediated iron acquisition. Mol Cell, Schiweck, et al., Fermenter production of an artificial fab fragment, 2002, 10, 1033-43. rationally designed for the antigen cystatin, and its optimized crys Goldenberg, M. M., Etanercept, a novel drug for the treatment of tallization through constant domain shuffling. Proteins, 1995, 23, patients with severe, active rheumatoid arthritis. Clin Ther, 1999, 21. 561-5. 75-87; discussion 1-2. Schlapschy, et. al., Fusion of a recombinant antibody fragment with Greenfield, et. al., Computed circular dichroism spectra for the evalu a homo-amino-acid polymer: effects on biophysical properties and ation of protein conformation, Biochemistry, 1969, pp. 4108-41 16, prolonged plasma half-life, Protein Engineering. Design & Selec vol. 8, No. 10. tion: PEDS, 2007, pp. 273-284, vol. 20, No. 6. Harris, et al., Effect of pegylation on pharmaceuticals. Nat Rev Drug Schlapschy, M., et al., “Fusion of a recombinant antibody fragment Discov. 2003, 2, 214-21. with a homo-amino-acid polymer: effects on biophysical properties Holmes, et al., Siderocalin (Lcn 2) also binds carboxymycobactins, and prolonged plasma half-life.” Protein Engineering. Design & potentially defending against mycobacterial infections through iron Selection, 2007, pp. 273-284, vol. 20, No. 6. sequestration. Structure, 2005, 13, 29-41. Schreuder, et al. A new cytokine-receptor binding mode revealed by Hvidberg, et al., “The endocytic receptor megalin binds the iron the crystal structure of the IL-I receptor with an antagonist. Nature, transporting neutrophil-gelatinase-associated lipocalin with high 1997, 386, 194-200. affinity and mediates its cellular uptake”. FEBS Lett, 2005, 579, Shamji M.F., et al., “Development and Characterization of a Fusion 773-7. Protein Between Thermally Responsive Elastin-like Polypeptide and Ivens, et al. “The Genome of the Kinetoplastid Parasite, Leishmania Interleukin-1 Receptor Antagonist.” Arthritis & Rheumatism, 2007. major”, Science, 2005, 309(5733), pp. 436-442. pp. 3650-3661, vol. 56, No. 11. Johnson, et al., “SPAK, a STE20/SPS1-related kinase that activates Shental-Bechor, et al., Monte Carlo studies offolding, dynamics, and the p38 pathway'. Oncogene, 2000, vol. 19, pp. 4290-4297. stability in alpha-helices. Biophys J. 2005, 88, 2391-402. Klinke, et al., Physiologie, 5" Ed., Georg Thieme Verlag, Stuttgart. Singer, J.W., “Paclitaxel poliglumex (XYOTAXTM, CT-2103): A pp. 872-876, 1999. macromolecular taxane.” Journal of Controlled Release, 2005, pp. Kojima, Y., et al., “Conjugation Cu,Zn-Superoxide Dismutase with 120-126, vol. 109. Succinylated Gelatin: Pharmacological Activity and Cell-Lubricat Skerra, A. Use of the tetracycline promoter for the tightly regulated ing Function.” Bloconjugate Chem. American Chemical Society, production of a murine antibody fragment in Escherichia coli. Gene, 1993, pp. 490-498, vol. 4. 1994, 151, 131-5. US 9.221,882 B2 Page 3

(56) References Cited Constancis et al., “Macromolecular Colloids of Diblock Poly(amino acids) That Bind Insulin.” Journal of Colloid and Interface Science, OTHER PUBLICATIONS vol. 217, 1999, pp. 357-368. Fares et al., “Design of a Long-Acting Follitropin Agonist by Fusing Skerra, A., Engineered protein scaffolds for molecular recognition. J Mol Recognit, 2000, 13, 167-87. the C-Terminal Sequence of the Chorionic Gonadotropin B Subunit Smith, et al., The concept of a random coil. Residual structure in to the Follitropin B Subunit.” Proc. Natl. Acad. Sci. USA. vol. 89, peptides and denatured proteins. Fold Des, 1996, 1, R95-106. May 1992, pp. 4304-4308. Squire, P. G., Calculation of hydrodynamic parameters of random Gregoriadis et al., “Polysialic Acids: Potential in Improving the Sta coil polymers from size exclusion chromotography and comparison bility and Pharmacokinetics of Proteins and Other Therapeutics.” with parameters by conventional methods. Journal of Chromatogra Cell. Mol. Life Sci., vol. 57, 2000, pp. 1964-1969. phy, 1981, 5,433-442. Huang et al., “Engineering a Pharmacologically Superior Form of Walker, et. al... Using protein-based motifs to stabilize peptides, Jour Granulocyte-Colony-Stimulating Factor by Fusion with Gelatin nal of Peptide Research, 2003, pp. 214-226, vol. 62, No. 5. Like-Protein Polymer.” European Journal of Pharmaceutics and Walsh, G., Biopharmaceutical benchmarks—2003. Nat Biotechnol, Biopharmaceutics, vol. 74, 2010, pp. 435-441. 2003, 21, 865-70. Homsi et al., “Phase I Trial of Poly-L-Glutamate Camptothecin (CT Walsh, G., Second-generation biopharmaceuticals. Eur, J Pharm Biopharm, 2004, 58, 185-96. 2106) Administered Weekly in Patients with Advanced Solid Malig Wood, W. B., Host specificity of DNA produced by Escherichia coli: nancies.” Clinical Cancer Research, vol. 13. No. 19, Oct. 2007, pp. bacterial mutations affecting the restriction and modification of 5855-5861. DNA. J Mol Biol, 1966, 16, 118-33. Klein et al., “Development and Characterization of a Long-Acting Yang, et al., An iron delivery pathway mediated by a lipocalin. Mol RecombinanthFSH Agonist.” Human Reproduction, vol. 18, No. 1, Cell, 2002, 10, 1045-56. 2003, pp. 50-56. Yanisch-Perron, et al., Improved M13 phage cloning vectors and host Na et al., “Effect of Molecular Size of PEGylated Recombinant strains: nucleotide sequences of the M13mpl8 and pljC19 vectors. Human Epidermal Growth Factor on the Biological Activity and Gene, 1985, 33, 103-19. Stability in Rat Wound Tissue.” Pharmaceutical Development and International Preliminary Report on Patentability received in the Technology, vol. 11, 2006, pp. 513-519. corresponding International Application No. PCT/EP2011/0583.07. Qu et al., “Development of Compstatin Derivative-Albumin Binding dated Nov. 27, 2012. Peptide Chimeras for Prolonged Plasma Half-Life.” in Lebl (ed) Iizuka, et al., “Synthesis and Properties of High Molecular Weight Breaking Away. Proceedings of the 21 American Peptide Sympo Polypeptides Containing Proline', Bull. Chem. Soc. of Japan, 1998, col. 66, pp. 1269-1272. sium, pp. 219-220, American Peptide Society, 2009. Bullocket al., “XL1-Blue: A High Efficiency Plasmid Transforming Sung et al., “An IFN-B-Albumin Fusion Protein That Displays recA Escherichia colliStrain with Beta-Galactosidase Selection.” Improved Pharmacokinetic and Pharmacodynamic Properties in BioTechniques, vol. 5, No. 4, 1987, pp. 376-379. Nonhuman Primates,” Journal of Interferon & Cytokine Research, Schlapschy et al., “A System for Concomitant Overexpression of vol. 23, 2003, pp. 25-36. Four Periplasmic Folding Catalysts to Improve Secretory Protein Tan et al., “Glycosylation Motifs That Direct Arabinogalactan Addi Production in Escherichia coli.” Protein Engineering. Design & tion to Arabinogalactan-Proteins.” Plant Physiology, vol. 132, 2003, Selection, vol. 19, No. 8, 2006, pp. 385-390. pp. 1362-1369. Skerra et al., “Use of the Strep-Tagand Streptavidin for Detection and Xu et al., “Human Growth Hormone Expressed in Tobacco Cells as Purification of Recombinant Proteins.” Methods in Enzymology, vol. an Arabinogalactan-Protein Fusion Glycoprotein has a Prolonged 326, 2000, pp. 271-304. Serum Life.” Transgenic Res, vol. 19, 2010, pp. 849-867. Besheer et al., “Enzymatically Catalyzed HES Conjugation Using Bhandari. D. et al., “H-NMR study of mobility and conformational Microbial : Proof of Feasibility,” Journal of Phar constraints within the proline-rich N-terminal of the LC1 alkali light maceutical Sciences, vol. 98, No. 11, Nov. 2009, pp. 4420-4428. chain of skeletal myosin.” Eur, J. Biochem, vol. 160, 1986, pp. Chan et al., “Review on Medusa (R): a Polymer-Based Sustained 349-356. Release Technology for Protein and Peptide Drugs.” Expert Opinion Drug Delivery, vol. 4, No. 4, 2007, pp. 441-451. * cited by examiner

U.S. Patent Dec. 29, 2015 Sheet 2 of 24 US 9.221,882 B2

Figure 2 cont. C Xbal BStE

ompA-VH-huCH1-Hiss Xhol

phoA-VL-huCL

PAH1(200)

Hind 1% U.S. Patent US 9.221,882 B2

(T)

+––+-----~~~~~–+--~–+~+~~~~––+––+--

| U.S. Patent Dec. 29, 2015 Sheet 4 of 24 US 9.221,882 B2

Figure 2 cont. F Xbal Sap

strep II-PAff.1(200)

Hind

'Ipp pFAii.1(200)-IFNa2b

A

U.S. Patent Dec. 29, 2015 Sheet 5 of 24 US 9.221,882 B2

Figure 3

A

6 6 o O S S

D. o ?h to ?h s s l l- - I

66.2 kDa - &3:3:S:38 450 kDa . 35.0 kDa - 25.0 kDa - as - 18.4 kDa -

reduced not reduced

B

O O CN N (s t 2. 2 l as as o O O N O N f : 1 s 116.0 kDa - 66.2 kDa - 45.0 kDa - 35.0 kDa - 25.0 kDa - 18.4 kDa - the 14.4 kDa - | - reduced not reduced U.S. Patent Dec. 29, 2015 Sheet 6 of 24 US 9.221,882 B2

Figure 4

A

------Fab Fab-PA#1 (200)

- 0,8 "O c) N cy O,6 0.4 < 0.2

O 8 10 12 14 16 VoA Elution volume (ml) B

1000

o - 440 kDa

- Fab-PAlt1 (200) (app. MW. 237 kDa, 2OO cal. MW: 64.3 kDa) c - 150 KDa 9 100 s C

8 10 12 14 16 18 Elution volume (ml) U.S. Patent Dec. 29, 2015 Sheet 7 of 24 US 9.221,882 B2

Figure 4 cont. C

- PA#1 (200)-IFNa2b

------FNa2b 1

0.8 O C) N 0.6 S 9 0.4

0.2

O 8 10 12 14 16 18 A Elution volume (ml) Vo

D

1 OOO

2OO PA#1 (200)-FNa2b (app. MW: 229 kDa, o e - 150 kDa cal. MW. 37.037.OK kDa) ?h & 100 St 66.3 kDa - 0

IFNa2b (app. MW: 19.8 kDa, cal. MW. 21.0 kDa) 12.4 K 10 8 10 12 14 16 18 Elution volume m) U.S. Patent Dec. 29, 2015 Sheet 8 of 24 US 9.221,882 B2

Figure 5

2x10

4. -2 X 1 O

-4X10 4 Fab

- - - - - Fab-PAlt1 (200)

190 200 210 220 230 240 250 Wavelength (nm)

-2x10'

-4x10'

-6x10' - PA#1 (200) -8x10 190 2OO 210 220 230 240 250 Wavelength (nm) U.S. Patent Dec. 29, 2015 Sheet 9 of 24 US 9.221,882 B2

Figure 5 cont. C

6x10' ------IFNa2b 3x10' i - - - - - PA#1 (200)-IFNa2b

S o s, -3x10' CD C. 6 -6x10

-9x1O' 190 200 21 O 220 230 240 250 Wavelength (nm)

D

O 2 -2x10' 8 cN

O) -4x10' d O. -6x10" < -8x10 — PA+1(200) 190 200 210 220 230 240 250 Wavelength (nm) U.S. Patent Dec. 29, 2015 Sheet 10 of 24 US 9.221,882 B2

Figure 6

A

Nhe Sap I

GCCGCTAGCCATCAccACCATCAccATGGScAGCTCTCGCCTCCAAcc.am m. m. ------. . . CGGCGATCGGTAGTGGTGGTAGTGGTACCGCGGTCGAGAAGACG GGGTTGG. . . AlaAlla SerHis His His His His His GlyAla Ser SerSerAla Phe ProThr. . . - +1 U.S. Patent Dec. 29, 2015 Sheet 11 of 24 US 9.221,882 B2

||––~~+–+---––+~–~~~~+----~+–+– IdegTe?N || *OVÝVOOC),I,CO5),LC5)LOOLOºDJLCJ5) U.S. Patent Dec. 29, 2015 Sheet 12 of 24 US 9.221,882 B2

Figure 6 cont.

C Xibal Nhe Sap

tetplo Ompa N His6-PAH1 (200)

Hind t pp \\\ pASK75-His6-PAi1 (200)-hGH |

f1 - G

U.S. Patent Dec. 29, 2015 Sheet 13 of 24 US 9.221,882 B2

Figure 6 cont.

D Nhe

Sp CMV His-tag PA#1 (200)

Hird bGH pA \ pCHO-PAii.1(200)-hGH COE1-Ori f1-orior? || |

\\ N SV40//, U.S. Patent Dec. 29, 2015 Sheet 14 of 24 US 9.221,882 B2

Figure 6 cont.

Mmamm m --- 100 kDa - 70 kDa - 55 kDa - 40 kDa - 35 kDa -

25 kDa -

15 kDa .

U.S. Patent Dec. 29, 2015 Sheet 16 of 24 US 9.221,882 B2

Figure 8:

140 5, 120 -O- Fab-PAlt1 (600) -- Fab-PAii.1(200) (L 100 lyI -- Fab

s o | 80 C .9 60 CD C a 40 O s C 5 20 n

O chael O 1O 2O 30 40 50 Time (h) U.S. Patent Dec. 29, 2015 Sheet 17 of 24 US 9.221,882 B2

Fig. 9

116.O-

662 - 45.0 -

35, O -

25.0 - 184 -

reduced not reduced

-- Fab-P1A1 (200) a Fab-P1A3(200)

A Elution volume (ml) U.S. Patent Dec. 29, 2015 Sheet 18 of 24 US 9.221,882 B2

Fig.11

1x10

O -1x10 -2x10' -3x10' -4x10' - - Fab-P1A1 (200) Fab-P1A3(200)

190 200 21 O 220 230 240 250 Wavelength (nm)

-2x10'

-4x10'

-6x10 - - P1A1 (200) -P1A3(200) -8x10 190 200 210 220 230 240 250 Wavelength (nm) U.S. Patent Dec. 29, 2015 Sheet 19 of 24 US 9.221,882 B2

Fig. 12

ta

PA#1 (200) bia

pSUMO-PAlt1 (200)

His-SUMOS qb1OP Q. Ori

1 2 kDa 116.0 - up als --SUMO-PAli1 (200) 66.3 -.

45. O or 35.0 - 25.0- 184 w {x888& SUMO

144 ro

1: SUMO-PAH1(200) prior to cleavage 2: SUMO-PAli1 (200) cleaved with SUMO protease U.S. Patent Dec. 29, 2015 Sheet 20 of 24 US 9.221,882 B2

Fig. 13

A 1 - a- A280

------, A225 5 O.8 - A 5. O6 O R C

B SUMO-PAH1 (200) 9. (cleaved) 5. O t C

C SUMO-PAlt1 (200) 5 (cleaved, coupled)

Os c

D |MAC purified Fluorescein PAii.1(200)

O 5 10 15 20 25 Elution volume mi Vo U.S. Patent Dec. 29, 2015 Sheet 21 of 24 US 9.221,882 B2

Fig. 13 cont.

E PA#1 (200) S. 5. O O a.

F Fluorescein 9. 5. C t C e

O 4 & 12 16 20 24 28 32 G saw v c s , i. A A. Fluorescein- 494 PA#1 (200) S. O o a.

O 5 10 15 2O 25 W Elution volume (ml) U.S. Patent Dec. 29, 2015 Sheet 22 of 24 US 9.221,882 B2

Fig. 13 cont.

H O.7 O6 SUMO-PAH1 (200) O.5 O4 O3 O.2 O. 1 O 2so go so 4o 450 so so soo O.7 O6 PA#1 (200) O5 0.4 O.3 O2 0.1 O 250 300 350 400 45O 500 550 600 0.7 O6 Fluorescein 0.5 0.4 O3 0.2 O. 1 O 250 300 350 4OO 450 500 550 600

K 0.7 O6 Fluorescein-PAf1(200) O.5 0.4 0.3 O2 O. 1 250 300 350 4OO 450 500 550 600 Wavelength (nm) U.S. Patent Dec. 29, 2015 Sheet 23 of 24 US 9.221,882 B2

Fig. 13 cont. L

1 OOO SUMO-PAH1(200) PA#1 (200) & Fluorescein-PAft1 (200) 100

c 10 ? A. s 1

O. 1 Fluorescein

0.01 1O 15 2O 25 Elution volume (ml)

M 100 16671.4

SS --Digoxigenin-PAlt1 (200)

:33 38

- - - 3.SS OOOO 14OOO 18000 22000 26OOO 3OOOO U.S. Patent Dec. 29, 2015 Sheet 24 of 24 US 9.221,882 B2

Fig. 14

H O ''' O N--- N 1-ProAla-Polymer H US 9,221,882 B2 1. 2 BOSYNTHETIC PROLINEAALANNE ecules (such as proteins, nucleic acids, carbohydrates, lipid RANDOMCOL POLYPEPTIDES AND THEIR vesicles) and the like, linked and/or coupled to said biosyn USES thetic random coil proline/alanine polypeptide or proline/ alanine polypeptide segment. Furthermore, nucleic acid mol The instant application contains a Sequence Listing which ecules encoding the biosynthetic random coil polypeptide or has been submitted in ASCII format via EFS-Web and is polypeptide segment and/or the biologically active, heterolo hereby incorporated by reference in its entirety. gous proteins as well as vectors and cells comprising said The present invention relates to a biosynthetic random coil nucleic acid molecules are disclosed. Furthermore, methods polypeptide or a biosynthetic random coil polypeptide seg for the production of the herein described inventive biosyn ment or a conjugate, said biosynthetic random coil polypep 10 thetic random coil polypeptides or polypeptide segments and tide or a biosynthetic random coil polypeptide segment or a corresponding drug or food conjugates, i.e. conjugates com conjugate comprising an amino acid sequence consisting prising the herein defined biosynthetic random coil polypep solely of proline and alanine amino acid residues, wherein tides or polypeptide segments and a food ingredient or a food said amino acid sequence consists of at least about 50 proline additive, are disclosed. Also disclosed are corresponding con (Pro) and alanine (Ala) amino acid residues. Said at least 15 jugates (comprising as one constituent the herein disclosed about 50 proline (Pro) and alanine (Ala) amino acid residues biosynthetic random coil polypeptide or polypeptide seg may be (a) constituent(s) of a heterologous polypeptide oran ment) which comprise, inter alia, a cosmetic ingredient or heterologous polypeptide construct. Also uses and methods additive or a biologically or spectroscopically active com of use of these biosynthetic random coil polypeptides, said pound. In addition, the present invention provides composi polypeptide segments or said conjugates are described. The tions comprising the compounds of the invention (i.e. the uses may, interalia, comprise medical uses, diagnostic uses or herein disclosed the random coil polypeptides or random coil uses in the food industry as well as other industrial applica polypeptide segments comprising an amino acid sequence tions, like in the paper industry, in oil recovery and the like. consisting solely of proline and alanine amino acid residues The present invention relates, also, to (a) specific use(s) of the and nucleic acid molecules encoding the same) as well as herein provided biosynthetic random coil polypeptide or bio 25 specific uses of said random coil polypeptide or polypeptide synthetic random coil polypeptide segment or conjugates, segment, of the biologically active proteins comprising said said biosynthetic random coil polypeptide or biosynthetic random coil polype random coil polypeptides or random coil random coil polypeptide segment or conjugates comprising polypeptide segments ptides or random coil polypeptide seg an amino acid sequence consisting solely of proline and ala ments, the drug conjugates, the food conjugates or the nucleic nine amino acid residues. The amino acid sequence of the 30 acid molecules, vectors and cells of the invention. Also meth herein provided biosynthetic random coil polypeptide or bio ods of producing and/or obtaining the inventive biosynthetic synthetic random coil polypeptide segment consists of at least random coil polypeptides or polypeptide segments as well as about 50, of at least about 100, of at least about 150, of at least of producing and/or obtaining the inventive biologically about 200, of at least about 250, of at least about 300, of at active, heterologous proteins, and/or polypeptide constructs least about 350 or of at least about 400 proline (Pro) and 35 or drug conjugates are provided. In addition, medical, phar alanine (Ala) amino acid residues. Said at least about 50, at maceutical as well as diagnostic uses are provided herein for least about 100, at least about 150, at least about 200, at least the biosynthetic random coil polypeptide or polypeptide seg about 250, at least about 300, at least about 350 or at least ment comprising an amino acid sequence consisting Solely of about 400 proline (Pro) and alanine (Ala) amino acid residues proline and alanine amino acid residues (or for molecules and are preferably (a) a constituent of a heterologous polypeptide 40 conjugates comprising the same) as defined herein. Such a or a heterologous polypeptide construct or are preferably (b) medical or pharmaceutical use can comprise the use of said a constituent of a conjugate, like a drug conjugate, like a biosynthetic random coil polypeptide or polypeptide segment conjugate with a food or cosmetic ingredient or additive, like as plasma expander and the like. However, the means and a conjugate with a biologically active compound or like a methods provided herein are not limited to pharmaceutical, conjugate with a spectroscopically active compound. In par 45 medical and biological uses but can also be employed in other ticular, heterologous proteins are provided herein whereby industrial areas, like in the paper industry, in oil recovery, etc. these proteins comprise at least two domains, wherein a first Rapid clearance from blood circulation by renal filtration is domain of said at least two domains comprises an amino acid a typical property of Small molecules (including Small pro sequence having and/or mediating an activity, like a biologi teins and peptides). However, by expanding the apparent cal activity, and a second domain of said at least two domains 50 molecular dimensions beyond the pore size of the kidney comprising the biosynthetic random coil proline/alanine glomeruli plasma half-life of therapeutic proteins can be polypeptide or proline/alanine polypeptide segment of the extended to a medically useful range of several days. One present invention. The present invention relates in particular strategy to achieve Such an effect is chemical conjugation of to a drug conjugate comprising (i) a biosynthetic random coil the biologic with the synthetic polymer poly-ethylene glycol polypeptide or polypeptide segment comprising an amino 55 (PEG). This has led to several approved drugs, for example acid sequence consisting Solely of proline and alanine amino PEG-interferon alpha2a (Pegasys(R), PEG-G-CSF (Neu acid residues, wherein said amino acid sequence consists of at lastaR) and, recently, a PEGylated alphaTNF-Fab fragment least 50 proline (Pro) and alanine (Ala) amino acid residues, (CimziaR). Nevertheless, the "PEGylation” technology has and (ii) a drug selected from the group consisting of (a) a several drawbacks: clinical grade PEG derivatives are expen biologically active protein or a polypeptide that comprises or 60 sive and their covalent coupling to a recombinant protein that is an amino acid sequence that has or mediates a biologi requires additional downstream processing and purification cal activity and (b) a small molecule drug. A further subject of steps, thus lowering yield and raising the costs. Furthermore, the present invention is a drug conjugate comprising the bio PEG is not biodegradable, which can cause side effects such synthetic random coil proline/alanine polypeptide or proline/ as vacuolation of kidney epithelium upon continuous treat alanine polypeptide segment as provided herein and, addi 65 ment; see, e.g., Gaberc-Porekar (2008) Curr Opin Drug Dis tionally, (a) pharmaceutically or medically useful cov Devel 11:242-50; Knop (2010) Angew Chem Int Ed Engl molecule(s), like Small molecules, peptides or biomacromol 49:6288-308 or Armstrong in: Veronese (Ed.), “PEGylated US 9,221,882 B2 3 4 Protein Drugs: Basic Science and Clinical Applications'; negatively charged carboxylate groups of the modified side Birkhanser Verlag, Basel 2009. chains Supposedly spreads out the molecule into a more or In order to overcome some of the drawbacks of PEG tech less extended conformation. The resulting expanded Volume nology, certain recombinant polypeptide mimetics have been makes Succinylated gelatin a macromolecule for use as provided in the art, some of which are based on naturally 5 plasma expander in humans and is, inter alia, marketed as occurring amino acid sequences or synthetic amino acid Volplex(R) (Beacon Pharmaceuticals Ltd) or Gelofusine(R) (B. stretches. Braun Melsungen AG). Furthermore, a half-life extending Most natural amino acid sequences do not behave like an effect was achieved by genetic fusion of granulocyte-colony ideal random chain in physiological Solution because they stimulating factor (G-CSF) to an artificial gelatin-like either tend to adopt a folded conformation (secondary struc 10 ture) or, if unfolded, they usually are insoluble and form polypeptide (Huang (2010) Eur J Pharm Biopharm 74:435 aggregates. In fact, most of the classical experiments to inves 41). To this end, all hydrophobic side chains in a natural tigate the random chain behaviour of polypeptides were con gelatin were exchanged by hydrophilic residues, resulting in ducted under denaturing conditions, i.e. in the presence of a 116 amino acid gelatin-like protein (GLK) comprising the chemical denaturants like urea or guanidinium chloride (see, 15 amino acids G. P. E. Q. N. S., and K in varying order. G-CSF e.g., Cantor (1980) Biophysical Chemistry. W.H. Freeman was fused at its N-terminus with 4 copies of this GLK and Company, New York). Hence, Such technologies gener sequence and secreted in Pichia pastoris. Pichia pastoris ally rest upon peculiar amino acid sequences that resist fold appeared as a favourable production for GLK ing, aggregation as well as unspecific adsorption and, thus, fusion proteins; yet, if GLKs can also be produced in other provide stable random chains under physiological buffer con remains to be determined as it is known that ditions and temperature even if genetically fused to a folded recombinant gelatin fragments can be expressed with only therapeutic protein domain. Under these circumstances. Such low yield in E. coli, for example, as illustrated in Olsen recombinant PEG mimetics can confer a size increase much (2003), Adv. Drug Deliv Rev 55:1547-67. larger than one would normally expect on the basis of their Elastin is a component of the extracellular matrix in many molecular mass alone, eventually retarding kidney filtration 25 tissues. It is formed from the soluble precursor tropoelastin, and effectively extending plasma half-life of the attached which consists of a hydrophilic LyS/Ala-rich domain and a biologic by considerable factors. hydrophobic, elastomeric domain with repetitive sequence. A lot of these technologies have, however, further caveats Enzymatic crosslinking of side chains within the and disadvantages. hydrophilic domain leads to insoluble elastin formation. For example, naturally occurring repetitive amino acid 30 Elastin-like polypeptides (ELPs) are artificially designed, sequences have been tested for their usefulness in medical sciences and in pharmaceutical approaches. One of these repetitive amino acid sequences derived from the hydropho approaches relates to the trans-Sialidase of Trypanosoma bic domain of tropoelastin. The most common repeat cruzi. It contains a 680 amino acid residue catalytic domain sequence motif of ELPs is V-P-G-X-G, wherein “X” can be followed by a C-terminal repetitive domain, dubbed “shed 35 any amino acid except Pro (MacEwan (2010) Biopolymers acute phase antigen” (SAPA), which comprises a variable 94:60-77; Kim (2010) Adv. Drug Deliv Rev 62:1468-78). number of 12mer amino acid repeats. Pharmacokinetic (PK) Suitable ELPs can be fused with therapeutic proteins and studies in mice of the trans-Sialidase containing 13 hydro produced in E. coli. Consequently, the ability ofELPs to form philic and (at physiological pH) negatively charged corre gel-like depots after injection can significantly prolong the in sponding amino acid repeats having the natural sequence 40 vivo half-life of an attached biologic, albeit by a mechanism DSSAHSTPSTPA revealed a five-fold longer plasma half different from the other unstructured polypeptides. Yet, ELP life compared to the recombinant enzyme from which the attachment can hamper the bioactivity of the fusion partner as C-terminal repetitive sequence had been deleted (Buscaglia demonstrated for the interleukin-1 receptor antagonist in an (1999) Blood 93: 2025-32). A similar half-life extending IL-1-induced lymphocyte proliferation bioassay (Shamji effect was observed after fusion of the same trans-Sialidase, 45 (2007) Arthritis Rheum. 11:3650-3661). In addition, ELPs i.e. its 76 kDa catalytic domain, with 13 charged amino acid are subject to degradation by endogenous proteases Such as repeats of the sequence EPKSA that were found in the Try collagenase. Also, aggregated proteins are generally more panosoma Cruzi protein antigen 13. Both the repeats from Susceptible to immunogenicity. SAPA and from the antigen 13 were able to prolong the Further approaches relate to the use of polyanionic poly plasma half-life of the heterologous protein gluthatione 50 mers. For example, polyglutamate (PG) has been chemically S-transferase (GST) from Schistosoma japonicum by a factor coupled to poorly soluble cytotoxic Small molecule drugs for 7-8 after genetic fusion to both C-termini of this homo cancer treatment. A corresponding product would be dimeric enzyme (see Buscaglia, loc.cit.). Yet, while these OpaxioTM, a paclitaxel drug conjugate currently in clinical naturally occurring repetitive amino acid sequences from phase III studies. Half-life of a paclitaxel PG conjugate was human pathogens in principle may appear attractive to opti 55 prolonged by a factor 3 to 14 in comparison with the unmodi mize the pharmacokinetics of therapeutic proteins they were fied compound (Singer (2005) J Control Release 109:120-6). found to be highly immunogenic (see Affranchino (1989), Further fusion proteins, for example G-CSF fused at its N-ter Mol Biochem Parasitol 34:221-8 or Buscaglia (1998), J Infect minus with a stretch of 175 consecutive Glu residues or IFN Dis 1998; 177:431-6). alpha2 carrying at its C-terminus a PG tail of 84 residues, Another approach relates to the use of gelatin. Gelatin, 60 were produced in a soluble state in the cytoplasm of E. coli hydrolyzed and denatured animal collagen, contains long (see WO2002/077036). For efficient translation, the N-termi stretches of Gly-Xaa-Yaa repeats, wherein Xaa and Yaa nal fusion required a leader peptide, which was later removed mostly constitute proline and 4-hydroxyproline, respectively. by Tobacco Etch Virus (TEV) protease cleavage. Poly of gelatin, primarily via the e-amino groups of glutamate fusions of G-CSF and INFO2 showed bioactivity in naturally interspersed lysine side chains, increases the hydro 65 cell culture assays. However, to date no pharmacokinetic data philicity of this biopolymer and lowers its isoelectric point of these PG fusions have been reported. Also, the highly (pl). The intramolecular electrostatic repulsion between the negative charge of PG fusions is a general disadvantage with US 9,221,882 B2 5 6 respect to biomolecular interactions (e.g. binding of the target receptor affinity (17-fold increased ECs) was described for receptor or soluble factor) due to artificial electrostatic attrac an XTEN fusion of human growth hormone (hGH); see WO tion or repulsion effects. 2010/1445O2 WO 2006/081249 describes a polypeptide sequence with Also , as the Smallest and structurally simplest about 2 to 500 repeat units of 3 to 6 amino acids, wherein G, amino acid, has been considered as the conformationally Nor Q represent the major constituents while minor constitu most flexible amino acid based on theoretical grounds; see, ents can be A, S, T, D or E. This amino acid composition e.g. Schulz. G. E. Schirmer R H. Principles of Protein Struc allows integration of the glycosylation sequon Asn-Xaa-Ser/ ture. Springer, N.Y. 1979. Furthermore, computer simula Thr (where Xaa is any amino acid except Pro) for N-linked tions have indicated that Gly polymers lack secondary struc glycosylation of the ASn side chain in eukaryotic expression 10 ture and are likely to form a random coil in Solution; see Shental-Bechor (2005) Biophys J 88:2391-402. From a systems. The increased macromolecular size of a resulting chemical perspective, polyglycine is a linear unbranched fusion protein, including posttranslational modification with polyimide that shows certain resemblance to the polyether bulky solvated carbohydrate structures, can extend the phar PEG in so far as both are essentially one-dimensional mac macokinetics of the genetically conjugated protein. Such oli 15 romolecules with many rotational degrees of freedom along gosaccharide attachments (glycoengineering’) in general the chain, which are made of repeated short hydrocarbon can both reduce Susceptibility to and increase the units that are regularly interrupted by hydrogen-bonding and hydrodynamic volume (Sinclair (2005).JPharm Sci94:1626 highly Solvated polar groups. Consequently, polyglycine 35). A disadvantage is the intrinsic molecular heterogeneity should constitute the simplest genetically encodable PEG of the glycosylated biomacromolecule, which causes addi mimetic with prospects for extending the plasma half-life of tional effort during biotechnological production and quality therapeutic proteins. This concept was employed in form of control. “homo-amino-acid polymer (HAP) or as glycine rich WO 2010/091122 (and WO 2007/103515) and Schellen sequence (GRS), respectively; see, Schlapschy (2007) Pro berger (2009) Nat Biotechnol 27:1186-90 disclose unstruc tein Eng Des Sel 20:273-84: WO 2007/103515. However, it tured non-repetitive amino acid polymers encompassing and 25 has long been known that chemically synthesized pure poly comprising the residues P. E. S.T.A and G. This set of amino mers of Gly show poor solubility in water; see, inter alia, in acids, which shows a composition not unlike the PSTAD Bamford C H et al. Synthetic Polypeptides—Preparation, repeat described further above, was systematically screened Structure, and Properties. Academic Press, New York 1956. for sequences to yield a solvated polypeptide with large Hence, different attempts were made to increase hydrophilic molecular size, Suitable for biopharmaceutical development, 30 ity, either by introducing hydrogen-bonding serine alcohol by avoiding hydrophobic side chains—in particular F.I., L. M. side chains (WO 2007/103515 as well as Schlapschy (2007) J and W that can give rise to aggregation and may cause an loc. cit.) or, in addition, negatively charged glutamate resi HLA/MHC-ii mediated immune response. Also, potentially dues (WO 2007/103515). It is of note that peptide spacers crosslinking Cys residues, the cationic amino acids K. Rand with the composition (Gly Ser), have already been described H, which could interact with negatively charged cell mem 35 in the art in order to link domains in fusion proteins in a branes, and the side chains of N and Q, which are flexible manner. A significantly increased hydrodynamic Vol potentially prone to hydrolysis, were excluded (see Schellen ume was detected for these fusion proteins in analytical SEC. berger (2009) loc. cit.). Synthetic gene libraries encoding In the case of the 200 residue HAP version the apparent size non-repetitive sequences comprising the PESTAG set of resi increase was 120% compared with the unfused Fab fragment, dues, which were fused to the green fluorescent protein 40 while the true mass was only bigger by 29%, hence revealing (GFP), were screened with respect to soluble expression lev the effect of an enhanced hydrodynamic volume due to the els in E. coli, and a resulting Subset was further investigated Solvated random coil structure of the polyglycine tag. Fur for genetic stability, protein solubility, thermostability, aggre thermore, CD difference spectra were characteristic for dis gation tendency, and contaminant profile. Eventually, an 864 ordered secondary structure for the HAP moiety. Finally, amino acid sequence containing 216 Ser residues (25.0 45 terminal plasma half-life of the Fab fragment carrying the 200 mole %), 72 Ala residues (8.3 mole %) and 144 amino acids residue HAP in mice was prolonged by approximately a (16.7 mole %) of each Pro, Thr, Glu, and Gly was further factor 3. Though moderate, this effect could be appropriate tested for fusion to the GLP-1 receptor agonist Exendin-4 for certain (specialized) diagnostic applications, such as in (E-XTEN) and a few other biologics. The fusion proteins— vivo imaging; see Schlapschy (2007); loc. cit. Unfortunately, typically carrying a cellulose binding domain, which was 50 the production of fusion proteins with longer (Gly Ser), later cleaved off were produced in a soluble state in the repeat sequences appeared less feasible due to an increasing cytoplasm of E. coli and isolated. Investigation of E-XTEN tendency to form aggregates, thus posing a natural limitation by circular dichroism (CD) spectroscopy revealed lack of to the use of more or less pure glycine polymers as PEG secondary structure while during size exclusion chromatog mimetics. raphy (SEC) the fusion protein showed substantially less 55 WO 2008/155134 discloses that sequences with an appro retention than expected for a 84 kDa protein, thus demon priate mixture of Pro, Ala, and Ser (i.e. PAS) residues lead to strating an increased hydrodynamic Volume (Schellenberg mutual cancellation of their distinct secondary structure pref (2009) loc. cit.). The disordered structure of the PESTAG erences and, thus, result in a stably disordered polypeptide. polypeptide and the associated increase in hydrodynamic However, WO 2008/155134 also documents that fusion pro radius may be favoured by the electrostatic repulsion between 60 teins with a domain composed only of serine and alanine (SA) amino acids that carry a high net negative charge which are residues, i.e. a domain comprising only two types of amino distributed across the XTEN sequence (see WO 2010/ acids, do not form a random coil, but a B-sheet structure 0.91122). However, a further study Geething (2010) PLoS instead. One 2010; 5:e10175 demonstrated that XTEN decreases The chemical synthesis of polypeptides is well known and potency of its therapeutic fusion partner. In a cell culture 65 has been described in the art. Izuka discloses the chemical assay, a glucagon XTEN fusion showed merely 15% bioac synthesis of polypeptides containing proline (see IZuka tivity of the non-modified peptide. An even stronger loss in (1993), Bull. Chem. Soc. Jpn. 66, 1269-1272). These US 9,221,882 B2 7 8 copolypeptides contain random sequences of proline and about 50, of at least about 100, of at least about 150, of at least either glycine, L-alanine, L-O-aminobutyric acid (Abu), about 200, of at least about 250, of at least about 300, of at L-norvaline (Nva) or L-leucine, respectively, and are synthe least about 350, of at least about 400 proline (P) and alanine sized by chemical copolymerization. Izuka discloses that (A) amino acid residues is, interalia, to be used in a heter Such copolypeptides mostly have a defined collagen-like con ologous context, i.e. in a biologically active heterologous formation. Further, it is described in this publication that protein, protein construct and/or in a drug conjugate compris copolypeptides of proline and alanine (or proline and L-O- ing said biosynthetic random coil polypeptide or polypeptide aminobutyric acid) are partially soluble in water, while other segment and pharmaceutically or medically useful mol copolypeptides were completely insoluble. It is speculated in ecules, like Small molecules, peptides or biomacromolecules IZuka that proline/alanine copolypeptides may have a partial 10 Such as proteins, nucleic acids, carbohydrates, lipid vesicles disordered conformation. IZuka emphasizes that chemically and the like. As illustrated in the appended examples, the synthesized polypeptides with a random proline/alanine inventors could successfully provide for drug conjugates sequence occur predominantly in a collagen-like conforma which consist of the true random coil polypeptides as defined tion, i.e. in a structured conformation. herein and biologically active proteins or protein stretches as Thus, the technical problem underlying the present inven 15 well as drug conjugates that consist of Small molecules or tion is the provision of large polypeptides with true random Small molecule drugs that comprise and/or are linked to the coil conformation. The technical problem is solved by provi herein described random coil polypeptides, consisting solely sion of the embodiments characterized in the claims and as of proline and alanine amino acid residues (i.e. of both amino provided herein. acids P and A) Accordingly, the invention relates to the provision and use Accordingly, the present invention provides, interalia, for of a biosynthetic random coil polypeptide or polypeptide a biologically active, heterologous protein comprising at least segment comprising an amino acid sequence consisting of at two domains wherein (a) a first domain of said at least two least about 50, in particular of at least about 100, in particular domains comprises an amino acid sequence having and/or of at least about 150, in particular of at least about 200, in mediating said biological activity; and (b) a second domain of particular of at least about 250, in particular of at least about 25 said at least two domains comprises the biosynthetic random 300, in particular of at least about 350, in particular of at least coil polypeptide or polypeptide segment consisting of an about 400 proline and alanine amino acid residues. The inven amino acid sequence consisting of at least about 50, of at least tion therefore relates to the provision of biosynthetic random about 100, of at least about 150, of at least about 200, of at coil polypeptides or polypeptide segments comprising an least about 250, of at least about 300, of at least about 350, of amino acid sequence of at least 50 amino acid residues, said 30 at least about 400 proline and alanine amino acid residues. In amino acid sequence consisting solely of proline and alanine accordance with this invention, said “first domain and said amino acid residues and comprising at least one proline and at “second domain” are not comprised in either a natural (i.e. least one alanine. The invention also provides for a drug occurring in nature) protein or a hypothetical protein as conjugate comprising (i) a biosynthetic random coil polypep deduced from naturally occurring coding nucleic acid tide or polypeptide segment comprising an amino acid 35 sequences, like open reading frames etc. sequence consisting solely of proline and alanine amino acid Furthermore, this invention provides for a drug conjugate residues, wherein said amino acid sequence consists of at consisting of the biosynthetic random coil polypeptide or least 50 proline (Pro) and alanine (Ala) amino acid residues, polypeptide segment comprising an amino acid sequence and (ii) a drug selected from the group consisting of (a) a consisting of at least about 50, of at least about 100, of at least biologically active protein or a polypeptide that comprises or 40 about 150, of at least about 200, of at least about 250, of at that is an amino acid sequence that has or mediates a biologi least about 300, of at least about 350, of at least about 400 cal activity and (b) a small molecule drug. The polypeptides proline and alanine amino acid residues and (a) pharmaceu with true random coil conformation and polypeptide seg tically, therapeutically and/or medically useful molecule(s), ments with true random coil conformation as provided herein like (a) Small molecule(s), (a) peptide(s) or (a) biomacromol are also useful in the context of cosmetic uses as well as uses 45 ecule(s) Such as protein(s), a nucleic acid(s), (a) in food industry and the production of beverages. The large carbohydrate(s), (a) lipid vesicle(s) and the like, that is/are polypeptides provided herein which show true random coil conjugated to said biosynthetic random coil polypeptide or confirmation consist soley and merely of proline (P. Pro) and polypeptide segment. Again, it is of note that the term “bio alanine (A, Ala) residues and comprise more than at least 50 logically active' in context of herein disclosed conjugates is amino acids, in particular of at least about 100, in particular of 50 not limited to pure biological molecules but also comprise at least about 150, in particular of at least about 200, in medically active, therapeutically active, pharmaceutically particular of at least about 250, in particular of at least about active molecules and the like. It is evident for the skilled 300, in particular of at least about 350, in particular of at least artisan that the means and methods provided herein are not about 400 proline and alanine amino acid residues. Both limited to pharmaceutical and medical uses, but can be amino acids, P and A, need to be present in the herein pro 55 employed in a wide variety of technologies, including, but not vided large polypeptides with true random coil conformation limited to cosmetic, food, beverage and nutrition technolo and polypeptide segments with true random coil conforma gies, oil industry, paper industry and the like. tion. Also provided herein are nucleic acid molecules that In contrast to chemically synthesized copolypeptides (like encode for the herein disclosed biosynthetic random coil in IZuka, loc. cit.), the random coil polypeptides provided polypeptides or polypeptide segments as well as for drug or 60 herein are biosynthetically produced. The term “biosyn food conjugates that comprise said biosynthetic random coil thetic” as used herein refers to the synthesis by means of polypeptides or polypeptide segments and a (covalently biotechnological methods (in contrast to chemical synthesis). linked) protein of interest, like a biologically active protein. Such biotechnological methods are well known in the art and The biosynthetic random coil polypeptide or biosynthetic also described herein further below. The biosynthesis of the random coil polypeptide segment as described herein and to 65 random coil polypeptides of the present invention allows the be used in drug or food conjugates as provided herein and production of polypeptides with a defined sequence of proline comprising an amino acid sequence consisting of at least and alanine residues, a defined length and/or a defined ratio of US 9,221,882 B2 10 proline and alanine residues. Further, the polypeptides pro ondary structure of polymers/polypeptides (or segments vided in accordance with the present invention are Substan thereof) composed of proline and alanine, as shown in FIG. 7. tially pure, i.e. the produced polypeptides are essentially uni Yet, herein it has been Surprisingly found and experimentally form and share the above characteristics (i.e. defined shown that proline-alanine polymers/polypeptides with a uni sequence, defined length and/or defined amino acid ratio). form composition form a stable random coil conformation. The random coil polypeptides consisting of at least about 50, This is also demonstrated in the appended examples, where in particular of at least about 100, in particular of at least about random coil structure of proline/alanine (co)-polymers/ 150, in particular of at least about 200, in particular of at least polypeptides is confirmed by experimental techniques like about 250, in particular of at least about 300, in particular of circular dichroism (CD) spectroscopy and size exclusion at least about 350, in particular of at least about 400 proline 10 chromatography (SEC). and alanine amino acid residues are, in accordance with this In contrast to the polypeptides/polymers of the present invention for example comprised in biologically active, het invention, the chemically synthesized polypeptides erologous polypeptides/polypeptide constructs and/or in described, for example, in Izuka (1993), loc. cit. have an drug or food conjugates as well as in other conjugates useful arbitrary/undefined and stochastic sequence and a diverse in further industrial areas, like, but not limited to paper indus 15 length. Thus, the chemically synthesized polypeptides com try, oil industry and the like. prise a mixture of completely different peptides with various Overall, the above features of the polypeptides of the proline/alanine ratios, lengths, and so on. As mentioned in present invention permit the formation of a stable random coil IZuka, the chemically synthesized polypeptides of Such a of the polypeptides and these random coil polypeptides have mixture do not (or only partially) form a random coil and, Surprising and advantageous properties. For example, the accordingly, do not have any of the advantageous properties polypeptides of the present invention are completely soluble of the biosynthetic polypeptides provided and described in aqueous solution and have an increased hydrodynamic herein below. Accordingly, the present invention comprises Volume. Unexpectedly, the random coil polypeptides as and relates to compositions comprising the inventive biosyn defined herein are also capable of conferring an increased in thetic random coil polypeptides/polymers as disclosed vivo/invitro stability. This is particularly important for medi 25 herein, whereby said biosynthetic random coil polypeptides/ cal applications, for example, for biologically active proteins polymers are defined, interalfa, by their sequence comprising or drug conjugates comprising the random coil polypeptide of solely proline and alanine residues. In one particular embodi this invention. However, the numerous advantageous proper ment, the present invention relates to conjugates, like drug or ties of the random coil polypeptides of the present invention food conjugates comprising, as one constituent, these random not only permit their use in the medical field but also in other 30 coil polypeptides/polymers disclosed herein. These inventive fields, like in cosmetics/cosmetic treatments or in the fields of biosynthetic random coil polypeptides/polymers comprised nutrition and food technology, like in the dairy industry or in in said compositions are, in one embodiment, of uniform meat processing. Examples of conjugates useful in food length. industry and the like are conjugates that comprise the herein As mentioned above, the biosynthetic random coil disclosed random coil polypeptide or polypeptide segment 35 polypeptides (or random coil polypeptide segments) of this comprising an amino acid sequence consisting Solely of pro invention consisting solely of proline and alanine residues line and alanine amino acid residues and compounds that are unexpectedly form a stable random coil conformation. The useful in these technologies, like, e.g. polyoxypropylene or term "random coil” as used herein relates generally to any polyoxyethylene polymers, which are non-ionic Surfactants conformation of a polymeric molecule, including amino acid used as emulsifiers. Also envisaged herein is the use of the 40 polymers/amino acid sequences/polypeptides, in which the biosynthetic random coil polypeptide as defined herein in individual monomeric elements that form said polymeric biochemical methods and in technical processes, such as structure are essentially randomly oriented towards the adja paper production, oil recovery and the like. The Surprising cent monomeric elements while still being chemically bound and advantageous characteristics of the biosynthetic random to said adjacent monomeric elements. In particular, a coil polypeptides consisting merely of proline and alanine 45 polypeptide, amino acid sequence or amino acid polymer residues as provided herein (and as also of the herein dis adopting/having/forming "random coil conformation” Sub closed conjugates and constructs, like drug or food conju stantially lacks a defined secondary and tertiary structure. In gates/constructs, comprising said biosynthetic, true random context of the polypeptides of the present invention, the coil polypeptides) are described below in greater detail. Fur monomeric elements forming the polymeric structure (i.e. the thermore, illustrative uses and means and methods employing 50 polypeptide?amino acid sequence) are either single amino these inventive biosynthetic random coil polypeptides are acids such as proline and alanine perse or peptide stretches provided below. Also means and methods for the production such as the “amino acid repeats'/'amino acid cassettes'/ of Such biosynthetic random coil polypeptides as well as “cassette repeats'/'building blocks/modules” (or frag biologically active, heterologous polypeptides or polypeptide ments thereof) which are described and defined further below. constructs and of the herein disclosed conjugates and con 55 The nature of polypeptide random coils and their methods structs, like drug constructs, comprising said random coil of experimental identification are known to the person skilled polypeptides are provided herein. in the art and have been described in the scientific literature In context of this invention, it has been Surprisingly found (Cantor (1980) Biophysical Chemistry, 2nd ed., W.H. Free that proline-alanine polymers/polypeptides with a uniform man and Company, New York: Creighton (1993) Proteins— composition form stable random coil conformation. This is 60 Structures and Molecular Properties, 2nd ed., W.H. Freeman also demonstrated in the appended examples, where random and Company, New York; Smith (1996) Fold Des 1:R95 coil structure of biosynthetic proline/alanine (co)-polymers/ R106). The term “segment” as used herein refers to a part of polypeptides is confirmed by circular dichroism (CD) spec the herein defined biosynthetic random coil polypeptide, troscopy. Obtaining and employing Such biosynthetic, truly whereby Such a part may be an internal part of the biosyn random coil polypeptides/polymers was surprising since the 65 thetic random coil polypeptide described herein. Such a “seg established Chou-Fasman method (Chou and Fasman (1974), ment may be, for example, a biosynthetic random coil Biochemistry 13, 223-245) predicts a 100% C-helical sec polypeptide as defined herein where one (or more) amino US 9,221,882 B2 11 12 acid(s) has/have been deleted, e.g. from the start and/or from 8.5, preferably in a range from 7.0 to 8.0, most preferably in the end of the polypeptide of the invention. Furthermore, such a range from 7.2 to 7.7 and the osmolarity should lie in a range a “segment may be used as or may form part of a larger from 10 to 1000 mmol/kg HO, more preferably in a range protein or polypeptide, for example, of a fusion protein with from 50 to 500 mmol/kg HO and most preferably in a range a biologically active protein. Such a “fusion protein’ would 5 from 200 to 350 mmol/kg HO. Optionally, the protein con also be an example of a heterologous, biologically active tent of a buffer representing physiological Solution conditions polypeptide?protein/polypeptide construct of the present may lie in a range from 0 to 100 g/l, neglecting the protein invention. The term "heterologous' as used herein is defined with biological activity itself, whereby typical stabilizing herein below. proteins may be used, for example human or bovine serum The random coil polypeptide (or random coil segment 10 albumin. thereof), as provided in the present invention and to be It has been found herein that the polypeptides (or segments employed in context of this invention, adopts/forms random thereof) not only form random coil conformation under coil conformation, for example, in aqueous solution or at physiological conditions but, more generally, in aqueous physiological conditions. The term "physiological condi solution. The term “aqueous solution' is well known in the tions” is known in the art and relates to those conditions in 15 art. An “aqueous solution' may be a solution with a water which proteins usually adopt their native, folded conforma (HO) content of at least about 20%, of at least about 30%, of tion. More specifically, the term “physiological conditions' at least about 40%, of at least about 50%, of at least about relates to the biophysical parameters as they are typically 60%, of at least about 70%, of at least about 80% or of at least valid for higher forms of life and, particularly, in mammals, about 90% HO (weight/weight). Accordingly, the polypep most preferably human beings. The term "physiological con tide (or segment thereof) of the present invention may form ditions” may relate to the biochemical and biophysical random coil conformation in aqueous Solution, possibly con parameters as they are normally found in the body (in par taining other miscible solvents, or in aqueous dispersions ticular in body fluids) of mammals and in particular in with a wider range of temperatures, pH values, osmolarities humans. Said “physiological conditions' may relate to the or protein content. This is particularly relevant for applica corresponding parameters found in the healthy body as well 25 tions of the random coil polypeptide (or segment thereof) as the parameters found under disease conditions or in human outside medical therapy or in vivo diagnostics, for example in patients. For example, a sick mammal or human patient may cosmetics, nutrition or food technology. have a higher, yet “physiological temperature condition Accordingly, it is also envisaged in the context of this when said mammal or said human suffers from fever. With invention that the random coil conformation of the proline/ respect to “physiological conditions' at which proteins adopt 30 alanine biosynthetic polypeptide (or segment thereof) of the their native conformation/state, the most important param present invention is maintained in and/or is used in context of eters are temperature (37° C. for the human body), pH (7.35 pharmaceutical compositions, like liquid pharmaceuticals/ 7.45 for human blood), osmolarity (280-300 mmol/kg HO), biologicals or lyophilized pharmaceutical compositions. This and, if necessary, protein content (66-85 g/l serum). Yet, the is particularly important in context of the herein provided person skilled in the art is aware that at physiological condi 35 biologically active, heterologous proteins or the drug conju tions these parameters may vary, e.g. the temperature, pH, gates comprising, inter alia, the inventive random coil osmolarity, and protein content may be different in given polypeptide (or polypeptide segment). Preferably, "physi body or tissue fluids such as blood, liquor cerebrospinalis, ological conditions are to be used in corresponding buffer peritoneal fluid and lymph (Klinke (2005) Physiologie, 5th systems, solvents and/or excipients. Yet, for example in lyo ed., Georg Thieme Verlag, Stuttgart). For example, in the 40 philized or dried compositions (like, e.g. pharmaceutical liquor cerebrospinalis the osmolarity may be around 290 compositions/biologicals), it is envisaged that the random mmol/kg HO and the protein concentration may be between coil conformation of the herein provided random coil 0.15 g/l to 0.45 g/l while in the lymph the pH may be around polypeptide (or polypeptide segment) is transiently not 7.4 and the protein content may be between 3 g/l and 5 g/l. present and/or cannot be detected. However, said random coil When determining whether a polypeptide (or segment 45 polypeptide (or polypeptide segment) will adopt/form again thereof)/amino acid sequence forms/adopts random coil con its random coil after reconstitution in corresponding buffers/ formation under experimental conditions using the methods Solutions/excipients/solvents or after administration to the as described herein below, the biophysical parameters such as body. Methods for determining whether a polypeptide (or temperature, pH, osmolarity and protein content may be dif segment thereof) forms/adopts random coil conformation are ferent to the physiological conditions normally found in vivo. 50 known in the art (Cantor (1980) loc. cit.: Creighton (1993) Temperatures between 1° C. and 42°C. or preferably 4°C. to loc. cit.; Smith (1996) loc. cit.). Such methods include circu 25° C. may be considered useful to test and/or verify the lar dichroism (CD) spectroscopy as exemplified herein below. biophysical properties and biological activity of a protein CD spectroscopy represents a light absorption spectroscopy under physiological conditions in vitro. method in which the difference in absorbance of right- and Several buffers, in particular in experimental settings (for 55 left-circularly polarized light by a substance is measured. The example in the determination of protein structures, in particu secondary structure of a protein can be determined by CD lar in CD measurements and other methods that allow the spectroscopy using far-ultraviolet spectra with a wavelength person skilled in the art to determine the structural properties between approximately 190 and 250 nm. At these wave of a protein/amino acid stretch) or in buffers, solvents and/or lengths, the different secondary structures commonly found excipients for pharmaceutical compositions, are considered 60 in polypeptides can be analyzed, since O.-helix, parallel and to represent “physiological Solutions/physiological condi anti-parallel B-sheet, and random coil conformations each tions' in vitro. Examples of Such buffers are, e.g. phosphate give rise to a characteristic shape and magnitude of the CD buffered saline (PBS: 115 mMNaCl, 4 mM. KHPO, 16 mM spectrum. Accordingly, by using CD spectrometry the skilled NaHPO pH 7.4), Tris buffers, acetate buffers, citrate buffers artisan is readily capable of determining whether polypeptide or similar buffers such as those used in the appended 65 (or segment thereof) forms/adopts random coil conformation examples. Generally, the pH of a buffer representing “physi in aqueous Solution or at physiological conditions. Other ological Solution conditions' should lie in a range from 6.5 to established biophysical methods include nuclear magnetic US 9,221,882 B2 13 14 resonance (NMR) spectroscopy, absorption spectrometry, mixture of proline and alanine residues have a dramatically infrared and Raman spectroscopy, measurement of the hydro increased hydrodynamic Volume as determined by analytical dynamic Volume via size exclusion chromatography, analyti gel permeation/size exclusion chromatography when com cal ultracentrifugation or dynamic/static light scattering as pared to the expected hydrodynamic Volume. In fact, it is well as measurements of the frictional coefficient or intrinsic Surprising that polypeptides comprising mixtures of these viscosity (Cantor (1980) loc. cit.; Creighton (1993) loc. cit.; two amino acids (proline and alanine), of which each alone Smith (1996) loc. cit.). tends to form a homooligopeptide with defined secondary In addition to the experimental methods above, theoretical structure, adopt random coil conformation under physiologi methods for the prediction of secondary structures in proteins cal conditions. Such inventive proline/alanine polypeptides 10 have a larger hydrodynamic radius than homo-polymers com have been described. One example of such a theoretical prising the same number of Gly residues, for example, and method is the Chou-Fasman method (Chou and Fasman, loc. they confer better solubility to the biologically active proteins cit.) which is based on an analysis of the relative frequencies or constructs, i.e. biologically active heterologous proteins or of each amino acid in C-helices, B-sheets, and turns based on known protein structures solved, for example, with X-ray drug conjugates, according to the invention. 15 As mentioned above, the biosynthetic random coil proline/ crystallography. However, theoretical prediction of protein alanine polypeptides of the present invention differ from secondary structure is known to be unreliable. As exemplified chemically synthesized polypeptides in that they can adopt a herein below, amino acid sequences expected to adopt an defined, uniform length by easy means and methods. Whereas C.-helical secondary structure according to the Chou-Fasman the prior art provides mixtures/compositions of polypeptides method were experimentally found to form a random coil. with enormous variations in terms of the length of the pep Accordingly, theoretical methods such as the Chou-Fasman tides, the present invention can provide mixtures/composi algorithm may only have limited predictive value whether a tions of biosynthetic random coil polypeptides with a defined given polypeptide adopts random coil conformation, as also length. Preferably, essentially all polypeptides of the inven illustrated in the appended examples and figures. Nonethe tion comprised in Such a mixture/composition have the same less, the above described theoretical prediction is often the first approach in the evaluation of a putative secondary struc 25 defined length, and, hence, share the same biochemical char ture of a given polypeptide?amino acid sequence. A theoreti acteristics. Such a uniform composition is more advanta cal prediction of a random coil structure also often indicates geous in the various medical, cosmetic, nutritional applica that it might be worthwhile verifying by the above experi tions, wherein the biosynthetic random coil polypeptides can mental means whether a given polypeptide?amino acid be employed. Furthermore, in particular in a medical or phar 30 maceutical context, the herein defined biosynthetic random sequence has indeed a random coil conformation. coil polypeptide or polypeptide segment comprising an Homo-polymers of most amino acids, in particular the amino acid sequence consisting of at least about 50, in par hydrophobic amino acids, are usually insoluble in aqueous ticular of at least about 100, in particular of at least about 150, solution (Bamford (1956) Synthetic Polypeptides—Prepara in particular of at least about 200, in particular of at least about tion, Structure, and Properties, 2nd ed., Academic Press, 35 250, in particular of at least about 300, in particular of at least New York). Homo-polymers of several hydrophilic amino about 350, in particular of at least about 400 proline and acids are known to form secondary structures, for example alanine amino acid residues can also be used in the preven C.-helix in the case of Ala (Shental-Bechor (2005) Biophys J tion, amelioration and/or treatment of disorders linked and/or 88:2391-2402) and B-sheet in the case of Ser (Quadrifoglio affiliated with an impaired blood plasma situation, for (1968) JAm Chem Soc 90:2760-2765) while poly-proline, 40 example after injuries, burns, Surgery and the like. One medi the stiffest homooligopeptide (Schimmel (1967) Proc Natl cal use of said biosynthetic random coil polypeptides or AcadSci USA 58:52-59), forms a type II trans helix in aque polypeptide segments is, accordingly, the use as plasma ous solution (Cowan (1955) Nature 176:501-503). Using the theoretical principles of polymer biophysics the expander. However, it is of note that in accordance with this random coil diameter of a chain of 200 amino acid residues invention also the herein described drug conjugates and het 45 erologous polypeptides or heterologous polypeptide con would amount in the case of poly-glycine, for example, to ca. structs may be employed in context of the medical or phar 75A-calculafed as the average root mean square end-to-end maceutical intervention of a disorder related to an impaired distance of V r)o-IVn-C., with n=200 rotatable bonds of blood plasma amount or blood plasma content or of a disorder length 1–3.8 A for each Co-Co distance and the 'character related to an impaired blood volume. istic ratio Cs2.0 for poly(Gly) (Brant (1967) J Mol Biol 50 Accordingly, the present invention relates in one embodi 23:47-65; Creighton, (1993) loc. cit.). This relation shows ment to a biosynthetic random polypeptide (or segment that the person skilled in the art would expect that the hydro thereof) which comprises an amino acid sequence consisting dynamic Volume of a random chain amino acid polymer can solely of at least about 50 proline and alanine amino acid be either extended by (a) using alonger chain length 1 or by (b) residues, of at least about 100 proline and alanine amino acid using amino acids that exhibit a larger characteristic ratio, C. 55 residues, of at least about 150 proline and alanine amino acid C is a measure for the inherent stiffness of the molecular residues or of at least about 200 proline and alanine residues, random chain and has a general value of 9 for most amino in particular when comprised in a heterologous protein/ acids (Brant (1967) loc. cit.). Only Gly, which lacks a side polypeptide/polypeptide constructorina drug conjugate. The chain, and also the imino acid Pro exhibit significantly present invention also relates to biosynthetic random coil Smaller values. Hence, Gly and Pro (under denaturing condi 60 polypeptides which comprise an amino acid sequence con tions) are expected to contribute to reducing the dimensions sisting solely of at least about 200 proline and alanine amino of random coil proteins (Miller (1968) Biochemistry 7:3925 acid residues, even more preferably of at least about 300 3935). Amino acid sequences comprising proline residues, proline and alanine amino acid residues, particularly prefer accordingly, are expected to have a relatively compact hydro ably of at least about 400 proline and alanine amino acid dynamic Volume. In contrast to this teaching, however, it is 65 residues, more particularly preferably of at least about 500 shown herein that the hydrodynamic volume of the amino proline and alanine amino acid residues and most preferably acid polymers/polypeptides of the invention that comprise a of at least about 600 proline and alanine amino acid residues. US 9,221,882 B2 15 16 The amino acid sequence forming random coil conformation It is envisaged that it is the herein defined biosynthetic may consist of maximally about 3000 proline and alanine amino acid sequence consisting solely of proline (P) and amino acid residues, of maximally about 2000 proline and alanine (A) amino acid residues, which forms/adopts/has a alanine amino acid residues, of maximally about 1500 proline random coil conformation. In the simplest case, the biosyn and alanine amino acid residues, of maximally about 1200 5 thetic polypeptide or polypeptide segment consists of the proline and alanine amino acid residues, of maximally about amino acid sequence having a random coil conformation as 800 proline and alanine amino acid residues. Accordingly, the defined herein. proline/alanine amino acid sequence stretch may consist of However, the biosynthetic polypeptide (or segment about 50, of about 100, of about 150, of about 200, of about thereof) may, in addition to the herein described amino acid 10 sequence forming/adopting/having a random coil conforma 250, of about 300, of about 350, of about 400, of about 500, of tion, comprise further amino acid sequences/amino acid resi about 600, of about 700, of about 800, of about 900 to about dues which do not contribute to the formation of the random 3000 proline and alanine amino acid residues. In certain coil conformation or which are not capable of forming/adopt embodiments, the inventive biosynthetic amino acid ing/having a random coil conformation on their own. Without sequence comprises about 200 to about 3000 proline and 15 deferring from the gist of the invention, also such biosynthetic alanine residues, about 200 to about 2500 proline and alanine polypeptides (or segments thereof) are biosynthetic "random residues, about 200 to about 2000 proline and alanine resi coil polypeptides or polypeptide segments. The further dues, about 200 to about 1500 proline and alanine residues, amino acid sequences/amino acid residues may, for example, about 200 to about 1000 proline and alanine residues, about be useful as linkers. Interalia, dimers, trimers, i.e. in general 300 to about 3000 proline and alanine residues, about 300 to 20 multimers of the biosynthetic random coil polypeptide are about 2500 proline and alanine residues, about 300 to about also envisaged in context of the present invention and Such 2000 proline and alanine residues, about 300 to about 1500 multimers may be linked by amino acid sequences/residues proline and alanine residues, about 300 to about 1000 proline which do not form random coil conformation. An example of and alanine residues, about 400 to about 3000 proline and a protein which may comprise Such a random coil polypep alanine residues, about 400 to about 2500 proline and alanine 25 tide is the herein provided biologically active protein, which residues, about 400 to about 2000 proline and alanine resi may, in addition to the random coil polypeptide consisting of dues, about 400 to about 1500 proline and alanine residues, proline and alanine amino acid residues as defined herein about 400 to about 1000 proline and alanine residues, about further comprise another polypeptide having/mediating bio 500 to about 3000 proline and alanine residues, about 500 to logical activity. Again, Such a construct may be a heterolo about 2500 proline and alanine residues, about 500 to about 30 gous, biologically active protein or polypeptide construct as 2000 proline and alanine residues, about 500 to about 1500 described herein. proline and alanine residues, about 500 to about 1000 proline The term “at least about 50/100/150/200/300/400/50Of and alanine residues, about 600 to about 3000 proline and 600/700/800/etc. amino acid residues is not limited to the alanine residues, about 600 to about 2500 proline and alanine concise number of amino acid residues but also comprises residues, about 600 to about 2000 proline and alanine resi- 35 amino acid stretches that comprise either additional about dues, about 600 to about 1500 proline and alanine residues, 1-20%, like 10% to 20% residues or about 1-20%, like about about 600 to about 1000 proline and alanine residues, about 10% to 20% less residues. For example “at least about 100 700 to about 3000 proline and alanine residues, about 700 to amino acid residues' may also comprise about 80 to 100 and about 2500 proline and alanine residues, about 700 to about about 100 to 120 amino acid residues without deferring from 2000 proline and alanine residues, about 700 to about 1500 40 the gist of the present invention. For example “at least about proline and alanine residues, about 700 to about 1000 proline 200 amino acid residues may also comprise about 160 to 200 and alanine residues, about 800 to about 3000 proline and and about 200 to 240 amino acid residues without deferring alanine residues, about 800 to about 2500 proline and alanine from the gist of this invention. The definition and explana residues, about 800 to about 2000 proline and alanine resi tions given herein above, apply, mutatis mutandis, also to the dues, about 800 to about 1500 proline and alanine residues, 45 term “maximally about 3000/2000/1500/1200/800 amino about 800 to about 1000 proline and alanine residues. As is acid residues' etc. Accordingly, the term “about' is not lim evident from the content of this invention, also larger biosyn ited or restricted to the concise number of amino acid residues thetic amino acid sequences (consisting essentially of proline in context of longer amino acid sequences (e.g. amino acid and alanine) are within the scope of this invention and can sequences comprising or consisting of maximally 3000 readily be employed in the herein defined biologically active 50 amino acid residues). Therefore, the term “maximally about proteins or protein constructs which comprise as one domain 3000/2000/1500/1200/800 amino acid residues” but may also of at least two domains an amino acid sequence having and/or comprise amino acid stretches that comprise additional 10% mediating said biological activity and as another domain of at to 20% or 10% to 20% less residues without deferring from least two domains the biosynthetic random coil polypeptide this invention. or polypeptide segment consisting of at least about 50 proline 55 Furthermore, the biosynthetic random coil polypeptides and alanine amino acid residues, of at least about 100 proline (or segments thereof) are characterised by a defined content and alanine amino acid residues, of at least about 150 proline or ratio of amino acid residues, in particular of the main and alanine amino acid residues, of at least about 200, of at constituents proline and alanine. As mentioned above, the least about 250, of at least about 300, of at least about 350, of present invention relates to a biosynthetic random coil at least about 400 proline and alanine amino acid residues. 60 polypeptide or polypeptide segment comprising an amino Such a biosynthetic random coil polypeptide or polypeptide acid sequence consisting solely of proline and alanine amino segment corresponds to the biosynthetic random coil part of a acid residues, wherein said amino acid sequence consists of at heterologous protein/protein construct. These biosynthetic least about 50, of at least about 100, of at least about 150, of proline/alanine stretches consist of maximally about 3000 at least about 200 of at least about 250, of at least about 300, proline and alanine amino acid residues. These amino acid 65 of at least about 350, of at least about 400 proline (Pro) and sequences (proline/alanine stretches) comprise proline and alanine (Ala) amino acid residues in particular when com alanine as main or unique residues as explained further below. prised in a heterologous biological active protein/protein con US 9,221,882 B2 17 18 struct/polypeptide or drug conjugate. The term "solely as The term “about X %' as used herein above is not limited to used herein means that preferably at least about 90% or at the concise number of the percentage, but also comprises least about 95% of the amino acids are proline and alanine, values of additional 10% to 20% or 10% to 20% less residues. whereby proline and alanine constitute the majority but may For example the term 10% may also relate to 11% or 12% and not be the only amino acid residues, i.e. these inventive amino to 9% and 8%, respectively. acid sequences are not necessarily 100% proline and alanine However, as mentioned above and further detailed herein amino acid stretches. Hence, the biosynthetic polypeptides/ below said random coil polypeptide (or polypeptide seg amino acid sequences of the present invention may also com ment), and, in particular the amino acid sequence, may also prise other amino acids than proline and alanine as minor comprise additional amino acids differing from proline and constituents as long as the amino acid sequence forms/adopts/ 10 alanine as minor constituents. As already discussed herein has random coil conformation. Such a random coil confor above, said minor constituent(s), i.e. (an)other amino acid(s) mation can be easily determined by herein provided means than proline or alanine, may comprise less than about 10%, and methods. Accordingly, also in context of the term less than about 9%, less than about 8%, less than about 7%, “solely”, a minor amount (less than about 10% or less than less than about 6%, less than about 5%, less than about 4%, about 5%) of other amino acid residues may be comprised. 15 less than about 4%, less than about 3% or less than about 2% Said “other, minor amino acid residues are defined herein of the biosynthetic random coil polypeptide/polymer of this below. invention. Accordingly, the present invention relates in one embodi The skilled person is aware that an amino acid sequence? ment to a biosynthetic random coil polypeptide (or segment polypeptide (or segment thereof) may also form random coil thereof) whereby the amino acid sequence consists mainly of conformation when other residues than proline and alanine proline and alanine, and wherein the proline residues consti are comprised as a minor constituent in said amino acid tute more than about 10% and less than 75% of the amino acid sequence/polypeptide (polypeptide segment). The term sequence. The alanine residues comprise the remaining at “minor constituent as used herein means that maximally 5% least 25% to 90% of said amino acid sequence (or the random or maximally 10% amino acid residues are different from coil polypeptide or polypeptide segment if it consists of the 25 proline or alanine in the inventive biosynthetic random coil amino acid sequence). polypeptides/polymers of this invention. This means that Preferably, the amino acid sequence comprises more than maximally 10 of 100 amino acids may be different from about 10%, preferably more than about 12%, even more pref proline and alanine, preferably maximally 8%, i.e. maximally erably more than about 14%, particularly preferably more 8 of 100 amino acids may be different from proline and than about 18%, more particularly preferably more than about 30 alanine, more preferably maximally 6%, i.e. maximally 6 of 20%, even more particularly preferably more than about 22%, 100 amino acids may be different from proline and alanine, 23% or 24% and most preferably more than about 25% pro even more preferably maximally 5%, i.e. maximally 5 of 100 line residues. The amino acid sequence preferably comprises amino acids may be different from proline and alanine, par less than about 75%, more preferably less than 70%. 65%, ticularly preferably maximally 4%, i.e. maximally 4 of 100 60%, 55% or 50% proline residues, wherein the lower values 35 amino acids may be different from proline and alanine, more are preferred. Even more preferably, the amino acid sequence particularly preferably maximally 3%, i.e. maximally 3 of comprises less than about 48%, 46%. 44%, 42% proline 100 amino acids may be different from proline and alanine, residues. Particular preferred are amino acid sequences com even more particularly preferably maximally 2%, i.e. maxi prising less than about 41%, 40%, 39%. 38%, 37% or 36% mally 2 of 100 amino acids may be different from proline and proline residues, whereby lower values are preferred. Most 40 alanine and most preferably maximally 1%, i.e. maximally 1 preferably, the amino acid sequence comprise less than about of 100 of the amino acids that are comprised in the random 35% proline residues; see also the herein below provided PA coil polypeptide (or segment thereof) may be different from COnStructS. proline and alanine. Said amino acids different from proline Vice versa, the amino acid sequence preferably comprises and alanine, may be selected from the group consisting of less than about 90%, more preferably less than 88%. 86%, 45 Arg, ASn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, 84%, 82% or 80% alanine residues, wherein the lower values Phe, Thr, Trp, Tyr, and Val, including posttranslationally are preferred. Even more preferably, the amino acid sequence modified amino acids or non-natural amino acids (see, e.g., comprises less than about 79%, 78%, 77%, 76% alanine Budisa (2004) Angew Chem Int Ed Engl 43:6426-6463 or residues, whereby lower values are preferred. Most prefer Young (2010) J Biol Chem 285:11039-1 1044). In case that ably, the amino acid sequence comprises less than about 75% 50 the “minor constituent’ (i.e. an amino acid other than proline alanine residues. and alanine) of the biosynthetic random coil polypeptide? Also preferred herein is an amino acid sequence compris construct/polymer (or a fragment thereof) comprises as ing more than about 25%, preferably more than about 30%, “other amino acid'/'differentamino acid' (a) Ser(s), said Ser even more preferably more than about 35%, particularly pref amino acid/Seramino acids constitute preferably less than erably more than about 40%, more particularly preferably 55 50%, more preferably less than 40%, less than 30%, less than more than about 45% or 50%, even more particularly prefer 20% or less than 10% of these (minor) amino acid residues. In ably more than about 52%, 54%, 56%, 58% or 59% alanine a most preferred embodiment, the biosynthetic random coil residues, wherein the higher values are preferred. Even more polypeptide? construct/polymeras described herein ortheran preferably, the amino acid sequence comprises more than dom coil polypeptide part of a (e.g.) fusion protein as about 60%, 61%. 62%, 63% or 64% alanine residues and 60 described herein does not comprise (a) serine residue(s). It is, most preferably more than about 65% alanine residues. generally, preferred herein that these “minor amino acids Accordingly, the random coil polypeptide (or segment (other than proline and alanine) are not present in the herein thereof) may comprise an amino acid sequence consisting of provided biosynthetic random coil polypeptide? construct/ about 25% proline residues, and about 75% alanine residues. polymer as described herein or the random coil polypeptide Alternatively, the random coil polypeptide (or segment 65 part of a (e.g.) fusion protein. In accordance with the inven thereof) may comprise an amino acid sequence consisting of tion, a biosynthetic random coil polypeptide (or segment about 35% proline residues and about 65% alanine residues. thereof)/the amino acid sequence may, in particular, consist US 9,221,882 B2 19 20 exclusively of proline and alanine amino acid residues (i.e. no ingamino acid sequences/polypeptides to be used as building other amino acid residues are present in the random coil blocks or modules of the herein defined random coil polypep polypeptide or in the amino acid sequence). tide (or segment thereof) may, interalia, comprise combina Whereas the above relates to the overall length and proline/ tions and/or fragments or circularly permuted versions of the alanine content of the amino acid sequence comprised in the specific “building blocks”, “polymer cassettes”, or “polymer random coil polypeptide (or segment thereof), the following repeats' shown above. Accordingly, the exemplified mod relates in greater detail to the specific, exemplary amino acid ules/sequence units/polymer repeats/polymer cassettes of the sequences (or fragments thereof). random coil polypeptide?amino acid sequence may also pro In one embodiment, the amino acid sequences/polypep vide for individual fragments which may be newly combined tides adopting random coil conformation (the random coil 10 to form further modules/sequence units/polymer repeats/ polypeptide or segment thereof as defined herein), for polymer cassettes in accordance with this invention. example, in aqueous solution or under physiological condi The terms “module(s)”, “sequence unit(s)', 'polymer tions may comprise a plurality of “amino acid repeats'/ repeat(s)”, “polymer cassette(s)' and “building block(s) are “amino acid cassettes'/"cassette repeats', wherein said used as synonyms herein and relate to individual amino acid “amino acid repeats'/'amino acid cassettes'/"cassette 15 stretches which may be used to form the herein defined repeats'/'building block'/“modules” (these terms are used dom coil polypeptide (or segment thereof)/amino acid herein interchangeably) mainly or exclusively consist of pro Sequence. line (Pro, P) and alanine (Ala., A) amino acid residues (de An amino acid repeat (used as “building block' etc. of a picted herein as “PA', or as “AP), wherein no more than 6 biosynthetic random coil polypeptide of the present inven consecutive amino acid residues are identical. An illustrative tion) may consist of at least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, “building block” is, e.g. AP and this has also been provided 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 in the appended illustrative examples as functional biosyn or more amino acid residues, wherein each repeat comprises thetic random coil domain of the present invention. This (an) proline and alanine residue(s). However, as illustrated in illustrative example is the sequence “P1A1 as also provided appended SEQ ID No. 51, said “building block' can also in form of APAPAPAPAPAPAPAPAPAP (SEQID NO:51). 25 merely consist of the 2 herein provided amino acid residues P i.e. a “poly PA” “amino acid repeat”/"amino acid cassette/ and A, namely in form of “PA’ or “AP. In one embodiment, “cassette repeat”. In a preferred embodiment, the amino acid the amino acid repeat according to the present invention does sequence/polypeptide comprising the above defined "amino not comprise more than 50 amino acid residues. However, it acid repeats'/'amino acid cassettes'/"cassette repeats' and is evident for the skilled artisan that such a “repeat” may the like comprises no more than 5 identical consecutive 30 comprise even more than 50 amino acid residues, for example amino acid residues. Other alternative embodiments are pro in cases wherein said inventive biosynthetic random coil vided herein below in context of exemplified, individual polypeptide/polymer comprises more than about, e.g., 100 building blocks. amino acids, more than about 150 amino acids, more than Within a random coil polypeptide (or segment thereof) about 200 amino acids, etc. Accordingly, the maximal amount according to this invention the amino acid repeats may be 35 of amino acid residues comprised in Such a “repeat' is con identical or non-identical. Non-limiting examples of “amino ditioned by the over-all length of the biosynthetic polypeptide acid repeats”, “building blocks”, “modules”, “repeats'. (or segment thereof)/polymer as provided herein. “amino acid cassettes' etc. consisting of proline and alanine Yet, it is of note that the biosynthetic random coil polypep residues are provided herein below; see, e.g. SEQID NO: 1. tides/amino acid sequences comprising the above repeats etc. SEQID NO: 2, SEQID NO:3, SEQID NO: 4, SEQID NO: 40 should preferably have the overall length and/or proline/ala 5, SEQID NO: 6 and SEQID NO. 51 (The enclosed sequence nine content as defined and explained herein above, i.e. con listing also comprises illustrative nucleic acid sequences sist of about 50, of about 100, of about 150, of about 200, of which encode such “repeats'/'modules”, etc. The appended about 250, of about 300, of about 350, of about 400 to about sequences in said sequence listing as filed herewith constitute 3000 amino acids and/or comprise more than about 10% and part of this specification and description). Also the use of 45 less than about 75% proline residues. All the definitions given (identical and/or non-identical) fragments of these sequences herein above in this context also apply here, mutatis mutan is envisaged herein, whereby a "fragment comprises at least dis. 2 amino acids and comprises at least one proline and/or ala As discussed in detail herein and as provided herein above, nine, preferably at least one proline and one alanine. "Frag the present invention provides for (a) biologically active, ments of these sequences to be employed in accordance with 50 heterologous protein(s) or (a) protein construct(s) that is/are this invention for the generation of the random coil polypep particularly useful in a pharmaceutical, medical and/or tide (or segment thereof) may consist of at least 3, preferably medicinal setting. These biologically active, heterologous of at least 4, more preferably of at least 5, even more prefer proteins/protein constructs comprise as at least one domain of ably of at least 6, still more preferably of at least 8, particu said at least two domains the random coil polypeptide or larly preferably of at least 10, more particularly preferably of 55 polypeptide segment comprising an amino acid sequence at least 12, even more particularly preferably of at least 14, consisting solely of proline and alanine residues, wherein said still more particularly preferably of at least 16, and most amino acid sequence consist of about 50, of about 100, of preferably of at least 18 consecutive amino acids of the amino about 150, of about 200, of about 250, of about 300, of about acid sequence selected from the group consisting of said SEQ 350, of about 400 to about 3000 proline (Pro) and alanine ID NOs: 1, 2, 3, 4, 5, 6 and 51 (here it is of note that SEQ ID 60 (Ala) residues. No. 51 consists of an illustrative “AP’ or “PA” repeat). In context of the biologically active, heterologous proteins, Based on the teaching given herein, the person skilled in polypeptides or protein constructs as disclosed herein, the the art is readily in a position to generate further amino acid term "heterologous' relates to at least two domains within sequences/polypeptides that form random coil conformation said proteins, polypeptides or protein constructs wherein a for example under aqueous or under physiological conditions 65 first of said at least two domains confers, has and/or mediates and are constituted of mainly proline and alanine as defined a defined biological activity and wherein a second of said at herein. Further examples of random coil conformation form least two domains comprises the biosynthetic random coil US 9,221,882 B2 21 22 polypeptide consisting solely of proline and alanine amino solely of proline and alanine residues' relate to proteins or acid residues and whereby said at least two domains are not protein constructs that do not normally occur in nature and, found operationally linked to each other in nature or are not thus, are "heterologous'. Furthermore, and in contrast to pro encoded by a single coding nucleic acid sequence (like an line-rich sequences described in the plant kingdom, the bio open reading frame) existing in nature. The biosynthetic ran 5 synthetic random coil polypeptides/polypeptide segments dom coil polypeptide/polypeptide segment consisting solely described herein are preferably not chemically modified, i.e. of proline and alanine amino acid residues as provided herein they are preferably not glycosylated or hydroxylated. and as employed in the biologically active, heterologous pro A particular advantage of the biosynthetic random coil teins/protein constructs of this invention are preferably not polypeptides or polypeptide segments of this invention is further (chemically) modified, for example they are prefer 10 their intrinsically hydrophilic but uncharged character. ably neither glycosylated nor hydroxylated. Accordingly, as “minor amino acids (other than proline and It is of note that certain naturally occurring proteins or alanine) in the herein described biosynthetic random coil hypothetical proteins as deduced from sequenced nucleic polypeptide or polypeptide stretch Such amino acids are pre acid stretches found in nature are described as comprising a ferred that do not have hydrophobic side chains, like Val, Ile, relatively high (i.e. above average) content of proline and 15 Leu, Met, Phe, Tyror Trp, and/or that do not have charged side alanine. For example, a homologous hypothetical protein has chains, like Lys, Arg, Asp or Glu. In accordance with this been described for Leishmania major strain Friedlin (Ivens invention, it is envisaged that (in cases where such individual (2005) Science 309, 436-442.). The disclosed reading frame amino acids are nevertheless comprised in the inventive bio comprising 1514 codon triplets includes a stretch of 412 synthetic random coil polypeptide/polypeptide segment) the triplets composed of 240 Ala, 132 Pro, 34 Lys and 4 Val overall content of each individual amino acid having a hydro codons. The Lys residues, which are positively charged under phobic side chain, like Val, Ile, Leu, Met, Phe, Tyr or Trp, physiological buffer conditions, are almost evenly distributed and/or having a charged side chain, like Lys, Arg, Asp, or Glu, among this sequence, Suggesting a solubilizing effect. How within the herein defined biosynthetic random coil polypep ever, as is evident from the disclosure herein, Such a homolo tide (or segment thereof) does not exceed 8%, 7%, 6%. 5%, gous hypothetical protein as deduced from a naturally occur 25 4%, 3%, 2% or 1%. ring nucleic acid molecule or open reading frame, comprising The biosynthetic random coil polypeptide?amino acid a high proline and alanine content above average is not part of sequences of the present invention may comprise concatam this invention. The invention is based on the fact that a rather ers of individual blocks comprising combined proline/alanine large random coil polypeptide or polypeptide segment that stretches of the sequence (Pro),-(Ala), whereby X can have does not occur in nature in an isolated manner and that com 30 an integer value from 1 to preferably 15, more preferably 1 to prises an amino acid sequence consisting solely of proline 10, even more preferably 1 to 5, and y can have an integer and alanine residues, wherein said amino acid sequence con value from 1 to preferably 15, more preferably 1 to 10, even sist of about 50, of about 100, of about 150, of about 200, of more preferably 1 to 5, and X and y can vary between subse about 250, of about 300, of about 350, of about 400 to about quent blocks. Said X and y can also be an integer of 1, 2, 3, 4, 3000 proline (Pro) and alanine (Ala) residues is provided that 35 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15. is particularly useful in medical/pharmaceutical context. The The amino acid sequences/polypeptides forming random herein described isolated biosynthetic random coil polypep coil conformation in aqueous solution or under physiological tides or polypeptide segments that do not occur in nature in an conditions may have the formula (I): isolated manner are also comprised in the herein disclosed (a) biologically active, heterologous protein(s) or (a) protein 40 Pro, Alal, construct(s) that is/are particularly useful in a pharmaceuti wherein X is independently selected from integer 1 to 5. cal, medical and/or medicinal setting. These biologically Furthermore, for each n, y is independently selected from active, heterologous proteins/protein constructs comprise as integer 1 to 5. n, finally, is any integer provided that random at least one domain of said at least two domains the random coil polypeptide (or segment thereof)/amino acid sequence coil polypeptide or polypeptide segment comprising an 45 consists preferably of at least about 50, of at least about 100, amino acid sequence consisting solely of proline and alanine of at least about 150, of at least about 200, of at least about residues, wherein said amino acid sequence consists of about 250, of at least about 300, of at least about 350, of at least 50, of about 100, of about 150, of about 200, of about 250, of about 400 amino acid residues and up to about 3000 amino about 300, of about 350, of about 400 to about 3000 proline acid residues. Also in this context it is of note that the (Pro) and alanine (Ala) residues. 50 polypeptides/amino acid sequences comprising the above Also, arabinogalactan proteins (AGPs), Pro-rich proteins, concatemers or having the above formula (I) should prefer and extensins belong to a large group of glycoproteins, known ably have the overall length and/or proline/alanine content as as hydroxyproline (Hyp)-rich glycoproteins (HRGPs), which defined and explained herein above, i.e. consist of about 50, are expressed throughout the plant kingdom. One Such AGP of about 100, of about 150, of about 200, of about 250, of motif comprising an Ala-Pro repeat (AP) 51 was expressed as 55 about 300, of about 350, of about 400 to about 3000 amino a synthetic glycomodule peptide with N-terminal signal acids and/or comprise more than about 10% and less than sequence and C-terminal green fluorescent protein in trans about 75% proline residues. Again, all the definitions given genic Arabidopsis thaliana and investigated as a Substrate for herein above in this context also apply here, mutatis mutan prolyl hydroxylases and Subsequent O-glycosylation of the dis. hydroxyproline residues (Estévez (2006) Plant Physiol. 142, 60 The present invention also relates to random coil polypep 458-470). Again, the disclosed hydroxylated and/or glycosy tides ((a) polypeptide segment(s))/amino acid sequences lated Pro side chains, which can form hydrogen bonds to comprising an amino acid stretch selected from the group water molecules, appear to have a solubilizing effect. consisting of AAPAAPAPAAPAAPAPAAPA (SEQ ID NO: It is of note that the herein described “biologically active 1); AAPAAAPAPAAPAAPAPAAP (SEQID NO: 2); AAA proteins or protein constructs comprising as (at least) one 65 PAAAPAAAPAAAPAAAP (SEQ ID NO: 3 being an domain a biosynthetic random coil polypeptide or peptide example for Pro Alas); AAPAAPAAPAAPAAPAAPAA segment comprising an amino acid sequence consisting PAAP (SEQ ID NO: 4); APAAAPAPAAAPAPAAAPA US 9,221,882 B2 23 24 PAAAP (SEQ ID NO. 5): AAAPAAPAAPPAAAAPAA It is evident for the person skilled in the art that also PAAPPA (SEQID NO: 6) and APAPAPAPAPAPAPAPAPAP “modules” and (shorter) fragments or circularly permuted (SEQ ID NO: 51 being an example for Pro Alao) or cir versions of the herein provided amino acid stretches may be cular permuted versions or (a) multimers(s) of these used as “modules”, “repeats' and/or building blocks for the sequences as a whole or parts of these sequences. Accord herein defined random coil polypeptide (or segment thereof)/ ingly, the random coil polypeptide ((a) polypeptide amino acid sequence. segment(s) thereof)/amino acid sequence may comprise the In accordance with the above, the random coil polypeptide? amino acid stretch AAPAAPAPAAPAAPAPAAPA (SEQ ID amino acid sequence forming random coil conformation may NO: 1), AAPAAPAPAAPAAPAPAAPA (SEQ ID NO: 1); comprise a multimer of any of the above amino acid stretches 10 (or circular permuted versions or fragments thereof), prefer AAPAAAPAPAAPAAPAPAAP (SEQ ID NO: 2); AAA ably those shown in SEQID NO: 1, SEQID NO: 2, SEQ ID PAAAPAAAPAAAPAAAP (SEQID NO:3); AAPAAPAA NO:3, SEQID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6 and PAAPAAPAAPAAPAAP (SEQ ID NO: 4); APAAAPA SEQID NO. 51. It is to note that these sequences are by no PAAAPAPAAAPAPAAAP (SEQ ID NO: 5); means limiting in context of this invention. AAAPAAPAAPPAAAAPAAPAAPPA (SEQID NO: 6) and 15 Again, the polypeptides/amino acid sequences comprising APAPAPAPAPAPAPAPAPAP (SEQ ID NO. 51), as well as the above amino acid stretches (or fragments thereof), circu combinations of these motifs or combinations of fragments lar mutated versions (or fragments thereof) should preferably and parts of this motifs as long as the resulting biosynthetic have the overall length and/or proline/alanine content as random coil polypeptide consists solely of proline and ala defined and explained herein above, i.e. consist of about 50, nine amino acid residues, wherein said amino acid sequence of about 100, of about 150, of about 200, of about 250, of consists of at least 50 proline (Pro) and alanine (Ala) amino about 300, of about 350, of about 400 to about 3000 amino acid residues. acids and/or comprise more than about 10% and less than Also circular permuted versions of the above amino acid about 75% proline residues. All the definitions given herein sequences may be used in accordance with the present inven above in this context also apply here, mutatis mutandis. Also tion. Exemplary circular permuted versions of e.g. AAPAA 25 the term “fragment' has been defined above. PAPAAPAAPAPAAPA (SEQID NO: 1) can be easily gener As mentioned above, in context of this invention it was ated, for example by removing the first alanine and adding Surprisingly found that the biosynthetic random coil polypep another alanine at the end of the above sequence. Such a tides (or polypeptide segment)/polymers as provided herein cicular permuted version of SEQ ID NO: 1 would then be are characterized by a relatively large hydrodynamic Volume. APAAPAPAAPAAPAPAAPAA (SEQ ID NO: 7). Further, 30 This hydrodynamic Volume, also called apparent size, can non-limiting examples of cicular permuted versions of SEQ easily be determined by analytical gel filtration (also known ID NO. 1 are: as size exclusion chromatography, SEC). Preferably, the ran dom coil polypeptide (or segment thereof) has an apparent (SEQ ID NO: 8) size of at least 10 kDa, preferably of at least 25 kDa, more PAAPAPAAPAAPAPAAPAAA 35 preferably of at least 50 kDa, even more preferably of at least 100 kDa, particularly preferably of at least 200 kDa and most (SEO ID NO: 9) preferably of at least 400 kDa. The person skilled in the art is AAPAPAAPAAPAPAAPAAAP, readily capable of determining the hydrodynamic Volume of (SEQ ID NO: 10) specific proteins. Such methods may include dynamic/static APAPAAPAAPAPAAPAAAPA, 40 light scattering, analytical ultracentrifugation or analytical gel filtration as exemplified herein below. Analytical gel fil (SEQ ID NO: 11) tration represents a known method in the art for measuring the PAPAAPAAPAPAAPAAAPAA, hydrodynamic Volume of macromolecules. Alternatively, the (SEQ ID NO: 12) hydrodynamic Volume of a globular polypeptide can be esti APAAPAAPAPAAPAAAPAAP, 45 mated by its molecular weight (Creighton (1993) loc. cit.). As (SEQ ID NO: 13) described herein the hydrodynamic volume of the polypep PAAPAAPAPAAPAAAPAAPA, tides of the invention consisting preferably of at least about 50, of at least about 100, of at least about 150, of at least about (SEQ ID NO: 14) 200, of at least about 250, of at least about 300, of at least AAPAAPAPAAPAAAPAAPAP, 50 about 350, of at least about 400 to about 3000 proline and (SEQ ID NO: 15) alanine amino acid residues and having random coil confor APAAPAPAAPAAAPAAPAPA, mation show unexpectedly high values in relation to the hydrodynamic volume that would be estimated for a corre (SEQ ID NO: 16) PAAPAPAAPAAAPAAPAPAA, sponding folded, globular protein based on the molecular 55 weight. The following relates to biologically active, heterolo and the like. Based on the teaching of the present invention, a gous proteins or protein constructs comprising, interalia, the skilled person is easily in the position to generate correspond biosynthetic random coil polypeptide (or segment thereof) as ing circular permuted versions of the amino acid stretches as described and defined herein above which represent a pre shown in SEQID NO: 2, SEQID NO:3, SEQID NO: 4, SEQ ferred embodiment of the present invention. Without being ID NO: 5, SEQID NO: 6 and SEQID NO. 51 (said SEQID 60 bound by theory, it was surprisingly found in context of the No. 51 being entirely based on AP repeats and a circular present invention that the biosynthetic random coil polypep permutated version could be based entirely on “PA” or “AP tide stretches as provided herein and consisting solely of repeats/building blocks). proline and alanine can, even provide for a higher hydrody Such circular permuted versions may also be considered as namic Volume than a corresponding biosynthetic random coil examples of a further “module/"building block' etc. of the 65 stretch having the same total number of amino acid residues herein provided polypeptides/amino acid sequences which but consisting solely of proline, alanine and serine (as pro can be used accordingly herein. vided in WO 2008/155134). US 9,221,882 B2 25 26 Common human plasma proteins such as serum albumin modified by the PEGylation process. Furthermore, chemical (HSA) and immunoglobulins (Igs), including humanized coupling with synthetic polymers usually results in a hetero antibodies, show long half-lifes, typically of 2 to 3 weeks, geneous mixture of molecules which may show a substantial which is attributable to their specific interaction with the variance of in vivo activity. neonatal Fc receptor (FcRn), leading to endosomal recycling 5 Third, the use of glycosylation analogs of biologically (Ghetie (2002) Immunol Res, 25:97-113). In contrast, most active proteins in which new N-linked glycosylation consen other proteins of pharmaceutical interest, in particular recom Sus sequences are introduced has been proposed to prolong binant antibody fragments, hormones, interferons, etc. Suffer plasma half-life; see WO 02/02597; Perlman (2003) J Clin from rapid (blood) clearance. This is particularly true for Endocrinol Metab 88:2327-2335; or Elliott (2003) Nat Bio proteins whose size is below the threshold value for kidney 10 technol 21:414-420). The described glycoengineered pro filtration of about 70 kDa (Caliceti (2003) Adv Drug Deliv teins, however, displayed an altered in vivo activity, which Rev 55:1261-1277). In these cases the plasma half-life of an indicates that the new carbohydrate side chains influence the unmodified pharmaceutical protein may be considerably less biological activity of the engineered protein. Moreover, the than one hour, thus rendering it essentially useless for most additional carbohydrate side chains are likely to increase the therapeutic applications. In order to achieve Sustained phar 15 antigenicity of the resulting biological active molecules, macological action and also improved patient compliance— which raises Substantial safety concerns. Furthermore, fusion with required dosing intervals extending to days or even proteins comprising the Trypanosoma Cruzi derived artificial weeks—several strategies were previously established for repetitive sequence PSTAD have been reported to induce a purposes of biopharmaceutical drug development. prolonged plasma half-life of trans-Sialidase (Alvarez (2004) First, the recycling mechanism of natural plasma proteins JBC 279:3375-3381). Yet, such Trypanosoma cruzi derived has been employed by producing fusion proteins with the Fc repeats have been reported to induce a humoral immune portion of Igs, for example Enbrel(R), a hybrid between the response (Alvarez (2004) loc. cit.). Accordingly, alternative extracellular domain of TNFC. receptor and human IgG1 strategies to prolong the action of biologically active proteins (Goldenberg (1999) Clin Ther 21:75-87) or with serum albu are desired. min, for example Albuferon(R) (albinterferon alfa-2b, ZAL 25 The biosynthetic amino acid sequences/polypeptides as BINTM, JOULFERONR), a corresponding fusion of IFNal disclosed herein and consisting solely of proline and alanine pha with HSA (Osborn (2002) Pharmacol Exp Ther303:540 according to the invention were Surprisingly found to adopt 548). Albumin with its high plasma concentration of 600 uM random coil conformation in particular under physiological has also been utilized in an indirect manner, serving as carrier conditions. Therefore, they are advantageous molecules to vehicle for biopharmaceuticals that are equipped with an 30 provide for the herein below defined “second domain of the albumin-binding function, for example via fusion with a bac biologically active protein(s)/polypeptide(s), i.e. comprising terial albumin-binding domain (ABD) from Streptococcal a polypeptide stretch that forms under physiological condi protein G (Makrides (1996) J Pharmacol Exp Ther 277:534 tions a random coil conformation and thereby mediates an 542) or with a peptide selected against HSA from a phage increased in vivo and/or invitro stability to biologically active display library (Dennis (2002) J Biol Chem, 277:35035 35 (“functional) protein(s) or polypeptide(s), in particular, an 35043; Nguyen (2006) Protein Eng Des Sel 19:291-297). increased plasma half-life. The hydrodynamic volume of a Second, a fundamentally different methodology for pro functional protein that is fused to said random coil domain is longing the plasma half-life of biopharmaceuticals is the con dramatically increased as can be estimated by using standard jugation with highly solvated and physiologically inert methods mentioned herein. Since the random coil domain is chemical polymers, thus effectively enlarging the hydrody 40 thought not to interfere with the biological activity of the first namic radius of the therapeutic proteinbeyond the glomerular domain of the biologically active protein, the biological activ pore size of approximately 3-5 nm (Caliceti (2003) loc. cit.). ity mediated by the functional protein of interest to which it is Covalent coupling under biochemically mild conditions with fused is essentially preserved. Moreover, the amino acid poly activated derivatives of poly-ethylene glycol (PEG), either mers/polypeptides that form random coil domain as disclosed randomly via Lys side chains (Clark (1996) J Biol Chem 45 herein are thought to be biologically largely inert, especially 271:21969-21977) or by means of specifically introduced with respect to proteolysis in blood plasma, immunogenicity, Cys residues (Rosendahl (2005) BioProcess International: isoelectric point/electrostatic behaviour, binding to cell sur 52-60) has been moderately successful and is currently being face receptors as well as internalisation, but still biodegrad applied in several approved drugs. Corresponding advantages able, which provides clear advantages over synthetic poly have been achieved especially in conjunction with Small pro 50 mers such as PEG. teins possessing specific pharmacological activity, for In accordance with the above, the present invention relates example Pegasys(R), a chemically PEGylated recombinant to a biologically active protein comprising the herein IFNa-2a (Harris (2003) Nat Rev Drug Discov, 2:214-221; described biosynthetic random coil polypeptide. Such a bio Walsh (2003) Nat Biotechnol 21:865-870). logically active protein/protein construct comprising the bio However, the chemical coupling of a biologically active 55 synthetic random coil polypeptide described herein is a het protein with synthetic polymers has disadvantages with erologous biological active protein/protein construct. In respect to biopharmaceutical development and production. particular, herein disclosed is/are also (a) biologically active, Suitable PEG derivatives are expensive, especially as high heterologous protein(s) comprising or consisting of at least purity is needed, and their conjugation with a recombinant two domains wherein protein requires additional in vitro processing and purifica 60 (a) a first domain of said at least two domains comprises or tion steps, which lower the yield and raise the manufacturing consists of an amino acid sequence having and/or mediat costs. In fact, PEG is often contaminated with aldehydes and ing said biological activity; and peroxides (Ray (1985) Anal Biochem 146:307-312) and it is (b) a second domain of said at least two domains comprises or intrinsically prone to chemical degradation upon storage in consists of the herein described and defined random coil the presence of oxygen. Also, the pharmaceutical function of 65 polypeptide or polypeptide segment. a therapeutic protein may be hampered if amino acid side It is of note that in accordance with the present invention chains in the vicinity of its biochemical active site become that “first domain” and said “second domain relate to protein US 9,221,882 B2 27 28 stretches that are not naturally occurring within the same amount of any otheramino acid, also not a Substantial amount protein or that are not expected to be part of the same hypo of serine or no serine at all) do also form a useful random coil thetical protein as encoded by a coding nucleic acid sequence structure. This is particularly unexpected given the disclosure (like an open reading frame) as found in nature. in WO 2008/155134 of fusion proteins with a domain com The definitions and explanations given herein above in posed only of serine and alanine (SA) residues, i.e. where context of the random coil polypeptide or polypeptide seg proline residues were omitted, demonstrating that such a ment thereof apply, mutatis mutandis, in the context of bio domain comprising only two types of amino acids did not logically active proteins comprising said random coil form a random coil, but a B-sheet structure. These serine polypeptide (or (a) polypeptide segment(s) thereof). alanine domains did also not show Such an increased hydro Preferably, said random coil conformation mediates an 10 dynamic volume as observed with “PAS' or, in particular, increased in vivo and/or in vitro stability of said biologically with the “P/A sequences as provided herein. active protein, like the in vivo and/or in vitro stability in As used herein, the term “biological activity” describes the biological samples or in physiological environments. biological effect of a Substance on living matter. Accordingly, For example, it is envisaged herein that proteins compris the terms “biologically active protein’ as used herein relate to ing a herein defined, additional 'second domain adopting a 15 proteins that are capable of inducing a biological effect in random coil conformation in aqueous solution or under living cells/organisms that are exposed to said protein or physiological conditions (for example polymers consisting of polypeptide. Yet, it is of note that in the context of the present about 200 or about 400 or about 600 amino acid residues and invention, the term “biologically active protein relates to the comprising PA#1/SEQ ID NO. 1, PA#2/SEQ ID NO. 2, whole protein of the invention which both comprises an PAH3/SEQ ID NO. 3, PAH4/SEQ ID NO. 4, PAH5/SEQ ID amino acid sequence having and/or mediating said biological NO. 5, PA#6/SEQID NO. 6 and/or P1A1/SEQID NO. 51 as activity (said first domain) and the inventive amino acid “building blocks”) have an advantageous serum stability or sequence adopting/forming random coil conformation and plasma half-life, even in Vivo, (in particular if intravenously consisting Solely of proline and alanine (said second domain). administered) as compared to a control lacking said random Accordingly, the terms "amino acid sequence having and/ coil conformation. 25 or mediating biological activity” or "amino acid sequence In WO 2008/155134 (as discussed herein above) it has with biological activity” as used herein to the above-defined been shown that biologically active proteins which comprise “first domain' of the biologically active protein of the inven a domain with an amino acid sequence adopting a random coil tion, which mediates or has or is capable of mediating or conformation have an increased in vivo and/or in vitro stabil having the above defined “biological activity”. Also com ity. The random coil domains disclosed in WO 2008/155134 30 prised in the terms "amino acid sequence having and/or medi consist, in particular, of proline, alanine, and serine (PAS) ating biological activity” or "amino acid sequence with bio residues. The presence of these three residues is described in logical activity are any proteins of interest (and functional this prior art document as an essential requirement for the fragments thereof. Such as antibody fragments, fragments formation of a stable and soluble random coil in aqueous comprising extracellular or intracellular domain(s) of a mem Solution. 35 brane receptor, truncated forms of a growth factor or cytokine As discussed in the introduction herein above, WO 2007/ and the like), the half-life of which, either in vivo or in vitro, 103515 describes unstructured recombinant polymers which needs to be prolonged. In one embodiment of this invention, comprise as main constituents a large variety of amino acids, the amino acid sequence having and/or mediating biological inter alia, glycine, aspartate, alanine, serine, , activity in accordance with the present invention may be glutamate and proline. However, the term “unstructured 40 deduced from any “protein of interest”, i.e. any protein of recombinant polymer has, in contrast to the terms “biosyn pharmaceutical or biological interest or any protein that is thetic' and "random coil', no recognized, clear meaning. useful as a therapeutic/diagnostic agent. Also mentioned herein above was WO 2006/081249. This Accordingly, the biologically active proteins may com document describes protein conjugates comprising a biologi prise a first domain comprising a biologically active amino cally active protein coupled to a polypeptide comprising 2 to 45 acid sequence which is derived from naturally produced 500 units of an amino acid repeat having Gly, ASn, and Gln as polypeptides or polypeptides produced by recombinant DNA a major constituent and Ser, Thr, Asp, Gln, Glu, His, and Asn technology. In a preferred embodiment, the protein of interest as a minor constituent. Said protein conjugates are described may be selected from the group consisting of binding pro to have either an increased or a decreased plasma half-life teins/binding molecules, immunoglobulins, antibody frag when compared to the unconjugated biologically active pro 50 ments, transport proteins, membrane receptors, signaling tein. WO 2006/081249, however, does not provide any teach proteins/peptides such as cytokines, growth factors, hor ing to predict whether a specific amino acid repeat reduces or mones or enzymes and the like. augments the plasma half-life of the conjugate. Moreover, As explained herein above, the random coil polypeptide (or WO 2006/081249 does not teach or suggest that the plasma polypeptide segment) comprised in the second domain of the half-life of proteins can be increased when the conjugated 55 biologically active protein forms the random coil conforma protein comprises an amino acid repeat that forms random tion in particular under physiological conditions. This is par coil conformation as shown in the present invention. Further ticularly relevant in context of biologically active proteins more, the amino acid repeats disclosed in WO 2006/081249 which may form part of a pharmaceutical composition that is comprise at least two residues selected from Gly, ASn, and to be administered to a subject or patient. Gln, which is in clear contrast with the biosynthetic random 60 It is of note that the inventive biosynthetic random coil coil polypeptides of the present invention which comprise an domain (said “second domain”) of the biologically active amino acid sequence that solely consists of proline and ala protein natively (ie. under physiologic conditions) adopts/ nine amino acid residues. forms/has random coil conformation, in particular in Vivo and Surprisingly, it has been found herein that biosynthetic when administered to mammals or human patients in need of random coil amino acid sequences as provided herein which, 65 medical intervention. In contrast, it is known in the art that in contrast to the prior art, Solely comprise proline and alanine proteins having a non-random secondary and/or tertiary residues (i.e. which preferably do not comprise a substantial structure as native conformation tend to adopt a random coil US 9,221,882 B2 29 30 conformation under non-physiological conditions (i.e. under folded, globular protein based on their molecular weight or denaturing conditions). However, Such denatured proteins number/composition of amino acid residues. have completely different characteristics compared to the It should be noted that the first domain comprising an biologically active protein comprising the random coil “amino acid sequence having and/or mediating biological polypeptide of the present invention. Hence, it is the gist of 5 activity” may also adopt its biological activity in the context this invention that the “biologically active protein’ and the of or after association with another polypeptide oramino acid biologically active part of the fusion proteins/fusion con sequence. For example, the Fab fragment of an antibody Such structs as provided herein maintain their biological function as the one of the anti-tumour antibody Herceptin (Eigenbrot also when combined and/or linked with the biosynthetic ran (1993) J. Mol. Biol. 229:969-995) consists of two different dom coil polypeptide (or polypeptide segment) of this inven 10 polypeptide chain, the immunoglobulin light chain and a tion. fragment of the immunoglobulin heavy chain, which may Furthermore, the random coil polypeptide (or polypeptide furthermore be linked via (an) interchain bond(s). segment) retains solubility under physiological conditions. According to the present invention, it may be sufficient to link Accordingly, it is also envisaged that the protein construct of one of those chains (e.g. via gene fusion) to the random coil the present invention (comprising the above defined “first 15 polypeptide (or polypeptide segment) while the full biologi and 'second domain’) may comprise the 'second, random cally active protein is reconstituted by means of association coil forming/adopting domain transiently or temporarily not with the other chain. Such reconstitution may beachieved, for in random coil conformation, for example, when inform of a example, by co-expression of the different polypeptides (on specific composition, like a lyophylisate or dried composi the one hand a fusion protein of one chain and the random coil tion. Yet, it is important that such a “second domain of the polypeptide, on the other hand the other chain) in the same inventive protein construct again adopts after, e.g., reconsti host cell, as described in the appended examples, or by recon tution in corresponding buffers (preferably “physiological stitution in vitro, for example, as part of a refolding protocol. buffers/excipients and/or solvents), the herein defined ran Accordingly, also such proteins (comprising tow separate dom coil conformation. Said “second domain is (if neces polypeptide chains) are considered as biologically active pro sary, after corresponding reconstitution) capable of mediat 25 teins in accordance with the present invention. In such a case, ing an increased in vivo and/or in vitro stability of the the first domain as defined herein may comprise two separate inventive biologically active protein. It is preferred herein that polypeptide chains which are linked only non-covalently. the “second domain” as defined herein consists of the random Furthermore, the independent chains of the biologically coil polypeptide (or polypeptide segment) of the present active protein/domain may each be linked to the random coil invention. 30 polypeptide (or polypeptide segment). Beside antibody frag As used herein, the term “domain relates to any region/ ments there are many other homo- or hetero-oligomeric pro part of an amino acid sequence that is capable of autono teins of interest (for example, insulin, hemoglobin and the mously adopting a specific structure and/or function. In the like) that are composed of several associated polypeptide context of the present invention, accordingly, a “domain may chains and which are subject to this invention. represent a functional domain or a structural domain. As 35 As used herein, the term “binding protein’ relates to a described herein, the proteins of the present invention com molecule that is able to specifically interact with (a) potential prise at least one domain/part having and/or mediating bio binding partner(s) so that it is able to discriminate between logical activity and at least one domain/part forming random said potential binding partner(s) and a plurality of different coil conformation. Yet, the proteins of the invention also may molecules as said potential binding partner(s) to Such an consist of more than two domains and may comprise e.g. an 40 extent that, from a pool of said plurality of different molecules additional linker or spacer structure between the herein as potential binding partner(s), only said potential binding defined two domains/parts or another domain/part like, e.g. a partner(s) is/are bound, or is/are significantly bound. Meth protease sensitive cleavage site, an affinity tag such as the ods for the measurement of binding between a binding pro His-tag or the Strep-tag, a , retention peptide, tein and a potential binding partner are known in the art and a targeting peptide like a membrane translocation peptide or 45 can be routinely performed, e.g., by using ELISA, isothermal additional effector domains like antibody fragments for titration calorimetry, equilibrium dialysis, pull down assays, tumour targeting associated with an anti-tumour toxin or an Surface plasmon resonance or a Biacore apparatus. Exem enzyme for prodrug-activation etc. plary binding proteins/binding molecules which are useful in In another embodiment, the biologically active protein of the context of the present invention include, but are not lim the invention has a hydrodynamic Volume as determined by 50 ited to antibodies, antibody fragments such as Fab fragments, analytical gel filtration (also known as size exclusion chro F(ab')2 fragments, single chain variable fragments (ScPV), matography, SEC) of at least 50kDa, preferably of at least 70 isolated variable regions of antibodies (VL and/or VH kDa, more preferably of at least 80 kDa, even more preferably regions), CDRs, single domain antibodies/immunoglobulins, of at least 100 kDa, particularly preferably of at least 125kDa CDR-derived peptidomimetics, lectins, immunoglobulin and most preferably of at least 150 kDa. The person skilled in 55 domains, fibronectin domains, protein A domains, SH3 the art is readily capable of determining the hydrodynamic domains, ankyrin repeat domains, lipocalins or various types volume of specific proteins. Exemplary methods have been of scaffold-derived binding proteins as described, for described herein above in context of the random coil polypep example, in Skerra (2000) J Mol Recognit 13:167-187, tide. A skilled person is easily in the position to adapt Such Gebauer (2009) Curr Opin Chem Biol 13:245-255, Binz methods also in context of the biologically active protein of 60 (2005) Nat Biotechnol 23:1257-1268 or Nelson (2009) Nat the present invention. As described herein below, the hydro Biotechnol 27:331-337. dynamic volume of the biologically active proteins of the Other exemplary biologically active proteins of interest (in invention that comprise the above defined second domain, i.e. particular proteins comprised in the first domain or constitut the domain comprising or consisting of the herein provided ing/being the first domain of the biologically active protein) random coil polypeptide (or segment thereof) are shown to 65 which are useful in the context of the present invention have an unexpectedly large hydrodynamic Volume in relation include, but are not limited to, granulocyte colony stimulating to the estimated hydrodynamic Volume for a corresponding factor, human growth hormone, alpha-interferon, beta-inter US 9,221,882 B2 31 32 feron, gamma-interferon, lambda-interferon, tumor necrosis multi-domain polypeptide. Again, it is of note that the present factor, erythropoietin, coagulation factors such as coagula invention is not limited to fusion proteins wherein one tion factor VIII, coagulation factor VIIa, coagulation factor domain mediates a biological activity. Also other “fusion IX. gp120/gp160, soluble tumor necrosis factor I and II recep proteins'/'fusion constructs are provided herein wherein tor, thrombolytics such as reteplase, peptides with metabolic 5 one part/domain is or comprises the inventive random coil effects such as GLP-1 or exendin-4, immunosuppressive/ polypeptide/polymer of proline/alanine and the other part/ immunoregulatory proteins like interleukin-1 receptor domain comprises another protein stretch/structure. antagonists or anakinra, interleukin-2 and neutrophil gelati In particular, in the case of fusion proteins the random coil nase-associated lipocalin or other natural or engineered polypeptide (or polypeptide segment) according to this inven lipocalins or those proteins or compounds listed, for example, 10 tion does not necessarily carry Pro or Ala residues at its amino in Walsh (2003) Nat Biotechnol 21:865-870 or Walsh (2004) or carboxyl terminus. In an alternative embodiment, the bio Eur J Pharm Biopharm 58:185-196 or listed in online data logically active protein in accordance with the present inven bases such as biopharmadot.com/approvalsdothtml or tion may represent a protein conjugate wherein a protein of drugbank dotca. Further biologically active proteins (in par interest or a polypeptide/polypeptide stretch/peptide?amino ticular proteins comprised in the first domain or constituting/ 15 acid sequence having and/or mediating biological activity is being the first domain of the biologically active protein) conjugated via a non- to an amino acid sequence which may be employed in context of the present invention which forms/adopts random coil conformation, in particular, are, interalia, follicle-stimulating hormone, glucocerebrosi the random coil polypeptide (or polypeptide segment) as dase, thymosin alpha 1, glucagon, Somatostatin, adenosine provided herein and consisting solely of proline and alanine deaminase, interleukin 11, hematide, leptin, interleukin-20, 20 residues. Non-peptide bonds that are useful for cross-linking interleukin-22 receptor subunit alpha (IL-22ra), interleukin proteins are known in the art with the biosynthetic random 22, hyaluronidase, fibroblast growth factor 18, fibroblast coil polypeptide or polypeptide segment comprising an growth factor 21, glucagon-like peptide 1, osteoprotegerin, amino acid sequence consisting solely of proline and alanine IL-18 binding protein, growth hormone releasing factor, amino acid residues, wherein said amino acid sequence con soluble TACI receptor, thrombospondin-1, soluble VEGF 25 sists of at least 50 proline (Pro) and alanine (Ala) amino acid receptor Flt-1, C-galactosidase A, myostatin antagonist, gas residues as provided herein. Such Non-peptide bonds may tric inhibitory polypeptide, alpha-1 antitrypsin, IL-4 mutein, include disulfide bonds, e.g. between CyS side chains, thio and the like. As will be evident from the disclosure herein, the ether bonds or non-peptide covalent bonds induced by chemi present invention also relates to comprising the biosynthetic cal cross-linkers, such as disuccinimidyl suberate (DSS) or random coil proline/alanine polypeptide or proline/alanine 30 sulfosuccinimidyl 4-p-maleimidophenylbutyrate (Sulfo polypeptide segment and pharmaceutically or medically use SMPB), metal-chelating/complexing groups, as well as non ful molecules, like small molecules, peptides or biomacro covalent protein-protein interactions. molecules such as proteins, nucleic acids, carbohydrates, It is of note that the “biologically active protein' of the lipid vesicles and the like, in particular pharmaceutically or present invention may also comprise more than one "amino medically useful proteins, like (but not limited to) binding 35 acid sequence having and/or mediating a biological activity'. proteins/binding molecules, immunoglobulins, antibody Furthermore, the biologically active protein may also com fragments, transport proteins, membrane receptors, signaling prise more than biosynthetic random coil polypeptide (or proteins/peptides, cytokines, growth factors, hormones or segment thereof). In the simplest case, the biologically active enzymes and the like may be comprised in the herein defined protein consists of two domains, i.e. a first domain compris drug constructs but they may also be part of the herein defined 40 ing an amino acid sequence having and/or mediating a bio biologically active, heterologous protein comprising or con logical activity and a second domain comprising the biosyn sisting of said defined at least two domains. In such a case, thetic polypeptide (or segment thereof). It is of note that the said particular pharmaceutically or medically useful proteins present invention is not limited to “biologically or therapeu (or functional fragments thereof) may be the “first domain of tically active proteins’ linked to the herein disclosed biosyn said at least two domains comprising or consisting of an 45 thetic random coil polypeptide or polypeptide segment com amino acid sequence having and/or mediating said biological prising an amino acid sequence consisting Solely of proline activity. Functional fragments, in this context, are fragments and alanine amino acid residues, wherein said amino acid of said pharmaceutically or medically useful proteins that are sequence consists of at least 50 proline (Pro) and alanine still capable to elucidate the desired biological or pharmaceu (Ala) amino acid residues. Also other proteins or molecules of tical response in vivo and/or in vitro and/or still have or 50 interest, as relevant for other industries, like food or beverage mediate the desired biological activity. industry, cosmetic industry and the like, may be manufac The above-mentioned polypeptide linker/spacer, inserted tured by the means and methods provided herein. between said first and said second domains, preferably com The person skilled in the art is aware that the “domain prises plural hydrophilic, peptide-bonded amino acids that comprising an amino acid sequence having and/or mediating are covalently linked to both domains. In yet another embodi- 55 a biological activity” and the 'second domain comprising the ment said polypeptide linker/spacer comprises a plasma pro random coil polypeptide (or segment thereof) as comprised in tease cleavage site which allows the controlled release of said the biologically active proteins of the invention may be orga first domain comprising a polypeptide having and/or mediat nized in a specific order. ing a biological activity. Linkers of different types or lengths Accordingly, and in the context of the invention, the order may be identified without undue burden to obtain optimal 60 of the herein defined “first and “second domain of the biological activity of specific proteins. inventive biologically active polypeptide may be arranged in In a preferred embodiment, the biologically active proteins an order, whereby said “first domain” (i.e. protein of interest; of the present invention are fusion proteins. A fusion protein “amino acid sequence having and/or mediating said biologi as described herein may comprise at least one domain which cal activity”) is located at the amino (N-) terminus and said can mediate a biological activity and at least one other domain 65 “second domain' (i.e. the domain that comprises the herein which comprises the biosynthetic random coil polypeptide provided random coil polypeptide (or segment thereof)) is (or polypeptide segment) as described herein in a single located at the carboxy (C-) terminus of the biologically active US 9,221,882 B2 33 34 protein. However, this order may also be reversed, e.g. said and/or mediating a specific biological activity)—first “first domain” (i.e. protein of interest; "amino acid sequence domain (amino acid sequence having and/or mediating a having and/or mediating said biological activity) is located specific biological activity)—second domain (random at the carboxy (C-) terminus and said 'second domain' (i.e. coil polypeptide (or segment thereof)); the domain that comprises the herein provided random coil first domain (amino acid sequence having and/or mediat polypeptide (or segment thereof)) is located at the amino (N-) ing a specific biological activity)—second domain (ran terminus of the biologically active protein. If the biologically dom coil polypeptide (or segment thereof))—second active protein consists only of one first domain and one sec domain (random coil polypeptide (or segment ond domain, the domain order may, accordingly, be (from thereof))—first domain (amino acid sequence having N-terminus to C-terminus): first domain (amino acid 10 and/or mediating a specific biological activity); sequence having and/or mediating a biological activity)— second domain (random coil polypeptide (or segment second domain (random coil polypeptide (or segment thereof))—first domain (amino acid sequence having thereof)). Vice versa, the domain order may be (from N-ter and/or mediating a specific biological activity)—second minus to C-terminus): second domain (random coil polypep domain (random coil polypeptide (or segment tide (or segment thereof))—first domain (amino acid 15 thereof))—first domain (amino acid sequence having sequence having and/or mediating a biological activity). and/or mediating a specific biological activity); It is also envisaged that more than one domain comprising second domain (random coil polypeptide (or segment or consisting of an amino acid sequence having and/or medi thereof))—second domain (random coil polypeptide (or ating said biological activity are to be used in context of the segment thereof))—first domain (amino acid sequence inventive protein construct. For example, the biologically having and/or mediating a specific biological activity)— active protein may comprise two “first domain', i.e. two first domain (amino acid sequence having and/or medi specific amino acid sequences having and/or mediating a ating a specific biological activity); or biological activity, whereby this biological activity may be first domain (amino acid sequence having and/or mediat the same or a different activity. If the biologically active ing a specific biological activity)—first domain (amino protein consists of two such “first domains, i.e two specific 25 acid sequence having and/or mediating a specific bio amino acid sequences having and/or mediating a biological logical activity)—second domain (random coil polypep activity, and one 'second domain” (comprising the biosyn tide (or segment thereof))—second domain (random thetic random coil polypeptide (or segment thereof), the coil polypeptide (or segment thereof)). domain order may be (from N-terminus to C-terminus): first For a person skilled in the art further corresponding domain domain (amino acid sequence having and/or mediating a 30 orders (in particular in cases where more than two “first specific biological activity)—second domain (random coil domains” or “more than two "second domains' are com polypeptide (or segment thereof)) first domain (amino acid prised in the biologically active protein) are easily conceiv sequence having and/or mediating a specific (optionally dif able. ferent) biological activity). As with all embodiments of the present inventive polypep The same explanations apply in cases where the biologi 35 tide/biologically active protein, said domain(s) comprising an cally active protein comprises more than one 'second amino acid sequence having and/or mediating the said bio domain” (i.e. the biologically active protein comprises more logical activity may also be a biologically active fragment of than one random coil polypeptide (or segment thereof). If the a given protein with a desired biological function. Therefore, biologically active protein consists of two Such 'second the herein defined “second domain” (preferably comprising domains', i.e two domains comprising the biosynthetic ran 40 the herein provided random coil polypeptide (or segment dom coil polypeptide (or segment thereof), and one “first thereof)) may also be located between two biologically active domain (comprising an amino acid sequence having and/or fragments of a protein of interest or between biologically mediating a biological activity), the domain order may be active fragments of two proteins of interest. All the explana (from N-terminus to C-terminus): second domain (random tions and definitions given herein above in context of “full coil polypeptide (or segment thereof))—first domain (amino 45 length' proteins/polypeptides of interest (i.e. when the amino acid sequence having and/or mediating a specific biological acid sequences has/mediates a certain biological activity on activity)—second domain (random coil polypeptide (or seg its own) apply, mutatis mutandis, in context of Such frag ment thereof)). If the biologically active protein comprises mentS. more than one 'second domain it is envisaged herein that Again the above invention is not limited to the constructs these “second domains” may be identical or may be different. 50 that comprise a “domain” with a “biological active function'. As mentioned above, the biologically active protein may The constructs of the present invention may also comprise comprise more than one “first domain', i.e. more than one domains with other functions and are not limited to biological specific amino acid sequences having and/or mediating a activities. These are merely embodiments of the present biological activity and more than one 'second domain (bio invention and it is evident for the skilled artisan that other synthetic random coil polypeptide (or segment thereof)) 55 constructs can easily be made and used without deferring whereby these “first domains’ may be identical or different from the gist of the present invention. Accordingly, the herein and/or whereby said “second domains' may be identical or said in context of “amino acid sequence having and/or medi different. In such cases the following, exemplary domain ating a specific biological activity” applies, mutatis mutinatis, orders are conceivable (from N-terminus to C-terminus): for other constructs, for example constructs to be used in first domain (amino acid sequence having and/or mediat 60 other technical fields, like in cosmetics, food processing, ing a specific biological activity)—second domain (ran dairy products, paper production, etc. As mentioned herein dom coil polypeptide (or segment thereof))—first above, the biosynthetic polypeptides/polymers of the present domain (amino acid sequence having and/or mediating a invention can also be used to be linked with e.g. Small mol specific biological activity)—second domain (random ecules and the like. coil polypeptide (or segment thereof)); 65 Again, it has to be pointed out that the term "amino acid second domain (random coil polypeptide (or segment sequence having and/or mediating first biological activity” is thereof))—first domain (amino acid sequence having not limited to full-length polypeptides that have and/or medi US 9,221,882 B2 35 36 ate said biological activity or function, but also to biologically peptides or biomacromolecules Such as proteins, nucleic and/or pharmacologically active fragments thereof. Espe acids, carbohydrates, lipid vesicles and the like. In the cially, but not only, in a context wherein two or more “first appended illustrative experimental part (see, e.g. Example domains” as defined herein are comprised in the inventive 22) the Successful generation of constructs/conjugates of the “biologically active protein’ it is also envisaged that these present invention are provided, also constructs wherein “first domains are or represent different parts of a protein “small chemical molecules have been conjugated to the complex or fragments of Such parts of protein complex. herein disclosed random coil polypeptide. Therefore, the As exemplified herein below, the biologically active pro present Figures and experimental information in the corre teins of the invention which are modified to comprise a ran sponding figure legends provide for illustrative examples, dom coil polypeptide Surprisingly exhibit an increased in vivo 10 wherein the herein disclosed drug conjugates comprise (i) a and/or in vitro stability when compared to unmodified bio biosynthetic random coil polypeptide or polypeptide segment logically active proteins that lack said random coil domain. comprising an amino acid sequence consisting Solely of pro As used herein, the term “in vivo stability” relates to the line and alanine amino acid residues, wherein said amino acid capacity of a specific Substance that is administered to the sequence consists of at least 50 proline (Pro) and alanine living body to remain biologically available and biologically 15 (Ala) amino acid residues, and (ii) a small molecule, which is, active. In Vivo, a Substance may be removed and/or inacti as illustration, selected from digoxigenin and fluorescein. It is vated due to excretion, kidney filtration, liver uptake, aggre of note that these are not only academic examples. Fluores gation, degradation and/or other metabolic processes. cein or fluorescein derivates are commonly used as diagnos Accordingly, in the context of the present invention biologi tics, and medical fluorescein solutions are sold under the trade cally active proteins that have an increased in vivo stability names Fluoescite R, AK-FLUOR(R) or Fluress(R). Such com may be less rapidly excreted through the kidneys (urine) or pounds can certainly profit from the means and methods via the feces and/or may be more stable against proteolysis, in provided herein. Digoxigenin forms the steroid part of particular against in vivo proteolysis in biological fluids, like digoxin, a well known secondary plant metabolite with car blood, liquor cerebrospinalis, peritoneal fluid, and lymph. In dioactive function which furthermore contains three digitox one embodiment, the increased in vivo stability of a biologi 25 ose Sugars. Digoxin, and to a lesser extent the closely related cally active protein manifests in a prolonged plasma half-life compound digitoxin, are widely used for the treatment of of said biologically active protein. In particular, the increased Ventricular tachyarrhythmias and congestive heart failure in vivo stability of the biologically active protein is a pro (Hauptman (1999) Circulation 99: 1265-1270). All cardioac longed plasma half-life of said biologically active protein tive steroids are potent and highly specific inhibitors of the comprising said second domain when compared to the bio 30 Na"/K-ATPase located in the cellular plasma membrane, logically active protein lacking the second domain. thereby exerting sympatholytic or positive inotropic effects. Methods for measuring the in vivo stability of biologically The definitions and explanations given herein above in active proteins are known in the art. As exemplified herein context of the random coil polypeptide or polypeptide seg below, biologically active proteins may be specifically ment thereof apply, mutatis mutandis, in context of drug detected in the blood plasma using Western blotting tech 35 conjugate comprising the random coil polypeptide (or niques or enzyme linked immunosorbent assay (ELISA). Yet, polypeptide segment thereof) and a drug selected from the the person skilled in the art is aware that other methods may group consisting of (a) a biologically active protein or a be employed to specifically measure the plasma half-life of a polypeptide that comprises or that is an amino acid sequence protein of interest. Such methods include, but are not limited that has or mediates a biological activity and (b) a small to the physical detection of a radioactively labelled protein of 40 molecule drug. interest. Methods for radioactive labelling of proteins e.g. by The amino acid polymer forming random coil conforma radioiodination are known in the art. tion/the random coil polypeptide (or segment thereof) as The term “increased in vitro stability” as used herein defined and provided herein can be conjugated to a small relates to the capacity of a biologically active protein to resist molecule? small molecule drug. By this means, plasma half degradation and/or aggregation and to maintain its original 45 life and/or solubility of the small molecule/small molecule biological activity in an in vitro environment. Methods for drug may be increased, unspecific toxicity may be decreased, measuring the biological activity of biologically active pro and the prolonged exposure of active drug to target cells or teins are well known in the art. structures in the body may result in enhanced pharmacody Furthermore, a drug conjugate is provided which com namics. prises the herein described and defined random coil polypep 50 A site-specific conjugation of the N-terminus of the ran tide or polypeptide segment and a small molecule drug that is dom coil polypeptide with an activated drug derivative, e.g. as conjugated to said random coil polypeptide or polypeptide N-hydroxysuccinimide (NHS) ester derivative (Hermanson segment. Non-limiting examples of the Small molecules are (1996) Bioconjugate Techniques, Academic Press, San digoxigenin, fluorescein doxorubicin, calicheamicin, camp Diego, Calif.), is possible. Generally, the N-terminal amino tothecin, fumagillin, dexamethasone, geldanamycin, pacli 55 group can be chemically coupled with a wide variety of taxel, docetaxel, irinotecan, cyclosporine, buprenorphine, functional groups such as aldehydes and ketones (to form maltrexone, naloxone, Vindesine, Vancomycin, risperidone, Schiffbases, which may be reduced to amines using Sodium aripiprazole, palonosetron, granisetron, cytarabine, NX1838, borohydride or sodium cyanoborohydride, for example) or to leuprolide, goserelin, buserelin, octreotide, teduglutide, activated carbonic acid derivatives (anhydrides, chlorides, cilengitide, abarelix, enfuvirtide, ghrelin and derivatives, 60 esters and the like, to form ) or to other reactive chemi tubulysins, platin derivatives, alpha 4 integrin inhibitors, anti cals such as isocyanates, isothiocyanates, Sulfonly chlorides sense nucleic acids, small interference RNAs, microRNAs, etc. Also, the N-terminus of the amino acid polymer/polypep steroids, DNA or RNAaptamers, peptides, peptidomimetics. tide can first be modified with a suitable protective group, for In general, the present invention also relates to drug con example an acetyl group, a BOC group or an FMOC group structs comprising the herein defined random coil polypep 65 (Jakubke (1996) Peptide. Spektrum Akdemischer Verlag, tide or polypeptide segment and in particular pharmaceuti Heidelberg, Germany). Furthermore, the amino terminus cally or medically useful molecules, like Small molecules, may be protected by a pyroglutamyl group, which can form US 9,221,882 B2 37 38 from an encoded Gln amino acid residue preceding the Pro/ In another embodiment, the present invention relates to Ala polypeptide or polypeptide segment. After activation of nucleic acid molecules encoding the random coil polypep the C-terminal carboxylate group, e.g. using the common tides (or segments thereof) or biologically active proteins as reagents EDC (N-(3-dimethylaminopropyl)-N-ethylcarbodi described herein. Accordingly, said nucleic acid molecule imide) and NHS, site-specific coupling to the C-terminus of 5 may comprise a nucleic acid sequence encoding a polypep the protected random coil polypeptide can be achieved if the tide having biological activity and a nucleic acid sequence drug carries a free amino group, for example. encoding the random coil polypeptide (or segment thereof). Alternatively, the N-terminus or the C-terminus of the The term “nucleic acid molecule', as used herein, is intended amino acid polymer forming random coil conformation/the to include nucleic acid molecules such as DNA molecules and random coil polypeptide can be modified with a commer 10 RNA molecules. Said nucleic acid molecule may be single cially available linker reagent providing a maleimide group, stranded or double-stranded, but preferably is double thus allowing chemical coupling to a thiol group as part of the stranded DNA. Preferably, said nucleic acid molecule may be drug molecule. In this manner uniform drug conjugates can comprised in a vector. be easily obtained. Similar techniques, which are well known Accordingly, the present invention also relates to a nucleic in the art (Hermanson (1996) loc. cit.), can be used to couple 15 acid molecule encoding the random coil polypeptide or the random coil polypeptide to a peptide or even to a protein polypeptide segment as comprised in the conjugates provided drug. Such peptides or proteins can easily be prepared carry herein, like a drug conjugate as defined herein, or a nucleic ing a Lys or CyS side chain, which allows their in vitro acid molecule encoding a protein conjugate that comprises a coupling to the amino acid polymer forming random coil biologically active protein as defined above and that com conformation via NHS ester or maleimide active groups. prises, additionally, a biosynthetic random coil polypeptide Generally, similar drug conjugates can be prepared with or polypeptide segment comprising an amino acid sequence fusion proteins comprising the random coil polypeptide (or consisting solely of proline and alanine amino acid residues, segment thereof). Yet, and as illustrated in the appended wherein said amino acid sequence consists of at least 50 Examples and Figures the present invention also provides for proline (Pro) and alanine (Ala) amino acid residues. the preparation of a random coil polypeptides or a random 25 In one embodiment a nucleic acid molecule is provided that coil polypeptide segments as comprised in the innovative encodes a conjugate, like a drug conjugate or food conjugate conjugates of the present invention. as defined above, said nucleic acid molecule comprising As an alternative to a single site-specific conjugation the (i) a nucleic acid sequence encoding a translated amino acid random coil polypeptide may be equipped with additional and/or a leader sequence; side chains, at the N- or C-terminus or internally, suitable for 30 (ii) a nucleic acid sequence encoding a biosynthetic random chemical modification Such as lysine residues with their coil polypeptide or polypeptide segment comprising an S-amino groups, residues with their thiol groups, or amino acid sequence consisting solely of proline and ala even non-natural amino acids, allowing the conjugation of nine amino acid residues, wherein said amino acid multiple small molecules using, for example, NHS ester or sequence consists of at least 50 proline (Pro) and alanine maleimide active groups. 35 (Ala) amino acid residues; Apart from stable conjugation, a prodrug may be linked (iii) a nucleic acid sequence encoding biologically active transiently to the random coil polypeptide. The linkage can be protein or said a polypeptide that comprises or that is an designed to be cleaved in vivo, in a predictable fashion, either amino acid sequence that has or that mediates a biological via an enzymatic mechanism or by slow hydrolysis initiated activity or a protein of interest, like a protein to be at physiological pH similarly as, for example, the poorly 40 employed in other industrial areas like the food industry; soluble antitumor agent camptothecin was conjugated to a and PEG polymer, thus achieving increased biodistribution, (iv) a nucleic acid sequence that represents or is a transla decreased toxicity, enhanced efficacy and tumor accumula tional stop codon. tion (Conover (1998) Cancer Chemother Pharmacol, 42:407 The above mentioned “translated amino acid and/or a 414). Examples for further prodrugs are chemotherapeutic 45 leader sequence’ under (i) may for example be the starting agents like docetaxel (Liu (2008) J Pharm Sci. 97:3274 “M”, i.e. a derived from a corresponding starting 3290), doxorubicin (Veronese (2005) Bioconjugate Chem. codon, it may also comprise non-translated sequences of an 16: 775-784) or paclitaxel (Greenwald (2001) J Control mRNA like the 5' sequence up to a start codon which com Release 74: 159-171). prises for example a ribosome binding site. Such a sequence Furthermore, the small molecule may be coupled to a 50 may however also comprise classical leader and/or signal fusion protein comprising the amino acid polymer/polypep sequences for example for secretion of an expressed protein tide genetically fused to a targeting domain, e.g. an antibody into the periplasm or in a culture medium. Prokaryotic signal fragment, thus resulting in a specific delivery of the Small peptides are for example Omp.A, MalE, PhoA, DsbA, pelB. molecule drug. In the latter case immunotoxins can be easily Afa, npr, STII. Eukaryotic signal peptides are for example generated by conjugation with a cytotoxic Small molecule if 55 Honeybee melittin signal sequence, acidic glycoprotein gp67 the targeting domain is directed against a cell-surface recep signal sequence, mouse IgM signal sequence, hCH signal tor which undergoes internalization, for example. Sequence. In accordance with the above, the present invention also Biologically active proteins or polypeptides that comprises relates, therefore, to the provision of the herein disclosed or that is an amino acid sequence that has or that mediates a biosynthetic random coil polypeptide or polypeptide segment 60 biological activity as well as other proteins of interest, like a comprising an amino acid sequence consisting Solely of pro protein to be employed in other industrial areas, have been line and alanine amino acid residues for further and additional provided herein above. Said embodiments apply, mutatis coupling with other compounds of choice. Said further and/or mutantis, for the nucleic acid molece (part/segments (iii)) as additional coupling may be and/or may comprise the first illustrated herein above. coupling of said biosynthetic random coil polypeptide or 65 Translational stop codons to be employed in the nucleic biosynthetic random coil polypeptide segment with or to acid molecule provided herein are well known in the art and another compound. are, e.g. codons UAA, UAG or UGA. US 9,221,882 B2 39 40 In one embodiment of the nucleic acid molecule as pro diagnostically useful polypeptides or Small molecules or, vided herein above said nucleic acid molecule parts/segments inter alia, other useful proteins or small molecules of other (ii) and (iii) are interchanged in their position on said nucleic industrial areas, like food or paper industry or in oil recovery. acid molecule encoding for a conjugate, like a drug conjugate Therefore, the present invention also provide for a nucleic or a food conjugate. Such a nucleic acid molecule would acid encoding for a biosynthetic random coil polypeptide or comprise the following order of parts/segments: polypeptide segment comprising an amino acid sequence (i) a nucleic acid sequence encoding a translated amino acid consisting solely of proline and alanine amino acid residues and/or a leader sequence; wherein said amino acid sequence consists of at least 50 (ii) a nucleic acid sequence encoding biologically active pro proline (Pro) and alanine (Ala) amino acid residues, said tein or said a polypeptide that comprises or that is an amino 10 nucleic acid molecule comprising acid sequence that has or that mediates a biological activity (i) a nucleic acid sequence encoding for a translated amino or a protein of interest, like a protein to be employed in acid and/or leader sequence; other industrial areas, like the food industry; (ii) a nucleic acid sequence encoding for said a biosynthetic (iii) a nucleic acid sequence encoding a biosynthetic random random coil polypeptide or polypeptide segment compris coil polypeptide or polypeptide segment comprising an 15 inganamino acid sequence consisting solely of proline and amino acid sequence consisting solely of proline and ala alanine amino acid residues; and nine amino acid residues, wherein said amino acid (iii) a nucleic acid sequence that represents or is a transla sequence consists of at least 50 proline (Pro) and alanine tional stop codon. (Ala) amino acid residues; and Such a nucleic acid molecule may, optionally, comprise, (iv) a nucleic acid sequence that represents or is a transla between (i) and (ii) a protease and/or a chemical cleavage site tional stop codon. and/or a recognition site). The nucleic acid molecules as provided herein above may Also for this nucleic acid molecule, the embodiments pro also, optionally, comprise, between parts/segments (i) and (ii) vided herein above in context of the first two described and/or between parts/segments (ii) and (iii), a protease and/or nucleic acid molecules (i.e. a protease and/or a chemical a chemical cleavage site and/or a recognition site. Such 25 cleavage site and/or a recognition), apply here mutatis mutan chemical cleavage sites are well known in the art, and com tis. prise for example specific, individual amino acid sequences Useful and illustrative signal sequences to be employed in (see, e.g. Lottspeich and Engels (Hrsg.) (2006) Bioanalytik. context of this invention comprise, but are not limited, 2. Auflage. Spektrum Akademischer Verlag, Elsevier, prokaryotic sequences like: Omp.A, MalE, PhoA, DsbA, Munchen, Germany). For example, cyanogen bromide or 30 pelB, Afa, npr., STII or eukaryotic sequences like: Honeybee cyanogen chloride cleaves the peptide bond following a Met melittin signal sequence, acidic glycoprotein gp67 signal residue; hydroxylamine cleaves the asparaginyl-glycyl bond; sequence, mouse IgM signal sequence, hCH signal sequence formic acid cleaves Asp-Pro: 2-(2-nitrophenylsulfenyl)-3- In particular the nucleic acid molecule encoding the bio methyl-3-bromoinolenine, 2-iodosobenzoic acid or N-chlo synthetic random coil polypeptide or polypeptide segment roSuccinimide after Trp, 2-nitro-5-thiocyanatobenzoic acid 35 comprising an amino acid sequence consisting Solely of pro after Cys. It is also envisaged and possible that the residue line and alanine amino acid residues of the present invention preceding the Pro/Ala polypeptide or polypeptide segment is useful in methods as also provided herein below and as may be substituted to Met via site-directed mutagenesis and illustrated in the appended examples and figures. Such an the resulting fusion protein can then be cleaved by BrCN. expressed random coil polypeptide or random coil polypep Similarly, other amino acid sequences comprising cleavage 40 tide segment can be isolated form, e.g. host cells expressing site can be introduced into the recombinant fusion protein or Such a random coil polypeptide or random coil polypeptide its encoding nucleic acid by way of site-directed mutagenesis. segment. Such host cells may be transfected cells, for Also useful protease recognition/cleavage sites are known example with an vector as provided herein. in the art. These comprise, but are not limited to: trypsin, Therefore, it is envisaged to transfect cells with the nucleic chymotrypsin, enterokinase, Tobacco Etch Virus (TEV) pro 45 acid molecule or vectors as described herein. In a further tease, PreScission protease, HRV 3C Protease, SUMO Pro embodiment, the present invention relates to nucleic acid tease, Sortase A, granzyme B, furin, thrombin, factor Xa or molecules which upon expression encode the random coil self cleavable inteins. Factor Xa hydrolyses the peptide bond polypeptide (or segment thereof) or biologically active pro at the C-terminal end of the amino acid sequence IleGluGl teins of the invention. Yet, in a further embodiment, the y Arg, which may be inserted between the N-terminal fusion 50 present invention relates to nucleic acid molecules which partner and the Pro/Ala polypeptide or polypeptide segment. upon expression encode the herein disclosed polypeptides A particularly simple method to achieve proteolytic cleavage that, entirely or in part, form/adopt random coil conformation would be by insertion or substitution of a Lys or Arg side in aqueous solution or under physiological conditions. Said chain at the N-terminal end of the Pro/Ala polypeptide or nucleic acid molecules may be fused to Suitable expression polypeptide segment followed by digest with trypsin, which 55 control sequences known in the art to ensure propertranscrip does not cleave within the Pro/Ala polypeptide or polypeptide tion and translation of the polypeptide as well as signal segment as longas internal Lys or Arg side chains are avoided. sequences to ensure cellular secretion or targeting to Illustrative recognition sites are, without being limited, D-D- organelles. Such vectors may comprise further genes such as D-D-K (enterokinase), ENLYFQ(G/S) (TEV protease), I-(E/ marker genes which allow for the selection of said vector in a D)-G-R (Factor Xa), L-E-V-L-F-Q-G-P(HRV 3C), R-X-(K/ 60 suitable host cell and under suitable conditions. R)-R (Furin), LPXTG (Sortase A), L-V-P-R-G (Thrombin) or Preferably, the nucleic acid molecule of the invention is I-E-X-D-X-G (Granzyme B). comprised in a recombinant vector in which a nucleic acid As is evident form the disclosure herein above, the present molecule encoding the herein described biologically active invention provides for recombinantly produced biosynthetic protein is operatively linked to expression control sequences random coil polypeptides and polypeptide segments that can 65 allowing expression in prokaryotic or eukaryotic cells. be conjugated with molecules of choice, like useful proteins, Expression of said nucleic acid molecule comprises tran pharmaceutically active polypeptides or Small molecules, Scription of the nucleic acid molecule into a translatable US 9,221,882 B2 41 42 mRNA. Regulatory elements permitting expression in cells, invertebrate cells, CHO cells, CHO-K1 cells, HEK 293 prokaryotic host cells comprise, e.g., the lambda PL, lac, trp, cells, Hela cells, COS-1 monkey cells, melanoma cells such tac, ara, phoA, tet or T7 promoters in E. coli. Possible regu as Bowes cells, mouse L-929 cells, 3T3 cell lines derived latory elements ensuring expression in eukaryotic cells, pref from Swiss, Balb-c or NIH mice, BHK or HaKhamster cell erably mammalian cells or , are well known to those lines and the like. skilled in the art. They usually comprise regulatory sequences In a further aspect, the present invention comprises meth ensuring initiation of and optionally poly-A sig ods for the preparation of the conjugates of the present inven nals effecting termination of transcription and stabilization of tion as well as the biosynthetic random coil polypeptide (or the transcript. Additional regulatory elements may include segment thereof) or biologically active proteins provided transcriptional as well as translational enhancers, and/or 10 herein and comprising culturing the (host) cell of this inven naturally associated or heterologous promoter regions. tion and isolating said random coil polypeptide (or segment Examples for regulatory elements permitting expression in thereof) or the conjugate or a biologically active protein from eukaryotic host cells are the AOX1 or GAL1 promoters in the culture as described herein. In general, the inventive ran yeast or the CMV, SV40, RSV (Rous sarcoma virus) promot dom coil polypeptide (or segment thereof), the conjugate or ers, CMV enhancer, SV40 enhancer or a globin intron in 15 biologically active protein comprising a random coil domain mammalian and other animal cells. Apart from elements may be produced by recombinant DNA technology, e.g. by which are responsible for the initiation of transcription such cultivating a cell comprising the described nucleic acid mol regulatory elements may also comprise transcription termi ecule or vectors which encode the inventive biologically nation signals. Such as the SV40-poly-A site or the tk-poly-A active protein or random coil polypeptide (or segment site, downstream of the coding region. thereof) and isolating said protein/polypeptide from the cul Methods which are well known to those skilled in the art ture. The inventive biologically active protein or random coil can be used to construct recombinant vectors (see, for polypeptide (or segment thereof) may be produced in any example, the techniques described in Sambrook (1989), Suitable cell culture system including prokaryotic cells, e.g. Molecular Cloning: A Laboratory Manual, Cold Spring Har E. coli BL21, KS272 or JM83, or eukaryotic cells, e.g. Pichia bor Laboratory NY and Ausubel (1989), Current Protocols in 25 pastoris, yeast strain X-33 or CHO cells. Further suitable cell Molecular Biology, Green Publishing Associates and Wiley lines known in the art are obtainable from cell line deposito Interscience, NY). In this context, suitable expression vectors ries like the American Type Culture Collection (ATCC). are known in the art Such as Okayama-Berg cDNA expression The term “prokaryotic' is meant to include bacterial cells vector pcDV1 (Pharmacia), pCDM8, pRc/CMV, pcDNA1, while the term “eukaryotic' is meant to include yeast, higher pcDNA3, pPICZalpha A (Invitrogen), or pSPORT1 (GIBCO 30 plant, insect and mammaliancells. The transformed hosts can BRL). Furthermore, depending on the expression system that be grown in fermentors and cultured according to techniques is used, leader sequences capable of directing the polypeptide known in the art to achieve optimal cell growth. In a further to a cellular compartment or secreting it into the culture embodiment, the present invention relates to a process for the medium may be added to the coding sequence of the nucleic preparation of a random coil polypeptide (or segment thereof) acid molecule of the invention. 35 or a biologically active protein described above comprising The present invention also relates to Vectors, particularly cultivating a cell of the invention under conditions suitable for plasmids, cosmids, viruses, and bacteriophages that are con expression of the biologically active protein or random coil ventionally employed in genetic engineering, comprising a polypeptide (or segment thereof) and isolating said protein/ nucleic acid molecule encoding the random coil polypeptide polypeptide from the cell or the culture medium. (or segment thereof) or the biologically active protein of the 40 The random coil polypeptide (or segment thereof) perse of invention. Preferably, said vector is an expression vector and/ the present invention does, preferably not comprises any or a gene transfer or targeting vector. Expression vectors chemically reactive group, except for, possibly, one N-termi derived from viruses such as retroviruses, vaccinia virus, nal primary (or, in the case of proline, secondary) amino adeno-associated virus, herpes viruses or bovine papilloma group and one carboxylate group at the C-terminus of the virus may be used for delivery of the polynucleotides or 45 polymer. However, it is evident for the skilled artisan that the vector of the invention into targeted cell populations. biosynthetic random coil polypeptide/polymer as provided The vectors containing the nucleic acid molecules of the herein may comprise a chemically reactive group, for invention can be transfected into the host cell by well known example when said random coil polypeptide/polymer is part methods, which vary depending on the type of cell. Accord of a “fusion protein'/fusion construct”. As also described ingly, the invention further relates to a cell comprising said 50 above, the biosynthetic random coil polypeptide (or segment nucleic acid molecule or said vector. Such methods, for thereof) can be prepared by recombinant expression in a example, include the techniques described in Sambrook transformed cell in several ways according to methods well (1989), loc. cit. and Ausubel (1989), loc. cit. Accordingly, known to the person skilled in the art, for example: (i) direct calcium chloride transfection or electroporation is commonly expression in the cytoplasm with the help of an N-terminal utilized for prokaryotic cells, whereas calcium phosphate 55 Met residue/start codon; (ii) secretion via an N-terminal sig treatment or electroporation may be used for other cellular nal peptide, for example Omp.A, PhoA (Monteilhet (1993) hosts (Sambrook (1989), loc. cit.). As a further alternative, the Gene. 1993 125:223-228), mellitin (Tessier (1991) Gene 98: nucleic acid molecules and vectors of the invention can be 177-183), interleukin 2 (Zhang (2005) J Gene Med 7: 354 reconstituted into liposomes for delivery to target cells. The 365), hCH (Pecceu (1991) Gene 97(2):253–258) and the like, nucleic acid molecule or vector of the invention which is 60 followed by intracellular cleavage resulting in the mature present in the host cell may either be integrated into the N-terminus, such as Ala or Pro; (iii) expression as a fusion genome of the host cell or it may be maintained extra-chro protein with another soluble protein, e.g., maltose-binding mosomally. Accordingly, the present invention also relates to protein at the N-terminus and with a protease cleavage site a host cell comprising the nucleic acid molecule and/or the interspersed (Kapust and Waugh (2000) Protein Expr. Purif. vector of this invention. Host cells for the expression of 65 19:312-318), followed by specific protease cleavage in vitro polypeptides are well known in the art and comprise prokary or in Vivo, thus releasing the amino acid polymer/polypeptide otic cells as well as eukaryotic cells, e.g. E. coli cells, yeast with its mature N-terminus such, as Ala or Pro. Another US 9,221,882 B2 43 44 suitable fusion partner is the SUMO protein, which can be coil polypeptide or polypeptide segment comprising an cleaved by SUMO protease, as described in Examples 20 and amino acid sequence consisting solely of proline and alanine 21. Further fusion partners include, without limitation, glu amino acid residues as well as the isolated conjugate may than tathion-5-transferase, thioredoxin, a cellulose-binding be further processed. For example, said biosynthetic random domain, an albumin-binding domain, a fluorescent protein 5 coil polypeptide or polypeptide segment comprising an (such as GFP), protein A, protein G, an intein and the like amino acid sequence consisting solely of proline and alanine (Malhotra (2009) Methods Enzymol. 463:239-258). amino acid residues may be chemically linked or coupled to As explained above, the random coil polypeptides (or a molecule of interest, as also shown in the appended polypeptide segments)/polymers described consist predomi examples. Furthermore and as an alternative, the molecule of nantly of alanine and proline residues, whereas serine, threo 10 interest may be enzymatically conjugated e.g. via trans nine or , which are required for O- or N-glycosy glutaminase (Besheer (2009) J Pharm Sci. 98.4420-8) or lation, are preferably absent. Thus, the production of the other enzymes (Subul (2009) Org. Biomol. Chem. 7:3361 polypeptide itself or of a biologically active protein compris 3371) to the said biosynthetic random coil polypeptide or ing the random coil polypeptide (or polypeptide segment polypeptide segment comprising an amino acid sequence thereof) or, generally, a fusion protein comprising the random 15 consisting of proline and alanine amino acid residues. coil polypeptide (or polypeptide segment thereof) Surpris The random coil polypeptide (or segment thereof) and/or a ingly can result in a monodisperse product preferably devoid protein conjugate comprising random coil polypeptide (or of post-translational modifications within the Pro-Ala segment thereof) and a protein of interest, like a biologically sequence This is an advantage for recombinant protein pro ortherapeutically active protein or a protein to be used in, e.g. duction in eukaryotic cells, like chinese hamster ovarian cells diagnostic methods, can be isolated (inter alia) from the (CHO) or yeast, which are often chosen for the biosynthesis growth medium, cellular lysates, periplasm or cellular mem of complex proteins. For example, yeast has been used for the brane fractions. (Again, the present invention is not limited to production of approved therapeutic proteins such as insulin, (protein) conjugates that are useful in a medical or pharma granulocyte-macrophage colony Stimulating factor, platelet ceutical setting. The means and methods provided herein are derived growth factor or hirudin (Gerngross (2004) Nat. Bio 25 also of use in other industrial areas, like, but not limited to technol. 22:1409-1414). CHO cells have served for the pro food and beverage industry, nutrient industry, paper industry, duction of therapeutic proteins such as coagulation factor IX, bioreagent industry, research tool and reagent industry, indus interferone B-1a, tenecteplase (Chu (2001) Curr. Opin. Bio tries where enzymes are to be used, cosmetic industry, oil technol. 12:180-187) or gonadotropins, where the glycocom processing and oil recovery, and the like). The isolation and ponent may positively influence several aspects like func 30 purification of the expressed polypeptides of the invention tional activity, folding, dimerization, Secretion as well as may be performed by any conventional means (Scopes (1982) receptor interaction, signal transduction, and metabolic clear “Protein Purification”, Springer, New York, N.Y.), including ance (Walsh (2006) Nat. Biotechnol. 24:-1241-1252). ammonium Sulphate precipitation, affinity purification, col Accordingly, the preparation of the inventive constructs, ran umn chromatography, gel electrophoresis and the like and dom coil polypeptides and conjugates in eukaryotic expres 35 may involve the use of monoclonal or polyclonal antibodies sion systems is also disclosed in context of the present inven directed, e.g., against a tag fused with the biologically active tion. protein of the invention. For example, the protein can be With the means and methods provided herein it is now purified via the Strep-tag II using streptavidin affinity chro possible to manufacture and provide for the herein disclosed matography (Skerra (2000) Methods Enzymol. 326:271-304) conjugates and molecules comprising (i) a biosynthetic ran 40 as described in the appended examples. Substantially pure dom coil polypeptide or polypeptide segment comprising an polypeptides of at least about 90 to 95% homogeneity (on the amino acid sequence consisting solely of proline and alanine protein level) are preferred, and 98 to 99% or more homoge amino acid residues and (ii) a further molecule of interest, like neity are most preferred, in particular for pharmaceutical a useful protein, a protein segment or a small molecule. The use/applications. Depending upon the host cell/organism present invention, therefore, also provides for methods for the 45 employed in the production procedure, the polypeptides of preparation or manufacture of Such conjugates as well as of the present invention may be glycosylated or may be non biosynthetic random coil polypeptides and/or molecules or glycosylated. conjugates comprising the same. Accordingly, the present The invention further relates to the use of the biologically invention also provides for a method for the preparation and/ active protein, the random coil polypeptide (or segment or manufacture of a random coil polypeptide or a random coil 50 thereof) or the conjugates, like the drug conjugates of the polypeptide segment as comprised in the conjugates, like invention, the nucleic acid molecule of the invention, the drug conjugates, food conjugates, diagnostic conjugates and vector of the invention or the (host) cell of the invention for the like. Also methods for the preparation and/or manufacture the preparation of a medicament, wherein said biologically of the biologically active protein or conjugate comprising the active protein or drug (or any other Small molecule or protein random coil polypeptide or the random coil polypeptide seg 55 of interest) has an increased in vivo and/or in vitro stability as ment are provided. Furthermore, methods for the preparation compared to a control molecule that does not comprise or that and/or manufacture and/or for the preparation and/or manu is not linked to a biosynthetic random coil polypeptide or facture of a polypeptide that comprises or that is an amino polypeptide segment comprising an amino acid sequence acid sequence that has or that mediates a biological activity consisting solely of proline and alanine amino acid residues, and that additionally comprises said random coil polypeptide 60 wherein said amino acid sequence consists of at least 50 or random coil polypeptide segment are provided. These proline (Pro) and alanine (Ala) amino acid residues. methods, in particular comprise (as one step) the cultivation In yet another embodiment, the present invention relates to of the (host) cell as provided herein above and (as a further a method for the treatment of diseases and/or disorders that step) the isolation of said random coil polypeptide or biologi benefit from the improved stability of said biologically active cally active protein and/or said biologically active protein 65 protein or drug, comprising administering the biologically and/or said polypeptide conjugate from the culture or from active protein or drug conjugate as described herein to a said cell. This isolated random coil, a biosynthetic random mammal in need of Such treatment. Depending on the bio US 9,221,882 B2 45 46 logical activity of the inventive protein or drug conjugate, the -continued skilled person is readily capable of determining which dis ease/disorder is to be treated with a specific biologically Biologically active protein (or a biologically active active protein or drug conjugate of the invention. Some non componentffragment limiting examples are listed in the following Table: 5 hereof) or drug Disorder disease to be treated glucagon-like peptide 1 diabetes Osteoprotegerin cancer, osteoporosis, rheumatoid arthritis Biologically active protein L-18 binding protein rheumatoid arthritis (or a biologically active growth hormone-releasing HIV-associated lipodystrophy componentffragment 10 actor thereof) or drug Disorder disease to be treated soluble TACI receptor systemic lupus erythematosus, multiple Sclerosis, rheumatoid arthritis granulocyte colony cancer and/or chemotherapy related hrombospondin-1 C8C stimulating factor neutropenia soluble VEGF receptor C8C human growth hormone growth hormone deficiency related Flt-1 hypoglycaemia and/or growth failure 15 L-4 mutein (IL-4 receptor asthma interferon C. cancer, viral infection, hepatitis C antagonist) interferon B auto-immune disease, multiple Sclerosis cyclosporine organ rejection interferon Y viral infection lumagillin C8C tumor necrosis factor maltrexone alcohol dependence interleukin-20 psoriasis octreotide acromegaly, carcinoid tumors C-galactosidase A Fabry disease eduglutide short bowel syndrome, Crohn's disease myostatin antagonist sarcopenia goserelin advanced prostate cancer, breast cancer gastric inhibitory type 2 diabetes camptothecin C8C polypeptide vancomycin Gram-positive pneumonias alpha-1 antitrypsin enzyme replacement therapy, cystic fibrosis, chronic obstructive pulmonary diseases, acute respiratory syndrome, severe asthma. 25 In accordance with the above, the biologically active pro erythropoietin anaemia tein, the random coil polypeptide (or segment thereof), the coagulation factor VIII haemophilia drug conjugate, the nucleic acid, the vector or the cell may be gp120 gp160 HIV used for the preparation of a medicament which preferably Soluble tumor necrosis inflammatory disease factor I and II receptor has or confers an increased in vivo and/or in vitro stability, in reteplase thrombosis, myocardial infarction 30 particular for the biologically active protein and/or drug com exendin-4 diabetes ponent, for the treatment of hormone deficiencies or related interleukin-1 receptor auto-immune disease, rheumatoid arthritis disorders, auto-immune disease, cancer, anaemia, neovascu antagonist (IL-1 ra: lar diseases, infectious/inflammatory diseases, thrombosis, anakinra) interleukin-2 C8CC myocardial infarction, diabetes, infertility, Gaucher's dis insulin diabetes 35 ease, hepatitis, hypoglycaemia, acromegaly, adenosine asparaginase acute lymphoblastic leukemia, deaminase deficiency, thrombocytopenia, haemophilia, ane non-Hodgkin's lymphoma mia, obesity, Alzheimer's disease, lipodistrophy, psoriasis, OCO8Se. malignant mesothelioma and other types of metastatic melanoma, osteoarthritis, dyslipidemia, rheuma C8CC streptokinase thrombotic disorders toid arthritis, systemic lupus erythromatosis, multiple Sclero neutrophilgelatinase microbial infection, kidney reperfusion 40 sis, asthma, osteoporosis, and reperfusion injury or other associated lipocalin injury kidney diseases, for example. In one embodiment, the bio antibodies and their immunological, oncological, neovascular, logically active protein, the drug conjugate the nucleic acid, ragments, including single and infectious diseases etc. the vector or the cell is for the use as a medicament which has domain antibodies, single chain and other engineered an increased in vivo and/or in vitro stability of said biologi ragments including CDR 45 cally active protein/drug conjugate. Similarly, the biologi mimetic peptides and CDRs cally active protein, the random coil polypeptide (or segment granulocyte-macrophage chemotherapy related neutropenia thereof), the drug conjugate, the nucleic acid, the vector or the colony-stimulating factor cell are for use in the treatment of for the treatment of hor ollicle-stimulating infertility mone deficiencies or related disorders, auto-immune disease, OOle 50 glucocerebrosidase Gaucher's disease proliferative disorders, like cancer, anaemia, neovascular dis hymosin alpha 1 chronic hepatitis B, cancer eases, infectious and/or inflammatory diseases, thrombosis, glucagon hypoglycemia myocardial infarction stroke, diabetes, infertility, penile dys Somatostatin acromegaly function, Gaucher's disease, Fabry disease, sarcopenia, cys adenosine deaminase adenosine deaminase deficiency tic fibrosis, obstructive pulmonary diseases, acute respiratory interleukin-11 thrombocytopenia 55 syndrome, hepatitis, hypoglycaemia, acromegaly, adenosine coagulation factor VIIa haemophilia coagulation factor IX hemophilia deaminase deficiency, thrombocytopenia, haemophilia, ane hematide anemia mia, obesity, Alzheimer's disease, lipodistrophy, psoriasis, interferone, hepatitis C metastatic melanoma, osteoarthritis, dyslipidemia, rheuma eptin lipodystrophy, obesity, Alzheimer's disease, toid arthritis, systemic lupus erythromatosis, multiple Sclero type I diabetes 60 sis, asthma, osteoporosis, and reperfusion injury or other interleukin-22 receptor psoriasis kidney diseases, for example. Subunit alpha (IL-22ra) interleukin 22 metastatic melanoma The present invention also relates to the use of the nucleic hyaluronidase solid tumors acid molecules, vectors as well as transfected cells as pro fibroblast growth factor 18 osteoarthritis vided herein and comprising the nucleic acid molecules or fibroblast growth factor 21 diabetes type II, obesity, dyslipidemia, 65 vectors of the present invention in medical approaches, like, metabolic disorders e.g. cell based gene therapy approaches or nucleic acid based gene therapy approaches. US 9,221,882 B2 47 48 In a further embodiment, the random coil polypeptide (or water, emulsions, such as oil/water emulsions, various types polypeptide segment thereof) as provided herein, the biologi of wetting agents, sterile solutions etc. cally active, heterologous protein/protein construct or the Compositions comprising Such carriers can be formulated drug or food conjugate or other conjugates that comprise the by well known conventional methods. Suitable carriers may biosynthetic random coil polypeptide (or polypeptide seg comprise any material which, when combined with the bio ment thereof) and/or the nucleic acid molecule or the vector logically active protein/drug conjugate of the invention, or the host cell of the present invention) is part of a compo retains its biological and/or pharmaceutical activity (see sition. Said composition may comprise one or more of the Remington’s Pharmaceutical Sciences (1980) 16th edition, inventive random coil polypeptides (or polypeptide segments Osol, A. Ed, Mack Publishing Company, Easton, Pa.). Prepa 10 rations for parenteral administration may include sterile thereof), biologically active proteins, food conjugates, con aqueous or non-aqueous solutions, Suspensions, and emul jugates of interest, drug conjugates or nucleic acid molecules, sions. The buffers, Solvents and/or excipients as employed in vectors and/or host cells encoding and/or expressing the context of the pharmaceutical composition are preferably same. Said composition may be a pharmaceutical composi “physiological as defined herein above. Examples of non tion, optionally further comprising a pharmaceutically 15 aqueous solvents are propylene glycol, polyethylene glycol, acceptable carrier and/or diluent. In a further embodiment, Vegetable oils such as olive oil, and injectable organic esters the present invention relates to the use of the herein described Such as ethyl oleate. Aqueous carriers include water, alco biologically active protein, the random coil polypeptide (or holic/aqueous solutions, emulsions or Suspensions, including segment thereof) or the drug conjugate for the preparation of saline and buffered media. Parenteral vehicles may include a pharmaceutical composition for the prevention, treatment Sodium chloride solution, Ringer's dextrose, dextrose and or amelioration of diseases which require the uptake of Such Sodium chloride, lactated Ringers, or fixed oils. Intravenous a pharmaceutical composition. vehicles may include fluid and nutrient replenishes, electro As mentioned herein above, not only the herein disclosed lyte replenishers (such as those based on Ringer's dextrose), conjugates, like drug conjugates or diagnostic conjugates, and the like. Preservatives and other additives may also be and/or biologically active, heterologous proteins/protein con 25 present, including antimicrobials, anti-oxidants, chelating structs (comprising the inventive random coil polypeptide or agents and/or inert gases and the like. In addition, a pharma polypeptide segment thereof) are in particular of medical or ceutical composition of the present invention might comprise pharmaceutical use. Also said random coil polypeptide or proteinaceous carriers, like, e.g., serum albuminor immuno polypeptide segment may be perse employed in Such a medi globulin, preferably of human origin. cal context, for example as "plasma expander or as blood 30 These pharmaceutical compositions can be administered to Surrogate, in the amelioration, prevention and/or treatment of the Subject at a Suitable dose. The dosage regimen will be a disorder related to an impaired blood plasma amount or determined by the attending physician and clinical factors. As blood plasma content or in the amelioration, prevention and/ is well known in the medical arts, dosages for any one patient or treatment of a disorder related to an impaired blood vol depend upon many factors, including the patient's size, body ume. Disorders that are treated with plasma expanders are, 35 Surface area, age, the particular compound to be adminis but not limited to, disorders affiliated with blood loss, like tered, sex, time and route of administration, general health, injuries, Surgeries, burns, trauma, or abdominal emergencies, and other drugs being administered concurrently. Pharma infections, dehydratations etc. Yet, such a medical use is not ceutically active matter may be present in amounts between 1 limited to the random coil polypeptide or polypeptide seg ug and 20 mg/kg body weight per dose, e.g. between 0.1 mg ment of this invention but can also be extended to certain drug 40 to 10 mg/kg body weight, e.g. between 0.5 mg to 5 mg/kg conjugates as disclosed herein or even to certain biologically body weight. If the regimen is a continuous infusion, it should active, heterologous proteins/protein constructs. also be in the range of 1 Jug to 10 mg per kilogram of body In one embodiment, the composition as described herein weight per minute. Yet, doses below or above the indicated may be a diagnostic composition, for example an imaging exemplary ranges also are envisioned, especially considering reagent, optionally further comprising Suitable means for 45 the aforementioned factors. detection, wherein said diagnostic composition has an Furthermore, it is envisaged that the pharmaceutical com increased in vivo and/or in vitro stability. position of the invention might comprise further biologically The compositions of the invention may be in solid or liquid orpharmaceutically active agents, depending on the intended form and may be, inter alia, in a form of (a) powder(s), (a) use of the pharmaceutical composition. These further biologi tablet(s), (a) solution(s) or (an) aerosol(s). Furthermore, it is 50 cally or pharmaceutically active agents may be e.g. antibod envisaged that the medicament of the invention might com ies, antibody fragments, hormones, growth factors, enzymes, prise further biologically active agents, depending on the binding molecules, cytokines, chemokines, nucleic acid mol intended use of the pharmaceutical composition. ecules and drugs. Administration of the Suitable (pharmaceutical) composi It is of note that the present invention is not limited to tions may be effected by different ways, e.g., by parenteral, 55 pharmaceutical compositions. Also compositions to be used Subcutaneous, intravenous, intraarterial, intraperitoneal, topi in research or as diagnostic(s) are envisaged. It is, for cal, intrabronchial, intrapulmonary and intranasal adminis example, envisaged that the biologically active proteins or tration and, if desired for local treatment, intralesional admin drug conjugates comprising a random coil domain or com istration. Parenteral administrations include intraperitoneal, ponent as defined herein, are used in a diagnostic setting. For intramuscular, intradermal. Subcutaneous, intravenous or 60 Such a purpose, the inventive biologically active protein or intraarterial administration. The compositions of the inven drug conjugate of this invention may be labelled in order to tion may also be administered directly to the target site, e.g., allow detection. Such labels comprise, but are not limited to, by biolistic delivery to an external or internal target site, like radioactive labels (like Hhydrogen ''Iliodide or 'I a specifically effected organ. iodide), fluorescent labels (including fluorescent proteins, Examples of Suitable pharmaceutical carriers, excipients 65 like green fluorescent protein (GFP) or fluorophores, like and/or diluents are well known in the art and include phos fluorescein isothiocyanate (FITC)) or NMR labels (like gado phate buffered saline solutions or other buffer solutions, linium chelates). The here defined labels or markers are in no US 9,221,882 B2 49 50 way limiting and merely represent illustrative examples. The The kit of the present invention may be advantageously diagnostic compositions of this invention are particularly use used, interalia, for carrying out the method of the invention ful in tracing or imaging experiments or in a diagnostic medi and could be employed in a variety of applications referred cals setting. In the appended Examples and Figures, the herein, e.g., as diagnostic kits, as research tools or as medical preparation of a corresponding construct is provided that tools. Additionally, the kit of the invention may contain comprises conjugates comprising (i) a biosynthetic random means for detection suitable for scientific, medical and/or coil polypeptide or polypeptide segment comprising an diagnostic purposes. The manufacture of the kits preferably amino acid sequence consisting solely of proline and alanine follows standard procedures which are known to the person amino acid residues, wherein said amino acid sequence con skilled in the art. sists of at least 50 proline (Pro) and alanine (Ala) amino acid 10 The invention is further illustrated by the following, non residues, and (ii) fluorescein or digoxigenin; see appended limiting Figures and Examples. FIGS. 13 and 14 and the corresponding figure legend as well FIG. 1. Gene design for the PA#1 Pro/Ala polymer/ as illustrative Example 22. polypeptide sequence. But not only pharmaceutical or diagnostic uses of the 15 Nucleotide and encoded amino acid sequence of a building means and methods provided herein are within the gist of the block for PA#1 (SEQID NO: 1) obtained by hybridization of present invention. The compounds/conjugates provided two complementary oligodeoxynucleotides (upper/coding herein are also useful in certain other industrial areas, like in strand oligodeoxynucleotide SEQ ID NO: 17, lower/non the food industry, the beverage industry, the cosmetic indus coding strand oligodeoxynucleotide SEQ ID NO: 18). The try, the oil industry, the paper industry and the like. Therefore, resulting nucleic acid has two sticky ends (shown in lower the present invention also provides for uses of the biosyn case letters), corresponding to an Ala codon and anti-codon, thetic random coil polypeptide as provided herein in Such respectively, and are mutually compatible. Upon repeated industrial areas. Also part of this invention is, accordingly, a ligation of such a building block, concatamers encoding Pro method for the production of a cosmetic, of a compound to be Ala polypeptides of varying lengths can be obtained and used in cosmetic treatments, of a food or of a beverage, said 25 Subsequently cloned, for example, via (a) Sap restriction method comprising the culture of the cell comprising a site(s). nucleic acid molecule (or a vector) encoding a random coil FIG. 2. Cloning strategies for a Pro/Ala polymer/polypep polypeptide as defined herein or encoding a biologically tide sequence as fusion to a Fab fragment or to human active protein and/or a biologically active protein and/or a IFNa2b. polypeptide that comprises or that is an amino acid sequence 30 (A) Nucleotide and encoded amino acid sequence stretch that has or that mediates an activity. Such a method also (upper/coding strand SEQ ID NO: 19, lower/non-coding includes the isolation of said random coil polypeptide, said strand SEQIDNO:20, encoded amino acid sequence SEQID biologically active protein and/or said biologically active pro NO: 21) around the C-terminus of the immunoglobulin light tein or said polypeptide that comprises or that is an amino acid chain of an antibody Fab fragment as encoded on pASK88 sequence that has or that mediates an activity, like a biological 35 Fab-2XSapI (SEQID NO: 22), a derivative of p ASK75, used activity, and that additionally comprises said random coil for Subcloning of Pro/Ala polymer/polypeptide sequences polypeptide or random coil polypeptide segment from the and expression of corresponding biologically active proteins. culture or from said cell. In the same context, other conjugates The nucleotide sequence carries two Sap recognition sites in of interest can be produced, for example conjugates which are mutually reverse orientation, which leads upon digest to pro useful in different areas of industry, like in the oil or paper 40 truding DNA ends that are compatible with the synthetic gene industry. The person skilled in the art is readily in a position cassette shown in FIG.1. The recognition sequences and the to adapt the herein provided means and methods for the C-terminal amino acids of the light chain are underlined. generation of corresponding molecular/recombinant con (B) Nucleotide sequence and encoded amino acid structs as well as for the generation of conjugates that com sequence (upper/coding strand SEQID NO: 23, lower/non prise a biosynthetic random coil polypeptide or polypeptide 45 coding strand SEQID NO: 24, encoded amino acid sequence segment comprising an amino acid sequence consisting SEQ ID NO: 25) of a PA#1 polymer/polypeptide with 20 solely of proline and alanine amino acid residues, wherein residues after insertion of a single cassette as shown in FIG. 1 said amino acid sequence consists of at least 50 proline (Pro) into the paSK88-Fab-2XSapI plasmid. Similar ligation/inser and alanine (Ala) amino acid residues, and a small molecule tion of 10 such repeated cassettes resulted in the plasmid or a polypeptide of interest. 50 vector pFab-PA#1 (200) (Seq ID NO: 28) coding for a poly In yet another embodiment, the present invention provides mer/polypeptide with 200 residues (SEQID NO: 26 and 27). for a kit comprising the random coil polypeptide (or polypep The Sap restriction sites flanking the Pro/Ala polymer-en tide segment thereof), the biologically active protein, the drug coding sequence are labelled (recognition sequences are conjugate, the nucleic acid molecule encoding said biologi underlined). cally active protein encoding said biologically active protein 55 (C) Plasmid map of pFab-PA#1 (200) (SEQ ID NO: 28). and/or encoding said biologically active protein and/or The structural genes for the heavy chain (HC) and light chain encoding said polypeptide that comprises or that is an amino (LC) of the Fab-PA#1 (200) are under transcriptional control acid sequence that has or that mediates an activity (for of the tetracycline promoter/operator)(tet") and the operon example a biological activity), the vector comprising said ends with the lipoprotein terminator (t). HC comprises the nucleic acid molecule or the cell comprising said nucleic acid 60 bacterial Omp A signal peptide, the variable (VH) and the first or said vector as described herein. The kit of the present human IgG1 heavy chain constant C domain (CH) as well as invention may further comprise, (a) buffer(s), storage solu the His-tag. LC comprises the bacterial PhoA signal peptide, tions and/or additional reagents or materials required for the the variable (VL) and human light chain constant (CL) conduct of medical, scientific or diagnostic assays and pur domain, the PAit 1 polymer/polypeptide with 200 residues. poses. Furthermore, parts of the kit of the invention can be 65 The plasmid backbone of pFab-PA#1 (200) outside the packaged individually in vials or bottles or in combination in expression cassette flanked by the Xbal and HindIII restric containers or multicontainer units. tion sites is identical with that of the generic cloning and US 9,221,882 B2 51 52 expression vector pASK75 (Skerra (1994) Gene 151:131 of the Pro/Ala polymer/polypeptide segment because the Fab 135). Singular restriction sites are indicated. fragment itself, with a calculated mass of 48.0 kDa, or its (D) Nucleotide and amino acid sequence stretch (upper/ unfused light chain exhibit essentially normal electrophoretic coding strand SEQID NO: 29, lower/non-coding strand SEQ mobility. ID NO:30, encoded amino acid sequence SEQID NO: 31) (B) Analysis of the purified recombinant IFNa2b and its around the N-terminus of human IFNa2b as cloned on p ASK PA#1 fusion protein with 200 residues by 12% SDS-PAGE. IFNa2b (SEQID NO:32). The single restriction site SapI that The gel shows 2 ug protein samples each of IFNa2b and of can be used for insertion of the Pro/Ala polymer-encoding PA#1 (200)-IFNa2b. Samples on the left were reduced with sequence is labelled (recognition sequence is underlined). 2-mercaptoethanol whereas corresponding samples on the The two C-terminal amino acids of the Strep-tag II are under 10 right were left unreduced. Sizes of protein markers—applied lined. The first amino acid of the mature IFNa2b is labelled under reducing conditions—are indicated on the left margin. with +1. The two proteins appear as single homogeneous bands with (E) Nucleotide and encoded amino acid sequence stretch apparent molecular sizes of ca. 20 kDa and ca. 80 kDa in the (upper/coding strand SEQ ID NO: 33, lower/non-coding reduced form. The latter value is significantly larger than the strand SEQID NO:34, encoded amino acid sequence SEQID 15 calculated mass of 37.0 kDa for PAii.1(200)-IFNa2b. This NO: 35) of the N-terminus of IFNa2b after insertion of one effect is due to the addition of the Pro/Ala polymer/polypep PA#1 polymer sequence cassette as shown in FIG. 1. The tide segment because the IFNa2b itself, with a calculated single restriction site Sap, that remains after insertion of the mass of 20.9 kDa, exhibits essentially normal electrophoretic Prof Ala polymer-encoding sequence, is labelled (recognition mobility. IFNa2b in the non-reduced state has a slightly sequences are underlined). The first amino acid of IFNa2b as higher electrophoretic mobility, indicating a more compact part of the fusion protein is labelled (1) and the two C-termi form as a result of its two intramolecular disulfide bridges. nal amino acids of the Strep-tag II are underlined. Similar FIG. 4. Quantitative analysis of the hydrodynamic volumes ligation/insertion of 10 repeated PAH1 polymersequence cas of the purified recombinant Fab and IFNa2b as well as their settes resulted in the plasmid vector pPA#1 (200)-IFNa2b PA#1 (200) fusions. coding for a polymer/polypeptide with 200 residues (SEQID 25 (A) Analytical size exclusion chromatography (SEC) of NO:36) Fab and Fab-PAii.1(200). 250 ul of the purified protein at a (F) Plasmid map of pPA#1 (200)-IFNa2b (SEQ ID NO: concentration of 0.25 mg/ml was applied to a Superdex S200 37). The structural gene for biologically active protein PAi1 10/300 GL column equilibrated with PBS buffer. Absorption (200)-IFNa2b (comprising the bacterial Omp A signal pep at 280 nm was monitored and the peak of each chromatogra tide, the Strep-tag II, the PA#1 polymer/polypeptide segment 30 phy run was normalized to a value of 1. The arrow indicates with 200 residues, and human IFNa2b) is under transcrip the void volume of the column (7.8 ml). tional control of the tetracycline promoter/operator)(tet") (B) Calibration curve for the chromatograms from (A) and ends with the lipoprotein terminator (t). The plasmid using a Superdex 5200 10/300 GL column. The logarithm of backbone outside the expression cassette flanked by the Xbal the molecular weight (MW) of marker proteins (cytochrome and HindIII restriction sites is identical with that of the 35 c. 12.4 kDa; carbonic anhydrase, 29.0 kDa, ovalbumin, 43.0 generic cloning and expression vector pASK75 (Skerra kDa; bovine serum albumin, 66.3 kDa, alcohol dehydroge (1994) loc. cit.). Singular restriction sites are indicated. nase, 150 kDa, B-amylase, 200 kDa, apo-ferritin, 440 kDa) FIG. 3. Analysis of the purified recombinant Fab fragment was plotted vs. their elution volumes (black circles) and fitted and the purified recombinant IFNa2b as well as their Pro/Ala by a straight line. From the observed elution volumes of the polypeptide/polymer fusions by SDS-PAGE. 40 Fab fragment and its PAH1 fusion protein (black squares) The recombinant proteins were produced in E. coli KS272 their apparent molecular sizes were determined as follows. (Strauch (1988) Proc. Natl. Acad. Sci. USA 85:1576–80) via Fab: 31 kDa (true mass: 48.0 kDa); Fab-PAii.1(200): 237kDa periplasmic secretion and purified by means of the His-tag (true mass: 64.3 kDa). These data show that fusion with the (Fab) or the Strep-tag II (IFNa2b) using immobilized metal or PAH1 polypeptide confers a much enlarged hydrodynamic streptavidin affinity chromatography, respectively. 45 Volume. (A) Analysis of the purified recombinant Fab and its PAH1 (C) Analytical size exclusion chromatography of IFNa2b fusion with 200 residues by 12% SDS-PAGE. The gel shows and PA#1 (200)-IFNa2b. 250 ul of each purified protein at a 2 ug protein samples each of Fab and Fab-PA#1 (200). concentration of 0.25 mg/ml was applied to a Superdex 5200 Samples on the left were reduced with 2-mercaptoethanol 10/300 GL column equilibrated with phosphate-buffered whereas repeated samples on the right were left unreduced. 50 saline, PBS. Absorption at 280 nm was monitored and the Sizes of protein markers—applied under reducing condi peak of each chromatography run was normalized to a value tions—are indicated on the left margin. Upon reduction of the of 1. The arrow indicates the void volume of the column (7.8 interchain disulfide bridge the Fab fragment and its 200 resi ml). due PAH1 fusion appear as two homogenous bands. In the (D) Calibration curve for the chromatogram from (C) using case of the reduced Fab fragment, the two bands with molecu 55 a Superdex S200 10/300 GL column. The logarithm of the lar sizes of ca. 24 and 26 kDa, respectively, correspond to the molecular weight (MW) of marker proteins (see B) was plot separated LC and HC. In the case of the reduced Fab-PA#1 ted vs. their elution volumes (black circles) and fitted by a (200) fusion protein the band at 24 kDa corresponds to the straight line. From the observed elution volumes of IFNa2b HC, whereas the band at ca. 90 kDa corresponds to the LC and its PAH1 fusion protein (black Squares) their apparent fused with the PAi1 (200) polypeptide segment. Under non 60 molecular sizes were determined as follows. IFNa2b: 22.5 reducing conditions, the Fab fragment and its PAi1 (200) kDa (true mass: 20.9 kDa); PA#1 (200)-IFNa2b: 229.0 kDa fusion appear as single homogeneous bands with apparent (true mass: 37.0 kDa). These data show that fusion with the molecular sizes of ca. 45 kDa and 100 kDa, respectively. The PAH1 polypeptide confers a much enlarged hydrodynamic two apparent size values for the Fab-PA#1 (200) fusion pro Volume. tein are significantly larger than the calculated masses of 64.3 65 FIG. 5. Experimental secondary structure analysis of kDa for the non-reduced Fab-PA#1 (200) and of 39.1 kDa for recombinant proteins and their PAit 1 polymer/polypeptide the isolated LC-PAi1 (200). This effect is due to the addition fusions by circular dichroism (CD) spectroscopy. US 9,221,882 B2 53 54 Spectra were recorded at room temperature in 50 mM sequence, is labelled (recognition sequences are underlined). KSO. 20 mM K-phosphate pH 7.5 and normalized to the The first amino acid of hCH as part of the fusion protein is molar ellipticity, 0, for each protein. labelled (1) and the amino acids of the His6-tag are under (A) CD spectra of the purified recombinant Fab and Fab lined. Similar ligation/insertion of 10 repeated PA#1 polymer PA#1 (200). The CD spectrum for the Fab fragment shows the 5 sequence cassettes resulted in the plasmid vector pASK75 typical features of a predominant B-sheet protein with a broad His6-PA#1 (200)-hGH coding for the mature fusion protein negative maximum around 216 nm (Sreerama in: Circular SEQID NO: 45.) Dichroism Principles and Applications (2000) Berova, (C) Plasmid map of pASK75-His6-PA#1 (200)-hGH (SEQ Nakanishi and Woody (Eds.) Wiley, New York, N.Y., pp. ID NO: 46). The structural gene for biologically active pro 601-620), which indicates the correct folding of the bacteri 10 tein His6-PA#1 (200)-hGH (comprising the bacterial Omp A ally produced Fab fragment. The spectrum of its fusion pro signal peptide, the His6-tag, the PAF1 polymer/polypeptide tein with the Pro/Ala polymer/polypeptide reveals a domi segment with 200 residues, and human GH) is under tran nant negative band below 200 nm, which is indicative of Scriptional control of the tetracycline promoter/operator random coil conformation. In addition, there is a shoulder (tet”) and ends with the lipoprotein terminator (t). The around 220 nm, which results from the B-sheet contribution 15 plasmid backbone outside the expression cassette flanked by of the Fab fragment and indicates its correct folding even as the Xbaland HindIII restriction sites is identical with that of part of the fusion protein. the generic cloning and expression vector pASK75 (Skerra (B) Molar difference CD spectrum for Fab-PA#1 (200) (1994) loc. cit.). Singular restriction sites are indicated. obtained by subtraction of the spectrum for the Fab fragment. (D) Plasmid map of pCHO-PA#1 (200)-hGH, which The difference CD spectrum represents the secondary struc encodes a His0-PA#1 (200)-hGH fusion protein (SEQID NO: ture of the 200 residue PA#1 polymer/polypeptide segment 47). The structural gene, comprising the human growth hor and reveals a strong minimum around 200 nm, which is a mone signal peptide (Sp), the His-tag, the PAH1 polymer/ clear indication of random coil conformation in the buffered polypeptide sequence with 200 residues (PAit 1 (200)), the aqueous solution (Greenfield (1969) Biochemistry 8: 4108 human growth hormone (hGH), and containing the bovine 41 16: Sreerama (2000) loc. cit.; Fändrich (2002) EMBO J. 25 growth hormone polyadenylation signal (bCH pA), is under 21:5682-5690). transcriptional control of the cytomegalus virus promoter (C) CD spectra of the purified recombinant IFNa2b and (CMV). The singular restriction sites Nhe and HindIII are PA#1 (200)-IFNa2b. The CD spectrum for IFNa2b shows the indicated. The resistance gene for neomycinphosphotrans typical features of a predominant C.-helix protein with two ferase (neo) is under control of the SV40 promotor (SV40) negative bands around 208 nm and 220 nm (Sreerama (2000) 30 and followed by a SV40 polyadenylation signal (SV40 pA). loc. cit.), which indicates the correct folding of the bacterially In addition, the plasmid contains the bacterial ColE1 origin of produced human IFNa2b. The spectrum of its fusion protein replication (ColE1-ori), the bacteriophage fl origin of repli with the Pro/Ala polymer/polypeptide reveals characteristic cation (fl-ori), and the B-lactamase gene (bla) to allow propa deviations with a dominant minimum around 200 nm, which gation and selection in E. coli. is indicative of random coil conformation. In addition, there is 35 (E) Western blot analysis of a fusion protein between hCGH a shoulder around 220 nm, which results from the C-helical and the genetically encoded PAH1 polymer of 200 residues contribution of IFNa2b and indicates the correct folding of produced in CHO cells compared with recombinant hCH. the IFNa2b even as part of the fusion protein. CHO-K1 cells were transfected either with pCHO-PA#1 (D) Molar difference CD spectrum for PA#1 (200)-IFNa2b (200)-hGH (SEQID NO: 48) or with pCHO-hGH (SEQ ID obtained by subtraction of the spectrum for IFNa2b. The 40 NO: 49), a similar plasmid encoding hCH without the PAi1 difference CD spectrum represents the secondary structure of (200) sequence (but also carrying the His6-tag). Two days the 200 residue PA#1 polymer/polypeptide segment and after transfection, a sample of the cell culture Supernatant was reveals a strong minimum around 200 nm, essentially identi subjected to SDS-PAGE and Western blotting with an anti cal to the one shown in (B). This is again a clear indication of hGH antibody conjugated with horse radish peroxidase. The random coil conformation in buffered aqueous solution for a 45 two proteins appear as single bands indicated by arrows, with biological polymer comprising Pro and Ala residues accord apparent molecular sizes of ca. 23 kDa (His6-hGH) and ca. 90 ing to the invention. kDa (His6-PA#1-hGH). There is also a weak band around 60 FIG. 6. Secretory production of a fusion protein between kDa arising from serum proteins in the culture medium. human growth hormone (hGH) and the genetically encoded Whereas the His6-tagged hCGH appears at the calculated mass PA#1 polymer in CHO cells. 50 of 23.5 kDa, the apparent molecular size of His6-PA#1-hGH (A) Nucleotide and amino acid sequence stretch (upper/ is significantly larger than its calculated mass of 39.5 kDa. coding strand SEQID NO:38, lower/non-coding strand SEQ This effect is due to the hydrophilic random coil nature of the ID NO:39, encoded amino acid sequence SEQID NO: 40) Pro-Ala polymer. around the N-terminus of hCH as cloned on pASK75-His6 FIG. 7. Theoretical prediction of secondary structure for hCH(SEQID NO: 41). The single restriction sites Nhel, that 55 the PA#1 Pro/Ala polypeptide/polymer sequence. can be used together with HindIII (not shown) for subcloning, This illustration shows the output from the CHOFAS com and SapI, that can be used for insertion of the Pro/Ala poly puter algorithm according to the Chou-Fasman method mer-encoding sequence, are labelled (recognition sequence is (Chou and Fasman (1974) Biochemistry 13: 222-245) as underlined). The six amino acids of the His6-tag are under implemented on the Sequence Comparison and Secondary lined. The first amino acid of the hCH is labelled with +1. 60 Structure prediction server at the University of Virginia (B) Nucleotide and encoded amino acid sequence (upper/ (URL: http://fasta.bioch. virginia.edu/fasta www2). To avoid coding strand SEQID NO:42, lower/non-coding strand SEQ boundary effects at the amino and carboxy termini, the 20mer ID NO: 43, encoded amino acid sequence SEQID NO:44) of amino acid repeat according to FIG. 1 was pasted in three the N-terminus of hCH after insertion of one PAi1 polymer consecutive copies (resulting in a concatamer similar as sequence cassette as shown in FIG. 1. The single restriction 65 encoded after repeated ligation/insertion of the synthetic gene sites NheI, that can be used for subcloning, and Sap, that cassette) and only the output for the central 20mer sequence remains after insertion of the Pro/Ala polymer-encoding block (boxed) was considered. In the case of the PAi1 US 9,221,882 B2 55 56 polypeptide sequence/segment (SEQ ID NO: 1) the Chou Analytical size exclusion chromatography (SEC) of Fab Fasman algorithm predicts 100% C.-helical secondary struc P1A1 (200) and Fab-P1A3(200). 250 ul of the purified protein ture. This is in contrast with the experimentally observed at a concentration of 0.25 mg/ml was applied to a Superdex predominant random coil conformation for the PAH1 S200 10/300 GL column equilibrated with PBS. Absorption polypeptide/polypeptide segment as part of a fusion protein at 280 nm was monitored and the peak of each chromatogra (see FIG. 5B/D). phy run was normalized to a value of 1. The arrow indicates FIG. 8: Quantitative analysis of the pharmacokinetics of the void volume of the column (7.8 ml). From the observed the purified recombinant Fab fragment and its PA#1 polymer elution volumes of the fusion proteins their apparent molecu fusions with 200 and 600 residues in BALB/c mice. lar sizes were determined using a similar calibration curve as Plasma samples from Example 16 were assayed for Fab, 10 shown in FIG. 4B as follows. Fab-P1A1 (200): 180.7 kDa Fab-PA#1 (200), and Fab-PA#1 (600) concentrations using a (true mass: 65.3 kDa); Fab-P1A3(200): 160.2 kDa (true sandwich ELISA. To estimate the plasma half-life of Fab, mass: 64.0kDa). These data show that fusion of a protein with Fab-PA#1 (200), and Fab-PA#1 (600), the measured concen the P1A1 and/or P1A5 polypeptide confers a much enlarged tration values were plotted against time post intravenous 15 hydrodynamic Volume. injection and numerically fitted assuming a bi-exponential FIG.11. Experimental secondary structure analysis of Fab decay. The unfused Fab fragment exhibited a very fast clear P1A1 (200) and Fab-P1A3(200) fusions by circular dichroism ance with an elimination half-life of 1.3-0.1 h. In contrast, the (CD) spectroscopy. elimination phase determined for Fab-PA#1 (200) and Fab PA#1 (600) was significantly slower, with terminal half-lives Spectra were recorded at room temperature in 50 mM of 4.1+1.8 hand 38.8+11.2 h, respectively, thus demonstrat KSO, 20 mM K-phosphate pH 7.5 and normalized to the ing a ca. 3-fold and a ca. 30-fold prolonged circulation due to molar ellipticity, 0, for each protein. the Pro/Ala polymer fusion with 200 or 600 residues com (A) CD spectra of the purified recombinant Fab-P1A1 pared with the unfused Fab fragment. (200) and Fab-P1A3(200). The CD spectra of the Fab fusion FIG.9: Analysis of the purified recombinant Fab fragment 25 proteins with both Pro/Ala polymers/polypeptides each as fusion with the P1A1 or P1A3 polymer having 200 resi reveal a dominant negative band below 200 nm, which is dues. indicative of random coil conformation. In addition, there is a The recombinant proteins were produced in E. coli KS272 shoulder around 220 nm, which arises from the B-sheet con via periplasmic secretion and purified by means of the His tribution of the Fab fragment and indicates its correct folding tag using immobilized metal affinity chromatography. The 30 even as part of the fusion protein. purified proteins were analyzed by 12% SDS-PAGE. The gel shows 2 ug protein samples each of Fab-P1A1 (200) and Fab (B) Molar difference CD spectra for Fab-P1A1 (200) and P1A3(200) as well as, for comparison, of the unfused Fab Fab-P1A3 (200) obtained by subtraction of the spectrum for fragment (cf. FIG.3A). Samples on the left were reduced with the unfused Fab fragment (see FIG. 5A). The difference CD 2-mercaptoethanol whereas analogous samples on the right 35 spectra represent the secondary structures of the 200 residue were left unreduced. Sizes of protein markers—applied under P1A1 (SEQID NO:51) and P1A3 (SEQID NO:3) polymers/ reducing conditions—are indicated on the left margin. After polypeptide segments, respectively, and reveal a strong mini reduction of the interchain disulfide bridges the Fab fragment mum around 200 nm, which is a clear indication of their and its 200 residue Pro/Ala fusions appear as two homoge random coil conformation in the buffered aqueous Solution neous bands. In the case of the reduced Fab fragment, the two 40 (Greenfield (1969) Biochemistry 8: 4108-41 16: Sreerama bands with molecular sizes of ca. 24 and 26 kDa, respectively, (2000) loc. cit.; Fändrich (2002) EMBO J. 21:5682-5690). correspond to the separated light chain (LC) and heavy chain FIG. 12: Preparation of an isolated biosynthetic Pro/Ala fragment (HC). In the case of the reduced Fab-P1A1 (200) polymer/polypeptide. fusion protein the band at 24 kDa corresponds to the HC, (A) Plasmid map of pSUMO-PAii.1(200) (SEQ ID NO: whereas the band at ca. 90 kDa corresponds to the LC fused 45 with the P1A1 (200) polypeptide. In the case of the reduced 60). The structural gene for the fusion protein MK-His(6)- Fab-P1A3(200) fusion protein the band at 24 kDa corre SUMO-PAi 1 (200) comprising a start methionine codon fol sponds to the HC, whereas the band at ca.75 kDa corresponds lowed by a lysine codon, an N-terminal affinity tag of six to the LC fused with the P1A5(200) polypeptide. Under non consecutive His residues, the cleavable small -like reducing conditions, the Fab fragment, its P1A1 (200) and its 50 modifier (SUMO) protein Smt3p (Panavas (2009) Methods P1A3(200) fusion appear as single prominent bands with apparent molecular sizes of ca. 45 kDa, 110 kDa, and 90 kDa, Mol. Biol. 497: 303-17), and the PAi1 polymer/polypeptide respectively. The apparent sizes for the Fab-P1A1 (200) and segment with 200 residues (SEQ ID NO: 60) is under tran Fab-P1A3(200) fusion proteins are significantly larger than scriptional control of the gene 10 promoter of the bacterioph the calculated masses of 65.3 kDa for the non-reduced Fab 55 age T7 and ends with the td terminator. Additional plasmid P1A1(200) and of 64.0 kDa for the non-reduced Fab-P1A3 elements comprise the origin of replication (ori), the ampi (200). Also, the apparent sizes for the corresponding reduced light chains are significantly larger than the calculated masses cillin resistance gene (bla), and the fl origin of replication. of 40.7 kDa for the P1A1(200) LC and of 39.4 kDa for the The plasmid backbone outside the expression cassette P1A3(200) LC. This effect is due to the addition of the Pro/ 60 flanked by the Ndel and HindIII restriction sites is, except for Ala polymer/polypeptide segment because the Fab fragment a Sap restriction site that was eliminated by silent mutation, itself, with a calculated mass of 48.0 kDa, or its unfused light identical with that of the generic cloning and expression chain, with a calculated mass of 23.4kDa, exhibit essentially normal electrophoretic mobility. vector pRSET5a (Schoepfer (1993) 124: 83-85). FIG. 10. Quantitative analysis of the hydrodynamic vol 65 SEQ ID NO: 60 is provided in the enclosed sequence umes of the purified recombinant Fab-P1A1 (200) and Fab listing (which is also part of this description and specifica P1A3(200) fusion proteins. tion) and is reproduced herein below.

US 9,221,882 B2 59 60 - Continued tccagotgca cct gct coag cagcacctgc tigcaccagot ccggctgctic ctgctg.ccgc 252O tccagotgca cct gct coag cagcacctgc tigcaccagot ccggctgctic ctgctg.ccgc. 258 O tccagotgca cct gct coag cagcacctgc tigcaccagot ccggctgctic ctgctg.ccgc. 264 O tccagotgca cct gct coag cagcacctgc tigcaccagot ccggctgctic ctgctg.ccgc 2700 tccagotgca cct gct coag cagcacctgc tigcaccagot ccggctgctic ctgctg.ccgc 276 O tccagotgca cct gct coag cagcacctgc tigcaccagot ccggctgctic ctgctg.ccgc. 282O tccagotgca cct gct coag cagcacctgc tigcaccagot ccggctgctic ctgctg.ccgc. 288 O tccagotgca cct gct coag cagcacctgc tigcaccagot ccggctgctic ctgctg.ccgc. 294 O tccagotgca cct gct coag cagcacctgc tigcaccagot ccggctgctic ctgctgcctg. 3 OOO alaga.gcaa.gc titgatc.cggc tigctaacaag ccc.gaaagga agctgagttg gctgctgcc 3060 accoctago: aataactago at aaccocitt gggg.ccticta aacgggt Ctt gaggggittitt 312O ttgctgaaag gaggaact at atc.cggat ct ggcgtaatag cgalagaggcc cgcaccgat c 318O gccCtt CC ca acagttgcgc agcctgaatg gcgaatggga cgc.gc.cctgt agcggcgcat 324 O taag.cgcggc gggtgtggtg gttacgc.gca gcgtgaccgc tacacttgcc agcgc.cc tag 33 OO cgc.ccgctico titt cqcttitc titccct tcct ttctogccac gttcgc.cggc titt coccgtc. 3360 aagctictaaa tCgggggctic c ctittagggit toccatttag tgctttacgg cacct cq acc 342O cCaaaaaact tattagggit gatggttcac gtag tiggcc atcgc cctoga tagacggittt 3480 titcgcc ctitt gacgttggag tccacgttct ttaatagtgg act cittgttc caaactggaa 354 O caac acticaa cccitat ct cq gtctattott ttgatttata agggattittg ccgattitcgg 3600 cctattggitt aaaaaatgag citgatttaac aaaaatttaa cgc gaattitt aacaaaatat 366 O taacgcttac aatttaggtg 3 680

(B) Analysis of the bacterially produced His(6)-SUMO purified His(6)-SUMO-PA#1 (200)(A), His(6)-SUMO-PA#1 PA#1 (200) fusion protein and its cleavage by 12% SDS (200) after cleavage reaction in the presence of SUMO pro PAGE. The gel shows the SUMO-PAS#1 (200) fusion protein tease (B), the cleaved His(6)-SUMO-PA#1 (200) batch after extracted from E. coli and purified via immobilized metal 40 chemical coupling with a fluorescein NHS ester (C), and the affinity chromatography (IMAC) and size exclusion chroma fluorescein-PAi 1 (200) conjugate after IMAC purification tography (SEC) before (lane 1) and after proteolytic cleavage (D). 250 ul of protein/polypeptide at a concentration of ca. 0.5 with Ubl-specific protease 1 (SUMO protease) (lane 2) as mg/ml was applied to a Superdex 5200 10/300 GL column described in Example 21. All samples were reduced with equilibrated with PBS on an Aktapurifier system. Absorption 2-mercaptoethanol. Sizes of protein markers (M), applied 45 at 225 nm, 280 nm, and 494 nm was monitored using a under reducing conditions, are indicated on the left margin. UV-900 UV/VIS detector (GE Healthcare) and a prominent The His(6)-SUMO-PA#1 (200) fusion protein appears as a peak of each chromatogram was normalized to a value of 1. single homogeneous band with an apparent molecular size of The arrow indicates the void volume of the column (7.3 ml). ca. 100 kDa. Thus, the apparent size for the His(6)-SUMO 50 (E-K) Characterization of free fluorescein, the biosynthetic PA#1 (200) fusion protein observed in SDS-PAGE is signifi PA#1 (200) polymer/polypeptide, and its fluorescein conju cantly larger than the calculated mass of 28.3 kDa, which is gate via SEC and UV/VIS spectroscopy. The three chromato due to the presence of the Pro/Ala polymer/polypeptide. After grams show (from top to bottom) purified PA#1 (200) (E), the cleavage, the hydrophilic PA#1 (200) polypeptide is not chemical compound fluorescein (F) and the purified fluores detectably stained by Coomassie blue; hence, only a small 55 cein-PAit 1 (200) conjugate (G). The four UV/VIS spectra residual fraction of the fusion protein and the cleaved His(6)- show the purified His(6)-SUMO-PAii.1(200) fusion protein SUMO protein are visible on the SDS polyacrylamide gel (H), the purified PA#1 (200) polymer/polypeptide (I), free (lane 2). The His(6)-SUMO protein shows a homogeneous fluorescein (J), and the purified fluorescein-PAi1 (200) con band with apparent molecular size of ca. 16 kDa (lane 2) jugate (K) (all in PBS). The arrows indicate characteristic which is well in agreement with its calculated molecular mass 60 absorption bands/shoulders of SUMO (280 nm), PAi1 (200) of 12.2 kDa. (225 nm), and fluorescein (494 nm). FIG. 13: Conjugation of a biosynthetic Pro/Ala polymer/ (L) Calibration curve for the chromatograms from (A-G) polypeptide with chemical compounds and/or drugs. using a Superdex S200 10/300 GL column. The logarithm of (A-D) Production of a fluorescein conjugate with a biosyn the molecular weight (MW) of marker proteins (aprotinin, 6.5 thetic PA#1 (200) polymer/polypeptide (SEQ ID NO: 61) 65 kDa; cytochrome C, 12.4kDa; carbonic anhydrase, 29.0kDa. monitored via analytical size exclusion chromatography bovine serum albumin, 66.3 kDa; alcohol dehydrogenase, (SEC). The panels show (from top to bottom) SEC runs of 150 kDa; B-amylase, 200 kDa, apo-ferritin, 440 kDa) was US 9,221,882 B2 61 62 plotted vs. their elution volumes (x) and fitted by a straight DTT, 10 mM ATP, and in some cases 5 mM of each dATP, line. From the observed elution volumes of His(6)-SUMO dCTP, dGTP, and dTTP, in a total volume of 100 ul and PA#1 (200) (10.81 ml), PA#1 (200) (11.51 ml), fluorescein incubation for 55 minonice. After 10 min heatinactivation at PA#1 (200) (11.49 ml) and fluorescein (27.57 ml), their appar 70° C. the ligation products were separated by 1.5% (w/v) ent molecular sizes were determined as follows. His(6)- agarose gel electrophoresis in the presence of TAE buffer (40 SUMO-PAH1(200): 215.6 kDa, PA#1 (200): 154.1 kDa (true mM Tris, 20 mM acetic acid, 1 mM EDTA). After staining mass: 16.1 kDa), fluorescein-PAi1 (200): 155.6 kDa (true with ethidium bromide the band corresponding to the mass: 16.6 kDa): SUMO: 25.7 kDa (true mass: 12.2 kDa); assembled gene segment of 300 bp length was excised and fluorescein: 0.09 kDa (true mass: 0.33 kDa). These data show isolated. that fusion with the Pro/Ala polypeptide/polymer confers a 10 much enlarged hydrodynamic Volume to the conjugated drug Example 2 compared with the unmodified compound. (M) Characterization of the chemical conjugate between Construction of pfab-PA#1 (200) as Expression the biosynthetic PAi 1 (200) polypeptide/polymer and the ste Vector for a Fab-PAH1 Fusion Protein roid compound digoxigenin via Electro Spray Ionisation 15 Mass Spectrometry (ESI-MS). A deconvoluted ESI-MS spec For cloning of a 10mer repeat of the synthetic gene frag trum of digoxigenin-PAi1 (200) reveals a mass of 16671.4 ment coding for the 20 amino acid sequence of PAH1 from Da, which essentially coincides with the calculated mass for Example 1 the plasmid vectorpASK88-Fab-2XSapI (SEQID the digoxigenin-PAi1 (200) conjugate (16670.6 Da). NO: 22), an expression plasmid for an Fab fragment (Schlap FIG. 14: Illustration of chemical conjugates between the schy (2007) Protein Eng. Des. Sel. 20:273-284) harboring a biosynthetic PAi1 (200) polypeptide/polymer and small mol nucleotide sequence with two SapI restriction sites in reverse ecule drugs. complementary orientation at the 3-'end of the light chain (A) Fluorescein coupled to the N-terminus of biosynthetic (FIG.2A), was employed. This vector, which is a derivative of PA#1 (200). pASK75 (Skerra, A. (1994) Gene 151:131-135), was cut with (B) Digoxigenin coupled to the N-terminus of biosynthetic 25 Sap, dephosphorylated with shrimp alkaline phosphatase PA#1 (200). (USB, Cleveland, Ohio), and ligated with a 300 bp cassette of the synthetic DNA fragment obtained from Example 1. The EXAMPLES resulting intermediate plasmid pFab-PA#1 (100) was again cut with SapI, dephosphorylated with shrimp alkaline phos The present invention is additionally described by way of 30 phatase, and ligated with a 300 bp cassette of the synthetic the following illustrative non-limiting examples that provide DNA fragment obtained from Example 1 (as exemplified in a better understanding of the present invention and of its many FIG.2B, however with only a PAH1 (20) polymer/polypeptide advantages. cassette). The resulting plasmid was designated pFab-PA#1 Unless otherwise indicated, established methods of recom (200) (SEQID NO: 28) (FIG.2C). It should be noted that on binant gene technology were used as described, for example, 35 this plasmid the coding region for the 200 residue PAi1 in Sambrook (2001) loc. cit. sequence repeat was flanked by two Sap restriction, which enables precise excision and further Subcloning of the entire Example 1 sequence cassette, carrying 5'-GCC nucleotide overhangs. After transformation of E. coli XL1-Blue (Bullock (1987) Gene Synthesis for Pro/Ala Amino Acid 40 Biotechniques 5: 376-378), plasmid was prepared and the Polymers/Polypeptides sequence of the cloned synthetic nucleic acid insert was con firmed by restriction analysis and double-stranded DNA As described herein above, amino acid repeats consisting sequencing (ABI-PrismTM310 Genetic analyzer, Perkin of Pro and Ala residues are depicted herein as Pro/Ala or Elmer Applied Biosystems, Weiterstadt, Germany) using the “PA'. Gene fragments encoding a repetitive polymer 45 BigDyeTM terminator kit as well as oligodeoxynucleotide sequence comprising Pro and Ala (PAH1 which corresponds primers that enabled sequencing from both sides. to SEQID NO: 1) were obtained by hybridisation of the two complementary oligodeoxynucleotides (SEQID NO: 17 and Example 3 SEQ ID NO: 18) shown in FIG. 1, followed by concatamer formation in a directed manner via DNA ligation of their 50 Construction of p ASK-PA#1 (200)-IFNa2b as an mutually compatible but non-palindromic sticky ends. Oli Expression Vector for a PA#1 (200)-IFNa2b Fusion godeoxynucleotides were purchased from ThermoScientific Protein (Ulm, Germany) and purified by preparative urea polyacry lamide gel electrophoresis. The nucleic acid sequences of the For the construction of an expression plasmid encoding oligodesoxynucleotides are depicted in FIG. 1 (SEQID NOS 55 IFNa2b as fusion with a 200 residue PAH1 sequence repeat, 17 and 18 comprising an additional GCC codon for alanine, PA#1 (200), pASK-IFNa2b (SEQID NO:32) (FIG. 2D) was which becomes part of the following PAH1 sequence repeat cut with SapI, dephosphorylated with shrimp alkaline phos upon ligation of the corresponding Sticky ends. Enzymatic phatase, and ligated with the gene fragment encoding the 200 was performed by mixing 200 pmol of both residue PA#1 polypeptide excised from the previously con oligodeoxynucleotides in 100 ul 50 mM Tris/HCl pH 7.6, 10 60 structed plasmid pFab-PA#1 (200) (Example 2) by restriction mMMgCl, 5 mMDTT, 1 mMATP and incubation for 30 min digest with SapI (as exemplified in FIG. 2E, however with at 37°C. in the presence of 10 upolynucleotide kinase (MBI only a PAi1 (20) polymer/polypeptide cassette). After trans Fermentas, St. Leon-Rot, Germany). After denaturation for formation of E. coli JM83 (Yanisch-Perron. (1985) Gene 10 min at 80°C., the mixture was cooled to room temperature 33:103-119), plasmid was prepared and the presence of the overnight to achieve hybridization. Then 50 ul of this solution 65 correct insert was confirmed by restriction analysis. The was ligated by adding 1 u T4 DNA ligase (MBI Fermentas) resulting plasmid was designated pPA#1 (200)-IFNa2b (SEQ and 10 Jul 100 mM Tris/HCl pH 7.4, 50 mM MgCl2, 20 mM ID NO:37) (FIG.2F). US 9,221,882 B2 63 64 Example 4 (1989) loc. cit.) of 23590 M' cm both for the unfused IFNa2b and its PAi1 polymer fusion. Bacterial Production and Purification of Fusion Proteins Between an Fab Fragment and a Genetically Example 6 Encoded PA#1 Polymer/Polypeptide Measurement of the Hydrodynamic Volume for the Recombinant Fusion Protein Between a Fab The Fab fragment (calculated mass: 48.0 kDa) and the Fragment and a Genetically Encoded PAi1 Polymer Fab-PA#1 (200) fusion (calculated mass: 64.3 kDa) were pro of 200 Residues by Analytical Gel Filtration duced at 22°C. in E. coli KS272 harboring the corresponding expression plasmids from Example 3, together with the fold 10 Size exclusion chromatography (SEC) was carried out on a ing helper plasmid pTUM4 (Schlapschy (2006) Protein Eng. Superdex S200 HR 10/300 GL column (GE Healthcare Des. Sel. 19:385-390), using shaker flask cultures with 2 L Europe, Freiburg, Germany) at a flow rate of 1 ml/min using LB medium containing 100 mg/l amplicillin and 30 mg/1 an Akta Purifier 10 system (GE Healthcare) with PBS (115 chloramphenicol. Induction of recombinant gene expression mM NaCl, 4 mM. KHPO, 16 mM NaHPO. pH 7.4) as 15 running buffer. 250 ul samples of the purified Fab fragment was performed by addition of 0.4 mg anhydrotetracycline at and its 200 residue PA#1 fusion, obtained from the metal ODsso-0.5 over night (typically resulting in ODsso of ca. 1.0 affinity affinity chromatography as described in Example 4. at harvest). Periplasmic extraction in the presence of 500 mM were individually applied at a concentration of 0.25 mg/ml in sucrose, 1 mM EDTA, 100 mM Tris/HCl pH 8.0 containing PBS. Both proteins eluted in a single homogenous peak as 50 ug/ml lysozyme was performed as described elsewhere shown in FIG. 4A. (Breustedt (2005) Biochim. Biophys. Acta 1764:161-173) For column calibration (FIG. 4B), 250 ul of an appropriate and followed by purification by means of the His-tag using mixture of the following globular proteins (Sigma, Deisen immobilized metal affinity chromatography (Skerra (1994) hofen, Germany) were applied in PBS at protein concentra tions between 0.2 mg/ml and 0.5 mg/ml. cytochrome c. 12.4 Gene 141: 79-84) with an imidazole gradient from 0 to 200 kDa; carbonic anhydrase, 29.0 kDa; ovalbumin, 43.0 kDa, mM in 500 mM betaine, 50 mM Na-phosphate pH 7.5). 25 bovine serum albumin, 66.3 kDa; alcohol dehydrogenase, Homogeneous protein preparations were obtained for both 150 kDa; (3-amylase, 200 kDa; apo-ferritin, 440 kDa. recombinant Fab fragments (FIG. 3A) with yields of 0.2 mg As result, the fusion protein with the 200 residue PA#1 L' OD' for the unfused Fab and 0.1 mg L' OD' for polymer/polypeptide exhibited a significantly larger size than Fab-PA#1 (200). SDS-PAGE was performed using a high corresponding globular proteins with the same molecular molarity Tris buffer system (Fling (1986) Anal. Biochem. 30 weight. The apparent size increase for Fab-PA#1 (200) was 155: 83-88). Protein concentrations were determined accord 7.4-fold compared with the unfused Fab fragment whereas ing to the absorption at 280 nm using calculated extinction the true mass was only larger by 1.3-fold. This observation coefficients (Gill (1989) Anal. Biochem. 182: 319-326) of clearly indicates a much increased hydrodynamic Volume 68290 M' cm both for the unfused Fab and its PA#1 poly conferred to the biologically active Fab fragment by the Pro/ mer fusion as the Pro/Ala polymer did not contribute to UV 35 Ala polypeptide segment according to this invention. absorption because of its lack of aromatic amino acids. Example 7 Example 5 Measurement of the Hydrodynamic Volume for the Bacterial Production and Purification of Fusion 40 Recombinant Fusion Protein Between IFNa2b and a Proteins Between IFNa2b and a Genetically Encoded Genetically Encoded PAlt1 Polymer of 200 Residues PA#1 Polymer/Polypeptide by Analytical Gel Filtration Size exclusion chromatography was carried out with IFNa2b (calculated mass: 20.9 kDa) and PA#1 (200)- IFNa2b and PA#1 (200)-IFNa2b on a Superdex 5200 HR IFNa2b (calculated mass: 37.0 kDa) were produced at 22°C. 45 10/300 GL column (GE Healthcare) at a flow rate of 1 ml/min in E. coli KS272 harboring the corresponding expression using an Akta Purifier 10 system (GE Healthcare) similarly as plasmids from Example 3, together with the folding helper described in Example 6. Both proteins eluted in a single plasmid pTUM4 (Schlapschy (2006) loc. cit.), using shaker homogenous peak as shown in FIG. 4C. flask cultures with 2 L LB medium containing 100 mg/1 As result, the fusion protein with the 200 residue PA#1 ampicillin and 30 mg/l chloramphenicol. Induction of recom 50 polymer/polypeptide exhibited a significantly larger size than binant gene expression was performed by addition of 0.4 mg corresponding globular proteins with the same molecular anhydrotetracycline at ODsso-0.5 over night (typically weight (FIG. 4D). The apparent size increase for PA#1 (200)- resulting in ODsso of ca. 1.0 at harvest). Periplasmic extrac IFNa2b was 10.2-fold compared with the unfused IFNa2b tion in the presence of 500 mM sucrose, 1 mM EDTA, 100 protein whereas the true mass was only larger by 1.8-fold. mM Tris/HCl pH 8.0 containing 50 g/ml lysozyme was 55 This observation clearly indicates a much increased hydro performed as described elsewhere (Breustedt (2005) loc. cit.) dynamic volume conferred to the biologically active inter and followed by purification via the Strep-tag II using strepta feron by the Pro/-Ala polymer/polypeptide according to this vidin affinity chromatography (Schmidt (2007) Nat. Protoc. invention. 2:1528-1535) in the presence of 150 mMNaCl, 1 mM EDTA, 100 mM Tris/HC1, pH 8.0. 60 Example 8 Homogeneous protein preparations were obtained for both recombinant IFNa2b proteins (FIG. 3B) with yields of 0.15 Detection of Random Coil Conformation for the mg L' OD' for IFNa2b and 0.1 mg L' OD' for PAlt1 Biosynthetic PAi1 Polymer Fused to a Fab Fragment (200)-IFNa2b. SDS-PAGE was performed using a high Via Circular Dichroism Spectroscopy molarity Tris buffer system (Fling (1986) loc. cit.). Protein 65 concentrations were determined according to the absorption Secondary structure was analysed using a J-810 spectropo at 280 nm using calculated extinction coefficients (Gill larimeter (Jasco, Gro?-Umstadt, Germany) equipped with a US 9,221,882 B2 65 66 quartz cuvette 106-QS (0.1 mm path length; Helima, Mill coil conformation was observed. Thus, the Pro/Ala polypep heim, Germany). Spectra were recorded from 190 to 250 nm tide segment as part of the recombinant fusion proteinappears at room temperature by accumulating 16 runs (bandwidth 1 to be present as a random coil polymer under aqueous buffer nm, scan speed 100 nm/min, response 4S) using 3.12 to 15.4 conditions. uM protein solutions obtained from Example 4 in 50 mM KSO, 20 mM K-phosphate pH 7.5. After correction for Example 10 Solution blanks, spectra were Smoothed using the instrument software, and the molar ellipticity 0 was calculated accord ing to the equation: Quantitative Analysis of the Secondary Structure of 10 the Fab Fragment, of IFNa2b and of Their 200 Residue PAlt1 Polymer Fusions Oobs Of = c. (d The secondary structure content of the Fab fragment, Fab PA#1 (200), IFNa2b, and PA#1 (200)-IFNa2b was individu whereby 0 denotes the measured ellipticity, c the protein 15 ally quantified from the corresponding CD spectra measured concentration mol/l, d the path length of the quartz cuvette in Examples 8 and 9 using the secondary structure deconvo cm. The 0 values were plotted against the wavelength lution program CDNN ver, 2.1 (Böhm (1992) Protein Eng. using Kaleidagraph (Synergy Software, Reading, Pa.). 5:191-195) with a set of 33 base spectra for the deconvolution The measured circular dichroism (CD) spectrum for the of complex CD spectra The results of this analysis are pro recombinant Fab was in accordance with the f-sheet domi vided in the following Table:

Fab- Diff PA#1 Diff Fab PAlt1(200) PAlt1 (200) IFNa2b (100)-IFNa2b PAlt1 (200) C-helix 9.5% 7.59% 2.1% 38.2% 31.0% O.7% anti-parallel 40.4% 3.1% O% 1.8% O.2% 4.6% B-sheet parallel 6.9% 1.3% O.3% 8.4% O.7% O.6% B-sheet B-turn 6.2% SO.4% 78.6% 19.2% 75.29% 69.7% random coil 37.2% 63.4% 94.8% 35.9% 64.4% 97.59%

X total 100.2%. 125.8%. 175.8%. 103.5% 171.4% 170.0% X B-turn and 43.4%. 113.8%. 173.4% SS.1% 139.6% 169.1% random coil nated immunglobuline fold, whereas the spectrum for the Compared with the predominantly B-sheet secondary Fab-PA#1 (200) fusion protein revealed a significant contri structure content of the recombinant Fab fragment, which is bution of random coil conformation (FIG.5A). To analyze the in accordance with its known immunoglobulin fold (see spectroscopic contribution by the Pro/Ala polypeptide seg- 40 Eigenbrot (1993) J. Mol. Biol. 229:969-995), the fraction of ment in greater detail the molar difference CD spectrum with unstructured conformation (comprising random coil and respect to the unfused Fab fragment was calculated (FIG. 5B) B-turn) clearly increases if the PAi1 polymer is fused to the by subtraction of the latter spectrum from the one for Fab Fab fragment. The difference CD spectrum for the Pro/Ala PA#1 (200). As result, a strong minimum around 200 nm, polypeptide segment reveals a clear random coil conforma which is characteristic of random coil conformation, was 45 tion. Analysis of the secondary structure shows the presence observed. Thus, the Pro/Ala sequence as part of the recombi of a high fraction of unstructured conformations (comprising nant fusion protein appears to be present as a random coil random coil and B-turn) which nearly comprise 100% of the polymer under physiological buffer conditions. total secondary structure. Similarly, compared with the pre dominantly C-helical secondary structure content of the Example 9 50 recombinant IFNa2b, which is in accordance with its known Detection of Random Coil Conformation for the three-dimensional structure as an O-helix bundle protein Genetically Encoded PAlt1 Polymer Fused to (Radhakrishnan (1996) Structure 4: 1453-1463), the fraction IFNa2b Via Circular Dichroism Spectroscopy of unstructured conformation for the whole protein clearly 55 increases if the PA#1 polymer is fused to IFNa2b. The differ Secondary structure was analysed by CD measurements ence CD spectrum for the Pro/Ala polypeptide segment for IFNa2b and PA#1 (200)-IFNa2b (obtained from Example reveals a clear random coil conformation. Analysis of the 5) as described in Example 8 using 3.6 to 38.7 LM protein secondary structure shows the presence of a high fraction of solutions. The spectrum of PA#1 (200)-IFNa2b revealed sig unstructured conformations (comprising random coil and nificant contributions of C-helical secondary structure, 60 B-turn) which nearly comprise 100% of the total secondary indicative of the known C-helix bundle fold of interferon, as Structure. well as of random coil conformation (FIG.5C). To analyze Different results were obtained when a theoretical analysis the spectroscopic contributions by the Pro/Ala polymer of the PAF1 polymer sequence was performed using the fusion partner in greater detail the molar difference CD spec Chou-Fasman algorithm (Chou and Fasman (1974) Bio trum with respect to the unfused IFNa2b was calculated by 65 chemistry 13: 222-245). The results of this analysis are illus subtraction of the two individual spectra (FIG.5D). As result, trated in FIG. 7. This algorithm predicts 100% C.-helical a strong minimum around 200 nm characteristic of random secondary structure, which is in clear contrast with the experi US 9,221,882 B2 67 68 mental data. Thus, this algorithm is not useful to confidently was subjected to 12% SDS-PAGE. Following electro-transfer predict unstructured conformation for an amino acid polymer onto a nitrocellulose membrane (Schleicher & Schuell, Das according to the invention. sel, Germany) by means of a semi-dry blotting apparatus, the membrane was washed 3 times for 15 min with 10 ml PBST Example 11 (PBS containing 0.1% V/v Tween 20). The membrane was incubated with 10 ml of a 1:1000 dilution of anti human growth hormone antibody ab1956 conjugated with horse rad Construction of p ASK75-His6-PA#1 (200)-hGH as ish peroxidase (Abcam, Cambridge, UK). After incubation an Expression Vector for a His6-PA#1 (200)-hGH for 1 h and washing the membrane twice for 5 min with 20 ml Fusion Protein PBST and twice for 5 min with PBS, the chromogenic reac 10 tion was performed in the presence of 15 ml of SIGMA For the construction of an expression plasmid encoding FASTTM 3.3-diaminobenzidine solution (Sigma-Aldrich hGH as fusion with a 200 residue PAH1 sequence repeat, Chemie, Munich, Germany). The reaction was stopped by PA#1 (200), pASK75-His6-hGH (SEQID NO: 41) (FIG. 6A) washing with water and air-drying of the membrane. The blot was cut with SapI, dephosphorylated with shrimp alkaline revealed signals for both recombinant protein samples (FIG. phosphatase, and ligated with the gene fragment encoding the 15 6E), thus proving secretory production of the hCH fusion 200 residue PAi1 polypeptide excised from the previously protein with the PAi1 polypeptide in CHO cells. constructed plasmid pFab-PA#1 (200) (Example 2) by restric tion digest with Sap (as exemplified in FIG. 6B, with only a Example 14 PA#1 (20) polymer/polypeptide cassette). After transforma tion of E. coli JM83 (Yanisch-Perron. (1985) loc. cit.), plas Bacterial Production and Purification of Fusion mid was prepared and the presence of the correct insert was Proteins Between hCGH and a Genetically Encoded confirmed by restriction analysis. The resulting plasmid was PA#1 Polymer/Polypeptide designated paSK75-His6-PAii.1(200)-hGH (SEQ ID NO: 46) (FIG. 6C). Human growth hormone (hGH) (calculated mass: 23.4 25 kDa), PAi1 (200)-hGH (calculated mass: 39.6 kDa), PA#1 Example 12 (400)-hGH (calculated mass: 55.8 kDa) and PA#1 (600)-hGH (calculated mass: 72.0 kDa) were produced in E. coli KS272 harboring the corresponding expression plasmids from Construction of an Expression Vector for the Example 11 or their derivatives with a double (encoding 400 Secretory Production of Human Growth Hormone residues) or triple (600 residues) PAi1 sequence cassette, Fused with a 200 Residue PAH1 30 respectively. Bacterial production was performed at 22°C. in Polymer/Polypeptide in Chinese Hamster Ovary shaker flask cultures with 2 LLB medium containing 2.5g/L Cells glucose, 0.5 g/L proline and 100 mg/lampicillin. Induction of recombinant gene expression was performed by addition of The vector pASK75-His6-PA#1 (200)-hGH (SEQ ID NO: 0.4 mg anhydrotetracycline at ODsso-0.5 for 3h. Periplasmic 46), a derivative of pASK75 (Skerra (1994) loc. cit.), allowing 35 extraction in the presence of 500 mM sucrose, 1 mM EDTA, prokaryotic production of the hCH PAH1 fusion protein, was 100 mM Tris/HCl pH 8.0 containing 50 ug/ml lysozyme was cut with Nhel and HindIII. This fragment was purified via carried out as described elsewhere (Breustedt (2005) loc. cit.) agarose gel electrophoresis and ligated with the correspond and followed by purification via the His-tag using the His ingly cut vector pCHO (SEQ ID NO: 50). After transforma Trap High Performance affinity column (GE Healthcare) with tion of E. coli XL1-Blue (Bullock (1987) loc. cit.), plasmid 40 40 mM Na-phosphate pH 7.5, 0.5M NaCl as buffer. Proteins was prepared and the correct insertion of the fragment was were eluted using an imidazole concentration gradient from 0 Verified via restriction analysis. The resulting plasmid, which to 150 mM (dissolved in the running buffer and adjusted with codes for the hCH signal peptide fused to the His tag, a HCl to pH 7.5) and further purified via size exclusion chro PA#1 (200) polypeptide segment, and the human growth hor matography using a Superdex 200-HR 10/30 column (GE mone (hGH), was designated pCHO-PA#1 (200)-hGH SEQ 45 Healthcare) equilibrated with PBS (115 mM NaCl, 4 mM ID NO: 48) and is depicted in FIG. 6D. KHPO, 16 mM NaHPO, pH 7.4). After size exclusion chromatography homogeneous pro Example 13 tein preparations were obtained for all recombinant hCGH fusion proteins without signs of aggregation and with yields Secretory Production of a Fusion Protein Between 50 of 1 mg L' OD' for hCGH, 0.3 mg L' OD' for PA#1 (200)- Human Growth Hormone (hGH) and the Genetically hGH, 0.3 mg L' OD' for PA#1 (400)-hGH and 0.2 mg L' Encoded PA#1 Polymer in CHO Cells OD' for PA#1 (600)-hGH. SDS-PAGE was performed using a high molarity Tris buffer system (Fling (1986) loc. cit.). CHO-K1 cells ATCC No. CCL-61 were cultured in Quan Protein concentrations were determined according to the tum 263 medium (PAA Laboratories, C6lbe, Germany) in a 55 absorption at 280 nm using calculated extinction coefficients 100 mm plastic dishuntil 50% confluency was reached. Cells (Gill (1989) loc. cit.) of 16050M cm' for the unfused hCH were transfected with 8 ug pCHO-PA#1 (200)-hGH (SEQID and all its PAit 1 polypeptide fusions. NO: 48) or, for control, pCHO-hGH (SEQ ID NO: 49), a similar plasmid encoding hCGH without the PAit 1 (200) Example 15 sequence, using the Nanofectin Kit (PAA Laboratories, 60 C6lbe, Germany). After 6 h, cell culture medium was Measurement of Binding Affinity of Human Growth exchanged by 7 ml Opti-MEMR-I reduced serum medium Hormone and its PA#1 Polymer Fusions Towards the (Invitrogen, Darmstadt, Germany) and cells were incubated Extracellular Domain of Human Growth Hormone at 37°C. in a humidified atmosphere with 5% 1.02. After two Receptor Using Surface Plasmon Resonance days, 20 of the cell culture supernatant was taken and diluted 65 with 5ul SDS-PAGE loading buffer containing B-mercapto The affinity of hCGH and its PAH 1 polypeptide fusions to a ethanol. After 5 min heating at 95°C., 15ul of each sample human growth hormone receptor Fc fusion protein (hGHR US 9,221,882 B2 69 70 Fc; R&D Systems) was determined via surface plasmon reso nance (SPR) real time measurements on a Biacore 2000 sys Group A. B D tem (GE Healthcare). First, 15 ul mouse anti-human IgG-Fc capture antibody (Jackson Immuno Research) at a concentra Test item Fab Fab- Fab tion of 100 ug/ml in 10 mM Na-acetate pH 5.0 was immobi- 5 PAlt1 (200) PAh 1(600) lized to the surface of two flow channels of a CMDP chip Administration route intravenous (XanTec bioanalytics) using an amine coupling kit (GE Dose mg/kg b.w. S.O S.O S.O Healthcare). This resulted in ca. 2700 response units (RU). Concentration mg/ml 1.O 1.O 1.O After equilibration with PBS/T (PBS containing 0.05% (v/v) Application volume ml/kg b.w. S.O Tween 20) as flow buffer, one channel of the chip was charged 10 No. of animals, group 9 9 9 with 2 ug/ml hCGHR-Fc at a flow rate of 5 ul/min until an No. of blood sampling time points 12 12 12 additional signal of ca. 300 RU was reached. Then, 75 ul of No. of animals sampling time point 3 3 3 hGH or its PAi1 polypeptide fusions in PBS/T was injected at No. of blood samplings animal 41 41 4f1 varying concentrations and the association and dissociation phases were measured under continuous buffer flow of 20 15 ul/min. For regeneration, three 6 ul pulses of 10 mM glycine/ The total volume of intravenously administered test item HCl pH 2.7 were applied. The sensograms were corrected by was calculated according to the individual body weight (b.w.) double Subtraction of the corresponding signals measured for recorded on the day of administration (e.g. an animal with 20 the channel without immobilized receptor and an averaged g body weight received 100 ul of 1 mg/ml test item). Blood baseline determined from several buffer blank injections sampling was performed according to the following Table:

Time points for blood Sampling after iniection Test item Subgroup 10 min 30 min 1h 2h 3h 4h 6h 8h 12 h 24h 36 h 48 h.

Fab 1 X X X X Fab-PA#1 (200) X X X X Fab-PA#1 (600) X X X X 2 X X X X X X X X X X X X 3 X X X X X X X X X X X X

(Myszka (1999) Mol. Recognit. 12: 279-284). Kinetic data 35 For each Substance (Test item) altogether nine animals— evaluation was performed by a global fit of the traces from at divided into three subgroups 1-3 with each three animals— least seven different sample injections according to the 1:1 were injected, each providing four samples at different time Langmuir binding model using BIAevaluation Software ver points. Blood samples (approximately 50 ul) were taken from the tail vene and stored at 4°C. for 30 min. After centrifuga sion 3.1 (GE Healthcare). The values obtained from SPR tion for 10 min at 10 000 g and 4°C. the Supernatant (plasma) measurements for the kinetic and derived equilibrium con- 40 was immediately frozen and stored at -20°C. stants of the complexes between hCGH or its PAH1 fusions and For quantitative detection of the Fab fusion protein in an the human growth hormone receptor are Summarized in the ELISA, the wells of a 96 well microtiter plate (Maxisorb, following Table: NUNC, Denmark) were coated overnight at 4°C. with 50 ul 45 of a 10 g/ml solution of recombinant Her2/ErbB2 ectodomain antigen in 50 mM NaHCO pH 9.6. Then, the hCH variant k10M's k-10's KopM) wells were blocked with 200ul of 3% (w/v) BSA in PBS for GH 10.2 10.6 10.4 1 hand washed three times with PBS/T (PBS containing 0.1% PAlt1 (200)-hCGH 4.75 9.18 19.3 (v/v) Tween 20). The plasma samples were applied in dilution PAlt1 (400)-hCGH 3.26 14.0 42.9 50 series in PBS/T containing 0.5% (v/v) mouse plasma from an PAlt1 (600)-hCGH 3.29 12.5 38.0 untreated animal and incubated for 1 h. The wells were then washed three times with PBS/T and incubated for 1 h with 50 These data show that the fusion of hCGH with PA#1 ul of a 1:1000 diluted solution of an anti-human CK antibody polypeptides of different lengths does not significantly inter alkaline phosphatase conjugate in PBS/T. After washing fere with receptor binding. All hCGH PAi1 polypeptide 55 twice with PBS/T and twice with PBS the chromogenic reac fusions retain receptor binding activity within a factor 5 com tion was started by adding 50 ul of 0.5 lug/ml p-nitrophenyl pared with the recombinanthCH lacking a PA#1 polypeptide. phosphate in 100 mM Tris/HCl pH 8.8, 100 mMNaCl, 5 mM MgCl, as substrate, and after 15 minat 25°C. the absorbance Example 16 at 405 nm was measured. Concentrations of Fab, Fab-PA#1 60 (200), and Fab-PA#1 (600) in the plasma samples were quan Detection of Prolonged Plasma Half-Life In Vivo for tified by comparison of the measured signals with standard the Recombinant Fusion Proteins Between a Fab curves which were determined for dilution series for the Fragment and Genetically Encoded PA#1 Polymers corresponding purified proteins at defined concentrations in PBS/T containing 0.5% (v/v) mouse plasma from untreated Adult BALB/c mice (SPF stock breeding: TU Munchen, 65 animals. Freising, Germany) were intravenously injected according to To estimate the plasma half-life of Fab, Fab-PA#1 (200), the following Table: and Fab-PA#1 (600), the concentration values, c(t), were US 9,221,882 B2 71 72 determined for each time point from the ELISA measure Example 17 ments and plotted against time post intravenous injection, t. These data were numerically fitted using KaleidaGraph soft Gene Synthesis for P1A1 and P1A3 Amino Acid ware assuming a bi-exponential decay according to the equa Polymers/Polypeptides and Construction of pFab tion 5 P1A1 (200) and pFab-P1A3(200) as Expression Vec tors for Fab-P1A1(200) and Fab-P1A3(200) Fusion Proteins

-In 2-- -In2- - Gene fragments encoding a repetitive polymer sequence C(i) = Coe If2 + (co - C)e 112 comprising the Pro/Ala polypeptides/polymers P1A1 (SEQ 10 IDNO:51) and P1A3, also designated PA#3, (SEQID NO:3) were obtained by hybridisation of pairs of complementary whereby t', and to are the half-life values of the distri oligodeoxynucleotides, respectively, SEQ ID NO: 52 and bution phase C. and the elimination phase?, respectively.co is SEQID NO: 53 for P1A1 and SEQ ID NO: 54 and SEQ ID the total blood concentration at time point Zero while c is the NO: 55 for P1A3 as described in Example 1. pFab-P1A1 concentration amplitude for the distribution phase. 15 (200) (Seq ID NO:58) and pFab-P1A3(200) (Seq ID NO. 59) coding for Fab fragments with the corresponding Pro/Ala FIG. 8 depicts the pharmacokinetics for the three test items polymers/polypeptide segments of 200 residues at the C-ter in BALB/c mice. While the recombinant Fab shows a rapid minus of the light chain (LC) (amino acid sequence of LC blood clearance with an elimination half-life of just ca. 1.3 h. Fab-P1A1(200): SEQID NO:56; amino acid sequence of LC the Fab-PAii.1(200) and Fab-PA#1 (600) fusion proteins have Fab-P1A3(200): SEQ ID NO: 57) were constructed in an a more than 3-fold and 29-fold extended half-life with corre analogous manner to pFab-PA#1 (200), which has been sponding values of ca. 4.1 h and 38.8 h, respectively. These described in Example 2. data prove that the in vivo plasma half-life of a Fab fragment In the following SEQID NOs: 56, 57, 58 and 59 are also is significantly prolonged due to fusion with a Pro/Ala poly reproduced. However, these sequences are also comprised in mer/polypeptide, whereby the half-life becomes longer with 25 the appended sequence listing which is a specific part of this increasing length of the amino acid polymer. disclosure and the description of the present invention.

SEO ID NO: 56 Asp Ile Glu Lieu. Thr Glin Ser Pro Ser Ser Lieu. Ser Ala Ser Val Gly 15

Asp Wall. Thir Ile Thir Arg Ala Ser Glin Asp Wall Asn. Thir Ala 25 3 O

Val Ala Trp Gin Glin Lys Pro Gly Ala Pro Lys Lieu. Lieu. Ile 35 4 O 45

Ser Ala Ser Phe Lieu. Tyr Ser Gly Wall Pro Ser Arg Phe Ser Gly SO 55 60

Ser Arg Ser Gly Thir Asp Phe Thir Lell Thir Ile Ser Ser Leu Glin Pro 65 70

Glu Asp Phe Ala Thr Tyr Tyr Glin Glin His Thir Thr Pro Pro 85 90 95

Thir Phe Gly Glin Gly Thr Llys Lieu Glu Ile Arg Thir Wall Ala Ala 105 11O

Pro Ser Wall Phe Ile Phe Pro Pro Ser Asp Glu Glin Lieu. Lys Ser Gly 115 12O 125

Thir Ala Ser Wal Wall Cys Lieu. Lieu. Asn. Asn. Phe Tyr Pro Arg Glu Ala 13 O 135 14 O

Llys Val Gin Trp Wall Asp Asn Ala Luell Glin Ser Gly Asn Ser Glin 145 15 O 155 16 O

Glu Ser Wall Thr Glu Glin Asp Ser Asp Ser Thir Ser Luell Ser 1.65 17 O 17s

Ser Thir Lell Thir Luell Ser Lys Ala Asp Tyr Glu His Lys Wall 18O 185 190

Ala Glu Wall. Thir His Glin Gly Lieu Ser Ser Pro Wall Thir Ser 195 2 OO 2O5

Phe Asn Arg Gly Glu Ser Ser Ala Pro Ala Pro Ala Pro Ala Pro 210 215 22 O

Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro 225 23 O 235 24 O

Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro US 9,221,882 B2 73 74 - Continued 245 250 255

Pro

3 5

4. 5

SEO ID NO : 57 Asp Ile Glu Luell Thir Glin Ser Pro Ser Ser Luell Ser Ala Ser Wall Gly 1O 15

Asp Arg Wall Thir Ile Thir Arg Ala Ser Glin Asp Wall Asn Thir Ala 25

Wall Ala Trp Glin Glin Pro Gly Ala Pro Lys Luell Luell Ile 35 4 O 45

Ser Ala Ser Phe Lell Tyr Ser Gly Wall Pro Ser Arg Phe Ser Gly SO 55 6 O

Ser Arg Ser Gly Thir Asp Phe Thir Luell Thir Ile Ser Ser Luell Glin Pro 65 70

Glu Asp Phe Ala Thir Glin Glin His Thir Thir Pro Pro 85 90 95

Thir Phe Gly Glin Gly Thir Luell Glu Ile Lys Arg Thir Wall Ala Ala 105 11 O

Pro Ser Wall Phe Ile Phe Pro Pro Ser Asp Glu Glin Lell Ser Gly 115 12 O 125

Thir Ala Ser Wall Wall Lell Luell Asn Asn Phe Tyr Pro Arg Glu Ala 13 O 135 14 O

Lys Wall Glin Trp Lys Wall Asp Asn Ala Luell Glin Ser Gly Asn Ser Glin 145 150 155 160

Glu Ser Wall Thir Glu Glin Asp Ser Asp Ser Thir Ser Luell Ser 1.65 17O 17s

Ser Thir Luell Thir Lell Ser Ala Asp Glu His Lys Wall Tyr 18O 185 19 O

Ala Glu Wall Thir His Glin Gly Luell Ser Ser Pro Wall Thir Ser 195 2O5

Phe Asn Arg Gly Glu Ser Ser Ala Ala Ala Pro Ala Ala Ala Pro 21 O 215 22O

Ala Ala Ala Pro Ala Ala Ala Pro Ala Ala Ala Pro Ala Ala Ala Pro 225 23 O 235 24 O

Ala Ala Ala Pro Ala Ala Ala Pro Ala Ala Ala Pro Ala Ala Ala Pro

US 9,221,882 B2 83 84 - Continued aagc catacc aaacgacgag cqtgacacca catgcctgt agcaatggca acaacgttgc 3480 gcaaactatt aactggcgaa citact tactic tagct tcc.cg gcaacaattg atagactgga 354 O tggaggcgga taaagttgca ggaccactitc. tcgct cqgc cct tccggct ggctggttta 36OO ttgctgataa atctggagcc tat cattgca gCactggggc 3660 cagatggtaa gcc ctic cc.gt atcgtagtta t ctacacgac ggggagt cag gcaactatogg 372 O atgaacgaaa tag acagatc gctgagatag gtgcct cact gattalagcat tgg taggaat 378 O taatgatgtc. tcgtttagat aaaagtaaag tdattaa.cag cgcattagag ctgcttaatg 384 O aggit cqgaat cgaaggttta acaac ccgta aactic gocca gaagctaggit gtagagcagc 39 OO ctacattgta ttggcatgta aaaaataagc gggctittgct cgacgcc tta gcc attgaga 396 O tgttagatag gcaccatact cacttittgcc ctittagaagg ggaaagctgg caagatttitt 4020 tacgtaataa cqctaaaagt tittagatgtg citt tactaag t catcgcgat ggagcaaaag 408O tacatttagg tacacggcct acagaaaaac agtatgaaac t ct cqaaaat caattagcct 414 O ttittatgcca acaaggtttt toactagaga atgcattata tgcacticago gcagtggggc 42OO attt tactitt aggttgcgta ttggaagat.c aagagcatca agt cqctaaa gaagaaaggg 426 O aaac acctac tactgatagt atgcc.gc.cat tattacgaca agctatcgaa ttatttgat c 432 O accalaggtgc agagc.cagcc ttct tatt cq gocttgaatt gat catatgc ggattagaaa 438O aacaacttaa atgtgaaagt gggtcttaaa agcagcatala cctttitt cog tgatggtaac 444 O ttcact agtt taaaaggat.c taggtgaaga t cotttittga taatcto atg accaaaatcc 45OO cittaacgtga gttitt.cgttc. cactgagcgt cagaccc.cgt. agaaaagat C aaaggat citt 4560 cittgagat co tttittittctg cqcgtaatct gctgcttgca aacaaaaaaa. ccaccgctac 462O Cagcggtggt ttgtttgc.cg gat calaga.gc tacca actict ttitt.ccgaag gtaactggct 468O t cagcagagc gcagatacca aatactgtcc ttctagtgta gcc.gtagtta ggccaccact 474 O t caagaactic tdtag.c accq cct acatacct cqct citgct aatcc tdtta c cagtggctg 4800 cgataagt cq ttct taccg ggttggactic alagacgatag ttaccggata 486 O aggcgcagcg gttcgggctga acggggggtt C9tgcacaca gcc cagottg gag.cgaacga 492 O

Cctacaccga actgagatac Ctacagogtg agctatgaga aag.cgc.cacg citt cocqaag 498 O ggagaaaggc gga Caggit at CC9gtaag.cg gCagggit C9g aac aggaga.g cgcacgaggg 504 O agct tccagg gggaaacgc.c tggitat ctitt atagt cotgt cgggttt CC cacct ctdac 51OO ttgagcgt.cg attitttgttga tigct cqtcag ggggg.cggag Cctatggaaa aacgc.ca.gca 516 O acgcggcctt tttacggttc ctdgccttitt gctggcc titt tgct cacatg 521 O

50 Example 18 Fab-PA#1 (200) in Example 4, were individually applied at a concentration of 0.25 mg/ml in PBS. Both proteins eluted in a single homogenous peak as shown in FIG. 10.

Measurement of the Hydrodynamic Volume for the 55 As result, the fusion proteins with the 200 residue P1A1 or Recombinant Fusion Protein Between a Fab P1A3 polymers/polypeptides exhibited significantly larger Fragment and a Genetically Encoded P1A1 or P1A3 sizes than the corresponding unfused Fab fragment. The Polypeptide/Polymer by Analytical Gel Filtration apparent size increase for Fab-P1A1 (200) and Fab-P1A3

60 (200) was 5.8-fold and 5.2-fold, respectively, compared with SEC was carried out on a Superdex S200 HR 10/300 GL the Fab fragment (cf. FIG. 4B) whereas the true mass was column (GE Healthcare Europe, Freiburg, Germany) at a flow only larger by 1.4-fold and 1.3-fold. This observation clearly rate of 1 ml/minusing an Akta Purifier 10 system (GE Health indicates a much increased hydrodynamic Volume conferred care) with PBS as running buffer. 250 ul samples of the 65 to the biologically active Fab fragment by the biosynthetic Fab-P1A1(200) and Fab-P1A3(200) fusion proteins, which P1A1 and P1A3 polypeptide segments according to this were similarly produced and purified (FIG.9) as described for invention. US 9,221,882 B2 85 86 Example 19 PA#1 (200) (described in Example 21) together with the plas mid pLysE (Studier (1991).J. Mol. Biol. 219:37-44), which Detection of Random Coil Conformation for the Suppresses the T7 promoter. Bacterial production was per Biosynthetic P1A1 and P1A3 Polymers/Polypeptides formed at 30°C. in shake flask cultures with 2 LLB medium Fused to a Fab Fragment Via Circular Dichroism 5 containing 2.5 g/L D-glucose, 0.5 g/L L-proline, 100 mg/1 (CD) Spectroscopy ampicillin, and 30 mg/l chloramphenicol. Recombinant gene expression was induced by addition ofisopropyl-B-D-thioga CD spectra for Fab-P1A1(200) and Fab-P1A3(200) were lactopyranoside (IPTG) to a final concentration of 0.5 mM. recorded as described in Example 8 using 4.2 and 6.5 uM Bacteria were harvested 3 h after induction, resuspended in protein solutions, respectively, prepared similarly as 10 100 mM NaCl, 40 mMNa-phosphate pH 7.5 and lysed using described in Example 4 using 50 mMKSO, 20 mMK-phos a French pressure cell (Thermo Scientific, Waltham, Mass. phate pH 7.5 as buffer. USA). After centrifugation (15 min, 15000 g) of the lysate no The spectra for the Fab-P1A1 (200) and Fab-P1A3(200) were observed. fusion proteins revealed a significant fraction of random coil The Supernatant containing the Soluble fusion protein was conformation (FIG. 11A). To analyze the spectroscopic con 15 incubated at 70° C. for 15 minand centrifuged (15min, 15000 tribution by the Pro/Ala polypeptide segment in greater detail g) to remove thermally unstable host cell proteins. The His the molar difference CD spectrum with respect to the unfused (6)-SUMO-PA#1 (200) fusion protein was purified from the Fab fragment (see Example 8) was calculated (FIG. 11B) by supernatant via IMAC (Skerra (1994) Gene 141:79-84) using subtracting the latter spectrum from the one for Fab-P1A1 a 12 ml Ni" charged HisTrap high performance column (GE (200) and Fab-P1A3 (200), respectively, after normalization Healthcare) connected to an Aktapurifier system (GE Health to the same molar concentration. As result, a strong minimum care) and eluted with animidazole gradient from 0 to 150 mM at a wavelength of approximately 200 nm, which is charac in 500 mM. NaCl, 40 mM Na-phosphate pH 7.5. After a teristic of random coil conformation, was observed. Thus, the Subsequent preparative SEC step a homogeneous preparation P1A1 and the P1A3 sequences as part of the recombinant of the His(6)-SUMO-PA#1 (200) fusion protein (FIG. 12B) fusion protein appear to be present in random coil conforma 25 with a yield of approximately 5 mg per 1 L bacterial culture tion under physiological buffer conditions. with OD550=1 was obtained. Protein concentration was determined according to the absorption at 280 nm using a Example 20 calculated extinction coefficient (Gill (1989) loc. cit) of 1280 M' cm' for the His(6)-SUMO-PA#1 (200) polypeptide Construction of pSUMO-PA#1 (200) as Expression 30 fusion. Note that the PAit 1 (200) polypeptide segment does Vector for a His(6)-SUMO-PA#1 (200) Fusion not contribute to the absorption at 280 nm due to its lack of Protein aromatic or sulfur-containing amino acid side chains. The biosynthetic PA#1 (200) polypeptide was liberated For the construction of an expression plasmid encoding a from the fusion protein by site specific proteolytic cleavage six-residue His-tag and the Small ubiquitin-like modifier 35 (downstream of a Gly-Gly motif preceding the Pro/Ala (SUMO) protein (Panavas (2009) Methods Mol. Biol. 497: polypeptide segment) with 2 U/mg Ubl-specific protease 1 303-17) fused to a 200 residue PAH1 sequence repeat, the from Saccharomyces cerevisiae (Invitrogen, Carlsbad, Calif., SUMO protein) from Saccharomyces cerevisiae also known USA) for 1 h at 30° C. in cleavage buffer (0.2 w/v % Igepal, as Smt3p; Uniprot: D12306 was amplified via polymerase 1 mM DTT, 150 mM NaCl, 50 mM Tris-HCl pH 8.0). The chain reaction (PCR) from a cloned cDNA. The 5'-primer 40 cleavage process was checked by SDS-PAGE (FIG. 12B) introduced an Ndel restriction site, containing a Met start using a high molarity Tris buffer system (Fling (1986) Anal. codon (ATG) and an additional Lys codon, as well as the Biochem. 155: 83-88). In order to remove the cleaved His(6)- His6-tag encoding sequence while the 3'-primer introduced a SUMO protein, residual uncleaved fusion protein, and also HindIII and SapI restriction site into the PCR product. The the SUMO protease, all carrying the His-tag, the reaction resulting DNA fragment was digested with Ndel and HindIII 45 mixture was subjected to another IMAC using a 5 ml Ni" and ligated with a correspondingly digested derivative of the charged HisTrap high performance column (GE Healthcare) plasmid pSA1 (Schmidt (1994) J. Chromatogr. 676: 337 and 500 MNaCl, 20 mM phosphate, pH 7.5 as running buffer. 345), wherein the SapI restriction site had been eliminated by This time the flow-through contained the pure biosynthetic silent mutation. The resulting plasmid was cut with Sap, PA#1 (200) polypeptide (FIG. 13 E). Note that the biosyn dephosphorylated with shrimp alkaline phosphatase, and 50 thetic PA#1 (200) polypeptide/polymer (SEQID NO: 61) pre ligated with the gene fragment encoding the 200 residue pared in this manner comprises altogether 201 amino acid PA#1 polypeptide segment excised from the plasmid pFab residues, which arise from the encoded combined gene prod PA#1 (200) (described in Example 2) by restriction digest uct of 10 ligated double-stranded oligodeoxynucleotide with SapI (in an analogous way as exemplified in FIG. 2E). building blocks, each encoding 20 amino acid residues, as The resulting plasmid was designated pSUMO-PAi1 (200) 55 shown in FIG. 1, and an additional Ala residue encoded by the (SEQ ID NO: 60) and is depicted in FIG. 12A. triplet DNA overhang of the downstream SapI restriction site that was used for cloning. Example 21 Example 22 Bacterial Expression and Isolation of a Genetically 60 Encoded PA#1 (200) Polymer/Polypeptide Preparation and Characterization of Small Molecule/Drug Conjugates with PAH1 (200) The PAi1 (200) polypeptide (calculated mass: 16.1 kDa) was initially produced as fusion protein with the Small ubiq The unpurified proteolytic cleavage reaction mixture of the uitin-like modifier (SUMO) protein (calculated mass: 12.2 65 His(6)-SUMO-PA#1 (200) fusion protein from Example 21 kDa) in the cytoplasm of E. coli BLR(DE3) (NEB, Ipswich, was twice dialysed at 4°C. against 50 mM NaHCO pH 8.3 Mass., USA) harboring the expression plasmid pSUMO and incubated at room temperature for 1 h after mixing with US 9,221,882 B2 87 88 a 10-fold molar excess of a solution of 6-fluorescein-5(6)- The present invention relates to and refers to the following carboxamidohexanoic acid N-hydroxysuccinimide ester exemplified sequences, whereby the appended sequence list (Fluorescein-NHS ester; Sigma-Aldrich) in dry dimethylfor ing is presented as part of the description and is, accordingly mamide (DMF). To this end, 200 ul of a 2.5 mg/ml solution of a part of this specification. the His(6)-SUMO-PAi1 (200) cleavage mixture was added to 5 SEQID NO: 1 shows the amino acid sequence of PAH 1. 17.6 ul of a 10 mM solution of Fluorescein-NHS ester dis SEQID NO: 2 shows the amino acid sequence of PAH2. solved in DMF. The resulting mixture was incubated at room SEQID NO: 3 shows the amino acid sequence of PAi3. temperature for 1 h and applied to IMAC as described in SEQID NO. 4 shows the amino acid sequence of PAH4. Example 21 to remove the cleaved His(6)-SUMO protein, SEQID NO: 5 shows the amino acid sequence of PAi5. residual uncleaved fusion protein, and the SUMO protease 10 and further purified by preparative SEC on a Superdex S200 SEQID NO: 6 shows the amino acid sequence of PAH6. 10/300 GL column equilibrated with PBS at a flow rate of 0.5 SEQID NO: 7 shows an amino acid sequence of a circular ml/min. permutated version of SEQID NO: 1 Samples from the different steps were then analysed via SEQID NO: 8 shows an amino acid sequence of a circular analytical SEC on a Superdex S200 10/300 GL column 15 permutated version of SEQID NO: 1. equilibrated with PBS at a flow rate of 0.5 ml/min. The SEQID NO: 9 shows an amino acid sequence of a circular SUMO protein was detected via its aromatic side chains at permutated version of SEQID NO: 1. 280 nm and the peptide bonds, including those of the Pro/Ala SEQID NO: 10 shows an amino acid sequence of a circular polypeptide or polypeptide segment, were detected at 225 nm permutated version of SEQID NO: 1. while fluorescein was detected at 494 nm (FIG. 13 A-G). For SEQID NO: 11 shows an amino acid sequence of a circular comparison, UV/VIS spectra of a solution of free fluorescein permutated version of SEQID NO: 1. (Sigma-Aldrich) and of fractions from each distinct peak SEQID NO: 12 shows an amino acid sequence of a circular detected in the SEC were measured using a Lambda 9 instru permutated version of SEQID NO: 1. ment (Perkin Elmer, Waltham, Mass., USA) (FIG. 13 H-K). SEQID NO: 13 shows an amino acid sequence of a circular For size calibration of the chromatography column (FIG. 13 25 permutated version of SEQID NO: 1. L), 250 ul of an appropriate mixture of the following globular SEQID NO: 14 shows an amino acid sequence of a circular proteins (Sigma-Aldrich) were applied in PBS at concentra permutated version of SEQID NO: 1. tions between 0.2 and 0.5 mg/ml: aprotinin, 6.5 kDa; cyto SEQID NO: 15 shows an amino acid sequence of a circular chrome c. 12.4 kDa; carbonic anhydrase, 29.0 kDa; bovine permutated version of SEQID NO: 1. serum albumin, 66.3 kDa; alcohol dehydrogenase, 150 kDa; 30 SEQID NO: 16 shows an amino acid sequence of a circular B-amylase, 200 kDa; apo-ferritin, 440 kDa. permutated version of SEQID NO: 1. As result, after coupling of the biosynthetic PA#1 (200) SEQ ID NO: 17 shows a nucleic acid sequence of the polypeptide/polymer with Fluorescein-NHS ester a macro upper/coding strand oligodeoxynucleotide used for the gen molecular conjugate was isolated via IMAC and SEC that eration of building block PAH 1. essentially exhibits the size properties of the PAi1 (200) 35 SEQ ID NO: 18 shows a nucleic acid sequence of lower/ polypeptide/polymer and the spectroscopic signature of the non-coding strand oligodeoxynucleotide used for the genera Small molecule, i.e. the fluorescein group. This demonstrates tion of the building block for PAiii 1. that the small molecule was successfully coupled to the bio SEQ ID NO: 19 shows a nucleic acid sequence stretch synthetic Pro/Ala polypeptide/polymer, which according to (upper/coding strand) around the C-terminus of the immuno this invention dramatically increases the hydrodynamic Vol 40 globulin light chain of an antibody Fab fragment as encoded ume of the conjugated Small molecule drug or compound. on pASK88-Fab-2XSapI. To prepare a similar conjugate between the biosynthetic SEQ ID NO: 20 shows a nucleic acid sequence stretch Prof Ala polypeptide/polymer and the plant steroid digoxige (lower/non-coding strand) around the C-terminus of the nin, 0.1 mg of the purified PAi1 (200) polypeptide from immunoglobulin light chain of an antibody Fab fragment as Example 21 was dialysed against 50 mM NaHCO pH 8.3 as 45 encoded on pASK88-Fab-2XSapI. described above. The concentration of purified PAi1 (200) SEQ ID NO: 21 shows an amino acid sequence of the polypeptide was determined according to the absorption at C-terminus of the light chain of the Fab fragment as encoded 205 nm (Gill (1989) loc. cit). The PA#1 (200) polypeptide was on pASK88-Fab-2XSapI. coupled with a 10-fold molar excess of digoxigenin-3-O- SEQ ID NO: 22 shows the nucleic acid sequence of methylcarbonyl-e-aminocaproic acid NHS ester (DIG-NHS 50 pASK88-Fab-2xSapI. ester; Roche Diagnostics, Mannheim, Germany). For this SEQ ID NO: 23 shows a nucleic acid sequence stretch purpose, 100 ul of a 1 mg/ml solution of the purified PAH1 (upper/coding strand) encoding amino acid sequence of the (200) polypeptide in 50 mM NaHCO, pH 8.3 was added to 2 C-terminus of the Fab light chain after insertion of one PAi1 ul of a 30 mM solution of DIG-NHS ester dissolved in dry (20) polymer. DMF and the reaction mix was incubated for 1 h at room 55 SEQID NO: 24 shows a nucleic acid sequence (lower/non temperature. The resulting Solution of the conjugate was puri coding strand) for an amino acid stretch of the C-terminus of fied using a Zeba'TM spin desalting column with a cutoff of 7 an Fab light chain after insertion of one PAH 1 (20) polymer. kDa (Thermo Scientific), twice dialysed against 10 mM SEQID NO: 25 shows an amino acid sequence stretch of ammonium acetate buffer pH 6.8 and analysed via ESI mass the C-terminus of an Fab light chain after insertion of one spectrometry on a Q-T of Ultima instrument (Waters, Esch 60 PA#1 (20) polymer. bronn, Germany) using the positive ion mode. As result, the SEQID NO: 26 shows the amino acid sequence of the Fab spectrum of the Digoxigenin-PA#1 (200) conjugate revealed a heavy chain as encoded on pFab-PA#1 (200). mass of 16671.4 Da, which essentially coincides with the SEQID NO: 27 shows the amino acid sequence of the Fab calculated mass of 16670.6 Da (FIG. 13M). This clearly light chain fused with the PAi1 (200) polymeras encoded on demonstrates that a biosynthetic Pro/Ala polypeptide/poly 65 pFab-PA#1 (200). mer, in particular PAi1 (200), can be efficiently conjugated SEQID NO: 28 shows the nucleic acid sequence of pFab with a small molecule drug. PA#1 (200). US 9,221,882 B2 89 90 SEQID NO: 29 shows the nucleic acid sequence (upper/ SEQ ID NO: 44 shows the amino acid sequence of the coding strand) encoding the amino acid sequence of the N-terminus of His6-hGH after insertion of the PA#1 (20) N-terminus of INFa2b and Strep-tag II (only the last two polymer. amino acids). SEQID NO: 45 shows the amino acid sequence of mature SEQID NO:30 shows a nucleic acid sequence (lower/non- 5 coding strand) encoding amino acid sequence of the N-ter His 6-PA#1 (200)-hGH as encoded on pASK75-His6-PA#1 minus of INFa2b and Strep-tag II (only the last two amino (200)-hGH. acids). SEQ ID NO: 46 shows the nucleic acid sequence of SEQ ID NO: 31 shows the amino acid sequence of the pASK75-His6-PA#1 (200)-hGH. C-terminus of Strep-tag II and the N-terminus of INFa2b. SEQID NO: 47 shows the amino acid sequence of His6 SEQID NO:32 shows the nucleic acid sequence of p ASK 10 PA#1 (200)-hGH as encoded on pCHO-PA#1 (200)-hGH. IFNa2b. SEQID NO: 48 shows the nucleic acid sequence of pCHO SEQ ID NO: 33 shows a nucleic acid sequence stretch PA#1 (200)-hGH. (upper/coding strand) encoding the C-terminus of Strep-tag II SEQID NO: 49 shows the nucleic acid sequence of pCHO and the N-terminus of IFNa2b after insertion of one PA#1 hGH. polymer sequence cassette. 15 SEQIDNO: 50 shows the nucleic acid sequence of pCHO. SEQ ID NO. 34 shows a nucleic acid sequence stretch SEQID NO: 51 shows the amino acid sequence of P1A1. (lower/non-coding strand) of the C-terminus of Strep-tag II SEQID NO: 52 shows the nucleic acid sequence of upper/ and the N-terminus of IFNa2b after insertion of one PA#1 coding strand oligodeoxynucleotide used for the generation polymer sequence cassette. of the building block for P1A1. SEQID NO: 35 shows an amino acid sequence stretch of SEQID NO: 53 shows the nucleic acid sequence of lower/ the C-terminus of Strep-tag II and the N-terminus of IFNa2b non-coding strand oligodeoxynucleotide used for the genera ater fusion with one PA#1 polymer cassette. tion of the building block for P1A1. SEQID NO:36 shows the amino acid sequence of IFNa2b SEQID NO: 54 shows the nucleic acid sequence of upper/ and Strep-tag II fused with the PAi 1 (200) polymer as coding strand oligodeoxynucleotide used for the generation encoded on pPA#1 (200)-IFNa2b. 25 of the building block for P1A3. SEQID NO:37 shows the nucleic acid sequence of pPA#1 SEQID NO: 55 shows the nucleic acid sequence of lower/ (200)-IFNa2b. non-coding strand oligodeoxynucleotide used for the genera SEQ ID NO: 38 shows a nucleic acid sequence stretch tion of the building block for P1A3. (upper/coding strand) on p ASK75-His6-hGH encoding the SEQID NO: 56 shows the amino acid sequence of the Fab amino acid sequence around the N-terminus of His6-hGH. 30 light chain fused with the P1A1 (200) polymeras encoded on SEQ ID NO: 39 shows a nucleic acid sequence stretch pFab-P1A1(200). (lower/non-coding strand) on p ASK75-His6-hGH encoding SEQID NO: 57 shows the amino acid sequence of the Fab the amino acid sequence around the N-terminus of hCH. light chain fused with the P1A3(200) polymeras encoded on SEQID NO: 40 shows an amino acid sequence stretch of pFab-P1A3(200). the N-terminus of His6-hGH as encoded on pASK75-His6 SEQID NO: 58 shows the nucleic acid sequence of pFab hGH. P1A1(200). SEQ ID NO: 41 shows the nucleic acid sequence of SEQ ID NO. 59 shows the acid sequence of pFab-P1A3 pASK75-His6-hGH. (200). SEQ ID NO: 42 shows a nucleic acid sequence (upper/ SEQ ID NO: 60 shows the nucleic acid sequence of coding-Strand) stretch encoding amino acid sequence of the 40 pSUMO-PA#1 (200). N-terminus of His6-hGH after insertion of the PA#1 (20) SEQ ID NO: 61 shows the PA#1 (200) polypeptide/poly polymer. mer used for the preparation of drug conjugates (made by SEQID NO: 43 shows a nucleic acid sequence (lower/non ligation of 10 20mer encoding gene cassettes, including one coding strand) encoding the N-terminus of hCH after inser additional C-terminal Ala residue resulting from the down tion of one PAH1 polymer sequence cassette. stream ligation site.

SEQUENCE LISTING

<16 Os NUMBER OF SEO ID NOS: 61

<21 Os SEQ ID NO 1 &211s LENGTH: 2O 212s. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221s NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of PAH1'

<4 OOs SEQUENCE: 1

Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro 1. 5 1O 15

Ala Ala Pro Ala 2O US 9,221,882 B2 91 92 - Continued

<210s, SEQ ID NO 2 &211s LENGTH: 2O 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of PAH2 <4 OOs, SEQUENCE: 2

Ala Ala Pro Ala Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala 1. 5 1O 15

Pro Ala Ala Pro 2O

<210s, SEQ ID NO 3 &211s LENGTH: 2O 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of PAH3"

<4 OOs, SEQUENCE: 3

Ala Ala Ala Pro Ala Ala Ala Pro Ala Ala Ala Pro Ala Ala Ala Pro 1. 5 1O 15

Ala Ala Ala Pro 2O

<210 SEQ ID NO 4 &211s LENGTH: 24 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of PAH4' <4 OOs, SEQUENCE: 4

Ala Ala Pro Ala Ala Pro Ala Ala Pro Ala Ala Pro Ala Ala Pro Ala 1. 5 1O 15

Ala Pro Ala Ala Pro Ala Ala Pro 2O

<210s, SEQ ID NO 5 &211s LENGTH: 24 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of PAH5" <4 OOs, SEQUENCE: 5

Ala Pro Ala Ala Ala Pro Ala Pro Ala Ala Ala Pro Ala Pro Ala Ala 1. 5 1O 15

Ala Pro Ala Pro Ala Ala Ala Pro 2O

<210s, SEQ ID NO 6 &211s LENGTH: 24 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: US 9,221,882 B2 93 94 - Continued amino acid sequence of PA 6" <4 OOs, SEQUENCE: 6

Ala Ala Ala Pro Ala Ala Pro Ala Ala Pro Pro Ala Ala Ala Ala Pro 1. 5 1O 15

Ala Ala Pro Ala Ala Pro Pro Ala 2O

<210s, SEQ ID NO 7 &211s LENGTH: 2O 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of a circular permutated version of SEQ ID NO: 1

<4 OO > SEQUENCE: 7

Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala 1. 5 1O 15

Ala Pro Ala Ala 2O

<210s, SEQ ID NO 8 &211s LENGTH: 2O 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of a circular permutated version of SEQ ID NO: 1

<4 OOs, SEQUENCE: 8

Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala 1. 5 1O 15

Pro Ala Ala Ala 2O

<210s, SEQ ID NO 9 &211s LENGTH: 2O 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of a circular permutated version of SEQ ID NO: 1

<4 OOs, SEQUENCE: 9

Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro 1. 5 1O 15

Ala Ala Ala Pro 2O

<210s, SEQ ID NO 10 &211s LENGTH: 2O 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of a circular permutated version of SEQ ID NO: 1

<4 OOs, SEQUENCE: 10 US 9,221,882 B2 95 96 - Continued

Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala 1. 5 1O 15

Ala Ala Pro Ala 2O

<210s, SEQ ID NO 11 &211s LENGTH: 2O 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of a circular permutated version of SEQ ID NO: 1

<4 OOs, SEQUENCE: 11

Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala 1. 5 1O 15

Ala Pro Ala Ala 2O

<210s, SEQ ID NO 12 &211s LENGTH: 2O 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of a circular permutated version of SEQ ID NO: 1

<4 OOs, SEQUENCE: 12

Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Ala 1. 5 1O 15

Pro Ala Ala Pro 2O

<210s, SEQ ID NO 13 &211s LENGTH: 2O 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of a circular permutated version of SEQ ID NO: 1

<4 OOs, SEQUENCE: 13

Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro 1. 5 1O 15

Ala Ala Pro Ala 2O

<210s, SEQ ID NO 14 &211s LENGTH: 2O 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of a circular permutated version of SEQ ID NO: 1

<4 OOs, SEQUENCE: 14

Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro Ala 1. 5 1O 15 US 9,221,882 B2 97 98 - Continued

Ala Pro Ala Pro 2O

<210s, SEQ ID NO 15 &211s LENGTH: 2O 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of a circular permutated version of SEQ ID NO: 1

<4 OOs, SEQUENCE: 15

Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro Ala Ala 1. 5 1O 15

Pro Ala Pro Ala 2O

<210s, SEQ ID NO 16 &211s LENGTH: 2O 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of a circular permutated version of SEQ ID NO: 1

<4 OOs, SEQUENCE: 16

Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro Ala Ala Pro 1. 5 1O 15

Ala Pro Ala Ala 2O

<210s, SEQ ID NO 17 &211s LENGTH: 60 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence of the upper/coding strand oligodeoxynucleotide used for the generation of building block PAH1'

<4 OOs, SEQUENCE: 17 gcc.gct coag Ctgcacctgc ticcagcagca Cctgctgcac cagct Coggc tigctic ctgct 6 O

<210s, SEQ ID NO 18 &211s LENGTH: 60 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence of lower/non-coding Strand oligodeoxynucleotide used for the generation of the building block for PAH1'

<4 OOs, SEQUENCE: 18 ggcagcagga gCagc.cggag Ctggtgcagc aggtgctgct gagc aggtg Cagctggagc 6 O

<210s, SEQ ID NO 19 &211s LENGTH: 49 &212s. TYPE: DNA <213> ORGANISM: artificial sequence US 9,221,882 B2 99 100 - Continued

22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence stretch (upper/coding Strand) around the C-terminus of the immunoglobulin light chain of an antibody Fab fragment as encoded on pASK88-Fab-2XSapI"

<4 OOs, SEQUENCE: 19 alaga.gctt.ca accgcggaga gtgct cttct gcc taagag Cttaa.gctt 49

<210s, SEQ ID NO 2 O &211s LENGTH: 49 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence stretch (lower/non-coding strand) around the C-terminus of the immunoglobulin light chain of an antibody Fab fragment as encoded on pASK88-Fab-2XSapI"

<4 OOs, SEQUENCE: 2O aagcttaa.gc ticttcaggca galaga.gcact Ctc.cgcggitt gaagct Ctt 49

<210s, SEQ ID NO 21 &211s LENGTH: 11 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of the C-terminus of the light chain of the Fab fragment as encoded on paSK88-Fab-2XSapI"

<4 OOs, SEQUENCE: 21 Llys Ser Phe Asin Arg Gly Glu. Cys Ser Ser Ala 1. 5 1O

<210s, SEQ ID NO 22 &211s LENGTH: 461 O &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence of paSK88 - Fab-2XSap I"

<4 OOs, SEQUENCE: 22 accc.gacacic atcgaatggc cagatgatta attcc taatt tttgttgaca citctato att 6 O gatagagitta ttttaccact c cctatoagt gatagagaaa agtgaaatga at agttcgac 12 O aaaaatctag atalacgaggg caaaaaatga aaaagacagc tat cqcgatt gcagtggcac 18O tggctggttt cqctaccgta gcgcaggc.cg aagttaaact gcaggaatcc ggtggtggit c 24 O tggttcagcc aggtggttcc ctg.cggct ct cqtgtgctgc titc.cggttt C alacatcaaag 3OO acacct acat C cactgggitt cqt caggctic cqggtaaagg cctggaatgg gttgctcgta 360 tctaccc.gac caacggittac accagg tatgcc.gatticagt taaaggit cqt titcac catct cgg.ccgacac titccaaaaac accqcttacct coagatgaa citc cctd.cgit gctgaagaca 48O

Cagctgttta t tattgct co cqttggggtg gtgacggittt Ctacgctatg gaCtactggg 54 O gtcaggg tac cct ggit cacc gtctic ct cag cct coaccaa goggcc catcg gttct tcc ccc tggcaccctic ctic caagagc acctctgggg gCacagoggc cctgggctgc Ctggit Caagg 660 actact tccc calaccggtg acggtgtcgt ggaacticagg cqc cctgacc agcggcgtgc 72 O

US 9,221,882 B2 103 104 - Continued atgaacgaaa tag acagatc gctgagatag gtgcct cact gattalagcat tgg taggaat 318O taatgatgtc. tcgtttagat aaaagtaaag tgattaa.cag cgcattagag ctgcttaatg 324 O aggit cqgaat calaggttta acaac ccgta aactic gocca gaagctaggit gtagagcagc 33 OO ctacattgta ttggcatgta aaaaataagc gggctittgct cgacgcc tta gcc attgaga 3360 tgttagatag gcaccatact cacttittgcc Ctttagaagg ggaaagctgg caagatttitt 342O tacgtaataa cqctaaaagt tittagatgtg citt tactaag t catcgcgat ggagcaaaag 3480 tacatttagg tacacggc ct acagaaaaac agtatgaaac t ct cqaaaat caattagcct 354 O ttittatgcca acaaggttitt t cact agaga atgcattata tgcacticago gCagtggggc 36OO attt tactitt aggttgcgta ttggaagatc aagagcatca agt cqctaaa gaagaaaggg 366 O aaac acctac tactgatagt atgcc.gc.cat tattacgaca agctatcgaa ttatttgat c 372 O accalaggtgc agagc.ca.gc.c ttct tatt cq gcc ttgaatt gat catatgc ggattagaaa 378 O aacaacttaa atgtgaaagt gggtcttaaa agcagcataa cctttitt cog tgatggtaac 384 O ttcact agtt taaaaggatc taggtgaaga tcc tttittga taatcto atg accaaaatcC 3900 cittaacgtga gttitt.cgttc Cactgagcgt. cagaccc.cgt. agaaaagat C aaaggat citt 396 O cittgagat co tttittittctg cgcgtaatct gctgcttgca aacaaaaaaa. ccaccgctac

Cagcggtggt ttgtttgc.cg gat calaga.gc taccalactict ttitt.ccgaag gta actggct t cagcagagc gcagatacca aatactgtc.c ttctagtgta gcc.gtagtta ggccaccact 414 O t caagaactic tdtag.c accq CCtaCatacc tcqct ctdct aatcc tdtta Ccagtggctg 42OO

Ctgc.ca.gtgg catalagt cq tgtc.ttaccg ggttggactic alagacgatag ttaccggata 426 O aggcgcagcg gtcgggctga acggggggtt cgtgcacaca gcc cagottg gag.cgaacga 432O cctacaccga act gagatac Ctacagcgtg agctatgaga aag.cgc.cacg citt cocqaag 438 O ggagaaaggc gga Caggit at ccggtaag.cg gCagggit C9g aac aggaga.g cgcacgaggg 4 44 O agct tccagg gggaaacgc.c tggitat ctitt atagt cctogt cgggttt CC cacct ctdac 4500 ttgagcgt.cg attitttgttga tgctcgtcag ggggg.cggag Cctatggaaa aacgc.ca.gca 456 O acgcggcctt tttacggttc ctggccttitt gctggcctitt tgct cacatg 461 O

SEQ ID NO 23 LENGTH: 85 TYPE: DNA ORGANISM: artificial sequence FEATURE: NAMEAKEY: source OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence stretch (upper/coding Strand) encoding amino acid sequence of the C-terminus of the Fab light chain after insertion of one PAE.1 (20) polymer"

SEQUENCE: 23 gagtgctictt Ctgcc.gct Co. agctgcacct gct coag cag Cacctgctgc accagct cog 6 O gctgct Cotg Ctgcctgaag agctt 85

SEQ ID NO 24 LENGTH: 85 TYPE: DNA ORGANISM: artificial sequence FEATURE: NAMEAKEY: source OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence (lower/non-coding Strand) for an amino acid stretch of the C-terminus of an Fab light chain after insertion of US 9,221,882 B2 105 106 - Continued

one PAE.1 (2O) polymer"

<4 OOs, SEQUENCE: 24 aagctictt Caggcagcagga gcago.cggag Ctggtgcagc aggtgctgct ggagcaggtg Cagctggagc ggcagaagag cactic

SEO ID NO 25 LENGTH: 25 TYPE : PRT ORGANISM: artificial sequence FEATURE: NAME/KEY: Solice OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence stretch of the C-terminus of an Fab light chain after insertion of one PAE.1 (20 polymer"

<4 OOs, SEQUENCE: 25 Glu Cys Ser Ser Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala 1. 5 15

Ala Pro Ala Pro Ala Ala Pro Ala Ala 25

SEQ ID NO 26 LENGTH: 229 TYPE : PRT ORGANISM: artificial sequence FEATURE: NAME/KEY: Solice OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of a Fab heavy chain as encoded on pFab-PA#1 (200)."

SEQUENCE: 26

Glu Val Lys Lieu Glin Glu Ser Gly Gly Luell Wall Glin Pro Gly Gly 1. 5 15

Ser Luell Arg Luell Ser Ala Ser Gly Phe Asn Ile Lys Asp Thir 2O 25

Ile His Trp Wall Arg Glin Pro Gly Lys Gly Lell Glu Trp Wall 35 45

Ala Arg Ile Pro Thir Asn Thir Arg Tyr Ala Asp Ser Wall SO 55 6 O

Lys Gly Arg Phe Thir Ile Ser Asp Thir Ser Asn Thir Ala Tyr 65 70

Lell Glin Met Asn Ser Lell Arg Glu Asp Thir Ala Wall Tyr Cys 85 90 95

Ser Arg Trp Gly Gly Asp Gly Phe Tyr Ala Met Asp Trp Gly Glin 105 11 O

Gly Thir Luell Wall Thir Wall Ser Ser Ala Ser Thir Gly Pro Ser Wall 115 12 O 125

Phe Pro Luell Ala Pro Ser Ser Ser Thir Ser Gly Gly Thir Ala Ala 13 O 135 14 O

Lell Gly Luell Wall Lys Asp Phe Pro Glu Pro Wall Thir Wall Ser 145 150 155 160

Trp Asn Ser Gly Ala Lell Thir Ser Gly Wall His Thir Phe Pro Ala Wall 1.65 17O 17s

Lell Glin Ser Ser Gly Lell Tyr Ser Luell Ser Ser Wall Wall Thir Wall Pro 18O 185 19 O

Ser Ser Ser Luell Gly Thir Glin Thir Tyr Ile Cys Asn Wall Asn His 195 2OO 2O5 US 9,221,882 B2 107 108 - Continued Pro Ser Asn. Thir Lys Val Asp Llys Llys Val Glu Pro Llys Ser Cys His 21 O 215 22O

His His His His His 225

<210s, SEQ ID NO 27 &211s LENGTH: 417 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of the Fab light chain fused with the PA#1 (200) polymer as encoded on pFab-PA#1 (200) " <4 OOs, SEQUENCE: 27 Asp Ile Glu Lieu. Thr Glin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly 1. 5 1O 15 Asp Arg Val Thir Ile Thr Cys Arg Ala Ser Glin Asp Wall Asn. Thir Ala 2O 25 3O Val Ala Trp Tyr Glin Gln Llys Pro Gly Lys Ala Pro Llys Lieu. Lieu. Ile 35 4 O 45 Tyr Ser Ala Ser Phe Lieu. Tyr Ser Gly Val Pro Ser Arg Phe Ser Gly SO 55 6 O Ser Arg Ser Gly Thr Asp Phe Thr Lieu. Thir Ile Ser Ser Leu Gln Pro 65 70 7s 8O Glu Asp Phe Ala Thr Tyr Tyr Cys Glin Glin His Tyr Thr Thr Pro Pro 85 90 95 Thir Phe Gly Glin Gly Thr Llys Lieu. Glu Ile Lys Arg Thr Val Ala Ala 1OO 105 11 O Pro Ser Val Phe Ile Phe Pro Pro Ser Asp Glu Gln Leu Lys Ser Gly 115 12 O 125 Thir Ala Ser Val Val Cys Lieu. Lieu. Asn. Asn. Phe Tyr Pro Arg Glu Ala 13 O 135 14 O Llys Val Glin Trp Llys Val Asp Asn Ala Lieu. Glin Ser Gly Asn. Ser Glin 145 150 155 160 Glu Ser Val Thr Glu Glin Asp Ser Lys Asp Ser Thr Tyr Ser Leu Ser 1.65 17O 17s Ser Thr Lieu. Thir Lieu. Ser Lys Ala Asp Tyr Glu Lys His Llys Val Tyr 18O 185 19 O Ala Cys Glu Val Thr His Glin Gly Lieu Ser Ser Pro Val Thr Lys Ser 195 2OO 2O5 Phe Asn Arg Gly Glu. Cys Ser Ser Ala Ala Pro Ala Ala Pro Ala Pro 21 O 215 22O

Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro Ala 225 23 O 235 24 O

Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala 245 250 255

Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro 26 O 265 27 O

Ala Ala Pro Ala Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala 27s 28O 285

Ala Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro Ala Ala Pro Ala Pro 29 O 295 3 OO

Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro Ala 3. OS 310 315 32O US 9,221,882 B2 109 110 - Continued

Ala Pro Ala Pro Ala Al a Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala 330 335

Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Pro Ala Pro 34 O 35. O

Ala Ala Pro Ala Ala Al a Pro Ala Ala Pro Ala Pro Ala Pro Ala 355 360

Ala Pro Ala Pro Ala Al a Pro Ala Ala Ala Pro Ala Pro Ala Pro 37 O 375

Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala 385 39 O 395 4 OO

Ala Pro Ala Pro Ala Al a Pro Ala Ala Pro Ala Pro Ala Pro Ala 41O 415

<210s, SEQ ID NO 28 &211s LENGTH: 521 O &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence of pFab-PA#1 (200) "

<4 OOs, SEQUENCE: 28 accc.gacacic atcgaatggc cagatgatta att CCtaatt tttgttgaca cit citat catt 6 O gatagagitta ttttaccact c cctatoagt gatagagaaa agtgaaatga at agttcgac 12 O aaaaatctag atalacgaggg caaaaaatga aaaaga Cagc tat CCatt gCagtggCaC 18O tggctggttt cqctaccgta gcgcaggcc.g aagttaaact gCaggaatcC ggtggtggtC 24 O tggttcagcc aggtggttcc Ctgcggct ct titc.cggttt c aac atcaaag 3OO acacct acat coactgggitt cgt caggctic cgggtaaagg Cctggaatgg gttgctcgta 360 tctaccc.gac caacggittac accagg tatg ccgatticagt taalaggt cqt tt cac catct cgg.ccgacac titccaaaaac accoctitacc tccagatgaa ctic cctd.cgt. gctgaagaca cagotgttta ttattgct co cgttggggtg gtgacggittt citacgctatg gact actggg 54 O gtcagggitac Cctggt cacc gtc. tcct cag CCt CCaC cala gggcc catcg gtctt coccc tggcaccctic ctic caagagc acctctgggg gcacagoggc Cctgggctgc Ctggit Caagg 660 actact tccc cqaaccoggtg acggtgtcgt. ggaacticagg cgc cctdacc agcgg.cgtgc 72 O acacct tccc ggctgtc.cta cagtic ct cag gactic tactic cct cago agc gtggtgactg tgcc ct coag cagcttgggc acccagacct a catctgcaa cgittaat cac aaacccagoa 84 O acaccalaggt cacaagaaa gttgagcc.ca aatcttgc.ca to accaccat CaCCattaat 9 OO alaccatggag aaaataaagt gaaacaaagc act attgcac tggcact citt accottactg 96.O tttacc cctd togacaaaag.c cga catcgag Ctcac Coaat cc.ccgt.cctic cctgtc.cgct tcc.gttggcg accgtgttac cat cacgtgt agggcct cqc alagacgtaaa caccgc.cgta gcgtggitatic agcagaalacc cgggaaagct ccgaaactgc tgatctatag cgctt cottc 14 O

Ctgt attic.cg gagttc.cgag Caggttcagt ggttc.ccgtt ccggit accga citt caccctg 2OO acgatat cct c cct coagcc ggaag acttic gctacct act actgtcaa.ca gcact acacc 26 O acccc.gc.cga cct tcggtca ggg taccalaa citcgagat.ca aacggactgt ggctgcacca 32O tctgtc.tt catct tcc.cgcc atctgatgag cagttgaaat Ctggaactgc citctgttgttg tgcc togctgaataactitcta tcc.ca.gagag gcc aaagtac agtggalaggt ggatalacgc.c 44 O

US 9,221,882 B2 115 116 - Continued

<210s, SEQ ID NO 31 &211s LENGTH: 19 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of the C-terminus of Strep-tag II and the N-terminus of IN

<4 OOs, SEQUENCE: 31 Glu Lys Gly Ala Ser Ser Ser Ala Cys Asp Lieu Pro Glin Thr His Ser 1. 5 15 Lieu. Gly Ser

<210s, SEQ ID NO 32 &211s LENGTH: 3721 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence of pASK-IFNa2b."

<4 OOs, SEQUENCE: 32 c catcgaatg gccagatgat taatt CCtaa ttitttgttga CactCtatca ttgatagagt 6 O tattttacca CitcCCt at Ca gtgatagaga aaagtgaaat gaatagttcg acaaaaatct 12 O agataacgag ggcaaaaaat gaaaaagaca gct at cqcga ttgcagtggc actggctggit 18O titcgct accg tagcgcaggc cgctagotgg agccaccc.gc agttcgaaaa aggcgc.ca.gc 24 O tottctgcct gtgatctgcc toaaa.cccac agcctgggta gCaggaggac cittgatgctic 3OO

Ctggcacaga tigaggagaat ctic tottt to tcc tigcttga aggacagaca tgactittgga 360 titt CCC cagg aggagtttgg caaccagttc caaaaggctg aalaccat CCC tgtcct c cat gagatgat Co agcagatctt caat citct to agcacaaagg act catctgc tgcttgggat gaga.ccct co tag acaaatt ctacactgaa ctic taccago agctgaatga 54 O tgttgttgatac agggggtggg ggtgacagag act cocctga tgaaggagga ctic cattctg gctgtgagga aatact tcca aagaat cact ctictatotga aagagaagaa atacago cct 660 tgtgcctggg aggttgtcag agcagaaatc atgagat citt tttctttgtc aacaaacttg 72 O Caagaaagtt taagaagtaa ggaataagct tgacctgttga agtgaaaaat ggcgcacatt gtgcga catt tttitttgtct gcc.gtttacc gct actg.cgt. cacggat.ctic cacgc.gc.cct 84 O gtagcggcgc attaag.cgcg tggittacgcg Cagcgtgacc gctacacttg 9 OO

Ccagcgcc ct agcgc.ccgct cctitt cqctt tot to cotto citttctogcc acgttcgc.cg 96.O gctitt.ccc.cg tdaagcticta aatcgggggc tcc ctittagg gttc.cgattit agtgctttac ggcacct cqa ccc caaaaaa Cttgattagg gtgatggttc acgtag tigg c catcgc cct gatagacggit tttitcqcc ct ttgacgttgg agt ccacgtt ctittaatagt ggact Cttgt 14 O tccaaactgg aacaac actic aac Cotatict cggtctatt c ttittgattta taagggattt 2OO tgc.cgattitc ggcctattgg ttaaaaaatg agctgattta acaaaaattit aacgcgaatt 26 O ttaacaaaat attaacgctt acaattitcag gtggcactitt tcggggaaat gtgcgcggaa 32O cccctatttgtttatttittc taalataCatt caaatatgta tcc.gct catg agacaataac cctgataaat gcttcaataa tattgaaaaa ggaagagtat gag tatt caa cattt cogtg 44 O

US 9,221,882 B2 119 120 - Continued

<210s, SEQ ID NO 33 &211s LENGTH: 90 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence stretch (upper/coding Strand) encoding the C-terminus of Strep-tag II and the N-terminus of IFNa2b after insertion of one PA1 polymer sequence cassette'

<4 OOs, SEQUENCE: 33 gaaaaaggcg C cagct Cttic tic.cgcticca gctgcacctg. Ctc.ca.gcagc acctgctgca 6 O

Ccagct Cogg Ctgctic ctgc tigcctgtgat 9 O

<210s, SEQ ID NO 34 &211s LENGTH: 90 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence stretch (lower/non-coding strand) of the C-terminus of Strep-tag II and the N-terminus of IFNa2b after insertion of one PA1 polymer sequence cassette' <4 OOs, SEQUENCE: 34 atcacaggca gcaggagcag ccggagctgg to agcaggit gctgctggag caggtgcagc 6 O tggagcggca gaagagctgg cqc ctitttitc 9 O

<210 SEQ ID NO 35 &211s LENGTH: 30 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence stretch of the C-terminus of Strep-tag II and the N-terminus of IFNa2b ater fusion with one PA1 polymer cassette'

<4 OOs, SEQUENCE: 35 Glu Lys Gly Ala Ser Ser Ser Ala Ala Pro Ala Ala Pro Ala Pro Ala 1. 5 1O 15 Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Cys Asp 2O 25 3O

<210s, SEQ ID NO 36 &211s LENGTH: 381 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of IFNa2b and Strep-tag II fused with the PA#1 (200) polymer as encoded on pPA#1 (200) - IFNa2b."

<4 OOs, SEQUENCE: 36 Ala Ser Trp Ser His Pro Glin Phe Glu Lys Gly Ala Ser Ser Ser Ala 1. 5 1O 15

Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala 2O 25 3O

Ala Pro Ala Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala 35 4 O 45

Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro Ala Ala Pro Ala Pro Ala SO 55 6 O US 9,221,882 B2 121 122 - Continued

Ala Pro Ala Pro Ala Pro Ala Pro Ala Ala Ala Pro Ala 65 70

Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala 85 90 95

Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro 105 11 O

Ala Pro Ala Ala Pro Ala Pro Ala Pro Ala Pro Ala 115 125

Pro Ala Pro Ala Pro Ala Ala Pro Ala Pro Ala Pro 13 O 135

Ala Pro Ala Pro Ala Pro Ala Pro Ala Ala Pro Ala 145 150 155

Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala 1.65 17O 17s

Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro Ala Pro 185 19 O

Ala Pro Ala Ala Pro Ala Pro Ala Pro Ala Pro Ala 195

Pro Ala Pro Ala Pro Ala Asp Luell Pro Glin Thir His Ser 21 O 215 22O

Lell Gly Ser Arg Arg Thir Lell Met Luell Luell Ala Glin Met Arg Arg Ile 225 23 O 235 24 O

Ser Luell Phe Ser Cys Lell Asp Arg His Asp Phe Gly Phe Pro Glin 245 250 255

Glu Glu Phe Gly Asn Glin Phe Glin Lys Ala Glu Thir Ile Pro Wall Luell 26 O 265 27 O

His Glu Met Ile Glin Glin Ile Phe Asn Luell Phe Ser Thir Asp Ser 28O 285

Ser Ala Ala Trp Asp Glu Thir Luell Luell Asp Phe Thir Glu Luell 29 O 295 3 OO

Tyr Glin Glin Luell Asn Asp Lell Glu Ala Wall Ile Glin Gly Wall Gly 3. OS 310 315

Wall Thir Glu Thir Pro Lell Met Glu Asp Ser Ile Lell Ala Wall Arg 3.25 330 335

Phe Glin Arg Ile Thir Luell Tyr Luell Lys Glu Lys Ser 34 O 345 35. O

Pro Ala Trp Glu Wall Wall Arg Ala Glu Ile Met Arg Ser Phe Ser 355 360 365

Lell Ser Thir Asn Lell Glin Glu Ser Luell Arg Ser Lys Glu 37 O 375 38O

SEO ID NO 37 LENGTH: 4321 TYPE: DNA <213> ORGANISM: artificial sequence <22O > FEATURE: <221 > NAME/KEY: Solice <223s OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence of pPA#1 (200) -IFNa2b."

< 4 OOs SEQUENCE: 37 c catcgaatg gccagatgat taatticcitaa tttttgttga cactictato a ttgatagagt tattittacca citc cctatica gtgatagaga aaagtgaaat gaatagttcg acaaaaatct 12 O agataacgag ggcaaaaaat gaaaaagaca gct atcgcga ttgcagtggc actggctggit 18O

US 9,221,882 B2 127 128 - Continued

&211s LENGTH: 54 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence stretch (lower/non-coding strand) on pASK75-His6-hCGH encoding the amino acid sequence around the N-terminus of hCGH'

<4 OOs, SEQUENCE: 39 ggttgggaag gCagalaga.gc tigg.cgc.catg gttgatggtgg tatggctag C9gc 54

<210s, SEQ ID NO 4 O &211s LENGTH: 18 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence stretch of the N-terminus of His 6-hCGH as encoded on pASK75-His 6-hCGH" <4 OOs, SEQUENCE: 4 O Ala Ala Ser His His His His His His Gly Ala Ser Ser Ser Ala Phe 1. 5 1O 15

Pro Thir

<210s, SEQ ID NO 41 &211s LENGTH: 3793 &212s. TYPE: DNA <213> ORGANISM; artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence of pASK75-His6-hCGH"

<4 OOs, SEQUENCE: 41 accc.gacacic atcgaatggc cagatgatta attcc taatt tttgttgaca citctato att 6 O gatagagitta ttttaccact c cctatoagt gatagagaaa agtgaaatga at agttcgac 12 O aaaaatctag atalacgaggg caaaaaatga aaaagacagc tat cqcgatt gcagtggcac 18O tggctggttt cqctaccgta gcgcaggc.cg ctago catca ccaccat cac catggcgc.ca 24 O gct cittctgc ct tcc.caacc attcc ctitat coaggcttitt togacaacgct atgct cog.cg 3OO cc catcgt.ct gcaccagotg gcc tittgaca cctaccagga gtttgaagaa goctatat co 360 caaaggaaca gaagtatt cattcctgcaga acccc.ca.gac ctic cct citgt ttct cagagt 42O ctatt cogac accctic caac agggaggaaa cacaacagaa atcca acct a gagctgcticc 48O gCatct coct gctgct catc cagtcgtggc tiggagc.ccgt gcagttcCt c aggagtgtct 54 O tcgc.ca acag cctggtgtac ggcgc.ctctg acagcaacgt. Ctatgacct C Ctalaaggacc 6OO tagaggaagg catccaaacg Ctgatgggga ggctggalaga tiggcagc.ccc cqgactgggc 660 agat cittcaa gcagacctac agcaagttcg acacaaactic acacaacgat gacgcactac 72 O t caagaacta C9ggctgctic tactgctt.ca ggaagga cat gga caaggt c gagacatt CC 78O tgcgcatcgt gcagtgcc.gc. tctgtggagg gcagctgtgg Cttctaagct tacctgttga 84 O agtgaaaaat gg.cgcacatt gtgcgacatt tttitttgtct gcc.gtttacc gct actg.cgt. 9 OO cacggat.ct c cacgc.gc.cct gtagcggcgc attaa.gc.gcg gcgggtgtgg togttacgc.g 96.O cagogtgacc gctacacttg ccagogcc ct agcgc.ccgct c ctitt cqct t t ct tcc.cttic 1 O2O ctitt ct cqcc acgttcgc.cg gctitt coccg tdaagcticta aatcgggggc ticcictittagg 108 O

US 9,221,882 B2 131 132 - Continued

Ctcaagacga tagttaccgg ataaggcgca gcggtcgggc tigaacggggg gttcgtgcac 3480 acagcc.ca.gc titggagcgaa cacct acac caactgaga tacct acagc gtgagctatg 354 O agaaag.cgcc acgctt Cocg aagggaga aa ggcggaCagg tat cogg tala gcggCagggit 36OO cggaac agga gagcc acga gggagctt Co. agggggaaac gcctggt at C tittatagt cc 366 O tgtcgggttt cqccacct ct gaCttgagcg tcgattitttgttgatgct cqt caggggggcg 372 O gaggctatgg aaaaacgc.ca gcaacgcggc Ctttittacgg ttcCtggcct tttgctggCC 378 O ttittgct cac atg 3793

<210s, SEQ ID NO 42 &211s LENGTH: 114 &212s. TYPE: DNA <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence (upper/coding-Strand) stretch encoding amino acid sequence of the N-terminus of His 6-hCGH after insertion of the PA#1 (2O) polymer"

<4 OOs, SEQUENCE: 42 gcc.gctagoc at caccacca toaccatggc gcc agct citt ctdcc.gctic c agctgcacct 6 O gctic cagoag cacctgctgc accagctic cq gotgctic ctd citgcct tcc c aacc 114

<210s, SEQ ID NO 43 &211s LENGTH: 114 &212s. TYPE: DNA <213> ORGANISM; artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: nucleic acid sequence (lower/non-coding Strand) encoding the N-terminus of hCGH after insertion of one PA1 polymer sequence cassette'

<4 OOs, SEQUENCE: 43 ggttgggaag gCagcaggag cagc.cggagc tiggtgcagca gatgctgctg gag caggtgc 6 O agctggagcg gCagaa.gagc tigg.cgc.catg gttgatggtgg tatggctag C9gc 114

<210s, SEQ ID NO 44 &211s LENGTH: 38 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of the N-terminus of His 6-hCGH after insertion of the PA1 (2O) polymer"

<4 OOs, SEQUENCE: 44 Ala Ala Ser His His His His His His Gly Ala Ser Ser Ser Ala Ala 1. 5 1O 15

Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala 2O 25 3O

Pro Ala Ala Phe Pro Thir 35

<210s, SEQ ID NO 45 &211s LENGTH: 405 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source US 9,221,882 B2 133 134 - Continued <223> OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of mature His 6-PA#1 (200) -hGH as encoded on pASK75-His6-PA1 (2OO) -hGH"

<4 OOs, SEQUENCE: 45 Ala Ser His His His His His His Gly Ala Ser Ser Ser Ala Ala Pro 1. 5 1O 15

Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro 2O 25 3O

Ala Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala 35 4 O 45

Pro Ala Ala Pro Ala Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro SO 55 6 O

Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro Ala Ala Pro Ala 65 70 7s 8O

Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro 85 90 95

Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro 1OO 105 11 O

Ala Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala 115 12 O 125

Pro Ala Ala Pro Ala Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro 13 O 135 14 O

Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro Ala Ala Pro Ala 145 150 155 160

Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro 1.65 17O 17s

Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro 18O 185 19 O

Ala Ala Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala 195 2OO 2O5

Pro A. a. Ala Pro Ala Ala Phe Pro Thir Ile Pro Leu Ser Arg Lieu Phe 21 O 215 22O Asp Asn Ala Met Lieu. Arg Ala His Arg Lieu. His Glin Lieu Ala Phe Asp 225 23 O 235 24 O Thr Tyr Glin Glu Phe Glu Glu Ala Tyr Ile Pro Lys Glu Glin Llys Tyr 245 250 255 Ser Phe Leu Glin Asn Pro Gln Thr Ser Lieu. Cys Phe Ser Glu Ser Ile 26 O 265 27 O Pro Thr Pro Ser Asn Arg Glu Glu Thr Glin Gln Lys Ser Asn Lieu. Glu 27s 28O 285 Lieu. Lieu. Arg Ile Ser Lieu. Lieu. Lieu. Ile Glin Ser Trp Lieu. Glu Pro Val 29 O 295 3 OO Glin Phe Lieu. Arg Ser Val Phe Ala Asn. Ser Lieu Val Tyr Gly Ala Ser 3. OS 310 315 32O

Asp Ser Asn Val Tyr Asp Lieu. Lieu Lys Asp Lieu. Glu Glu Gly Ile Glin 3.25 330 335

Thir Lieu Met Gly Arg Lieu. Glu Asp Gly Ser Pro Arg Thr Gly Glin Ile 34 O 345 35. O Phe Lys Glin Thir Tyr Ser Llys Phe Asp Thir Asn. Ser His Asn Asp Asp 355 360 365

Ala Lieu. Lieu Lys Asn Tyr Gly Lieu. Lieu. Tyr Cys Phe Arg Lys Asp Met 37 O 375 38O

Asp Llys Val Glu Thir Phe Lieu. Arg Ile Val Glin Cys Arg Ser Val Glu

US 9,221,882 B2 139 140 - Continued

Ctgtcgggitt togccacctic tacttgagc gtcgatttitt gtgatgct cq t caggggggc 432O ggagcc tatg gaaaaacgcc agcaacgcgg CCtttittacg gttcCtggcc ttittgctggc 438 O cittittgctica cat 4393

<210s, SEQ ID NO 47 &211s LENGTH: 404 212. TYPE: PRT <213> ORGANISM: artificial sequence 22 Os. FEATURE: <221 > NAMEAKEY: source OTHER INFORMATION: /note=Description of artificial sequence: amino acid sequence of His 6-PA#1 (200) -hGH as encoded on pCHO-PA-1 (200) -hGH"

<4 OOs, SEQUENCE: 47

Ser His His His His His His Gly Ala Ser Ser Ser Ala Pro Ala 1. 15

Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Pro Ala 2O 25

Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Pro Ala Pro 35 4 O

Ala Pro Ala Ala Ala Pro Ala Ala Pro Ala Pro Ala Pro Ala SO 55 6 O

Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro Ala Pro Ala Pro 70 7s

Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala 85 90 95

Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Pro Ala 1OO 105 11 O

Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Pro Ala Pro 115 12 O

Ala Pro Ala Ala Ala Pro Ala Ala Pro Ala Pro Ala Pro Ala 13 O 135 14 O

Pro Ala Pro Ala Ala Pro Ala Ala Ala Pro Ala Pro Ala Pro 150 155 160

Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala 1.65 17O 17s

Pro Ala Pro Ala Ala Pro Ala Ala Pro Ala Pro Ala Pro Ala 18O 185 19 O

Ala Pro Ala Ala Pro Ala Pro Ala Ala Pro Ala Pro Ala Pro 195 2OO

Ala Pro Ala Ala Phe Pro Thir Ile Pro Leu Ser Arg Luell Phe Asp 21 O 215 22O

Asn Ala Met Lieu. Arg Ala His Arg Lieu. His Glin Lell Ala Phe Asp Thir 225 23 O 235 24 O

Gln Glu Phe Glu Glu Ala Tyr Ile Pro Llys Glu Glin Tyr Ser 245 250 255

Phe Leu Glin Asn Pro Gln Thir Ser Lieu. Cys Phe Ser Glu Ser Ile Pro 26 O 265 27 O

Thir Pro Ser Asn Arg Glu Glu Thr Glin Glin Lys Ser Asn Luell Glu Luell 27s 28O 285

Lell Arg Ile Ser Lieu Lleu Lieu. Ile Glin Ser Trp Lell Glu Pro Wall Glin 29 O 295 3 OO

Phe Lieu. Arg Ser Val Phe Ala Asn. Ser Lieu Val Gly Ala Ser Asp 3. OS 310 315 32O