<<

US 2016O159877A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2016/0159877 A1 Retallack et al. (43) Pub. Date: Jun. 9, 2016

(54) FUSION PARTNERS FOR Publication Classification PRODUCTION (51) Int. Cl. (71) Applicant: Pfenex Inc., San Diego, CA (US) C07K I4/635 (2006.01) C07K I4/62 (2006.01) (72) Inventors: Diane M. Retallack, Poway, CA (US); C07K 4/535 (2006.01) Adam Chapman, San Diego, CA (US); CI2N 9/64 (2006.01) Torben R. Bruck, Lakeside, CA (US); (52) U.S. Cl. Hongfan Jin, San Diego, CA (US) CPC ...... C07K 14/635 (2013.01); C12N 9/6424 (2013.01); C07K 14/62 (2013.01); C07K (21) Appl. No.: 14/954,766 14/535 (2013.01); C12Y 304/21009 (2013.01); C07K 2319/35 (2013.01) (22) Filed: Nov.30, 2015 (57) ABSTRACT The present invention relates to the field of medicine, in Related U.S. Application Data particular, to the production of large amounts of a soluble recombinant polypeptide as part of a fusion protein compris (60) Provisional application No. 62/086,119, filed on Dec. ing an N-terminal fusion partner linked to the polypeptide of 1, 2014. interest.

Patent Application Publication Jun. 9, 2016 Sheet 2 of 9 US 2016/0159877 A1

FIGURE 2A Drai-ike Protein - PTH i-34. Fusion (SEQ is NO: 45) ikweggiyehykggeyviswariseteeewfygaiygeygfwvaisievewdigegwgrfaiviae ps:figgggggsgiggs hiddickSySeigiitiitiigikhiisneyesikkiggvinif

FIGURE 2B FkB - PTH i-34. Fusion (SEQ to No.: 46 8:SevristcietysygigqigocripppgysiciaiiagiidaiagkpSiwiceqaasikwisei gae aaakaeacagagliafaerak digitiasgigevitaggakptredeviathygieigvfissyeggpa efowggwiagwieairagsikwiyyy}seayg38gwgsippi Swwfiveiewiggggsggggs hidisksyseiqinninigkhinsterwewirikkiggvhaf

FIGURE 2C FrnE - PTH 1-34. Fusion (SEG D No.: 47) spikitiivsdwscaweigigteaidaigsewgaeiicipieign:rigaeggniweekygstaees gairaridigaagiairtriggsfiyatfoathwagegieyskeafkayisciggigschatiaiiates wgidiaraaeiascieyaaeyegegww.sgysswgitivfadeyavsggopaeafygairginesksggg gSggggishtickSySeigighinigkhi Staeryevikkiciyiiif Patent Application Publication Jun. 9, 2016 Sheet 3 of 9 US 2016/0159877 A1

Hidul '69.9S

Hld as 'OG9ELS

Hid “I'd '809ELS HIduaold tell "86gs HLduled alreud '06SELS Hildall: '69-'98:LS

Hld all OS 9ELS sia es

Hid II (9ELS s l Hidualad es?el '86GES O HLdruejold the "Olges

Herall 6999). S

He all OgELS

Hilde's 'S809ELS

Hld II's "rese. LS Hua (Id arrell '6ELS Hldrula old e-rell '6GES : : Patent Application Publication Jun. 9, 2016 Sheet 4 of 9 US 2016/0159877 A1

FIGURE 4

2 3 4 5, 8 8 g : 1 2 3 4 5, 18 1, 18

: Eftekasa intertakinase 40 giri Entertakinase : gint Patent Application Publication Jun. 9, 2016 Sheet 5 of 9 US 2016/0159877 A1

FGRES

:

- PTH 1-34 Patent Application Publication Jun. 9, 2016 Sheet 6 of 9 US 2016/0159877 A1

se : *:

O

s Y P C e s rt

c

N

r Patent Application Publication Jun. 9, 2016 Sheet 7 of 9 US 2016/0159877 A1

FGRE 7A

PTH 1-34

Patent Application Publication Jun. 9, 2016 Sheet 8 of 9 US 2016/0159877 A1

FGRE 7B

PTH 1-34 :

8 8

Patent Application Publication Jun. 9, 2016 Sheet 9 of 9 US 2016/0159877 A1

FIGURE 7C

PTH 1-34 US 2016/0159877 A1 Jun. 9, 2016

FUSION PARTNERS FOR PEPTIDE significant amounts in soluble form in a recombinant fusion PRODUCTION protein of the invention, eliminating the need for refolding. CROSS-REFERENCE 0006. The present invention thus provides a recombinant fusion protein comprising: an N-terminal fusion partner, 0001. This application claims the benefit of U.S. Provi wherein the N-terminal fusion partner is a bacterial chaper sional Patent Application No. 62/086,119, filed Dec. 1, 2014, one or folding modulator; a polypeptide of interest; and a which is incorporated herein by reference in its entirety. linker comprising a cleavage site between the N-terminal fusion partner and the polypeptide of interest. In embodi BACKGROUND OF THE INVENTION ments, the N-terminal fusion partner is selected from: a DnaJ 0002 Heterologous recombinant polypeptides often are like protein; an Fk1B protein or a truncation thereof; an FrnE difficult to express at high yield in bacterial expression sys protein or a truncation thereof an FkpB2 protein or a trunca tems due to causes that include , low expression tion thereof; an EcpD protein or a truncation thereof; or a Skp level, improper protein folding, which can result in poor protein or a truncation thereof. In embodiments, the N-termi nal fusion partner is selected from: Pfluorescens DnaJ-like solubility, and poor secretion from the host cell. protein; Pfluorescens FklB protein or a C-terminal truncation SUMMARY OF THE INVENTION thereof: Pfluorescens FrnE protein or a truncation thereof. P. fluorescens FkpB2 protein or a C-terminal truncation thereof; 0003. The present invention provides a recombinant or Pfluorescens EcpD protein or a C-terminal truncation fusion protein comprising a polypeptide of interest. Expres thereof. In certain embodiments, the N-terminal fusion part sion of a polypeptide of interest as part of the recombinant ner is Pfluorescens FklB protein, truncated to remove 1 to fusion protein as described allows production of high quality 200 amino acids from the C-terminus, Pfluorescens EcpD polypeptide in large amounts. Polypeptides of interest protein, truncated to remove 1 to 200 amino acids from the include Small or rapidly-degraded , e.g., parathyroid C-terminus, or P. fluorescens FrnE protein, truncated to hormone N-terminal fragment (PTH 1-34), proteins having remove 1 to 180 amino acids from the C-terminus. In embodi an N-terminus that is vulnerable to degradation, e.g., GCSF ments, the polypeptide of interest is a difficult-to-express and Pfalciparum circumsporozoite protein, and proteins that protein selected from: a small or rapidly-degraded peptide; a typically are produced in insoluble form in microbial expres protein with an easily degraded N-terminus; and a protein sion systems, e.g., proinsulin that can be processed to typically expressed in a bacterial expression system in or an insulin analog, GCSF, or IFN-B. The recombinant insoluble form. In embodiments, the polypeptide of interest is fusion protein, shown schematically in FIG. 1, comprises an a small or rapidly-degraded peptide, wherein the polypeptide N-terminal bacterial fusion partner, e.g., a bacterial chaper of interest is selected from: hPTH1-34, Glp1, Glp2, IGF-1 one or folding modulator. The polypeptide of interest and (SEQID NO:37), (SEQID NO:38), N-terminal bacterial chaperone or folding modulator are con (SEQID NO:39), Ziconotide (SEQID NO: 40), nected by a flexible linker sequence that contains a (SEQ ID NO: 42), Enfluvirtide (SEQ ID NO: cleavage site. When cleaved, the polypeptide of interest is 43), Nesiritide (SEQ ID NO: 44). In embodiments, the released from the N-terminal fusion partner. The present polypeptide of interest is a protein with easily degraded N-ter invention further discloses a vector for expressing the recom minus, wherein the polypeptide of interest is N-met-GCSF or binant fusion protein, and a method for producing the recom Pfalciparum circumsporozoite protein. In embodiments, the binant fusion protein in a bacterial host cell at high yield. polypeptide of interest is a protein typically expressed in a 0004. The recombinant fusion constructs of the present bacterial expression system as insoluble protein, wherein the invention are useful for producing a high yield of a recombi polypeptide of interest is a proinsulin that is processed to nant polypeptide of interest that is difficult to overexpress in insulin or an insulin analog, GCSF, or IFN-B. In any of these a bacterial expression system, due to, e.g., proteolysis, low embodiments, the proinsulin C-peptide has an amino acid expression level, poor folding, and/or poor secretion. In sequence selected from: SEQ ID NO: 97: SEQ ID NO: 98: embodiments of the invention, a recombinant fusion protein SEQ ID NO: 99; or SEQ ID NO: 100. In embodiments, the of the invention is produced in a bacterial host cell at a titer of insulin analog is , , lispro, higher than 0.5g/L. In embodiments, the bacterial host cell in glulisine, detemir, or degludec. In certain embodiments, the which the recombinant polypeptide of interest is difficult to N-terminal fusion partner is Pfluorescens DnaJ-like protein overexpress is E. coli. having the amino acid sequence set forth in SEQID NO: 2. In 0005 For example, the PTH 1-34 protein, previously embodiments, the N-terminal fusion partner is Pfluorescens reported as expressed as part of a fusion protein in inclusion FklB protein having the amino acid sequence set forth in SEQ bodies which require high concentrations of urea (e.g. 7 M) to ID NO: 4, SEQID NO: 28, SEQID NO: 61, or SEQID NO: solubilize, is described hereinas produced as part of a soluble 62. In embodiments, the N-terminal fusion partner is Pfluo PTH 1-34 fusion protein at high titer expression (higher than rescens FrnE protein having the amino acid sequence set forth 0.5 g/L). Furthermore, purification can be carried out under in SEQ ID NO:3, SEQID NO: 63, or SEQ ID NO: 64. In non-denaturing conditions, e.g. 4 M or lower concentrations embodiments, the N-terminal fusion partner is Pfluorescens of urea, or without the use of urea altogether. Also using the EcpD protein having the amino acid sequence set forth in methods of the invention, a protein with an easily degraded N SEQID NO:7, SEQID NO: 65, SEQID NO: 66, or SEQID terminus, e.g., N-met-GCSF or Pfalciparum circumsporo NO: 67. In embodiments, the cleavage site in the recombinant Zoite protein, can be produced as part of the described fusion fusion protein is recognized by a cleavage in the protein and separated from the N-terminal fusion partner by group consisting of enterokinase; trypsin, Factor Xa; and cleavage after host cell have been removed from the furin. The recombinant fusion protein of any of claims 1 to 15, fusion protein preparation. As also described herein, a proin wherein the linker comprises an affinity tag. In certain Sulin normally produced in insoluble form can be produced in embodiments, the affinity tag is selected from: polyhistidine; US 2016/0159877 A1 Jun. 9, 2016

a FLAG tag; a myc tag; a GST tag; a MBP tag; a calmodulin urea. In embodiments, the non-denaturing conditions com tag; an HA tag; an E-tag an S-tag; an SBP tag; a Softag 3; a prise lysing the induced cells of step (ii) with a buffer com V5 tag; and a VSV tag. In embodiments, the linker has an prising a non-denaturing concentration of a chaotropic agent. amino acid sequence selected from: SEQID NO: 9: SEQID In embodiments, the non-denaturing concentration of a chao NO: 10; SEQID NO: 11: SEQID NO: 12; and SEQID NO: tropic agent is less than 4Murea. 226. In embodiments, the polypeptide of interest is hPTH1 0010. In embodiments, the microbial host cell is a 34, and the recombinant fusion protein comprises an amino Pseudomonad or E. coli host cell. In embodiments, the acid sequence selected from: SEQID NO. 45: SEQID NO: Pseudomonad host cell is a Pseudomonas host cell. In 46; and SEQID NO: 47. In embodiments, the isoelectric point embodiments, the Pseudomonas host cell is Pseudomonas of the polypeptide of interest is at least about 1.5 times higher fluorescens. than the isoelectric point of the N-terminal fusion partner. In 0011. In specific embodiments, the host cell is deficient in embodiments, the molecular weight of the polypeptide of at least one protease selected from the group consisting of interest constitutes about 10% to about 50% of the molecular Lon (SEQID NO: 14); Lal (SEQID NO: 15); AprA (SEQID weight of the recombinant fusion protein. NO: 16): HtpX (SEQID NO:17); DegP1 (SEQID NO: 18); 0007. The invention also provides an expression vector for DegP2 (SEQID NO: 19); Npr (SEQID NO: 20); Prc1 (SEQ expression of a recombinant fusion protein. In embodiments, ID NO: 21); Prc2 (SEQID NO: 22); M50 (SEQID NO. 24); the expression vector is for expression of a recombinant PrlC (SEQID NO:30); Serralysin (RXF04495) SEQID NO: fusion protein in any of the embodiments described above. In 227) and PrtB (SEQID NO. 23). In related embodiments, the embodiments, the expression vector comprises a host cell is deficient in proteases Lon (SEQID NO: 14), Lal sequence encoding a recombinant fusion protein of any of the (SEQ ID NO: 15), and AprA (SEQID NO: 16). In embodi above embodiments. ments, the host cell is deficient in proteases AprA (SEQ ID 0008. The invention further provides a method for produc NO: 16) and HitpX (SEQID NO: 17). In other embodiments, ing a polypeptide of interest, comprising: the host cell is deficient in proteases Lon (SEQID NO: 14), (i) culturing a microbial host cell transformed with an expres Lal (SEQ ID NO: 15) and DegP2 (SEQ ID NO: 19). In sion vector comprising an expression construct, wherein the embodiments, the host cell is deficient in proteases Npr(SEQ expression construct comprises a nucleotide sequence encod ID NO: 20), DegP1 (SEQ ID NO: 18) and DegP2 (SEQ ID ing a recombinant fusion protein; NO: 19). In related embodiments, the host cell is deficient in (ii) inducing the host cell of step (i) to express the recombi proteases Serralysin (SEQ ID NO: 227), and AprA (SEQ ID nant fusion protein; (iii) purifying the recombinant fusion NO: 16). protein expressed in the induced host cells of step (ii); and (iv) cleaving the purified recombinant fusion protein of step (iii) INCORPORATION BY REFERENCE by incubation with a cleavage enzyme that recognizes the 0012 All publications, patents, and patent applications cleavage site in the linker, to release the polypeptide of inter mentioned in this specification are herein incorporated by est; thereby obtaining the polypeptide of interest. In embodi reference to the same extent as if each individual publication, ments, the recombinant fusion protein of step (i) is that patent, or patent application was specifically and individually described in any of the embodiments described above. In indicated to be incorporated by reference. embodiments, the method further comprises measuring the expression level of the fusion protein expressed in step (ii), BRIEF DESCRIPTION OF THE DRAWINGS measuring the amount of the recombinant fusion protein puri fied in step (iii), or measuring the amount of the polypeptide 0013 The novel features of the invention are set forth with of interest obtained in step (iv) that has been properly particularity in the appended claims. A better understanding released, or a combination thereof. In embodiments, the of the features and advantages of the present invention will be expression level of the fusion protein expressed in step (ii) is obtained by reference to the following detailed description greater than 0.5 g/L. In embodiments, the expression level of that sets forth illustrative embodiments, in which the prin the fusion protein expressed in step (ii) is about 0.5 g/L to ciples of the invention are utilized, and the accompanying about 25 g/L. In embodiments, the fusion protein expressed in drawings of which: step (ii) is directed to the cytoplasm. In embodiments, the 0014 FIG.1. Schematic Representation of a Recombinant fusion protein expressed in Step (ii) is directed to the peri Fusion Protein. Domain 1 corresponds to an N-terminal plasm. In embodiments, the incubation of step (iv) is about fusion partner, domain 2 corresponds to a linker, and domain one hour to about 16 hours, and the cleavage enzyme is 3 corresponds to a polypeptide of interest. Non-limiting enterokinase. examples of N-terminal fusion partners and polypeptides of 0009. In embodiments, the incubation of step (iv) is about interest are listed below each respective domain. one hour to about 16 hours, the cleavage enzyme is enteroki 0015 FIG. 2A to 2C. Three Recombinant Fusion Protein nase, and wherein the amount of the recombinant fusion Amino Acid Sequences. The amino acid sequences of three protein purified in step (iii) that is properly released in step recombinant fusion proteins comprising hPTH 1-34 as the (iv) is about 90% to about 100%. In embodiments, the amount polypeptide of interest are shown. The hPTH 1-34 sequence is of the recombinant fusion protein purified in step (iii) that is italicized in each, and the linker between the N-terminal properly released in step (iv) is about 100%. In embodiments, fusion partner and PTH 1-34 is underlined. 2A. Recombinant the amount of the polypeptide of interest obtained in step (iii) fusion protein comprising a DnaJ-like protein N-terminal or step (iv) is about 0.1 g/L to about 25 g/L. In embodiments, fusion partner. (DnaJ-like protein, aa 1-77; linker, aa 78-98: the properly released polypeptide of interest obtained is hPTH 1-34, aa 99-132.) (SEQ ID NO: 45) 2B. Recombinant soluble, intact, or both. In embodiments, step (iii) is carried fusion protein comprising an FklBN-terminal fusion partner. out under non-denaturing conditions. In embodiments, the (FklB, aa 1-205; linker, aa 206-226; hPTH 1-34, aa 227-260.) recombinant fusion protein is solubilized without the use of (SEQ ID NO: 46) 2C. Recombinant fusion protein compris US 2016/0159877 A1 Jun. 9, 2016

ing an FrnE N-terminal fusion partner. (FrnE, aa 1-216: PTH 1-34 are indicated by a solid arrow. 7A. FklB-PTH linker, aa 217-237; hPTH 1-34, aa 238-271) (SEQ ID NO: fusion protein purified from STR36034, 7B. FklB-PTH 47) fusion protein purified from STR36085. 7C. FklB-PTH 0016 FIG. 3. SDS-CGE Analysis of Shake Flask Expres fusion protein purified from STR36098. sion Samples. Samples are shown in three sets: whole cell broth (lanes 1-6); cell free broth (lanes 7-12); and soluble SEQUENCES fraction (lanes 13-18), as indicated at the bottom of the figure. 0021. This application includes nucleotide sequences Molecular weight markers are shown on each side of the image (68, 48, 29, 21, 16 kD. from top to bottom). The lanes SEQID NO: 1-237, and these nucleotide sequences are listed in each of the three sets represent, from left to right: DNAJ in the Table of Sequences before the claims. like protein-PTH 1-34 fusion (STR35970); DNAJ-like pro DETAILED DESCRIPTION OF THE INVENTION tein-PTH 1-34 fusion (STR35984); FklB-PTH 1-34 fusion (STR36034); FklB-PTH 1-34 fusion (STR36085); FrnE Overview PTH 1-34 fusion (STR36150); and FrnE-PTH 1-34 fusion (STR36169), as indicated above the lanes. The DnaJ-like 0022. The present invention relates to recombinant fusion PTH fusion protein bands are marked by a solid arrow and proteins for overexpressing recombinant polypeptides of FklB-PTH and FrnE-PTH fusion proteinbands are marked by interest in bacterial expression systems, constructs for a dashed arrow. expressing the recombinant fusion proteins, and methods for 0017 FIG. 4. Enterokinase Cleavage of Purified Recom producing high yields of the recombinant fusion proteins and binant Fusion Proteins. Samples are shown in three sets: no the recombinant polypeptide of interest in soluble form. In enterokinase treatment (lanes 1-6); enterokinase treatment 40 embodiments, the methods of the invention enable produc ug/ml (lanes 7-12); and enterokinase treatment 10 ug/ml tion of greater than 0.5 g/L of recombinant fusion proteins (lanes 13-18). The lanes in each of the three sets represent, following purification. In embodiments, the methods of the from left to right: DNAJ-like protein-PTH 1-34 fusion invention produce high yields of recombinant fusion proteins (STR35970); DNAJ-like protein-PTH 1-34 fusion without the use of a denaturing concentration of a chaotropic (STR35984); FklB-PTH 1-34 fusion (STR36034); FklB agent. In embodiments, the methods of the invention produce PTH 1-34 fusion (STR36085); FrnE-PTH 1-34 fusion high yields of recombinant fusion proteins without the use of (STR36150); and FrnE-PTH 1-34 fusion (STR36169). The any chaotropic agent. migration of DnaJ-like fusion protein is indicated by the solid 0023. As used herein, the term “comprise' or variations arrow in the lower pair of arrows. The migration of cleaved thereof such as “comprises” or “comprising are to be read to DnaJ-like-protein N-terminal fusion partners are indicated by indicate the inclusion of any recited feature but not the exclu the dashed arrow in the lower pair of arrows. The migration of sion of any other features. Thus, as used herein, the term FklB and FrnE fusion proteins are indicated by the solid “comprising is inclusive and does not exclude additional, arrow in the upper pair of arrows. The migration of FklBand unrecited features. In embodiments of any of the composi FrnE N-terminal fusion partners are indicated by the dashed tions and methods provided herein, "comprising may be arrow in the upper pair of arrows. Molecular weight markers replaced with “consisting essentially of or “consisting of are shown on the right side of the image (29, 20, and 16 kD. The phrase “consisting essentially of is used herein to from top to bottom). require the specified feature(s) as well as those which do not 0018 FIG.5. Intact Mass Analysis of Enterokinase Cleav materially affect the character or function of the claimed age Products. Shown is the deconvoluted mass spectra for invention. As used herein, the term "consisting is used to DnaJ-like protein-PTH 1-34 fusion protein, purified from indicate the presence of the recited feature (e.g. nucleobase expression strainSTR35970, digested with enterokinase for 1 sequence) alone (so that in the case of an antisense oligomer hour. The peak corresponding to PTH 1-34 is indicated by a consisting of a specified nucleobase sequence, the presence of solid arrow. additional, unrecited nucleobases is excluded). 0019 FIG. 6. Enterokinase Cleavage of DnaJ-like protein Recombinant Fusion Protein PTH 1-34 Fusion Protein Purification Fractions. The DnaJ like protein-PTH fusion protein was purified from expression 0024. A recombinant fusion protein of the present inven strain STR36005 following growth in a conventional biore tion comprises three domains, as generally illustrated in FIG. actor. Purification fractions were incubated with enterokinase 1. From left, the fusion protein comprises an N-terminal for 1 hour (lanes 2-4), 16 hours (lanes 6-8), without enteroki fusion partner, a linker, and a polypeptide of interest, wherein nase (control) for 1 hour (lane 1), or without enterokinase the linker is between the N-terminal fusion partner and the (control) for 16 hours (lane 5). The fractions analyzed were as polypeptide of interest is C-terminal to the linker. In embodi follows: fraction 1 (lanes 1,2,5, and 6); fraction 2 (lanes 3 and ments, the linker sequence comprises a protease cleavage site. 7); and fraction 3 (lanes 4 and 8). The full-length DnaJ-like In embodiments, the polypeptide of interest can be released protein-PTH 1-34 recombinant fusion protein bands are indi from the recombinant fusion protein, by cleavage at the pro cated by the solid blackarrow. The cleaved DnaJ-like protein tease cleavage site within the linker. PTH 1-34 fusion partner bands are indicated by a dashed 0025. In embodiments, the molecular weight of the arrow. Molecular weight markers are shown on each side of recombinant fusion protein is about 2 kDa to about 1000kDa. the image (49, 29, 21, and 16 kD. from top to bottom). In embodiments, the molecular weight of the recombinant 0020 FIG. 7A to 7C. Intact Mass Analysis of PTH 1-34 fusion protein is about 2 kDa, about 3kDa, about 4kDa, about enterokinase cleavage products derived from FklB-PTH 1-34 5 kDa, about 6 kDa, about 7 kDa, about 8 kDa, about 9 kDa, Fusion Proteins. The figures show the deconvoluted mass about 10 kDa, about 11 kDa, about 12 kDa, about 13 kDa, spectra for FklB-PTH 1-34 fusion protein purification frac about 14 kDa, about 15 kDa, about 20 kDa, about 25 kDa, tions digested with enterokinase. The peaks corresponding to about 26 kDa, about 27 kDa, about 28 kDa, about 30 kDa, US 2016/0159877 A1 Jun. 9, 2016

about 35 kDa, about 40 kDa, about 45 kDa, about 50 kDa, 39), Ziconotide (SEQID NO: 40), Becaplermin (SEQID NO: about 55 kDa, about 60 kDa, about 65 kDa, about 70 kDa, 42), Enfuvirtide (SEQID NO: 43), Nesiritide (SEQ ID NO: about 75 kDa, about 80 kDa, about 85 kDa, about 90 kDa, 44) or Enterokinase (e.g., SEQID NO: 31). about 95 kDa, about 100 kDa, about 200 kDa, about 300 kDa, 0031. In embodiments, the recombinant fusion protein about 400 kDa, about 500 kDa, about 550 kDa, about 600 comprises a P fluorescens DnaJ-like protein N-terminal kDa, about 700 kDa, about 800 kDa, about 900 kDa, about fusion partner and a trypsin cleavage site linker, together 1000 kDa, or greater. In embodiments, the molecular weight having the amino acid sequence of SEQ ID NO: 101. In of the recombinant fusion protein is about 2 kDa to about embodiments, the nucleotide sequence encoding SEQ ID 1000 kDa, about 2 kDa to about 500 kDa, about 2 kDa to NO: 101 is SEQID NO: 202. about 250kDa, about 2 kDa to about 100 kDa, about 2 kDa to 0032. In embodiments, the recombinant fusion protein about 50 kDa, about 2 kDa to about 25 kDa, about 2 kDa to comprises a Pfluorescens EcpD1 protein N-terminal fusion about 30kDa, about 2 kDa to about 1000 kDa, about 2 kDa to partner and a trypsin cleavage site linker, together having the about 500kDa, about 2 kDa to about 250kDa, about 2 kDa to amino acid sequence of SEQID NO: 102 or 103. In embodi about 100 kDa, about 2 kDa to about 50 kDa, about 2 kDa to ments, the nucleotide sequence encoding SEQID NO: 102 or about 25kDa, about 3 kDa to about 1000 kDa, about 3 kDa to 103 is SEQID NO: 202 or 228, respectively. about 500 kDa, about 3 kDa to about 250 kDa, about 3 kDa to 0033. In embodiments, the recombinant fusion protein about 100 kDa, about 3 kDa to about 50 kDa, about 3 kDa to comprises a Pfluorescens EcpD2 protein N-terminal fusion about 25 kDa, about 3 kDa to about 30 kDa, about 4 kDa to partner and a trypsin cleavage site linker, together having the about 1000 kDa, about 4 kDa to about 500 kDa, about 4 kDa amino acid sequence of SEQID NO: 104. In embodiments, to about 250kDa, about 4 kDa to about 100 kDa, about 4 kDa the nucleotide sequence encoding SEQID NO: 104 is SEQID to about 50 kDa, about 4 kDa to about 25kDa, about 4 kDa to NO: 2O4. about 30 kDa, about 5 kDa to about 1000 kDa, about 5 kDa to about 500 kDa, about 5kDa to about 250 kDa, about 5 kDa to 0034. In embodiments, the recombinant fusion protein about 100 kDa, about 5 kDa to about 50 kDa, about 5 kDa to comprises a Pfluorescens EcpD3 protein N-terminal fusion about 25 kDa, about 5 kDa to about 30 kDa, about 10 kDa to partner and a trypsin cleavage site linker, together having the about 1000 kDa, about 10 kDa to about 500 kDa, about 10 amino acid sequence of SEQID NO: 105. In embodiments, kDa to about 250 kDa, about 10 kDa to about 100 kDa, about the nucleotide sequence encoding SEQID NO: 105 is SEQID 10kDa to about 50kDa, about 10 kDa to about 25kDa, about NO: 205. 10 kDa to about 30 kDa, about 20 kDa to about 1000 kDa, 0035. In embodiments, the recombinant fusion protein about 20 kDa to about 500 kDa, about 20 kDa to about 250 comprises a Pfluorescens FklB1 protein N-terminal fusion kDa, about 20 kDa to about 100 kDa, about 20 kDa to about partner and a trypsin cleavage site linker, together having the 50kDa, about 20kDa to about 25kDa, about 20 kDa to about amino acid sequence of SEQID NO: 106. In embodiments, 30 kDa, about 25 kDa to about 1000 kDa, about 25 kDa to the nucleotide sequence encoding SEQID NO: 106 is SEQID about 500 kDa, about 25kDa to about 250kDa, about 25 kDa NO: 2O6. to about 100 kDa, about 25 kDa to about 50 kDa, about 25 0036. In embodiments, the recombinant fusion protein kDa to about 25 kDa, or about 25 kDa to about 30 kDa. comprises a Pfluorescens FklB2 protein N-terminal fusion 0026. In embodiments, the recombinant fusion protein is partner and a trypsin cleavage site linker, together having the about 50, 100, 150, 200, 250, 300, 350, 400, 450, 470, 500, amino acid sequence of SEQID NO: 107. In embodiments, 530,560,590, 610,640, 670, 700, 750, 800, 850, 900, 950, the nucleotide sequence encoding SEQID NO: 107 is SEQID 1000, 1200, 1400, 1600, 1800, 2000, 2500, or more, amino NO: 207. acids in length. In embodiments, the recombinant fusion pro 0037. In embodiments, the recombinant fusion protein tein is about 50 to 2500, 100 to 2000, 150 to 1800, 200 to comprises a Pfluorescens FklB3 protein N-terminal fusion 1600, 250 to 1400, 300 to 1200, 350 to 1000, 400 to 950, 450 partner and a trypsin cleavage site linker, together having the to 900, 470 to 850, 500 to 800,530 to 750,560 to 700, 590 to amino acid sequence of SEQID NO: 108. In embodiments, 670, or 610 to 640 amino acids in length. the nucleotide sequence encoding SEQID NO: 108 is SEQID 0027. In embodiments, the recombinant fusion protein NO: 2O8. comprises an N-terminal fusion partner selected from: 0038. In embodiments, the recombinant fusion protein 0028 Pfluorescens DnaJ-like protein (e.g., SEQID NO: comprises a Pfluorescens FrnE1 protein N-terminal fusion 2), FrnE (SEQ ID NO:3), FrnE2 (SEQ ID NO: 63), FrnE3 partner and a trypsin cleavage site linker, together having the (SEQ ID NO:64), FklB (SEQ ID NO: 4), FklB3* (SEQ ID amino acid sequence of SEQID NO: 109. In embodiments, NO: 28), FklB2 (SEQID NO: 61), FklB3 (SEQID NO: 62), the nucleotide sequence encoding SEQID NO: 109 is SEQID FkpB2 (SEQID NO: 5), SecB (SEQID NO: 6), a truncation NO: 209. of SecB, EcpD (SEQ ID NO: 7), EcpD (SEQ ID NO: 65), 0039. In embodiments, the recombinant fusion protein EcpD2 (SEQID NO: 66), and EcpD3 (SEQID NO: 67); comprises a Pfluorescens FrnE2 protein N-terminal fusion 0029 a linker selected from: SEQ ID NO: 9, 10, 11, 12, partner and a trypsin cleavage site linker, together having the and 226; and amino acid sequence of SEQID NO: 110. In embodiments, 0030) a polypeptide of interest selected from: hPTH 1-34 the nucleotide sequence encoding SEQID NO: 110 is SEQID (SEQ ID NO: 1), Met-GCSF (SEQ ID NO: 69), rCSP, a NO: 21 O. Proinsulin (e.g., any of Human Proinsulin SEQ ID NO:32, 0040. In embodiments, the recombinant fusion protein Insulin Glargine Proinsulin SEQ ID NO: 88, 89,90, or 91), comprises a Pfluorescens FrnE3 protein N-terminal fusion SEQID NO:33, SEQID NO: partner and a trypsin cleavage site linker, together having the 34), Insulin C-peptide (SEQID NO:97); (SEQ amino acid sequence of SEQID NO: 111. In embodiments, ID NO:35), Glp-1 (SEQID NO:36), Exenatide (SEQID NO: the nucleotide sequence encoding SEQID NO: 111 is SEQID 37), Teduglutide (SEQID NO:38), Pramlintide (SEQID NO: NO: 211. US 2016/0159877 A1 Jun. 9, 2016

0041. In embodiments, the recombinant fusion protein ID NO:9, and human amino acids 1-34 comprises a P fluorescens DnaJ-like protein N-terminal (hPTH 1-34) (SEQID NO: 1). In embodiments, the N-termi fusion partner and a enterokinase cleavage site linker, nal fusion partner, linker, and polypeptide of interest of the together having the amino acid sequence of SEQID NO: 112. recombinant fusion protein are, respectively: Pfluorescens In embodiments, the nucleotide sequence encoding SEQID folding modulator FrnE (SEQID NO:3), the linker set forth NO: 112 is SEQID NO: 212. as SEQ ID NO: 9, and hPTH 1-34 (SEQ ID NO: 1). In 0042. In embodiments, the recombinant fusion protein embodiments, the N-terminal fusion partner, linker, and comprises a Pfluorescens EcpD1 protein N-terminal fusion polypeptide of interest of the recombinant fusion protein are, partner and a enterokinase cleavage site linker, together hav respectively: Pfluorescens folding modulator FklB (SEQID ing the amino acid sequence of SEQID NO: 113. In embodi NO: 4), the linker set forth as SEQID NO:9, and hPTH 1-34 ments, the nucleotide sequence encoding SEQID NO: 113 is (SEQ ID NO: 1). In embodiments, the recombinant hPTH SEQ ID NO: 213, respectively. fusion protein has the amino acid sequence as set forth in one 0043. In embodiments, the recombinant fusion protein of SEQID NOS: 45, 46, and 47. comprises a Pfluorescens EcpD2 protein N-terminal fusion 0052. In embodiments, the recombinant fusion protein is partner and a enterokinase cleavage site linker, together hav an insulin fusion protein having the following elements: ing the amino acid sequence of SEQID NO: 114. In embodi 0053 an N-terminal fusion partner selected from Pfluo ments, the nucleotide sequence encoding SEQID NO: 114 is rescens: DnaJ-like protein (e.g., SEQID NO: 2), FrnE (SEQ SEQID NO: 214. ID NO:3), FrnE2 (SEQID NO: 63), FrnE3 (SEQID NO:64), 0044. In embodiments, the recombinant fusion protein FklB (SEQ ID NO: 4), Fk1B3* (SEQ ID NO: 28), FklB2 comprises a Pfluorescens EcpD3 protein N-terminal fusion (SEQID NO: 61), FklB3 (SEQID NO: 62), FkpB2 (SEQID partner and a enterokinase cleavage site linker, together hav NO: 5), EcpDEcpD (SEQID NO: 65), EcpD2 (SEQID NO: ing the amino acid sequence of SEQID NO: 115. In embodi 66), or EcpD3 (SEQID NO: 67); ments, the nucleotide sequence encoding SEQID NO: 115 is 0054 a linker having the sequence set forth as SEQ ID SEQID NO: 215. NO: 226; and 0045. In embodiments, the recombinant fusion protein 0055 a polypeptide of interest selected from: Glargine comprises a Pfluorescens FklB1 protein N-terminal fusion Proinsulin SEQID NO: 88, 89,90, or 91. partner and a enterokinase cleavage site linker, together hav 0056. In embodiments, the polypeptide of interest is the ing the amino acid sequence of SEQID NO: 216. In embodi Glargine Proinsulin set forth as SEQID NO: 88, encoded by ments, the nucleotide sequence encoding SEQID NO: 116 is the nucleotide sequence set forth as SEQID NO: 80 or 84. In SEQID NO: 216. embodiments, the polypeptide of interest is the Glargine Pro 0046. In embodiments, the recombinant fusion protein insulin set forth as SEQID NO: 89, encoded by the nucleotide comprises a Pfluorescens FklB2 protein N-terminal fusion sequence set forth as SEQID NO: 81 or 85. In embodiments, partner and a enterokinase cleavage site linker, together hav the polypeptide of interest is the Glargine Proinsulin set forth ing the amino acid sequence of SEQID NO: 217. In embodi as SEQID NO: 90, encoded by the nucleotide sequence set ments, the nucleotide sequence encoding SEQID NO: 117 is forth as SEQID NO: 82 or 86. In embodiments, the polypep SEQID NO: 217. tide of interest is the Insulin Glargine Proinsulin set forth as 0047. In embodiments, the recombinant fusion protein SEQID NO: 91, encoded by the nucleotide sequence set forth comprises a Pfluorescens FklB3 protein N-terminal fusion as SEQID NO: 83 or 87. partner and a enterokinase cleavage site linker, together hav 0057. In embodiments, the insulin fusion protein com ing the amino acid sequence of SEQID NO: 118. In embodi prises a Pfluorescens DnaJ-like protein N-terminal fusion ments, the nucleotide sequence encoding SEQID NO: 118 is partner and a trypsin cleavage site linker, together having the SEQID NO: 218. amino acid sequence of SEQID NO: 101. In embodiments, 0048. In embodiments, the recombinant fusion protein the nucleotide sequence encoding SEQID NO: 101 is SEQID comprises a Pfluorescens FrnE1 protein N-terminal fusion NO: 2O2. partner and a enterokinase cleavage site linker, together hav 0058. In embodiments, the insulin fusion protein com ing the amino acid sequence of SEQID NO: 119. In embodi prises a Pfluorescens EcpD1 protein N-terminal fusion part ments, the nucleotide sequence encoding SEQID NO: 119 is ner and a trypsin cleavage site linker, together having the SEQID NO: 219. amino acid sequence of SEQID NO: 102 or 103. In embodi 0049. In embodiments, the recombinant fusion protein ments, the nucleotide sequence encoding SEQID NO: 102 or comprises a Pfluorescens FrnE2 protein N-terminal fusion 103 is SEQID NO: 202 or 228, respectively. partner and a enterokinase cleavage site linker, together hav 0059. In embodiments, the insulin fusion protein com ing the amino acid sequence of SEQID NO: 120. In embodi prises a Pfluorescens EcpD2 protein N-terminal fusion part ments, the nucleotide sequence encoding SEQID NO: 120 is ner and a trypsin cleavage site linker, together having the SEQID NO: 220. amino acid sequence of SEQID NO: 104. In embodiments, 0050. In embodiments, the recombinant fusion protein the nucleotide sequence encoding SEQID NO: 104 is SEQID comprises a Pfluorescens FrnE3 protein N-terminal fusion NO: 2O4. partner and a enterokinase cleavage site linker, together hav 0060. In embodiments, the insulin fusion protein com ing the amino acid sequence of SEQID NO: 121. In embodi prises a Pfluorescens EcpD3 protein N-terminal fusion part ments, the nucleotide sequence encoding SEQID NO: 121 is ner and a trypsin cleavage site linker, together having the SEQID NO: 221. amino acid sequence of SEQID NO: 105. In embodiments, 0051. In embodiments, the N-terminal fusion partner, the nucleotide sequence encoding SEQID NO: 105 is SEQID linker, and polypeptide of interest of the recombinant fusion NO: 205. protein are, respectively: P fluorescens folding modulator 0061. In embodiments, the insulin fusion protein com DnaJ-like protein (SEQID NO: 2), the linker set forth as SEQ prises a Pfluorescens FklB1 protein N-terminal fusion part US 2016/0159877 A1 Jun. 9, 2016

ner and a trypsin cleavage site linker, together having the 0072. In embodiments, the insulin fusion protein com amino acid sequence of SEQID NO: 106. In embodiments, prises a Pfluorescens FklB2 protein N-terminal fusion part the nucleotide sequence encoding SEQID NO: 106 is SEQID ner and a enterokinase cleavage site linker, together having NO: 2O6. the amino acid sequence of SEQ ID NO: 217. In embodi 0062. In embodiments, the insulin fusion protein com ments, the nucleotide sequence encoding SEQID NO: 117 is prises a Pfluorescens FklB2 protein N-terminal fusion part SEQID NO: 217. ner and a trypsin cleavage site linker, together having the 0073. In embodiments, the insulin fusion protein com amino acid sequence of SEQID NO: 107. In embodiments, prises a Pfluorescens FklB3 protein N-terminal fusion part the nucleotide sequence encoding SEQID NO: 107 is SEQID ner and a enterokinase cleavage site linker, together having NO: 207. the amino acid sequence of SEQ ID NO: 118. In embodi 0063. In embodiments, the insulin fusion protein com ments, the nucleotide sequence encoding SEQID NO: 118 is prises a Pfluorescens FklB3 protein N-terminal fusion part SEQID NO: 218. ner and a trypsin cleavage site linker, together having the 0074. In embodiments, the insulin fusion protein com amino acid sequence of SEQID NO: 108. In embodiments, prises a Pfluorescens FrnE1 protein N-terminal fusion part the nucleotide sequence encoding SEQID NO: 108 is SEQID ner and a enterokinase cleavage site linker, together having NO: 2O8. the amino acid sequence of SEQ ID NO: 119. In embodi 0064. In embodiments, the insulin fusion protein com ments, the nucleotide sequence encoding SEQID NO: 119 is prises a Pfluorescens FrnE1 protein N-terminal fusion part SEQID NO: 219. ner and a trypsin cleavage site linker, together having the 0075. In embodiments, the insulin fusion protein com amino acid sequence of SEQID NO: 109. In embodiments, prises a Pfluorescens FrnE2 protein N-terminal fusion part the nucleotide sequence encoding SEQID NO: 109 is SEQID ner and a enterokinase cleavage site linker, together having NO: 209. the amino acid sequence of SEQ ID NO: 120. In embodi 0065. In embodiments, the insulin fusion protein com ments, the nucleotide sequence encoding SEQID NO: 120 is prises a Pfluorescens FrnE2 protein N-terminal fusion part SEQID NO: 220. ner and a trypsin cleavage site linker, together having the 0076. In embodiments, the insulin fusion protein com amino acid sequence of SEQID NO: 110. In embodiments, prises a Pfluorescens FrnE3 protein N-terminal fusion part the nucleotide sequence encoding SEQID NO: 110 is SEQID ner and a enterokinase cleavage site linker, together having NO: 21 O. the amino acid sequence of SEQ ID NO: 121. In embodi 0066. In embodiments, the insulin fusion protein com ments, the nucleotide sequence encoding SEQID NO: 121 is prises a Pfluorescens FrnE3 protein N-terminal fusion part SEQID NO: 221. ner and a trypsin cleavage site linker, together having the 0077. In embodiments, the recombinant insulin fusion amino acid sequence of SEQID NO: 111. In embodiments, protein has the amino acid sequence as set forth in one of SEQ the nucleotide sequence encoding SEQID NO: 111 is SEQID ID NOS: 122 to 201. NO: 211. 0078. In embodiments, the recombinant fusion protein is a 0067. In embodiments, the insulin fusion protein com GCSF fusion protein having the following elements: prises a Pfluorescens DnaJ-like protein N-terminal fusion 0079 an N-terminal fusion partner selected from: Pfluo partner and a enterokinase cleavage site linker, together hav rescens DnaJ-like protein (e.g., SEQID NO: 2), FrnE (SEQ ing the amino acid sequence of SEQID NO: 112. In embodi ID NO:3), FrnE2 (SEQID NO: 63), FrnE3 (SEQID NO:64), ments, the nucleotide sequence encoding SEQID NO: 112 is FklB (SEQ ID NO: 4), Fk1B3* (SEQ ID NO: 28), FklB2 SEQID NO: 212. (SEQID NO: 61), FklB3 (SEQID NO: 62), FkpB2 (SEQID 0068. In embodiments, the insulin fusion protein com NO: 5), EcpDEcpD (SEQID NO: 65), EcpD2 (SEQID NO: prises a Pfluorescens Ecp)1 protein N-terminal fusion part 66), or EcpD3 (SEQID NO: 67); ner and a enterokinase cleavage site linker, together having 0080 a linker having the sequence set forth as SEQ ID the amino acid sequence of SEQ ID NO: 113. In embodi NO: 9; and ments, the nucleotide sequence encoding SEQID NO: 113 is I0081 a polypeptide of interest having the sequence set SEQ ID NO: 213, respectively. forth as SEQID NO: 68. 0069. In embodiments, the insulin fusion protein com prises a Pfluorescens Ecp)2 protein N-terminal fusion part Polypeptide of Interest ner and a enterokinase cleavage site linker, together having I0082. The protein or polypeptide of interest of the recom the amino acid sequence of SEQ ID NO: 114. In embodi binant fusion protein, also referred to as the C-terminal ments, the nucleotide sequence encoding SEQID NO: 114 is polypeptide of interest, recombinant polypeptide of interest, SEQID NO: 214. and C-terminal fusion partner, is a polypeptide desired to be 0070. In embodiments, the insulin fusion protein com expressed in soluble form and at high yield. In embodiments, prises a Pfluorescens Ecp)3 protein N-terminal fusion part the polypeptide of interest is a heterologous polypeptide that ner and a enterokinase cleavage site linker, together having has been found not to be expressed at high yield in a bacterial the amino acid sequence of SEQ ID NO: 115. In embodi expression system due to, e.g., proteolysis, low expression ments, the nucleotide sequence encoding SEQID NO: 115 is level, improper protein folding, and/or poor secretion from SEQID NO: 215. the host cell. Polypeptides of interest include small or rapidly 0071. In embodiments, the insulin fusion protein com degraded peptides, proteins having an N-terminus that is Vul prises a Pfluorescens FklB1 protein N-terminal fusion part nerable to degradation, and proteins that typically are pro ner and a enterokinase cleavage site linker, together having duced in insoluble form in microbial or bacterial expression the amino acid sequence of SEQ ID NO: 216. In embodi systems. In embodiments, the N-terminus of the polypeptide ments, the nucleotide sequence encoding SEQID NO: 116 is of interest is protected from degradation while fused to the SEQID NO: 216. N-terminal fusion partner, resulting in a greater yield of US 2016/0159877 A1 Jun. 9, 2016

N-terminally intact protein. In embodiments, the heterolo polypeptide or derivative or analog thereof. In embodiments, gous polypeptide has been described as not expressed in the polypeptide of interest typically produced in insoluble soluble form at high yield in a microbial or bacterial expres form when overexpressed in a bacterial expression system is sion system. For example, in embodiments, the heterologous a proinsulin (a precursor of insulin). Proinsulin is comprised polypeptide has been described as not expressed in soluble of three designated segments (from N to C terminus: B-C-A). form at high yield in an E. coli, B. subtilis, or L. plantarum, L. Proinsulin is processed to insulin (or an insulin analog, casei, L. fermentum or Corynebacterium glutamicum host depending on the proinsulin) when the internal C-peptide is cell. In embodiments, the polypeptide of interest is a eukary removed by protease cleavage. Disulfide bonding between otic polypeptide or derived from (e.g., is an analog of) a the A and B-peptides maintains their association following eukaryotic polypeptide. In embodiments, the polypeptide of excision of the C-peptide insulin. In reference to insulin and interest is a mammalian polypeptide or derived from a mam insulin analogs here, "A-peptide' and A-chain” are used malian polypeptide. In embodiments, the polypeptide of interchangeably, and “B-peptide' and “B-chain” are used interest is a human polypeptide or derived from a human interchangeably. Positions within these chains are referred to polypeptide. In embodiments, the polypeptide of interest is a by the chain and amino acid number from the amino terminus prokaryotic polypeptide or derived from a prokaryotic of the chain, for example, “B30 refers to the thirtieth amino polypeptide. In embodiments, the polypeptide of interest is a acid in the B-peptide, i.e., the B-chain. In embodiments, the microbial polypeptide or derived from a microbial polypep polypeptide of interest is a proinsulin that is processed to form tide. In embodiments, the polypeptide of interest is a bacterial a long-acting insulin analog or a rapid-acting insulin analog. polypeptide or derived from a bacterial polypeptide. By “het I0085. In embodiments, the polypeptide of interest is a erologous' it is meant that the polypeptide of interest is proinsulin that is processed to form a long-acting insulin derived from an organism other than the expression host cell. analog. Long-acting insulin analogs include, e.g., insulin In embodiments, the fusion protein and/or polypeptide of glargine, a 43-amino acid (6050.41 Da), long-acting insulin interest is produced in a Pseudomonad host cell (i.e., a host analog marketed as Lantus.(R), , marketed as cell of the order Pseudomonadales) according to the methods Tresiba R, and , marketed as Levemir R. In of the present invention at higher yield than in another micro insulin glargine the asparagine at N21 (ASn21) is Substituted bial expression system. In embodiments, the fusion protein or with glycine, and two arginines are present at the C-terminus polypeptide of interest is produced in a Pseudomonad, of the B-peptide. In insulin, these two arginines are present in Pseudomonas, or Pseudomonas fluorescens expression sys proinsulin but not in the processed mature molecule. In tem according to the methods of the present invention at embodiments, the polypeptide of interest is processed to higher yield, e.g., about 1.5-fold to about 10-fold, about 1.5- glargine, and the polypeptide of interest is the 87-amino acid fold, about 2-fold, about 2.5-fold, about 3-fold, about 5-fold, proinsulin as set forth in SEQID NOS: 88, 89,90, or 91. In or about 10-fold higher, than in an E. coli or other microbial nonlimiting embodiments, the coding sequence for SEQID or bacterial expression system, e.g., those listed above, under NO: 88 is the nucleotide sequence set forth in SEQID NO: 80 Substantially comparable conditions. In embodiments, the or 84. In nonlimiting embodiments, the coding sequence for fusion protein or C-terminal polypeptide is produced in an E. SEQID NO: 89 is the nucleotide sequence set forthin SEQID coli expression system at a yield of less than 0.5, less than 0.4. NO: 81 or 85. In nonlimiting embodiments, the coding less than 0.3, less than 0.2, or less than 0.1 grams/liter. sequence for SEQID NO: 90 is the nucleotide sequence set 0083. In embodiments, the polypeptide of interest is a forth in SEQID NO: 82 or 86. In nonlimiting embodiments, Small and/or rapidly degraded peptide. In embodiments, the the coding sequence for SEQ ID NO: 91 is the nucleotide Small and/or rapidly degraded peptide is parathyroid hor sequence set forth in SEQID NO: 83 or 87. Each of SEQID mone (PTH). In embodiments, the polypeptide of interest is NOS:80-87 include an initial 15bp cloning site at the 5' end, human hPTH 1-34 (SEQID NO: 1). PTH is an 84 amino acid therefore in these embodiments the proinsulin coding (aa) peptide derived from a 115 aa pre-pro-peptide, Secreted sequences referred to are the sequences starting at the first Phe by the parathyroid gland, that acts to increase calcium con codon, TTT (in SEQID NO: 80), or TTC (in SEQID NOS: centration in the blood and is known to stimulate bone for 81-87). Insulin degludec has a deletion of Threonine at posi mation. The N-terminal 34 aa peptide is approved to treat tion B30 and is conjugated to hexadecanedioic acid via osteoporosis (Forteo R. Eli Lilly and Company; see package gamma-L-glutamyl spacer at the amino acid lysine at position insert). The active ingredient in Forteo(R), PTH 1-34, is pro B29. Insulin detemir has a fatty acid (myristic acid) is bound duced in E. coli as part of a C-terminal fusion protein (NDA to the lysine amino acid at position B29. 21-319 for Forteo(R); see Chemistry Review, Center for Drug I0086. In embodiments, the polypeptide of interest is pro Evaluation and Research, 2000-2001; see also Clinical Phar insulin that is processed to form a rapid-acting insulin analog. macology and Biopharmaceutics review, Center for Drug Rapid-acting (or fast-acting) insulin analogs include, e.g., Evaluation and Research, 2000-2001). Purification of For insulin aspart (NovoLog/NovoRapidR) (SEQ ID NO: 94), teo(R) (Eli Lilly’s LY333334) is described by, e.g., Jin, et al. where the proline at position B28 is replaced with aspartic (“Crystal Structure of Human Parathyroid Hormone 1-34 at acid, and insulin lispro (Humalog(R) (lispro proinsulin, SEQ 0.9 A Resolution.” J. Biol. Chem. 275(35):27238-44, 2000), ID NO: 33), where the last lysine and proline residues occur incorporated herein by reference. This report describes ring at the C-terminal end of the B-chain are reversed, and expression of the protein as inclusion bodies, and Subsequent insulin glulisine (Apidra(R) (glulisine proinsulin, SEQ ID solubilization in 7 Murea. NO:34), where the asparagine at position B3 is replaced with 0084. In embodiments, the polypeptide of interest typi lysine and the lysine in position B29 is replaced with glutamic cally is produced in insoluble form when overexpressed in a acid). At all other positions, these molecules have an identical bacterial expression system. In embodiments, the polypep amino acid sequence to regular insulin (proinsulin, SEQID tide of interest typically produced in insoluble form when NO:32; insulin A-peptide, SEQID NO:92; insulin B-peptide, overexpressed in a bacterial expression system is a eukaryotic SEQID NO:93). US 2016/0159877 A1 Jun. 9, 2016

0087. In embodiments, the polypeptide of interest typi ; Alglucosidase Alpha; Human Alpha-1 Protein cally produced in insoluble form when overexpressed in a ase Inhibitor. Botulinum ToxinType B (Rimabotulinumtoxin bacterial expression system is GCSF, e.g., Met-GCSF. In B); Coagulation Factor IX Fc Fusion; Recombinant Coagul embodiments, the polypeptide of interest typically produced lation factor IX: Recombinant Coagulation factor VIIa; in insoluble form when overexpressed in a bacterial expres Recombinant Coagulation factor XIII A-subunit; Human sion system is IFN-B, e.g., IFN-?3-1b. In embodiments, the Coagulation Factor VIII-von Willebrand Factor Complex: bacterial expression system in which the recombinant Collagenase Clostridium Histolyticum: Human Platelet-de polypeptide of interest is difficult to overexpress is an E. coli rived (Cecaplermin); Abatacept: : expression system. Adalimumab. ; Agallsidase Beta; Aldesleukin; 0088. In embodiments, the polypeptide of interest is a Alefacept; Alemtuzumab; Alglucosidase Alfa. ; protein that has an easily-degraded N terminus. Because a Anakinra, Octocog Alfa, Recombinant Human Antithrom fusion protein produced according to the methods of the bin; Azficel-T: Basiliximab: Belatacept; Belimumab; Beva present invention is separated from host proteases before cizumab. Botulinum Toxin Type A, Brentuximab Vedotin: cleavage to release the polypeptide of interest, the N-terminus Recombinant C1 Esterase Inhibitor, Canakinumab: Certoli of the polypeptide of interest is protected throughout the Zumab Pegol; : Nonacog Alfa: Daclizumab: Dar purification process. This allows the production of a prepara bepoetin Alfa: Denosumab: Digoxin Immune Fab: Dornase tion of up to 100% N-terminally intact polypeptide of interest. Alfa; Ecallantide; Eculizumab; Etanercept: Fibrinogen; 0089. In embodiments, the polypeptide of interest having ; Galsulfase; Golimumab; Ibritumomab Tiuxetan: an easily-degraded N-terminus is filgrastim, an analog of Idursulfase; Infliximab, , Interferon Alfa-2b: GCSF (granulocyte colony stimulating factor, or colony Interferon Alfacon-1; Interferon Alfa-2a: Interferon Alfa-n3; stimulating factor 3 (CSF 3)). GCSF is a 174 amino acid Interferon Beta-1a; Interferon Beta-1b: glycoprotein that stimulates the bone marrow to produce 1b: Ipilimumab: Laronidase; Epoetin Alfa: Moroctocog Alfa: granulocytes and stem cells and release them into the blood Muromonab-CD3; Natalizumab. Ocriplasmin. Ofatu stream. Filgrastim, which is nonglycosylated and has an mumab; Omalizumab: ; ; Palivizumab: N-terminal , is marketed as Neupogen R. The ; ; : Human Papil amino acid sequence of GCSF (filgrastim) is set forth in SEQ loma Virus (HPV) Types 6:11; 16; 18-L1 viral protein Virus ID NO: 69. In embodiments, the methods of the invention are like Particles (VLP): HPV Type 16 and 18 L1 protein VLPs: used to produce a high level of GCSF (filgrastim) with an ; Rasburicase; Raxibacumab; Recombinant intact N-terminus, including the N-terminal methionine. Factor IX: : Rilonacept: Rituximab: Romiplostim: GCSF production in a protease-deficient host cell is described ; ; Tocilizumab; ; in U.S. Pat. No. 8,455,218, “Methods for G-CSF production Ustekinumab; Abarelix; Cetrorelix; Desirudin; Enfluvirtide; in a Pseudomonas host cell, incorporated herein by reference Exenatide; Follitropin Beta; Ganirelix; Degarelix; Hyalu in its entirety. In embodiments of the present invention intact ronidase; Insulin Aspart; Insulin Degludec, Insulin Detemir, GCSF, including the N-terminal methionine, is produced Insulin Glargine rDNA Injection (long-acting human insulin within a fusion protein at a high level in a bacterial host cell, analog); Recombinant Insulin Glulisine; Human Insulin; e.g., a Pseudomonas host cell, which is not protease-deficient. Insulin Lispro (rapid acting insulin analog); Recombinant 0090. In embodiments, the polypeptide of interest having Insulin Lispro Protamine; Recombinant Insulin Lispro; Lan an easily-degraded N-terminus is recombinant Pfalciparum reotide; ; Surfaxin (Lucinactant; Sinapultide); circumsporozoite protein (rCSP), described in, e.g., U.S. Pat. Mecasermin; Insulin like Growth Factor; Nesiritide; Pram No. 9,169,304, “Process for Purifying Recombinant Plasmo lintide: Recombinant Teduglutide; Acetate; dium Falciparum Circumsporozoite Protein, incorporated Ziconotide Acetate; 10.8 mg Goserelin Acetate Implant; Abo herein by reference in its entirety. botulinumtoxin.A.; Agallsidase Alfa, Alipogene Tiparvovec, 0091. In embodiments, the polypeptide of interest is: a Ancestim; ; ; Avian TB Vac reagent protein; a therapeutic protein; an extracellular recep cine; Batroxobin; ; Buserelin (Gonadotropin-re tor or ligand; a protease; a kinase; a blood protein; a chemok leasing Hormone Agonist); S-Malate: Carper ine; a cytokine; an antibody; an antibody-based drug; an itide; Catumaxomab: Ceruletide; Coagulation Factor VIII; antibody fragment, e.g., a single-chain antibody, an antigen Coccidiosis Vaccine; ; Deferiprone; Defi binding (ab) fragment, e.g., F(ab), F (ab), F(ab)', Fv, gener brotide; Dibotermin Alfa: ; Edotreotide; ated from the variable region of IgG or IgM, an Fc fragment Efalizumab: ; Epoetin Delta: Eptifi generated from the heavy chain constant region of an anti batide: Eptotermin Alfa: Follitropin Alfa for Injection; body, a reduced IgG fragment (e.g., generated by reducing the Fomivirsen; Gemtuzumab ozogamicin; Gonadorelin; hinge region disulfide bonds of IgG), an Fc fusion protein, Recombinant Chorionic Human Gonadotropin: Histrelin e.g., comprising the Fc domain of IgG fused together with a Acetate (gonadotropin releasing hormone agonist); HVT protein or peptide of interest, or any other antibody fragment IBD vaccine. Imiglucerase; Insulin Isophane; described in the art, e.g., in U.S. Pat. No. 5,648.237, "Expres (Granulocyte-Colony Stimulating Factor); ; Lep sion of Functional Antibody Fragments, incorporated by to spira Vaccine for Dogs; Leuproprelin; Linaclotide; Lipeg reference herein in its entirety; an ; a blood filgrastim; ; Lutropin Alfa (human leutinizing factor, a bone morphogenetic protein; an engineered protein hormone); Mepolizumab; ; Mipomersen scaffold; an enzyme; a growth factor, an interferon; an inter Sodium; MirimoStim (macrophage-colony Stimulating fac leukin; a thrombolytic agent; or a hormone. In embodiments, tor); Mogamulizumab, (granulocyte mac the polypeptide of interest is selected from: Human Antihe rophage-colony stimulating factor); Monteplase; Nadroparin mophilic Factor; Human Antihemophilic Factor-von Will calcium; Nafarelin; Nebacumab; ; Pamiteplase; ebrand Factor Complex: Recombinant Antihemophilic Fac Pancrelipase; Parnaparin sodium; daspartate; tor (Turoctocog Alfa); Ado-; Peginesatide acetate; ; Pentetreotide; Poractant US 2016/0159877 A1 Jun. 9, 2016

alfa: ( releasing peptide); Pro about 30 kDA, about 2 to about 40 kDA, about 2 to about 50 tirelin; PTH 1-84; rhBMP-2: rhBMP-7: Eptortermin Alfa: kDA, about 2 to about 60 kDA, about 2 to about 70 kDA, Romurtide; ; : : Vasso about 2 to about 80 kDA, about 2 to about 90kDA, about 2 to pressin: Desmopressin; Taliglucerase Alfa, Taltirelin (thy about 100 kDA, about 2 kDa to about 200 kDa, about 2 kDa rotropin-releasing hormone analog); Tasonermin; Taspo to about 300 kDa, about 2 kDa to about 400kDa, about 2 kDa glutide; Thromobomodulin Alfa; Thyrotropin Alfa: to about 500 kDa, about 3 to about 10 kDA, about 3 to about ; Triptorelin Pamoate; Urofollitropin for Injection; 20 kDA, about 3 to about 30 kDA, about 3 to about 40 kDA, ; Velaglucerase Alfa: Cholera Toxin B: Recombi about 3 to about 50 kDA, about 3 to about 60 kDA, about 3 to nant Antihemophilic Factor (Efrailoctocog Alfa); Human about 70 kDA, about 3 to about 80 kDA, about 3 to about 90 Alpha-1 Proteinase Inhibitor; Asparaginase Erwinia Chry kDA, about 3 to about 100 kDA, about 3 kDa to about 200 Santhemi: Capromab: Denileukin Diftitox; Ovine Digoxin kDa, about 3 kDa to about 300 kDa, about 3 kDa to about 400 Immune Fab; Elosulfase Alfa; Epoetin Alfa: Factor IX Com kDa, or about 3 kDa to about 500 kDa. In embodiments the plex: Factor XIII Concentrate; Technetium (Fanolesomab); molecular weight of the polypeptide of interest is about 4.1 Fibrinogen; ; Influenza Hemagglutinin and kDa. Neuraminidase; Glucarpidase; Hemin for Injection; Hep B 0094. In embodiments, the polypeptide of interest is 25 or Surface Antigen; Human Albumin, Incobotulinumtoxin; more amino acids in length. In embodiments, the polypeptide Nofetumomab, ObinutuZumab, L-asparaginase (from of interest is about 25 to about 2000 or more amino acids in Escherichia. Coli, Erwinia sp., Pseudomonas sp., etc.); Pem length. In embodiments, the polypeptide of interest is about brolizumab: Concentrate; ; Siltux or at least about 25, 30, 35, 40, 45,50, 100, 150, 200,250,300, imab; Tbo-Filgrastim; Pertussis Toxin Subunits A-E: Topical 350, 400, 450, 475,500, 525,550, 575,600,625, 650, 700, Bovine Thrombin; Topical Human Thrombin; Tositumomab: 750, 800, 850, 900, 950, 1000, 1200, 1400, 1600, 1800, or Vedolizumab; Ziv-Aflibercept: ; Somatropin: Plas 2000 amino acids in length. In embodiments, the polypeptide modium falciparum or a Plasmodium vivax Antigen (e.g., of interest is about: 25 to about 2000, 25 to about 1000, 25 to CSP, CelTOS, TRAP, Rhs, AMA-1, LSA-1, LSA-3, Pfs25, about 500, 25 to about 250, 25 to about 100, or 25 to about 50, MSP-1, MSP-3, STARP, EXP1, pb9, GLURP). The amino acids in length. In embodiments, the polypeptide of sequences of these polypeptides, including variations, are interest is 32, 36,39, 71, 109, or 110 amino acids in length. In available in the literature and knownto those of skill in theart. embodiments, the polypeptide of interest is 34 amino acids in Any known sequence of any of the polypeptides listed is length. contemplated for use in the methods of the present invention. 0092. In embodiments, the polypeptide of interest is enter N-Terminal Fusion Partner okinase (e.g., SEQ ID NO: 31 bovine), insulin, proinsulin (e.g., SEQ ID NO. 32), a long-acting insulin analog or a 0.095 The N-terminal fusion partner of the recombinant proinsulin that is processed to form a long-acting insulin fusion protein is a bacterial protein that improves the yield of analog (e.g., insulin glargine, SEQ ID NO: 88, insulin the recombinant fusion protein obtained using a bacterial detemir, or insulin degludec), a rapid-acting insulin analog or expression system. In embodiments, the N-terminal fusion a proinsulin that is processed to form a rapid-acting insulin partner can be stably overexpressed from a recombinant con analog (e.g., insulin lispro, insulin aspart, or insulin struct in a bacterial host cell. In embodiments, the yield and/or glulisine), insulin C-peptide (e.g., SEQ ID NO: 97), IGF-1 solubility of the polypeptide of interest are increased or (e.g., Mecasermin, SEQID NO:35), Glp-1 (e.g., SEQID NO: improved by the presence of the N-terminal fusion partner. In 36), a Glp-1 analog (e.g., Exenatide, SEQID NO: 37), Glp-2 embodiments, the N-terminal fusion partner facilitates proper (e.g., SEQ ID NO: 38), a Glp-2 analog (e.g., Teduglutide, folding of the recombinant fusion protein. In embodiments, SEQ ID NO: 39), Pramlintide (e.g., SEQ ID NO: 40), the N-terminal fusion partner is a bacterial folding modulator Ziconotide (e.g., SEQ ID NO: 41), Becaplermin (e.g., SEQ or chaperone protein. ID NO: 42), Enfluvirtide (e.g., SEQID NO: 43), or Nesiritide 0096. In embodiments, the N-terminal fusion partner is a (e.g., SEQID NO:44). large-sized affinity tag protein, a folding modulator, a 0093. In embodiments, the molecular weight of the molecular chaperone, a ribosomal protein, a translation-re polypeptide of interest is about 1 kDa, about 2 kDa, about 3 lated factor, an OB-fold protein (oligonucleotide binding fold kDa, about 4 kDa, about 5 kDa, about 6 kDa, about 7 kDa, protein), or another protein described in the literature, e.g. by about 8 kDa, about 9 kDa, about 10 kDa, about 11 kDa, about Ahn, et al., 2011, “Expression screening of fusion partners 12 kDa, about 13 kDa, about 14 kDa, about 15 kDa, about 16 from an E. coligenome for soluble expression of recombinant kDa, about 17 kDa, about 18 kDa, about 19 kDa, about 20 proteins in a cell-free protein synthesis system. PLoS One, kDa, about 30 kDa, about 40 kDa, about 50 kDa, about 60 6(11): e26875, incorporated herein by reference. In embodi kDa, about 70 kDa, about 80 kDa, about 90 kDa, about 100 ments, the N-terminal fusion partner is a large-sized affinity kDa, about 150 kDa, about 200 kDa, about 250 kDa, about tag protein selected from MBP. GST. NuSA, Ubiquitin, 300 kDa, about 350 kDa, about 400 kDa, about 450 kDa, Domain 1 of IF-2, and the N-terminal domain of L9. In about 500 kDa, or more. In embodiments, the molecular embodiments, the N-terminal fusion partner is a ribosomal weight of the recombinant polypeptide is about 1 to about 10 protein from the 30S ribosomal subunit, or a ribosomal pro kDA, about 1 to about 20 kDA, about 1 to about 30 kDA, tein from the 50S ribosomal subunit. In embodiments, the about 1 to about 40 kDA, about 1 to about 50 kDA, about 1 to N-terminal fusion partner is an E. coli or Pseudomonad chap about 60 kDA, about 1 to about 70 kDA, about 1 to about 80 erone or folding modulator protein. In embodiments, the kDA, about 1 to about 90 kDA, about 1 to about 100 kDA N-terminal fusion partner is a P fluorescens chaperone or about 1 kDa to about 200 kDa, about 1 kDa to about 300kDa, folding modulator protein. In embodiments, the N-terminal about 1 kDa to about 400 kDa, about 1 kDa to about 500kDa, fusion partner is a chaperone or folding modulator protein about 2 to about 10 kDA, about 2 to about 20 kDA, about 2 to selected from Table 1. US 2016/0159877 A1 Jun. 9, 2016 10

0097. In embodiments, the N-terminal fusion partner is P partner is not thioredoxin. In embodiments, the N-terminal fluorescens DnaJ-like protein (SEQID NO: 2), FrnE(SEQID fusion partner is neither B-galactosidase nor thioredoxin. NO:3), Frn E2 (SEQ ID NO: 63), FrnE3 (SEQID NO: 64), 0101. In embodiments, the molecular weight of the N-ter Fk1B (SEQ ID NO: 4), Fk1B3* (SEQ ID NO: 28), FklB2 minal fusion partner is about 1 kDa, about 2 kDa, about 3kDa, (SEQID NO: 61), FklB3 (SEQID NO: 62), FkpB2 (SEQID about 4 kDa, about 5 kDa, about 6 kDa, about 7 kDa, about 8 NO: 5), SecB (SEQID NO: 6), EcpD (RXFO4553.1, SEQID kDa, about 9 kDa, about 10 kDa, about 11 kDa, about 12 kDa, NO: 7), EcpD (RXF04296.1, SEQID NO: 65, also referred to about 13 kDa, about 14 kDa, about 15 kDa, about 16 kDa, herein as EcpD1), EcpD2 (SEQID NO: 66), or EcpD3 (SEQ about 17 kDa, about 18 kDa, about 19 kDa, about 20 kDa, ID NO: 67). In embodiments, the N-terminal fusion partner is about 30 kDa, about 40 kDa, about 50 kDa, about 60 kDa, Escherichia coli protein Skip (SEQID NO: 8). about 70 kDa, about 80 kDa, about 90 kDa, about 100 kDa, about 150 kDa, about 200 kDa, about 250 kDa, about 300 0098. In embodiments, the N-terminal fusion partner is kDa, about 350 kDa, about 400 kDa, about 450 kDa, about truncated relative to the full-length fusion partner polypep 500 kDa, or more. In embodiments, the molecular weight of tide. In embodiments, the N-terminal fusion partner is trun the N-terminal fusion partner is about 1 to about 10 kDA, cated from the C-terminus, to remove at least one C-terminal about 1 to about 20kDA, about 1 to about 30 kDA, about 1 to amino acid. In embodiments, the N-terminal fusion partner is about 40 kDA, about 1 to about 50 kDA, about 1 to about 60 truncated to remove 1 to 300 amino acids from the C-terminus kDA, about 1 to about 70 kDA, about 1 to about 80 kDA, of the full-length polypeptide. In embodiments, the N-termi about 1 to about 90 kDA, about 1 to about 100 kDA about 1 nal fusion partner is truncated to remove 300, 290, 280, 270, kDa to about 200 kDa, about 1 kDa to about 300 kDa, about 260, 250, 240, 230, 220, 210, 200, 190, 180, 170, 160, 150, 1 kDa to about 400 kDa, about 1 kDa to about 500kDa, about 140, 130, 120, 110, 100,90, 80, 70, 60, 50, 40, 30, 20, 10, 5, 2 to about 10 kDA, about 2 to about 20kDA, about 2 to about 1 to 300, 1 to 295, 1 to 290, 1 to 280, 1 to 270, 1 to 260, 1 to 30 kDA, about 2 to about 40 kDA, about 2 to about 50 kDA, 250, 1 to 240, 1 to 230, 1 to 220, 1 to 210, 1 to 200, 1 to 190, about 2 to about 60 kDA, about 2 to about 70 kDA, about 2 to 1 to 180, 1 to 170, 1 to 160, 1 to 150, 1 to 140, 1 to 130, 1 to about 80 kDA, about 2 to about 90 kDA, about 2 to about 100 120, 1 to 110, 1 to 100, 1 to 90, 1 to 80, 1 to 70, 1 to 60, 1 to kDA, about 2 kDa to about 200kDa, about 2 kDa to about 300 50, 1 to 40, 1 to 30, 1 to 20, 1 to 15, 1 to 10, or 1 to 5 amino kDa, about 2 kDa to about 400kDa, about 2 kDa to about 500 acids from the C-terminus of the polypeptide. In embodi kDa, about 3 to about 10kDA, about 3 to about 20kDA, about ments, the N-terminal fusion partner polypeptide is truncated 3 to about 30 kDA, about 3 to about 40 kDA, about 3 to about from the C-terminus, to retain the first N-terminal 300, 290, 50 kDA, about 3 to about 60 kDA, about 3 to about 70 kDA, 280, 270, 260, 250, 240, 230, 220, 210, 200, 190, 180, 170, about 3 to about 80 kDA, about 3 to about 90 kDA, about 3 to 160, 150, 140, 130, 120, 110, 100,90, 80, 70, 60, 50, 40, 150 about 100 kDA, about 3 kDa to about 200 kDa, about 3 kDa to 40, the first 150 to 50, the first 150 to 75, the first 150-100, to about 300 kDa, about 3 kDa to about 400 kDa, or about 3 the first 100 to 40, the first 100 to 50, the first 100 to 75, the kDa to about 500 kDa. first 75-40, the first 75-50, the first 300, the first 250, the first 0102. In embodiments, the N-terminal fusion partner or 200, the first 150, the first 140, the first 130, the first 120, the truncated N-terminal fusion partneris 25 or more amino acids first 110, the first 100, the first 90, the first 80, the first 75, the in length. In embodiments, the N-terminal fusion partner is first 70, the first 65, the first 60, the first 55, the first 50, or the about 25 to about 2000 or more amino acids in length. In first 40 amino acids of the full-length polypeptide. embodiments, the N-terminal fusion partner is about or at 0099. In embodiments, the N-terminal fusion partner that least about 25, 35, 40, 45, 50, 100, 150, 200, 250, 300, 350, is truncated is FklB, FrnE, or Ecp)1. In embodiments, the 400, 450, 470, 500, 530,560,590, 610,640, 670, 700, 750, N-terminal fusion partner that is truncated is FklB, wherein 800, 850, 900, 950, 1000, 1200, 1400, 1600, 1800, 2000 the FklB is truncated from the C-terminus to remove 148, amino acids in length. In embodiments, the polypeptide of 198, 210, 200, 190, 180, 170, 160, 150, 140, 130, 120, 110, interest is about: 25 to about 2000, 25 to about 1000, 25 to 100, 90, 80, 70, 60, 50, 40, 30, 20, 10, 5, 1, 1 to 210, 1 to 200, about 500, 25 to about 250, 25 to about 100, or 25 to about 50, 1 to 190, 1 to 180, 1 to 170, 1 to 160, 1 to 150, 1 to 140, 1 to amino acids in length. 130, 1 to 120, 1 to 110, 1 to 100, 1 to 90, 1 to 80, 1 to 70, 1 to 60, 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 15, 1 to 10, or 1 to 5 Relative Sizes of the Polypeptide of Interest and the amino acids. In embodiments, the N-terminal fusion partner Recombinant Fusion Protein that is truncated is EcpD, wherein the EcpD is truncated from 0103) The yield of the polypeptide of interest is propor the C-terminus to remove 148, 198, 210, 200, 190, 180, 170, tional to the yield of the full recombinant fusion protein. This 160, 150, 140, 130, 120, 110, 100,90, 80, 70, 60, 50, 40, 30, proportion depends on the relative sizes (e.g., molecular 20, 10, 5, 1, 1 to 210, 1 to 200, 1 to 190, 1 to 180, 1 to 170, 1 weight and/or length in amino acids) of the polypeptide of to 160, 1 to 150, 1 to 140, 1 to 130, 1 to 120, 1 to 110, 1 to 100, interest and the recombinant fusion protein. For example, 1 to 90, 1 to 80, 1 to 70, 1 to 60, 1 to 50, 1 to 40, 1 to 30, 1 to decreasing the size of the N-terminal fusion partner in the 20, 1 to 15, 1 to 10, or 1 to 5 amino acids. In embodiments, the fusion protein would result in a greater proportion of the N-terminal fusion partner that is truncated is FrnE, wherein fusion protein produced being the polypeptide of interest. In the FrnE is truncated from the C-terminus to remove 118, embodiments, to maximize yield of the polypeptide of inter 168, 190, 180, 170, 160, 150, 140, 130, 120, 110, 100, 90, 80, est, the N-terminal fusion partner is selected based on its size 70, 60, 50, 40, 30, 20, 10, 5, 1, 1 to 190, 1 to 180, 1 to 170, 1 relative to the polypeptide of interest. In embodiments, an to 160, 1 to 150, 1 to 140, 1 to 130, 1 to 120, 1 to 110, 1 to 100, N-terminal fusion partner is selected to be a certain minimal 1 to 90, 1 to 80, 1 to 70, 1 to 60, 1 to 50, 1 to 40, 1 to 30, 1 to size (e.g., MW or length in amino acids) relative to the 20, 1 to 15, 1 to 10, or 1 to 5 amino acids. polypeptide of interest. In embodiments, the recombinant 0100. In embodiments, the N-terminal fusion partner is fusion protein is designed so that the molecular weight of the not f-galactosidase. In embodiments, the N-terminal fusion polypeptide of interest constitutes from about 10% to about US 2016/0159877 A1 Jun. 9, 2016

50% of the molecular weight of the recombinant fusion pro the length of the polypeptide of interest constitutes about 40% tein. In embodiments, the molecular weight of the polypep to about 72% of the total length of the recombinant fusion tide of interest constitutes about or at least about: 10%, 11%, protein. In embodiments, the polypeptide of interest is a pro 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, insulin and the length of the polypeptide of interest consti 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 35%, tutes about 19% to about 56% of the total length of the 40%, 45%, 50% of the molecular weight of the recombinant recombinant fusion protein. fusion protein. In embodiments, the molecular weight of the polypeptide of interest constitutes about or at least about: Difference in Polypeptide of Interest and N-Terminal Fusion 10% to about 50%, 11% to about 50%, 12% to about 50%, Partner Isoelectric Points 13% to about 50%, 14% to about 50%, 15% to about 50%, 20% to about 50%, 25% to about 50%, 30% to about 50%, 0105. The isoelectric point of a protein (pl), is defined as 35% to about 50%, 40% to about 50%, 13% to about 40%, the pH at which the protein carries no net electrical charge. 14% to about 40%, 15% to about 40%, 20% to about 40%, The pi value is known to affect the solubility of a protein at a 25% to about 40%, 30% to about 40%, 35% to about 40%, given pH. At a pH below its p, a protein carries a net positive 13% to about 30%, 14% to about 30%, 15% to about 30%, charge and at a pH above its p1, it carries a net negative charge. 20% to about 30%, 25% to about 30%, 13% to about 25%, Proteins can be separated according to their isoelectric point 14% to about 25%, 15% to about 25%, or 20% to about 25%, (overall charge). In embodiments, the pl of the polypeptide of of the molecular weight of the recombinant fusion protein. In interest and that of the N-terminal fusion protein are substan embodiments, the polypeptide of interest is hPTH and the tially different. This can facilitate purification of the polypep molecular weight of the polypeptide of interest constitutes tide of interest away from the N-terminal fusion protein. In about 14.6% of the molecular weight of the recombinant embodiments, the pl of the polypeptide of interest is at least fusion protein. In embodiments, the polypeptide of interest is two times higher than that of the N-terminal fusion partner. In hPTH and the molecular weight of the polypeptide of interest embodiments, the pi of the polypeptide of interest is 1.5 to 3 constitutes about 13.6% of the molecular weight of the times higher than that of the N-terminal fusion partner. In recombinant fusion protein. In embodiments, the polypeptide embodiments, the pi of the polypeptide of interest is 1.5, 1.6, of interest is hPTH and the molecular weight of the polypep 1.7, 1.8, 1.9, 2, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, or 3 tide of interest constitutes about 27.3% of the molecular times higher than that of the N-terminal fusion partner. In weight of the recombinant fusion protein. In embodiments, embodiments, the pl of the N-terminal fusion partner is about the polypeptide of interest is met-GCSF and the molecular 4, about 4.1, about 4.2, about 4.3, about 4.4, about 4.5, about weight of the polypeptide of interest constitutes about 39% to 4.6, about 4.7, about 4.8, about 4.9 or about 5. In embodi about 72% of the molecular weight of the recombinant fusion ments, the pi of the N-terminal fusion partner is about 4 to protein. In embodiments, the polypeptide of interest is a pro about 5, about 4.1 to about 4.9, about 4.2 to about 4.8, about insulin and the molecular weight of the polypeptide of inter 4.3 to about 4.7, about 4.4 to about 4.6. est constitutes about 20% to about 57% of the molecular 0106. In embodiments, the N-terminal fusion partner is weight of the recombinant fusion protein. one listed in Table 8 or 18, having the pl listed therein. In embodiments, the C-terminal polypeptide of interest is hPTH 0104. In embodiments, the length of the polypeptide of 1-34, having a pi of 8.52 and a molecular weight of 4117.65 interest constitutes between about 10% to about 50% of the daltons. In embodiments, the C-terminal polypeptide of inter total length of the recombinant fusion protein. In embodi est is Met-GCSF, having a pi of 5.66 and a molecular weight ments, the length of the polypeptide of interest constitutes of 18801.9 daltons. In embodiments, the C-terminal polypep about or at least about: 10%, 11%, 12%, 13%, 14%, 15%, tide of interest is proinsulin as set forth in SEQ ID NO: 88, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, having a pi of about 5.2 and a molecular weight of about 9.34 26%, 27%, 28%, 29%, 30%, 35%, 40%, 45%, 50% of the total KDa. In embodiments, the C-terminal polypeptide of interest length of the recombinant fusion protein. In embodiments, is proinsulin as set forth in SEQID NO: 89, having a pl of the length of the polypeptide of interest constitutes about or at about 6.07 and a molecular weight of about 8.81 KDa. In least about: 10% to about 50%, 11% to about 50%, 12% to embodiments, the C-terminal polypeptide of interest is pro about 50%, 13% to about 50%, 14% to about 50%, 15% to insulin as set forth in SEQID NO: 90, having a pl of about about 50%, 20% to about 50%, 25% to about 50%, 30% to 5.52 and a molecular weight of about 8.75 KDa. In embodi about 50%, 35% to about 50%, 40% to about 50%, 13% to ments, the C-terminal polypeptide of interest is proinsulin as about 40%, 14% to about 40%, 15% to about 40%, 20% to set forth in SEQ ID NO: 91, having a pl of 6.07 and a about 40%, 25% to about 40%, 30% to about 40%, 35% to molecular weight of about 7.3 KDa. The pl of a protein can be about 40%, 13% to about 30%, 14% to about 30%, 15% to determined according to any method as described in the lit about 30%, 20% to about 30%, 25% to about 30%, 13% to erature and known to those of skill in the art. about 25%, 14% to about 25%, 15% to about 25%, or 20% to about 25%, of the total length of the recombinant fusion Chaperones and Protein Folding Modulators protein. In embodiments, the polypeptide of interest is hPTH and the length of the polypeptide of interest constitutes about 0107 An obstacle to the production of a heterologous 13.1% of the totallength of the recombinant fusion protein. In protein at a highyield in a non-native host cell (a cell to which embodiments, the polypeptide of interest is hPTH and the the heterologous protein is not native) is that the cell often is length of the polypeptide of interest constitutes about 12.5% not adequately equipped to produce the heterologous protein of the total length of the recombinant fusion protein. In in soluble and/or active form. While the primary structure of embodiments, the polypeptide of interest is hPTH and the a protein is defined by its amino acid sequence, the secondary length of the polypeptide of interest constitutes about 25.7% structure is defined by the presence of alpha helices or beta of the total length of the recombinant fusion protein. In sheets, and the tertiary structure by amino acid sidechain embodiments, the polypeptide of interest is met-GCSF and interactions within the protein, e.g., between protein US 2016/0159877 A1 Jun. 9, 2016 domains. When expressing heterologous proteins, particu EL, DnaKJ, Clp, Hsp90 and SecB families of folding modu larly in large-scale production, the secondary and tertiary lators are all examples of proteins with chaperone-like activ structure of the protein itself are of critical importance. Any ity. significant change in proteinstructure can yield a functionally 0111. The disulfide bond are another impor inactive molecule, or a protein with significantly reduced tant type of folding modulator. These proteins catalyze a very biological activity. In many cases, a host cell expresses chap specific set of reactions to help folding polypeptides form the erones or protein folding modulators (PFMs) that are neces proper intra-protein disulfide bonds. Any protein that has sary for proper production of active heterologous protein. more than two cysteines is at risk of forming disulfide bonds However, at the high levels of expression generally required between the wrong residues. The disulfide bond formation to produce usable, economically satisfactory biotechnology family consists of the Dsb proteins which catalyze the forma products, a cell often cannot produce enough native protein tion of disulfide bonds in the non-reducing environment of the folding modulator or modulators to process the heterolo periplasm. When a periplasmic polypeptide misfolds disul gously-expressed protein. fide bond , DsbC is capable of rearranging the 0108. In certain expression systems, overproduction of disulfide bonds and allowing the protein to reform with the heterologous proteins can be accompanied by their misfold correct linkages. ing and segregation into insoluble aggregates. In bacterial (O112 The FklB and FrnE proteins belong to the Peptidyl cells these aggregates are known as inclusion bodies. Proteins prolyl cis-trans isomerase family of folding modulators. This processed to inclusion bodies can, in certain cases, be recov is a class of that catalyZE the cis-trans isomerization ered through additional processing of the insoluble fraction. of proline imidic peptide bonds in oligopeptides. The proline Proteins found in inclusion bodies typically have to be puri residue is unique among amino acids in that the peptidyl bond fied through multiple steps, including denaturation and rena immediately preceding it can adopt either a cis or trans con turation. Typical renaturation processes for inclusion body formation. For all otheramino acids this is not favored due to proteins involve attempts to dissolve the aggregate in concen steric hindrance. Peptidyl-prolyl cis-trans isomerases trated denaturant with Subsequent removal of the denaturant (PPlases) catalyze the conversion of this bond from one form by dilution. Aggregates are frequently formed again in this to the other. This isomerization may accelerate and/or aid stage. The additional processing adds cost, there is no guar protein folding, refolding, assembly of Subunits and traffick antee that the in vitro refolding will yield biologically active ing in the cell. product, and the recovered proteins can include large 0113. In addition to the general chaperones which seem to amounts of fragment impurities. interact with proteins in a non-specific manner, there are also chaperones which aid in the folding of specific targets. These 0109. In vivo protein folding is assisted by molecular protein-specific chaperones form complexes with their tar chaperones, which promote the proper isomerization and cel gets, preventing aggregation and degradation and allowing lular targeting of other polypeptides by transiently interacting time for them to assemble into multi-subunit structures. The with folding intermediates, and by foldases, which accelerate PapD chaperone is an example (described in Lombardo et al., rate-limiting steps along the folding pathway. In certain 1997. Escherichia coli PapD, in Guidebook to Molecular cases, the overexpression of chaperones has been found to Chaperones and Protein-Folding Catalysts, Gething M-J Ed. increase the soluble yields of aggregation-prone proteins (see Oxford University Press Inc., New York: 463-465), incorpo Baneyx, F., 1999, Curr. Opin. Biotech. 10:411–421). The rated herein by reference. beneficial effect associated with an increase in the intracellu 0114. Folding modulators include, for example, HSP70 lar concentration of these chaperones appears highly depen proteins, HSP110/SSE proteins, HSP40 (DnaJ-related) pro dent on the nature of the overproduced protein, and may not teins, GRPE-like proteins, HSP90 proteins, CPN60 and require overexpression of the same protein folding modulator CPN10 proteins, cytosolic chaperoning, HSP 100 proteins, (s) for all heterologous proteins. Protein folding modulators, small HSPs, calnexin and calreticulin, PDI and thioredoxin including chaperones, disulfide bond isomerases, and pepti related proteins, peptidyl-prolyl isomerases, cyclophilin dyl-prolyl cis-trans isomerases (PPlases) are a class of pro PPlases, FK-506 binding proteins, parvulin PPlases, indi teins present in all cells which aid in the folding, unfolding vidual chaperoning, protein specific chaperones, or intramo and degradation of nascent polypeptides. lecular chaperones. Folding modulators are generally 0110 Chaperones act by binding to nascent polypeptides, described in "Guidebook to Molecular Chaperones and Pro stabilizing them and allowing them to fold properly. Proteins tein-Folding Catalysts.” 1997, ed. M. Gething, Melbourne possess both hydrophobic and hydrophilic residues, the University, Australia, incorporated herein by reference. former are usually exposed on the surface while the latter are 0115 The best characterized molecular chaperones in the buried within the structure where they interact with other cytoplasm of E. coli are the ATP-dependent DnaK-DnaJ hydrophilic residues rather than the water which surrounds GrpE and GroEL-GroES systems. In E. coli, the network of the molecule. However in folding polypeptide chains, the folding modulators/chaperones includes the Hsp70 family. hydrophilic residues are often exposed for some period of The major Hsp70 chaperone. DnaK, efficiently prevents pro time as the protein exists in a partially folded or misfolded tein aggregation and Supports the refolding of damaged pro state. It is during this time when the forming polypeptides can teins. The incorporation of heat shock proteins into protein become permanently misfolded or interact with other mis aggregates can facilitate disaggregation. Based on in vitro folded proteins and form large aggregates or inclusion bodies studies and homology considerations, a number of additional within the cell. Chaperones generally act by binding to the cytoplasmic proteins have been proposed to function as hydrophobic regions of the partially folded chains and pre molecular chaperones in E. coli. These include Clpb, HtpG venting them from misfolding completely or aggregating and IbpA/B, which, like DnaK-DnaJ-GrpE and GroEL with other proteins. Chaperones can even bind to proteins in GroES, are heat-shock proteins (Hsps) belong to the stress inclusion bodies and allow them to disaggregate. The GroES/ regulon. US 2016/0159877 A1 Jun. 9, 2016

0116. The Pfluorescens DnaJ-like protein is a molecular trans isomerases (PPlases). Three cytoplasmic PPlases, Sly D, chaperone belonging to the DnaJ/Hsp40 family of proteins, Slp A and trigger factor (TF), have been identified to date in E. characterized by their highly conserved J-domain. The J-do coli. TF, a 48 kDa protein associated with 50S ribosomal main, which is a region of 70 amino acids, is located at the C Subunits that has been postulated to cooperate with chaper terminus of the DnaJ protein. The N terminus has a trans ones in E. coli to guarantee proper folding of newly synthe membrane (TM) domain that promotes insertion into the sized proteins. At least five proteins (thioredoxins 1 and 2, and membrane. The A-domain separates the TM domain from the glutaredoxins 1, 2 and 3, the products of the trXA, trXc, grXA, J-domain. Proteins in the DnaJ family play a critical role in grXB and grxC genes, respectively) are involved in the reduc protein folding, by interacting with another chaperone pro tion of disulfide bridges that transiently arise in cytoplasmic tein, DnaK (as a co-chaperone). The highly conserved J-do enzymes. Thus, the N-terminal fusion partner can be a disul main is the site of interaction between DnaJ proteins and fide bond forming protein or a chaperone that allows proper DnaK proteins. Type IDnaJ proteins are considered true DnaJ disulfide bond formation. proteins, while types II and III are usually referred to as 0118. Examples of folding modulators useful in the meth DnaJ-like proteins. The DnaJ-like protein is also known to ods of the present invention are shown in participate actively in the response to hyperosmotic and heat Table 1. RXF numbers refer to the open reading frame. U.S. Pat. App. Pub. Nos. 2008/0269070 and 2010/0137162, both shock by preventing the aggregation of stress-denatured pro titled “Method for Rapidly Screening Microbial Hosts to teins and by disaggregating proteins, in both DnaK dependent Identify Certain Strains with Improved Yield and/or Quality and DnaK-independent manners. in the Expression of Heterologous Proteins, incorporated by 0117 The trans conformation of X-Pro bonds is energeti reference herein in their entirety, disclose the open reading cally favored in nascent protein chains; however, approxi frame sequences for the proteins listed in Table 1. Proteases mately 5% of all prolyl peptide bonds are found in a cis and folding modulators also are provided in Tables A to F of conformation in native proteins. The trans to cis isomeriza U.S. Pat. No. 8,603,824, “Process for improved protein tion of X-Pro bonds is rate limiting in the folding of many expression by Strain engineering, incorporated by reference polypeptides and is catalyzed in vivo by peptidyl prolyl cis/ herein in its entirety. TABLE 1. P. fluorescens Folding Modulators

ORFID GENE FUNCTION FAMILY LOCATION

GroESEL

RXFO2095.1 groES Chaperone Hsp10 Cytoplasmic RXFO6767.1: groEL Chaperone Hsp60 Cytoplasmic Rxf)2O90 RXFO1748.1 ibpA Small heat-shock protein (SEISP) Ibp A HS2O Cytoplasmic PA3126; Acts as a holder for GroESL folding RXFO3385.1 hscE Chaperone protein hsch3 HS2O Cytoplasmic Hsp70 (DnaK/J)

RXFOS399.1 dnaK Chaperone Hsp70 Periplasmic RXFO69S4.1 dnaK Chaperone Hsp70 Cytoplasmic RXFO3376.1 hscA Chaperone Hsp70 Cytoplasmic RXFO3987.2 cbp.A Curved dina-binding protein, dinal like Hsp40 Cytoplasmic activity RXFOS4O6.2 dna Chaperone protein dna Hsp40 Cytoplasmic RXFO3346.2 dna Molecular chaperones (DnaJ family) Hsp40 Non-secretory RXFOS413.1 grpE heat shock protein GrpE PA4762 GrpE Cytoplasmic Hsp100 (Clp/Hsl)

RXFO4587.1 clip A atp-dependent clp protease atp-binding Hsp100 Cytoplasmic subunit RXFO8347.1 clipB ClpB protein Hsp100 Cytoplasmic RXFO4654.2 clpX atp-dependent clp protease atp-binding Hsp100 Cytoplasmic subunit RXFO4663.1 clpP atp-dependent Clp protease proteolytic MEROPS Cytoplasmic subunit peptidase (ec 3.4.21.92) amily S14 RXFO 1957.2 hisU atp-dependent his protease atp-binding Hsp100 Cytoplasmic subunit RXFO 1961.2 hSV atp-dependent his protease proteolytic MEROPS Cytoplasmic subunit peptidase subfamily B US 2016/0159877 A1 Jun. 9, 2016 14

TABLE 1-continued P. fluorescens Folding Modulators OR FID GENE FUNCTION FAMILY LOCATION Hsp33

FO4254.2 yrfI 33 kDa chaperonin (Heat shock protein 33 Hsp33 Cytoplasmic homolog) (HSP33). Hsp90

FO5455.2 htpG Chaperone protein htpG Hsp90 Cytoplasmic SecB

FO2231.1 SecB secretion specific chaperone SecB SecB Non-secretory Disulfide Bond Isomerases

F07017.2 disbA disulfide isomerase DSBA oxido Cytoplasmic reductase FO8657.2 frnE disulfide isomerase DSBA oxido Cytoplasmic reductase FO1 OO2.1 disbA disulfide isomerase DSBA oxido Periplasmic homolog reductase? Thioredoxin FO3307.1 dsbC disulfide isomerase Glutaredoxin Periplasmic Thioredoxin FO4890.2 dsbG disulfide isomerase Glutaredoxin Periplasmic Thioredoxin dsbB Disulfide bond formation protein B DSBA oxido Periplasmic (Disulfide ). reductase FO4886.2 dsbD Thiol:disulfide interchange protein dsbD DSBA oxido Periplasmic reductase Pep idyl-prolyl Cis-trans Isomerases FO3768.1 ppiA Peptidyl-prolyl cis-trans isomerase A PPIase: cyclophilin Periplasmic (ec 5.2.1.8) type FO5345.2 ppiB Peptidyl-prolyl cis-trans isomerase B. PPIase: cyclophilin Cytoplasmic type FO6034.2 fkIB Peptidyl-prolyl cis-trans isomerase FklB. PPIase: FKBP type OuterMembrane FO6591.1 fkIB, fk506 binding protein Peptidyl-prolyl PPIase: FKBP type Periplasmic fkbP cis-transisomerase (EC 5.2.1.8) FO5753.2 fkIB, Peptidyl-prolyl cis-trans isomerase PPIase: FKBP type OuterMembrane fkbP (ec 5.2.1.8) FO1833.2 slyD Peptidyl-prolyl cis-trans isomerase SlyD. PPIase: FKBP type Non-secretory FO46SS.2 tig Trigger factor, ppiase (ec 5.2.1.8) PPIase: FKBP type Cytoplasmic FO5385. yaad Probable FKBP-type 16 kDa peptidyl-prolyl PPIase: FKBP type Non-secretory cis-trans isomerase (EC 5.2.1.8) (PPiase) (Rotamase) FOO271. Peptidyl-prolyl cis-trans isomerase PPIase: FKBP type Non-secretory (ec 5.2.1.8) P i Assembly Chaperones (papD-like)

FO6068. Cl Chaperone protein cup pili assembly Periplasmic bap) F05719. ecpD Chaperone protein ecpD pili assembly bap) F05319. ecpD Hnr protein pili assembly Periplasmic chaperone ecpD; Chaperone protein ecpD pili assembly Signal peptide cSuC 8. FO4296. ecpD; Chaperone protein ecpD pili1 assembly Periplasmic Cl 8. FO4553. ecpD; Chaperone protein ecpD pili assembly Periplasmic Cl 8. FO4554.2 ecpD; Chaperone protein ecpD pili assembly Periplasmic Cl bap) FOS3.10.2 ecpD; Chaperone protein ecpD pili assembly Periplasmic Cl 8. FOS3O4.1 ecpD; Chaperone protein ecpD pili assembly Periplasmic Cl 8. F05073.1 Gram-nega ive pili assembly chaperone pili assembly Signal peptide periplasmic function 8. Type II Secretion Complex FOS445.1 Yac Histidinol-phosphate aminotransferase Class-II pyridoxal Membrane (ec 2.6.1.9) phosphate-dependent aminotransferase amily. Histidinol US 2016/0159877 A1 Jun. 9, 2016 15

TABLE 1-continued P. fluorescens Folding Modulators ORFID GENE FUNCTION FAMILY LOCATION phosphate amino subfamily RXFOS426.1 SecD Protein subunit seco Type II secretion Membrane complex RXFOS432.1 SecF protein translocase subunit sec? Type II secretion Membrane complex Disulfide Bond Reductases

RXFO8122.2 trixC Thioredoxin 2 Disulfide Bond Cytoplasmic Reductase RXFO6751.1 Gor Glutathione reductase (EC 1.8.1.7) (GR) Disulfide Bond Cytoplasmic (GRase) Reductase PA2O2S RXFOO922.1 gshA Glutamate-cysteine (ec 6.3.2.2) Disulfide Bond Cytoplasmic PAS2O3 Reductase

Linkers Carboxypeptidase Y. Caspases (general), Caspase 1, Caspase 2, Caspase 3, Caspase 4, Caspase 5. Caspase 6. Caspase 7. 0119 The recombinant fusion proteins of the present Caspase 8, Caspase 9, Caspase 10, Caspase 11, Caspase 12, invention contain a linker between the N-terminal fusion Caspase 13, B, Cathepsin C, . Cathe partner and the C-terminal polypeptide of interest. In embodi psin E. Cathepsin G, Cathepsin H. Cathepsin L. Chymopa ments, the linker comprises a cleavage site that is recognized pain, Chymase, Chymotrypsin, a-Clostripain, Collagenase, by a cleavage enzyme, i.e., a proteolytic enzyme that cleaves Complement Clr. Complement Cls, Complement Factor D, a protein internally. In embodiments, cleavage of the linker at Complement factor I, Cucumisin, Dipeptidyl Peptidase IV. the cleavage site separates the polypeptide of interest from the Elastase, leukocyte, Elastase, Endoproteinase Arg-C, N-terminal fusion partner. The proteolytic enzyme can be any Endoproteinase Asp-N, Endoproteinase Glu-C, Endoprotein protease known in the art or described in the literature, e.g., in ase Lys-C, Enterokinase, Factor Xa, Ficin, Furin, Granzyme PCT Pub. No. WO 2003/010204, “Process for Preparing A. Granzyme B, HIV Protease, IGase, Kallikrein tissue, Leu Polypeptides of Interest from Fusion Polypeptides. U.S. Pat. cine Aminopeptidase (General), aminopeptidase, No. 5,750,374, “Process for Producing Hydrophobic cytosol, Leucine aminopeptidase, microsomal, Matrix met Polypeptides and Proteins, and Fusion Proteins for Use in alloprotease, Methionine Aminopeptidase, Neutrase, Papain, Producing Same.” and U.S. Pat. No. 5,935,824, each incor , Plasmin, Prolidase, Pronase E, Prostate Specific Anti porated by reference herein in its entirety. gen, Protease, Alkalophilic from Streptomyces griseus, Pro 0120 In embodiments, the linker comprises a cleavage tease from Aspergillus, Protease from Aspergillus Saitoi, Pro site cleaved by, e.g., a , threonine protease, tease from Aspergillus sojae, Protease (B. licheniformis) cysteine protease, aspartate protease, protease, (Alkaline), Protease (B. licheniformis) (Alcalase), Protease metalloprotease, asparagine protease, mixed protease, or a from Bacillus polymyxa, Protease from Bacillus sp. (Espe protease of unknown catalytic type. In embodiments, the rase), Protease from Rhizopus sp., Protease S. Proteasomes, serine protease is, e.g., trypsin, chymotrypsin, endoprotein Proteinase from Aspergillus oryzae, Proteinase 3, Proteinase ase Arg-C, endoproteinase Glu-C, endoproteinase Lys-C, A, Proteinase K. Protein C, Pyroglutamate aminopeptidase, elastase, proteinase K. Subtilisin, carboxypeptidase P. carbox , Rennin, , Subtilisin, Thermolysin, ypeptidase Y. Acylaminoacid Releasing Enzyme. In embodi Thrombin, Tissue , Trypsin, Tryptase, ments, the metalloprotease is, e.g., endoproteinase Asp-N. or Urokinase. In embodiments, the linker comprises a cleav thermolysin, carboxypeptidase A, carboxypeptidase B. In age site recognized by Enterokinase, Factor Xa, or Furin. In embodiments, the cysteine protease is, e.g., papain, clostri embodiments, the linker comprises a cleavage site recognized pain, cathepsin C, or pyroglutamate aminopeptidase. In by Enterokinase or trypsin. In embodiments, the linker com embodiments, the aspartate protease is, e.g., pepsin, chy prises a cleavage site recognized by bovine Enterokinase. mosin, cathepsin D. In embodiments, the glutamic protease These and other proteases useful in the methods of the present is, e.g., Scytalidoglutamic peptidase. In embodiments, the invention, and their cleavage recognition sites, are known in asparagine protease is, e.g., nodavirus peptide , intein the art and described in the literature, e.g., by Harlow and containing chloroplast ATP-dependent peptide lyase, intein Lane, ANTIBODIES: A LABORATORY MANUAL, Cold containing replicative DNA helicase precursor, or reovirus Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. type 1 coat protein. In embodiments, the protease of unknown (1988); Walsh, PROTEINS: BIOCHEMISTRY AND BIO catalytic type is, e.g., collagenase, protein P5 murein TECHNOLOGY, John Wiley & Sons, Ltd., West Sussex, endopeptidase, homomultimeric peptidase, microcin-pro England (2002), incorporated herein by reference. cessing peptidase 1, or Dop isopeptidase. I0122. In embodiments, the linker comprises an affinity 0121. In embodiments, the linker comprises a cleavage tag. Anaffinity tag is a peptide sequence that can aid in protein site for Achromopeptidase, Aminopeptidase, , Angio purification. Affinity tags are fused to proteins to facilitate tensin Converting Enzyme, Bromelain, Calpain, Calpain I. purification of the protein from a crude biological source, Calpain II, Carboxypeptidase A, Carboxypeptidase B, Car using an affinity technique. Any Suitable affinity tag known in boxypeptidase G. Carboxypeptidase P. Carboxypeptidase W. the art can be used as desired. In embodiments, an affinity tag US 2016/0159877 A1 Jun. 9, 2016

used in the present invention is, e.g., Chitin Binding Protein, TABLE 2 - continued Maltose Binding Protein, or Glutathione-S-transferase Pro tein, Polyhistidine, FLAG tag (SEQID NO: 229), Calmodu Linker Sequences lin tag (SEQID NO: 230), Myc tag, BP tag, HA-tag (SEQID NO: 231), E-tag (SEQID NO:232), S-tag (SEQID NO. 233), SEO ID NO: Amino Acid Sequence SBP tag (SEQID NO. 234), Softag 1, Softag 3 (SEQID NO: 235), V5 tag (SEQID NO: 236), Xpress tag, Green Fluores 12 GGGGSGGGGHHHHHHLWPR cent Protein, Nus tag, Strep tag. Thioredoxin tag, MBP tag, 226 GGGGSGGGGSHHHHHHR VSV tag (SEQ ID NO. 237), or Avi tag. 0123 Affinity tags can be removed by chemical agents or by enzymatic means, such as proteolysis. Methods for using Expression Vector affinity tags in protein purification are described in the litera 0.126 In embodiments, gene fragments coding for recom ture, e.g., by Lichty, et al., 2005, “Comparison of affinity tags binant fusion proteins are introduced into Suitable expression for protein purification.” Protein Expression and Purification plasmids to generate expression vectors for expressing 41: 98-105. Other affinity tags useful in linkers of the inven recombinant fusion proteins. The expression vector can be, tion are known in the art and described in the literature, e.g., for example, a plasmid. In some embodiments, a plasmid by U.S. Pat. No. 5,750,374, referenced above, and Terpe K. encoding a recombinant fusion protein sequence can com 2003, “Overview of Tag Protein Fusions: from molecular and prise a selection marker, and host cells maintaining the plas biochemical fundamentals to commercial systems. Applied mid can be grown under selective conditions. In some Microbiology and Biotechnology (60):523-533, both incor embodiments, the plasmid does not comprise a selection porated by reference herein in their entirety. marker. In some embodiments, the expression vector can be 0124. In embodiments, the linker is 4, 5, 6, 7, 8, 9, 10, 11, integrated into the host cell genome. In some embodiments, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, the expression vector encodes hPTH 1-34 fused to a linker 45, 50, or more, amino acids in length. In embodiments, the and a protein that can direct the expressed fusion protein to linker is 4 to 50, 4 to 45, 4 to 40, 4 to 35, 4 to 30, 4 to 25, 4 to the cytoplasm. In embodiments, expression vector encodes 20, 4 to 15, 4 to 10, 5 to 50, 5 to 45, 5 to 40, 5 to 35, 5 to 30, hPTH 1-34 fused to a linker and a protein that can direct the 5 to 25, 5 to 20, 5 to 15, 5 to 10, 10 to 50, 10 to 45, 10 to 40, expressed fusion protein to the periplasm. In some embodi 10 to 35, 10 to 30, 10 to 25, 10 to 20, 10 to 15, 15 to 50, 15 to ments, the expression vector encodes hPTH 1-34 fused to a 45, 15 to 40, 15 to 35, 15 to 30, 15 to 25, 15 to 20, 20 to 50, 20 linker and Pfluorescens DnaJ-like protein. In some embodi to 45, 20 to 40, 20 to 35, 20 to 30, or 20 to 25 amino acids in ments, the expression vector encodes hPTH 1-34 fused to a length. In embodiments, the linker is 18 amino acids in linker and Pfluorescens FklB protein. length. In embodiments, the linker is 19 amino acids in I0127 Examples of nucleotide sequences encoding PTH length. 1-34 fusion proteins are provided in the Table of Sequences 0.125. In embodiments the linker includes multiple glycine herein. Examples of nucleotide sequences that encode a residues. In embodiments, the linker includes 1, 2, 3, 4, 5, 6, fusion protein comprising a DnaJ-like protein N-terminal 7, 8, or more glycine residues. In embodiments, the linker fusion partner are designated gene ID 126203 (SEQID NO: includes 1 to 8, 1 to 7, 1 to 6, 1 to 5, or 1 to 4 glycine residues. 52), corresponding to a coding sequence optimized for P In embodiments, the glycine residues are consecutive. In fluorescens. The sequence designated gene ID 126206 (SEQ embodiments, the linker contains at least one serine residue. ID NO: 53) corresponds to a native Pfluorescens DnaJ cod In embodiments, the glycine and/or serine residues comprise ing sequence fused to an optimized linker and PTH 1-34 a spacer. In embodiments, the spacer is a (G45) spacer hav coding sequence. The gene sequences 126203 and 126206 are ing 10 amino acids, as set forth in SEQID NO. 59. In embodi those present in the expression plasmids p708-001 and p708 ments, the spacer is a (G45), (G45), (G45), (G45), or 004, respectively. Examples of nucleotide sequences that (G45)s spacer. In embodiments, the linker contains six histi encode a fusion protein comprising an FklB N-terminal dine residues, or a His-tag. In embodiments the linker fusion partner are designated gene ID 126204 (SEQID NO: includes an enterokinase cleavage site, e.g., as set forth by 54), corresponding to a coding sequence optimized for P SEQ ID NO: 13 (DDDDK). In embodiments, the recombi fluorescens. The gene ID 126207 (SEQ ID NO: 55) corre nant fusion protein comprises a linker as set forth in any of sponds to a native Pfluorescens FklB coding sequence fused SEQID NOS:9 to 12, or 226, listed in Table 2. The enteroki to an optimized linker and PTH1-34 coding sequence. The nase cleavage site in SEQID NO: 9 is underlined. The poly gene sequences 126204 and 126207 are those present in the histidine affinity tags are italicized in each of SEQID NOS: 9 expression plasmids p708-002 and p708-005, respectively. to 12 and 226. In embodiments, the recombinant fusion pro Examples of nucleotide sequences that encode a fusion pro tein comprises a linker corresponding to SEQID NO: 9. tein comprising an FrnE N-terminal fusion partner are desig nated gene ID 126205 (SEQID NO. 56), corresponding to a TABLE 2 coding sequence optimized for Pfluorescens. The sequence designated gene ID 126208 (SEQID NO:57) corresponds to Linker Sequences a native P. fluorescens FrnE coding sequence fused to an SEO ID NO: Amino Acid Sequence optimized linker and PTH1-34 coding sequence. The gene sequences 126205 and 126208 are present in the expression 9 GGGGSGGGGHHHHHHDDDDK plasmids p708-003 and p708-006, respectively. 1O GGGGSGGGGHHHHHHRKR Codon Optimization 11 GGGGSGGGGHHHHHHRRR I0128. The present invention contemplates the use of any appropriate coding sequence for the fusion protein and/or US 2016/0159877 A1 Jun. 9, 2016

each of its individual components, including any sequence herein by reference, as well as Ptac16, Ptac17, PtacII, that has been optimized for expression in the host cell being PlacUV5, and the T71ac promoter. In embodiments, the pro used. Methods for optimizing codons to improve expression moter is not derived from the host cell organism. In embodi in bacterial hosts are known in the art and described in the ments, the promoter is derived from an E. coli organism. In literature. For example, optimization of codons for expres embodiments, a lac promoteris used to regulate expression of sion in a Pseudomonas host strain is described, e.g., in U.S. a recombinant fusion protein from a plasmid. In the case of Pat. App. Pub. No. 2007/0292918, “Codon Optimization the lac promoter derivatives or family members, e.g., the tac Method.” incorporated herein by reference in its entirety. promoter, an inducer is IPTG (isopropyl-B-D-1-thiogalacto Codon optimization for expression in E. coli is described, pyranoside, "isopropylthiogalactoside'). In embodiments, e.g., by Welch, et al., 2009, PLoS One, “Design Parameters to IPTG is added to the host cell culture to induce expression of Control Synthetic Gene Expression in Escherichia coli, 4(9): the recombinant fusion protein from a lac promoter in a e7002, incorporated by reference herein. Nonlimiting Pseudomonas host cell according to methods known in the art examples of coding sequences for fusion protein components and described in the literature, e.g., in U.S. Pat. Pub. No. are provided herein, however it is understood that any suitable 2006/OO4O352 sequence can be generated as desired according to methods I0132) Examples of non-lac promoters useful in expression well known by those of skill in the art. systems according to the present invention include, P. (in duced by high temperature), P. (induced by high tempera Expression Systems ture), P., (induced by Alkyl- or halo-benzoates), P., (induced 0129. An appropriate bacterial expression system useful by alkyl- or halo-toluenes), or P (induced by salicylates). for producing the polypeptide of interest according to the described in, e.g.J. Sanchez-Romero & V. De Lorenzo (1999) present methods can be identified by one of skill in the art Manual of Industrial Microbiology and Biotechnology (A. based on the teachings herein. In embodiments, an expression Demain & J. Davies, eds.) pp. 460-74 (ASM Press, Washing construct comprising a nucleotide sequence encoding a ton, D.C.); H. Schweizer (2001) Current Opinion in Biotech recombinant fusion protein comprising the polypeptide of nology, 12:439-445; and R. Slater & R. Williams (2000 interest are provided as part of an inducible expression vector. Molecular Biology and Biotechnology (J. Walker & R. Rap In embodiments, a host cell that has been transformed with ley, eds.) pp. 125-54 (The Royal Society of Chemistry, Cam the expression vector is cultured, and expression of the fusion bridge, UK). A promoter having the nucleotide sequence of a protein from the expression vector is induced. The expression promoternative to the selected bacterial host cell also may be vector can be, for example, a plasmid. In embodiments, the used to control expression of the expression construct encod expression vector is a plasmid encoding a recombinant fusion ing the polypeptide of interest, e.g., a Pseudomonas anthra protein coding sequence further comprising a selection nilate or benzoate operon promoter (Pant, Pben). Tandem marker, and the host cells are grown under selective condi promoters may also be used in which more than one promoter tions that allow maintenance of the plasmid. In embodiments, is covalently attached to another, whether the same or differ the expression construct is integrated into the host cell ent in sequence, e.g., a Pant-Pbentandem promoter (interpro genome. In embodiments, the expression construct encodes a moter hybrid) or a Plac-Plac tandem promoter, derived from recombinant fusion protein fused to a secretory signal that the same or different organisms. In embodiments, the pro can direct the recombinant fusion protein to the periplasm. moter is Pmtl, as described in, e.g., U.S. Pat. Nos. 7,476,532, 0130 Methods for expressing heterologous proteins, and 8,017,355, both titled “Mannitol induced promoter sys including useful regulatory sequences (e.g., promoters, secre tems in bacterial host cells, incorporated by reference herein tion leaders, and ribosome binding sites), in host cells useful in their entirety. in the methods of the present invention, including Pseudomo 0.133 Regulated (inducible) promoters utilize promoter nas host cells, are described, e.g., in U.S. Pat. App. Pub. Nos. regulatory proteins in order to control transcription of the 2008/0269070 and 2010/0137162, U.S. Pat. App. Pub. No. gene of which the promoter is a part. Where a regulated 2006/0040352, “Expression of Mammalian Proteins in promoter is used herein, a corresponding promoter regulatory Pseudomonas fluorescens, and U.S. Pat. No. 8,603.824, each protein will also be part of an expression system according to incorporated herein by reference in its entirety. These publi the present invention. Examples of promoter regulatory pro cations also describe bacterial host strains useful in practicing teins include: activator proteins, e.g., E. coli catabolite acti the methods of the invention, that have been engineered to vator protein, MalT protein; AraCfamily transcriptional acti overexpress folding modulators or wherein protease muta vators; repressor proteins, e.g., E. coli Lad proteins; and dual tions have been introduced, e.g., to eliminate, inactivate or function regulatory proteins, e.g., E. coli NagC protein. Many decrease activity of the protease, in order to increase heter regulated-promoter/promoter-regulatory-protein pairs are ologous protein expression. Sequence leaders are described known in the art. in detail in U.S. Pat. No. 7,618,799, “Bacterial leader 0.134 Promoter regulatory proteins interact with an effec sequences for increased expression.” and U.S. Pat. No. 7.985, tor compound, i.e., a compound that reversibly or irreversibly 564, “Expression systems with Sec-system Secretion, both associates with the regulatory protein so as to enable the incorporated herein by reference in their entirety, as well as in protein to either release or bind to at least one DNA transcrip U.S. Pat. App. Pub. No. 2010/0137162, previously refer tion regulatory region of the gene that is under the control of enced. the promoter, thereby permitting or blocking the action of a 0131 Promoters used in accordance with the present transcriptase enzyme in initiating transcription of the gene. invention may be constitutive promoters or regulated promot Effector compounds are classified as either inducers or co ers. Examples of inducible promoters include those of the repressors, and these compounds include native effector com family derived from the lac promoter (i.e. the lacZ promoter), pounds and gratuitous inducer compounds. Many regulated e.g., the tac and trc promoters described in U.S. Pat. No. promoter/promoter-regulatory-protein/effector-compound 4.551,433, “Microbial Hybrid Promoters, incorporated trios are known in the art. Although an effector compound can US 2016/0159877 A1 Jun. 9, 2016

be used throughout the cell culture or fermentation, in a dicted Expression Levels and Operon Structures. J. Bact. preferred embodiment in which a regulated promoter is used, 184(20): 5733-45, incorporated herein by reference. after growth of a desired quantity or density of host cell 0.138. Further examples of methods, vectors, and transla biomass, an appropriate effector compound is added to the tion and transcription elements, and other elements useful in culture to directly or indirectly result in expression of the the present invention are well known in the art and described desired gene(s) encoding the protein or polypeptide of inter in, e.g.: U.S. Pat. No. 5,055.294 to Gilroy and U.S. Pat. No. eSt. 5,128,130 to Gilroy et al.; U.S. Pat. No. 5,281,532 to Ram 0135) In embodiments wherein a lac family promoter is mler et al.; U.S. Pat. Nos. 4,695,455 and 4,861,595 to Barnes utilized, a lacIgene can also be present in the system. The lacI et al.; U.S. Pat. No. 4,755,465 to Gray et al.; and U.S. Pat. No. gene, which is normally a constitutively expressed gene, 5,169,760 to Wilcox, all incorporated herein by reference, as encodes the Lac repressor protein Lad protein, which binds to well as in many of the other publications incorporated herein the lac operator of lac family promoters. Thus, where a lac by reference. family promoter is utilized, the lac gene can also be included and expressed in the expression system. Secretion Leader Sequences 0.139. In embodiments, a secretion signal or leader coding Other Regulatory Elements sequence is fused to the N-terminus of the sequence encoding 0136. In embodiments, other regulatory elements are the recombinant fusion protein. Use of Secretion signal present in the expression construct encoding the recombinant sequences can increase production of recombinant proteins in fusion protein. In embodiments, the soluble recombinant . Additionally, many types of proteins require second fusion protein is present in either the cytoplasm or periplasm ary modifications that are inefficiently achieved using known of the cell during production. Secretion leaders useful for methods. Secretion leader utilization can increase the harvest targeting the fusion proteins are described elsewhere herein. of properly folded proteins by secreting the protein from the In embodiments, an expression construct of the present inven intracellular environment. In Gram-negative bacteria, a pro tion encodes a recombinant fusion proteinfused to a secretion tein secreted from the cytoplasm can end up in the periplas leader that can transport the recombinant fusion protein to the mic space, attached to the outer membrane, or in the extra cytoplasm of a Pseudomonad cell. In embodiments, an cellular broth. These methods also avoid formation of expression construct encodes a recombinant fusion protein inclusion bodies. Secretion of proteins into the periplasmic fused to a secretion leader that can transport a recombinant space also has the effect of facilitating proper disulfide bond fusion protein to the periplasm of a Pseudomonad cell. In formation (Bardwellet al., 1994, Phosphate Microorg, Chap embodiments, the secretion leader is cleaved from the recom ter 45,270-5, and Manoil, 2000, Methods in Enzymol. 326: binant fusion protein. 35-47). Other benefits of secretion of recombinant protein 0137 Other elements include, but are not limited to, tran include more efficient isolation of the protein, proper folding Scriptional enhancer sequences, translational enhancer and disulfide bond formation of the protein leading to an sequences, other promoters, activators, translational start and increase in yield represented by, e.g., the percentage of the stop signals, transcription terminators, cistronic regulators, protein in active form, reduced formation of inclusion bodies polycistronic regulators, tag sequences, such as nucleotide and reduced toxicity to the host cell, and an increased per sequence “tags' and “tag” polypeptide coding sequences, centage of the recombinant protein in soluble form. The which facilitate identification, separation, purification, and/or potential for excretion of the protein of interest into the cul isolation of an expressed polypeptide, as previously ture medium can also potentially promote continuous, rather described. In embodiments, the expression construct than batch, culture for protein production. includes, in addition to the protein coding sequence, any of 0140. In embodiments, the recombinant fusion protein or the following regulatory elements operably linked thereto: a polypeptide of interest is targeted to the periplasm of the host promoter, a ribosome (RBS), a transcription ter cell or into the extracellular space. In embodiments, the minator, and translational start and stop signals. Useful RBSS expression vector further comprises a nucleotide sequence can be obtained from any of the species useful as host cells in encoding a secretion signal polypeptide operably linked to expression systems according to, e.g., U.S. Pat. App. Pub. No. the nucleotide sequence encoding the recombinant fusion 2008/0269070 and 2010/0137162, previously referenced. protein or polypeptide of interest. Many specific and a variety of consensus RBSs are known, 0.141. Therefore, in one embodiment, the recombinant e.g., those described in and referenced by D. Frishman et al., fusion protein comprises a secretion signal, an N-terminal Gene 234(2):257-65 (8 Jul. 1999); and B. E. Suzek et al., fusion partner, a linker, and a polypeptide of interest, wherein Bioinformatics 17(12): 1123-30 (December 2001), incorpo the secretion signal is N-terminal to the fusion partner. The rated herein by reference. In addition, either native or syn secretion signal can be cleaved from the recombinant fusion thetic RBSs may be used, e.g., those described in: EP protein when the protein is targeted to the periplasm. In 0207459 (synthetic RBSs); O. Ikehata et al., Eur. J. Biochem. embodiments, the linkage between the secretion signal and 181(3):563-70 (1989). In embodiments, a “Hi' ribosome the protein or polypeptide is modified to increase cleavage of binding site, aggagg, (SEQ ID NO: 60) is used in the con the secretion signal from the fusion protein. struct. Ribosome binding sites, including the optimization of spacing between the RBS and translation initiation codon, are Host Cells and Strains described in the literature, e.g., by Chen, et al., 1994, "Deter 0.142 Bacterial host cells, including Pseudomonads (i.e., mination of the optimal aligned spacing between the Shine host cells in the order Pseudomonadales) and closely related Dalgarno sequence and the translation initiation codon of bacterial organisms are contemplated for use in practicing the Escherichia coli mRNAs. Nucleic Acids Research 22(23): methods of the invention. In certain embodiments, the 4953-4957, and Ma, et al., 2002, “Correlations between Pseudomonad host cell is Pseudomonas fluorescens. The host Shine-Dalgarno Sequences and Gene Features Such as Pre cell also can be E. coli. US 2016/0159877 A1 Jun. 9, 2016

0143 Host cells and constructs useful in practicing the TABLE 3-continued methods of the invention can be identified or made using reagents and methods known in the art and described in the Families and Genera (“Gram-Negative Aerobic Rods and Cocci. literature, e.g., in U.S. Pat. No. 8,288,127, “Protein Expres Bergey's, 1974) sion Systems, incorporated herein by reference in its Family III. Rhizobiaceae Agrobacterium Rhizobium Family IV. Methylomonadaceae Methylococcus Methylomonas entirety. This patent describes production of a recombinant Family V. Halobacteriaceae Halobacterium Haiococcus polypeptide by introduction of a nucleic acid construct into an Other Genera Acetobacter Alcaligenes auxotrophic Pseudomonas fluorescens host cell comprising a Bordeteia Bruceiia chromosomal lad gene insert. The nucleic acid construct Franciselia comprises a nucleotide sequence encoding the recombinant Thermits polypeptide operably linked to a promoter capable of direct ing expression of the nucleic acid in the host cell, and also comprises a nucleotide sequence encoding an auxotrophic 0145 Pseudomonas and closely related bacteria are gen selection marker. The auxotrophic selection marker is a erally part of the group defined as “Gram(-) Proteobacteria polypeptide that restores prototrophy to the auxotrophic host Subgroup 1 or “Gram-Negative Aerobic Rods and Cocci” cell. In embodiments, the cell is auxotrophic for proline, (Buchanan and Gibbons (eds.) (1974) Bergey’s Manual of uracil, or combinations thereof. In embodiments, the host cell Determinative Bacteriology, pp. 217-289). Pseudomonas is derived from MB101 (ATCC deposit PTA-7841). U.S. Pat. host strains are described in the literature, e.g., in U.S. Pat. No. 8,288,127, “Protein Expression Systems,” and App. Pub. No. 2006/0040352, incorporated by reference Schneider, et al., 2005, “Auxotrophic markers pyrF and proC herein in its entirety. can replace antibiotic markers on protein production plas mids in high-cell-density Pseudomonas fluorescens fermen 0146 “Gram-negative Proteobacteria Subgroup 1 also tation. Biotechnol. Progress 21(2): 343-8, both incorporated includes Proteobacteria that would be classified in this head herein by reference in their entirety, describe a production ing according to the criteria used in the classification. The host strain auxotrophic for uracil that was constructed by heading also includes groups that were previously classified deleting the pyrE gene in strain MB101. The pyrF gene was in this section but are no longer, Such as the genera Acidovo cloned from strain MB214 (ATCC deposit PTA-7840) to rax, Brevundimonas, Burkholderia, Hydrogenophaga, Oce generate a plasmid that can complement the pyrF deletion to animonas, Ralstonia, and Stenotrophomonas, the genus Sph restore prototropy. In particular embodiments, a dual pyrF ingomonas (and the genus Blastomonas, derived therefrom). proC dual auxotrophic selection marker system in a Pfluo which was created by regrouping organisms belonging to rescens host cell is used. Given the published literature, a (and previously called species of) the genus Xanthomonas, PyrF production host strain as described can be produced by the genus Acidomonas, which was created by regrouping one of skill in the art according to standard recombinant organisms belonging to the genus Acetobacter as defined in methods and used as the background for introducing other Bergey (1974). In addition hosts can include cells from the desired genomic changes, including those described hereinas genus Pseudomonas, Pseudomonas enalia (ATCC 14393), Pseudomonas nigrifaciensi (ATCC 19375), and Pseudomo useful in practicing the methods of the invention. nas putrefaciens (ATCC 8071), which have been reclassified 0144. In embodiments, the host cell is of the order respectively as Alteromonas haloplanktis, Alteromonas nigri Pseudomonadales (referred to herein as a “Pseudomonad.” faciens, and Alteromonas putrefaciens. Similarly, e.g., Where the host cell is of the order Pseudomonadales, it may Pseudomonas acidovorans (ATCC 15668) and Pseudomonas be a member of the family Pseudomonadaceae, including the genus Pseudomonas. Gamma Proteobacterial hosts include testosteroni (ATCC 1 1996) have since been reclassified as members of the species Escherichia coli and members of the Comamonas acidovorans and Comamonas testosteroni, species Pseudomonas fluorescens. Other Pseudomonas respectively; and Pseudomonas nigrifaciens (ATCC 19375) organisms may also be useful. Pseudomonads and closely and Pseudomonas piscicida (ATCC 15057) have been reclas related species include Gram-negative Proteobacteria Sub sified respectively as Pseudoalteromonas nigrifaciens and group 1, which include the group of Proteobacteria belonging Pseudoalteromonas piscicida. “Gram-negative Proteobacte to the families and/or genera described as “Gram-Negative ria Subgroup 1 also includes Proteobacteria classified as Aerobic Rods and Cocci” by R. E. Buchanan and N. E. belonging to any of the families: Pseudomonadaceae, AZoto Gibbons (eds.), Bergey’s Manual of Determinative Bacteri bacteraceae (now often called by the synonym, the Azoto ology, pp. 217-289 (8th ed., 1974) (The Williams & Wilkins bacter group' of Pseudomonadaceae), Rhizobiaceae, and Co., Baltimore, Md., USA), all are incorporated by reference Methylomonadaceae (now often called by the synonym, herein in its entirety. (i.e., a host cell of the order “Methylococcaceae). Consequently, in addition to those Pseudomonadales) Table 3 presents these families and genera genera otherwise described herein, further Proteobacterial of organisms. genera falling within “Gram-negative Proteobacteria Sub group 1 include: 1) Azotobacter group bacteria of the genus TABLE 3 Azorhizophilus; 2) Pseudomonadaceae family bacteria of the Families and Genera (“Gram-Negative Aerobic Rods and Cocci. genera Cellvibrio, Oligella, and Teredinibacter; 3) Rhizobi Bergey's, 1974) aceae family bacteria of the genera Chelatobacter, Ensifer, Family I. Pseudomonaceae Gluconobacter Pseudomonas Liberibacter (also called “Candidatus Liberibacter'), and Xanthomonas Sinorhizobium; and 4) Methylococcaceae family bacteria of Zoogloea the genera Methylobacter, Methylocaldum, Methylomicro Family II. Azotobacteraceae Azomonas Azotobacter bium, Methylosarcina, and Methylosphaera. Beijerinckia Dexia 0147 The host cell can be selected from “Gram-negative Proteobacteria Subgroup 16.” “Gram-negative Proteobacte US 2016/0159877 A1 Jun. 9, 2016 20 ria Subgroup 16' is defined as the group of Proteobacteria of couverensis (ATCC 700688); Pseudomonas wisconsinensis; the following Pseudomonas species (with the ATCC or other and Pseudomonas xiamenensis. In one embodiment, the host deposit numbers of exemplary strain(s) shown in parenthe cell is Pseudomonas fluorescens. sis): Pseudomonas abietaniphila (ATCC 700689); 0.148. The host cell can also be selected from “Gram Pseudomonas aeruginosa (ATCC 101.45); Pseudomonas negative Proteobacteria Subgroup 17.” “Gram-negative Pro alcaligenes (ATCC 14909); Pseudomonas anguilliseptica teobacteria Subgroup 17' is defined as the group of Proteo (ATCC 33660); Pseudomonas citronellolis (ATCC 13674): bacteria known in the art as the “fluorescent Pseudomonads' Pseudomonas flavescens (ATCC 51555); Pseudomonas men including those belonging, e.g., to the following Pseudomo docina (ATCC 25411); Pseudomonas nitroreducens (ATCC nas species: Pseudomonas azotoformans, Pseudomonas 33634); Pseudomonas oleovorans (ATCC 8062); Pseudomo brenneri. Pseudomonas cedrella, Pseudomonas corrugata, nas pseudoalcaligenes (ATCC 17440); Pseudomonas resino Pseudomonas extremorientalis, Pseudomonas fluorescens, vorans (ATCC 14235); Pseudomonas straminea (ATCC Pseudomonas gessardii. Pseudomonas libanensis, 33.636); Pseudomonas agarici (ATCC 25941); Pseudomonas Pseudomonas mandelii, Pseudomonas marginalis, alcaliphila, Pseudomonas alginovora, Pseudomonas ander Pseudomonas migulae, Pseudomonas mucidolens, sonii. Pseudomonas asplenii (ATCC 23835); Pseudomonas Pseudomonas Orientalis, Pseudomonas rhodesiae, azelaica (ATCC 27162): Pseudomonas beverinckii (ATCC Pseudomonas synxantha, Pseudomonas tolaasii; and 19372); Pseudomonas borealis, Pseudomonas boreopolis Pseudomonas veronii. (ATCC 33662); Pseudomonas brassicacearum, Pseudomo 0149. In embodiments, a bacterial host cell used in the nas butanovora (ATCC 43655); Pseudomonas cellulosa methods of the invention is defective in the expression of a (ATCC 55703); Pseudomonas aurantiaca (ATCC 33663); protease. In embodiments, the bacterial host cell defective in Pseudomonas chlororaphis (ATCC 9446, ATCC 13985, the expression of a protease is a Pseudomonad. In embodi ATCC 17418, ATCC 17461); Pseudomonas fragi (ATCC ments, the bacterial host cell defective in the expression of a 4973); Pseudomonas lundensis (ATCC 49968); Pseudomo protease is a Pseudomonas. In embodiments, the bacterial nas taetrolens (ATCC 4683); Pseudomonascissicola (ATCC host cell defective in the expression of a protease is 33616); Pseudomonas coronafaciens, Pseudomonas diter Pseudomonas fluorescens. peniphila, Pseudomonas elongata (ATCC 10144); 0150. In embodiments, a bacterial host cell used in the Pseudomonas flectens (ATCC 12775); Pseudomonas azoto methods of the invention is not defective in the expression of formans, Pseudomonas brenneri. Pseudomonas cedrella, a protease. In embodiments, the bacterial host cell that is not Pseudomonas corrugata (ATCC 29736); Pseudomonas defective in the expression of a protease is a Pseudomonad. In extremorientalis, Pseudomonas fluorescens (ATCC 35858); embodiments, the bacterial host cell that is not defective in the Pseudomonas gessardii. Pseudomonas libanensis, expression of a protease is a Pseudomonas. In embodiments, Pseudomonas mandelii (ATCC 700871); Pseudomonas mar the bacterial host cell that is not defective in the expression of ginalis (ATCC 10844); Pseudomonas migulae, Pseudomo a protease is Pseudomonas fluorescens. nas mucidolens (ATCC 4685); Pseudomonas orientalis, 0151. In embodiments, a Pseudomonas host cell used in Pseudomonas rhodesiae, Pseudomonas synxantha (ATCC the methods of the invention is defective in the expression of 9890); Pseudomonas tolaasii (ATCC 33618); Pseudomonas Lon protease (e.g., SEQID NO: 14), Lal protease (e.g., SEQ veronii (ATCC 700474); Pseudomonas federiksbergensis, ID NO: 15), AprA protease (e.g., SEQ ID NO: 16), or a Pseudomonas geniculata (ATCC 19374); Pseudomonas gin combination thereof. In embodiments, the Pseudomonas host geri. Pseudomonas graminis, Pseudomonas grimontii, cell is defective in the expression of AprA (e.g., SEQID NO: Pseudomonas halodenitrificans, Pseudomonas halophila, 16), HtpX(e.g., SEQID NO: 17), or a combination thereof. In Pseudomonas hibiscicola (ATCC 19867); Pseudomonas hut embodiments, the Pseudomonas host cell is defective in the tiensis (ATCC 14670); Pseudomonas hydrogenovora, expression of Lon (e.g., SEQID NO: 14), Lal (e.g., SEQID Pseudomonas jessenii (ATCC 700870); Pseudomonas kilon NO: 15), AprA (e.g., SEQID NO: 16), HtpX (e.g., SEQ ID ensis, Pseudomonas lanceolata (ATCC 14669); Pseudomo NO: 17), or a combination thereof. In embodiments, the nas lini. Pseudomonas marginate (ATCC 25417); Pseudomonas host cell is defective in the expression of Npr Pseudomonas mephitica (ATCC 33665); Pseudomonas deni (e.g., SEQID NO:20), DegP1 (e.g., SEQID NO: 18), DegP2 trificans (ATCC 19244); Pseudomonas pertucinogena (e.g., SEQID NO: 19), or a combination thereof. In embodi (ATCC 190); Pseudomonas pictorum (ATCC 23328); ments, the Pseudomonas host cell is defective in the expres Pseudomonas psychrophila, Pseudomonas filva (ATCC sion of Lal (e.g., SEQID NO: 15), Prc1 (e.g., SEQID NO: 31418); Pseudomonas monteilii (ATCC 700476); Pseudomo 21, Prc2 (e.g., SEQID NO 22), PrthB (e.g., SEQID NO. 23), nas mosselii, Pseudomonas oryzihabitans (ATCC 43272): ora combination thereof. These proteases are known in the art Pseudomonas plecoglossicida (ATCC 700383); Pseudomo and described in, e.g., U.S. Pat. No. 8,603,824, “Process for nas putida (ATCC 12633); Pseudomonas reactans, Improved Protein Expression by Strain Engineering. U.S. Pseudomonas spinosa (ATCC 14606); Pseudomonas bale Pat. App. Pub. No. 2008/0269070 and U.S. Pat. App. Pub. No. arica, Pseudomonas luteola (ATCC 43273); Pseudomonas 2010/0137162, which disclose the open reading frame Stutzeri (ATCC 17588); Pseudomonas amygdali (ATCC sequences for the proteases listed above. 33614); Pseudomonas avellanae (ATCC 700331); 0152 Examples of Pfluorescens host strains derived from Pseudomonas caricapapayae (ATCC 33615); Pseudomonas base strain MB101 (ATCC deposit PTA-7841) are useful in cichorii (ATCC 10857); Pseudomonas ficuserectae (ATCC the methods of the present invention. In embodiments, the P. 35104); Pseudomonas fiscovaginae, Pseudomonas meliae fluorescens used to express an hPTH fusion protein is, e.g., (ATCC 33050); Pseudomonas syringae (ATCC 19310): DC454, DC552, DC572, DC1084, DC1106, DC508, DC992. Pseudomonas viridiflava (ATCC 13223); Pseudomonas ther 1, PF 1201.9, PF1219.9, PF 1326.1, PF 1331, PF1345.6, or mocarboxydovorans (ATCC 35961); Pseudomonas thermo DC1040.1-1. In embodiments, the Pfluorescens hoststrain is tolerans, Pseudomonas thivervalensis, Pseudomonas van PF1326.1. In embodiments, the Pfluorescens host strain is US 2016/0159877 A1 Jun. 9, 2016

PF1345.6. These and other strains useful in the methods of the 0157. In embodiments, a strain used has been transformed invention can be readily constructed by those of skill in the art with an FMO plasmid according to methods known in the art. using information provided herein, recombinant DNA meth For example, DC1106 host cells can be transformed with ods known in the art and described in the literature, and FMO plasmid p)OW1384, which overexpresses FkbP materials available, e.g., P. fluorescens strain MB101, on (RXFO6591.1), a folding modulator belonging to the pepti deposit with the ATCC as described. dyl-prolyl cis-trans isomerase family, to generate the expres sion strainSTR36034. The genotypes for certain examples of Expression Strains hPTH fusion protein expression strains and corresponding 0153 Expression strains useful for practicing the methods host cells useful for expressing hPTH according to the meth of the invention can be constructed using methods described ods of the invention are set forth in Table 4. In embodiments, herein and in the published literature. In embodiments, an a host cell equivalent to any host cell described in Table 4 is expression strain useful in the methods of the invention com transformed with an equivalent FMO plasmid as described prises a plasmid overexpressing one or more Pfluorescens herein, to obtain an expression Strain equivalent to one chaperone or folding modulator protein. For example, DnaJ described herein for expressing hPTH1-34 using the methods like protein, FrnE, Fk1B, or EcpD, can be overexpressed in the of the invention. As discussed, appropriate expression strains expression strain. In embodiments, a Pfluorescens folding can be similarly derived according to methods described modulator overexpression (FMO) plasmid encodes ClpX, herein and in the literature. FklB3, FrnE, ClpA, Fkbp, or ppi.A. An example of an expres sion plasmid encoding Fkbp is pl)OW 1384-1. In embodi TABLE 4 ments, an expression plasmid not encoding a folding modu P. fluorescens Host Cells and Expression Strains for PTH 1-34 Fusion lator is introduced into an expression strain. In these Protein Production embodiments, the plasmid is, e.g., pI)OW2247. In embodi ments, a Pfluorescens expression strain useful for expressing Expression Protease FMO Fusion an hPTH fusion protein in the methods of the invention is Host Strain Strain Deletions plasmid Protein STR35970, STR35984, STR36034, STR36085, STR36150, DCSO8-1 STR35970 MSOS2P DnaJ-like Protease protein-PTH STR36169, STR35949, STR36098, or STR35783, as Family described elsewhere herein. Membrane 0154) In embodiments, a Pfluorescens host strain used in metalloprotease the methods of the invention is DC1106 (mtlDYZ knock-out DC992.1 STR35984 PriC, AprA DOW2247 DnaJ-like (empty vector; protein-PTH mutant ApyrF AproC AbenAB lsc:lacI'), a derivative of no folding deposited strain MB101 in which the genes pyrF, proC, benA, modulator) benB, and mtlDYZ from the mannitol (mtl) operon are DC1084-1 STR35949 Lon, La1, DOW2247 DnaJ-like deleted, and the E. coli lad transcriptional repressor is DegP2 protein-PTH PF1201.9 STR35985 AprA, Lon, DOW2247 DnaJ-like inserted and fused with the levanSucrase gene (Isc). Lal, protein-PTH Sequences for these genes and methods for their use are DegP1, DegP2, known in the art and described in the literature, e.g., in U.S. Pro1 Pat. No. 8,288,127, 8,017,355, “Mannitol induced promoter PF1326.1 STR36005 HtpX, AprA DOW2247 DnaJ-like systems in bacterial host cells,” and U.S. Pat. No. 7,794,972, DC1106-1 STR36034 AprA, Lon, DOW1384-1 FkB-PTH “Benzoate- and anthranilate-inducible promoters, each La1 FkbP incorporated by reference herein. (RXFO6591.1) 0155. A host cell equivalent to DC1106 or any of the host PF1326.1 STR36085 HitpX, AprA DOW2247 FkB-PTH PF1345.6 STR36098 HtpX, AprA, DOW2247 FkB-PTH cells or expression strains described herein can be con LOn, Lal structed from MB101 using methods described herein and in DC1040.1-1 STR35783 rxf)4495 DOW2247 FkB-PTH the published literature. In embodiments, a host cell equiva (Serralysin) lent to DC1 106 is used. Host cell DC454 is described by AprA PF1219.9 STR36150 Npr, DegP1, Frn -PTH Schneider, et al., 2005, where it is referred to as DC206, and DegP2 in U.S. Pat. No. 8,569,015, “rPA Optimization,” incorporated PF1331 STR36169 La1, Prc1, Frn E -PTH herein by reference in its entirety. DC206 is the same strainas Pre2, PrtB DC454; it was renamed DC454 after passage three times in animal-free media. 0156. One with ordinary skill in the art will appreciate that 0158. In embodiments, a host cell or strain listed in Table in embodiments, a genomic deletion or mutation (e.g., an 4, or equivalent to any host cell or strain described in Table 4, inactivating or debilitating mutation) can be made by, e.g., is used to express a fusion protein comprising a polypeptide allele exchange, using a deletion plasmid carrying regions of interest as described herein, using the methods of the that flank the gene to be deleted, which does not replicate in P. invention. In embodiments, a host cell or strain listed in Table fluorescens. The deletion plasmid can be constructed by PCR 4, or equivalent to any host cell or strain described in Table 4, amplifying the gene to be deleted, including the upstream and is used to express a fusion protein comprising hPTH, GCSF, downstream regions of the gene to be deleted. The deletion or an insulin polypeptide, e.g., a proinsulin as described can be verified by sequencing a PCR product amplified from herein, using the methods of the invention. In embodiments, genomic DNA using analytical primers, observed after sepa a wild-type host cell, e.g., DC454 or an equivalent, is used to ration by electrophoresis in an agarose slab gel, followed by express a fusion protein comprising a polypeptide of interest DNA sequencing of the fragment. In embodiments, a gene is as described herein, using the methods of the invention. inactivated by complete deletion, partial deletion, or muta 0159. The sequences of these and other proteases andfold tion, e.g., frameshift, point, or insertion mutation. ing modulators useful for generating host strains of the US 2016/0159877 A1 Jun. 9, 2016 22 present invention are known in the art and published in the salts media. In other embodiments either a minimal medium literature, for example, as provided in Tables A to F of U.S. or a mineral salts medium is selected. In certain embodi Pat. No. 8,603,824, described above and incorporated by ments, a mineral salts medium is selected. reference herein in its entirety. For example, the M50 S2P 0.165 Mineral salts media consists of mineral salts and a Protease Family Membrane metalloprotease open reading carbon Source Such as, e.g., . Sucrose, or glycerol. frame sequence is provided therein as RXF04692. Examples of mineral salts media include, e.g., M9 medium, Pseudomonas medium (ATCC 179), and Davis and Mingioli High Throughput Screens medium (see, Davis, B. D., and Mingioli, E. S., 1950, J. Bact. 0160 In some embodiments, a high throughput screen can 60:17-28). The mineral salts used to make mineral salts media be conducted to determine optimal conditions for expressing include those selected from among, e.g., potassium phos a soluble recombinant fusion protein. The conditions that can phates, ammonium Sulfate or chloride, or be varied in the screen include, for example, the host cell, chloride, and trace minerals such as calcium chloride, borate, genetic background of the host cell (e.g., deletions of differ and Sulfates of iron, copper, manganese, and zinc. Typically, ent proteases), type of promoter in an expression construct, no organic nitrogen Source, Such as peptone, tryptone, amino type of secretion leader fused to the sequence encoding the acids, or a yeast extract, is included in a mineral salts medium. recombinant protein, growth temperature, OD at induction Instead, an inorganic nitrogen source is used and this may be when an inducible promoter is used, concentration of IPTG Selected from among, e.g., ammonium salts, aqueous ammo used for induction when a lacz promoter is used, duration of nia, and gaseous ammonia. A mineral salts medium will typi protein induction, growth temperature following addition of cally contain glucose or glycerol as the carbon Source. In an inducing agent to a culture, rate of agitation of culture, comparison to mineral salts media, minimal media can also method of selection for plasmid maintenance, Volume of cul contain mineral salts and a carbon source, but can be supple ture in a vessel, and method of cell lysing. mented with, e.g., low levels of amino acids, vitamins, pep 0161 In some embodiments, a library (or "array') of host tones, or other ingredients, though these are added at very strains is provided, wherein each Strain (or “population of minimal levels. Suitable media for use in the methods of the host cells’) in the library has been genetically modified to present invention can be prepared using methods described in modulate the expression of one or more target genes in the the literature, e.g., in U.S. Pat. App. Pub. No. 2006/0040352, host cell. An “optimal host strain” or “optimal expression referenced and incorporated by reference above. Details of system can be identified or selected based on the quantity, cultivation procedures and mineral salts media useful in the quality, and/or location of the expressed recombinant fusion methods of the present invention are described by Riesen protein compared to other populations of phenotypically dis berg, D et al., 1991, “High cell density cultivation of Escheri tinct host cells in the array. Thus, an optimal host strain is the chia coli at controlled specific growth rate.” J. Biotechnol. 20 strain that produces the recombinant fusion protein according (1):17-27, incorporated by reference herein. to a desired specification. While the desired specification will 0166 In embodiments, production can be achieved in vary depending on the protein being produced, the specifica bioreactor cultures. Cultures can be grownin, e.g., up to 2 liter tion includes the quality and/or quantity of protein, e.g., bioreactors containing a mineral salts medium, and main whether the protein is sequestered or secreted, and in what tained at 32°C. and pH 6.5 through the addition of ammonia. quantities, whether the protein is properly or desirably pro Dissolved oxygen can be maintained in excess through cessed and/or folded, and the like. In embodiments, improved increases in agitation and flow of sparged air and oxygen into or desirable quality can be production of the recombinant the fermentor. Glycerol can be delivered to the culture fusion protein with high titer expression and low levels of throughout the fermentation to maintain excess levels. In degradation. In embodiments, the optimal host strain or opti embodiments, these conditions are maintained until a target mal expression system produces a yield, characterized by the culture cell density, e.g., an optical density of 575 nm (A575), amount or quantity of soluble recombinant fusion protein, the for induction is reached and IPTG is added to initiate the amount or quantity of recoverable recombinant fusion pro target protein production. It is understood that the cell density tein, the amount or quantity of properly processed recombi at induction, the concentration of IPTG, pH, temperature, nant fusion protein, the amount or quantity of properly folded CaCl2 concentration, dissolved oxygen flow rate, each can be recombinant fusion protein, the amount or quantity of active varied to determine optimal conditions for expression. In recombinant fusion protein, and/or the total amount or quan embodiments, cell density at induction can be varied from As7s of 40 to 200 absorbance units (AU). IPTG concentra tity of recombinant fusion protein, of a certain absolute level tions can be varied in the range from 0.02 to 1.0 mM, pH from or a certain level relative to that produced by an indicator 6 to 7.5, temperature from 20 to 35°C., CaCl concentration strain, i.e., a strain used for comparison. from 0 to 0.5g/L, and the dissolved oxygen flow rate from 1 0162 Methods of screening microbial hosts to identify LPM (liters per minute) to 10 LPM. After 6-48 hours, the strains with improved yield and/or quality in the expression of culture from each bioreactor can be harvested by centrifuga recombinant fusion proteins are described, e.g., in U.S. Patent tion and the cell pellet frozen at -80° C. Samples can then be Application Publication No. 2008/0269070. analyzed, e.g., by SDS-CGE, for product formation. 0.167 Fermentation may be performed at any scale. The Fermentation Format expression systems according to the present invention are 0163 An expression strain of the present invention can be useful for recombinant protein expression at any scale. Thus, cultured in any fermentation format. For example, batch, e.g., microliter-scale, milliliter scale, centiliter scale, and fed-batch, semi-continuous, and continuous fermentation deciliter scale fermentation volumes may be used, and 1 Liter modes may be employed herein. scale and larger fermentation Volumes can be used. 0164. In embodiments, the fermentation medium may be (0168. In embodiments, the fermentation volume is at or selected from among rich media, minimal media, and mineral above about 1 Liter. In embodiments, the fermentation vol US 2016/0159877 A1 Jun. 9, 2016

ume is about 1 Liter to about 100 Liters. In embodiments, the about 5.7 to about 6.2, about 5.7 to about 6, about 5.9 to about fermentation volume is about 1 Liter, about 2 Liters, about 3 8.8, about 5.9 to about 8.5, about 5.9 to about 8.3, about 5.9 to Liters about 4 Liters, about 5 Liters, about 6 Liters, about 7 about 8, about 5.9 to about 7.8, about 5.9 to about 7.6, about Liters, about 8 Liters, about 9 Liters, or about 10 Liters. In 5.9 to about 7.4, about 5.9 to about 7.2, about 5.9 to about 7, embodiments, the fermentation volume is about 1 Liter to about 5.9 to about 6.8, about 5.9 to about 6.6, about 5.9 to about 5 Liters, about 1 Liter to about 10 Liters, about 1 Liter about 6.4, about 5.9 to about 6.2, to about 25 Liters, about 1 Liter to about 50 Liters, about 1 about 6 to about 8.8, about 6 to about 8.5, about 6 to about 8.3, Liter to about 75 Liters, about 10 Liters to about 25 Liters, about 6 to about 8, about 6 to about 7.8, about 6 to about 7.6, about 25 Liters to about 50 Liters, or about 50 Liters to about about 6 to about 7.4, about 6 to about 7.2, about 6 to about 7, 100 Liters. In other embodiments, the fermentation volume is about 6 to about 6.8, about 6 to about 6.6, about 6 to about 6.4. at or above 5 Liters, 10 Liters, 15 Liters, 20 Liters, 25 Liters, about 6 to about 6.2, about 6.1 to about 8.8, about 6.1 to about 50 Liters, 75 Liters, 100 Liters, 200 Liters, 250 Liters, 300 8.5, about 6.1 to about 8.3, about 6.1 to about 8, about 6.1 to Liters, 500 Liters, 1,000 Liters, 2,000 Liters, 5,000 Liters, about 7.8, about 6.1 to about 7.6, about 6.1 to about 7.4, about 10,000 Liters, or 50,000 Liters. In embodiments, 6.1 to about 7.2, about 6.1 to about 7, about 6.1 to about 6.8, 0169. In general, the amount of a recombinant protein about 6.1 to about 6.6, about 6.1 to about 6.4. yielded by a larger culture Volume, e.g., a 50 mL shake-flask about 6.2 to about 8.8, about 6.2 to about 8.5, about 6.2 to culture, a 1 Liter culture, or greater, is increased relative to about 8.3, about 6.2 to about 8, about 6.2 to about 7.8, about that observed in a smaller culture volume, e.g., a 0.5 mL 6.2 to about 7.6, about 6.2 to about 7.4, about 6.2 to about 7.2, high-throughput Screening culture. This can be due to not about 6.2 to about 7, about 6.2 to about 6.8, about 6.2 to about only the increase in culture size but, e.g., the ability to grow 6.6, about 6.2 to about 6.4, about 6.4 to about 8.8, about 6.4 to cells to a higher density in large-scale fermentation (e.g., as about 8.5, about 6.4 to about 8.3, about 6.4 to about 8, about reflected by culture absorbance). For example, the volumetric 6.4 to about 7.8, about 6.4 to about 7.6, about 6.4 to about 7.4, yield from the same strain can increase up to ten-fold from about 6.4 to about 7.2, about 6.4 to about 7, about 6.4 to about HTP scale to large-scale fermentation. In embodiments, the 6.8, about 6.4 to about 6.6, about 6.6 to about 8.8, about 6.6 to volumetric yield observed for the same expression strain is about 8.5, about 6.6 to about 8.3, about 6.6 to about 8, about 2-fold to 10-fold greater following large-scale fermentation 6.6 to about 7.8, about 6.6 to about 7.6, about 6.6 to about 7.4, than HTP scale growth. In embodiments, the yield observed about 6.6 to about 7.2, about 6.6 to about 7, about 6.6 to about for the same expression strain is 2-fold, 3-fold, 4-fold, 5-fold, 6.8, about 6.8 to about 8.8, about 6.8 to about 8.5, about 6.8 to 6-fold, 7-fold, 8-fold, 9-fold, 2-fold to 10-fold, 2-fold to about 8.3, about 6.8 to about 8, about 6.8 to about 7.8, about 9-fold, 2-fold to 8-fold, 2-fold to 7-fold, 2-fold to 6-fold, 6.8 to about 7.6, about 6.8 to about 7.4, about 6.8 to about 7.2, 2-fold to 5-fold, 2-fold to 4-fold, 2-fold to 3-fold, 3-fold to about 6.8 to about 7, about 7 to about 8.8, about 7 to about 8.5, 10-fold, 3-fold to 9-fold, 3-fold to 8-fold, 3-fold to 7-fold, about 7 to about 8.3, about 7 to about 8, about 7 to about 7.8, 3-fold to 6-fold, 3-fold to 5-fold, 3-fold to 4-fold, 4-fold to about 7 to about 7.6, about 7 to about 7.4, about 7 to about 7.2, 10-fold, 4-fold to 9-fold, 4-fold to 8-fold, 4-fold to 7-fold, about 7.2 to about 8.8, about 7.2 to about 8.5, about 7.2 to 4-fold to 6-fold, 4-fold to 5-fold, 5-fold to 10-fold, 5-fold to about 8.3, about 7.2 to about 8, about 7.2 to about 7.8, about 9-fold, 5-fold to 8-fold, 5-fold to 7-fold, 5-fold to 6-fold, 7.2 to about 7.6, about 7.2 to about 7.4, about 7.4 to about 8.8, 6-fold to 10-fold, 6-fold to 9-fold, 6-fold to 8-fold, 6-fold to about 7.4 to about 8.5, about 7.4 to about 8.3, about 7.4 to 7-fold, 7-fold to 10-fold, 7-fold to 9-fold, 7-fold to 8-fold, about 8, about 7.4 to about 7.8, about 7.4 to about 7.6, about 8-fold to 10-fold, 8-fold to 9-fold, 9-fold to 10-fold, greater 7.6 to about 8.8, about 7.6 to about 8.5, about 7.6 to about 8.3, following large-scale fermentation than following HTP-scale about 7.6 to about 8, about 7.6 to about 7.8, about 7.8 to about growth. See, e.g., Retallack, et al., 2012, “Reliable protein 8.8, about 7.8 to about 8.5, about 7.8 to about 8.3, about 7.8 to production in a Pseudomonas fluorescens expression sys about 8, about 8 to about 8.8, about 8 to about 8.5, or about 8 tem.” Prot. Exp. and Purif, 81:157-165, incorporated herein to about 8.3. In embodiments, the pH is about 6.5 to about 7.2. by reference in its entirety. 0172. In embodiments, the growth temperature is main Bacterial Growth Conditions tained at about 4° C. to about 42°C. In embodiments, the growth temperature is about 4°C., about 5°C., about 6°C., 0170 Growth conditions useful in the methods of the pro about 7°C., about 8°C., about 9°C., about 10°C., about 11° vided invention can comprise a temperature of about 4°C. to C., about 12° C., about 13°C., about 14° C., about 15° C., about 42°C. and a pH of about 5.7 to about 8.8. When an about 16°C., about 17°C., about 18°C., about 19°C., about expression construct with a lacZ promoter is used, expression 20°C., about 21°C., about 22°C., about 23°C., about 24°C., can be induced by adding IPTG to a culture at a final concen about 25°C., about 26°C., about 27°C., about 28°C., about tration of about 0.01 mM to about 1.0 mM. 29°C., about 30°C., about 31°C., about 32°C., about 33°C., 0171 The pH of the culture can be maintained using pH about 34°C., about 35° C., about 36°C., about 37°C., about buffers and methods knownto those of skill in the art. Control 38°C., about 39°C., about 40°C., about 41° C., or about 42° of pH during culturing also can be achieved using aqueous C. In embodiments, the growth temperature is about 25°C. to ammonia. In embodiments, the pH of the culture is about 5.7 about 32° C. In embodiments, the growth temperature is to about 8.8. In embodiments, the pH is about 5.7. 5.8, 5.9, maintained at about 22°C. to about 27°C., about 22°C. to 6.0, 6.1, 6.2, 6.3, 6.4., 6.5, 6.6, 6.7, 6.8, 6.9, 7.0, 7.1, 7.2, 7.3, about 28°C., about 22°C. to about 29° C., about 22°C. to 7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 8.0, 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, about 30°C., 23°C. to about 27°C., about 23°C. to about 28° or 8.8. In embodiments, the pH is about 5.7 to about 8.8, about C., about 23°C. to about 29°C., about 23°C. to about 30°C., 5.7 to about 8.5, about 5.7 to about 8.3, about 5.7 to about 8, about 24°C. to about 27°C., about 24°C. to about 28°C., about 5.7 to about 7.8, about 5.7 to about 7.6, about 5.7 to about 24°C. to about 29° C., about 24°C. to about 30° C., about 7.4, about 5.7 to about 7.2, about 5.7 to about 7, about about 25° C. to about 27°C., about 25° C. to about 28°C., 5.7 to about 6.8, about 5.7 to about 6.6, about 5.7 to about 6.4. about 25° C. to about 29° C., about 25° C. to about 30° C., US 2016/0159877 A1 Jun. 9, 2016 24 about 25°C. to about 31° C., about 25° C. to about 32° C., (0175. In embodiments, the final IPTG concentration of the about 25°C. to about 33°C., about 26° C. to about 28°C., culture is about 0.01 mM, about 0.02 mM, about 0.03 mM, about 26°C. to about 29° C., about 26° C. to about 30° C., about 0.04 mM, about 0.05 mM, about 0.06 mM, about 0.07 about 26°C. to about 31° C., about 26° C. to about 32° C., mM, about 0.08 mM, about 0.09 mM, about 0.1 mM, about about 26°C. to about 33°C., about 27°C. to about 29° C., 0.2 mM, about 0.3 mM, about 0.4 mM, about 0.5 mM, about about 27°C. to about 30° C., about 27°C. to about 31° C., 0.6 mM, about 0.7 mM, about 0.8 mM, about 0.9 mM, or about 27°C. to about 32° C., about 27°C. to about 33°C., about 1 mM. In embodiments, the final IPTG concentration of about 28°C. to about 30° C., about 28°C. to about 31° C., the culture is about 0.08 mM to about 0.1 mM, about 0.1 mM about 28°C. to about 32° C., about 29° C. to about 31° C., to about 0.2 mM, about 0.2 mM to about 0.3 mM, about 0.3 about 29° C. to about 32° C., about 29° C. to about 33°C., mM to about 0.4 mM, about 0.2 mM to about 0.4 mM, about about 30° C. to about 32° C., about 30° C. to about 33°C., 0.08 to about 0.2 mM, or about 0.1 to 1 mM. about 31° C. to about 33°C., about 31° C. to about 32° C., 0176). In embodiments wherein a non-lac type promoter is about 21° C. to about 42°C., about 22°C. to about 42°C., used, as described herein and in the literature, other inducers about 23° C. to about 42°C., about 24°C. to about 42°C., or effectors can be used. In one embodiment, the promoter is about 25° C. to about 42°C. In embodiments, the growth a constitutive promoter. temperature is about 25° C. to about 28.5° C. In embodi 0177. After adding and inducing agent, cultures can be ments, the growth temperature is above about 20°C., above grown for a period of time, for example about 24 hours, about 21°C., above about 22°C., above about 23°C., above during which time the recombinant protein is expressed. After about 24°C., above about 25°C., above about 26°C., above adding an inducing agent, a culture can be grown for about 1 about 27°C., above about 28°C., above about 29° C., or hr, about 2 hr., about 3 hr, about 4 hr, about 5 hr, about 6 hr, above about 30° C. about 7 hr, about 8 hr, about 9 hr, about 10 hr, about 11 hr. 0173. In embodiments, the temperature is changed during about 12 hr, about 13 hr, about 14 hr, about 15 hr, about 16 hr, culturing. In embodiments, the temperature is maintained at about 17 hr, about 18 hr, about 19 hr, about 20 hr, about 21 hr. about 30° C. to about 32°C. before an agent, e.g., IPTG, is about 22 hr., about 23 hr, about 24 hr, about 36 hr, or about 48 added to the culture to induce expression from the construct, hr. After an inducing agent is added to a culture, the culture and after adding the induction agent, the temperature is can be grown for about 1 to 48 hr, about 1 to 24 hr, about 1 to reduced to about 25°C. to about 28°C. In embodiments, the 8 hr, about 10 to 24 hr, about 15 to 24 hr, or about 20 to 24hr. temperature is maintained at about 30° C. before an agent, Cell cultures can be concentrated by centrifugation, and the e.g., IPTG, is added to the culture to induce expression from culture pellet resuspended in a buffer or solution appropriate the construct, and after adding the induction agent, the tem for the Subsequent lysis procedure. perature is reduced to about 25°C. 0178. In embodiments, cells are disrupted using equip ment for high pressure mechanical cell disruption (which are 0.174 As described elsewhere herein, inducible promoters available commercially, e.g., Microfluidics Micro fluidizer, can be used in the expression construct to control expression Constant Cell Disruptor, Niro-Soavi homogenizer or APV of the recombinant fusion protein, e.g., a lac promoter. In the Gaulinhomogenizer). Cells expressing the recombinant pro case of the lac promoter derivatives or family members, e.g., tein can be disrupted, for example, using Sonication. Any the tac promoter, the effector compound is an inducer, such as appropriate method known in the art for lysing cells can be a gratuitous inducer like IPTG. In embodiments, a lac pro used to release the soluble fraction. For example, in embodi moter derivative is used, and recombinant protein expression ments, chemical and/or enzymatic cell lysis reagents, such as is induced by the addition of IPTG to a final concentration of cell-wall lytic enzyme and EDTA, can be used. Use of frozen about 0.01 mM to about 1.0 mM, when the cell density has or previously stored cultures is also contemplated in the meth reached a level identified by an ODss of about 40 to about ods of the invention. Cultures can be OD-normalized prior to 180. In embodiments, the ODs.s at the time of culture induc lysis. For example, cells can be normalized to an OD600 of tion for the recombinant protein can be about 40, about 50, about 10, about 11, about 12, about 13, about 14, about 15, about 60, about 70, about 80, about 90, about 110, about 120, about 16, about 17, about 18, about 19, or about 20. about 130, about 140, about 150, about 160, about 170 about 0179 Centrifugation can be performed using any appro 180. In other embodiments, the ODss is about 40 to about 50, about 50 to about 60, about 60 to about 70, about 70 to about priate equipment and method. Centrifugation of cell culture 80, about 80 to about 90, or about 90 to about 100. In other or lysate for the purposes of separating a soluble fraction from embodiments, the ODs, is about 40 to about 100, about 100 an insoluble fraction is well-known in the art. For example, to about 120, about 120 to about 130, about 130 to about 140, lysed cells can be centrifuged at 20,800xg for 20 minutes (at about 140 to about 150, about 150 to about 160, about 160 to 4°C.), and the Supernatants removed using manual or auto about 170, or about 170 to about 180. In other embodiments, mated liquid handling. The cell pellet obtained by centrifu the ODss is about 40 to about 140, or about 80 to 180. The gation of cell culture, or the insoluble fraction obtained by cell density can be measured by other methods and expressed centrifugation of cell lysate, can be resuspended in a buffered in other units, e.g., in cells per unit Volume. For example, an solution. Resuspension of the cell pellet or insoluble fraction ODs,s of about 40 to about 160 of a Pfluorescens culture is can be carried out using, e.g., equipment such as impellers equivalent to approximately 4x10" to about 1.6x10" colony connected to an overhead mixer, magnetic stir-bars, rocking forming units per mL or 17.5 to 70 g/L dry cell weight. In shakers, etc. embodiments, the cell density at the time of culture induction is equivalent to the cell density as specified herein by the Non-Denaturing Conditions absorbance at OD575, regardless of the method used for 0180. Lysis of the induced host cells is carried out under determining cell density or the units of measurement. One of non-denaturing conditions. In embodiments, the non-dena skill in the art will know how to make the appropriate con turing conditions comprise use of a non-denaturing treatment version for any cell culture. buffer, e.g., to resuspend the cell pellet or paste. In embodi US 2016/0159877 A1 Jun. 9, 2016

ments, the non-denaturing treatment buffer comprises column. In embodiments the non-denaturing treatment buffer Sodium phosphate or Tris buffer, glycerol, and Sodium chlo comprises additional components, e.g., imidazole for IMAC ride. In embodiments wherein affinity chromatography is as described elsewhere herein. carried out by immobilized metal affinity chromatography 0183. It is understood by those of skill in the art that a (IMAC), the non-denaturing treatment buffer comprises imi denaturing concentration of a chaotropic agent may be influ dazole. In embodiments, the non-denaturing treatment buffer enced by the pH, and that the denaturing levels depend on the comprises 0 to 50 mM imidazole. In embodiments, the non characteristics of the protein. For example, the pH can be denaturing treatment buffer comprises no imidazole. In increased to cause protein denaturation despite a lower con embodiments, the non-denaturing treatment buffer comprises centration of a chaotropic agent. 25 mMimidazole. In embodiments, the non-denaturing treat ment buffer comprises 10-30 mM sodium phosphate or Tris, Product Evaluation pH 7 to 9. In embodiments, the non-denaturing treatment buffer has a pH of 7.3, 7.4, or 7.5. In embodiments, the 0.184 The quality of the produced recombinant fusion pro non-denaturing treatment buffer comprises 2-10% glycerol. tein or polypeptide of interest can be evaluated by any method In embodiments, the non-denaturing treatment buffer com known in the art or described in the literature. In embodi prises 50 mM to 750 mM. NaCl. In embodiments, the cell ments, denaturation of a protein is evaluated based on its paste is resuspended to 10-50% solids. In embodiments, the solubility, or by lack or loss of biological activity. For many non-denaturing treatment buffer comprises 20 mM sodium proteins biological activity assays are commercially avail phosphate, 5% glycerol, 500 mM sodium chloride, 20 mM able. A biological activity assay can include, e.g., an antibody imidazole, at pH 7.4, and is resuspended to 20% solids. In binding assay. In embodiments, physical characterization of embodiments, the non-denaturing treatment buffer comprises the recombinant fusion protein or polypeptide of interest is 20 mM Tris, 50 mM. NaCl, at pH 7.5, and is resuspended to carried out using methods available in the art, e.g., chroma 20% solids. tography and spectrophotometric methods. Evaluation of the 0181. In embodiments, the non-denaturing treatment polypeptide of interest can include a determination that it has buffer does not comprise a chaotropic agent. Chaotropic been properly released, e.g., its N-terminus is intact. agents disrupt the 3-dimensional structure of a protein or 0185. The activity of hPTH, e.g., hPTH 1-34 or 1-84, can nucleic acid, causing denaturation. In embodiments, the non be evaluated using any method known in the art or described denaturing treatment buffer comprises a non-denaturing con herein or in the literature, e.g., using antibodies that recognize centration of a chaotropic agent. In embodiments, the chao the N-terminus of the protein. Methods include, e.g., intact tropic agent is, e.g., urea or guanidinium hydrochloride. In mass analysis. PTH bioactivity can be measured, by, e.g., embodiments, the non-denaturing treatment buffer comprises cAMP ELISA, homogenous time-resolved fluorescence 0 to 4Murea or guanidinium hydrochloride. In embodiments, (HTRF) assay (Charles River Laboratories), or as described the non-denaturing treatment buffer comprises urea or guani by Nissenson, et al., 1985, Activation of the Parathyroid dinium hydrochloride at a concentration of less than 4M, less Hormone Receptor-Adenylate Cyclase System in Osteosar than 3.5M, less than 3M, less than 2.5M, less than 2M, less coma Cells by a Human Renal Carcinoma Factor.” Cancer than 1.5M, less than 1 M, less than 0.5M, about 0.1M, about Res. 45:5358-5363, and U.S. Pat. No. 7,150,974, “Parathy 0.2M, about 0.3M, about 0.4M, about 0.5M, about 0.6M, roid Hormone Receptor Binding Method, each incorporated about 0.7M, about 0.8M, about 0.9M, about 1.OM, about by reference herein. Methods of evaluating PTH also are 1.1M, about 1.2M, about 1.3M, about 1.4M, about 1.5M, described by Shimizu, et al., 2001, “Parathyroid hormone about 1.6M, about 1.7M, about 1.8M, about 1.9M, or about (1-14) and (1-11) analogs conformationally constrained by 2.0M, about 2.1M, about 2.2M, about 2.3M, about 2.4M, C.-aminoisobutyric acid mediatefull agonist responses via the about 2.5M, about 2.6M, about 2.7M, about 2.8M, about Juxtamembrane region of the PTH-1 receptor. J. Biol. Chem. 2.9M, about 3M, about 3.1M, about 3.2M, about 3.3M, about 276: 49003-49012, incorporated by reference herein. 3.4M, about 3.5M, about 3.6M, about 3.7M, about 3.8M, about 3.9M, about 4M, about 0.5 to about 3.5M, about 0.5 to Purification of the Recombinant Fusion Protein and about 3M, about 0.5 to about 2.5M, about 0.5 to about 2M, Polypeptide of Interest about 0.5 to about 1.5M, about 0.5 to about 1M, about 1 to 0186 The solubilized recombinant fusion protein or about 4M, about 1 to about 3.5M, about 1 to about 3M, about polypeptide of interest can be isolated or purified from other 1 to about 2.5M, about 1 to about 2M, about 1 to about 1.5M, protein and cellular debris by any method known by those of about 1.5 to about 4M, about 1.5 to about 3.5M, about 1.5 to skill in the art or described in the literature, for example, about 3M, about 1.5 to about 2.5M, about 1.5 to about 2M, centrifugation methods and/or chromatography methods about 2 to about 4M, about 2 to about 3.5M, about 2 to about Such as size exclusion, anion or cation exchange, hydropho 3M, about 2 to about 2.5M, about 2.5 to about 4M, about 2.5 bic interaction, or affinity chromatography. In embodiments, to about 3.5M, about 2.5 to about 3M, about 3 to about 4M, the solubilized protein can be purified using Fast Perfor about 3 to about 3.5M, or 0.5 to about 1M. mance Liquid Chromatography (FPLC). FPLC is a form of 0182. In embodiments wherein a non-denaturing treat liquid chromatography used to separate proteins based on ment buffer is used, the cell paste is slurried at 20% solids in affinity towards various resins. In embodiments, the affinity 20 mM Tris, 50 mM. NaCl, 4 Murea, pH 7.5, for about 1-2.5 tag expressed with the fusion proteins causes the fusion pro hours at 2-8°C. In embodiments the cell paste is subjected to tein, dissolved in a solubilization buffer, to bind to a resin, lysis with a Niro homogenizer, e.g., at 15,000 psi, and batch while the impurities are carried out in the solubilization centrifuged 35 minutes at 14,000xg or continuous centrifuge buffer. Subsequently, an elution buffer is used, in gradually at 15,000xg and 340 mL/min feed, the supe/centrate filtered increasing gradient or added in a step-wise manner, to disso with a depth filter and a membrane filter, diluted 2x in resus ciate the fusion protein from the ion exchange resin and pension buffer, e.g., 1xPBS pH 7.4, and loaded to a capture isolate the pure fusion protein, in the elution buffer. US 2016/0159877 A1 Jun. 9, 2016 26

0187. In embodiments, after the completion of induction, hr, about 19 hr, about 20 hr, about 21 hr., about 22 hr., about 23 the fermentation broth is harvested by centrifugation, e.g., at hr, about 24 hr, about 1 hr to about 24 hr, about 1 hr to about 15,900xg for 60 to 90 minutes. The cell paste and supernatant 23 hr, about 1 hr to about 22 hr, about 1 hr to about 21 hr, about are separated and the paste is frozen at -80°C. The frozen cell 1 hr to about 20 hr, about 1 hr to about 19 hr, about 1 hr to paste is thawed in a bufferas described elsewhere herein, e.g., about 18 hr, about 1 hr to about 17 hr, about 1 hr to about 16 a non-denaturing buffer or buffer with no urea. In embodi hr, about 1 hr to about 15 hr, about 1 hr to about 14 hr, about ments, the frozen cell paste is thawed in and resuspended in 20 mM sodium phosphate, 5% glycerol, 500 mM sodium 1 hr to about 13 hr, about 1 hr to about 12 hr, about 1 hr to chloride, pH 7.4. In embodiments, the buffer comprises imi about 11 hr., about 1 hr to about 10 hr, about 1 hr to about 9 hr, dazole. In embodiments, the final volume of the suspension is about 1 hr to about 8 hr, about 1 hr to about 7 hr, about 1 hr to adjusted to the desired percent solids, e.g., 20% solids. The about 6 hr, about 1 hr to about 5 hr, about 1 hr to about 4 hr, cells can be lysed chemically or mechanically, e.g., the mate about 1 hr to about 3 hr, about 1 hr to about 2 hr., about 2 hr to rial can then be homogenized by through a microfluidizer at about 24 hr, about 2 hr to about 23 hr, about 2 hr to about 22 15,000 psi. Lysates are centrifuged, e.g., at 12,000xg for 30 hr, about 2 hr to about 21 hr., about 2 hr to about 20 hr, about minutes, and filtered, e.g., through a Sartorius Sartobran 150 2 hr to about 19 hr, about 2 hr to about 18 hr, about 2 hr to (0.45/0.2 um) filter capsule. about 17 hr, about 2 hr to about 16 hr, about 2 hr to about 15 0188 In embodiments, fast protein liquid chromatogra hr, about 2 hr to about 14 hr, about 2 hr to about 13 hr, about phy (FPLC) can be used for purification, e.g., using AKTA 2 hr to about 12 hr., about 2 hr to about 11 hr., about 2 hr to explorer 100 chromatography systems (GE Healthcare) about 10 hr, about 2 hr to about 9 hr, about 2 hr to about 8 hr, equipped with Frac-950 fraction collectors. In embodiments about 2 hr to about 7 hr, about 2 hr to about 6 hr, about 2 hr to wherein a His-tag is used, samples can be loaded onto His about 5 hr, about 2 hr to about 4 hr, about 2 hr to about 3 hr, Trap FF, 10 mL columns (two 5 mL HisTrap FF cartridges about 3 hr to about 24 hr, about 3 hr to about 23 hr, about 3 hr GE Healthcare, part number 17-5255-01 connected in to about 22 hr., about 3 hr to about 21 hr., about 3 hr to about 20 series), washed, and eluted, e.g., using a 10 column Volume hr, about 3 hr to about 19 hr, about 3 hr to about 18 hr, about linear gradient of an elution buffer, by varying the imidazole 3 hr to about 17 hr, about 3 hr to about 16 hr, about 3 hr to concentration from 0mMto 200 mM, and fractions collected. about 15 hr, about 3 hr to about 14 hr, about 3 hr to about 13 (0189 In embodiments, chromatography can be carried out hr, about 3 hr to about 12 hr., about 3 hr to about 11 hr., about as appropriate for the polypeptide of interest. For example, 3 hr to about 10hr, about 3 hr to about 9 hr, about 3 hr to about immobilized metal ion affinity chromatography purification 8 hr, about 3 hr to about 7 hr, about 3 hr to about 6 hr, about 3 hr to about 5 hr, about 3 hr to about 4 hr, about 4 hr to about can be carried out (e.g., using Nickel IMAC) as described 24 hr, about 4 hr to about 23 hr, about 4 hr to about 22 hr, about herein in the Examples. 4 hr to about 21 hr., about 4 hr to about 20 hr, about 4 hr to about 19 hr, about 4 hr to about 18 hr, about 4 hr to about 17 Cleavage of Recombinant Fusion Protein hr, about 4 hr to about 16 hr, about 4 hr to about 15 hr, about 0190. In embodiments, the purified recombinant fusion 4 hr to about 14 hr, about 4 hr to about 13 hr, about 4 hr to protein fractions are incubated with a cleavage enzyme, to about 12 hr., about 4 hr to about 11 hr., about 4 hr to about 10 cleave the polypeptide of interest from the linker and N-ter hr, about 4 hr to about 9 hr, about 4 hr to about 8 hr, about 4 hr minal fusion partner. In embodiments, the cleavage enzyme is to about 7hr, about 4 hr to about 6 hr, about 4 hr to about 5 hr, a protease, for example, a serine protease, e.g. bovine enter about 5 hr to about 24 hr, about 5 hr to about 23 hr, about 5 hr okinase, porcine enterokinase, trypsin or any other appropri to about 22hr, about 5 hr to about 20 hr, about 5 hr to about 21 ate protease as described elsewhere herein. Any appropriate hr, about 5 hr to about 19 hr, about 5 hr to about 18 hr, about protease cleavage method known in the art and described in 5 hr to about 17 hr, about 5 hr to about 16 hr, about 5 hr to the literature, including in the manufacturer's instructions, about 15 hr, about 5 hr to about 14 hr, about 5 hr to about 13 can be used. Proteases are available commercially, e.g., from hr, about 5 hr to about 12 hr., about 5 hr to about 11 hr., about Sigma-Aldrich (St. Louis, Mo.). ThermoFisher Scientific 5 hr to about 10hr, about 5 hr to about 9 hr, about 5 hr to about (Waltham, Mass.), and Promega (Madison, Wis.). For 8 hr, about 5 hr to about 7 hr, about 5 hr to about 6 hr, about 6 example, in embodiments, bovine enterokinase (e.g., hr to about 24 hr, about 6 hr to about 23 hr, about 6 hr to about Novagen cat #69066-3, batch D00155747) cleavage fusion 22hr, about 6hr to about 21 hr, about 6hr to about 20 hr, about protein purification fractions can be concentrated and resus 6 hr to about 19 hr, about 6 hr to about 18 hr, about 6 hr to pended in a buffer containing 20 mM Tris pH 7.4, 50 mM about 17 hr, about 6 hr to about 16 hr, about 6 hr to about 15 NaCl, and 2 mM CaCl. Two units of bovine enterokinase are hr, about 6 hr to about 14 hr, about 6 hr to about 13 hr, about be added to 100 g protein in a 100 ul reaction. The mixture 6 hr to about 12 hr., about 6 hr to about 11 hr., about 6 hr to of fusion protein purification fraction and enterokinase are about 10 hr, about 6 hr to about 9 hr, about 6 hr to about 8 hr, incubated for an appropriate length of time. In embodiments, about 6 hr to about 7 hr, about 7 hr to about 24 hr, about 7 hr control reactions with no enterokinase also are incubated, for to about 23 hr, about 7 hr to about 22 hr., about 7 hr to about 21 comparison. The enzyme reactions can be stopped by the hr, about 7 hr to about 20 hr, about 7 hr to about 19 hr, about addition of complete protease inhibitor cocktail containing 7 hr to about 18 hr, about 7 hr to about 17 hr, about 7 hr to 4-benzenesulfonyl fluoride hydrochloride (AEBSF, Sigma about 16 hr, about 7 hr to about 15 hr, about 7 hr to about 14 catii P8465). hr, about 7 hr to about 13 hr, about 7 hr to about 12 hr., about (0191 In embodiments, the cleavage enzyme incubation is 7 hr to about 11 hr., about 7 hr to about 10 hr, about 7 hr to carried out for about 1 hour to about 24 hours. In embodi about 9 hr, about 7 hr to about 8 hr, about 8 hr to about 24 hr, ments, the incubation is carried out for about 1 hr, about 2 hr. about 8 hr to about 23 hr, about 8 hr to about 22 hr, about 8 hr about 3 hr, about 4 hr, about 5 hr, about 6 hr, about 7 hr, about to about 21 hr., about 8 hr to about 20 hr, about 8 hr to about 19 8 hr, about 9 hr, about 10hr, about 11 hr., about 12 hr., about 13 hr, about 8 hr to about 18 hr, about 8 hr to about 17 hr, about hr, about 14 hr, about 15 hr, about 16 hr, about 17 hr, about 18 8 hr to about 16 hr, about 8 hr to about 15 hr, about 8 hr to US 2016/0159877 A1 Jun. 9, 2016 27 about 14 hr, about 8 hr to about 13 hr, about 8 hr to about 12 about 90% to about 99%, about 91% to about 99%, about 92% hr, about 8 hr to about 11 hr., about 8 hr to about 10 hr, about to about 99%, about 93% to about 99%, about 94% to about 8 hr to about 9 hr, about 9 hr to about 24 hr, about 9 hr to about 99%, about 95% to about 99%, about 96% to about 99%, 23 hr, about 9 hr to about 22 hr., about 9 hr to about 21 hr, about about 97% to about 99%, about 98% to about 99%, about 90% 9 hr to about 20 hr, about 9 hr to about 19 hr, about 9 hr to to about 98%, about 91% to about 98%, about 92% to about about 18 hr, about 9 hr to about 17 hr, about 9 hr to about 16 98%, about 93% to about 98%, about 94% to about 98%, hr, about 9 hr to about 15 hr, about 9 hr to about 14 hr, about about 95% to about 98%, about 96% to about 98%, about 97% 9 hr to about 13 hr, about 9 hr to about 12 hr., about 9 hr to to about 98%, about 90% to about 97%, about 91% to about about 11 hr, about 9 hr to about 10 hr, about 10hr to about 24 97%, about 92% to about 97%, about 93% to about 97%, hr, about 10hr to about 23 hr, about 10hr to about 22hr, about about 94% to about 97%, about 95% to about 97%, about 96% 10 hr to about 21 hr., about 10hr to about 20 hr, about 10hr to to about 97%, about 90% to about 96%, about 91% to about about 19 hr, about 10hr to about 18 hr, about 10hr to about 17 96%, about 92% to about 96%, about 93% to about 96%, hr, about 10hr to about 16 hr, about 10hr to about 15 hr, about about 94% to about 96%, about 95% to about 96%, about 90% 10 hr to about 14 hr, about 10hr to about 13 hr, about 10hr to to about 95%, about 91% to about 95%, about 92% to about about 12 hr., about 10hr to about 11 hr., about 11 hr to about 24 95%, about 93% to about 95%, about 94% to about 95%, hr, about 11 hr to about 23 hr, about 11 hr to about 22 hr, about about 90% to about 94%, about 91% to about 94%, about 92% 11 hr to about 21 hr., about 11 hr to about 20 hr, about 11 hr to to about 94%, about 93% to about 94%, about 90% to about about 19 hr, about 11 hr to about 18 hr, about 11 hr to about 17 93%, about 91% to about 93%, about 92% to about 93%, hr, about 11 hr to about 16 hr, about 11 hr to about 15 hr, about about 90% to about 92%, about 91% to about 92%, or about 11 hr to about 14 hr, about 11 hr to about 13 hr, about 11 hr to 90% to about 91%. about 12 hr., about 12hr to about 24 hr, about 12hr to about 23 (0193 In embodiments, the protease cleavage results in hr, about 12hr to about 22 hr., about 12hr to about 21 hr, about release of the polypeptide of interest from the recombinant 12 hr to about 20 hr, about 12 hr to about 112 hr., about 12 hr fusion protein. In embodiments, the recombinant fusion pro to about 18 hr, about 12hr to about 17 hr, about 12 hr to about tein is properly cleaved, to properly release the polypeptide of 16 hr, about 12 hr to about 15 hr, about 12 hr to about 14 hr, interest. In embodiments, proper cleavage of the recombinant about 12 hr to about 13 hr, about 13 hr to about 24 hr, about 13 fusion protein results in a properly released polypeptide of hr to about 23 hr, about 13 hr to about 22 hr., about 13 hr to interest having an intact (undegraded) N-terminus. In about 21 hr., about 13 hr to about 20 hr, about 13 hr to about 19 embodiments, proper cleavage of the recombinant fusion pro hr, about 13 hr to about 18 hr, about 13 hr to about 17 hr, about tein results in a properly released polypeptide of interest that 13 hr to about 16 hr, about 13 hr to about 15 hr, about 13 hr to contains the first (N-terminal) amino acid. In embodiments, about 14 hr, about 14 hr to about 24 hr, about 14 hr to about 23 the amount of properly released polypeptide following pro hr, about 14 hr to about 22hr, about 14 hr to about 21 hr, about tease cleavage is about 90% to about 100%. In embodiments, 14 hr to about 20 hr, about 14 hr to about 19 hr, about 14 hr to the amount of properly released polypeptide following pro about 18 hr, about 14 hr to about 17 hr, about 14 hr to about 16 tease cleavage is about 90%, about 91%, about 92%, about hr, about 14 hr to about 15 hr, about 15 hr to about 24 hr, about 93%, about 94%, about 95%, about 96%, about 97%, about 15 hr to about 23 hr, about 15 hr to about 22 hr., about 15 hr to 98%, about 99%, about 100%, about 91% to about 100%, about 21 hr., about 15 hr to about 20 hr, about 15 hr to about 19 about 92% to about 100%, about 93% to about 100%, about hr, about 15 hr to about 18 hr, about 15 hr to about 17 hr, about 94% to about 100%, about 95% to about 100%, about 96% to 16 hr to about 24 hr, about 16 hr to about 23 hr, about 16 hr to about 100%, about 97% to about 100%, about 98% to about about 22 hr, about 16 hr to about 21 hr., about 16 hr to about 20 100%, about 99% to about 100%, about 90% to about 99%, hr, about 16 hr to about 19 hr, about 16 hr to about 18 hr, or about 91% to about 99%, about 92% to about 99%, about 93% about 16 hr to about 17 hr, about 17 hr to about 24 hr, about 17 to about 99%, about 94% to about 99%, about 95% to about hr to about 23 hr, about 17 hr to about 22 hr., about 17 hr to 99%, about 96% to about 99%, about 97% to about 99%, about 21 hr., about 17 hr to about 20 hr, about 17 hr to about 19 about 98% to about 99%, about 90% to about 98%, about 91% hr, about 17 hr to about 18 hr, about 18 hr to about 24 hr, about to about 98%, about 92% to about 98%, about 93% to about 18 hr to about 23 hr, about 18 hr to about 22 hr., about 18 hr to 98%, about 94% to about 98%, about 95% to about 98%, about 21 hr., about 18 hr to about 20 hr, about 18 hr to about 19 about 96% to about 98%, about 97% to about 98%, about 90% hr, about 19 hr to about 24 hr, about 19 hr to about 23 hr, about to about 97%, about 91% to about 97%, about 92% to about 19 hr to about 22 hr., about 19 hr to about 21 hr., about 19 hr to 97%, about 93% to about 97%, about 94% to about 97%, about 20 hr, about 20hr to about 24 hr, about 20hr to about 23 about 95% to about 97%, about 96% to about 97%, about 90% hr, about 20hr to about 22 hr., about 20hr to about 21 hr, about to about 96%, about 91% to about 96%, about 92% to about 21 hr to about 24 hr, about 21 hr to about 23 hr, about 21 hr to 96%, about 93% to about 96%, about 94% to about 96%, about 22 hr., about 22 hr to about 24 hr, or about 22 hr to about about 95% to about 96%, about 90% to about 95%, about 91% 23 hr. to about 95%, about 92% to about 95%, about 93% to about (0192 In embodiments, the extent of cleavage of the 95%, about 94% to about 95%, about 90% to about 94%, recombinant fusion protein after incubation with the protease about 91% to about 94%, about 92% to about 94%, about 93% is about 90% to about 100%. In embodiments, the extent of to about 94%, about 90% to about 93%, about 91% to about cleavage after incubation with the protease is about 90%, 93%, about 92% to about 93%, about 90% to about 92%, about 91%, about 92%, about 93%, about 94%, about 95%, about 91% to about 92%, or about 90% to about 91%. about 96%, about 97%, about 98%, about 99%, about 100%, Recombinant Fusion Protein Evaluation and Yield about 91% to about 100%, about 92% to about 100%, about 93% to about 100%, about 94% to about 100%, about 95% to (0194 The produced fusion protein and/or polypeptide of about 100%, about 96% to about 100%, about 97% to about interest can be characterized in any appropriate fraction, 100%, about 98% to about 100%, about 99% to about 100%, using any appropriate assay method known in the art or US 2016/0159877 A1 Jun. 9, 2016 28 described in the literature for characterizing a protein, e.g., for TABLE 5-continued evaluating the yield or quality of the protein. 0.195. In embodiments, LC-MS or any other appropriate Protein Concentrations for an Also of 1 method as known in the art is used to monitor proteolytic Concentration of Molar clipping, deamidation, oxidation, and fragmentation, and to Amino Acid Protein (mg/mL) for Extinction verify that the N-terminus of the polypeptide of interest is SEQID NO Protein An A2so of 1 Coefficient intact following linker cleavage. The yield of recombinant 74 FkB-EK-GCSF fusion 26 33600 fusion protein or polypeptide of interest can be determined by (Full length FklB) methods known to those of skill in the art, for example, by 75 FkB2-EK-GCSF .83 17100 SDS-PAGE, capillary gel electrophoresis (CGE), or Western fusion (100 aa truncated blot analysis. In embodiments, ELISA methods are used to FkB) measure host cell protein. For example, the host cell protein 76 FkB3-EK-GCSF 52 17100 (HCP) ELISA can be performed using the “Immunoenzymet fusion ric Assay for the Measurement of Pseudomonas fluorescens (500 aa truncated FkB) Host Cell Proteins' from Cygnus Technologies, Inc., cata 77 FrnE-EK-GCSF fusion 40810 log number F450, according to the manufacturer's protocol. (Full length FrnE) The plate can be read on a SPECTRAmax Plus (Molecular 78 FrnE2-EK-GCSF 31 24310 Devices), using Softmax Pro V3.1.2 software. fusion (aa) (100 aa truncated (0196. SDS-CGE can be carried out using a LabChip GXII FrnE) instrument (Caliper LifeSciences, Hopkinton, Mass.) with a 79 FrnE3-EK-GCSF .21 21750 HT Protein Express V2 chip and corresponding reagents (part fusion numbers 760499 and 760328, respectively, Caliper Life (50 aa truncated FrnE) 22 DnaJ-like protein-EK 9210 Sciences). Samples can be prepared following the manufac Proinsulin-CP-A turer's protocol (Protein User Guide Document No. 450589, 23 DnaJ-like protein-EK 9210 Rev. 3) and electrophoresed on polyacrylamide gels. After Proinsulin-CP-B separation the gel can be stained, destained, and digitally 24 DnaJ-like protein-EK 9210 Proinsulin-CP-C imaged. 25 DnaJ-like protein .04 9210 0197) The concentration of a protein, e.g., a purified Trypsin-Proinsulin recombinant fusion protein or polypeptide of interest as CP-A 26 DnaJ-like protein 9210 described herein, can be determined by absorbance spectros Trypsin-Proinsulin copy by methods known to those of skill in the art and CP-B described in the literature. In embodiments, the absorbance of 27 DnaJ-like protein 9210 a protein sample at 280 nm is measured (e.g., using an Eppen Trypsin-Proinsulin dorf BioPhotometer, Eppendorf, Hamburg, Germany) and the CP-C concentration of protein calculated using the Beer-Lambert 28 DnaJ-like protein-EK OS 9210 Law. An accurate molar absorption coefficient for the protein Proinsulin-CP-D can be calculated by known methods, e.g., as described by DnaJ-like protein Grimsley, G. R., and Pace, C. N., “Spectrophotometric Deter 29 Trypsin-Proinsulin O7 9210 mination of Protein Concentration, in Current Protocols in CP-D 30 FkB-EK-Proinsulin 40 2362O Protein Science 3.1.1-3.1.9, Copyright (C) 2003 by John Wiley CP-A & Sons, Inc., incorporated by reference herein. 31 FkB-EK-Proinsulin 38 2362O 0198 Table 5 lists the concentration of proteins described CP-B herein at an Aso of 1, determined using molar extinction 32 FIkB-EK-Proinsulin 37 2362O coefficients calculated by VectorNTI, Invitrogen. CP-C 33 FklB-Trypsin 38 2362O TABLE 5 Proinsulin-CP-A 34 FIkB-Trypsin 36 2362O Proinsulin-CP-B Protein Concentrations for an A280 of 1 35 FIkB-Trypsin 35 2362O Concentration of Molar Proinsulin-CP-C Amino Acid Protein (mg/mL) for Extinction 36 FIkB-EK-Proinsulin 31 2362O SEQID NO Protein An A280 of 1 Coefficient CP-D 37 FIkB-Trypsin 29 2362O 1 PTH1-34 0.72 Proinsulin-CP-D 45 DnaJ-like protein-PTH O.8 38 FIkB2-EK-Proinsulin 3.06 1-34 fusion CP-A 46 FkB-PTH 1-34 fusion 1.18 39 FkB2-EK-Proinsulin 2.99 47 FrnE-PTH 1-34 fusion O.98 CP-B 70 DnaJ-like protein-EK- 1.02 291.90 40 FIkB2-EK-Proinsulin 2.98 GCSF fusion 71 EcpD1-EK-GCSF 1.21 39530 CP-C fusion 41 FklB2-Trypsin 3.0 (Full length EcpD1) Proinsulin-CP-A 72 EcpD2-EK-GCSF 1.37 23O3O 42 FIkB2-Trypsin 2.93 fusion Proinsulin-CP-B 73 EcpD3-EK-GCSF 1.51 17430 43 FIkB2-Trypsin 2.92 fusion Proinsulin-CP-C (50 aa truncated FIkB2-EK-Proinsulin 2.78 EcpD1) CP-D US 2016/0159877 A1 Jun. 9, 2016 29

TABLE 5-continued TABLE 5-continued Protein Concentrations for an Also of 1 Protein Concentrations for an Also of 1 Concentration of Molar Concentration of Molar Amino Acid Protein (mg/mL) for Extinction Amino Acid Protein (mg/mL) for Extinction SEQID NO Protein An A2so of 1 Coefficient SEQID NO Protein An A2so of 1 Coefficient 45 FIkB2-Trypsin 2.72 181 EcpD1-EK-Proinsulin 1.23 295.50 Proinsulin-CP-D CP-D 46 FkB3.1-EK 2.33 182 EcpD1-Trypsin 1.28 295.50 Proinsulin-CP-A Proinsulin-CP-A 47 FkB3-EK-Proinsulin 2.26 (EcpD1-Trypsin as CP-B encoded by 48 FkB3.1-EK 2.25 pFNX4402 does not Proinsulin-CP-C contain the underlined 49 FklB3-Trypsin 2.27 N residue) Proinsulin-CP-A 183 EcpD1-Trypsin 1.26 295.50 50 FIkB3.1-Trypsin 2.2O Proinsulin-CP-B Proinsulin-CP-B (EcpD1-Trypsin as 51 FIkB3.1-Trypsin 2.19 encoded by Proinsulin-CP-C pFNX4402 does not 52 FkB-EK-Proinsulin 2.04 contain the underlined CP-D N residue) 53 FIkB3.1-Trypsin .98 184 EcpD1-Trypsin 1.26 295.50 Proinsulin-CP-D Proinsulin-CP-C S4 FrnE-EK-Proinsulin .14 3O830 (EcpD1-Trypsin as CP-A encoded by 55 FrnE-EK-Proinsulin .12 pFNX4402 does not CP-B contain the underlined 56 FrnE-EK-Proinsulin .12 N residue) CP-C 18S EcpD1-Trypsin 1.21 295.50 57 FrnE-Trypsin 13 Proinsulin-CP-D Proinsulin-CP-A (EcpD1-Trypsin as 58 FrnE-Trypsin .11 encoded by Proinsulin-CP-B pFNX4402 does not 59 FrnE-Trypsin .11 contain the underlined Proinsulin-CP-C N residue) 60 FrnE-EK-Proinsulin O8 86 EcpD2-EK-Proinsulin 1.69 13OSO CP-D CP-A 61 FrnE-Trypsin O6 87 EcpD2-EK-Proinsulin 1.65 13OSO Proinsulin-CP-D CP-B 62 FrnE2-EK-Proinsulin 57 4330 88 EcpD2-EK-Proinsulin 1.64 13OSO CP-A CP-C 63 FrnE2-EK-Proinsulin 53 4330 89 EcpD2-Trypsin 1.65 13OSO CP-B Proinsulin-CP-A 64 FrnE2-EK-Proinsulin 53 4330 90 EcpD2-Trypsin 1.61 13OSO CP-C Proinsulin-CP-B 65 FrnE2-Trypsin 53 4330 91 EcpD2-Trypsin 1.61 13OSO Proinsulin-CP-A Proinsulin-CP-C 66 FrnE2-Trypsin SO 4330 92 EcpD2-EK-Proinsulin 1.53 13OSO Proinsulin-CP-B CP-D 67 FrnE2-Trypsin SO 4330 93 EcpD2-Trypsin 1...SO 13OSO Proinsulin-CP-C Proinsulin-CP-D 68 FrnE2-EK-Proinsulin 42 4330 94 EcpD3-EK-Proinsulin 2.28 7360 CP-D CP-A 69 FrnE2-Trypsin 39 4330 95 EcpD3-EK-Proinsulin 2.21 7360 Proinsulin-CP-D CP-B 70 FrnE3-EK-Proinsulin .44 1770 96 EcpD3-EK-Proinsulin 2.2O 7360 CP-A CP-C 71 FrnE3-EK-Proinsulin 39 1770 97 EcpD3-Trypsin 2.22 7360 CP-B Proinsulin-CP-A 72 FrnE3-EK-Proinsulin 39 1770 98 EcpD3-Trypsin 2.15 7360 CP-C Proinsulin-CP-B 73 FrnE3-Trypsin 40 1770 99 EcpD3-Trypsin 2.14 7360 Proinsulin-CP-A Proinsulin-CP-C 74 FrnE3-Trypsin 36 1770 200 EcpD3-EK-Proinsulin 2.OO 7360 Proinsulin-CP-B CP-D 75 FrnE3-Trypsin 35 1770 2O1 EcpD3-Trypsin 1.95 7360 Proinsulin-CP-C Proinsulin-CP-D 76 FrnE3-EK-Proinsulin 26 1770 CP-D 77 FrnE3-Trypsin 23 1770 Proinsulin-CP-D 0199 Western blot analysis to determine yield or purity of 78 EcpD1-EK-Proinsulin 30 295.50 the polypeptide of interest can be carried out according to any CP-A 79 EcpD1-EK-Proinsulin 28 295.50 appropriate method known in the art by transferring protein CP-B separated on SDS-PAGE gels to a nitrocellulose membrane EcpD1-EK-Proinsulin 28 295.50 and incubating the membrane with a monoclonal antibody CP-C specific for the polypeptide of interest. Antibodies useful for US 2016/0159877 A1 Jun. 9, 2016 30 any analytical methods described herein can be generated by about 6 g/L to about 23 g/L, about 7 g/L to about 23 g/L, about suitable procedures known to those of skill in the art. 8 g/L to about 23 g/L, about 9 g/L to about 23 g/L, about 10 0200 Activity assays, as described herein and known in g/L to about 23 g/L, about 15 g/L to about 23 g/L, about 20g/L the art, also can provide information regarding protein yield. to about 23 g/L, about 0.5 g/L to about 20 g/L, about 1 g/L to In embodiments, these or any other methods known in the art about 20 g/L, about 1.5 g/L to about 20 g/L, about 2 g/L to are used to evaluate proper processing of a protein, e.g., about 20 g/L, about 2.5 g/L to about 20 g/L, about 3 g/L to proper secretion leader cleavage. about 20 g/L, about 3.5 g/L to about 20 g/L, about 4 g/L to 0201 Useful measures of recombinant fusion protein about 20 g/L, about 4.5 g/L to about 20 g/L, about 5 g/L to yield include, e.g., the amount of soluble recombinant fusion about 20 g/L, about 6 g/L to about 20 g/L, about 7 g/L to about protein per culture Volume (e.g., grams or milligrams of pro 20 g/L, about 8 g/L to about 20 g/L, about 9 g/L to about 20 tein/liter of culture), percent or fraction of soluble recombi g/L, about 10 g/L to about 20 g/L, about 15 g/L to about 20 nant fusion protein obtained (e.g., amount of soluble recom g/L, about 0.5 g/L to about 15 g/L, about 1 g/L to about 15 g/L, binant fusion proteinfamount of total recombinant fusion about 1.5 g/L to about 15 g/L, about 2 g/L to about 15 g/L, protein), percent or fraction of total cell protein (tcp), and about 2.5 g/L to about 15 g/L, about 3 g/L to about 15 g/L, percent or proportion of dry biomass. In embodiments, the about 3.5 g/L to about 15 g/L, about 4 g/L to about 15 g/L, measure of recombinant fusion protein yield as described about 4.5 g/L to about 15 g/L, about 5 g/L to about 15 g/L, herein is based on the amount of soluble recombinant fusion about 6 g/L to about 15 g/L, about 7 g/L to about 15 g/L, about protein obtained. In embodiments, the measurement of 8 g/L to about 15g/L, about 9 g/L to about 15 g/L, about 10 soluble recombinant fusion protein is made in a soluble frac g/L to about 15 g/L, about 0.5g/L to about 12 g/L, about 1 g/L tion obtained after cell lysis, e.g., a soluble fraction obtained to about 12 g/L, about 1.5 g/L to about 12 g/L, about 2 g/L to after one or more centrifugation steps, or after purification of about 12 g/L, about 2.5 g/L to about 12 g/L, about 3 g/L to the recombinant fusion protein. about 12 g/L, about 3.5 g/L to about 12 g/L, about 4 g/L to 0202 Useful measures of polypeptide of interest yield about 12 g/L, about 4.5 g/L to about 12 g/L, about 5 g/L to include, e.g., the amount of Soluble polypeptide of interest about 12 g/L, about 6 g/L to about 12 g/L, about 7 g/L to about obtained per culture Volume (e.g., grams or milligrams of 12 g/L, about 8 g/L to about 12 g/L, about 9 g/L to about 12 protein/liter of culture), percent or fraction of soluble g/L, about 10 g/L to about 12 g/L, about 0.5 g/L to about 10 polypeptide of interest obtained (e.g., amount of soluble g/L, about 1 g/L to about 10 g/L, about 1.5 g/L to about 10 g/L. polypeptide of interest/amount of total polypeptide of inter about 2 g/L to about 10 g/L, about 2.5 g/L to about 10 g/L, est), percent or fraction of active polypeptide of interest about 3 g/L to about 10 g/L, about 3.5 g/L to about 10 g/L, obtained (e.g., amount of active polypeptide of interest/total about 4 g/L to about 10 g/L, about 4.5 g/L to about 10 g/L, amount polypeptide of interest in the activity assay), percent about 5 g/L to about 10 g/L, about 6 g/L to about 10 g/L, about or fraction of total cell protein (tcp), and percent or proportion 7 g/L to about 10 g/L, about 8 g/L to about 10 g/L, about 9 g/L of dry biomass. to about 10 g/L, about 0.5 g/L to about 9 g/L, about 1 g/L to 0203. In embodiments whereinyield is expressed in terms about 9 g/L, about 1.5 g/L to about 9 g/L, about 2 g/L to about of culture volume the culture cell density may be taken into 9 g/L, about 2.5 g/L to about 9 g/L, about 3 g/L to about 9 g/L, account, particularly when yields between different cultures about 3.5g/L to about 9 g/L, about 4 g/L to about 9 g/L, about are being compared. In embodiments, the methods of the 4.5 g/L to about 9 g/L, about 5 g/L to about 9 g/L, about 6 g/L present invention can be used to obtain a soluble and/or active to about 9 g/L, about 7 g/L to about 9 g/L, about 8 g/L to about and/or properly processed (e.g., having the secretion leader 9 g/L, about 0.5 g/L to about 8 g/L, about 1 g/L to about 8 g/L, cleaved properly) recombinant fusion protein yield of about about 1.5 g/L to about 8 g/L, about 2 g/L to about 8 g/L, about 0.5 grams per liter to about 25 grams per liter. In embodi 2.5 g/L to about 8 g/L, about 3 g/L to about 8 g/L, about 3.5 ments, the recombinant fusion protein comprises an N-termi g/L to about 8 g/L, about 4 g/L to about 8 g/L, about 4.5 g/L nal fusion partner which is a cytoplasmic chaperone or fold to about 8 g/L, about 5 g/L to about 8 g/L, about 6 g/L to about ing modulator from the heat shock , and the 8 g/L, about 7 g/L to about 8 g/L, about 0.5 g/L to about 7 g/L, fusion protein is directed to the cytoplasm after expression. In about 1 g/L to about 7 g/L, about 1.5 g/L to about 7 g/L, about embodiments, the recombinant fusion protein comprises an 2 g/L to about 7 g/L, about 2.5 g/L to about 7 g/L, about 3 g/L N-terminal fusion partner which is a periplasmic chaperone to about 7 g/L, about 3.5 g/L to about 7 g/L, about 4 g/L to or folding modulator from the periplasmic peptidylprolyl about 7 g/L, about 4.5 g/L to about 7 g/L, about 5 g/L to about isomerase family, and the fusion protein is directed to the 7 g/L, about 6 g/L to about 7 g/L, about 0.5 g/L to about 6 g/L, periplasm after expression. In embodiments, the yield of the about 1 g/L to about 6 g/L, about 1.5 g/L to about 6 g/L, about fusion protein, the cytoplasmically expressed fusion protein, 2 g/L to about 6 g/L, about 2.5 g/L to about 6 g/L, about 3 g/L or the periplasmically expressed fusion protein, is about 0.5 to about 6 g/L, about 3.5 g/L to about 6 g/L, about 4 g/L to g/L, about 1 g/L, about 1.5 g/L, about 2 g/L, about 2.5g/L, about 6 g/L, about 4.5 g/L to about 6 g/L, about 5 g/L to about about 3 g/L, about 3.5g/L, about 4 g/L, about 4.5 g/L, about 6 g/L, about 0.5 g/L to about 5g/L, about 1 g/L to about 5g/L, 5g/L, about 6 g/L, about 7 g/L, about 8 g/L, about 9 g/L, about about 1.5 g/L to about 5g/L, about 2 g/L to about 5g/L, about 10 g/L, about 11 g/L, about 12 g/L, about 13 g/L, about 14 2.5 g/L to about 5g/L, about 3 g/L to about 5 g/L, about 3.5 g/L, about 15 g/L, about 16 g/L, about 17 g/L, about 18 g/L. g/L to about 5g/L, about 4 g/L to about 5 g/L, about 4.5 g/L about 19 g/L, about 20 g/L, about 21 g/L, about 22 g/L, about to about 5 g/L, about 0.5 g/L to about 4 g/L, about 1 g/L to 23 g/L, about 24 g/L, about 25 g/L, about 0.5g/L to about 25 about 4 g/L, about 1.5 g/L to about 4 g/L, about 2 g/L to about g/L, about 0.5 g/L to about 23 g/L, about 1 g/L to about 23 g/L. 4 g/L, about 2.5 g/L to about 4 g/L, about 3 g/L to about 4 g/L. about 1.5 g/L to about 23 g/L, about 2 g/L to about 23 g/L, about 0.5g/L to about 3 g/L, about 1 g/L to about 3 g/L, about about 2.5 g/L to about 23 g/L, about 3 g/L to about 23 g/L, 1.5 g/L to about 3 g/L, about 2 g/L to about 3 g/L, about 0.5 about 3.5 g/L to about 23 g/L, about 4 g/L to about 23 g/L, g/L to about 2 g/L, about 1 g/L to about 2 g/L, or about 0.5g/L about 4.5 g/L to about 23 g/L, about 5 g/L to about 23 g/L, to about 1 g/L. US 2016/0159877 A1 Jun. 9, 2016

0204. In embodiments, the polypeptide of interest is hPTH about 9 g/L, about 1.5 g/L to about 9 g/L, about 2 g/L to about and the yield of the recombinant fusion protein directed to the 9 g/L, about 2.5 g/L to about 9 g/L, about 3 g/L to about 9 g/L, cytoplasm is about 0.5 g/L to about 2.4 grams per liter. about 3.5g/L to about 9 g/L, about 4 g/L to about 9 g/L, about 0205. In embodiments, the polypeptide of interest is hPTH 4.5 g/L to about 9 g/L, about 5 g/L to about 9 g/L, about 6 g/L and the yield of the recombinant fusion protein directed to the to about 9 g/L, about 7 g/L to about 9 g/L, about 8 g/L to about periplasm is about 0.5 grams per liter to about 6.7 grams per 9 g/L, about 0.5 g/L to about 8 g/L, about 1 g/L to about 8 g/L, liter. about 1.5 g/L to about 8 g/L, about 2 g/L to about 8 g/L, about 2.5 g/L to about 8 g/L, about 3 g/L to about 8 g/L, about 3.5 Yield of Polypeptide of Interest g/L to about 8 g/L, about 4 g/L to about 8 g/L, about 4.5 g/L to about 8 g/L, about 5 g/L to about 8 g/L, about 6 g/L to about 0206. In embodiments, the polypeptide of interest is 8 g/L, about 7 g/L to about 8 g/L, about 0.5 g/L to about 7 g/L, released from the full recombinant fusion protein, by protease about 1 g/L to about 7 g/L, about 1.5 g/L to about 7 g/L, about cleavage within the linker. In embodiments, the polypeptide 2 g/L to about 7 g/L, about 2.5 g/L to about 7 g/L, about 3 g/L of interest obtained after cleavage with protease is the prop to about 7 g/L, about 3.5 g/L to about 7 g/L, about 4 g/L to erly released polypeptide of interest. In embodiments, the about 7 g/L, about 4.5 g/L to about 7 g/L, about 5 g/L to about yield of the polypeptide of interest—either based on measure 7 g/L, about 6 g/L to about 7 g/L, about 0.5 g/L to about 6 g/L, ment of properly released protein, or calculated based on the about 1 g/L to about 6 g/L, about 1.5 g/L to about 6 g/L, about known proportion of polypeptide of interest to total fusion 2 g/L to about 6 g/L, about 2.5 g/L to about 6 g/L, about 3 g/L protein is about 0.7 grams per liter to about 25.0 grams per to about 6 g/L, about 3.5 g/L to about 6 g/L, about 4 g/L to liter. In embodiments, the yield of the polypeptide of interest about 6 g/L, about 4.5 g/L to about 6 g/L, about 5 g/L to about is about 0.5 g/L (500 mg/L), about 1 g/L, about 1.5g/L, about 6 g/L, about 0.5 g/L to about 5g/L, about 1 g/L to about 5g/L, 2 g/L, about 2.5g/L, about 3 g/L, about 3.5g/L, about 4 g/L. about 1.5 g/L to about 5g/L, about 2 g/L to about 5g/L, about about 4.5 g/L, about 5 g/L, about 6 g/L, about 7 g/L, about 8 2.5 g/L to about 5g/L, about 3 g/L to about 5 g/L, about 3.5 g/L, about 9 g/L, about 10 g/L, about 11 g/L, about 12 g/L. g/L to about 5g/L, about 4 g/L to about 5 g/L, about 4.5 g/L about 13 g/L, about 14 g/L, about 15 g/L, about 16 g/L, about to about 5 g/L, about 0.5 g/L to about 4 g/L, about 1 g/L to 17 g/L, about 18 g/L, about 19 g/L, about 20 g/L, about 21 about 4 g/L, about 1.5 g/L to about 4 g/L, about 2 g/L to about g/L, about 22 g/L, about 23 g/L, about 24 g/L, about 25 g/L. 4 g/L, about 2.5 g/L to about 4 g/L, about 3 g/L to about 4 g/L. about 0.5 g/L to about 23 g/L, about 1 g/L to about 23 g/L, about 0.5g/L to about 3 g/L, about 1 g/L to about 3 g/L, about about 1.5 g/L to about 23 g/L, about 2 g/L to about 23 g/L, 1.5 g/L to about 3 g/L, about 2 g/L to about 3 g/L, about 0.5 about 2.5 g/L to about 23 g/L, about 3 g/L to about 23 g/L, g/L to about 2 g/L, about 1 g/L to about 2 g/L, or about 0.5g/L about 3.5 g/L to about 23 g/L, about 4 g/L to about 23 g/L, about 4.5 g/L to about 23 g/L, about 5 g/L to about 23 g/L, to about 1 g/L, at 0.5 mL to 100 L, 0.5 mL, 50 mL, 100 mL, about 6 g/L to about 23 g/L, about 7 g/L to about 23 g/L, about 1 L, 2 L, or larger scale. 8 g/L to about 23 g/L, about 9 g/L to about 23 g/L, about 10 0207. In embodiments, hPTH is produced as a fusion pro g/L to about 23 g/L, about 15 g/L to about 23 g/L, about 20g/L tein having an N-terminal fusion partner and hPTH construct to about 23 g/L, about 0.5 g/L to about 20 g/L, about 1 g/L to as described in Table 8. In embodiments, expression of the about 20 g/L, about 1.5 g/L to about 20 g/L, about 2 g/L to hPTH fusion protein produces at least 100, at least 125, at about 20 g/L, about 2.5 g/L to about 20 g/L, about 3 g/L to least 150, at least 175, at least 200, at least 250, at least 300, about 20 g/L, about 3.5 g/L to about 20 g/L, about 4 g/L to at least 350, at least 400, at least 450, at least 500, at least 550, about 20 g/L, about 4.5 g/L to about 20 g/L, about 5 g/L to at least 600, at least 650, or at least 1000 mg/L total hPTH about 20 g/L, about 6 g/L to about 20 g/L, about 7 g/L to about fusion protein, at 0.5 mL to 100 L, 0.5 mL, 50 mL. 100 mL, 1 20 g/L, about 8 g/L to about 20 g/L, about 9 g/L to about 20 L., 2 L, or larger scale. g/L, about 10 g/L to about 20 g/L, about 15 g/L to about 20 0208. In embodiments, a proinsulin, e.g., proinsulin for an g/L, about 0.5 g/L to about 15 g/L, about 1 g/L to about 15 g/L, insulin analog, for example, glargine, is produced as a proin about 1.5 g/L to about 15 g/L, about 2 g/L to about 15 g/L, Sulin fusion protein having an N-terminal fusion partner and about 2.5 g/L to about 15 g/L, about 3 g/L to about 15 g/L, proinsulin construct comprising a C-peptide sequence as about 3.5 g/L to about 15 g/L, about 4 g/L to about 15 g/L, described in Table 19. In embodiments, expression of a pro about 4.5 g/L to about 15 g/L, about 5 g/L to about 15 g/L, insulin fusion protein according to the methods of the inven about 6 g/L to about 15 g/L, about 7 g/L to about 15 g/L, about tion produces at least about 10, at least about 20, at least about 8 g/L to about 15g/L, about 9 g/L to about 15 g/L, about 10 30, at least about 40, at least about 50, at least about 60, at least g/L to about 15 g/L, about 0.5g/L to about 12 g/L, about 1 g/L about 70, at least about 80, at least about 90, at least about 100, to about 12 g/L, about 1.5 g/L to about 12 g/L, about 2 g/L to at least about 110, at least about 120, at least about 130, at about 12 g/L, about 2.5 g/L to about 12 g/L, about 3 g/L to least about 140, at least about 150, at least about 200, or at about 12 g/L, about 3.5 g/L to about 12 g/L, about 4 g/L to least about 250 mg/L soluble proinsulin, at 0.5 mL to 100 L, about 12 g/L, about 4.5 g/L to about 12 g/L, about 5 g/L to 50 mL 100 mL, 1 L, 2 L, or larger scale, either as measured about 12 g/L, about 6 g/L to about 12 g/L, about 7 g/L to about when properly released or calculated based on its known 12 g/L, about 8 g/L to about 12 g/L, about 9 g/L to about 12 proportion of the fusion protein. g/L, about 10 g/L to about 12 g/L, about 0.5 g/L to about 10 0209. In embodiments, expression of a proinsulin fusion g/L, about 1 g/L to about 10 g/L, about 1.5 g/L to about 10 g/L. protein according to the methods of the invention produces about 2 g/L to about 10 g/L, about 2.5 g/L to about 10 g/L, about 10 to about 500, about 15 to about 500, about 20 to about 3 g/L to about 10 g/L, about 3.5 g/L to about 10 g/L, about 500, about 30 to about 500, about 40 to about 500, about about 4 g/L to about 10 g/L, about 4.5 g/L to about 10 g/L, 50 to about 500, about 60 to about 500, about 70 to about 500, about 5 g/L to about 10 g/L, about 6 g/L to about 10 g/L, about about 80 to about 500, about 90 to about 500, about 100 to 7 g/L to about 10 g/L, about 8 g/L to about 10 g/L, about 9 g/L about 500, about 200 to about 500, about 10 to about 400, to about 10 g/L, about 0.5 g/L to about 9 g/L, about 1 g/L to about 15 to about 400, about 20 to about 400, about 30 to US 2016/0159877 A1 Jun. 9, 2016 32 about 400, about 40 to about 400, about 50 to about 400, about mL, 50 mL, 100 mL, 1 L, 2 L, or larger scale, either as 60 to about 400, about 70 to about 400, about 80 to about 400, measured when properly released or calculated based on its about 90 to about 400, about 100 to about 400, about 200 to known proportion of the fusion protein. In embodiments, about 400, about 10 to about 300, about 15 to about 300, about expression of a GCSF fusion according to the methods of the 20 to about 300, about 30 to about 300, about 40 to about 300, invention produces at least 100, at least 200, at least 250, at about 50 to about 300, about 60 to about 300, about 70 to least 300, at least 400, at least 500, or at least 1000 mg/L about 300, about 80 to about 300, about 90 to about 300, about soluble GCSF. In embodiments, expression of the GCSF 100 to about 300, about 200 to about 300, about 10 to about fusion produces at least 300, at least 350, at least 400, at least 250, about 15 to about 250, about 20 to about 250, about 30 to 450, at least 500, at least 550, at least 600, at least 650, at least about 250, about 40 to about 250, about 50 to about 250, about 700, at least 850, at least, at least 550, at least 600, at least 650, 60 to about 250, about 70 to about 250, about 80 to about 250, about 100 to about 1000, about 200 to about 1000, about 300 about 90 to about 250, about 100 to about 250, about 10 to to about 1000, about 400 to about 1000, or about 500 to about about 200, about 15 to about 200, about 20 to about 200, about 1000 mg/L of total soluble and insoluble GCSF, at 0.5 mL to 30 to about 200, about 40 to about 200, about 50 to about 200, 100 L, 0.5 mL, 50 mL, 100 mL, 1 L, 2 L, or larger scale. about 60 to about 200, about 70 to about 200, about 80 to 0212. In embodiments, the amount of recombinant fusion about 200, about 90 to about 200, or about 100 to about 200 protein produced is about 1% to about 75% of the total cell mg/L soluble proinsulin, at 0.5 mL to 100 L, 0.5 mL, 50 mL. protein. In certain embodiments, the amount of recombinant 100 mL, 1 L, 2 L, or larger scale, either as measured when fusion protein produced is about 1%, about 2%, about 3%, properly released or calculated based on its known proportion about 4%, about 5%, about 10%, about 15%, about 20%, of the fusion protein. about 25%, about 30%, about 35%, about 40%, about 45%, 0210. In embodiments, expression of a proinsulin fusion about 50%, about 55%, about 60%, about 65%, about 70%, protein produces at least about 100, at least about 125, at least about 75%, about 1% to about 5%, about 1% to about 10%, about 150, at least about 175, at least about 200, at least about about 1% to about 20%, about 1% to about 30%, about 1% to 250, at least about 300, at least about 350, at least about 400, about 40%, about 1% to about 50%, about 1% to about 60%, at least about 450, at least about 500, at least about 550, at about 1% to about 75%, about 2% to about 5%, about 2% to least about 600, at least about 650, or at least about 1000 mg/L about 10%, about 2% to about 20%, about 2% to about 30%, of total soluble and insoluble proinsulin. In embodiments, about 2% to about 40%, about 2% to about 50%, about 2% to expression of the proinsulin fusion protein produces about about 60%, about 2% to about 75%, about 3% to about 5%, 100 to about 2000 mg/L, about 100 to about 1500 mg/L, about about 3% to about 10%, about 3% to about 20%, about 3% to 100 to about 1000 mg/L, about 100 to about 900 mg/L, about about 30%, about 3% to about 40%, about 3% to about 50%, 100 to about 800 mg/L, about 100 to about 700 mg/L, about about 3% to about 60%, about 3% to about 75%, about 4% to 100 to about 600 mg/L, about 100 to about 500 mg/L, about about 10%, about 4% to about 20%, about 4% to about 30%, 100 to about 400 mg/L, about 200 to about 2000 mg/L, about about 4% to about 40%, about 4% to about 50%, about 4% to 200 to about 1500 mg/L, about 200 to about 1000 mg/L, about about 60%, about 4% to about 75%, about 5% to about 10%, 200 to about 900 mg/L, about 200 to about 800 mg/L, about about 5% to about 20%, about 5% to about 30%, about 5% to 200 to about 7000 mg/L, about 200 to about 600 mg/L, about about 40%, about 5% to about 50%, about 5% to about 60%, 200 to about 500 mg/L, about 300 to about 2000 mg/L, about about 5% to about 75%, about 10% to about 20%, about 10% 300 to about 1500 mg/L, about 300 to about 1000 mg/L, about to about 30%, about 10% to about 40%, about 10% to about 300 to about 900 mg/L, about 300 to about 800 mg/L, about 50%, about 10% to about 60%, about 10% to about 75%, 300 to about 7000 mg/L, or about 300 to about 600 mg/L of about 20% to about 30%, about 20% to about 40%, about 20% total soluble and insoluble proinsulin, at 0.5 mL to 100 L, 0.5 to about 50%, about 20% to about 60%, about 20% to about mL, 50 mL, 100 mL, 1 L, 2L, or larger scale. In embodiments, 75%, about 30% to about 40%, about 30% to about 50%, the proinsulin is cleaved to release the C-peptide and produce about 30% to about 60%, about 30% to about 75%, about 40% mature insulin. In embodiments, expression of the proinsulin to about 50%, about 40% to about 60%, about 40% to about fusion protein produces at least about 100, at least about 200, 75%, about 50% to about 60%, about 50% to about 75%, at least about 250, at least about 300, at least about 400, at about 60% to about 75%, or about 70% to about 75%, of the least about 500, about 100 to about 2000 mg/L, about 200 to total cell protein. about 2000 mg/L, about 300 to about 2000 mg/L, about 400 to about 2000 mg/L, about 500 to about 2000 mg/L, about 100 to Solubility and Activity about 1000 mg/L, about 200 to about 1000 mg/L, about 300 to 0213. The “solubility” and “activity” of a protein, though about 1000 mg/L, about 400 to about 1000 mg/L, about 500 to related qualities, are generally determined by different about 1000 mg/L, mature insulin, at 0.5 mL to 100 L, 0.5 mL, means. Solubility of a protein, particularly a hydrophobic 50 mL, 100 mL, 1 L, 2 L, or larger scale, either as measured protein, indicates that hydrophobic amino acid residues are when properly released or calculated based on its known improperly located on the outside of the folded protein. Pro proportion of the fusion protein. tein activity, which can be evaluated using methods as deter 0211. In embodiments, GCSF is produced as a GCSF mined to be appropriate for the polypeptide of interest by one fusion protein having an N-terminal fusion partner as of skill in the art, is another indicator of proper protein con described in Table 21. In embodiments, expression of a GCSF formation. “Soluble, active, or both' as used herein, refers to fusion according to the methods of the invention produces protein that is determined to be soluble, active, or both soluble soluble fusion protein comprising at least 100, at least 200, at and active, by methods known to those of skill in the art. least 250, at least 300, at least 400, at least 500, or at least 0214. In general, with respect to an amino acid sequence, 1000, about 100 to about 1000, about 200 to about 1000, the term “modification' includes substitutions, insertions, about 300 to about 1000, about 400 to about 1000, or about elongations, deletions, and derivatizations alone or in combi 500 to about 1000 mg/L soluble GCSF, at 0.5 mL to 100L, 0.5 nation. In embodiments, the recombinant fusion proteins may US 2016/0159877 A1 Jun. 9, 2016

include one or more modifications of a “non-essential amino Materials and Methods acid residue. In this context, a “non-essential amino acid residue is a residue that can be altered, e.g., deleted or Sub 0218 Construction of PTH 1-34. Fusion Protein Expres stituted, in the novel amino acid sequence without abolishing sion Plasmids: or Substantially reducing the activity (e.g., the agonist activ 0219 Gene fragments encoding PTH 1-34 fusion proteins ity) of the recombinant fusion protein. By way of example, were synthesized using DNA 2.0, a gene design and synthesis the recombinant fusion protein may include 1, 2, 3, 4, 5, 6, 7, service (Menlo Park, Calif.). Each gene fragment included a 8, 9, 10, or more substitutions, both in a consecutive manner coding sequence for a P fluorescens folding modulator or spaced throughout the recombinant fusion protein mol (DnaJ-like protein, Fk1B, or FrnE), fused with a coding ecule. Alone or in combination with the substitutions, the sequence for PTH 1-34, and a linker. Each gene fragment also recombinant fusion protein may include 1,2,3,4,5,6,7,8,9, included recognition sequences for the restriction enzymes 10, or more insertions, again either in consecutive manner or Spel and XhoI, a “Hi’ ribosome binding site, and an 18 spaced throughout the recombinant fusion protein molecule. basepair spacer that includes a ribosome binding site and a The recombinant fusion protein, alone or in combination with restriction site (SEQID NO. 58) added upstream to the cod the Substitutions and/or insertions, may also include 1, 2, 3, 4, ing sequences and three stop codons. Nucleotide sequences 5, 6, 7, 8, 9, 10, or more deletions, again either in consecutive encoding these PTH 1-34 fusion proteins are provided as SEQ manner or spaced throughout the recombinant fusion protein ID NOS: 52-57. molecule. The recombinant fusion protein, alone or in com 0220 To generate expression plasmids p708-004, -005 bination with the substitutions, insertions and/or deletions, and -006 (listed in Table 6), the PTH 1-34 fusion protein gene may also include 1,2,3,4,5,6,7,8,9, 10, or more amino acid fragments were digested using Spel and XhoI restriction additions. enzymes, and subcloned into expression vectorp)OW1169, containing the pTac promoter and rrnT1 T2 transcriptional 0215. Substitutions include conservative amino acid sub terminator. pl.)OW1169 is described in literature, for e.g., in stitutions. A "conservative amino acid Substitution' is one in U.S. Pat. No. 7,833,752, “Bacterial Leader Sequences for which the amino acid residue is replaced with an amino acid Increased Expression, and Schneider et al., 2005, Aux residue having a similar side chain, or physicochemical char otrophic markers pyrF and proC can replace antibiotic mark acteristics (e.g., electrostatic, hydrogen bonding, isosteric, ers on protein production plasmids in high-cell-density hydrophobic features). The amino acids may be naturally Pseudomonas fluorescens fermentation. Biotechnol. occurring or normatural (unnatural). Families of amino acid Progress 21(2): 343-8, both incorporated by reference herein. residues having similar side chains are known in the art. These families include amino acids with basic side chains The plasmids were electroporated into competent Pfluore (e.g. lysine, arginine, histidine), acidic side chains (e.g., scens DC454 host cells (pyrElsc:lacI'). aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, TABLE 6 , methionine, cysteine), nonpolar side chains (e.g., PTH 1-34. Fusion Protein Plasmids alanine, Valine, leucine, isoleucine, proline, , tryptophan), B-branched side chains (e.g., threonine, Valine, Plasmid N-terminal isoleucine) and aromatic side chains (e.g., tyrosine, phenyla Number Fusion Partner Fusion Protein lanine, tryptophan, histidine). Substitutions may also include p708-004 DnaJ-like DnaJ-like protein protein PTH non-conservative changes. p708-005 FkB FkB-PTH 0216 While preferred embodiments of the present inven p708-006 FrnE FrnE-PTH tion have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and 0221 DNA Sequencing: substitutions will now occur to those skilled in the art without 0222. The presence of the cloned fragments in the fusion departing from the invention. It should be understood that protein expression plasmids were confirmed by DNA various alternatives to the embodiments of the invention sequencing using a BigDye R. Terminator v3.1 Cycle described herein may be employed in practicing the inven Sequencing Kit (Applied Biosystems, 4337455). The DNA tion. It is intended that the following claims define the scope sequencing reactions, containing 50 fmol of plasmid DNA to of the invention and that methods and structures within the be analyzed, were prepared by mixing 1 u, of sequencing Scope of these claims and their equivalents be covered premix, 0.5 L of 100 LM primer stock solutions, 3.5 uL of thereby. sequencing buffer, and water added to a final Volume of 20LL. The results were assembled and analyzed using the SequencherTM software (Gene Codes). EXAMPLES 0223 Growth and Expression in 96-Well Format (HTP): 0224. The fusion protein expression plasmids were trans Example I formed into Pfluorescens host strains in an array format. The transformation reaction was initiated by mixing 35 uL of P High-Throughput Screening of Strains Expressing fluorescens competent cells and a 10 uL Volume of plasmid hPTH 1-34 Fusions DNA (2.5 ng). A 25uLaliquot of the mixture was transferred to a 96-multi-well Nucleovette R plate (Lonza). Electropora 0217. This study was conducted to test levels recombinant tion was carried out using the NucleofectorTM 96-well protein produced by Pfluorescens strains expressing hPTH ShuttleTM system (Lonza AG), and the electroporated cells 1-34 fusion proteins comprising DNAJ-like protein, FklB, or were subsequently transferred to a fresh 96-well deep well FrnE as the N-terminal fusion partner. plate, containing 500 uL M9 salts supplemented with 1% US 2016/0159877 A1 Jun. 9, 2016 34 glucose medium, and trace elements. The plates were incu 0232 Mechanical Release and Purification: bated at at 30°C. with shaking for 48 hours, to generate seed 0233. Frozen cell pastes, at quantities of 5 grams or 10 cultures. grams, were thawed and resuspended in 3xPBS, 5% glycerol, 0225 TenuLaliquots of the seed cultures were transferred 50 mM imidazole pH 7.4, to prepare final volumes of 50 mL in duplicate into 96-well deep well plates. Each well con or 100 mL, respectively. The suspensions were subsequently tained 500LL of HTP-YE medium (Teknova), supplemented homogenized in two passes through a microfluidizer (Microf with trace elements and 5% glycerol. The seed cultures, plated in the glycerol supplemented HTP media, were incu luidics, Inc., model M 11OY) at 15,000 psi. Lysates were bated for 24 hours, in a shaker, at 30°C. Isopropyl-f-D-1- centrifuged at 12,000xg for 30 minutes and filtered through a thiogalactopyranoside (IPTG) was added to each well at a Sartorius Sartobran 150 (0.45/0.2 um) filter capsule. final concentration of 0.3 mM to induce expression of the 0234 Chromatography: PTH 1-34 fusion proteins. For strains containing folding modulator over-expressing plasmids (see Table 4), IPTG was 0235 Fast protein liquid chromatography (FPLC) opera supplemented with mannitol (Sigma, M1902) at a final con tions were performed using AKTA explorer 100 chromatog centration of 1% to induce the expression of the folding raphy systems (GE Healthcare) equipped with Frac-950 frac modulators. In addition, 0.01 uL of a 250 unit/uIL stock Ben tion collectors. The soluble fraction samples, prepared from Zonase (Novagen, 70746-3) was added per well at the time of HTP expression broths, were loaded onto 5 mL HisTrap FF induction to reduce the potential for culture viscosity. After columns (GE Healthcare, part number 17-5248-02) pre 24 hours of induction, cell density was calculated by measur equilibrated with 3xPBS, 5% glycerol, 50 mMimidazole pH ing the optical density at 600 nm (OD). The cells were 7.4. The columns were washed with 4 column volumes of subsequently harvested, diluted 1:3 with 1xPhosphate Buff equilibration buffer, and the fusion proteins were eluted, from ered Saline (PBS) to a final volume of 400 uL and frozen for the HisTrap columns, using 10 column Volumes of elution later processing. buffer, applying a linear gradient of imidazole from 50 mM to 0226 Soluble Lysate Sample Preparation for Analytical 200 mM. The entire process was run at 100 cm/h, which was Characterization: equivalent to a 1.5 minute residence time. The purification 0227. The harvested cell samples were diluted and lysed fractions were analyzed by SDS-CGE, using the SDS-CGE by sonication with a Cell Lysis Automated Sonication System analysis methods described above. (CLASS, Scinomix) using a 24 probe tip horn. The lysates 0236 Enterokinase Cleavage: were centrifuged at 5,500xg for 15 minutes at 8° C. The 0237 A first set of samples was prepared by dialyzing the supernatant was collected and labeled as the soluble fraction. purification fractions containing the fusion protein overnight The pellets were collected, resuspended in 400 uL of 1xPBS at 4°C., against 1xPBS pH 7.4 supplemented with 2 mM pH 7.4 by another round of sonication, and labeled as the CaCl, using 7000 molecular weight cutoff (MWCO) Slide insoluble fraction. A-Lyzer cassettes (Pierce). The dialyzed samples were main 0228 SDS-CGE Analysis: tained at about 1 mg/mL concentration. A second set of 0229. The soluble and insoluble fractions were analyzed samples was prepared by 2x dilution of the purification frac by HTP microchip SDS capillary gel electrophoresis using a tions containing the fusion proteins, with water, and stored in LabChip GXII instrument (Caliper LifeSciences) with a HT a buffer comprising 1.5xPBS, 2.5% glycerol, and -30-70mM Protein Express V2 chip and corresponding reagents (part imidazole at a concentration of 0.5 mg/mL. A stock Solution numbers 760499 and 760328, respectively, Caliper Life of porcine enterokinase (Sigma EO632-1.5KU) was added to Sciences). Samples were prepared following the manufactur the samples either at 5x or 20x dilution (corresponding to er's protocol (Protein User Guide Document No. 450589, enterokinase concentrations of 40 ug/mL and 10 ug/mL, Rev. 3). Briefly, 4 uL aliquots of either the soluble or the respectively). CaCl2 was also added to a 2 mM final concen insoluble fraction samples was mixed with 14 L of buffer, tration, and the reaction mixture was incubated overnight at with or without dithiothreitol (DTT) reducing agent in room temperature. 96-well polypropylene conical well PCR plates heated at 95° 0238 Liquid Chromatography-Mass Spectrometry: C. for 5 minutes, and diluted with 70 uL deionized water. Lysates from null host strains, which were not transformed 0239 A Q-ToF mass spectrometer (Waters) with an with fusion protein expression plasmid, were run as control in electro spray interface (ESI) coupled to an Agilent 1100 parallel with test samples, and quantified using the system HPLC equipped with an auto sampler, column heater, and UV internal standard. detector, was used for Liquid Chromatography-Mass Spec trometry (LC-MS) analysis. A CN-reversed phase column, 0230 Shake Flask Expression: which had an internal diameter of 2.1 mm ID, length of 150 0231. Seed cultures for each of the fusion protein expres mm, particle size of 5um, and pore size of 300 A (Agilent, sion strains being evaluated were grown in M9 Glucose catalog number 883750-905) was used with a guard column (Teknova) to generate intermediate cultures, and a 5 mL (Agilent, catalog number 820950-923). The HPLC run was Volume of each intermediate culture was used to inoculate carried out at a temperature of 50° C. and the flowrate was each of four 1 Liter baffled bottom flasks containing 250 mL maintained at 2°C. The HPLC buffers were 0.1% formic acid HTP medium (Teknova 3H1129). Following 24 hours of (mobile phase A) and 90% acetonitrile with 0.1% formic acid growth at 30° C., the cultures were induced with 0.3 mM (mobile phase B). Approximately 4 ug of fusion protein IPTG and 1% mannitol, and incubated for an additional 24 sample was loaded onto the HPLC column. The HPLC run hours at 30°C. The shake flask broths were then centrifuged ning conditions were set at 95% mobile phase A while loading to harvest cells and the harvested cell paste was frozen for the sample. The fusion protein was eluted using a reversed future use. phase gradient exemplified in Table 7. US 2016/0159877 A1 Jun. 9, 2016 35

TABLE 7 cation of the fusion proteins, a hexa-histidine tag was included in the linker. The linker also contained an enteroki Reverse Phase Gradient for Mass Spectrometric Analysis of Purified nase cleavage site (DDDDK) to facilitate separation of the Protein Sample N-terminal fusion partner from the desired PTH 1-34 polypeptide of interest. The amino acid sequences for the % Mobile % Mobile Flow PTH 1-34 fusion proteins are shown in FIG. 2A (DnaJ-like Time Phase A Phase B (ml/min) Curve protein-PTH, SEQID NO: 45), 2B (FklB-PTH, SEQID NO: O.O 95.0 5 O.2 46), and 2C (FrnE-PTH, SEQID NO: 47). The amino acids 1O.O 90 10 O.2 Linear corresponding to the linker are underlined and those corre sponding to PTH 1-34 are italicized in FIGS. 2A, B, and C. TABLE 8 Physicochemical Properties of Selected N-terminal Fusion Partners Molar A280 nm) Molecular equivalent of Molar for 1 mg/mL. Fusion Weight 1 Lig Extinction (AU-absorbance Isoelectric Charge Partner (Da) (pMoles) Coefficient unit) Point (pl) at pH 7 DnaJ-like 91.76.27 Da 108.977 13370 1.46 4.83 -5.04 protein (79 aa) FklB (206 aa) 21770.89 45.933 17780 122 4.71 -9.94 FrnE (218 aa) 23945 41.762 24990 1.04 4.62 -1477

TABLE 7-continued 0243 Construction of PTH Fusion Expression Vectors and HTP Expression: Reverse Phase Gradient for Mass Spectrometric Analysis of Purified 0244 Synthetic gene fragments encoding each of the three Protein Sample PTH fusion proteins listed in Table 6 were synthesized by %. Mobile %. Mobile Flow DNA 2.0. The synthetic gene fragments were digested with Time Phase A Phase B (ml/min) Curve Spel and XhoI and ligated to pl)OW1169 (digested with the SO.O 35 65 O.2 Linear same enzymes), generating the expression plasmids p708 S2.O O 1OOO O.2 Linear 004, p708-005 and p708-006. Following confirmation of the 57.0 O 1OOO O.2 Hold inserts, the plasmids were used to electroporate an array of P 57.1 95.0 S.O O.2 Step fluorescens host strains and generate the expression strains 6S.O 95.0 S.O O.2 Hold listed in Table 4. The resulting transformed strains were grown and induced with IPTG and mannitol following the 0240 UV absorbance spectra were collected from 180 nm procedures described in the Materials and Methods. After to 500 nm, prior to MS. The ESI-MS source was used in induction the cells were harvested, Sonicated, and centrifuged positive mode at 2.5 kV. MS scans were carried out using a to separate soluble and insoluble fractions. Soluble and range of 600-2600 m/z at 2 scans per second. MS and UV data insoluble fractions were collected. Both the soluble and were analyzed using MassLynx software (Waters). UV chro insoluble fractions were analyzed using reduced SDS-CGE to matograms and MS total ion current (TIC) chromatograms measure PTH 1-34 fusion protein expression levels. A total of were generated. The MS spectra of the target peaks were six strains, including two high HTP expressing strains for summed. These spectra were deconvoluted using MaxEnt 1 each of the three PTH 1-34 fusion proteins, were selected for (Waters) scanning for a molecular weight range of 2,800-6, shake flask expression. The strains screened using the shake 000 (for PTH 1-34, which has a theoretical molecular weight flask expression method are listed in Table 9. of 4118kDa, and higher window for fusion proteins or N-ter 0245 Shake Flask Expression: minal fusion partners), resolution of 1 Da per channel, and 0246. Each of the six strains were grown and induced at Gaussian width of 0.25 Da. 250 mL culture scale (4x250 mL cultures each) as described in the Materials and Methods (Shake Flask Expression) sec Results tion. Following induction, samples from each culture (whole 0241. Design of PTH 1-34 Gene Fusion Fragments: cell broth, WCB) were retained; a subset of the samples were 0242 To facilitate high level expression of PTH 1-34 diluted 3x with PBS, sonicated and centrifuged to produce fusion proteins, three folding modulators, DnaJ-like protein soluble and insoluble fractions. The remainder of each culture (SEQID NO: 2, cytoplasmic chaperone), FrnE(SEQID NO: was centrifuged to generate cell paste and a Supernatant cell 3, cytoplasmic PPIase) and FklB (SEQID NO: 4, periplasmic free broth (CFB). The cell paste was retained for purification. PPIase), from P. fluorescens, were selected based on high The WCB, CFB, and soluble fractions were evaluated by soluble expression, molecular weight less than 25kDa and an reduced SDS-CGE (FIG. 3). isoelectric point (pl) significantly different than that of PTH 0247. Fusion proteins (bands corresponding to a molecu 1-34 (which has a pi of 8.52). Characteristics of the folding lar weight of about 14 kDa for the DnaJ-like protein-PTH modulators are shown in Table 8. As shown in Table 8, the ps fusion, and about 26 kDa for the FrnE-PTH and FklB-PTH of DnaJ-like protein, FklB and FrnE, between 4.6 and 4.8, fusions) were observed in the WCB and in the soluble frac were well separated from that of PTH 1-34. This allowed for tions; no fusion protein was observed in the CFB. The shake ready separation by ion exchange. To further aid the purifi flask expression titers for STR35984, STR36085, and US 2016/0159877 A1 Jun. 9, 2016 36

STR36169 were 50% of the HTP expression titer, whereas the shake flask expression titers for the strains STR35970, STR36034, and STR36150 were 70-100% of that observed at HTP scale. The HTP and shake flask expression titers are listed in Table 9. TABLE 9 HTP and Shake Flask Expression Titer of Selected PTH 1-34 Fusion Protein Expression Strains Shake Flask Fusion HTP Expression Expression Titer Strain Barcode Plasmid Host Cell Partner Size (kDa) Titer (g/L) (g/L) STR35970 p708-004 DCSO8-1 DnaJ-like 14 0.552 O.382 protein STR35984 p708-004 DC992.1-1 DnaJ-like 14 O.490 O.266 protein STR36034 p708-005 DC1106-1 FkB 26 O.672 0.573 STR36085 p708-005 PF1326.1-1 FkB 26 O.670 O.233 STR361SO p708-006 PF1219.9-1 FrnE 26 0.577 O.651 STR36169 p708-006 PF1331-1 FrnE 26 0.551 O.284

0248 IMAC Purification of PTH Fusion Protein Expres- pared with uncleaved samples (lanes 1-6), complete cleavage sion Strains Grown in HTP and Shake Flask Scales, to Isolate of the fusion partner from PTH 1-34 was observed when 40 PTH Fusion Proteins: ug/mL enterokinase was used (lanes 7-12) and partial cleav 0249. The cell pastes of the six strains were subjected to age was observed when 10 g/mL enterokinase was used mechanical lysis and IMAC purification. Each purification (lanes 13-18). run resulted in highly enriched fractions. Peak fractions (0252) Intact Mass Analysis of PTH Fusion Proteins after derived from the DnaJ-like protein-PTH expression strain Enterokinase Cleavage: STR35970 were 60-80% pure, those from the FklB-PTH (0253) The DnaJ-like protein-PTH fusion protein, purified expression strain STR36034 were 60-90% pure and those from strain STR35970, was used for additional enterokinase from the FrnE-PTH expression strain STR36150 were cleavage experiments and intact mass analysis. A purification 90-95% pure. fraction, containing the DnaJ-like protein-PTH fusion pro (0250 Enterokinase Cleavage of the PTH Fusion Proteins: tein, derived from STR35970, was incubated with porcine 0251. The highly pure, concentrated fractions from IMAC enterokinase for 1 to 3 hours at room temperature followed by purification runs, containing the fusion proteins, were immediate intact mass analysis. As shown in FIG. 5, the selected for enterokinase cleavage reaction to confirm that the C-terminal PTH 1-34 polypeptide was detected. Details of the intact mass analysis are Summarized in Table 10. In addition N-terminal fusion partner could be cleaved from the PTH to full length PTH 1-34, fragments corresponding to N-ter 1-34. Porcine-derived enterokinase was used for the study. minal deletions of 5 or 8 amino acids also were detected. The Since the 4 kDa PTH 1-34 polypeptide of interest was not proteolysis observed was likely due to host cell protein con readily detectable by SDS-CGE, a molecular weight shift of taminants or contaminants in the porcine enterokinase prepa the total fusion protein, from 14 kDa to 10 kDa for DnaJ-like ration. Recombinant enterokinase also can be used to evaluate protein-PTH fusion protein, and 26 kDa to 22 kDa for the cleavage, via similar steps. Observed and theoretical molecu FklB-PTH and FrnE-PTH fusion proteins, were accepted as lar weights (MW) are indicated in Table 10 for the major evidence of enterokinase cleavage. The samples were treated species detected by intact mass analysis. The retention time with either 40 g/mL or 10 g/mL enterokinase overnight. for the uncleaved fusion protein was about 33 minutes, com Following enterokinase treatment, the samples were analyzed pared to an average retention time of 27 minutes for the fusion by SDS-CGE. As shown in FIG. 4 by the shift in MW com proteins Subjected to enterokinase cleavage for 1 to 3 hours. TABLE 10

Intact Mass Results DnaJ-like protein- PTH PTH 1-34

Theoretical MW: 15207.95 4117.8

Observed Observed minus minus Major Species, Theoretical Theoretical Sample Name Observed MW MW MW DnaJ-like protein-PTH fraction 4118 O.2 (about 3 hrs cleavage reaction) DnaJ-like protein-PTH fraction 4118 O.2 (about 1 hr cleavage) DnaJ-like protein-PTH fraction 4119 1.2 (about 2 hrs cleavage) US 2016/0159877 A1 Jun. 9, 2016 37

TABLE 10-continued

Intact Mass Results DnaJ-like protein-PTH fraction 152O7 -1.0 (no cleavage reaction) 140116 PTH (Reagent Proteins 4117 -0.8 Cat # RAB-391)

Example II medium was depleted. The fermentation was continued for 16 hours, and samples were collected and frozen for analysis. Large-Scale Fermentation and Expression of PTH 0257 CBR Fermentation: 1-34 Fusion Proteins (0258. The inocula for the 1 Liter CBR (conventional bioreactor) fermentor cultures were generated by inoculating 0254 The PTH 1-34 fusion proteins described in Example a shake flask, containing 600 mL of chemically defined I also were evaluated for large-scale expression in Pfluore medium Supplemented with yeast extract and glycerol, with a scens, to identify a highly productive expression strain for the frozen culture stock of the selected strain. After 16 to 24 hours large-scale manufacture of PTH 1-34. The P. fluorescens incubation, with shaking, at 32°C., equal aliquots from each strains screened in this study were the DnaJ-like protein-PTH shake flask culture were then aseptically transferred to each of fusion expression strains STR35970, STR35984, STR35949, an 8 unit multiplex fermentation system comprising 2 liter STR36005, STR35985, FklB-PTH fusion protein expression bioreactors (1 liter working volume). The fed-batch high cell strains, STR36034, STR36085, STR36098, and FrnE-PTH density fermentation process consisted of a growth phase fusion protein expression strains, STR36150, STR36169, followed by an induction phase, initiated by the addition of listed in Tables 11 and 12. IPTG once the culture reached the target optical density. 0259. The induction phase of the fermentation was TABLE 11 allowed to proceed for 8 hours, and analytical samples were DnaJ-like Protein-PTH Fusion Expression Strains for Large-scale withdrawn from the fermentor to determine cell density at Fermentation 575 nm (ODs,s). The analytical samples were frozen for subsequent analyses to determine the level of fusion protein Strain Plasmid Host expression. After the completion of 8 hours of induction, the STR35949 p708-004 DC1084 entire fermentation broth (approximately 0.8 L broth per 2 L STR35970 p708-004 DCSO8 STR35984 p708-004 DC992.1 bioreactor) of each vessel was harvested by centrifugation at STR35985 p708-004 PF1201.9 15,900xg for 60 to 90 minutes. The cell paste and supernatant STR36OOS p708-004 PF1326.1 were separated and the paste was frozen at -80° C. 0260 Mechanical Homogenization and Purification: 0261 Frozen cell paste (20g), obtained from the CBR TABLE 12 fermentation process, as described above, was thawed and resuspended in 20 mM sodium phosphate, 5% glycerol, 500 FrnE-PTH and FklB-PTH Fusion Expression Strains for Large-scale mM sodium chloride, 20 mM imidazole pH 7.4. The final Fermentation Volume of the Suspension was adjusted to ensure that the Strain Plasmid Host concentration of solids was 20%. The material was then homogenized in two passes through a microfluidizer (Microf STR36034 p708-005 DC1106 STR36085 p708-005 PF1326.1 luidics, Inc., model M 11OY) at 15,000 psi. Lysates were STR36098 p708-005 PF1345.6 centrifuged at 12,000xg for 30 minutes and filtered through a STR361SO p708-006 PF1219.9 Sartorius Sartobran 150 (0.45/0.2 um) filter capsule. STR36169 p708-006 PF1331 0262 Chromatography: 0263 Fast protein liquid chromatography (FPLC) opera tions were performed using AKTA explorer 100 chromatog Materials and Methods raphy systems (GE Healthcare) equipped with Frac-950 frac tion collectors. Samples were loaded onto HisTrap FF, 10 mL 0255 MBR Fermentation: columns (two 5 mL HisTrap FF cartridges GE Healthcare, 0256 Shake flasks containing medium supplemented with part number 17-5255-01 connected in series), washed, and yeast extract were inoculated with a frozen culture stock of eluted using a 10 column Volume linear gradient of an elution the selected strain. For the minibioreactors (MBR), 250 mL buffer, by varying the imidazole concentration from 0 mM to shake flasks containing 50 mL of chemically defined medium 200 mM. Two milliliter volume fractions were collected. supplemented with yeast extract were used. Shake flask cul 0264. Immobilized metal ion affinity chromatography tures were incubated for 16 to 24 hours with shaking at 30°C. (IMAC) purification was performed using Nickel IMAC (GE Aliquots from the shake flask cultures were used to seed the Healthcare, part number 17-5318-01). The analytical samples MBR (Pall Micro-24). The MBR cultures were operated at a collected after CBR fermentation were separated into soluble volume of 4 mL in each 10 mL well of the disposable mini and insoluble fractions. A 600 uL aliquot of the soluble frac bioreactor cassette under controlled conditions for pH, tem tion was incubated with 100 uLIMAC resin for one hour on perature, and dissolved oxygen. Cultures were induced with a rocker at room temperature, and centrifuged for one minute IPTG when the initial amount of glycerol contained in the at 12,000xg to pellet the resin. The supernatant was removed US 2016/0159877 A1 Jun. 9, 2016

and labeled as flow-through. The resin was thenwashed thrice TABLE 1.4 with 1 mL of washbuffer containing 20 mMNaphosphate pH Soluble Fusion Protein Yield for the DnaJ-like-hPTH 1-34 Fusion 7.3,500 mMNaCl, 5% glycerol, and 20 mMimidazole. After Strains, Evaluated in CBR Fermentors, at 8 (I3) and 24 (I24) Hours the third wash, the resin was resuspended in 200 ul of the Post-induction wash buffer containing 400 mM imidazole and centrifuged. Soluble Fusion Soluble Fusion The Supernatant was collected and labeled as elution. Protein Yields Protein Yields 0265 Enterokinase Cleavage: Strain (I8) (I24) STR35949 1.5-2.4 g/L 1.1-1.9 g/L 0266 PTH 1-34 fusion protein purification fractions were STR35970 2.0 g/L 0.9 g/L concentrated and resuspended in a buffer containing 20 mM STR35985 1.7-2.4 g/L 0.3-0.6 g/L Tris pH 7.4, 50 mM. NaCl, and 2 mM CaCl. Two units of STR36OOS 2.1 g/L 1.4 g/L enterokinase (Novagen cat #69066-3, batch D00155747) were added to 100 g protein in a 100 uL reaction. The 0271 The soluble fractions from the MBR fermentations mixture of fusion protein purification fraction and enteroki for the FklB-PTH and FrnE-PTH fusion expression strains nase were incubated for either one hour, or overnight at room were analyzed by SDS-CGE under reducing conditions (re temperature. Control reactions with no enterokinase also sults shown in Table 15). were incubated for one hour or overnight, at room tempera ture. The enzyme reactions were stopped by the addition of TABLE 1.5 complete protease inhibitor cocktail containing 4-benzene sulfonyl fluoride hydrochloride (AEBSF, Sigma cathi P8465). Soluble Fusion Protein Yields for the FkB-hPTH 1-34 and FrnE-hPTH 1-34 Fusion Strains Evaluated in MBR Fermentors Results Soluble Fusion Strain Protein Yields 0267 Fermentation Assessment of DnaJ-Like Protein STR36085 6.4 g/L STR36034 3.4-5.8 g/L PTH, FklB-PTH and FrnE-PTH Fusion Expression Strains: STR36098 3.4-4.7 g/L 0268. The five top expressing DnaJ-like protein-PTH STR361SO 0.8-2.2 g/L fusion strains, three FklB-PTH expression strains, and two FrnE-PTH expression strains, listed Tables 9 and 10, each 0272. Overall, the strain with the highest expression level were evaluated for fermentation, first in minibioreactors for the soluble fusion protein was STR36034 at 6.4 g/L. The (MBR), and then in conventional bioreactors (CBR). same strains also were assessed for large scale fermentation in 0269. The soluble fraction from each MBR fermentation conventional bioreactors (CBR) (results shown in Table 16). of the DnaJ-like protein-PTH fusion expression strains were The strain with the maximum yield, in CBR fermentation, analyzed by SDS-CGE, following the protocol described in was STR36034, expressing the FklB-PTH fusion protein at the Materials and Methods section of Example I. The MBR 6.7 g/L, after an induction period of 24 hours. fermentation yields for the DnaJ-like protein-PTH fusion expression strains are listed in Table 13. Overall, the strain TABLE 16 with the highest MBR expression level of the soluble fusion Soluble Fusion Protein Yield for the FIkB-hPTH 1-34 and FrnE-hPTH 1-34 Fusion Strains Evaluated in CBR Fermentors, at 24 (I24) protein was STR35949, at 2.1 g/L. Hours Post-induction

TABLE 13 Soluble Fusion Protein Yields Soluble Fusion Protein Yield for the DnaJ-like-hPTH Fusion Strain (I24) Strains Tested in MBR Fermentors STR36034 4.9-6.7 g/L STR36085 4.6-4.9 g/L STR36098 2.9-5.2 g/L Soluble Fusion STR361SO 2.6-3.8 g/L Strain Protein Yields 0273 Evaluation of Purification and Enterokinase Cleav STR35949 0.6-2.1 g/L age of DnaJ-Like Protein-PTH and FklB-PTH Fusion Pro STR36OOS 1.5 g/L teins: STR35970 1.1 g/L 0274 The cell paste obtained after induction of expression STR35985 0.9 g/L and growth in DnaJ-like protein-PTH fusion expression strain STR36005 was subjected to mechanical lysis and IMAC puri fication as described in the Materials and Methods. Each (0270. The DnaJ-like protein PTH fusion strains were purification run resulted in highly enriched fractions. The assessed for fermentation at the 1 L Scale, in conventional purity of the peak fractions was 90% or higher. bioreactors (CBR). CBR Expression levels of the DnaJ-like 0275 Highly pure concentrated fractions of the DnaJ-like protein-PTH fusion protein strains were comparable to the protein-PTH fusion protein purified from strain 36005 were MBR levels, as shown in Table 14. The expression levels were used for enterokinase cleavage testing to confirm that the higher at the 8-hour post-induction time points than at the N-terminal fusion partner could be cleaved from the PTH 24-hour post-induction time points. 1-34 polypeptide of interest. Recombinant bovine enteroki US 2016/0159877 A1 Jun. 9, 2016 39 nase was used for cleavage reactions. Soluble fractions from TABLE 17-continued the analytical scale samples were used for a small scale batch enrichment of the fusion protein using IMAC resin (FIG. 6). Enterokinase Fusion Proteins After one hour of incubation with enterokinase, partial cleav Gene ID Fusion Partner Fusion Protein age of the DnaJ-like protein fusion partner was observed EK4 EcpD (SEQID NO: 65) EcpD-Enterokinase (lanes 2-4). Cleavage was complete after overnight incuba (SEQID NO: 50) tion (lanes 6-8). EKS None Enterokinase (0276. The FklB-PTH fusion strains appeared to be robust SEQID NO:51 at the 1 liter scale. Purification samples were further analyzed to confirm that the fusion protein could be enriched and cleaved with enterokinase. Soluble fractions from the analyti Example IV cal scale samples were used for a small scale batch enrich ment of the fusion protein using IMAC resin. One enriched Large-Scale Fermentation of Enterokinase Fusion sample for each of the three expression strains, STR36034, Proteins (DNAJ-Like, FklB, FrnEN-Term Partners) STR36085, and STR36098 was treated with enterokinase and Subjected to intact mass analysis using methods described in 0281. The expression strains described in Example III are Example I. The PTH 1-34 polypeptide of interest was iden tested for expression of recombinant protein by HTP analysis, tified and observed to be of the correct mass, 4118 Da, for following methods similar to those described in Example I. each sample, as shown in FIG. 7. 0282 Expression strains are selected for fermentation studies based on soluble fusion protein expression levels. The selected Strains are grown and induced, and the induced cells Example III are centrifuged, lysed, and centrifuged again as described above for the PTH 1-34 fusion proteins. The resulting Construction of Enterokinase Fusions insoluble fraction and soluble fractions are extracted using extraction conditions described above, and the EK fusion (0277 DnaJ-like protein, FklB, and FrnE N-terminal protein extract Supernatants are quantitated using SDS-CGE. fusion partner-Enterokinase fusion proteins were designed and expression constructs generated, for use in expressing Example V recombinant Enterokinase (SEQID NO:31). 0278 Construction of Enterokinase Fusion Expression High Throughput Screening of Strains Expressing Plasmids: Insulin Fusion Proteins 0279 Enterokinase (EK) fusion coding regions evaluated 0283. This study was conducted to test levels recombinant are listed in Table 17. The gene fragments encoding the fusion protein produced by Pfluorescens strains expressing proin proteins were synthesized by DNA2.0. The fragments sulin fusion proteins comprising DNAJ-like protein, EcpD. included Spel and Xho1 restriction enzyme sites, a “Hi’ FklB, FrnE, or a truncation of EcpD, FklB, FrnE as the N-ter ribosome binding site, an 18 basepairspacer (5'-actagtaggag minal fusion partner. gtctaga-3') added upstream of the coding sequences, and three stop codons. Materials and Methods 0280 Standard cloning methods were used to construct expression plasmids. Plasmid DNA containing each enteroki Construction of Proinsulin Expression Vectors: nase fusion coding sequence was digested using Spel and XhoI restriction enzymes, then subcloned into Spe-XhoI 0284 Optimized gene fragments encoding proinsulin (in digested plCW1169 expression vector containing the pTac sulin glargine), were synthesized by DNA 2.0 (Menlo Park, promoter and rrnT1T2 transcriptional terminator. Inserts and Calif.). Gene fragments and proinsulin amino acid sequences vectors were ligated overnight with T4DNA ligase (Fermen encoded by the proinsulin coding sequences contained within tas ELO011), resulting in enterokinase fusion protein expres the gene fragments are listed in Table 18. Each gene fragment sion plasmids. The plasmids were electroporated into com contained peptide A and B coding sequences, and one of four petent Pfluorescens DC454 host cells. Positive clones were different glargine C peptide sequences: CP-A (MW=9336.94 screened for presence of enterokinase fusion protein Da; pl–5.2: 65% of A+B Glargine), CP-B (MW-8806.42 Da: sequence insert by PCR, using Ptac and Term sequence prim 69% of A+B Glargine), CP-C (MW=8749.32 Da: 69% of ers (AccuStart II, PCR SuperMix from Quanta, 95137-500). A+B Glargine), and CP-D (MW=7292.67 Da: 83% of A+B Glargine). The gene fragments were designed with Sap TABLE 17 restriction enzyme sites added upstream and downstream of the proinsulin coding sequences to enable the rapid cloning of Enterokinase Fusion Proteins the gene fragments into various expression vectors. The gene Gene ID Fusion Partner Fusion Protein fragments also included, within the 5' flanking region, either alysine amino acid codon (AAG) or an arginine amino acid EK1 DnaJ-like protein DnaJ-like protein codon (CGA), to facilitate ligation into expression vectors (SEQ ID NO: 2) Enterokinase (SEQID NO: 48) containing an enterokinase cleavage site or a trypsin cleavage EK2 FkB (SEQID NO:4) FkB-Enterokinase site, respectively. In addition, three stop codons (TGA, TAA, (SEQID NO:49) TAG) were included within the 3' flanking region of all the gene fragments. US 2016/0159877 A1 Jun. 9, 2016 40

TABLE 18 Proinsulin Gene Fragments and C-peptide Anino Acid Sequences Proinsulin Gene Nucleotide Glargine B- Glargine C- Glargine A- Amino Acid MW Fragment Sequence peptide Peptide Peptide Sequence pI KDa G737-001 SEQID NO: 80 SEQ ID NO: 93 CP-A SEQID NO: 92 SEQID NO: 88 5.2 9.34 SEQ ID NO: 97 G737-002 SEQID NO: 81 SEQ ID NO: 93 CP-B SEQID NO: 92 SEQID NO: 89 6.07 8.81 SEQ ID NO: 98 G737-003 SEQID NO: 82 SEQ ID NO: 93 CP-C SEQID NO: 92 SEQID NO:90 5.52 8.75 SEQ ID NO: 99 G737-007 SEQID NO: 83 SEQ ID NO: 93 CP-D SEQID NO: 92 SEQID NO:91 6.07 7.29 SEQID NO: 100 G737-009 SEQID NO: 84 SEQ ID NO: 93 CP-A SEQID NO: 92 SEQID NO: 88 5.2 9.34 SEQ ID NO: 97 G737-017 SEQID NO: 85 SEQ ID NO: 93 CP-B SEQID NO: 92 SEQID NO: 89 6.07 8.81 SEQ ID NO: 98 G737-018 SEQID NO: 86 SEQ ID NO: 93 CP-C SEQID NO: 92 SEQID NO:90 5.52 8.75 SEQ ID NO: 99 G737-O31 SEQID NO: 87 SEQ ID NO: 93 CP-D SEQID NO: 92 SEQID NO:91 6.07 7.29 SEQID NO: 100

0285. The proinsulin coding sequences were then sub cloned into expression vectors containing different fusion partners (Table 19), by ligating of the coding sequences into expression vectors using T4 DNA ligase (New England Biolabs, M0202S). The ligated vectors were electroporated in 96-well format into competent DC454 Pfluorescens cells. TABLE 19 Vectors for Glargine Proinsulin Fusion Protein Expression Amino Acid Nucleic Acid Protein Expression N-terminal Fusion Sequence Sequence Size Vector Partner-Cleavage Site (SEQID NO) (SEQID NO) KDa pI pFNX4401 DnaJ-like protein-Trypsin SEQ ID NO: 101 SEQID NO: 202 O.67 6.03 pFNX4402 EcpD1-Trypsin SEQ ID NO: 102 SEQ ID NO:203 28.52 9.15 pFNX4403 EcpD2-Trypsin SEQ ID NO: 104 SEQ ID NO:204 2.25 9.78 pFNX4404 EcpD3-Trypsin SEQ ID NO: 105 SEQ ID NO: 205 7.04 9.70 pFNX4405 FklB-Trypsin SEQ ID NO: 106 SEQ ID NO: 206 23.27 5.41 pFNX4406 FklB2-Trypsin SEQ ID NO: 107 SEQ ID NO: 207 2.07 6.04 pFNX4407 FklB3-Trypsin SEQ ID NO: 108 SEQ ID NO:208 6.85 6.28 pFNX4408 FrnE-Trypsin SEQ ID NO: 109 SEQ ID NO: 209 2544 S.12 pFNX4409 FrnE2-Trypsin SEQ ID NO: 110 SEQ ID NO:210 2.7 5.85 pFNX4410 FrnE3-Trypsin SEQ ID NO: 111 SEQ ID NO: 211 7.17 S.90 pFNX4411 DnaJ-like protein-EK SEQ ID NO: 112 SEQ ID NO:212 1.11 5.32 pFNX4412 EcpD1-EK SEQ ID NO: 113 SEQ ID NO:213 28.95 7.26 pFNX4413 EcpD2-EK SEQ ID NO: 114 SEQ ID NO:214 2.68 8.05 pFNX4414 EcpD3-EK SEQ ID NO: 115 SEQ ID NO:215 7.48 7.22 FNX4415 FkB-EK SEQ ID NO: 116 SEQ ID NO: 216 23.70 4.99 bFNX4416 FkB2-EK SEQ ID NO: 117 SEQ ID NO:217 2.49 5.19 FNX4417 FkB3-EK SEQ ID NO: 118 SEQ ID NO: 218 7.28 S.22 FNX4418. FrnE-EK SEQ ID NO: 119 SEQ ID NO: 219 25.88 4.84 FNX4419 FrnE2-EK SEQ ID NO: 120 SEQ ID NO:220 3.13 5.17 FNX442O FrnE3-EK SEQ ID NO: 121 SEQ ID NO: 221 7.60 4.99

0286 Growth and Expression in 96 Well Format (HTP): seed plates were incubated at 30°C. with shaking for 48 hours 0287. The plasmids containing proinsulin coding to generate seed cultures. sequences and the fusion partners were transformed into a P 0288 Ten microliters of seed culture were transferred in fluorescens DC454 host strain. Twenty-five microliters of duplicate into fresh 96-well deep well plates, each well con competent cells were thawed, transferred into a 96-multi-well taining 500 uL of HTP medium (Teknova 3H1129), supple Nucleovette(R) plate (Lonza VHNP-1001) and mixed with the mented with trace elements and 5% glycerol, and incubated at ligation mixture prepared in the previous step. The electropo 30°C. with shaking for 24 hours. Isopropyl-B-D-1-thiogalac ration was carried out using the NucleofectorTM 96-well topyranoside (IPTG) was added to each well at a final con ShuttleTM system (Lonza AG) and the transformed cells were centration of 0.3 mM to induce expression of the proinsulin then transferred to 96-well deep well plates (seed plates) with fusion proteins. In addition, 0.01 uL of 250 units/ul stock 400 uLM9 salts 1% glucose medium and trace elements. The Benzonase (Novagen, 70746-3) was added per well at time of US 2016/0159877 A1 Jun. 9, 2016

induction to reduce the potential for culture viscosity. Cell TABLE 20 density was quantified by measuring optical density at 600 nm (ODoo), 24hours after induction. Twenty four hours after HTP Expression Titer of Exemplary Proinsulin Fusion Proteins induction, cells were harvested, diluted 1:3 with 1xPBS to a Proinsulin Total final volume of 400 ul, and then frozen for later processing. Gene C-peptide Proinsulin Fragment N-terminal Fusion Sequence Soluble titer (mg/L) 0289 Soluble Lysate Sample Preparation for Analytical (SEQID Partner-Cleavage Site (SEQID Proinsulin (Soluble + Characterization: NOS in (SEQID NOS in NOS in titer Insoluble Table 18) Table 19) Table 18) (mg/L) Fractions) 0290 The culture broth samples, prepared and stored fro G737-001 DnaJ-like protein-EK CP- 66 235 Zen as described above, were thawed, diluted, and Sonicated. G737-002 DnaJ-like protein-EK CP-B 81 241 The lysates obtained by sonication were centrifuged at G737-003 DnaJ-like protein-EK CP-C 88 267 5,500xg for 15 minutes, at a temperature of 8°C., to separate G737-007 DnaJ-like protein-EK CP-D 50 499 the soluble (supernatant) and insoluble (pellet) fractions. The G737-009 DnaJ-like protein-Trypsin CP-A 9 136 G737-017 DnaJ-like protein-Trypsin CP-B 7 81 insoluble fractions were resuspended in PBS using sonica G737-018 DnaJ-like protein-Trypsin CP-C 21 331 tion. G737-031 DnaJ-like protein-Trypsin CP-D 10 487 G737-001 FkB-EK CP-A 50 445 0291 SDS-CGE Analysis: G737-002 FkB-EK CP-B 38 321 G737-003 FkB-EK CP-C 33 210 0292. The test protein samples prepared as discussed G737-007 FkB-EK CP-D 10 343 above were analyzed by HTP microchip SDS capillary gel G773-009 FklB-Trypsin CP-A 8 578 electrophoresis using a LabChip GXII instrument (Perki G737-017 FklB-Trypsin CP-B 23 375 nElmer) with a HT Protein Express V2 chip and correspond G737-018 FklB-Trypsin CP-C 18 59 G737-O31 FklB-Trypsin CP-D 10 321 ing reagents (Part Numbers 760499 and 760328, respectively, G737-001 FkB2-EK CP-A 7 528 PerkinElmer). Samples were prepared following manufactur G737-002 FkB2-EK CP-B 46 60 er's protocol (Protein User Guide Document No. 450589, G737-003 FkB2-EK CP-C 36 69 Rev. 3). In a 96-well conical well PCR plate, 4 uL sample G737-007 FkB2-EK CP-D 22 339 G773-009 FklB2-Trypsin CP-A O 658 were mixed with 14 ul of sample buffer, with or without a G737-017 FklB2-Trypsin CP-B 6 92 Dithiotreitol (DTT) reducing agent. The mixture was heated G737-018 FklB2-Trypsin CP-C 6 2O at 95°C. for 5 min and diluted by adding 70LL of deionized G737-O31 FklB2-Trypsin CP-D 1 193 Water. G737-001 FkB3-EK CP-A 3 565 G737-002 FkB3-EK CP-B O 109 0293. The proinsulin titer at the 96-well scale was deter G737-003 FkB3-EK CP-C 1 26 mined based on the fusion protein titer multiplied by the G737-007 FkB3-EK CP-D O 12 G737-009 FklB3-Trypsin CP-A 2 222 percentage of the fusion protein comprised of proinsulin. G737-017 FklB3-Trypsin CP-B 9 108 Total titer represents the sum of soluble and insoluble target G737-018 FklB3-Trypsin CP-C 7 70 expression (mg/L). G737-O31 FklB3-Trypsin CP-D 5 457 G737-001 FrnE-EK CP-A 132 258 G737-007 FrnE-EK CP-D 6 52 Results G737-009 FrnE-Trypsin CP-A 30 65 G737-017 FrnE-Trypsin CP-B 41 63 G737-018 FrnE-Trypsin CP-C 43 56 0294 As shown in Table 20, the glargine proinsulin fusion G737-O31 FrnE-Trypsin CP-D 3 218 proteins having DnaJ-like protein as the N-terminal fusion G737-009 FrnE2-Trypsin CP-A 2O 96 partner showed the highest levels of proinsulin expression. G737-017 FrnE2-Trypsin CP-B 6 39 Surprisingly, proinsulin fusion proteins containing the Small G737-018 FrnE2-Trypsin CP-C 3 53 G737-007 FrnE2-EK CP-D O 219 est version of EcpD fusion partner, the 50 amino acid fusion G737-O31 FrnE2-Trypsin CP-D 5 2O1 partner EcpD3, showed higher levels of expression compared G737-001 FrnE3-EK CP-A 8 266 to full length fusion partner EcpD1 and the 100 amino acid G737-002 FrnE3-EK CP-B O 248 truncated version EcpD2. For proinsulin fusion proteins con G737-003 FrnE3-EK CP-C 9 171 G737-007 FrnE3-EK CP-D 3 161 taining an FklB or FrnEN-terminal fusion partner, the expres G773-009 FrnE3-Trypsin CP-A 8 144 sion of proinsulin fused to the Smallest fusion partner frag G737-017 FrnE3-Trypsin CP-B 8 49 ment, Fk1B3 and FrnE3 respectively, was equal to or slightly G737-018 FrnE3-Trypsin CP-C 7 22 lower than expression of the constructs having the longer G737-O31 FrnE3-Trypsin CP-D 7 307 G737-001 EcpD1-EK CP-A 9 194 N-terminal fusion partners. Table 20 Summarizes proinsulin G737-002 EcpD1-EK CP-B 5 131 protein titers, both soluble and total, observed during the high G737-003 EcpD1-EK CP-B 5 132 throughput expression study. G737-007 EcpD1-EK CP-D 5 22 G773-009 EcpD1-Trypsin CP-A 21 86 0295 Therefore, mature glargine was determined to be G737-017 EcpD1-Trypsin CP-B 16 39 successfully released from the purified fusion protein (and the G737-018 EcpD1-Trypsin CP-C 27 74 G737-O31 EcpD1-Trypsin CP-D 4 2O6 C-peptide) following trypsin cleavage. IMAC enrichment G737-001 EcpD2-EK CP-A 16 21 followed by trypsin cleavage performed on selected fusion G737-002 EcpD2-EK CP-B 9 24 proteins (DnaJ construct G737-031 and FklB construct G737-003 EcpD2-EK CP-C 9 29 G737-009, purified in the presence of non-denaturing con G737-007 EcpD2-EK CP-D 9 60 G773-009 EcpD2-Trypsin CP-A 18 125 centration of urea, and Frn E1 construct G737-018, purified G737-017 EcpD2-Trypsin CP-B 6 9 without urea) demonstrated that the fusion protein was G737-018 EcpD2-Trypsin CP-C 7 34 cleaved to produce mature insulinas evaluated by SDS-PAGE G737-O31 EcpD2-Trypsin CP-D 5 33 or SDS-CGE, compared to a glargine standard. Receptor G737-001 EcpD3-EK CP-A 8 81 binding assays further indicated activity. US 2016/0159877 A1 Jun. 9, 2016 42

TABLE 20-continued downstream to the coding sequence, was synthesized by DNA2.0 (Menlo Park, Calif.). The GCSF gene fragment of HTP Expression Titer of Exemplary Proinsulin Fusion Proteins plasmid p201:207232, was digested with restriction enzyme Sap to generate fragments containing the optimized gcSf Proinsulin Total Gene C-peptide Proinsulin coding sequence. The gcSf coding sequence was then Sub Fragment N-terminal Fusion Sequence Soluble titer (mg/L) cloned into expression vectors containing different fusion (SEQID Partner-Cleavage Site (SEQID Proinsulin (Soluble + partners, by ligation of the GCSF gene fragment and the NOS in (SEQ ID NOS in NOS in titer Insoluble expression vectors using T4DNA ligase (Fermentas ELO011) Table 18) Table 19) Table 18) (mg/L) Fractions) and electroporated in 96-well format into competent Pfluo G737-002 EcpD3-EK CP-B 15 18 rescens DC454 host cells. A hexahistidine tag was included in G737-003 EcpD3-EK CP-C 17 64 a linker between the GCSF and each N-terminal fusion part G737-007 EcpD3-EK CP-D 10 169 ner along with an enterokinase cleavage site (DDDK) for G773-009 EcpD3-Trypsin CP-A 8 40 releasing the N-terminal fusion partner from the GCSF. The G737-017 EcpD3-Trypsin CP-B 9 9 resulting plasmids containing the fusion protein constructs are listed in the third column of Table 21. TABLE 21 Plasmids for GCSF Fusion Protein Expression GCSF Fusion Expression Fusion Partner- Expression % GCSF Protein Vector Cleavage Site Plasmid Size (kDa) GCSF Size of Fusion Size pFNX4411 DnaJ-like protein - p529-301 11 19 O.63 30 EK pFNX4412 EcpD1-EK p529-302 29 19 O4O 48 pFNX4413 EcpD2-EK p529-303 13 19 O.6O 32 pFNX4414 EcpD3-EK p529-304 7 19 0.72 27 pFNX4415 FkB-EK p529-305 24 19 O45 43 pFNX4416 FkB2-EK p529-306 12 19 O.61 32 pFNX4417 FkB3-EK p529-307 7 19 0.73 27 pFNX4418 FE-EK p529-308 26 19 0.43 45 pFNX4419 FrnE2-EK p529-309 13 19 O.S9 32 pFNX4420 FrnE3-EK p529-310 8 19 0.72 27

TABLE 20-continued 0299 Growth and Expression in 96 Well Format (HTP): HTP Expression Titer of Exemplary Proinsulin Fusion Proteins 0300. The plasmids containing coding Sequences for the gcSf gene and the N-terminal fusion partners were trans Proinsulin Total Gene C-peptide Proinsulin formed into an array of P. fluorescens host strains. Thirty-five Fragment N-terminal Fusion Sequence Soluble titer (mg/L) microliters of P. fluorescens competent cells were thawed and (SEQID Partner-Cleavage Site (SEQID Proinsulin (Soluble + mixed with 10 uL of 10x diluted plasmid DNA (2.5 ng). NOS in (SEQ ID NOS in NOS in titer Insoluble Twenty-five microlitres of the mixture was transferred into a Table 18) Table 19) Table 18) (mg/L) Fractions) 96-multi-well Nucleovette R plate (Lonza VHNP-1001), for G737-018 EcpD3-Trypsin CP-C 10 12 transformation via electroporation, using the NucleofectorTM G737-O31 EcpD3-Trypsin CP-D 7 57 96-well ShuttleTM system (Lonza AG) and the transformed cells were then transferred to 96-well deep well plates (seed plates) containing 500M9 salts 1% glucose medium and trace Example VI elements. The seed plates were incubated at 30° C. with shaking for 48 hours to generate seed cultures. High Throughput Screening of GCSF Fusion 0301 Ten microliters of seed culture were transferred in Proteins duplicate into fresh 96-well deep well plates, each well con 0296. This study was conducted to test levels of recombi taining 500 uL of HTP medium (Teknova 3H1129), supple nant GCSF protein produced by P. fluorescens strains mented with trace elements and 5% glycerol, and incubated at expressing GCSF fusion proteins containing DnaJ-like pro 30°C. with shaking for 24 hours. Isopropyl-B-D-1-thiogalac tein, varying lengths of FklB (FklB, FklB2, or FklB3), FrnE topyranoside (IPTG) was added to each well at a final con (FrnE, FrnE2, or FrnE3), or EcpD (EcpD1, EcpD2, or centration of 0.3 mM to induce expression of the GCSF Ecp)3) as the N-terminal fusion partner. fusion proteins. In Pseudomonas strains over-expressing folding modulators (FMO strains), Mannitol (Sigma, Materials and Methods M1902) at a final concentration of 1% was added along with the IPTG, to induce expression of the folding modulators. In 0297 Construction of GCSF Expression Vectors: addition, 0.01 uL of 250 units/ul stock Benzonase (Novagen, 0298. A GCSF gene fragment (SEQID NO. 68), contain 70746-3) was added per well at the time of induction to ing an optimized gcSf coding sequence, recognition reduce the potential for culture viscosity. Cell density was sequences for restriction enzyme Sap both downstream and quantified by measuring optical density at 600 nm (ODoo)24 upstream to the coding sequence, and three stop codons hours after induction. Twenty four hours after induction, cells US 2016/0159877 A1 Jun. 9, 2016

were harvested, diluted 1:3 with 1xPBS to a final volume of fusion protein titers of over 100 mg/L, as shown in Table 22. 400 u, and then frozen for later processing. These high levels observed at the HTP scale show great promise for expression at shake flask or fermentation scale. Soluble Lysate Sample Preparation for Analytical Furthermore, it is common to observe a significant increase in Characterization: volumetric titer between HTP and larger scale cultures. In a previous study, the prtB protease deficient Strain was shown 0302) The culture broth samples, prepared and frozen as to enable expression of -247 mg/L Met-GCSF at the 0.5 mL described above, were thawed, diluted and Sonicated using a scale (H. Jin et al., 2011, Protein Expression and Purification Cell Lysis Automated Sonication System (CLASS, Scino 78:69-77, and U.S. Pat. No. 8,455,218). In the present study, mix) with a 24 probe tip horn. The lysates obtained by soni as described, expression of a high level of Met-GCSF as part cation were centrifuged at 5,500xg for 15 minutes, at a tem of a fusion protein was observed even in a host cell having no perature of 8°C., to separate the soluble (supernatant) and protease deficiency. It is noted that a preparation of Met insoluble (pellet) fractions. The insoluble fractions were GCSF, obtained by expressing as part of any of the described resuspended in 400 uL of PBS, at pH 7.4, also by sonication. fusion proteins and releasing by protease cleavage, contains 0303 SDS-CGE Analysis: virtually 100% Met-GCSF (and no des-Met-GCSF), as cleav 0304. The test protein samples prepared as discussed age is carried out following the removal of any proteases. above were analyzed by HTP microchip SDS capillary gel electrophoresis using a LabChip GXII instrument (Caliper TABLE 22 LifeSciences) with a HT Protein Express V2 chip and corre sponding reagents (Part Numbers 760499 and 760328, HTP Expression Titer of GCSF Fusion Proteins respectively, Caliper LifeSciences). Samples were prepared Fusion Partner- Fusion Titer % Target in GCSF Titer following the manufacturer's protocol (Protein User Guide Cleavage Site (mg/L) Fusion (mg/L) Document No. 450589, Rev. 3). In a 96-well conical well DnaJ-like protein EK 155-758 63 98-478 PCR plate, 4 LL sample were mixed with 14 uL of sample EcpD1-EK (FL EcpD) 247-542 40 96-211 buffer, with or without a Dithiotreitol (DTT) reducing agent. EcpD2-EK 101-112 60 61-67 The mixture was heated at 95°C. for 5 min and diluted by EcpD3-EK 137-249 72 99-179 adding 70 uL of deionized water. In parallel with the test FklB1-EK (FL FklB) 226-S6S 44 99-249 protein samples, lysates from strains containing no fusion FkB2-EK 171-362 60 103-217 FkB3-EK 79-145 72 57-104 protein (null strains) were also analyzed. The null Strain FrnE1-EK (FL FrnE) 241-763 42 101-32O lysates were quantified using the system internal standard FrnE2-EK 59 without background Subtraction. One sample per strain was FrnE3-EK 141-260 71 100-185 quantitated during the HTP screen; typically the standard deviation of the SDS-CGE method is -10%. 0306 While preferred embodiments of the present inven tion have been shown and described herein, it will be obvious Results to those skilled in the art that such embodiments are provided 0305 High level expression of GCSF was achieved at the by way of example only. Numerous variations, changes, and 96-well scale using the fusion partner approach, which pre substitutions will now occur to those skilled in the art without sents an alternative to Screening protease deficient hosts in departing from the invention. It should be understood that order to identify strains that enable high level expression of various alternatives to the embodiments of the invention N-terminal Met-GCSF. Fusion protein and GCSF titers (cal described herein may be employed in practicing the inven culated based on the percent GCSF of total fusion protein, by tion. It is intended that the following claims define the scope MW) are shown in Table 22. Wild-type strain DC454 pro of the invention and that methods and structures within the duced 484 mg/L fusion protein, and 305 mg/L GCSF with the Scope of these claims and their equivalents be covered dinal fusion partner. All fusion partner constructs yielded thereby.

Table of Sequences SEQ ID Protein/ NO. Gene Name Sequence 1. PTH 1-34 SVSEIOLMHNLGKHLNSMERVEWLRKKLODWHNF

2 DnaJ-like MKVEPGLYOHYKGPOYRVFSWARHSETEEEVWFYOALYGEYGFWV protein RPLSMFLETWEVDGEOVPRFALVTAEPSLFTGO (P. fluorescens)

3 FrnE MSTPLKIDFWSDVSCPWCIIGLRGLTEALDOLGSEVOAEIHFOPF (P. fluorescens) ELNPNMPAEGONIVEHITEKYGSTAEESOANRARIRDMGAALGFA FRTDGOSRIYNTFDAHRLLHWAGLEGLOYNLKEALFKAYFSDGOD PSDHATLAIIAESVGLDLARAAEILASDEYAAEWREQEQLWWSRG WSSVPTIVFNDOYAVSGGOPAEAFVGAIROIINESKS

4. FkB MSEVNLSTDETRVSYGIGROLGDOLRDNPPPGVSLDAILAGLTDA (P. fluorescens) FAGKPSRVDQEOMAASFKWIREIMOAEAAAKAEAAAGAGLAFLAE RXF6 O34.1 NAKRDGITTLASGLOFEVLTAGTGAKPTREDOVRTHYHGTLIDGT (full-length) WFDSSYERGOPAEFPWGGVIAGWTEALOLMNAGSKWRVYWPSELA YGAOGVGSIPPHSVLVFDWELLDVL

US 2016/0159877 A1 Jun. 9, 2016 67

- Continued

Table of Sedulences SEQ ID Protein/ NO. Gene Name Sequence

CAACGCTATCGCAGTGAGGACGGCATGGTCGGCCCTGGGGAAACC CGGCAGTTCGCGCTGCCCACGCTCAAGGCCAGGCCGTCGAGCCAG GCACAAGTGGAGTTCAGCGCCATCAACGATTACGGCGCGTTGGTC CCGACCCGCAACACGCTGCAGCCCGGTGGGGGTGGGTCGGGTGGT GGTGGGTCGCATCATCATCACCACCACCGA

229 FLAG Tag DYKDDDDK

23 O Calmodulin Tag KRRWKKNFIAWSAANRFKKISSSGAL

231 HA Tag YPYDWPDYA

232 E-tag GAPWPYPDPLEPR

233 S-Tag KETAAAKFEROHMDS

234 SBP Tag MDEKTTGWRGGHWWEGLAGELEOLRARLEHHPOGOREP

235 Softag 3 TODPSRVG

236 V5 Tag GKPIPNPLLGLDST

237 VSV Tag YTDEMNRLGK

SEQUENCE LISTING

<16O is NUMBER OF SEQ ID NOS: 242

<210s, SEQ ID NO 1 &211s LENGTH: 34 212. TYPE: PRT <213> ORGANISM: Homo sapiens

<4 OOs, SEQUENCE: 1 Ser Val Ser Glu Ile Glin Lieu Met His Asn Lieu. Gly Llys His Lieu. Asn 1. 5 1O 15 Ser Met Glu Arg Val Glu Trp Lieu. Arg Llys Llys Lieu. Glin Asp Val His 2O 25 3O

Asn. Phe

<210s, SEQ ID NO 2 &211s LENGTH: 78 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 2 Met Llys Val Glu Pro Gly Lieu. Tyr Gln His Tyr Lys Gly Pro Glin Tyr 1. 5 1O 15 Arg Val Phe Ser Val Ala Arg His Ser Glu Thr Glu Glu Glu Val Val 2O 25 3O Phe Tyr Glin Ala Lieu. Tyr Gly Glu Tyr Gly Phe Trp Val Arg Pro Leu 35 4 O 45 Ser Met Phe Leu Glu Thr Val Glu Val Asp Gly Glu Glin Val Pro Arg SO 55 6 O Phe Ala Leu Val Thr Ala Glu Pro Ser Leu Phe Thr Gly Glin 65 70 7s US 2016/0159877 A1 Jun. 9, 2016 68

- Continued

<210s, SEQ ID NO 3 &211s LENGTH: 217 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 3 Met Ser Thr Pro Leu Lys Ile Asp Phe Val Ser Asp Val Ser Cys Pro 1. 5 1O 15 Trp. Cys Ile Ile Gly Lieu. Arg Gly Lieu. Thr Glu Ala Lieu. Asp Glin Lieu. 2O 25 3O Gly Ser Glu Val Glin Ala Glu Ile His Phe Gln Pro Phe Glu Lieu. Asn 35 4 O 45 Pro Asn Met Pro Ala Glu Gly Glin Asn Ile Val Glu. His Ile Thr Glu SO 55 6 O Llys Tyr Gly Ser Thr Ala Glu Glu Ser Glin Ala Asn Arg Ala Arg Ile 65 70 7s 8O Arg Asp Met Gly Ala Ala Lieu. Gly Phe Ala Phe Arg Thr Asp Gly Glin 85 90 95 Ser Arg Ile Tyr Asn. Thir Phe Asp Ala His Arg Lieu. Lieu. His Trp Ala 1OO 105 11 O Gly Lieu. Glu Gly Lieu. Glin Tyr Asn Lieu Lys Glu Ala Lieu. Phe Lys Ala 115 12 O 125 Tyr Phe Ser Asp Gly Glin Asp Pro Ser Asp His Ala Thr Lieu Ala Ile 13 O 135 14 O Ile Ala Glu Ser Val Gly Lieu. Asp Lieu Ala Arg Ala Ala Glu Ile Lieu. 145 150 155 160 Ala Ser Asp Glu Tyr Ala Ala Glu Val Arg Glu Glin Glu Glin Lieu. Trip 1.65 17O 17s Val Ser Arg Gly Val Ser Ser Val Pro Thr Ile Val Phe Asn Asp Glin 18O 185 19 O Tyr Ala Val Ser Gly Gly Glin Pro Ala Glu Ala Phe Val Gly Ala Ile 195 2OO 2O5 Arg Glin Ile Ile Asn. Glu Ser Lys Ser 21 O 215

<210s, SEQ ID NO 4 &211s LENGTH: 2O5 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 4 Met Ser Glu Val Asn Lieu Ser Thr Asp Glu Thir Arg Val Ser Tyr Gly 1. 5 1O 15 Ile Gly Arg Glin Lieu. Gly Asp Gln Lieu. Arg Asp Asn Pro Pro Pro Gly 2O 25 3O

Val Ser Lieu. Asp Ala Ile Lieu Ala Gly Lieu. Thir Asp Ala Phe Ala Gly 35 4 O 45

Llys Pro Ser Arg Val Asp Glin Glu Gln Met Ala Ala Ser Phe Llys Val SO 55 6 O

Ile Arg Glu Ile Met Glin Ala Glu Ala Ala Ala Lys Ala Glu Ala Ala 65 70 7s 8O

Ala Gly Ala Gly Lieu Ala Phe Lieu Ala Glu Asn Ala Lys Arg Asp Gly 85 90 95

Ile Thir Thr Lieu Ala Ser Gly Lieu. Glin Phe Glu Val Lieu. Thir Ala Gly US 2016/0159877 A1 Jun. 9, 2016 69

- Continued

1OO 105 11 O Thr Gly Ala Lys Pro Thr Arg Glu Asp Glin Val Arg Thr His Tyr His 115 12 O 125 Gly Thr Lieu. Ile Asp Gly Thr Val Phe Asp Ser Ser Tyr Glu Arg Gly 13 O 135 14 O Gln Pro Ala Glu Phe Pro Val Gly Gly Val Ile Ala Gly Trp Thr Glu 145 150 155 160 Ala Lieu. Glin Lieu Met Asn Ala Gly Ser Lys Trp Arg Val Tyr Val Pro 1.65 17O 17s Ser Glu Lieu Ala Tyr Gly Ala Glin Gly Val Gly Ser Ile Pro Pro His 18O 185 19 O Ser Val Lieu Val Phe Asp Val Glu Lieu. Lieu. Asp Val Lieu. 195 2OO 2O5

<210s, SEQ ID NO 5 &211s LENGTH: 225 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 5 Met Ser Arg Tyr Lieu. Phe Lieu Val Phe Gly Lieu Ala Ile Cys Val Ala 1. 5 1O 15 Asp Ala Ser Glu Gln Pro Ser Ser Asn Ile Thr Asp Ala Thr Pro His 2O 25 3O Asp Lieu Ala Tyr Ser Lieu. Gly Ala Ser Lieu. Gly Glu Arg Lieu. Arg Glin 35 4 O 45 Glu Val Pro Asp Lieu. Glin Ile Glin Ala Lieu. Lieu. Asp Gly Lieu Lys Glin SO 55 6 O Ala Tyr Glin Gly Llys Pro Lieu Ala Lieu. Asp Lys Ala Arg Ile Glu Glin 65 70 7s 8O Ile Lieu. Ser Gln His Glu Ala Glin Asn. Thir Ala Asp Ala Glin Lieu Pro 85 90 95 Glin Ser Glu Lys Ala Lieu Ala Ala Glu Glin Glin Phe Lieu. Thir Arg Glu 1OO 105 11 O Lys Ala Ala Ala Gly Val Arg Glin Lieu Ala Asp Gly Ile Lieu. Lieu. Thir 115 12 O 125 Glu Lieu Ala Pro Gly Thr Gly Asn Llys Pro Lieu Ala Ser Asp Glu Val 13 O 135 14 O Glin Val Lys Tyr Val Gly Arg Lieu Pro Asp Gly Thr Val Phe Asp Llys 145 150 155 160 Ser Thr Glin Pro Gln Trp Phe Arg Val Asin Ser Val Ile Ser Gly Trp 1.65 17O 17s Ser Ser Ala Lieu. Glin Gln Met Pro Val Gly Ala Lys Trp Arg Lieu Val 18O 185 19 O

Ile Pro Ser Ala Glin Ala Tyr Gly Ala Asp Gly Ala Gly Glu Lieu. Ile 195 2OO 2O5

Pro Pro Tyr Thr Pro Leu Val Phe Glu Ile Glu Lieu. Leu Gly Thr Arg 21 O 215 22O

His 225

<210s, SEQ ID NO 6 &211s LENGTH: 159 US 2016/0159877 A1 Jun. 9, 2016 70

- Continued

212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 6 Met Thr Asp Glin Glin Asn Thr Glu Ala Ala Glin Asp Glin Gly Pro Glin 1. 5 1O 15 Phe Ser Lieu. Glin Arg Ile Tyr Val Arg Asp Lieu. Ser Phe Glu Ala Pro 2O 25 3O Lys Ser Pro Ala Ile Phe Arg Glin Glu Trp Thr Pro Ser Val Ala Leu 35 4 O 45 Asp Lieu. Asn. Thir Arg Glin Llys Ser Lieu. Glu Gly Asp Phe His Glu Val SO 55 6 O Val Lieu. Thir Lieu Ser Val Thr Val Lys Asn Gly Glu Glu Val Ala Phe 65 70 7s 8O Ile Ala Glu Val Glin Glin Ala Gly Ile Phe Lieu. Ile Glin Gly Lieu. Asp 85 90 95 Glu Ala Ser Met Ser His Thr Lieu. Gly Ala Phe Cys Pro Asn Ile Leu 1OO 105 11 O Phe Pro Tyr Ala Arg Glu Thir Lieu. Asp Ser Leu Val Thr Arg Gly Ser 115 12 O 125 Phe Pro Ala Lieu Met Lieu Ala Pro Val Asn. Phe Asp Ala Lieu. Tyr Ala 13 O 135 14 O Gln Glu Lieu. Glin Arg Met Glin Glin Glu Gly Ala Pro Thr Val Glin 145 150 155

<210s, SEQ ID NO 7 &211s LENGTH: 241 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OO > SEQUENCE: 7 Met Gly Cys Val Pro Leu Pro Asp His Gly Ile Thr Val Phe Met Phe 1. 5 1O 15 Lieu. Lieu. Arg Met Val Lieu. Lieu Ala Cys Gly Lieu. Lieu Val Lieu Ala Pro 2O 25 3O Pro Pro Ala Asp Ala Ala Lieu Lys Ile Glu Gly Thr Arg Lieu. Ile Tyr 35 4 O 45 Phe Gly Glin Asp Lys Ala Ala Gly Ile Ser Val Val Asn Glin Ala Ser SO 55 6 O Arg Glu Val Val Val Glin Thir Trp Ile Thr Gly Glu Asp Glu Ser Ala 65 70 7s 8O Asp Arg Thr Val Pro Phe Ala Ala Thr Glu Pro Lieu Val Glin Lieu. Gly 85 90 95 Ala Gly Glu. His His Llys Lieu. Arg Ile Lieu. Tyr Ala Gly Glu Gly Lieu 1OO O5 11 O

Pro Ser Asp Arg Glu Ser Leu Phe Trp Lieu. Asn Ile Met Glu Ile Pro 115 12 O 125

Lieu Lys Pro Glu Asp Pro Asn. Ser Val Glin Phe Ala Ile Arg Glin Arg 13 O 135 14 O

Lieu Lys Lieu. Phe Tyr Arg Pro Pro Ala Lieu. Glin Gly Gly Ser Ala Glu 145 150 155 160

Ala Val Glin Gln Leu Val Trp Ser Ser Asp Gly Arg Thr Val Thr Val 1.65 17O 17s US 2016/0159877 A1 Jun. 9, 2016 71

- Continued Asn Asn Pro Ser Ala Phe His Lieu. Ser Lieu Val Asn Lieu. Arg Ile Asp 18O 185 19 O Ser Glin Thr Lieu. Ser Asp Tyr Lieu. Lieu. Lieu Lys Pro His Glu Arg Llys 195 2OO 2O5 Thir Lieu. Thir Ala Lieu. Asp Ala Val Pro Lys Gly Ala Thr Lieu. His Phe 21 O 215 22O Thr Glu Ile Thir Asp Ile Gly Lieu. Glin Ala Arg His Ser Thir Ala Lieu. 225 23 O 235 24 O

Asn

<210s, SEQ ID NO 8 &211s LENGTH: 141 212. TYPE: PRT <213> ORGANISM: Escherichia coli

<4 OOs, SEQUENCE: 8 Ala Asp Llys Ile Ala Ile Val Asn Met Gly Ser Lieu. Phe Glin Glin Val 1. 5 1O 15 Ala Glin Llys Thr Gly Val Ser Asn. Thir Lieu. Glu Asn. Glu Phe Lys Gly 2O 25 3O Arg Ala Ser Glu Lieu. Glin Arg Met Glu Thir Asp Lieu. Glin Ala Lys Met 35 4 O 45 Llys Llys Lieu. Glin Ser Met Lys Ala Gly Ser Asp Arg Thr Lys Lieu. Glu SO 55 6 O Lys Asp Wal Met Ala Glin Arg Glin Thr Phe Ala Glin Lys Ala Glin Ala 65 70 7s 8O Phe Glu Glin Asp Arg Ala Arg Arg Ser Asn. Glu Glu Arg Gly Lys Lieu. 85 90 95 Val Thr Arg Ile Glin Thr Ala Wall Lys Ser Val Ala Asn. Ser Glin Asp 1OO 105 11 O Ile Asp Lieu Val Val Asp Ala Asn Ala Val Ala Tyr Asn. Ser Ser Asp 115 12 O 125 Val Lys Asp Ile Thr Ala Asp Val Lieu Lys Glin Val Lys 13 O 135 14 O

<210s, SEQ ID NO 9 &211s LENGTH: 2O 212. TYPE: PRT <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic peptide

<4 OOs, SEQUENCE: 9 Gly Gly Gly Gly Ser Gly Gly Gly Gly His His His His His His Asp 1. 5 1O 15

Asp Asp Asp Llys 2O

<210s, SEQ ID NO 10 &211s LENGTH: 18 212. TYPE: PRT <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic peptide

<4 OOs, SEQUENCE: 10 US 2016/0159877 A1 Jun. 9, 2016 72

- Continued

Gly Gly Gly Gly Ser Gly Gly Gly Gly His His His His His His Arg 1. 5 1O 15 Lys Arg

<210s, SEQ ID NO 11 &211s LENGTH: 18 212. TYPE: PRT <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic peptide

<4 OOs, SEQUENCE: 11 Gly Gly Gly Gly Ser Gly Gly Gly Gly His His His His His His Arg 1. 5 1O 15 Arg Arg

<210s, SEQ ID NO 12 &211s LENGTH: 19 212. TYPE: PRT <213> ORGANISM: Artificial Sequence 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Artificial Sequence: Synthetic peptide

<4 OOs, SEQUENCE: 12 Gly Gly Gly Gly Ser Gly Gly Gly Gly His His His His His His Lieu 1. 5 1O 15 Val Pro Arg

<210s, SEQ ID NO 13 &211s LENGTH: 5 212. TYPE: PRT <213s ORGANISM: Unknown 22 Os. FEATURE: <223> OTHER INFORMATION: Description of Unknown: Enterokinase cleavage site sequence

<4 OOs, SEQUENCE: 13 Asp Asp Asp Asp Llys 1. 5

<210s, SEQ ID NO 14 &211s LENGTH: 798 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 14 Met Lys Thir Thir Ile Glu Lieu Pro Lieu. Lieu Pro Lieu. Arg Asp Val Val 1. 5 1O 15

Val Tyr Pro His Met Val Ile Pro Leu Phe Val Gly Arg Glu Lys Ser 2O 25 3O

Ile Glu Ala Lieu. Glu Ala Ala Met Thr Gly Asp Llys Glin Ile Lieu. Lieu. 35 4 O 45

Lieu Ala Glin Lys Asn Pro Ala Asp Asp Asp Pro Gly Glu Asp Ala Lieu. SO 55 6 O

Tyr Arg Val Gly. Thir Ile Ala Thr Val Lieu Gln Lieu Lleu Lys Lieu Pro 65 70 7s 8O

Asp Gly Thr Val Llys Val Lieu Val Glu Gly Glu Glin Arg Gly Ala Val US 2016/0159877 A1 Jun. 9, 2016 73

- Continued

85 90 95 Glu Arg Phe Met Glu Val Asp Gly His Lieu. Arg Ala Glu Val Ala Lieu. 1OO 105 11 O Ile Glu Glu Val Glu Ala Pro Glu Arg Glu Ser Glu Val Phe Val Arg 115 12 O 125 Ser Lieu. Lieu. Ser Glin Phe Glu Glin Tyr Val Glin Lieu. Gly Lys Llys Val 13 O 135 14 O Pro Ala Glu Val Lieu. Ser Ser Lieu. Asn. Ser Ile Asp Glu Pro Ser Arg 145 150 155 160 Lieu Val Asp Thir Met Ala Ala His Met Ala Lieu Lys Ile Glu Glin Lys 1.65 17O 17s Glin Asp Ile Lieu. Glu Ile Ile Asp Lieu. Ser Ala Arg Val Glu. His Val 18O 185 19 O Lieu Ala Met Lieu. Asp Gly Glu Ile Asp Lieu. Lieu. Glin Val Glu Lys Arg 195 2OO 2O5 Ile Arg Gly Arg Val Llys Lys Gln Met Glu Arg Ser Glin Arg Glu Tyr 21 O 215 22O Tyr Lieu. Asn. Glu Glin Met Lys Ala Ile Glin Lys Glu Lieu. Gly Asp Gly 225 23 O 235 24 O Glu Glu Gly His Asn. Glu Ile Glu Glu Lieu Lys Lys Arg Ile Asp Ala 245 250 255 Ala Gly Lieu Pro Lys Asp Ala Lieu. Thir Lys Ala Thr Ala Glu Lieu. Asn 26 O 265 27 O Llys Lieu Lys Gln Met Ser Pro Met Ser Ala Glu Ala Thr Val Val Arg 27s 28O 285 Ser Tyr Ile Asp Trp Leu Val Glin Val Pro Trp Lys Ala Glin Thr Lys 29 O 295 3 OO Val Arg Lieu. Asp Lieu Ala Arg Ala Glu Glu Ile Lieu. Asp Ala Asp His 3. OS 310 315 32O Tyr Gly Lieu. Glu Glu Val Lys Glu Arg Ile Lieu. Glu Tyr Lieu Ala Val 3.25 330 335 Glin Lys Arg Val Llys Lys Ile Arg Gly Pro Val Lieu. Cys Lieu Val Gly 34 O 345 35. O Pro Pro Gly Val Gly Llys Thir Ser Lieu Ala Glu Ser Ile Ala Ser Ala 355 360 365 Thir Asn Arg Llys Phe Val Arg Met Ala Lieu. Gly Gly Val Arg Asp Glu 37 O 375 38O Ala Glu Ile Arg Gly His Arg Arg Thr Tyr Ile Gly Ser Met Pro Gly 385 390 395 4 OO Arg Lieu. Ile Glin Llys Met Thr Llys Val Gly Val Arg Asn. Pro Lieu. Phe 4 OS 41O 415 Lieu. Lieu. Asp Glu Ile Asp Llys Met Gly Ser Asp Met Arg Gly Asp Pro 42O 425 43 O

Ala Ser Ala Lieu. Lieu. Glu Val Lieu. Asp Pro Glu Glin Asn His Asn. Phe 435 44 O 445 Asn Asp His Tyr Lieu. Glu Val Asp Tyr Asp Lieu. Ser Asp Wal Met Phe 450 45.5 460

Lieu. Cys Thir Ser Asn. Ser Met Asn. Ile Pro Pro Ala Lieu. Lieu. Asp Arg 465 470 47s 48O

Met Glu Val Ile Arg Lieu Pro Gly Tyr Thr Glu Asp Glu Lys Ile Asn 485 490 495 US 2016/0159877 A1 Jun. 9, 2016 74

- Continued

Ile Ala Wall Lys Tyr Lell Ala Pro Lys Glin Ile Ser Ala Asn Gly Luell SOO 505

Gly Glu Ile Glu Phe Glu Wall Glu Ala Ile Arg Asp Ile Wall 515 525

Arg Tyr Thir Glu Ala Gly Wall Arg Gly Lell Glu Arg Glin Ile 53 O 535 54 O

Ala Ile Lys Ala Wall Glu His Ala Lell Glu Arg 5.45 550 555 560

Phe Ser Wall Wall Wall Ala Asp Ser Luell Glu His Phe Luell Gly Wall 565 st O sts

Phe Arg Tyr Gly Lell Ala Glu Glin Glin Asp Glin Wall Gly Glin 585 59 O

Wall Thir Gly Luell Ala Trp Thir Glin Wall Gly Gly Glu Lell Luell Thir Ile 595 605

Glu Ala Ala Wall Ile Pro Gly Gly Glin Luell Ile Thir Gly Ser 610 615

Lell Gly Asp Wall Met Wall Glu Ser Ile Thir Ala Ala Glin Thir Wall Wall 625 630 635 64 O

Arg Ser Arg Ala Arg Ser Lell Gly Ile Pro Luell Asp Phe His Glu 645 650 655

His Asp Thir His Ile His Met Pro Glu Gly Ala Thir Pro Lys Asp Gly 660 665 67 O

Pro Ser Ala Gly Wall Met Cys Thir Ala Luell Wall Ser Ala Luell Thir 675 685

Gly Ile Pro Wall Arg Asp Wall Ala Met Thir Gly Glu Ile Thir Luell 69 O. 695 7 OO

Arg Gly Glin Wall Lell Ile Gly Gly Luell Lys Glu Luell Luell Ala 7 Os

Ala His Arg Gly Gly Thir Wall Ile Ile Pro Glu Glu Asn Wall 72 73 O 73

Arg Asp Luell Lys Glu Pro Asp Asn Ile Glin Asp Luell Glin Ile 740 74. 7 O

Pro Wall Trp Asp Glu Wall Luell Glin Ile Ala Luell Glin Tyr 760 765

Ala Pro Glu Pro Lell Pro Asp Wall Ala Pro Glu Ile Wall Ala Asp 770 775

Glu Arg Glu Ser Asp Ser Glu Arg Ile Ser Thir His 78s 79 O 79.

<210s, SEQ ID NO 15 &211s LENGTH: 806 212. TYPE : PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 15

Met Ser Asp Glin Gln Glu Phe Pro Asp Tyr Asp Lell Asn Asp Tyr Ala 1. 5 15

Asp Pro Glu Asn Ala Glu Ala Pro Ser Ser ASn Thir Gly Luell Ala Luell 25

Pro Gly Glin Asn Lieu Pro Asp Lys Wall Tyr Ile Ile Pro Ile His Asn 35 4 O 45

Arg Pro Phe Phe Pro Ala Glin Wall Luell Pro Wall Ile Wall Asn Glu Glu US 2016/0159877 A1 Jun. 9, 2016 75

- Continued

SO 55 6 O Pro Trp Ala Glu Thir Lieu. Glu Lieu Val Ser Lys Ser Asp His His Ser 65 70 7s 8O Lieu Ala Leu Phe Phe Met Asp Thr Pro Pro Asp Asp Pro Arg His Phe 85 90 95 Asp Thr Ser Ala Leu Pro Leu Tyr Gly Thr Lieu Val Lys Val His His 1OO 105 11 O Ala Ser Arg Glu Asn Gly Llys Lieu. Glin Phe Val Ala Glin Gly Lieu. Thr 115 12 O 125 Arg Val Arg Ile Llys Thir Trp Lieu Lys His His Arg Pro Pro Tyr Lieu 13 O 135 14 O Val Glu Val Glu Tyr Pro His Gln Pro Ser Glu Pro Thr Asp Glu Val 145 150 155 160 Lys Ala Tyr Gly Met Ala Lieu. Ile Asn Ala Ile Lys Glu Lieu. Lieu Pro 1.65 17O 17s Lieu. Asn Pro Lieu. Tyr Ser Glu Glu Lieu Lys Asn Tyr Lieu. Asn Arg Phe 18O 185 19 O Ser Pro Asn Asp Pro Ser Pro Lieu. Thir Asp Phe Ala Ala Ala Lieu. Thir 195 2OO 2O5 Ser Ala Thr Gly Asn. Glu Lieu. Glin Glu Val Lieu. Asp Cys Val Pro Met 21 O 215 22O Lieu Lys Arg Met Glu Lys Val Lieu Pro Met Lieu. Arg Lys Glu Val Glu 225 23 O 235 24 O Val Ala Arg Lieu Gln Lys Glu Lieu. Ser Ala Glu Val Asn Arg Lys Ile 245 250 255 Gly Glu. His Glin Arg Glu Phe Phe Lieu Lys Glu Glin Lieu Lys Val Ile 26 O 265 27 O Glin Glin Glu Lieu. Gly Lieu. Thir Lys Asp Asp Arg Ser Ala Asp Val Glu 27s 28O 285 Glin Phe Glu Glin Arg Lieu. Glin Gly Llys Val Lieu Pro Ala Glin Ala Glin 29 O 295 3 OO Lys Arg Ile Asp Glu Glu Lieu. Asn Llys Lieu. Ser Ile Lieu. Glu Thr Gly 3. OS 310 315 32O Ser Pro Glu Tyr Ala Val Thr Arg Asn Tyr Lieu. Asp Trp Ala Thir Ser 3.25 330 335 Val Pro Trp Gly Val Tyr Gly Ala Asp Llys Lieu. Asp Lieu Lys His Ala 34 O 345 35. O Arg Llys Val Lieu. Asp Llys His His Ala Gly Lieu. Asp Asp Ile Llys Ser 355 360 365 Arg Ile Lieu. Glu Phe Lieu Ala Val Gly Ala Tyr Lys Gly Glu Val Ala 37 O 375 38O

Gly Ser Ile Val Lieu. Leu Val Gly Pro Pro Gly Val Gly Lys Thr Ser 385 390 395 4 OO

Val Gly Lys Ser Ile Ala Glu Ser Lieu. Gly Arg Pro Phe Tyr Arg Phe 4 OS 41O 415 Ser Val Gly Gly Met Arg Asp Glu Ala Glu Ile Lys Gly. His Arg Arg 42O 425 43 O

Thir Tyr Ile Gly Ala Lieu Pro Gly Lys Lieu Val Glin Ala Lieu Lys Asp 435 44 O 445

Val Glu Val Met Asn Pro Val Ile Met Lieu. Asp Glu Ile Asp Llys Met 450 45.5 460 US 2016/0159877 A1 Jun. 9, 2016 76

- Continued

Gly Glin Ser Phe Glin Gly Asp Pro Ala Ser Ala Lieu. Lieu. Glu Thir Lieu. 465 470 47s 48O Asp Pro Glu Glin Asn Val Glu Phe Lieu. Asp His Tyr Lieu. Asp Lieu. Arg 485 490 495 Lieu. Asp Lieu. Ser Llys Val Lieu. Phe Val Cys Thr Ala Asn. Thir Lieu. Asp SOO 505 51O Ser Ile Pro Gly Pro Lieu. Lieu. Asp Arg Met Glu Val Ile Arg Lieu. Ser 515 52O 525 Gly Tyr Ile Thr Glu Glu Lys Val Ala Ile Ala Lys Arg His Lieu. Trp 53 O 535 54 O Pro Lys Glin Lieu. Glu Lys Ala Gly Val Ala Lys Asn. Ser Lieu. Thir Ile 5.45 550 555 560 Ser Asp Gly Ala Lieu. Arg Ala Lieu. Ile Asp Gly Tyr Ala Arg Glu Ala 565 st O sts Gly Val Arg Glin Lieu. Glu Lys Glin Lieu. Gly Lys Lieu Val Arg Lys Ala 58O 585 59 O Val Val Lys Lieu. Lieu. Asp Glu Pro Asp Ser Val Ile Lys Ile Gly Asn 595 6OO 605 Lys Asp Lieu. Glu Ser Ser Lieu. Gly Met Pro Val Phe Arg Asn. Glu Glin 610 615 62O Val Lieu. Ser Gly Thr Gly Val Ile Thr Gly Leu Ala Trp Thr Ser Met 625 630 635 64 O Gly Gly Ala Thr Lieu Pro Ile Glu Ala Thir Arg Ile His Thr Lieu. Asn 645 650 655 Arg Gly Phe Llys Lieu. Thr Gly Glin Lieu. Gly Glu Wal Met Lys Glu Ser 660 665 67 O Ala Glu Ile Ala Tyr Ser Tyr Ile Ser Ser Asn Lieu Lys Ser Phe Gly 675 68O 685 Gly Asp Ala Lys Phe Phe Asp Glu Ala Phe Val His Lieu. His Val Pro 69 O. 695 7 OO Glu Gly Ala Thr Pro Lys Asp Gly Pro Ser Ala Gly Val Thr Met Ala 7 Os 71O 71s 72O Ser Ala Lieu. Lieu. Ser Lieu Ala Arg Asn Glin Pro Pro Llys Lys Gly Val 72 73 O 73 Ala Met Thr Gly Glu Lieu. Thir Lieu. Thr Gly His Val Lieu Pro Ile Gly 740 74. 7 O Gly Val Arg Glu Lys Val Ile Ala Ala Arg Arg Glin Lys Ile His Glu 7ss 760 765 Lieu. Ile Lieu Pro Glu Pro Asn Arg Gly Ser Phe Glu Glu Lieu Pro Asp 770 775 78O Tyr Lieu Lys Glu Gly Met Thr Val His Phe Ala Lys Arg Phe Ala Asp 78s 79 O 79. 8OO

Val Ala Lys Val Lieu. Phe 805

<210s, SEQ ID NO 16 &211s LENGTH: 477 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 16 Met Ser Llys Val Lys Asp Lys Ala Ile Val Ser Ala Ala Glin Ala Ser US 2016/0159877 A1 Jun. 9, 2016 77

- Continued

1. 5 1O 15 Thr Ala Tyr Ser Glin Ile Asp Ser Phe Ser His Lieu. Tyr Asp Arg Gly 2O 25 3O Gly Asn Lieu. Thr Val Asn Gly Llys Pro Ser Tyr Thr Val Asp Glin Ala 35 4 O 45 Ala Thr Glin Lieu. Lieu. Arg Asp Gly Ala Ala Tyr Arg Asp Phe Asp Gly SO 55 6 O Asn Gly Lys Ile Asp Lieu. Thir Tyr Thr Phe Lieu. Thir Ser Ala Thr Glin 65 70 7s 8O Ser Thr Met Asn Lys His Gly Ile Ser Gly Phe Ser Glin Phe Asn Thr 85 90 95 Glin Gln Lys Ala Glin Ala Ala Lieu Ala Met Glin Ser Trp Ala Asp Wall 1OO 105 11 O Ala Asn Val Thr Phe Thr Glu Lys Ala Ser Gly Gly Asp Gly His Met 115 12 O 125 Thir Phe Gly Asn Tyr Ser Ser Gly Glin Asp Gly Ala Ala Ala Phe Ala 13 O 135 14 O Tyr Lieu Pro Gly Thr Gly Ala Gly Tyr Asp Gly Thr Ser Trp Tyr Lieu. 145 150 155 160 Thr Asn Asn Ser Tyr Thr Pro Asn Llys Thr Pro Asp Lieu. Asn Asn Tyr 1.65 17O 17s Gly Arg Glin Thr Lieu. Thir His Glu Ile Gly His Thr Lieu. Gly Lieu Ala 18O 185 19 O His Pro Gly Asp Tyr Asn Ala Gly Asn Gly ASn Pro Thr Tyr Asn Asp 195 2OO 2O5 Ala Thr Tyr Gly Glin Asp Thr Arg Gly Tyr Ser Leu Met Ser Tyr Trp 21 O 215 22O Ser Glu Ser Asn. Thir Asn Glin Asn. Phe Ser Lys Gly Gly Val Glu Ala 225 23 O 235 24 O Tyr Ala Ser Gly Pro Lieu. Ile Asp Asp Ile Ala Ala Ile Glin Llys Lieu. 245 250 255 Tyr Gly Ala Asn Lieu. Ser Thr Arg Ala Thr Asp Thr Thr Tyr Gly Phe 26 O 265 27 O Asn Ser Asn. Thr Gly Arg Asp Phe Lieu. Ser Ala Thir Ser Asn Ala Asp 27s 28O 285 Llys Lieu Val Phe Ser Val Trp Asp Gly Gly Gly Asn Asp Thir Lieu. Asp 29 O 295 3 OO Phe Ser Gly Phe Thr Glin Asn Gln Lys Ile Asn Lieu. Thir Ala Thir Ser 3. OS 310 315 32O Phe Ser Asp Val Gly Gly Lieu Val Gly Asn Val Ser Ile Ala Lys Gly 3.25 330 335

Val Thir Ile Glu Asn Ala Phe Gly Gly Ala Gly Asn Asp Lieu. Ile Ile 34 O 345 35. O

Gly Asn Glin Val Ala Asn. Thir Ile Lys Gly Gly Ala Gly Asn Asp Lieu. 355 360 365

Ile Tyr Gly Gly Gly Gly Ala Asp Glin Lieu. Trp Gly Gly Ala Gly Ser 37 O 375 38O

Asp Thr Phe Val Tyr Gly Ala Ser Ser Asp Ser Llys Pro Gly Ala Ala 385 390 395 4 OO

Asp Llys Ile Phe Asp Phe Thir Ser Gly Ser Asp Llys Ile Asp Lieu. Ser 4 OS 41O 415 US 2016/0159877 A1 Jun. 9, 2016 78

- Continued

Gly Ile Thr Lys Gly Ala Gly Val Thr Phe Val Asn Ala Phe Thr Gly 42O 425 43 O His Ala Gly Asp Ala Val Lieu. Ser Tyr Ala Ser Gly Thr Asn Lieu. Gly 435 44 O 445 Thir Lieu Ala Val Asp Phe Ser Gly His Gly Val Ala Asp Phe Lieu Val 450 45.5 460 Thir Thr Val Gly Glin Ala Ala Ala Ser Asp Ile Val Ala 465 470 47s

<210s, SEQ ID NO 17 &211s LENGTH: 295 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 17 Met Met Arg Ile Lieu Lleu Phe Lieu Ala Thir Asn Lieu Ala Val Val Lieu. 1. 5 1O 15 Ile Ala Ser Val Thr Lieu Ser Leu Phe Gly Phe Asn Gly Phe Met Ala 2O 25 3O Ala Asn Gly Val Asp Lieu. Asn Lieu. Asn Glin Lieu. Lieu. Ile Phe Cys Ala 35 4 O 45 Val Phe Gly Phe Ala Gly Ser Leu Phe Ser Leu Phe Ile Ser Lys Trp SO 55 6 O Met Ala Lys Met Ser Thr Ser Thr Glin Ile Ile Thr Gln Pro Arg Thr 65 70 7s 8O Arg His Glu Gln Trp Lieu Met Gln Thr Val Glu Gln Leu Ser Glin Glu 85 90 95 Ala Gly Ile Llys Met Pro Glu Val Gly Ile Phe Pro Ala Tyr Glu Ala 1OO 105 11 O Asn Ala Phe Ala Thr Gly Trp Asn Lys Asn Asp Ala Lieu Val Ala Val 115 12 O 125 Ser Glin Gly Lieu. Lieu. Glu Arg Phe Ser Pro Asp Glu Val Lys Ala Val 13 O 135 14 O Lieu Ala His Glu Ile Gly His Val Ala Asn Gly Asp Met Val Thir Lieu. 145 150 155 160 Ala Leu Val Glin Gly Val Val Asn Thr Phe Val Met Phe Phe Ala Arg 1.65 17O 17s Ile Ile Gly Asn. Phe Val Asp Llys Val Ile Phe Lys Asn. Glu Glu Gly 18O 185 19 O Arg Gly Ile Ala Tyr Phe Val Ala Thir Ile Phe Ala Glu Lieu Val Lieu 195 2OO 2O5 Gly Phe Lieu Ala Ser Ala Ile Val Met Trp Phe Ser Arg Lys Arg Glu 21 O 215 22O

Phe Arg Ala Asp Glu Ala Gly Ala Arg Lieu Ala Gly. Thir Ser Ala Met 225 23 O 235 24 O

Ile Gly Ala Lieu. Glin Arg Lieu. Arg Ser Glu Glin Gly Lieu Pro Val His 245 250 255

Met Pro Asp Ser Lieu. Thir Ala Phe Gly Ile Asn Gly Gly Ile Lys Glin 26 O 265 27 O

Gly Lieu Ala Arg Lieu. Phe Met Ser His Pro Pro Lieu. Glu Glu Arg Ile 27s 28O 285 Asp Ala Lieu. Arg Arg Arg Gly US 2016/0159877 A1 Jun. 9, 2016 79

- Continued

29 O 295

<210s, SEQ ID NO 18 &211s LENGTH: 386 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 18 Met Lieu Lys Ala Lieu. Arg Phe Phe Gly Trp Pro Lieu. Lieu Ala Gly Val 1. 5 1O 15 Lieu. Ile Ala Met Lieu. Ile Ile Glin Arg Tyr Pro Glin Trp Val Gly Lieu. 2O 25 3O Pro Thr Lieu. Asp Val Asn Lieu. Glin Glin Ala Pro Gln Thr Asn Thr Val 35 4 O 45 Val Glin Gly Pro Val Thr Tyr Ala Asp Ala Val Val Ile Ala Ala Pro SO 55 6 O Ala Val Val Asn Lieu. Tyr Thir Thr Llys Val Ile Asn Llys Pro Ala His 65 70 7s 8O Pro Lieu. Phe Glu Asp Pro Glin Phe Arg Arg Tyr Phe Gly Asp Asn Gly 85 90 95 Pro Lys Glin Arg Arg Met Glu Ser Ser Leu Gly Ser Gly Val Ile Met 1OO 105 11 O Ser Pro Glu Gly Tyr Ile Lieu. Thir Asn Asn His Val Thr Thr Gly Ala 115 12 O 125 Asp Glin Ile Val Val Ala Lieu. Arg Asp Gly Arg Glu Thir Lieu Ala Arg 13 O 135 14 O Val Val Gly Ser Asp Pro Glu Thir Asp Lieu Ala Val Lieu Lys Ile Asp 145 150 155 160 Lieu Lys Asn Lieu Pro Ala Ile Thr Lieu. Gly Arg Ser Asp Gly Lieu. Arg 1.65 17O 17s Val Gly Asp Val Ala Lieu Ala Ile Gly Asn Pro Phe Gly Val Gly Glin 18O 185 19 O Thr Val Thr Met Gly Ile Ile Ser Ala Thr Gly Arg Asn Glin Leu Gly 195 2OO 2O5 Lieu. Asn. Ser Tyr Glu Asp Phe Ile Glin Thr Asp Ala Ala Ile Asn Pro 21 O 215 22O Gly Asn. Ser Gly Gly Ala Lieu Val Asp Ala Asn Gly Asn Lieu. Thr Gly 225 23 O 235 24 O Ile Asn Thr Ala Ile Phe Ser Lys Ser Gly Gly Ser Glin Gly Ile Gly 245 250 255 Phe Ala Ile Pro Val Lys Lieu Ala Met Glu Val Met Lys Ser Ile Ile 26 O 265 27 O Glu. His Gly Glin Val Ile Arg Gly Trp Lieu. Gly Ile Glu Val Glin Pro 27s 28O 285

Lieu. Thir Lys Glu Lieu Ala Glu Ser Phe Gly Lieu. Thr Gly Arg Pro Gly 29 O 295 3 OO

Ile Val Val Ala Gly Ile Phe Arg Asp Gly Pro Ala Glin Lys Ala Gly 3. OS 310 315 32O

Lieu. Glin Lieu. Gly Asp Val Ile Lieu. Ser Ile Asp Gly Ala Pro Ala Gly 3.25 330 335

Asp Gly Arg Llys Ser Met Asn Glin Val Ala Arg Ile Llys Pro Thir Asp 34 O 345 35. O US 2016/0159877 A1 Jun. 9, 2016 80

- Continued Llys Val Ala Ile Lieu Val Met Arg Asn Gly Lys Glu Ile Llys Lieu. Ser 355 360 365 Ala Glu Ile Gly Lieu. Arg Pro Pro Pro Ala Thr Ala Pro Val Lys Glu 37 O 375 38O

Glu Glin 385

<210s, SEQ ID NO 19 &211s LENGTH: 478 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 19 Met Ser Ile Pro Arg Lieu Lys Ser Tyr Lieu. Ser Ile Val Ala Thr Val 1. 5 1O 15 Lieu Val Lieu. Gly Glin Ala Lieu Pro Ala Glin Ala Val Glu Lieu Pro Asp 2O 25 3O

Phe Thr Glin Lieu Val Glu Glin Ala Ser Pro Ala Wal Wall Asn. Ile Ser 35 4 O 45 Thir Thr Gln Lys Lieu Pro Asp Arg Llys Val Ser Asn Glin Glin Met Pro SO 55 6 O Asp Lieu. Glu Gly Lieu Pro Pro Met Lieu. Arg Glu Phe Phe Glu Arg Gly 65 70 7s 8O Met Pro Glin Pro Arg Ser Pro Arg Gly Gly Gly Gly Glin Arg Glu Ala 85 90 95 Gln Ser Leu Gly Ser Gly Phe Ile Ile Ser Pro Asp Gly Tyr Ile Leu 1OO 105 11 O Thir Asn. Asn His Val Ile Ala Asp Ala Asp Glu Ile Lieu Val Arg Lieu. 115 12 O 125 Ala Asp Arg Ser Glu Lieu Lys Ala Lys Lieu. Ile Gly. Thir Asp Pro Arg 13 O 135 14 O Ser Asp Wall Ala Lieu Lleu Lys Ile Glu Gly Lys Asp Lieu Pro Val Lieu. 145 150 155 160 Llys Lieu. Gly Lys Ser Glin Asp Lieu Lys Ala Gly Glin Trp Val Val Ala 1.65 17O 17s Ile Gly Ser Pro Phe Gly Phe Asp His Thr Val Thr Glin Gly Ile Val 18O 185 19 O Ser Ala Ile Gly Arg Ser Leu Pro Asn Glu Asn Tyr Val Pro Phe Ile 195 2OO 2O5 Gln Thr Asp Val Pro Ile Asn Pro Gly Asn Ser Gly Gly Pro Leu Phe 21 O 215 22O Asn Lieu Ala Gly Glu Val Val Gly Ile Asin Ser Glin Ile Tyr Thr Arg 225 23 O 235 24 O

Ser Gly Gly Phe Met Gly Val Ser Phe Ala Ile Pro Ile Asp Val Ala 245 250 255

Met Asp Val Ser Asn Gln Lieu Lys Ser Gly Gly Llys Val Ser Arg Gly 26 O 265 27 O

Trp Lieu. Gly Val Val Ile Glin Glu Val Asn Lys Asp Lieu Ala Glu Ser 27s 28O 285

Phe Gly Lieu. Asp Llys Pro Ala Gly Ala Lieu Val Ala Glin Ile Glin Asp 29 O 295 3 OO

Asn Gly Pro Ala Ala Lys Gly Gly Lieu Lys Val Gly Asp Val Ile Lieu 3. OS 310 315 32O US 2016/0159877 A1 Jun. 9, 2016 81

- Continued

Ser Met Asn Gly Glin Pro Ile Ile Met Ser Ala Asp Leu Pro His Leu 3.25 330 335 Val Gly Ala Lieu Lys Ala Gly Gly Lys Ala Lys Lieu. Glu Val Ile Arg 34 O 345 35. O Asp Gly Lys Arg Glin Asn Val Glu Lieu. Thr Val Gly Ala Ile Pro Glu 355 360 365 Glu Gly Ala Thr Lieu. Asp Ala Lieu. Gly Asn Ala Lys Pro Gly Ala Glu 37 O 375 38O Arg Ser Ser Asn Arg Lieu. Gly Ile Ala Val Val Glu Lieu. Thir Ala Glu 385 390 395 4 OO Glin Llys Llys Thr Phe Asp Lieu. Glin Ser Gly Val Val Ile Lys Glu Val 4 OS 41O 415 Glin Asp Gly Pro Ala Ala Lieu. Ile Gly Lieu. Glin Pro Gly Asp Val Ile 42O 425 43 O Thir His Lieu. Asn. Asn Glin Ala Ile Asp Thir Thr Lys Glu Phe Ala Asp 435 44 O 445 Ile Ala Lys Ala Lieu Pro Lys Asn Arg Ser Val Ser Met Arg Val Lieu. 450 45.5 460 Arg Glin Gly Arg Ala Ser Phe Ile Thr Phe Llys Lieu Ala Glu 465 470 47s

<210s, SEQ ID NO 2 O & 211 LENGTH 353 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 2O Met Cys Val Arg Gln Pro Arg Asn Pro Ile Phe Cys Lieu. Ile Pro Pro 1. 5 1O 15 Tyr Met Lieu. Asp Glin Ile Ala Arg His Gly Asp Lys Ala Glin Arg Glu 2O 25 3O Val Ala Lieu. Arg Thr Arg Ala Lys Asp Ser Thr Phe Arg Ser Lieu. Arg 35 4 O 45 Met Val Ala Val Pro Ala Lys Gly Pro Ala Arg Met Ala Lieu Ala Val SO 55 6 O Gly Ala Glu Lys Glin Arg Ser Ile Tyr Ser Ala Glu Asn. Thir Asp Ser 65 70 7s 8O Lieu Pro Gly Lys Lieu. Ile Arg Gly Glu Gly Glin Pro Ala Ser Gly Asp 85 90 95 Ala Ala Val Asp Glu Ala Tyr Asp Gly Lieu. Gly Ala Thr Phe Asp Phe 1OO 105 11 O Phe Asp Glin Val Phe Asp Arg Asn. Ser Ile Asp Asp Ala Gly Met Ala 115 12 O 125 Lieu. Asp Ala Thr Val His Phe Gly Glin Asp Tyr Asn. Asn Ala Phe Trp 13 O 135 14 O

Asn Ser Thr Gln Met Val Phe Gly Asp Gly Asp Gln Gln Leu Phe Asn 145 150 155 160

Arg Phe Thr Val Ala Lieu. Asp Val Ile Gly His Glu Lieu Ala His Gly 1.65 17O 17s

Val Thr Glu Asp Glu Ala Lys Lieu Met Tyr Phe Asn Glin Ser Gly Ala 18O 185 19 O

Lieu. Asn. Glu Ser Lieu. Ser Asp Val Phe Gly Ser Lieu. Ile Lys Glin Tyr US 2016/0159877 A1 Jun. 9, 2016 82

- Continued

195 2OO 2O5 Ala Lieu Lys Glin Thr Ala Glu Asp Ala Asp Trp Lieu. Ile Gly Lys Gly 21 O 215 22O Lieu. Phe Thir Lys Lys Ile Lys Gly Thr Ala Lieu. Arg Ser Met Lys Ala 225 23 O 235 24 O Pro Gly Thr Ala Phe Asp Asp Llys Lieu. Lieu. Gly Lys Asp Pro Glin Pro 245 250 255 Gly His Met Asp Asp Phe Val Glin Thr Tyr Glu Asp Asn Gly Gly Val 26 O 265 27 O His Ile Asin Ser Gly Ile Pro Asn His Ala Phe Tyr Glin Val Ala Ile 27s 28O 285 Asn. Ile Gly Gly Phe Ala Trp Glu Arg Ala Gly Arg Ile Trp Tyr Asp 29 O 295 3 OO Ala Lieu. Arg Asp Ser Arg Lieu. Arg Pro Asn. Ser Gly Phe Lieu. Arg Phe 3. OS 310 315 32O Ala Arg Ile Thr His Asp Ile Ala Gly Glin Lieu. Tyr Gly Val Asn Lys 3.25 330 335 Ala Glu Gln Lys Ala Wall Lys Glu Gly Trip Lys Ala Val Gly Ile Asn 34 O 345 35. O

Wall

<210s, SEQ ID NO 21 & 211 LENGTH: f(04 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 21 Met Arg Tyr Gln Leu Pro Pro Arg Arg Ile Ser Met Lys His Leu Phe 1. 5 1O 15 Pro Ser Thr Ala Lieu Ala Phe Phe Ile Gly Lieu. Gly Phe Ala Ser Met 2O 25 3O Ser Thr Asn. Thir Phe Ala Ala Asn. Ser Trp Asp Asn Lieu. Glin Pro Asp 35 4 O 45 Arg Asp Glu Val Ile Ala Ser Lieu. Asn Val Val Glu Lieu. Lieu Lys Arg SO 55 6 O His His Tyr Ser Llys Pro Pro Lieu. Asp Asp Ala Arg Ser Val Ile Ile 65 70 7s 8O Tyr Asp Ser Tyr Lieu Lys Lieu. Lieu. Asp Pro Ser Arg Ser Tyr Phe Lieu. 85 90 95 Ala Ser Asp Ile Ala Glu Phe Asp Llys Trp Llys Thr Glin Phe Asp Asp 1OO 105 11 O Phe Lieu Lys Ser Gly Asp Lieu. Glin Pro Gly Phe Thir Ile Tyr Lys Arg 115 12 O 125

Tyr Lieu. Asp Arg Val Lys Ala Arg Lieu. Asp Phe Ala Lieu. Gly Glu Lieu. 13 O 135 14 O

Asn Lys Gly Val Asp Llys Lieu. Asp Phe Thr Glin Lys Glu Thir Lieu. Lieu 145 150 155 160

Val Asp Arg Lys Asp Ala Pro Trp Lieu. Thir Ser Thr Ala Ala Lieu. Asp 1.65 17O 17s

Asp Lieu. Trp Arg Lys Arg Val Lys Asp Glu Val Lieu. Arg Lieu Lys Ile 18O 185 19 O

Ala Gly Lys Glu Pro Lys Ala Ile Glin Glu Lieu. Lieu. Thir Lys Arg Tyr US 2016/0159877 A1 Jun. 9, 2016 83

- Continued

195 2OO 2O5 Lys Asn Glin Lieu Ala Arg Lieu. Asp Glin Thr Arg Ala Glu Asp Ile Phe 21 O 215 22O Glin Ala Tyr Ile Asn Thr Phe Ala Met Ser Tyr Asp Pro His Thr Asn 225 23 O 235 24 O Tyr Lieu. Ser Pro Asp Asn Ala Glu Asn. Phe Asp Ile Asn Met Ser Lieu. 245 250 255 Ser Lieu. Glu Gly Ile Gly Ala Val Lieu. Glin Ser Asp Asn Asp Glin Val 26 O 265 27 O Lys Ile Val Arg Lieu Val Pro Ala Gly Pro Ala Asp Llys Thir Lys Glin 27s 28O 285 Val Ala Pro Ala Asp Llys Ile Ile Gly Val Ala Glin Ala Asp Llys Glu 29 O 295 3 OO Met Val Asp Val Val Gly Trp Arg Lieu. Asp Glu Val Val Lys Lieu. Ile 3. OS 310 315 32O Arg Gly Pro Lys Gly Ser Val Val Arg Lieu. Glu Val Ile Pro His Thr 3.25 330 335 Asn Ala Pro Asn Asp Glin Thir Ser Lys Ile Val Ser Ile Thr Arg Glu 34 O 345 35. O Ala Wall Lys Lieu. Glu Asp Glin Ala Val Glin Llys Llys Val Lieu. Asn Lieu 355 360 365 Lys Glin Asp Gly Lys Asp Tyr Lys Lieu. Gly Val Ile Glu Ile Pro Ala 370 375 380 Phe Tyr Lieu. Asp Phe Lys Ala Phe Arg Ala Gly Asp Pro Asp Tyr Lys 385 390 395 4 OO Ser Thir Thr Arg Asp Val Llys Lys Ile Lieu. Thr Glu Lieu Gln Lys Glu 4 OS 41O 415 Llys Val Asp Gly Val Val Ile Asp Lieu. Arg Asn. Asn Gly Gly Gly Ser 42O 425 43 O Lieu. Glin Glu Ala Thr Glu Lieu. Thir Ser Lieu. Phe Ile Asp Llys Gly Pro 435 44 O 445 Thr Val Lieu Val Arg Asn Ala Asp Gly Arg Val Asp Val Lieu. Glu Asp 450 45.5 460 Glu Asn Pro Gly Ala Phe Tyr Lys Gly Pro Met Ala Lieu. Lieu Val Asn 465 470 47s 48O Arg Lieu. Ser Ala Ser Ala Ser Glu Ile Phe Ala Gly Ala Met Glin Asp 485 490 495 Tyr His Arg Ala Lieu. Ile Ile Gly Gly Glin Thr Phe Gly Lys Gly Thr SOO 505 51O Val Glin Thir Ile Glin Pro Lieu. Asn His Gly Glu Lieu Lys Lieu. Thir Lieu 515 52O 525 Ala Lys Phe Tyr Arg Val Ser Gly Glin Ser Thr Gln His Glin Gly Val 53 O 535 54 O

Lieu Pro Asp Ile Asp Phe Pro Ser Ile Ile Asp Thir Lys Glu Ile Gly 5.45 550 555 560

Glu Ser Ala Lieu Pro Glu Ala Met Pro Trp Asp Thir Ile Arg Pro Ala 565 st O sts

Ile Llys Pro Ala Ser Asp Pro Phe Llys Pro Phe Lieu Ala Glin Lieu Lys 58O 585 59 O Ala Asp His Asp Thr Arg Ser Ala Lys Asp Ala Glu Phe Val Phe Ile 595 6OO 605 US 2016/0159877 A1 Jun. 9, 2016 84

- Continued

Arg Asp Llys Lieu Ala Lieu Ala Lys Llys Lieu Met Glu Glu Lys Thr Val 610 615 62O Ser Lieu. Asn. Glu Ala Asp Arg Arg Ala Gln His Ser Ser Ile Glu Asn 625 630 635 64 O Glin Glin Lieu Val Lieu. Glu Asn. Thir Arg Arg Lys Ala Lys Gly Glu Asp 645 650 655 Pro Lieu Lys Glu Lieu Lys Lys Glu Asp Glu Asp Ala Lieu Pro Thr Glu 660 665 67 O Ala Asp Llys Thr Llys Pro Glu Asp Asp Ala Tyr Lieu Ala Glu Thr Gly 675 68O 685 Arg Ile Lieu. Lieu. Asp Tyr Lieu Lys Ile Thir Lys Glin Val Ala Lys Glin 69 O. 695 7 OO

<210s, SEQ ID NO 22 &211s LENGTH: 437 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 22 Met Lieu. His Lieu. Ser Arg Lieu. Thir Ser Lieu Ala Lieu. Thir Ile Ala Lieu. 1. 5 1O 15 Val Ile Gly Ala Pro Lieu Ala Phe Ala Asp Glin Ala Ala Pro Ala Ala 2O 25 3O Pro Ala Thr Ala Ala Thr Thr Lys Ala Pro Leu Pro Lieu. Asp Glu Lieu 35 4 O 45 Arg Thr Phe Ala Glu Val Met Asp Arg Ile Lys Ala Ala Tyr Val Glu SO 55 6 O Pro Val Asp Asp Lys Ala Lieu. Lieu. Glu Asn Ala Ile Llys Gly Met Lieu. 65 70 7s 8O Ser Asn Lieu. Asp Pro His Ser Ala Tyr Lieu. Gly Pro Glu Asp Phe Ala 85 90 95 Glu Lieu. Glin Glu Ser Thr Ser Gly Glu Phe Gly Gly Lieu. Gly Ile Glu 1OO 105 11 O Val Gly Ser Glu Asp Gly Glin Ile Llys Val Val Ser Pro Ile Asp Asp 115 12 O 125 Thr Pro Ala Ser Lys Ala Gly Ile Glin Ala Gly Asp Lieu. Ile Val Lys 13 O 135 14 O Ile Asin Gly Glin Pro Thr Arg Gly Glin Thr Met Thr Glu Ala Val Asp 145 150 155 160 Llys Met Arg Gly Llys Lieu. Gly Glin Lys Ile Thr Lieu. Thir Lieu Val Arg 1.65 17O 17s Asp Gly Gly Asn Pro Phe Asp Val Thir Lieu Ala Arg Ala Thir Ile Thr 18O 185 19 O

Val Lys Ser Val Llys Ser Glin Lieu. Lieu. Glu Ser Gly Tyr Gly Tyr Ile 195 2OO 2O5

Arg Ile Thr Glin Phe Glin Val Lys Thr Gly Asp Glu Val Ala Lys Ala 21 O 215 22O

Lieu Ala Lys Lieu. Arg Lys Asp Asin Gly Lys Llys Lieu. Asn Gly Ile Val 225 23 O 235 24 O

Lieu. Asp Lieu. Arg Asn. Asn Pro Gly Gly Val Lieu. Glin Ser Ala Val Glu 245 250 255 Val Val Asp His Phe Val Thr Lys Gly Lieu. Ile Val Tyr Thr Lys Gly US 2016/0159877 A1 Jun. 9, 2016 85

- Continued

26 O 265 27 O Arg Ile Ala Asn. Ser Glu Lieu. Arg Phe Ser Ala Thr Gly Asn Asp Lieu 27s 28O 285 Ser Glu Asn Val Pro Lieu Ala Val Lieu. Ile Asin Gly Gly Ser Ala Ser 29 O 295 3 OO Ala Ser Glu Ile Val Ala Gly Ala Lieu. Glin Asp Lieu Lys Arg Gly Val 3. OS 310 315 32O Lieu Met Gly Thr Thr Ser Phe Gly Lys Gly Ser Val Glin Thr Val Lieu. 3.25 330 335 Pro Lieu. Asn. Asn. Glu Arg Ala Lieu Lys Ile Thir Thr Ala Lieu. Tyr Tyr 34 O 345 35. O Thr Pro Asn Gly Arg Ser Ile Glin Ala Glin Gly Ile Val Pro Asp Ile 355 360 365 Glu Val Arg Arg Ala Lys Ile Thr Asn. Glu Ile Asp Gly Glu Tyr Tyr 37 O 375 38O Lys Glu Ala Asp Lieu. Glin Gly His Lieu. Gly Asn Gly Asn Gly Gly Ala 385 390 395 4 OO

Asp Gln Pro Thr Gly Ser Arg Ala Lys A a. L y S Pro Met Pro Glin Asp 4 OS 4. O 415 Asp Asp Tyr Glin Lieu Ala Glin Ala Lieu. Ser Lieu Lleu Lys Gly Lieu. Ser 42O 425 43 O Ile Thr Arg Ser Arg 435

<210s, SEQ ID NO 23 &211s LENGTH: 1242 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 23 Met Asp Wall Ala Gly Asn Gly Phe Thr Val Ser Glin Arg Asn Arg Thr 1. 5 1O 15 Pro Arg Phe Llys Thr Thr Pro Leu. Thr Pro Ile Ala Lieu. Gly Lieu Ala 2O 25 3O Lieu. Trp Lieu. Gly His Gly Ser Val Ala Arg Ala Asp Asp Asin Pro Tyr 35 4 O 45 Thr Pro Glin Val Lieu. Glu Ser Ala Phe Arg Thr Ala Val Ala Ser Phe SO 55 6 O Gly Pro Glu Thir Ala Val Tyr Lys Asn Lieu. Arg Phe Ala Tyr Ala Asp 65 70 7s 8O Ile Val Asp Lieu Ala Ala Lys Asp Phe Ala Ala Glin Ser Gly Llys Phe 85 90 95 Asp Ser Ala Lieu Lys Glin Asn Tyr Glu Lieu. Glin Pro Glu Asn Lieu. Thr 1OO 105 11 O

Ile Gly Ala Met Lieu. Gly Asp Thir Arg Arg Pro Lieu. Asp Tyr Ala Ser 115 12 O 125 Arg Lieu. Asp Tyr Tyr Arg Ser Arg Lieu. Phe Ser Asn. Ser Gly Arg Tyr 13 O 135 14 O

Thir Thr Asn. Ile Lieu. Asp Phe Ser Lys Ala Ile Ile Ala Asn Lieu Pro 145 150 155 160

Ala Ala Lys Pro Tyr Thr Tyr Val Glu Pro Gly Val Ser Ser Asn Lieu. 1.65 17O 17s US 2016/0159877 A1 Jun. 9, 2016 86

- Continued Asn Gly Glin Lieu. Asn Ala Gly Glin Ser Trp Ala Gly Ala Thr Arg Asp 18O 185 19 O Trp Ser Ala Asn Ala Glin Thir Trp Llys Thr Pro Glu Ala Glin Val Asn 195 2OO 2O5 Ser Gly Lieu. Asp Arg Thr Asn Ala Tyr Tyr Ala Tyr Ala Lieu. Gly Ile 21 O 215 22O Thr Gly Lys Gly Val Asn Val Gly Val Lieu. Asp Ser Gly Ile Phe Thr 225 23 O 235 24 O Glu. His Ser Glu Phe Glin Gly Lys Asn Ala Glin Gly Glin Asp Arg Val 245 250 255 Glin Ala Val Thr Ser Thr Gly Glu Tyr Tyr Ala Thr His Pro Arg Tyr 26 O 265 27 O Arg Lieu. Glu Val Pro Ser Gly Glu Phe Lys Glin Gly Glu. His Phe Ser 27s 28O 285 Ile Pro Gly Glu Tyr Asp Pro Ala Phe Asn Asp Gly His Gly Thr Glu 29 O 295 3 OO Met Ser Gly Val Lieu Ala Ala Asn Arg Asin Gly Thr Gly Met His Gly 3. OS 310 315 32O Ile Ala Phe Asp Ala Asn Lieu. Phe Val Ala Asn Thr Gly Gly Ser Asp 3.25 330 335 Asn Asp Arg Tyr Glin Gly Ser Asn Asp Lieu. Asp Tyr Asn Ala Phe Met 34 O 345 35. O Ala Ser Tyr Asn Ala Lieu Ala Ala Lys ASn Val Ala Ile Val ASn Gln 355 360 365 Ser Trp Gly Glin Ser Ser Arg Asp Asp Val Glu Asn His Phe Gly Asn 37 O 375 38O Val Gly Asp Ser Ala Ala Glin Asn Lieu. Arg Asp Met Thr Ala Ala Tyr 385 390 395 4 OO Arg Pro Phe Trp Asp Lys Ala His Ala Gly His Llys Thir Trp Met Asp 4 OS 41O 415 Ala Met Ala Asp Ala Ala Arg Glin Asn. Thir Phe Ile Glin Ile Ile Ser 42O 425 43 O Ala Gly Asn Asp Ser His Gly Ala Asn Pro Asp Thr Asn. Ser Asn Lieu 435 44 O 445 Pro Phe Phe Llys Pro Asp Ile Glu Ala Lys Phe Leu Ser Ile Thr Gly 450 45.5 460 Tyr Asp Glu Thir Ser Ala Glin Val Tyr Asn Arg Cys Gly Thr Ser Lys 465 470 47s 48O Trp Trp Cys Val Met Gly Ile Ser Gly Ile Pro Ser Ala Gly Pro Glu 485 490 495 Gly Glu Ile Ile Pro Asn Ala Asn Gly. Thir Ser Ala Ala Ala Pro Ser SOO 505 51O

Val Ser Gly Ala Leu Ala Leu Val Met Glin Arg Phe Pro Tyr Met Thr 515 52O 525

Ala Ser Glin Ala Arg Asp Val Lieu. Lieu. Thir Thr Ser Ser Lieu. Glin Ala 53 O 535 54 O

Pro Asp Gly Pro Asp Thr Pro Val Gly Thr Lieu. Thr Gly Gly Arg Thr 5.45 550 555 560

Tyr Asp Asn Lieu. Glin Pro Wal His Asp Ala Ala Pro Gly Lieu Pro Glin 565 st O sts

Val Pro Gly Val Val Ser Gly Trp Gly Lieu Pro Asn Lieu. Glin Lys Ala US 2016/0159877 A1 Jun. 9, 2016 87

- Continued

58O 585 59 O Met Glin Gly Pro Gly Glin Phe Lieu. Gly Ala Val Ala Val Ala Lieu Pro 595 6OO 605 Ser Gly Thr Arg Asp Ile Trp Ala Asn Pro Ile Ser Asp Glu Ala Ile 610 615 62O Arg Ala Arg Arg Val Glu Asp Ala Ala Glu Glin Ala Thir Trp Ala Ala 625 630 635 64 O Thir Lys Glin Gln Lys Gly Trp Lieu. Ser Gly Lieu Pro Ala Asn Ala Ser 645 650 655 Ala Asp Asp Glin Phe Glu Tyr Asp Ile Gly His Ala Arg Glu Glin Ala 660 665 67 O Thr Lieu. Thr Arg Gly Glin Asp Val Lieu. Thr Gly Ser Thr Tyr Val Gly 675 68O 685 Ser Lieu Val Lys Ser Gly Asp Gly Glu Lieu Val Lieu. Glu Gly Glin Asn 69 O. 695 7 OO Thr Tyr Ser Gly Ser Thir Trp Val Arg Gly Gly Lys Lieu Ser Val Asp 7 Os 71O 71s 72O Gly Ala Lieu. Thir Ser Ala Val Thr Val Asp Ser Ser Ala Val Gly Thr 72 73 O 73 Arg Asn Ala Asp Asn Gly Val Met Thir Thr Lieu. Gly Gly. Thir Lieu Ala 740 74. 7 O Gly Asn Gly Thr Val Gly Ala Lieu. Thr Val Asn. Asn Gly Gly Arg Val 755 76 O 76.5 Ala Pro Gly His Ser Ile Gly. Thir Lieu. Arg Thr Gly Asp Val Thr Phe 770 775 78O Asn Pro Gly Ser Val Tyr Ala Val Glu Val Gly Ala Asp Gly Arg Ser 78s 79 O 79. 8OO Asp Gln Lieu. Glin Ser Ser Gly Val Ala Thr Lieu. Asn Gly Gly Val Val 805 810 815

Ser Wal Ser Lieu. Glu Asn. Ser Pro Asn Lieu. Lieu. Thir Ala Thr Glu Ala 82O 825 83 O Arg Ser Lieu. Lieu. Gly Glin Glin Phe Asn. Ile Lieu. Ser Ala Ser Glin Gly 835 84 O 845 Ile Glin Gly Glin Phe Ala Ala Phe Ala Pro Asn Tyr Lieu. Phe Ile Gly 850 855 860 Thir Ala Lieu. Asn Tyr Glin Pro Asn Glin Lieu. Thir Lieu Ala Ile Ala Arg 865 87O 87s 88O Asn Glin Thir Thr Phe Ala Ser Val Ala Glin Thr Arg Asn Glu Arg Ser 885 890 895 Val Ala Thr Val Ala Glu Thir Lieu. Gly Ala Gly Ser Pro Val Tyr Glu 9 OO 905 91 O

Ser Lieu. Lieu Ala Ser Asp Ser Ala Ala Glin Ala Arg Glu Gly Phe Lys 915 92 O 925

Glin Lieu. Ser Gly Glin Lieu. His Ser Asp Wall Ala Ala Ala Glin Met Ala 93 O 935 94 O

Asp Ser Arg Tyr Lieu. Arg Glu Ala Val Asn Ala Arg Lieu. Glin Glin Ala 945 950 955 96.O

Glin Ala Lieu. Asp Ser Ser Ala Glin Ile Asp Ser Arg Asp Asin Gly Gly 965 97O 97.

Trp Val Glin Lieu. Lieu. Gly Gly Arg Asn. Asn Val Ser Gly Asp Asn. Asn 98O 985 99 O US 2016/0159877 A1 Jun. 9, 2016 88

- Continued

Ala Ser Gly Tyr Ser Ser Ser Thir Ser Gly Val Lieu. Lieu. Gly Lieu. Asp 995 1OOO 1005 Thr Glu Val Asn Asp Gly Trp Arg Val Gly Ala Ala Thr Gly Tyr O1O O15 O2O Thr Glin Ser His Lieu. Asn Gly Glin Ser Ala Ser Ala Asp Ser Asp O25 O3 O O35 Asn Tyr His Lieu. Ser Val Tyr Gly Gly Lys Arg Phe Glu Ala Ile O4 O O45 OSO Ala Lieu. Arg Lieu. Gly Gly Ala Ser Thir Trp His Arg Lieu. Asp Thr O55 O6 O O65 Ser Arg Arg Val Ala Tyr Ala Asn Glin Ser Asp His Ala Lys Ala Of O O7 O8O Asp Tyr Asn Ala Arg Thr Asp Glin Val Phe Ala Glu Ile Gly Tyr O85 O9 O O95 Thr Gln Trp Thr Val Phe Glu Pro Phe Ala Asn Lieu. Thr Tyr Lieu.

Asn Tyr Glin Ser Asp Ser Phe Lys Glu Lys Gly Gly Ala Ala Ala

Lieu. His Ala Ser Glin Glin Ser Glin Asp Ala Thr Lieu. Ser Thr Lieu.

Gly Val Arg Gly His Thr Gln Leu Pro Leu. Thir Ser Thr Ser Ala

Val Thir Lieu. Arg Gly Glu Lieu. Gly Trp Glu. His Glin Phe Gly Asp

Thir Asp Arg Glu Ala Ser Lieu Lys Phe Ala Gly Ser Asp Thr Ala

Phe Ala Val Asn. Ser Val Pro Val Ala Arg Asp Gly Ala Val Ile 90 95 2OO Lys Ala Ser Ala Glu Met Ala Lieu. Thir Lys Asp Thir Lieu Val Ser 2O5 21 O 215 Lieu. Asn Tyr Ser Gly Lieu. Lieu. Ser Asn Arg Gly Asn. Asn. Asn Gly 22O 225 23 O Ile Asn Ala Gly Phe Thr Phe Leu Phe 235 24 O

<210s, SEQ ID NO 24 &211s LENGTH: 450 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 24 Met Ser Ala Leu Tyr Met Ile Val Gly Thr Lieu Val Ala Leu Gly Val 1. 5 1O 15 Lieu Val Thr Phe His Glu Phe Gly His Phe Trp Val Ala Arg Arg Cys 2O 25 3O

Gly Val Llys Val Lieu. Arg Phe Ser Val Gly Phe Gly Met Pro Leu Lieu. 35 4 O 45 Arg Trp His Asp Arg Arg Gly Thr Glu Phe Val Ile Ala Ala Ile Pro SO 55 6 O

Lieu. Gly Gly Tyr Val Llys Met Lieu. Asp Glu Arg Glu Gly Glu Val Pro 65 70 7s 8O

Ala Asp Gln Lieu. Asp Glin Ser Phe Asn Arg Llys Thr Val Arg Glin Arg US 2016/0159877 A1 Jun. 9, 2016 89

- Continued

85 90 95

Ile Ala Ile Wall Ala Ala Gly Pro Ile Ala ASn Phe Lell Luell Ala Met 105 11 O

Wall Phe Phe Trp Wall Lell Ala Met Luell Gly Ser Glin Glin Wall Arg Pro 115 12 O 125

Wall Ile Gly Ala Wall Glu Ala Asp Ser Ile Ala Ala Ala Gly Luell 13 O 135 14 O

Thir Ala Gly Glin Glu Ile Wall Ser Ile Asp Gly Glu Pro Thir Thir Gly 145 150 155 160

Trp Gly Ala Wall Asn Lell Glin Luell Wall Arg Arg Lell Gly Glu Ser Gly 1.65 17s

Thir Wall Asn Wall Wall Wall Arg Asp Glin Asp Ser Ser Ala Glu Thir Pro 18O 185 19 O

Arg Ala Luell Ala Lell Asp His Trp Luell Gly Ala Asp Glu Pro Asp 195

Pro Ile Ser Lell Gly Ile Arg Pro Trp Arg Pro Ala Luell Pro Pro 21 O 215 22O

Wall Luell Ala Glu Lell Asp Pro Gly Pro Ala Glin Ala Ala Gly Luell 225 23 O 235 24 O

Thir Gly Asp Arg Lell Lell Ala Luell Asp Gly Glin Ala Luell Gly Asp 245 250 255

Trp Glin Glin Wall Wall Asp Lell Wall Arg Wall Arg Pro Asp Thir Ile 26 O 265 27 O

Wall Luell Lys Wall Glu Arg Glu Gly Ala Glin Ile Asp Wall Pro Wall Thir 27s 285

Lell Ser Wall Arg Gly Glu Ala Ala Ala Gly Gly Luell Gly Ala 29 O 295 3 OO

Gly Wall Gly Wall Glu Trp Pro Pro Ser Met Wall Arg Glu Wall Ser 3. OS 310 315

Gly Pro Luell Ala Ala Ile Gly Glu Gly Ala Arg Thir Trp Thir 3.25 330 335

Met Ser Wall Luell Thir Lell Glu Ser Luell Met Lell Phe Gly Glu 34 O 345 35. O

Lell Ser Wall Asn Lell Ser Gly Pro Ile Thir Ile Ala Wall Ala 355 360 365

Gly Ala Ser Ala Glin Ser Gly Wall Ala Asp Phe Lell Asn Phe Luell Ala 37 O 375

Tyr Luell Ser Ile Ser Lell Gly Wall Luell Asn Luell Lell Pro Ile Pro Wall 385 390 395 4 OO

Lell Asp Gly Gly His Lell Lell Phe Luell Wall Glu Trp Wall Arg Gly 4 OS 41O 415

Arg Pro Luell Ser Asp Arg Wall Glin Gly Trp Gly Ile Glin Ile Gly Ile 42O 425 43 O

Ser Luell Wall Wall Gly Wall Met Luell Luell Ala Luell Wall Asn Asp Luell Gly 435 44 O 445

Arg Luell 450

<210s, SEQ ID NO 25 &211s LENGTH: 246 212. TYPE : PRT <213s ORGANISM: Pseudomonas fluorescens US 2016/0159877 A1 Jun. 9, 2016 90

- Continued

<4 OOs, SEQUENCE: 25 Met Lys Gln His Arg Lieu Ala Ala Ala Val Ala Lieu Val Ser Lieu Val 1. 5 1O 15 Lieu Ala Gly Cys Asp Ser Glin Thir Ser Val Glu Lieu Lys Thr Pro Ala 2O 25 3O Glin Lys Ala Ser Tyr Gly Ile Gly Lieu. Asn Met Gly Llys Ser Lieu Ala 35 4 O 45 Glin Glu Gly Met Asp Asp Lieu. Asp Ser Lys Ala Val Ala Glin Gly Ile SO 55 6 O Glu Asp Ala Val Gly Llys Lys Glu Gln Lys Lieu Lys Asp Asp Glu Lieu. 65 70 7s 8O Val Glu Ala Phe Ala Ala Lieu Gln Lys Arg Ala Glu Glu Arg Met Thr 85 90 95 Llys Met Ser Glu Glu Ser Ala Ala Ala Gly Lys Llys Phe Lieu. Glu Asp 1OO 105 11 O Asn Ala Lys Lys Asp Gly Val Val Thir Thr Ala Ser Gly Lieu. Glin Tyr 115 12 O 125 Lys Ile Val Lys Lys Ala Asp Gly Ala Glin Pro Llys Pro Thir Asp Wall 13 O 135 14 O Val Thr Val His Tyr Thr Gly Lys Lieu. Thir Asn Gly Thr Thr Phe Asp 145 150 155 160 Ser Ser Val Asp Arg Gly Ser Pro Ile Asp Lieu Pro Val Ser Gly Val 1.65 17O 17s Ile Pro Gly Trp Val Glu Gly Lieu. Glin Leu Met His Val Gly Glu Lys 18O 185 19 O Val Glu Lieu. Tyr Ile Pro Ser Asp Lieu Ala Tyr Gly Ala Glin Ser Pro 195 2OO 2O5 Ser Pro Ala Ile Pro Ala Asn. Ser Val Lieu Val Phe Asp Lieu. Glu Lieu. 21 O 215 22O Lieu. Gly Ile Lys Asp Pro Ala Lys Ala Glu Ala Ala Asp Ala Pro Ala 225 23 O 235 24 O Ala Pro Ala Ala Lys Llys 245

<210s, SEQ ID NO 26 &211s LENGTH: 427 212. TYPE: PRT <213s ORGANISM: Pseudomonas fluorescens

<4 OOs, SEQUENCE: 26 Met Thr Asp Thr Arg Asn Gly Glu Asp Asin Gly Llys Lieu. Lieu. Tyr Cys 1. 5 1O 15 Ser Phe Cys Gly Llys Ser Gln His Glu Val Arg Llys Lieu. Ile Ala Gly 2O 25 3O

Pro Ser Val Phe Ile Cys Asp Glu. Cys Val Asp Lieu. Cys Asn Asp Ile 35 4 O 45

Ile Arg Glu Glu Val Glin Glu Ala Glin Ala Glu Ser Ser Ala His Llys SO 55 6 O

Lieu Pro Ser Pro Lys Glu Ile Ser Gly Ile Lieu. Asp Gln Tyr Val Ile 65 70 7s 8O Gly Glin Glu Arg Ala Lys Llys Val Lieu Ala Val Ala Val Tyr Asn His 85 90 95