Visualization of Coupled Protein Folding and Binding in Bacteria and Purification of the Heterodimeric Complex

Visualization of coupled protein folding and binding in bacteria and purification of the heterodimeric complex Haoyong Wang and Shaorong Chong* New England Biolabs, Inc., 32 Tozer Road, Beverly, MA 01915 Edited by James A. Wells, Sunesis Pharmaceuticals, Inc., San Francisco, CA, and approved November 21, 2002 (received for review October 8, 2002) During overexpression of recombinant proteins in Escherichia coli, ence with the folding of the GFP domain might be prevented and misfolded proteins often aggregate and form inclusion bodies. If an the GFP domain be allowed to fold correctly. To test this idea, we aggregation-prone recombinant protein is fused upstream (as an set up a model system in which an aggregation-prone subunit of the N-terminal fusion) to GFP, aggregation of the recombinant protein E. coli integration host factor (IHF) was used as the upstream domain also leads to misfolding of the downstream GFP domain, (N-terminal) domain in a GFP fusion. IHF is a heterodimer of ␣ resulting in a decrease or loss of fluorescence. We investigated and ␤ subunits (11). Separate overexpression of IHF-␣ and -␤ in E. whether the GFP domain could fold correctly if aggregation of the coli resulted in unstable polypeptides and insoluble aggregates, upstream protein domain was prevented in vivo by a coupled protein respectively (12). Coexpression of both IHF subunits was necessary folding and binding interaction. Such interaction has been previously for the production of soluble and active proteins. The expressed shown to occur between the E. coli integration host factors ␣ and ␤, IHF proteins were purified as a 1:1 complex of tightly bound ␣ and and between the domains of the general transcriptional coactivator ␤ subunits (12). It seems that one expressed IHF subunit could not cAMP response element binding protein (CREB)-binding protein and adopt a correctly folded structure until it interacted with the other the activator for thyroid hormone and retinoid receptors. In this subunit, a process of mutual chaperoning which probably occurs ␤ study, fusion of integration host factor or the CREB-binding protein early during protein synthesis and folding. Using IHF-␤ as the domain upstream to GFP resulted in aggregation of the fusion upstream domain in a GFP fusion protein, we demonstrated in this protein. Coexpression of their respective partners, on the other hand, study that the interaction between the two IHF subunits allowed the allowed soluble expression of the fusion protein and a dramatic downstream GFP domain to fold correctly, resulting in a dramatic increase in fluorescence. The study demonstrated that coupled pro- increase in fluorescence. tein folding and binding could be correlated to GFP fluorescence. A There have been numerous examples of unstructured proteins modified miniintein containing an affinity tag was inserted between that adopt folded structures when binding to their biological the upstream protein domain and GFP to allow rapid purification and partners (13). These ‘‘coupled folding and binding’’ events, often identification of the heterodimeric complex. The GFP coexpression found in eukaryotes, represent a distinct class of biomolecular fusion system may be used to identify novel protein–protein inter- recognition that is quite different from the ‘‘lock and key’’ or actions that involve coupled folding and binding or protein partners ‘‘adhesive surface’’ model of biomolecular interaction, often ob- that can solubilize aggregation-prone recombinant proteins. served between folded or structured macromolecules (14, 15). We chose a recently characterized interaction, named synergistic fold- rotein aggregation and inclusion body formation are frequently ing, between domains of the general transcriptional coactivator Pencountered when recombinant proteins are overexpressed in cAMP response element binding protein (CREB)-binding protein Escherichia coli (1–3). By expressing GFP as a C-terminal fusion to (CBP) and the activator for thyroid hormone and retinoid receptors the protein of interest, the solubility of the fusion protein in vivo, (ACTRs) (16). We show here that the unstructured CBP domain indicative of productive protein folding, can be monitored by fused upstream to GFP led to aggregation and inclusion body fluorescence (4). If the protein of interest is prone to aggregation formation. Coexpression of the ACTR domain resulted in a sig- when expressed alone, it would likely also do so in the GFP fusion nificant increase in fluorescence and copurification of the ACTR͞ (4, 5). It appears that the aggregation of the upstream protein CBP complex. Our data suggest that the GFP fusion may be of domain causes misfolding of the downstream GFP domain. With- general use in monitoring coupled protein folding and binding out a correctly folded structure, GFP cannot form the chromophore events in vivo. essential for its fluorescence (6, 7). It has been suggested that protein aggregation during protein Materials and Methods overexpression in a heterologous host is the result of early events Fusion Vector Construction. The fusion vectors were derived from during protein synthesis and folding that lead to off-pathway the pTEG vector, which contained an isopropyl ␤-D-thiogalacto- association of folding intermediates (8, 9). Because the GFP variant pyranoside-inducible T7 promoter and expressed a modified mi- used in the GFP fusion protein folds properly when expressed alone niintein fused to a downstream GFP domain as described (5). The (10), the misfolding of the GFP domain in the GFP fusion protein E. coli IHF ␣ and ␤ genes were amplified from E. coli chromosomal is probably due to the interference of the early GFP-folding pathway by the upstream aggregation-prone protein. In a previous DNA by using PCR. The domain fragment of human ACTR study (4), GFP fusion proteins produced in vivo exhibited no (residues 1,018–1,088) (16) was synthesized by using overlapping significant difference in fluorescence compared to the same pro- oligonucleotides. The domain fragment of mouse CBP (residues teins produced by in vitro transcription͞translation. This lack of 2,059–2,117) was amplified from a vector carrying a full-length dependence in fluorescence on bulk concentrations of macromolecules seemed to suggest that the misfolding of the GFP domain in This paper was submitted directly (Track II) to the PNAS office. the fusion proteins involved intra-molecular interactions that oc- Abbreviations: ACTR, activator for thyroid hormone and retinoid receptor; CBP, cAMP curred early during protein synthesis and folding. Based on these response element binding protein (CREB)-binding protein; IHF, integration host factor; hypotheses, we reasoned that, if we introduced a second protein MBP, maltose-binding protein; Ni-NTA, nickel-nitrilotriacetate; sup, clarified cell extracts; known to bind and fold the aggregation-prone protein during early CBPf, CBP domain, residues 2,059–2,117; ACTRf, ACTR domain, residues 1,018–1,088. stages of protein synthesis and folding, the nonproductive interfer- *To whom correspondence should be addressed. E-mail: [email protected]. 478–483 ͉ PNAS ͉ January 21, 2003 ͉ vol. 100 ͉ no. 2 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0236088100 Downloaded by guest on September 27, 2021 coexpression of ␤, ␣-␤IGЈ (Fig. 1B), the ␤ gene was expressed from the pTEG vector whereas the ␣-␤IG bicistronic expression was from the pACCExp vector. Cell Culture and Protein Expression. All fusion constructs were transformed into E. coli ER 2566 (New England Biolabs, Bev- erly, MA) (genotype: F–␭–fhu A2 [lon] ompT lacZ::T7 gene1 gal sulA11 ⌬(mcrC-mrr)114::IS10R(mcr-73::miniTn10–TetS)2 R(zgb-210::Tn10)(TetS) endA1 [dcm]). Cells were grown at 37°C in LB medium supplemented with 100 ␮g͞ml ampicillin in shake ϭ flasks. At OD600 0.05, the cultures were induced by using 0.4 mM isopropyl ␤-D-thiogalactoside and allowed to continue growth at 20 or 30°C for an additional 16 h. The cells were then used for fluorescence measurements or harvested by centrifu- gation for solubility analyses and protein purification. Fluorescence Measurements. For each fusion construct, fluorescence was measured in triplicate from three independently grown postinduction cultures (50 ml). For the construct IG and all fusion constructs of ␤IG (Fig. 3A), the cultures were induced at 20°Cfor 16 h. For the fusion constructs of CBPfIG (Fig. 5A), the cultures were induced at 30°C for 16 h. The cultures were diluted to OD600 1.0 and aliquots (200 ␮l each) were then added to wells of a microplate. The fluorescence was measured by using a Perkin– Elmer LS50B luminescence spectrometer (excitation, 395 nm; emission, 510 nm; each with 2.5-nm bandwidth). A culture containing a GFP-minus construct and grown and induced under the same conditions was used as a blank, and its fluorescence value was subtracted from those of the cultures containing the GFP fusion constructs. The fluorescence images of induced cultures (wc) and clarified cell extracts (sup) (Figs. 3B and 5B) were taken by using a digital camera. The samples were prepared in the same microplate as used for the fluorescence measurement and illuminated by a Fig. 1. Schematic diagram of the coexpression fusion constructs. T7 pro- hand-held long-wave UV lamp (365 nm). All cultures were diluted moter (T7) is indicated by the arrow. rbs, bacterial ribosome-binding site; I, the to the same optical density before their fluorescence images were modified miniintein; G, green fluorescence protein; ␣, E. coli IHF␣; ␤, E. coli taken. The clarified cell

Visualization of Coupled Protein Folding and Binding in Bacteria and Purification of the Heterodimeric Complex

Evolution of Biomolecular Structure Class II Trna-Synthetases and Trna

The Architecture of the Protein Domain Universe

Documentation and Localization of Force-Mediated Filamin a Domain

Lipid-Targeting Pleckstrin Homology Domain Turns Its Autoinhibitory Face Toward the TEC Kinases

Pleckstrin Homology Domains and the Cytoskeleton

A New Census of Protein Tandem Repeats and Their Relationship with Intrinsic Disorder

Chapter 6 Protein Structure and Folding

Deep Learning of Protein Structural Classes: Any Evidence for an ‘Urfold?

Predicting Protein Domain Boundary from Sequence Alone Using Stacked Bidirectional LSTM

Insights Into the Molecular Architecture of the 26S Proteasome

Introduction to Bioinformatics (Master Chemoinformatique)

Tertiary Structure