Using Small Globular Proteins to Study Folding Stability and Aggregation

Using Small Globular Proteins to Study Folding Stability and Aggregation Virginia Castillo Cano September 2012 Universitat Autònoma de Barcelona Departament de Bioquímica i Biologia Molecular Institut de Biotecnologia i de Biomedicina Using Small Globular Proteins to Study Folding Stability and Aggregation Doctoral thesis presented by Virginia Castillo Cano for the degree of PhD in Biochemistry, Molecular Biology and Biomedicine from the University Autonomous of Barcelona. Thesis performed in the Department of Biochemistry and Molecular Biology, and Institute of Biotechnology and Biomedicine, supervised by Dr. Salvador Ventura Zamora. Virginia Castillo Cano Salvador Ventura aZamor Cerdanyola del Vallès, September 2012 SUMMARY SUMMARY The purpose of the thesis entitled “Using small globular proteins to study folding stability and aggregation” is to contribute to understand how globular proteins fold into their native, functional structures or, alternatively, misfold and aggregate into toxic assemblies. Protein misfolding diseases include an important number of human disorders such as Parkinson’s and Alzheimer’s disease, which are related to conformational changes from soluble non‐toxic to aggregated toxic species. Moreover, the over‐expression of recombinant proteins usually leads to the accumulation of protein aggregates, being a major bottleneck in several biotechnological processes. Hence, the development of strategies to diminish or avoid these taberran reactions has become an important issue in both biomedical and biotechnological industries. In the present thesis we have used a battery of biophysical and computational techniques to analyze the folding, conformational stability and aggregation propensity of several globular proteins. The combination of experimental (in vivo and in vitro) and bioinformatic approaches has provided insights into the intrinsic and structural properties, including the presence of disulfide bonds and the quaternary structure, that modulate these processes under physiological conditions. Overall, the data illustrates how the establishment of native‐like contacts providing folding intermediates, native structures or protein interfaces with significant thermodynamic stability is a crucial process both to drive protein folding and to compete toxic aggregation. This thesis is based on the following scientific papers: I. Designing out disulfide bonds of leech carboxypeptidase inhibitor: implications for its folding, stability and function. Arolas JL, Castillo V, Bronsoms S, Aviles FX, Ventura S. J Mol Biol. 2009 Sep 18; 392(2): 529‐46. II. Deciphering the role of the thermodynamic and kinetic stabilities of SH3 domains on their aggregation inside bacteria. Castillo V, Espargaró A, Gordo V, Vendrell J, Ventura S. Proteomics. 2010 Dec; 10(23): 4172‐85. III. Amyloidogenic regions and interaction surfaces overlap in globular proteins related to conformational diseases. Castillo V, Ventura S. PLoS Comput Biol. 2009 Aug; 5(8): e1000476. Subsequent research works are included in the annex section: IV. The in vivo and in vitro aggregation properties of globular proteins correlate with their conformational stability: the SH3 case. Espargaró A, Castillo V, de Groot NS, Ventura S. J Mol Biol. 2008 May 16; 378(5): 1116‐31. V. The N‐terminal helix controls the transition between the soluble and amyloid states of an FF domain. Castillo V, Chiti F and Ventura S. Submitted for publication. ABBREVIATIONS 1S One‐difulfide intermediates 2S Two‐difulfide intermediates 3S Three‐difulfide intermediates 4S Four‐difulfide intermediates Å Amstrong Aß Amyloid‐ß peptide ASA Accessible surface area ß2m ß2 microglobulin BPTI Bovine pancreatic trypsin inhibitor CD Circular dichroism CTD Carboxyl‐terminal domain EGF Epidermal growth factor FALS Familial amyotrophic lateral sclerosis FTIR Fourier transform infrared spectroscopy GFP Green fluorescent protein IB Inclusion body LCI Leech carboxypeptidase inhibitor LDTI Leech‐derived trypsin inhibitor mRNA Messenger ribonucleic acid N Native state NMR Nuclear magnetic resonance ODA Optimal Docking Area PDB Protein Data Bank RNase A Ribonuclease A SOD1 Superoxid dismutase 1 SPC‐SH3 Spectrin‐SH3 Sso AcP Sulfolobus solfataricus acylphosphatase TEM Transmission electron microscopy ThT Thioflavin‐T TS Transition state TTR Transthyretin U Unfolded state INDEX INDEX I ‐ INTRODUCTION 1 1. ‐ Protein folding 3 1.1 ‐ First insights 3 1.2 ‐ Folding intermediates and folding pathways 3 1.3 ‐ Energy and kinetic diagrams 4 1.4 ‐ Energy landscapes 6 1.5 ‐ Disulfide bonds and protein folding 8 1.6 ‐ Folding pathways of small disulfide proteins 10 2. ‐ Protein misfolding and disease 13 3. ‐ Protein aggregation and amyloid fibrils formation 14 3.1 ‐ Amyloid fibrils 14 3.2 ‐ Functional fibrils 15 3.3 ‐ Fibril formation mechanism 16 3.4 ‐ Native‐like aggregation 18 4. ‐ In vivo protein aggregation 20 5. ‐ Molecular determinants and prediction of amyloid aggregation 22 5.1‐Intrinsic physico‐chemical properties and external conditions 23 5.2‐Hot spots and gatekeepers 23 5.3‐Aggregation prediction algorithms 24 6. ‐ Interaccion surfaces 27 6.1 ‐ Quaternary structure and disease 27 6.2 ‐ Interface structural properties and prediction algorithms 28 7. ‐ Protein models 29 7.1 ‐ Leech carboxypeptidase inhibitor (LCI) 29 7.2 ‐ SH3 domain of α‐spectrin protein (SPC‐SH3) 31 7.3 ‐ FF domain of yeast URN1 protein (URN1‐FF) 32 II ‐ AIMS 35 III ‐ DISCUSSION 39 IV ‐ GENERAL DISCUSSION 49 INDEX V ‐ CONCLUDING REMARKS 53 VI ‐ REFERENCES 59 VII ‐ SCIENTIFIC PAPERS Paper I: Designing out disulfide bonds of leech carboxypeptidase inhibitor: implications for its folding, stability and function. Paper II: Deciphering the role of the thermodynamic and kinetic stabilities of SH3 domains on their aggregation inside bacteria. Paper III: Amyloidogenic regions and interaction surfaces overlap in globular proteins related to conformational diseases. VIII ‐ ANNEX Paper IV: The in vivo and in vitro aggregation properties of globular proteins correlate with their conformational stability: the SH3 case. Paper V: The N‐terminal helix controls the transition between the soluble and amyloid states of an FF domain. I ‐ INTRODUCTION INTRODUCTION 1. ‐ PROTEIN FOLDING The process by which a polypeptide chain acquires its three dimensional native structure is denominated protein folding. After the amino acidic chain synthesis in the ribosome, local and long‐range intramolecular interactions drive the folding process. The main forces leading to the attainment of the secondary and tertiary structure are hydrogen bonds, electrostatic interactions and hydrophobic forces. The native conformation is constituted by a large number of dynamic states, which display different but energetically proximal minimums. The fluctuations between these species provide flexibility, indispensable for protein functional activity under physiological conditions. Understanding the molecular basis of protein folding and unfolding has become a key topic in biology because mistakes in the acquisition or maintenance of the native state are being found linked to an increasing number of fatal diseases (Chiti and Dobson, 2006). 1.1 ‐ First insights In the early 1960s, Christian B. Anfinsen and co‐workers studied the in vitro refolding of Ribonuclease A (RNase A), demonstrating that the amino acid sequence suffices to encode its three‐dimensional structure (Anfinsen et al., 1961). The native state was proposed to be the most stable structure under physiological conditions and thermodynamics the main determinant of the folding process. Later on, Cyrus Levinthal illustrated how the acquisition of a particular structure by a random search would need astronomic time periods, considering the large number of possible conformations that the polypeptide should sample (Levinthal, 1968, Zwanzig et al., 1992). This argument leaded the search of folding pathways in the forthcoming years. 1.2 ‐ Folding intermediates and folding pathways Many efforts have been focused on the characterization of folding intermediates monitoring protein structural changes. Promoting jumps in the folding and unfolding reactions makes it possible to identify the folding model for a given protein. For instance, a protein follows a “two‐state” model when only the unfolded or the native state are present at any point of the folding reaction. When partially folded 3 INTRODUCTION conformers are populated along the folding reaction, they can correspond both to on‐ pathway or off‐pathway intermediates, depending if they can achieve the native structure or remain trapped in an energy minimum. Several studies have revealed that folding pathways could be guided by specific native‐like interactions directing polypeptide chains towards the acquisition of the lowest energetic structure (Daggett and Fersht, 2003). The “nucleation‐growth” model proposed that secondary structures are propagated rapidly towards tertiary contacts leading to the acquisition of the native structure, but this mechanism didn’t take into account the presence of folding intermediates (Wetlaufer, 1973). The alternative “framework‐model” postulated secondary structures as the first formed elements. The native form may be achieved then by docking the secondary structures (Kim and Baldwin, 1982, Kim and Baldwin, 1990, Karplus and Weaver, 1976). The last proposed model was that based on the “hydrophobic collapse”, in which hydrophobic forces promote a rapid compaction of the structure, reducing the conformational search towards the native structure (Baldwin,

Using Small Globular Proteins to Study Folding Stability and Aggregation

THE THICKNESSES of HEMOGLOBIN and BOVINE SERUM ALBUMIN MOLECULES AS UNIMOLECULAR LA YERS AD- SORBED ONTO FILMS of BARIUM STEARATE* by ALBERT A

Flowering Buds of Globular Proteins: Transpiring Simplicity of Protein Organization

Proteasome System of Protein Degradation and Processing

Folding Units in Globular Proteins (Protein Folding/Domains/Folding Intermediates/Structural Hierarchy/Protein Structure) ARTHUR M

Albumin Nanovectors in Cancer Therapy and Imaging

Topology Is the Principal Determinant in the Folding of a Complex All-Alpha Greek Key Death Domain from Human FADD

A Database of Globular Protein Structural Domains: Clustering of Representative Family Members Into Similar Folds R Sowdhamini, Stephen D Rufino and Tom L Blundell

Globular Proteins Team 434 Respiratory Block

SAP Domain Phi-Value Analysis

Worksheet 16 1

Structural Insights Into Substrate Recognition and Processing by the 20S Proteasome

Examples of Secondary Structure Proteins