<<

University of Massachusetts Medical School eScholarship@UMMS

GSBS Dissertations and Theses Graduate School of Biomedical Sciences

2007-08-22

Folding and Assembly of Multimeric : Dimeric HIV-1 Protease and a Trimeric Component of a Complex Hemoglobin Scaffold: A Dissertation

Amanda Ann Fitzgerald University of Massachusetts Medical School

Let us know how access to this document benefits ou.y Follow this and additional works at: https://escholarship.umassmed.edu/gsbs_diss

Part of the Amino Acids, Peptides, and Proteins Commons, Genetic Phenomena Commons, Nucleic Acids, Nucleotides, and Nucleosides Commons, and the Viruses Commons

Repository Citation Fitzgerald AA. (2007). Folding and Assembly of Multimeric Proteins: Dimeric HIV-1 Protease and a Trimeric Coiled Coil Component of a Complex Hemoglobin Scaffold: A Dissertation. GSBS Dissertations and Theses. https://doi.org/10.13028/750v-ht77. Retrieved from https://escholarship.umassmed.edu/ gsbs_diss/341

This material is brought to you by eScholarship@UMMS. It has been accepted for inclusion in GSBS Dissertations and Theses by an authorized administrator of eScholarship@UMMS. For more information, please contact [email protected].

A Dissertation Presented

By

Amanda Ann Fitzgerald

Submitted to the Faculty of the

University of Massachusetts Graduate School of Biomedical Sciences,

Worcester

in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

August 22, 2007

Biochemistry and Molecular Pharmacology

ii

Copyright Page

CHAPTER II of this thesis is in preparation as the manuscript

“The Folding Free Energy Surface of HIV-1 Protease: Monomer Folding Controls the

Formation of the Native Dimeric β-barrel.”

A.A.Fitzgerald, O.Bilsel, Y. Wu, A. Kundu, J. A. Zitzewitz., and C.R. Matthews for submission to Journal of Molecular Biology

C. Robert Matthews, Ph.D., Thesis Advisor

Anthony Carruthers, Ph.D., Member of Committee

William Royer, Ph.D., Member of Committee

David Lambright, Ph.D., Member of Committee

Celia Schiffer, Ph.D., Chair of Committee

that the student has met all graduation requirements of the school.

Anthony Carruthers, Ph.D., Dean of the Graduate School of Biomedical Sciences iv

Acknowledgements

My journey to a doctoral degree would never have been possible if I had not had a tremendous amount of support and encouragement from many people. To express my gratitude, I would like to take this opportunity to acknowledge some of the many people who have helped me along the way.

I would like to thank Bob Matthews for mentoring my dissertation research. I will always remember our many conversations about chemical kinetics and the intricacies of folding. I am also grateful to the members of the Matthews’ lab who were always willing to lend a hand. Jill Zitzewitz guided me every day through my research, helping me to design experiments and analyze my projects in innovative ways. Osman

Bilsel was my consulting expert on data analysis, reminding me to consider the physical implications of every possible model. Ying Wu taught me molecular biology, protein purification, and how to work efficiently. Agnita Kundu helped me to persevere through the difficulties of the HIV-1 protease project by sharing with me her own difficulties on a parallel research project. Jessica Vasale taught me peptide synthesis and purification.

Karen Welch was a great source of help for everything. Ram Vadrevu, Randy Forsyth,

Louise Wallace, Iain Day, Rob Simler, Anna Svensson, Sagar Kathuria, Zhenyu Gu,

Xiaoyan Yang, and Can Kayatekin also provided a supportive lab environment.

I would like to thank my Thesis Research Advisory Committee. As my committee chair, Celia Schiffer constantly reminded me to remember the biological relevance of the projects I studied and consider all possibilities within the experimental v

data. Bill Royer was especially helpful in the coiled coil project and provided structural biology expertise. Dave Lambright gave thoughtful suggestions in data analysis. Tony

Carruthers provided helpful suggestions regarding kinetic data and also reminded me of the big picture. I am also thankful to José Argüello for serving as my outside thesis examiner.

I am grateful to the people who encouraged me to pursue graduate work to

completion. Carol Russell, my advisor at Framingham State College, has been a constant

cheerleader in my pursuit of science. Gael Daly has provided a listening ear and

prayerful encouragement. Kathryn Bard has reminded me to keep my goals in sight.

Moses Prabu has been immensely helpful in editing my thesis and keeping me focused.

I would like to thank my fellow graduate students for sharing in the journey.

Julie, Tom, and Mandy– thank you for helping me through core course and qualifying.

Jared, Jenn, Kristen, Trista, Jeff, Anthony, Steve, Naveen, Nick, Yufeng, Chaitali, and

Madhavi– thank you for being supportive colleagues in the BMP department. Mary and

Melonnie– I am forever grateful for our friendship that has developed over the past five years.

Last, but most definitely not least, I would like to thank my family for their

constant love and encouragement. Uncle Reid– thank you for not letting me quit.

Michael– thank you for being there. Aunt Maureen– thank you for always reminding me

“You go girl!” Mom and Dad– thank you for teaching me to be intellectually curious and

dedicated to my passions. And finally, my fiancé Dave– you have been my greatest

source of strength, laughter, and love. I could not have done this without you. vi

To My Family

vii

Abstract

Knowledge of how a polypeptide folds from a space-filling random coil into a

biologically-functional, three-dimensional structure has been the essence of the problem. Though mechanistic details of DNA transcription and RNA translation are well understood, a specific code by which the primary structure dictates the acquisition of secondary, tertiary, and quarternary structure remains unknown. However, the demonstrated reversibility of in vitro protein folding allows for a thermodynamic analysis of the folding reaction. By probing both the equilibrium and kinetics of protein folding, a protein folding mechanism can be postulated. Over the past 40 years, folding mechanisms have been determined for many proteins; however, a generalized folding code is far from clear. Furthermore, most protein folding studies have focused on monomeric proteins even though a majority of biological processes function via the association of multiple subunits. Consequently, a complete understanding of the acquisition of quarternary is essential for applying the basic principles of protein folding to biology.

The studies presented in this dissertation examined the folding and assembly of

two very different multimeric proteins. Underlying both of these investigations is the

need for a combined analysis of a repertoire of approaches to dissect the folding

mechanism for multimeric proteins. Chapter II elucidates the detailed folding energy

landscape of HIV-1 protease, a dimeric protein containing β-barrel subunits. The folding

of this viral exhibited a sequential three-step pathway, involving the rate-limiting viii

formation of a monomeric intermediate. The energetics determined from this analysis and their applications to HIV-1 function are discussed. In contrast, Chapter III illustrates the association of a coiled coil component of L. terrestris erythrocruorin. This extracellular hemoglobin consists of a complex scaffold of linker chains with a central ring of interdigitating coiled coils. Allostery is maintained by twelve dodecameric hemoglobin subunits that dock upon this scaffold. Modest association was observed for this coiled coil, and the implications of this fragment to linker assembly are addressed.

These studies depict the complexity of multimeric folding reactions. Chapter II demonstrates that a detailed energy landscape of a dimeric protein can be determined by combining traditional equilibrium and kinetic approaches with information from a global analysis of kinetics and a monomer construct. Chapter III indicates that fragmentation of large complexes can show the contributions of separate domains to hierarchical organization. As a whole, this dissertation highlights the importance of pursuing mulitmeric protein folding studies and the implications of these folding mechanisms to biological function.

ix

Table of Contents

Title page i Copyright page ii Approval page iii Acknowledgements iv Abstract vi Table of Contents viii List of Tables xiv List of Figures xv Abbreviations xvii

CHAPTER I: INTRODUCTION AND LITERATURE REVIEW 1 Overview of the protein folding problem 2 The central dogma of molecular biology 2 Searching for a folding code 3 Postulated models of protein folding 5 Computational efforts in protein folding 5 Benefits from solving the protein folding problem 8 Protein therapeutics 8 Misfolding and disease 8 Emerging questions in the protein folding field 11 Structure-guided studies 11 Methods of protein engineering 12 Thermodynamic versus kinetic control 13 Chemical kinetics view of folding 13 Technology Development 16 Fluorescence spectroscopy 16 NMR spectroscopy 17 x

Mass spectrometry 18 Small angle X-ray scattering 18 Ultra-fast folding detection 19 Mutational analysis of folding reactions 21 Topology and folding 22 Relative contact order 22 Topology modules within a proteins 22 Key protein folding studies on different motifs 24 Folding trends in mainly α-helical proteins 24 Folding trends in α/β proteins 26 β−Protein Folding Studies 27 Patterns in β-motif formation 27 Model β-protein systems 28 Multimeric systems: an added layer of complexity for the protein folding 30 problem Complexity of bimolecular kinetics 30 α-helical multimer folding 31 α/β dimer folding trends 32 Folding of β motif multimers 33 Dissecting multimeric folding with monomer constructs 34 AFUs to dissect monomer folding 34 Creating stable monomer subunits 35 Scope of the thesis 37 HIV-1 protease folding 37 Coiled coils involved in assembly 40

CHAPTER II: THE FOLDING FREE ENERGY SURFACE OF HIV-1 41 PROTEASE: MONOMER FOLDING CONTROLS THE FORMATION OF xi

THE NATIVE DIMERIC β-BARREL

Manuscript in preparation 42 Abstract 43 Introduction 45 HIV-1 protease 45 Drug resistant HIV-1 protease 45 Drug desistance and the HIV-1 protease free energy landscape 46 Structure of HIV-1 protease 47 Thermodynamics of HIV-1 Protease 47 Motivation of the current study 49 Materials and methods 52 Standard buffers and reagents 52 Protein expression and purification 52 Determination of protein concentration 53 Equilibrium studies 53 Kinetic studies 54 Global analysis methods 55 Results 56 Secondary and tertiary structural probes of HIV-1 protease folding 56 Equilibrium unfolding 58 Chevron analysis 65 Global analysis 70 Monomer construct studies 79 Contribution of the Trp residues as FL probes 82 Summary of results 83 Discussion 84 HIV-1 protease folding mechanism 84 Energetic landscape of HIV-1 protease 84 xii

Significance of a monomeric intermediate 85 Comparison to superoxide dismutase (SOD) folding 86 Comparisons with β-barrel dimers 87 Monomer construct folding comparisons 88 Conclusion 88

CHAPTER III: THE ROLE OF COILED COIL ASSEMBLY IN THE 91 SCAFFOLD OF A COMPLEX HEMOGLOBIN Abstract 92 Introduction 93 Hemoglobin as a model of protein structure studies 93 A model annelid hemoglobin 94 Structure of a coiled coil 98 Importance of coiled coils 101 GCN4-pI: A model coiled coil 102 Coiled coil diversity 107 Aim of this chapter 107 Experimental methods 110 Helix propensity prediction 110 Peptide synthesis and purification 110 Buffers 112 Determination of Protein Concentration 112 Size-exclusion chromatography (SEC) 113 Equilibrium (CD) experiments 113 Results 114 Predicted helicity of the constructs 114 Peptide solubility 116 Secondary structure of the individual linker chains 118 Secondary structure induced by interactions between the peptides 121 xiii

Oligomeric state of the peptides 126 Summary of results 128 Discussion 129

CHAPTER IV: GENERAL DISCUSSION 133 Importance of protein folding studies 134 Beyond monomer folding 134 Current perspectives on multimer folding 135 Significance of this thesis 137 HIV-1 Protease 137 L. terrestris erythrocruorin 138 Contribution to multimeric protein folding 139 Macromolecular crowding in the cell 140 Considerations of macromolecular crowding in these studies 141 Folding macromolecules within the cell 143

BIBLIOGRAPHY 146 xiv

LIST OF TABLES AND SCHEMES

Table 2.1: Thermodynamic Parameters Obtained from Equilibrium Titrations 62 or Global Analysis of the Kinetic Data

Table 2.2: Kinetic parameters 75

Scheme 2.1: Putative models based on chevron analysis 72

xv

LIST OF FIGURES

Figure 1.1: The protein folding problem. 4

Figure 1.2: Models of protein folding 6

Figure 1.3: Possible misfolding models 10

Figure 1.4: Energy landscapes in protein folding 15

Figure 1.5: Timescale of protein folding reactions 20

Figure 1.6: Crystal Structure of HIV-1 protease 38

Figure 1.7: The 5.5 angstrom crystal map of erythrocrourin 39

Figure 2.1: Structure and Topology of HIV-PR 47

Figure 2.2: CD and FL spectral characteristics of HIV-PR 57

Figure 2.3: Reversibility of HIV-PR folding 59

Figure 2.4: Equilibrium folding properties of HIV-PR 60

Figure 2.5: Fraction unfolded protein (Fapp) plots of the SVD equilibrium 63

unfolding CD and FL vectors fit globally to a two-state process

Figure 2.6 HIV-PR Folding Kinetics 66

Figure 2.7: Chevron analysis of HIV-PR folding kinetics 67

Figure 2.8: Global analysis of HIV-PR folding kinetics 71

Figure 2.9: Comparison of microscopic rate constants with chevron of 76

HIV-PR folding kinetics

Figure 2.10: Predicted populations of HIV-PR folding species 77

Figure 2.11: Folding properties of m-HIV-PR 81

Figure 3.1: The 5.5 Å electron density map of Lumbricus terrestris erythrocurorin95 xvi

Figure 3.2: The linker scaffold is organized in one-twelfth subunits 96

Figure 3.3: The sequences of four distinct linker chains have been 97

identified in L. terrestris hemoglobin

Figure 3.4: Comparison of helical packing 99

Figure 3.5: Sequence of GCN4-p1 represented in format 100

Figure 3.6: Crystal structure of GCN4 103

Figure 3.7: Coiled coil assembly 108

Figure 3.8 Peptide constructs corresponding to the long coiled coil 111

Figure 3.9: Agadir prediction of ccL peptide helicity 115

Figure 3.10: UV Absorption spectra of the ccL peptides 117

Figure 3.11: CD Spectra of the individual ccL peptides 119

Figure 3.12: Unfolding of ccL1 120

Figure 3.13: CD Spectra of the pairwise ccL combinations 122

Figure 3.14: CD Spectra of the putative trimer linker combination 124

Figure 3.15: CD Spectra of the putative 1234 linker combination 125

Figure 3.16: Size Exclusion Chromatography of the ccL peptides 127

Figure 3.17: Contacts occurring between the globular heads of the linker domains 130

xvii

Abbreviations

AIDS acquired immunodeficiency syndrome

ALS amyotrophic lateral sclerosis

ANS anilinonaphthalene-8-sulphonate ccL coiled-coil domain of L. terrestris erythrocruorin

CD circular dichroism chevron semi-logarithmic plot of relaxation time versus denaturant

concentration

DHFR dihydrofolate reductase

Fapp fraction apparent

FL fluorescence

FRET Förster Resonance Energy Transfer

GCN4-pI yeast transcriptional activator GCN4 zipper peptide

GCN4-IQI engineered GCN4-pI trimer

0 ΔG (H2O) Gibbs free energy

Η/D exchange hydrogen/deuterium exchange

HIV human immunodeficiency virus

PR-HIV D25N Q7K/D25N/C67A/C95A HIV-1 protease

Δ 99-96 PR-HIV D25N C-terminal deletion monomer construct of HIV-PR

HPV domain human papillomavirus E2 transcriptional regulator DNA binding

domain xviii

MS mass spectrometry m value urea dependence of unfolding

NMR nuclear magnetic resonance spectroscopy

SAXS small angle X-ray scattering

SF-FL stopped-flow fluorescence

SIV simian immunodeficiency virus

S/N signal-to-noise

SOD human apo-Cu, Zn superoxide dismutase

SVD singular value decomposition

1

CHAPTER I

INTRODUCTION AND LITERATURE REVIEW

2

Overview of the protein folding problem

The central dogma of molecular biology. Beginning with the discovery of DNA structure by Watson and Crick in the 1950’s (Watson and Crick 1953a; Watson and Crick

1953b; Watson and Crick 1953c), through the development of recombinant DNA techniques in the 1970’s and 1980’s (Nash 1975a; Nash 1975b; Wetzel 1980; McGregor

1983; Georgiou 1995), and currently enlightened with the promise of RNAi (Seydoux et al. 1996; Fire et al. 1998; Tabara et al. 1998; Sharp and Zamore 2000; Shankar et al.

2005), the late 20th century and the dawn of the 21st century have been regarded as the genetic revolution (Hennig 2004). Scientific discoveries relating to all aspects of the genetic code hold the keys to transforming medicine, industry, and quality of life as we know it (Berg and Singer 1995).

The central dogma of molecular biology describes how DNA sequences are transcribed to RNA sequences and these RNA sequences are then translated to amino acid sequences of polypeptide chains (Crick 1958; Thieffry and Sarkar 1998). The systematic analysis of catalytic mechanisms, correlated with the structures of many of the relevant protein/DNA and protein/RNA complexes, has made these processes relatively well understood (Wahl and Sundaralingam 1995; Millar 1996; Steitz and Moore 2003;

Moore and Steitz 2005). However, the details of the process by which a fully-functional three-dimensional protein is rapidly and efficiently folds from an unstructured linear sequence of amino acids remain unclear. The quest to understand the molecular events that link the primary structure and the functional native protein is viewed as “the protein

3

folding problem” and is commonly referred to as the “second half” of the genetic code

(Figure 1.1).

Searching for a folding code. Through his pioneering work with ribonuclease in the 1960’s, Anfinsen showed that a denatured protein can spontaneously refold in solution (Anfinsen 1973). The amino acid sequence alone is capable of directing the folding reaction. This observation led to the idea that protein folding is a reversible process that can be described in terms of thermodynamic principles (Anfinsen et al.

1961). However, if a sequence of 100 amino acids were allowed to sample two conformations per amino acid, with one picosecond per conformation, the protein would require the age of the universe to reach its most stable fold. The Levinthal paradox

(Levinthal 1968) implies that folding is a directed search and has motivated an entire field of research on the mechanisms of this remarkable process. The coupling of the thermodynamic and kinetic perspectives has shaped the search for a folding code.

Over the past thirty-five years, experimentalists have studied the in vitro folding of purified proteins in an attempt to characterize the steps and requirements for protein folding. Many types of folding mechanisms have been delineated and these efforts have demonstrated the complexity of the protein folding problem, with intricacies dependent upon sequence, fold, and function (Matthews 1993; Rumbley et al. 2001). Consequently, the specificity and preference of amino acid sequences for a particular fold have yet to be determined.

4

Transcription Translation Folding DNA RNA Protein

Functional 3D-form

Amino acids are the building blocks of protein structure.

1° structure 2° structure 3° structure 4° structure

Figure 1.1: The protein folding problem. Knowledge of how a protein attains its functional, three-dimensional form from a primary amino acid code will further enhance scientific advances based on the genetic code.

5

Postulated models of protein folding. Several simplified models of how a protein

folds to its native state have been developed to explain experimental observations

(Dobson et al. 1994; Radford 2000) (Figure 1.2). The framework model involves the

initial formation of small secondary structural elements that diffuse and then dock into

the native tertiary fold via -collision. The hydrophobic collapse model proposes

the formation of a compact molten globule intermediate as the protein buries its

hydrophobic side chains away from solvent during early folding (Kuwajima 1989). The

search for the native structure is then constrained within the molten globule state. The

nucleation-growth model, suggests that a limited number of amino acids form a folding

nucleus that propagates structure through the chain in a sequential manner.

Computational efforts in protein folding. While experimental approaches are required to determine the actual molecular events that occur during folding, computational approaches are an appealing alternative to solving the folding code because they potentially provide a general approach towards handling the explosive growth of DNA and protein sequences (Hardin et al. 2002; Moult 2005). Molecular dynamics simulations have been used to monitor the folding of very small proteins or domains (Pande et al. 1997a; Pande et al. 1997b; Pande et al. 1998; Pande and Rokhsar

1998; Pande and Rokhsar 1999a; Pande and Rokhsar 1999b; Snow et al. 2002; Zagrovic et al. 2002; Pande et al. 2003; Lucent et al. 2007). However, the combinatorics of larger, typical proteins precludes the calculations of complete folding trajectories with existing

6

Figure 1.2: Models of protein folding. While experimental evidence supports each of the three models in distinct ways, it is thought that the folding of a protein involves a coordination of all three of these models (Dobson, 1994).

7

computational power. A simplified representation of proteins involving only the Cα atoms, i.e., the Gō model, has enabled the simulation of entire folding and unfolding reactions for proteins of 100-250 amino acids. However, the omission of side chain information and the focus on native interactions has led to skepticism regarding their validity. Even with these limitations, comparisons of experimentally-determined and simulated structures of partially folded states have shown reasonable agreement in several cases. By implication, the progressive development of native-like structure may be a common feature of folding reactions. Combining the wealth of data from experimental protein folding studies and the power of computational simulations may someday enable the prediction of a native protein fold from its amino acid sequence

(Honig 1999).

8

Benefits from solving the protein folding problem

Protein therapeutics. At present nearly 150 diseases, including autoimmune diseases, cancer, and infectious diseases, are treated with medicines derived from biotechnology, many of which are protein-based therapeutics (Frokjaer and Otzen 2005).

The advances in biotechnological production methods have made protein therapeutics more readily available than by previous traditional tissue extraction methods. Knowledge

of the sequence requirements for a given fold topology and the influence of various

interactions within these folds will enable the synthesis of designed proteins with

improved stability, function, and efficacy. The therapeutic effect of recombinant human

interleukin-2 (hIL-2) was improved through molecular design by replacing a cysteine

with a serine (Cassell et al. 2002). This recombinant form of hIL-2 has a longer half-life

in vivo. Also, mutations in self-association sites of human insulin have led to the

development of rapid-acting monomeric human insulin analogues (Brange et al. 1990).

These examples illustrate the power of combining protein folding information with

recombinant DNA technology to create more powerful therapeutics.

Misfolding and disease. The relevance of the protein folding problem to human

health has also been brought to light with the discovery of a link between misfolded

proteins and disease (Dobson 2001). Misfolded proteins can form aggregates that have

been connected to a host of pathologies including amyotrophic lateral sclerosis,

Parkinson’s disease, Huntington’s chorea, Alzheimer’s disease, systemic amyloidosis,

9

and sickle cell anemia through mutations that cause a gain-of-function. In other

afflictions, such as cystic fibrosis, hypercholesterolemia, type II diabetes, and cataracts, a change in the normal conformation or assembly of a functional protein is hypothesized to lead to a disease state through a loss-of-function.

A leading hypothesis connecting protein misfolding with disease is that a conformational change occurs within a protein that propagates an aggregation chain of

events (Dobson 2002) (Figure 1.3). This idea involves a “seeding” of misfolded protein

with a single misfolding event. Proteins that are prone to aggregation either have

intrinsic beta structure or adopt beta structure prior to forming aggregates. These

monomeric beta structures then form small oligomers that coalesce into protofibrils and

fibrils. While the formation of aggregated, misfolded proteins has been linked to human

disease, the exact cause of their toxicity remains yet to be discovered. It is not clear

whether protofibril-fibril formation leads to toxicity or whether such macromolecular

structures serve as a protective mechanism to remove toxic protein from living organisms

(Dobson 2001; Dobson 2004). Investigations that determine what factors might cause

proteins to misfold and self-associate may lead to a cure for these diseases.

10

Figure 1.3: Possible misfolding models. Gain or loss of protein function may occur through a shift in protein equilibrium between the native state and disordered aggregates and prefibrillar species (Dobson, 2002). U represents the unfolded polypeptide chain, I represents a folding intermediate, and N represents then functional native structure. As indicated, misfolding of a protein may arise from all three states in the folding reaction.

11

Emerging questions in the protein folding field

The search for a protein folding code has evolved with the technology accessible for its study and has broadened perspectives on protein folding mechanisms. When

Anfinsen first suggested reversible protein folding, proteins were only available from naturally occurring sources and the probes of study were limited. Advances in molecular biology and synthetic chemistry have vastly increased the number of model systems as well as the modifications that can be made to these systems. Continuing developments in spectroscopic techniques and the resulting improvements in structural resolution have provided more detailed information on the folding reaction. The very rapid of protein folding reactions has motivated the development of faster mixing techniques providing insights into microsecond folding events. In conjunction with the wealth of experimental data resulting from these advances, the increasing capability of computers to solve complex problems has moved the field ever closer to a protein folding code.

Structure-guided studies. Detailed knowledge of protein structure is essential to probe the role that specific structural features may have in the folding reaction. The determination of a number of biological genomes has led to a revolution in protein structure determination because structural data confirms the existence of proteins that are predicted from gene sequences. Advanced crystallographic methods, coupled with the ability to produce recombinant protein, have provided the means to determine numerous crystal structures of proteins whose structural information was previously inaccessible

(Hendrickson 1991; Hendrickson 2000; Dale et al. 2003; Derewenda 2004; Lecomte et al.

12

2004). Furthermore, higher resolution NMR magnets have allowed for solution structures of many small proteins (Yee et al. 2006). This information provides a map of specific residue interactions for dissecting the role of various amino acids and substructures in the development of the native protein fold.

Methods of protein engineering. Understanding the role of individual amino acids or domains in the folding of various proteins will enhance the development of a folding database. Comparisons between the folding mechanisms of the variant and wild-type protein have been made possible with advances in recombinant molecular biology. Site- directed mutagenesis can be used to make point mutations, insert or delete loops, circularly permute a sequence, and truncate a sequence (Matthews 1987; Garvey and

Matthews 1990). Large scale overexpression of recombinant proteins yields a sufficient quantity of purified protein for folding studies that may not be naturally available

(Makrides 1996; Baneyx 1999; Sorensen and Mortensen 2005b; Sorensen and Mortensen

2005a). Although molecular biology is a powerful tool for producing proteins and variants, some systems are not as tractable to the standard applications of this technique.

In these cases, limited proteolysis (Porter 1950; Porter 1959) and peptide synthesis

(Merrifield 1965) can be used to make fragments of proteins that may not express well by recombinant methods. Other alternatives include peptide synthesis can to insert nonnatural amino acids into proteins and chemical ligation techniques to add synthetic fluorescent probes to proteins combine domains that may not co express in recombinant proteins (Muir 2003).

13

Thermodynamic versus kinetic control. When the protein folding problem was first posed, a major question involved the idea of thermodynamic vs. kinetic control,

whether the functional fold of a protein is determined by the lowest energy state that the

protein can attain or the rate at which that fold will occur (Anfinsen and Scheraga 1975).

It is now widely accepted that many proteins fold under the influence of thermodynamic

control, attaining the most stable fold or ensemble of folds possible (Govindarajan and

Goldstein 1998). However, a few proteins, including those with pro sequences at the N-

terminus that are subsequently removed after folding (Baker et al. 1992a; Baker et al.

1992b; Eder et al. 1993; Wang et al. 1996) or heterodimers involved in signaling

pathways, such as Fos/Jun (d'Avignon et al. 2006), have been shown to fold via kinetic

control. The question remains as to whether this fold is merely a “trap” from which the

protein cannot attain the activation energy to fold to the native state or the only functional

state for the protein (Sohl et al. 1998).

Chemical kinetics view of folding. Initial attempts to resolve Levinthal’s paradox described folding as a discrete sequence of events in which an unfolded polymer of amino acids reaches its most thermodynamically favored state through a directed search along a productive reaction coordinate (Baldwin 1975; Chothia 1975; Tanaka and

Scheraga 1975; Matthews 1993). While this complex reaction coordinate was meant to imply that ensembles of intermediates were involved in a folding mechanism, the two dimensional nature of the diagram could be mistaken to suggest that only one particular intermediate or transition state could occur. Applications of polymer theory to protein

14

folding have proposed that folding occurs through a three-dimensional funnel, rather than a two-dimensional reaction coordinate (Wolynes et al. 1995) (Figure 1.4). One can visualize this concept as a three-dimensional reaction coordinate, with the apparent folding free energy (less the chain entropy contribution) on the z-axis and chain entropy in a 2-D format on the x- and y- axes. Depending upon the complexity of the folding reaction, the landscape can be rough or smooth. Thus, the landscape describes the possible paths a polypeptide might take as it folds to a functional protein (Lazaridis and

Karplus 1997; Matagne and Dobson 1998). While the polymer physics basis of protein folding appeared to contradict the chemical kinetics interpretation, both views are now reconciled with the concept of a folding energy landscape (Baldwin 1995).

15

Figure 1.4: Energy Landscapes in Protein Folding. The figure above demonstrates

the concept of how the entropic term enters the thermodynamics of folding. Rather than follow a two dimensional reaction coordinate, proteins are thought to fold via an ensemble of intermediates towards the lowest free energy state (Wolynes, 1995).

16

Technology Development

Advances in detection technology have expanded the experimental repertoire for

understanding protein folding, in providing a wealth of information on folding

intermediates. Conventionally far-UV circular dichroism (CD) has been used to probe

the formation of secondary structure (Woody 1995). Ultraviolet-Visible (UV-VIS),

fluorescence (FL), and near-UV CD spectroscopies have provided information about

tertiary packing around intrinsic aromatic side chains (Jennings et al. 1991). However,

the growing availability of other detection methods have broadened and deepened the

scope of the questions that can be posed about folding reactions and partially-folded

states. These techniques include fluorescence resonance energy transfer and anisotropy,

Nuclear Magnetic Resonance (NMR) spectroscopy, mass spectrometry (MS), and small

angle X-ray scattering (SAXS).

Fluorescence spectroscopy. Improved FL detection methods and access to

commercially-available probes have created new opportunities to obtain site-specific

information on folding reactions. Since tryptophan and tyrosine fluorescence report on

local folding events surrounding these side chains, both naturally occurring and point- mutated side chains with fluorescent properties have been followed by this detection method (Lakowicz 1980; Lakowicz et al. 1983; Lakowicz 1986). The enhanced fluorescence of 1-anilinonaphthalene-8-sulphonate (ANS) during the refolding of a protein indicates the transient formation of hydrophobic clusters to which this dye binds

17

(Jones et al. 1994; Uversky et al. 1998). Binding studies with fluorescent cofactors have elucidated the formation of active sites during folding, such as NADP+ binding to DHFR

(Jennings et al. 1993). FL anisotropy has been used to determine tryptophan rotational correlation times as a means of determining the rotational freedom and solvent exposure of each tryptophan side-chain, and monitor the role of hydrophobic collapse in proteins

(Jones et al. 1995; Mei et al. 2003). Förster Resonance Energy Transfer (FRET) has allowed the measure of discrete distances between donor and dye-labeled acceptor amino acids (Selvin 1995).

NMR spectroscopy. NMR spectroscopy can provide high-resolution structural information of transient intermediates for soluble proteins whose molecular weights are less than ~30,000 Da (Dyson and Wright 1996; Dyson and Wright 2005). Non-random structure in unfolded states (Neri et al. 1992a; Neri et al. 1992b; Kemmink and Creighton

1993; Dyson and Wright 2004; Mohana-Borges et al. 2004; Bowler 2007), the structure of a very small populations of a folding intermediate (Bezsonova et al. 2006; Korzhnev et al. 2006; Mittermaier and Kay 2006; Neudecker et al. 2006), and the structure of a stable model of a folding intermediate (Bai et al. 1995) have all been reported. NMR also has been extensively used to monitor the exchange of main chain amide hydrogens with solvent deuterium (H/D exchange) as a means of determining the persistence of hydrogen bonds in rare high-energy states (Meier et al. 2004) or the appearance of stable hydrogen bonds in early folding intermediates (Roder et al. 1988; Udgaonkar and Baldwin 1988;

Udgaonkar and Baldwin 1990). Real-time NMR has provided information about the

18

environment of side chains and also solvent accessibility of aromatic side chains within

50 ms (Zeeb and Balbach 2004). Dynamic NMR involving lineshape analysis among

equilibrium species in the transition region has provided details of early folding events

(Mok and Hore 2004).

Mass spectrometry. H/D exchange methods coupled with mass spectrometry have been used to provide detailed structural information about stable and transiently formed folding intermediates (Krishna et al. 2004). This technique is powerful because the limit of this detection method is near the picomolar concentration range and it is also amenable to larger proteins (Eyles and Kaltashov 2004). It is, however, limited to the detection of structure at the level of the peptides generated by proteases under acidic conditions where back exchange is very slow. By contrast, H/D exchange coupled with

NMR provides site-specific information on stable hydrogen bonds.

Small angle X-ray scattering. The size and shape of a protein during its folding reaction can be determined with small angle X-ray scattering (SAXS). Guinier plots of the scattering data provide estimates of the radius of gyrations, Rg, and Kratky plots reveal the shape of the protein in various solvents and during the folding reaction (Plaxco et al. 1999; Segel et al. 1999a; Segel et al. 1999b; Uversky et al. 1999; Moncoq et al.

2004; Bilsel and Matthews 2006; Jang do et al. 2006; Arai et al. 2007). With this information, it is possible to determine when compaction of the polypeptide chain occurs during folding and the size and shape of the ensemble of intermediates. One drawback of

19

this technique is that it requires relatively high concentrations of protein, ~1mg mL-1;

aggregation of marginally-soluble intermediates can obscure the signal.

Ultra-fast folding detection. Faster mixing techniques and other technology can probe a wider range of folding timescales and enrich the knowledge of kinetic folding mechanisms (Figure 1.5). Traditional manual-mixing techniques, although appropriate for very slow reactions, > 100s, miss the first 5-10 seconds of the folding reaction

(Jennings et al. 1991; Matthews 1993). Stopped-flow techniques can follow folding reactions from the millisecond to the hundreds of seconds timescale. Ultra-fast, continuous-flow mixing methods and dynamic NMR have provided a glimpse into the microseconds range, but these methods require much larger quantities of protein than traditional optical stopped-flow and manual mixing techniques (Huang and Oas 1995;

Ghaemmaghami et al. 1998; Myers and Oas 2002; Arora et al. 2004; Roder et al. 2004;

Bilsel and Matthews 2006). The nanosecond timescale has been probed by temperature- jump methods, but primarily for small, two-state folding proteins (Wang et al. 2005;

Bunagan et al. 2006a; Bunagan et al. 2006b; Xu et al. 2006; Gai et al. 2007). Although

T-jump methods have been successfully used to examine nanosecond to microsecond timescale kinetics of larger proteins (Ballew et al. 1996; Gilmanshin et al. 1997a;

Gilmanshin et al. 1997b; Gilmanshin et al. 1998; Hagen and Eaton 2000; Gulotta et al.

2001), they are limited to relaxation kinetics starting from a cold denatured state and require accurate knowledge of the equilibrium constant to obtain microscopic rates.

20

dynamic NMR manual laser-induced temperature jump mixing

pressure jump

MD simulations ultrafast mixing stopped-flow Technique

-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 log(Time (s))

helix/coil β-hairpin transition formation hydrophobic collapse protein rotation proline Folding events loop closure isomerization trp fluor. side-chain contacts form

native state formation

FIGURE 1.5: Timescale of protein folding reactions. A vast array of experimental approaches has been used to map the folding of proteins kinetically (Bilsel, unpublished).

21

Mutational analysis of folding reactions

Intermediates in protein folding mechanisms can be probed with mutational

analysis (Matthews 1987). Site-directed mutagenesis can be used to introduce point

mutations at individual residues that may stabilize transient intermediates. The folding

reaction can be measured over a time course with spectroscopic detection methods. The resulting kinetic traces can then be fit to a single exponential, or a series of exponentials, to yield a relaxation time. Using a chevron analysis, a semi-logarithmic plot of kinetic relaxation times as a function of denaturant concentration, the changes in folding kinetics

of different variants compared to those of the wild-type protein can determine a map of

the energetic landscape for a given protein (Jennings et al. 1991). In the case of proteins

with two-state folding mechanisms, point mutations perturb the folding to suggest

whether a certain residue contributes to the formation of the transition state.

Phi-value analysis utilizes protein variants containing point mutations to

characterize the contribution of individual residues to the formation of the transition state

(Fersht et al. 1991). The change in the activation free energy of folding between the

variant and the wild-type protein is divided by the overall change in stability between the

variant and wild-type protein. A mutation at a residue leading to a phi value closer to unity indicates that the variant more closely resembles the native state, while a resulting phi value closer to zero indicates the variant more closely resembles the unfolded state.

However, when a phi value lies near the value of 0.5, the interpretation becomes ambiguous. Phi value analysis has been attempted for three-state folders, but the results are not definitive (Lindberg et al. 2005).

22

Topology and folding

Relative contact order. To address the relationship between topology and folding rates, Plaxco and colleagues have developed an algorithm that calculates the relative contact order of residues in the protein using a known crystal structure of a protein

(Plaxco et al. 1998; Plaxco et al. 2000; Ivankov et al. 2003). They have shown that a linear relationship exists between the logarithm of the refolding rate and the relative contact order of a two-state folding protein. The greater the fraction of local contacts, the lower the contact order and the faster the refolding reaction. In essence, chain entropy governs the rate of the reaction. The folding reactions of proteins with more complex folding mechanisms do not obey this same linear relationship. A possible explanation for these deviations from the linear relationship is that this method only works for two-state proteins because the rates in multi-step folding mechanisms are highly coupled to each other.

Topology modules within a protein. Other current approaches relating topology

to folding include searching for discrete folding modules within protein families. This

method has been used with TIM barrels, as they have been demonstrated to have

conserved folding mechanisms based on similar topologies. (Gu et al. 2007; Wu et al.

2007; Yang et al. 2007) These homologous proteins of low sequence identity populate a

similar equilibrium intermediate that contains a high amount of structure and stability.

Fragmentation analysis of the α-subunit of Trp synthase showed that the minimum

23

structural module is likely to require a βαβαβ sequence (Zitzewitz and Matthews 1999).

(βα)n folding modules also are being investigated in the family by the

BASiC hypothesis that hydrophobic clusters dominated by Branched Aliphatic Side

Chains from Leu, Ile and Val residues play crucial roles in guiding the early folding reactions and in stabilizing subsequent equilibrium intermediates (Matthews, unpublished). The similar topology dictates similar folding mechanisms for members with very different sequences (Forsyth et al. 2007).

24

Key protein folding studies on different motifs

A comparison of the folding mechanisms from all classes of proteins (α,

β+α, β/α, and β) illustrates that proteins fold as a function of both enthalpic and entropic

contributions. However, it has been observed that the initial formation of structure can be

described by two major models: the framework model and the hydrophobic collapse

model. The framework model suggests that proteins fold by building upon local

interactions (e.g. helices) to form structure in a hierarchical manner. In contrast, the

hydrophobic collapse model is based upon the tendency of nonpolar side chains to be

excluded from aqueous solvent in a non-specific manner. In the hydrophobic collapse

model, the search for the native state proceeds from a non-specifically collapsed state

which has been suggested to reduce the conformational search towards the native state.

For some proteins, the extent of local bias in determining the folding pathway along the

energy landscape depends on the native structure and the prevailing mechanism may be

better described as a composite of the two extremes (Arai et al. 2007). Based on the

intricate characterization of a few canonical proteins, trends have been suggested for

these models.

Folding trends in mainly α-helical proteins. Several helical proteins have been studied extensively, as these proteins behave well in solution, are relatively easy to access, and yield a robust CD signal. Typically, the framework model drives structure formation in these proteins. Both monomeric lambda repressor (Yang and Gruebele

25

2004) and cytochrome c (Krantz et al. 2002) fold via a two-state mechanism. The small size (~100 amino acids or fewer) and high helicity of these proteins favors this mechanism. Interestingly, the heme group of cytochrome c enables more intricate folding investigations, including ultra-fast studies of its native-like contacts in burst phase folding (Abel et al. 2007). Among the helical immunity proteins, Capaldi and Radford have shown that overall global stability in IM9 yields a two-state mechanism, while more localized packing in IM7 favors formation of a compact intermediate (Ferguson et al.

1999; Capaldi et al. 2001; Ferguson et al. 2001; Gorski et al. 2001; Capaldi et al. 2002;

Friel et al. 2003; Gorski et al. 2004; Spence et al. 2004). Barstar also folds via a three- state mechanism with a native-like intermediate (Agashe et al. 1995; Shastry and

Udgaonkar 1995; Agashe et al. 1997; Schoppe et al. 1997; Lakshmikanth et al. 2001;

Pradeep and Udgaonkar 2007; Sinha and Udgaonkar 2007). Using NMR, Wright and colleagues have shown that intermediate formation during apomyoglobin folding is dictated by interhelical packing stabilized primarily by the (Jennings and Wright 1993; Eliezer et al. 1995; Eliezer et al. 1997; Eliezer et al. 1998; Nishimura et al. 2002; Schwarzinger et al. 2002; Nishimura et al. 2003). Typically, helical proteins exhibit local structural propensity and helical packing interactions that govern intermediate and transition state formation (Gunasekaran et al. 2001). Therefore, proteins with helices close in sequence will fold quickly and then the three-dimensional pattern of the various helices will determine whether intermediates form.

26

Folding trends in β+α and β/α proteins. The folding mechanisms of many

β+α and β/α proteins, including β+α ribonuclease A, β+α lysozyme, β/α the alpha subunit of , and β/α dihydrofolate reductase (DHFR), have also been well characterized. Ribonuclease A has the simplest mechanism of these four with three- state folding involving a native-like intermediate (Neira and Rico 1997; Neira et al. 1999;

Dyson et al. 2006). Both lysozyme (Dobson et al. 1994; Matagne and Dobson 1998) and the alpha subunit of tryptophan synthase (Bilsel et al. 1999a; Bilsel et al. 1999b; Wu and

Matthews 2002a; Wu and Matthews 2002b; Wu and Matthews 2003; Wu et al. 2005; Wu et al. 2007), a TIM barrel protein, become complicated by proline isomerization issues, leading to parallel folding pathways. In addition, alpha subunit of tryptophan synthase folds into an off-pathway intermediate, probably the result of barrel packing issues.

DHFR folding becomes more complex, as multiple ensembles fold in a parallel manner

(Iwakura et al. 1993; Jennings et al. 1993; Jones et al. 1994; Iwakura et al. 1995; Jones et al. 1995; Jones and Matthews 1995). However, some of these native-states are competent to bind ligand, while others are not. In genomic comparisons of both TIM barrels and

DHFR, different folding mechanisms have been shown for different species, but the overall trend of ensembles populated is conserved by topology. These results suggest that sequence, secondary structural components, and topology are all implicated in the path by which a protein will fold.

27

β−Protein Folding Studies

More recently, with help of a growing toolbox of technological advances, folding studies of mainly β-sheet proteins have been added to the protein folding database.

Unlike their α-helical, β+α, and β/α counterparts, β-proteins have a diminished native

CD spectral signature and less signal change upon unfolding. Thus, the thermodynamic information available from CD methods is less reliable for these proteins because the signal-to-noise ratios are smaller. While native tryptophan and tyrosine residues within

beta proteins can provide spectral probes, the information available from FL detection is

dependent upon the location of the residues within the proteins and the solvent exposure

changes of these residues upon unfolding. For example, Gierasch and colleagues (Clark

et al. 1998) have compared β-sheet formation with β-barrel closure in CRABP by introducing a single tryptophan into several positions in the sequence. Additionally, β- proteins have a higher propensity towards aggregation, further complicating folding studies.

Patterns in β-motif formation. Model β-sheet peptide studies have examined

fundamental aspects of β-structure during folding (Searle and Ciani 2004). A variety of

small β-hairpins investigated showed the common theme that the β-turn sequence in the hairpin provides steric direction and that electrostatic salt bridge formation stabilizes the structure, but does not drive folding as these interactions occur later in the folding reaction (Gibbs et al. 2002). Furthermore, minimal β-sheet models have shown that

28

hydrophobic side chain association, not the hydrogen bonding network, drives the

formation of multi-stranded β-sheet structures suggesting that long range interactions are more important in β-sheet folding (Kiehna and Waters 2003). Cooperative folding within

β-proteins is thought to proceed via distal tertiary contacts, not residues close in sequence

(Klimov and Thirumalai 2002).

Model β-protein systems. Small β-proteins, those containing fewer than 100

amino acids, typically fold through a cooperative two-state mechanism. The folding of

SH3 domains, small β-sandwich proteins consisting of two or three β-hairpins, is well- described by a two-state mechanism involving a transition state with a structured three- stranded β-sheet (Grantcharova and Baker 1997; Riddle et al. 1999; Grantcharova et al.

2000; McCallister et al. 2000; Yi et al. 2003). FRET measurements during refolding and a phi-value analysis have shown that cold shock proteins, small β-barrels with a two-

stranded and three-stranded sheet structure have a native-like transition state, preceded by

a rapid collapse in structure (Garcia-Mira et al. 2004; Magg and Schmid 2004). The IgG

binding domains of protein L and protein G consist of a helix packed against a four-

stranded β-sheet. A comparison between their two respective folding mechanisms indicates that local interactions lead to the stability of the isolated β-hairpins, but their

overall topology provides for an intact β-turn in the transition state (McCallister et al.

2000). Phi-value analysis has provided a transition state map for TI 127, from the human cardiac titin, in which the protein is highly structured except for a few residues in the N-

terminal region (Fowler and Clarke 2001). Comparison of TI 127 with an unrelated

29

fibronectin type III domain revealed that these β-sandwich proteins have a common core of residues involved in folding nucleation, but that the nucleus can become deformed to accommodate changes in sequence (Geierhaas et al. 2004).

Larger β-proteins display more complex folding pathways. Intracellular -

binding proteins, including CRABP and IFABP, contain approximately 130 amino acids

and have been shown to fold via native-like intermediates (Ropson et al. 1990; Clark et

al. 1996; Burns et al. 1998; Clark et al. 1998; Eyles et al. 1999; Dalessio and Ropson

2000; Ropson et al. 2000; Burns and Ropson 2001; Rotondi and Gierasch 2003a; Rotondi

and Gierasch 2003b; Rotondi et al. 2003; Ignatova and Gierasch 2005). Interleukin-1β,

with a β-trefoil fold, contains 153 amino acids and folds through an intermediate with

partial structure (Heidary et al. 1997; Jennings et al. 1998; Covalt et al. 2001; Finke and

Jennings 2002; Heidary and Jennings 2002; Roy and Jennings 2003; Heidary et al. 2005;

Roy et al. 2005; Gosavi et al. 2006). Human γD crystallin, a 174-residue, β-sheet protein consisting of four Greek keys in two domains, has two intermediates in its folding pathway and a large propensity for aggregation (Kosinski-Collins et al. 2004; Flaugh et al. 2005a; Flaugh et al. 2005b). These intricate folding pathways most likely arise due to the larger size and numerous distal contacts that must be made in the complex topology

of these proteins

30

Multimeric systems: an added layer of complexity for the protein folding problem

Building upon the base of knowledge from monomeric proteins, explorations into the folding mechanisms of multimeric systems have begun to be elucidated in recent years. From simple coiled-coil mechanisms to complex multidomain systems, studies of multimeric proteins have provided insights into the development of quarternary structure during folding to their biologically functional forms (Zitzewitz et al. 1995; Gloss and

Matthews 1997; Gloss and Matthews 1998a; Gloss and Matthews 1998b; Barry and

Matthews 1999; Zitzewitz and Matthews 1999; Zitzewitz et al. 2000; Gloss et al. 2001;

Gloss and Placek 2002; Knappenberger et al. 2002; Placek and Gloss 2002; Banks and

Gloss 2003; Banks and Gloss 2004; Doyle et al. 2004; Topping et al. 2004; Placek and

Gloss 2005; Placek et al. 2005; d'Avignon et al. 2006; Gloss 2007; Riley et al. 2007;

Steinmetz et al. 2007). These extensions are very important because a majority of proteins in biology exist as multi-subunit complexes. However, increase in the complexity of the multimeric folding problem has dictated that most studies focus primarily on relatively small multimers (Jaenicke and Lilie 2000).

Complexity of bimolecular kinetics. As demonstrated in several monomeric proteins, pockets of tertiary structure often form in the individual subunits before the development of secondary, tertiary, and quarternary structure (Jaenicke and Lilie 2000).

The addition of quartenary structure formation to the folding process leads to the question of whether each subunit must be completely folded before the coordination of the

31

functional protein or if the chains can fold together in a combined molten globule state.

Multimeric systems that have been characterized to date have shown that both scenarios are possible and the kinetic order of the association reaction often depends on the nature of the interface between a given subunit. For example, the subunits in some multimeric proteins fully acquire their tertiary structure before associating (Svensson et al. 2006a), while in other proteins formation of tertiary structure within each subunit occurs subsequent to oligomerization (de Prat-Gay et al. 2005). Additionally, traditional relaxation kinetics cannot approximate the appropriate rates for the association reaction because of its bimolecular character (Bernasconi 1976). Furthermore, if other steps of the folding reaction are highly coupled to its association, monomeric rate approximations will also be affected. Thus, multimeric folding analysis is far more involved than simply adding another step to the reaction pathway.

α-helical multimer folding. The coiled coil is the simplest model that has been used to study oligomeric folding. Studies on dimeric GCN4-pI, a peptide-based model system, showed that the folding mechanism is well-described by a simple bimolecular model (Zitzewitz et al. 1995; Zitzewitz et al. 2000; Knappenberger et al. 2002; Ibarra-

Molero et al. 2004). However, the four orders in magnitude reduction in the second order rate constant was interpreted in terms of a very rapid pre-equilibrium between coil and minor helix populations in each chain. Folding appears two-state, but may involve an intermediate that associates and rapidly propagates to produce the native structures.

Helical bundles, such as Rop proteins, show that tetramer forms through a dimer

32

association reaction between preformed dimers of two helices (Munson et al. 1997).

These systems illustrate the complexity arising from coupling an association reaction

with secondary and/or tertiary structure formation. The formation of quarternary

structure cannot be viewed simply as a final association step in folding, but must be

considered as part of a coordinated process that integrates the formation of secondary,

tertiary, and quarternary structure.

α/β dimer folding trends. Expanding upon the lessons learned from coiled coils,

the folding mechanisms of several dimeric proteins with both alpha helical and α/β content have shown key trends in the development of their native structure. Trp repressor folds via 3 parallel pathways and a dimeric intermediate (Gloss and Matthews

1997; Gloss and Matthews 1998b; Gloss and Matthews 1998a). Further dissection of this highly intertwined dimer into a dimeric core identified inter-subunit helical packing to be the driving force of the folding reaction, as well as dictating the relative stabilities of several intermediates (Gloss et al. 2001; Simler et al. 2006). Trp repressor and P22 arc

repressor (Milla and Sauer 1994; Milla et al. 1995; Sauer et al. 1996) both show the

acquisition of secondary structure and association concurrently. The H2A/H2B histone

heterodimer has a three-state mechanism with a rapid-appearing dimeric intermediate and

dimer maturation is rate-limiting (Gloss and Placek 2002; Placek and Gloss 2002; Banks

and Gloss 2003; Placek and Gloss 2005). Glutathione transferases also involve fast

association with a dimer intermediate (Wallace and Dirr 1999). In each of the above

33

mentioned proteins formation of the dimer interface precedes mature structure in each

subunit.

Folding of β motif multimers. The current literature contains relatively few

examples of multimeric folding mechanisms for proteins whose subunit structures are

dominated by the β motif. The low signal-to-noise in CD spectra of beta proteins, the

less than optimal location of fluorescent probes, and the propensity for these proteins to

aggregate complicates the already difficult task of dissecting tertiary and quarternary

folding steps. Human Cu, Zn apo-Superoxide Dismutase (Lindberg et al. 2004; Svensson

et al. 2006b) and the antibody domain CH3 (Feige et al. 2004) both slowly form folded

monomers prior to a fast dimerization process. In contrast, the parallel folding pathway

of the Human Papillomavirus protein involves a fast folding monomeric intermediate that

either quickly folds to native or a non-native dimeric intermediate (de Prat-Gay et al.

2005). In the case of the two former proteins, the dimer is comprised of two distinct β- barrels, while for the latter protein, β-barrel formation depends upon dimer association.

Thus, it appears that the rate-limiting step in these folding pathways involves the contacts necessary for β-barrel formation.

34

Dissecting multimeric folding with monomer constructs

The concept of autonomous folding units (AFUs) to study the folding of monomeric proteins has provided many insights into folding intermediates that may not be apparent in the cooperative folding, induced by chemical denaturants, of the whole protein that can spontaneously fold to a native-like structure. An AFU is comprised of a segment of a protein, independently-folded from a single secondary structural element to an assembled subdomain. These substructures of proteins have been suggested to be involved in both the nucleation and higher order assembly processes of protein folding

(Peng and Wu 2000). The thermodynamic stability of these AFUs, as well as their

folding kinetics, has yielded valuable insights into the folding mechanisms of several

monomeric proteins (Peng and Wu 2000). While AFUs were originally isolated with

limited proteolysis (Porter 1950), advances in protein engineering have enabled the

construction and expression of AFUs as recombinant proteins (Zitzewitz and Matthews

1999).

AFUs to dissect monomer folding. AFUs have been used to dissect the complex

folding of both the alpha subunit of tryptophan synthase and DHFR. In the case of the alpha subunit, N-terminal folding cores were generated utilizing recombinant DNA technology to systematically place stop codons after β-α-β modules (Zitzewitz and

Matthews 1999). These studies indicated that a stable core formed from the folding of

amino acids 1-188 exhibited an apparent two-state equilibrium folding, but more complex

35

kinetics with two refolding phases. The native state ensemble displayed a burst-phase

that resembled an intermediate that is populated at 3 M urea in the full length protein.

Additionally, the refolding kinetics of this fragment did not exhibit the slow, prolyl

isomerization phases, suggesting that those phases could be attributed to the C-terminus.

Fragments of DHFR were made by chemical cleavage of cysteine point mutations within

the protein (Gegg et al. 1997). Interestingly, even though the N-terminal residues 1-36

are essential to the core of the full-length protein, a fragment comprising residues 37-159

was able to form a stable, unique fold. These studies were used to determine the roles of

the adenine binding domain and the discontinuous loop domain in DHFR folding. In

both cases, a highly complex folding mechanism was teased apart by examining the

folding of modular components.

Creating stable monomer subunits. Investigators have utilized the AFU approach with multimeric systems. Since the association reaction in a multimer complicates kinetic analysis, a stable monomer construct that does not self-associate allows for the dissection of multimeric folding pathways. A monomer of Trp repressor demonstrated a non-native structure in the transient, monomer intermediate of the folding pathway (Shao and Matthews 1998). The monomer construct, consisting of two point mutations at the dimer interface, of a glutathione transferase, rGSTM-1, displays thermodynamic stability independent of dimerization, similar structural properties to a monomer subunit of the native dimer, but loss of catalytic activity (Thompson et al. 2006). A similar approach was used in the study of the superoxide dismutase folding mechanism, where the folded

36

monomers were shown to undergo a diffusion-limited association reaction to produce the native dimeric forms (Svensson et al. 2006a). The ability to dissect a multimeric system into its parts, by decoupling the subunit folding and the association reaction, has been exceedingly useful in understanding their complex folding mechanisms.

37

Scope of the thesis

The goal of this thesis was to probe the folding and assembly of multimeric

proteins. Two very different proteins, HIV-1 protease and L. terrestris erythrocruorin,

were studied as model systems. HIV-1 protease is as dimeric protein comprised of two

subunits, with an interface dominated by a four-stranded formed

from the interdigitation of the N- and C- termini of this 99-residue protein (Figure1.6).

The erythrocruorin is a very much larger protein, 3.5 ¯ 106 Da, consisting of twelve

dodecameric hemoglobin subunits docked upon a scaffolding complex, whose core is

formed with coiled coils (Figure 1.7). As outlined below, these two proteins were studied

in very different manners, but with the same basic goal of understanding how multimeric

proteins fold to their functional forms.

HIV-1 protease folding. HIV-1 protease is the key protein the maturation of the

HIV virus (Turner and Summers 1999). HIV-1 protease is synthesized as part of the

Gag-Pro-Pol polyprotein, and is the enzyme that cleaves both the Gag and Gag-Pro-Pol polyproteins into viral proteins. A plethora of efforts have focused on designing inhibitors to this protein as part of developing new therapeutic treatments for AIDS.

With the goal of studying the intricacies of folding of beta protein multimers, in particular those with beta barrel motifs, the folding mechanism of HIV-1 protease is the major focus of this thesis. The folding of the full length dimer and a stable monomeric construct were studied using both equilibrium and kinetic folding methods with the goal of defining the folding energy landscape of HIV-1 protease.

38

Figure 1.6: Crystal Structure of HIV-1 protease. Shown is the unliganded crystal structure of an inactive variant of HIV-1 protease. The molecule is a homodimer consisting of beta barrel subunits. The dimer interface is comprised of an active aspartic acid and a beta sheet formed by the intertwining of the N- and C- termini of each subunit.

The mutations that enhance thermodynamic reversibility (Q7K/D25N/C67A/C95A) as well as the intrinsic fluorophores (Trp6, Trp42, and Tyr59) are highlighted in one subunit.

39

Figure 1.7: The 5.5 angstrom crystal map of erythrocrourin. The majority of the complex is hemoglobin subunits docking on an inner scaffolding complex. The Q dyad indicates the two-fold axis of symmetry and the P dyad represents the six-fold axis of symmetry. The inner scaffold is composed of intertwining coiled coils, which may initiate the folding of this protein (Royer, 2000).

40

Coiled coils involved in assembly. The erythrocruorin molecule provides extracellular, oxygen transport for the L. terrestris earthworm. Due to its ability to exist outside the red blood cell, this hemoglobin analogue has been investigated as a potential blood substitute (Hirsch et al. 1997; Dorman et al. 2002; Harrington et al. 2007). The 3.5

Å resolution crystal structure shows that this protein is a dodecamer of dodecameric hemoglobin units assembled onto a scaffolding complex. The scaffolding complex appears to be composed of intertwining linker chains with a coiled coil structure acting as the initiating unit of assembly. Based upon the demonstrated role of coiled coils in other proteins and their example of simple subunit assembly, it was proposed that these substructures drive the assembly of the linker chain complex. In this thesis, peptides corresponding to the segments of the linker chain coiled-coil region were synthesized, characterized, and mixed to examine their role in driving the assembly of the linker chain scaffold.

Though these two proteins seem vastly different, common approaches to probing their folding mechanisms can provide useful insights into the development of higher order structure in biology. Based upon the theme of folding and assembly, this thesis aims to advance the frontier of protein folding in multi-subunit complexes.

41

CHAPTER II

THE FOLDING FREE ENERGY SURFACE OF HIV PROTEASE:

MONOMER FOLDING CONTROLS THE FORMATION OF THE

NATIVE DIMERIC β-BARREL

42

The following chapter is in preparation as the manuscript

“The Folding Free Energy Landscape of HIV-1 Protease: Monomer Folding Controls the

Formation of the Native Dimeric β-barrel”

A. A. Fitzgerald, O. Bilsel, Y. Wu, A. Kundu, J.A. Zitzewitz, and C. Robert Matthews

for submission to Journal of Molecular Biology.

Contributions to the work are as follows:

With the exception of data regarding the Trp variants, all of the data was collected and analyzed by Amanda A. Fitzgerald.

Osman Bilsel developed the software for and guided the global analysis efforts, as well as helping to postulate a mechanism.

Ying Wu helped determine the initial conditions for the study presented here and also provided insightful discussions regarding the project.

Agnita Kundu created the Trp variants and collected and analyzed that data in conjunction with an HIV-1 protease mutational analysis that she is pursuing in the lab.

Jill Zitzewitz provided guidance for experimental design and interpretation. Jill also provided useful insights in the development of this manuscript.

Bob Matthews served as the principal investigator on this project and has provided insightful analysis and discussions of the studies presented here.

43

Abstract

The folding energy landscape of HIV-1 protease, the viral enzyme responsible for

HIV viral maturation, has been explored using traditional protein folding methods, as

well as a global analysis of the folding kinetics. HIV-1 protease is a homodimer

consisting of two β-barrel subunits; each subunit contains 99 residues and has a

molecular weight of 21 kDa. In this study, equilibrium folding of HIV-1 protease,

monitored using both circular dichroism and fluorescence spectroscopy, displayed an

apparent two-state behavior. The kinetics of the folding reaction were probed as a

function of both urea and protein concentration and monitored via manual-mix circular

dichroism and stopped-flow fluorescence methods. A chevron analysis of the kinetics

revealed that folding of monomers limits access to the dimeric native state in a three-state

sequential pathway folding mechanism. Global analysis of the entire kinetic data was

used to quantitatively test this mechanism and obtain microscopic rate constants

corresponding to the monomer folding rate (4.7× 10-2 s-1) and the association rate (1.0 ×

106 s-1M-1) at pH 6.0 and 25 ºC. The thermodynamic parameters obtained from this

global fit were used to simulate the populations of all species in both equilibrium and

kinetic conditions. To further discriminate monomer kinetics from dimerization kinetics,

the folding reaction of a monomeric HIV-1 protease construct containing a C-terminal

deletion of the last 4 residues was also characterized. The isolated monomer folding

reaction studied with this construct suggested that an additional fast folding phase,

44

observed in both monomeric and dimeric constructs, represents a localized folding event in the monomer chains. This combined analysis revealed that HIV-1 protease folds via a four-state, sequential folding mechanism.

45

Introduction

HIV-1 protease. The infection by human immunodeficiency virus (HIV), leading to acquired immunodeficiency syndrome, AIDS, has come to the forefront as a global public health crisis since the initial identification of this disease twenty-five years ago.

HIV-1 protease, the viral enzyme belonging to the aspartyl protease family, is responsible for the maturation of HIV through the proteolytic cleavage of the viral Gag and Gag-Pro-

Pol polyproteins thereby releasing the structural proteins and functional . As such, HIV-1 protease is a therapeutic target in the treatment of AIDS (Kohl et al. 1988).

The sequences of the cleavage sites within the Gag and Gag-Pro-Pol are highly variable

(Chou 1996). Structural studies have demonstrated that HIV-1 protease attains its specificity for its substrates from a consensus asymmetric substrate shape known as the

“substrate-envelope” (Prabu-Jeyabalan et al. 2002).

Drug resistant HIV-1 protease. Structure-based drug design has guided the leading efforts in developing the first generation of HIV-1 protease inhibitors (Wlodawer and Erickson 1993). Though these inhibitors initially showed great promise, the

infidelity of the reverse transcriptase coupled with a high viral multiplicity rate resulted

in multiple HIV-1 protease variants that maintained proteolytic activity, yet were drug-

resistant (Chen et al. 1994; Gulnik et al. 2000; Wu et al. 2003). In response to this

setback, a wealth of structural information has improved the understanding of the drug

resistance mechanism. The second generation of inhibitor design has, in fact, built upon

46

the progress in the molecular-level understanding of the mechanism by which an enzyme can acquire drug resistance while maintaining specificity for nine different catalytic sites

(Mahalingam et al. 2001; King et al. 2004; Ozer et al. 2006).

Drug resistance and the HIV-1 protease free energy landscape. The presence of

both and non-active site mutations (Ohtaka et al. 2003) in these variants

suggests that the change in activity may arise from an altered free energy landscape of

accessible HIV-1 protease conformations (Martin et al. 2005). This idea of differential

perturbations in the HIV-1 protease conformational ensemble is supported by the

application of the protein folding landscape theory to ligand binding (Kumar et al. 2000).

Crystal structures detailing a crucial “intermediate” conformation of the HIV-1 protease

flap tips during substrate binding further confirm this role of conformational changes in

drug resistance (Prabu-Jeyabalan et al. 2006).

A common hypothesis has suggested that drug-resistant HIV-1 protease variants

arise as a consequence of the native ensemble conformational plasticity that may allow

for population shifts favoring enzymatic activity. Proteolytic enzymes, especially

aspartyl proteases (Guruprasad et al. 1995), have been shown to operate through

conformational selection (Fairlie et al. 2000). Molecular dynamics simulations have postulated that the catalytic efficiency of HIV-1 protease is not limited to the active site, but due to mechanical fluctuations of the whole protein (Piana et al. 2002). Moreover, calculations of the reaction path between two alternate ligand-free HIV-1 protease conformations have shown that the small free energy difference between the “closed” and

47

“open” conformations of the flap region allows for mutations, thermal changes, and

substrate binding to alter the protein stability (Rick et al. 1998). Hence, an experimental

examination of the HIV-1 protease folding free energy landscape could add detailed

insights into the importance of the HIV-1 protease conformational ensemble.

Structure of HIV-1 protease. Each subunit of the homodimer HIV-1 protease

(Figure 2.1a) contains ninety-nine residues and is comprised of nine β-strands and one α-

helix. β-strands two through eight are involved in the formation of a jelly-roll β-barrel

topology within one subunit. The dimeric interface nestles the active site and is

stabilized by an anti-parallel β-sheet formed by the interdigitation of the N- and C-

termini of each subunit (Figure 2.1b). Interactions at the subunit interface contribute

additional stability to the dimer (Wlodawer et al. 1989). Nuclear magnetic resonance

spectroscopy (NMR) studies of HIV-1 protease variants have demonstrated that key

mutations will disrupt the monomer-dimer equilibrium to shift the population

predominantly toward monomer (Ishima et al. 2001; Ishima et al. 2003; Louis et al.

2003). Also, deletion of the C-terminal beta strand of HIV-1 protease produces a folded monomer in solution (Ishima et al. 2003). This truncated monomer construct is disordered at the N- and C-termini, as well as the flap tips; the active site is solvent- exposed (Ishima et al. 2003).

Thermodynamics of HIV-1 Protease. Several investigations have examined the thermodynamic properties of HIV-1 protease under a variety of conditions. Meek and

48

(a) (b)

Figure 2.1: Structure and Topology of HIV-PR. (a) The crystal structure of unliganded, homodimer, D25N HIV-1 (1TW7) protease shows two β−barrel subunits with a dimerization interface comprised primarily of an anti-parallel β−sheet formed from the interdigitation of the N- and C- termini of each subunit. The flaps (residues 43-

56) at the top of each barrel are in an “open position” with respect to each other. Mutated residues in the wild-type HIV-1 protease and FL chromophores are shown in stick model representation and are indicated in one subunit of the structure by residue number: Q7K

(yellow), D25N (purple), C67A and C95A (green), Tyr59 (magenta) and Trp6 and Trp42

(cyan). (b) The topology map of one subunit has a jelly-roll beta barrel topology.

β−strands are indicated with arrows, the helix is represented by a coil, and loop regions are displayed with lines. Strands are numbered successively and the corresponding N- and C-terminal residues are shown on the side of the strand. Dashed arrows indicate the

N- and C- terminal beta strands from the interacting subunit.

49

co-workers have shown that the conformational stability of HIV-1 protease is greater than

that of SIV-1 protease, and that inhibitor-binding causes a substantial increase in stability

for HIV-1 protease (Grant et al. 1992). Assays involving binding of inhibitor have

demonstrated that the slow dissociation of HIV-1 protease is strongly pH dependent

(Darke et al. 1994). Sedimentation equilibrium studies have further elucidated a

connection between drug-resistant mutations that arise in HIV-1 protease and a reduction

in dimer stability compared to wild-type (Xie et al. 1997). Freire and co-workers have

provided an analysis of stabilization energetics with differential scanning calorimetry,

defining the dimerization interface as the major contribution to the Gibbs free energy

(Todd et al. 1998). A few kinetics studies have implied local unfolding interactions of

HIV-1 protease (Panchal and Hosur 2000; Panchal et al. 2001). Although these studies

have clarified aspects of thermodynamic and kinetic folding properties of HIV-1

protease, a comprehensive knowledge of its folding landscape remains elusive.

Motivation for the current study. A host of studies have defined the folding pathways of β-barrel monomers (Carlsson and Jonsson 1995; Clark et al. 1996; Burns et al. 1998; Capaldi and Radford 1998; Clark et al. 1998; Eyles et al. 1999; Covalt et al.

2001; Gunasekaran et al. 2001; Feige et al. 2004), but there have been only a few detailed analyses of dimeric β-barrel folding mechanisms to date, including human apo-Cu, Zn

superoxide dismutase (SOD) (Lindberg et al. 2004; Svensson et al. 2006a), the antibody

C-terminal domains of IgG (CH3) (Thies et al. 1999), and human papillomavirus E2 transcriptional regulator DNA binding domain (HPV domain) (de Prat-Gay et al. 2005).

50

The folding landscape of SOD, a dimer comprised of two β-barrel subunits, involves the

unfolded chain, slow folding to an obligatory monomeric intermediate, and rapid near

diffusion-limited formation of the native dimer in a sequential pathway (Svensson et al.

2006a). Similarly, the antibody domain CH3, a dimeric β-sandwich homodimer that is a simple model system for studying immunoglobin folding and assembly, exhibits a folding mechanism with a rate-limiting monomer formation followed by fast dimer association

(Thies et al. 1999). A trans to cis prolyl isomerization reaction in the monomer is required prior to dimerization in this system. In contrast, the parallel folding pathway of

the HPV domain, which forms a β-barrel upon dimerization, proceeds through a fast- folding monomeric intermediate that folds either directly to the native state or to a non-

native dimeric intermediate, which subsequently rearranges to the native state (Mok et al.

1996; Mok et al. 2000; de Prat-Gay et al. 2005).

Although HIV-1 protease is also a β-barrel dimer with an apparent two-state

folding equilibrium (Grant et al. 1992; Todd et al. 1998), the existence of a stable

monomeric form (Ishima et al. 2003) suggests that its kinetic folding mechanism may

also involve a monomeric intermediate. Computational simulations of HIV-1 protease

folding have suggested a sequential folding pathway involving the unfolded chains, a

monomeric intermediate, and the native dimer (Levy et al. 2004; Broglia et al. 2005). In

an effort to define the energetics of the HIV-1 protease folding landscape, an analysis of

the equilibrium and kinetic folding properties of HIV-1 protease is presented. An

inactive variant, D25N, provides an excellent model system for investigating the

thermodynamic properties of HIV-1 protease, as it ensures reversibility while

51

maintaining biological relevance (Prabu-Jeyabalan et al. 2000; Katoh et al. 2003). Global analysis of the folding kinetics for the mature dimer, coupled with insights gained from the folding of a monomer construct, has illustrated the presence of monomeric intermediates in the HIV-1 protease folding energy landscape.

52

Materials and Methods

Standard buffers and reagents. Lysis buffer contained 20 mM Tris hydrochloride,

1 mM sodium EDTA, pH 8.0. The purification buffer consisted of the lysis buffer with the addition of 7 M guanidine hydrochloride. The standard buffer for all folding experiments was 100 mM sodium phosphate, 0.2 mM EDTA, pH 6.0. Guanidine hydrochloride (98% purity) purchased from J. T. Baker Chemicals (Philipsburg, NJ) was used for purification. G75 Sephadex was used as the matrix for the sizing column during purification. Ultrapure urea was purchased from MP Biomedical (Solon, OH). All other reagents were of standard molecular biology grade.

Protein expression and purification. A synthetic gene encoding for HIV-1 protease containing the mutations Q7K/D25N/C67A/C95A was the kind gift of Dr. Celia

Schiffer. A double-stop codon was introduced to create the C-terminal deletion construct

using the Stratagene Quick Change protocol. The vector pET11A was used for

mutagenesis and maintenance of the gene in XL1 blue E. coli cells. DNA sequencing

was performed at the University of California at Davis sequencing center. Protein was

expressed using the E. coli pXC35 vector in the TAP 106 cell line, as previously

described (King et al. 2002).

Inclusion bodies were purified from the cell lysate at 4 °C in standard lysis buffer

following two rounds of sonication and washed in 2 M urea in the standard lysis buffer.

The protein was unfolded from the inclusion bodies in purification buffer for one hour at

53

4 °C. Following overnight dialysis against the purification buffer at 4 °C, the protein was

loaded onto a G75 Sephadex sizing column. Pure fractions, as determined by SDS-

PAGE, were pooled and refolded by four rounds of dialysis against experimental buffer.

Purified protein was then centrifuged and filtered to remove impure aggregates. Purity

was assessed at > 98% by mass spectrometry. Protein was stored in experimental standard buffer at 4 °C.

Determination of protein concentration. The absorbance of tryptophan and tyrosine residues at 280 nm was monitored using an Aviv 14DS UV-VIS spectrometer.

The extinction coefficient of HIV-PR was determined using the method of Gill and von

Hippel (Gill and von Hippel 1989). Protein was unfolded before measuring the absorbance so that all protein concentrations are expressed in terms of the monomer concentration. The protein concentration was then calculated by the absorbance at 280 nm using an extinction coefficient of 12700 M-1cm-1.

Equilibrium studies. The protein was manually unfolded in incrementing concentrations of urea in experimental buffer to the same final protein concentration for each titration. Samples were equilibrated overnight. Unfolding changes were monitored with both CD and FL at 25 °C. Temperature was controlled using a Peltier controller.

Each CD spectrum reflected the average of three spectra taken from 330 nm to 215 nm on a Jasco J810 CD spectrophotometer, with a 2.5 nm bandwidth, a scan rate of 20 nm/min, and an 8 s response time. FL emission spectra were taken on a Photon Technology

54

International fluorimeter equipped with single-grating excitation and emission

monochromators. The excitation wavelength was 295 nm, and emission was monitored

from 300 to 500 nm. Savuka, an in-house non-linear least-squares fitting software, was

used to fit equilibrium unfolding data using two- and three-state dimeric thermodynamic

models (Gloss and Matthews 1998b; Doyle et al. 2004). Global analysis of all

wavelengths utilized singular value decomposition for data reduction as previously

described (Henry and Hofrichter 1992; Gualfetti et al. 1999).

Kinetic studies. Manual-mix kinetics, with a dead-time of 5 s, were initiated by a

10-fold dilution into various final concentrations of urea with buffer and monitored via

CD on a Jasco J810 CD spectrophotometer. Stopped-flow kinetics, with a dead-time of

5 ms, were initiated by a 6-fold dilution with urea or buffer using an Applied

Photophysics Stopped-Flow Fluorimeter (model SX.18MV). Tryptophan fluorescence was detected with a 320 nm emission cut-off filter following excitation at 280 nm, and temperature was controlled at 25 °C using a recirculating water bath. To compensate for pressure-related instrument artifacts associated with driving the syringes, the first 50 ms of each kinetic trace was deleted prior to analysis. All kinetic traces were fit to a constant

plus a series of exponentials, A(t) = A(∞) + Σ∆Aiexp(-t/τi). Each τi was plotted as a function of urea concentration to provide a chevron analysis of the folding reaction

(Matthews 1987).

55

Global analysis methods. Raw kinetic unfolding and refolding FL traces across the protein and urea concentration dimensions were globally fit to several different kinetic folding mechanisms using a Levenberg-Marquardt non-linear least squares fitting algorithm in an in-house fitting package, Savuka. The kinetic rate equations in each case were solved by numerical integration using a Runge-Kutta algorithm with adaptive step size (Zitzewitz et al. 1995; Bilsel et al. 1999b). Kinetic traces were logarithmically averaged and then fit simultaneously using an iterative procedure previously described

(Svensson et al. 2006a). Adjustable global, i.e. linked, parameters in the Marquardt-

Levenberg optimization consisted of the microscopic rate constants, the kinetic m-values, the baselines for the native and unfolded states, and the Z-values, a normalized measure of to what degree the intermediate resembles the unfolded state,

Z = (Y − Y ) /(Y − Y ) . Additionally, a local scaling parameter and an offset I N2 U N2

parameter were used for each kinetic trace to compensate for small (<10%) variations in

protein concentration and instrumental offsets not corrected by blank subtraction.

Each optimization began by solving for the equilibrium concentration of all

species under the starting conditions. For refolding traces the starting conditions were

typically unfolding conditions and for unfolding traces the starting conditions were

typically strongly folding conditions. The rate equations were solved using the same

kinetic parameters under the final conditions and the calculated kinetic trace was

compared with the experimental data. This iterative procedure was repeated until the

kinetic parameters were optimized. The goodness of fit was evaluated by the randomness

of the residuals and the reduced-χ2.

56

Results

Secondary and tertiary structural probes of HIV-1 protease folding. An inactive,

pseudo-wild-type HIV-1 protease ( HIV - PR D25N ), containing the mutational background

Q7K/D25N/C67A/C95A (Figure 2.1a), was used as template for these folding studies.

The Q7K mutation decreases autoproteolysis in active protease variants (Rose et al.

1993). The D25N substitution eliminates activity (Babe et al. 1992; Prabu-Jeyabalan et al. 2000) and insures thermodynamic reversibility of the system by eliminating autoproteolysis. Finally, the C67A and C95A mutations eliminate adverse effects of cysteine oxidation and also enhance the reversibility of the system. The intrinsic FL

chromophores in HIV - PR D25N are Trp6, Trp42, and Tyr59 (Figure 1a). Trp6 is near the dimerization interface, while Trp42 and Tyr59 are near the flap tips (Lys45-Ile54). Both the Trp residues are conserved across all of the HIV-1 protease sequences.

The secondary structure of HIV - PR D25N was probed by CD spectroscopy. An

MRE of -4500 deg cm2 dmol-1 was observed at the 218 nm minimum (Figure 2.2a), a characteristic CD wavelength for β−sheet proteins (Greenfield 2004). The reduced signal of β-structured proteins compared to α-helical proteins, makes it possible to detect a small peak near 230 nm and a broad, near-UV peak at 270 nm. The minimum at 230 nm may reflect exciton coupling between Trp42 and Tyr59 (Woody 1994), and the broad peak at 270 nm reflects aromatic packing interactions within the protease.

57

(a)

6000 ) -1 l Native HIV - PR (―)

o D25N

4000 dm

2 Unfolded HIV - PR D25N (––) ∆96−99 Native HIV - PR D25N (– –) 2000 ∆96−99 (deg cm Unfolded HIV - PR D25N (– –) pticity

i 0

-2000

ean Residue Ell -4000 M

200 220 240 260 280 300 320 Wavelength (nm)

(b) 2e+6

Native HIV - PR D25N (―)

Unfolded HIV - PR D25N (––) 2e+6 ∆96−99 Native HIV - PR D25N (– –) ∆96−99 ce Intensity Unfolded HIV - PR D25N (– –)

1e+6 escen r uo l

5e+5 Relative F

0 300 350 400 450 500

Wavelength (nm) Figure 2.2: CD and FL spectral characteristics of HIV-PR. (a) CD spectra and (b)

FL spectra of HIV-PR 1-99 (solid lines) and m-HIV-PR 1-95 (dashed lines). HIV-PR spectra are shown under native (black) and unfolded (red) conditions. The unfolded FL spectra for the monomer and dimer constructs superimpose.

58

Unfolding in 6 M urea results in a CD spectrum lacking these three features, consistent with a random-coil-like structure lacking significant secondary and tertiary structure.

Similar to other β-structured proteins, the signal change observed upon unfolding is significantly smaller compared to the changes observed for proteins of a similar size with predominantly alpha-helical content (Carlsson and Jonsson 1995).

The dimeric HIV-1 protease exhibits a maximum FL emission peak at 347 nm upon excitation at 295 nm (Figure 2.2b). Unfolding in 6 M urea yields a maximum emission peak at 353 nm, slightly red-shifted and decreased with respect to the native dimer. Consistent with the relatively red-shifted native emission spectrum, DSSP

calculations of the unliganded HIV - PR D25N crystal structure (Kabsch and Sander 1983) indicate that Trp6 and Trp42 are relatively exposed, having respectively, 168 Å2 and 121

Å2 of solvent accessible surface area in the native state. Compared with the most solvent exposed residue, Arg41, Trp6 is 77% solvent exposed and Trp42 is 56% solvent exposed.

Thus, this small shift in FL emission suggests that either one or both of the tryptophans becomes slightly more solvent exposed upon unfolding.

Equilibrium unfolding. HIV - PR D25N equilibrium unfolding was observed as a function of urea denaturation. A comparison of refolding and unfolding equilibrium urea

titrations (Figure 2.3) showed greater than 95% folding reversibility. HIV - PR D25N urea titrations were studied across a protein concentration range of 5 µM to 60 µM and monitored with both CD and FL to better quantify dimer unfolding thermodynamics and

59

-1000

) 5 µM unfolding -1 l

o -1200 5 µM refolding

dm

2 -1400

-1600 (deg cm

-1800

230 nm -2000

MRE

-2200 0246 [Urea] (M)

Figure 2.3: Reversibility of HIV-PR folding. The equilibrium folding transition, in

100 mM sodium phosphate, 0.2 mM EDTA, pH 6.0, at 25 ºC, monitored by CD at 230 nm is shown as a function of urea concentration. Native (black circles) and unfolded (red circles) HIV-PR was equilibrated in incrementing amounts of urea before measurement of the unfolding transition.

60

(a) (b) ) ) ) ) -1000 -1 -1 -1 -1 - 2800 l l - 3000 -1200 230 nm - 3200 220 nm dmo dmo dmol dmol -1400 - 3400 2 2 2 2 - 3600 -1600 cm - 3800 cm g - 4000 g -1800 e e d - 4200 d -2000 - 4400 E ( E ( - 4600 -2200 MRE (deg cm MRE (deg cm MR 0123456MR 0123456 [Urea] (M) [Urea] (M)

(c) (d) ) ) 300 2x106 -1 -1 l l .) 250 .) o o 270 nm 6 .U 200 .U 2x10 350 nm dm dm (A (A 2 2 150 2x106 y y it 100 it cm cm s s 6 n 50 n 1x10 eg eg e e t 0 t n n 1x106 E (d E (d I -50 I L L R R F F 6 M M -100 1x10 0123456 0123456 [Urea] (M) [Urea] (M)

Figure 2.4: Equilibrium folding properties of HIV-PR. Equilibrium unfolding changes, in100 mM sodium phosphate, 0.2 mM EDTA, pH 6.0, at 25 ºC, at local wavelengths fit best to a two-state model, N2 ' 2U. Fits (solid lines) are shown probing the secondary, β−sheet structural changes at 220 nm (a), aromatic packing interactions at

(b) 230 nm, and (c) 270 nm, and local changes involving the FL probes at 350 nm (d).

Monomer protein concentration was 30 µM.

61

to potentially facilitate discrimination between monomeric and dimeric unfolding transitions. Local fits of individual titrations at a single wavelength (Figure 2.4) were fit to a two-state model, N2 ' 2U. Wavelengths were chosen on the basis of relevant structural probes; CD at 218 nm reflected overall secondary structural changes within the

β-barrel, CD at 230 nm and CD at 270 nm were indicative of changes in aromatic packing induced by unfolding, and FL at 350 nm reported on the local structural changes around the intrinsic Trp probes. Thermodynamic parameters from the individual two- state (N2 ' 2U) fits of each data set agreed within error, with an average ∆G°(H2O) of

13.0 ± 1.0 kcal (mol dimer)-1 and an average m, or urea sensitivity, value of 2.5 ± 0.5 kcal

(mol dimer)-1 M-1. The free energy for unfolding of the dimer is reflective of protein stability and the urea sensitivity is indicative of exposure of buried surface area upon unfolding.

A global fit of both CD and FL data that reflected all spectroscopic changes

observed for HIV - PR D25N equilibrium folding was also performed using the vectors from singular value decomposition (SVD) analysis. This data reduction algorithm provides the minimum number of statistically significant orthonormal basis spectra necessary to provide a least-squares description of the entire data set (Henry and

Hofrichter 1992). By rejecting signal changes uncorrelated along the wavelength and denaturant axes, this analysis method also provides a way to filter out experimental noise

and improve the signal-to-noise of HIV - PR D25N spectroscopic signals. SVD data

62

Table 2.1: Thermodynamic Parameters Obtained from Equilibrium Titrations or Global Analysis of the Kinetic Data

a a Method ∆G°(H2O) Urea Dependence (m)

13.89 ±0.08 kcal (mol 2.5 kcal ± 0.04 (mol Dimer Three-State -1 -1 -1 Kinetic dimer) dimer) M 14.15 ± 0.21 kcal (mol 2.85 ± 0.07 kcal (mol Dimer Two-State -1 -1 -1 Equilibrium dimer) dimer) M Monomer Two-State 0.579 ± 0.10 kcal mol-1 1.09 ± 0.13 kcal mol-1 M-1 Kinetic Monomer Two-State 1.35 ± 0.05 kcal mol-1 1.45± 0.24 kcal mol-1 M-1 Equilibrium

a The reported errors are the standard errors obtained in the fit of each data set to a two-state dimeric (2U ' N2) or monomeric (U ' M) model for HIV - PR D25N and ∆96−99 HIV - PR D25N , respectively. Kinetic thermodynamics were obtained from a global ∆96−99 analysis of HIV - PR D25N FL kinetics and the chevron analysis of HIV - PR D25N . Errors were propagated for the kinetic thermodynamic parameters.

63

1.0

0.8

0.6

5 µM 0.4 30 µM 60 µM 0.2 Fraction Unfolded Protein

0.0

0123456

[Urea] (M)

Figure 2.5: Fraction unfolded protein (Fapp) plots of the SVD equilibrium unfolding

CD and FL vectors fit globally to a two-state process. Fraction unfolded plots of CD

(circles) and FL (triangles) SVD vectors overlay at individual protein concentrations, suggesting that the observed folding is two-state. Representative concentrations of 5 µM

(black), 30 µM (red), and 60 µM (blue) display the expected protein concentration dependence of the transition midpoint for a dimer.

64

reductions for each set of CD and FL spectra for a particular titration were performed separately. A total of four significant SVD vectors, two from CD and two from FL, were considered significant based on the degree of randomness, the autocorrelation and the singular values for a given vector. A total of twenty SVD vectors, from five different protein concentration titrations, were fit globally to a two-state (N2 ' 2U) equilibrium model and provided robust equilibrium thermodynamic parameters (Table 2.1). There is little evidence for an equilibrium folding intermediate because the CD and FL equilibrium transitions for any given protein concentration overlay within error, and the

SVD analysis of both methods resulted in only two significant vectors per method.

While it is mathematically possible to have only two vectors yet still be three-state, the requirement that the intermediate be either native-like or unfolded-like, is physically unlikely. The fraction apparent plot of the combined CD and FL fits (Figure 2.5) displays the expected protein concentration dependence of the transition midpoint for a

dimeric system (Neet and Timm 1994). The equilibrium folding of HIV - PR D25N fit best to an apparent two-state process between native dimer and unfolded monomer with a ∆G° in the absence of denaturant, or overall stability, of 14.15 ± 0.21 kcal (mol dimer)-1 and an m value, or urea sensitivity, of 2.85 ± 0.07 kcal (mol dimer)-1 M-1, based upon a global fit of both CD and FL SVD vectors. These values are in agreement with previous

-1 equilibrium studies of active protease, including 14 kcal (mol dimer) (Grant et al. 1992) and 12 to 15 kcal (mol dimer)-1 (Todd et al. 1998), suggesting that this inactive variant of

HIV-1 protease is a relevant model in which to study the folding mechanism.

65

concentration range, τ2 becomes the rate-limiting phase at nanomolar protein concentrations.

Chevron analysis. The unfolding (Figure 2.6a) and refolding (Figure 2.6b) kinetics monitored by CD displayed a single, slow kinetic phase (Figure 2.7a). The absence of a CD burst-phase signal suggests that all of the expected refolding and unfolding CD amplitudes are observable in the manual mixing experiments and that more rapid reactions (τ<10 s) exhibiting a CD signal are not present. A semi-log plot of the relaxation times versus urea concentration shows classical chevron behavior, having a peak near the equilibrium denaturation midpoint and linear slope on either side of the peak on the semi-log plot (Figure 2.7a). However, no protein concentration dependence in the refolding kinetics over the 1 µM to 12 µM concentration range, as might be expected given the absence of a burst-phase, was observed. Either dimerization has occurred within the burst-phase without significant secondary structure formation or the bimolecular association step is rate-limited by a prior unimolecular monomer folding reaction. The inherently weak CD signal of the β-barrel motif, the limited S/N of the CD technique itself and the limited solubility of the protein precluded carrying out CD studies covering a wider protein concentration range. Therefore, fluorescence spectroscopy was used to further elucidate the folding mechanism.

The enhanced sensitivity of fluorescence relative to CD permitted kinetic studies to performed using stopped-flow techniques, which provided approximately two orders of magnitude greater time resolution over the manual mixing CD studies. Monitoring tryptophan fluorescence, three exponential phases (Figure 2.6c), slow (τ1), intermediate

66

(a) -3.5 (b) -3 -4.0 -4 -4.5 )

g 0.5 M urea -5.0 -5 -5.5 2.8 M urea -6

-6.0 (mde y 2.0 M urea -6.5 -7

-7.0 5.0 M urea ticit p -7.5 -8 0.6 0.6 Elli Ellipticity (mdeg)

0.0 0.0

-0.6 als -0.6 als 0.6 0.6

0.0 0.0 Residu Residu -0.6 -0.6 0 100 200 300 400 500 0 100 200 300 400 500 Time (s) Time (s)

0.02 (c) (d) 0.01

0.01 0.00

0.00 2.6 M urea y y -0.01

-0.01 1.0 M urea -0.02

-0.02 -0.03

-0.03 4.2 M urea -0.04 2.0 M urea -0.04 -0.05 FL Intensti FL Intensti -0.05 -0.06 -0.06 0 200 400 600 800 1000 -0.07 0 100 200 300 400 500 0.003 0.003 0.002 0.002 0.001 0.001 0.000 0.000 -0.001 -0.001

als -0.002 als -0.002 -0.003 -0.003 0 200 400 600 800 1000 0.003 0 100 200 300 400 500 0.002 0.003 0.001 0.002 0.000

Residu 0.001 Residu -0.001 0.000 -0.002 -0.001 -0.003 -0.002 0 100 200 300 400 500 -0.003 0 200 400 600 800 1000 Time (s) Time (s)

Figure 2.6: HIV-PR Folding Kinetics. (a) Manual-mixing CD unfolding kinetics fit best to one exponential, (b) Manual-mixing CD refolding kinetics fit best to one exponential, (c) SFFL unfolding kinetics fit best to three exponentials, and (d) SFFL refolding kinetics fit best to three exponentials.

67

(a) (b)

1000

Bimolecular phase at 100nM ) 100 (s ) e s (

e 100 m ti

n 10 tion tim o a ti a x lax a 10 l e e R R

1 0 2 4 6 8 10 12

1 [protein] (µM) 012345 Urea (M)

Figure 2.7: Chevron Analysis of HIV-PR Folding Kinetics. (a) Relaxation times (τ), extrapolated from an exponential fit of the kinetic traces at a 4 µM final protein concentration were plotted as a function of final urea concentration. All symbols are open for refolding and closed for unfolding. Manual-mixing kinetic traces observed by

CD at 230 nm displayed single exponential behavior (circles). Stopped-flow kinetic traces monitored by FL >320 nm, with λexc = 280 nm, fit to three exponentials in both refolding and unfolding; the phases are represented in as follows: τ1 in circles, τ2 in squares, and τ3 in triangles, protein concentrations in blue 4 µM, red 12 µM, and pink 0.1

µM. (b) Relaxation times for refolding to 1 M urea are shown as a function of protein concentration. Of the three observed refolding phases, only τ2 exhibited a protein concentration dependence. While τ2 is not rate limiting in the micromolar protein (τ2), and fast (τ3), were detected in stopped-flow FL unfolding kinetics (Figure 2.7a). All of the unfolding phases exhibited semi-logarithmic behavior, with exponential decreases in

68

relaxation times with increasing urea concentration (Matthews 1987). The FL slow phase

(τ1) accounted for 85% of the total FL amplitude and was coincident in relaxation time with the single CD phase, suggesting that this unfolding phase reflects the predominant

loss of structure in HIV - PR D25N unfolding. The intermediate (τ2) and fast (τ3) phases contributed 12% and 3%, respectively, of the total FL amplitude (data not shown).

Furthermore, the unfolding kinetics were independent of protein concentration, demonstrating the expected unimolecular nature of the unfolding reaction.

Refolding kinetics also displayed three exponential phases by FL (Figure 2.6d): a slow (τ1), intermediate (τ2), and fast (τ3) phase (Figure 2.7a). Because no burst-phase was detected by CD manual-mixing kinetics, the faster kinetic phases may correspond to local structural changes in the vicinity of one or both of the tryptophans and are not accompanied by significant changes in secondary structure. As discussed below, one or both of the faster phases may reflect the inaccuracies of attempting to fit bimolecular reactions to exponentials. As in the manual-mixing CD experiments, the slowest phase, detected as a single exponential by CD and the predominant FL phase (τ1), was protein concentration independent. The presumed rate-liming folding step, in the absence of urea had a relaxation time of 8.5 seconds.

Key insights into the folding mechanism were obtained from the intermediate FL phase (τ2), which displayed protein concentration dependence (Figure 2.7b). At final protein concentrations above 4 µM, this phase contributed ~12 % of the amplitude and had relaxation times in the absence of urea that were 4 seconds faster than the presumed rate-limiting step. However, when the final protein concentration was decreased to 1

69

µM, the kinetic traces were difficult to fit to three exponentials, suggesting that the τ1 and

τ2 kinetic phases were too similar to permit discrimination. Only when the protein concentration was dropped to 100 nM did this phase become rate-limiting in forming the native dimer. Consistent with this decrease in reaction rate, the contribution of this phase to the total amplitude increased to 55%. At nanomolar concentrations the association reaction (presumably approximately manifested by the τ2 phase) is slowed sufficiently to become comparable to the monomer folding rate. This disappearance of the protein concentration dependence at micromolar protein concentrations and dominant amplitude of the slowest refolding phase are consistent with monomer folding becoming rate- limiting to forming the native state. The association reaction following monomer folding is presumably significantly more rapid at micromolar concentrations. The slow FL phase

(τ1) contributed less than 5% of the refolding amplitude at nanomolar concentrations and exhibited the same time constant when the intermediate FL phase (τ2) became rate- limiting, suggesting that τ1 precedes τ2 as a monomer folding step occurring before the bimolecular association. The fast FL phase (τ3) was also protein concentration independent, but contributed less than 5% of the amplitude. The role of this phase was unclear from the chevron analysis, requiring that further analysis (see below).

Based upon the chevron analysis, a number of models can be postulated. The simplest folding models are sequential, and this type of behavior is consistent with the equal number of phases observed in the chevron for both unfolding and refolding reactions. The concentration independence of the dominant refolding phase at

concentrations down to several hundred nanomolar, and the persistence of the τ1 phase at

70

nanomolar concentrations, when association is rate-limiting, is consistent with a folding mechanism in which a unimolecular monomer folding step precedes the bimolecular association step. The simplest kinetic model consistent with the main features of the observable kinetics is 2U ' 2M ' N2, where U represents an unstructured monomer, M is a folded monomer and N2 is the dimeric native state. This model does not explicitly account for the minor fast phases that may reflect folding either prior to or subsequent to folding of the mature monomer, as discussed below.

Global analysis. The ability of the kinetic model, 2U ' 2M ' N2, to describe much of the available raw data was tested using a non-linear least squares fitting algorithm that simultaneously fit 96 kinetic traces. Both protein and urea concentration dimensions were included in the fit to provide a comprehensive test of the mechanism.

As shown in Figure 2.8, the model provides a very good description of the data over the

100 ms to 100 s time range and over the 50 nM to 5 µM protein concentration range. The quality of the fits, the robustness of the chi-square minimum and the parameter values give support to the validity of the model. The monomer refolding, monomer unfolding, and dimer dissociation microscopic rate constants were in agreement with the chevron

(Figure 2.9a). A rigorous analysis of the bimolecular phase (Figure 2.9b), revealed a confidence interval of ± 0.5 ¯ 106 M-1 s-1. Although models including additional steps

(Scheme 2.1) were also performed, a statistical comparison of the reduced χ2 values confirmed that the simplest model, 2U ' 2M ' N2 (Model 4 in Scheme 2.1) provided the most statistically significant description of the data. The reduced χ2 value improve

71

1.0 1.0 2.2 M urea s n 0.8 0.8 ie c ies c Protei

4.5 M urea 0.6 0.6 Spe ded Spe on

i 0.6 M urea

0.4 tion 0.4 c act n Unfol Fr Fra 0.2 0.2 Fractio 2.4 M urea 0.0 0.0 10 100 1000 110100 Time (s) Time (s) (a) (b)

1.0

0.1 µMprotein 0.8 s ie

ec 0.6 p 12 µMprotein S n o i

t 0.4 c a Fr 0.2

0.0 1 10 100 (c) Time (s)

Figure 2.8: Global Analysis of HIV-PR Folding Kinetics. Representative fits of the data to the 2U ' 2M ' N2 model are shown. (a) Unfolding kinetics between 2.4 M and

4.2 M urea at 4 µM protease. (b) Refolding kinetics between 0.6 M and 2.0 M urea at 4

µM protease. (c) Refolding kinetics to 1.0 M urea at increasing protein concentrations.

This model describes the data equally well across all 96 kinetic unfolding and refolding traces.

72

Model 1

2U ' 2I ' D2 'N2

Model 2

2U ' 2I ' 2M ' N2

Model 3

* 2U ' 2I ' N2 ' N2

Model 4

2U ' 2M ' N2

Scheme 2.1: Putative Models Based on Chevron Analysis. Four sequential models were postulated based upon the observed kinetic phases (Figure 2.7a) and the respective trends for each phase. U represents the unfolded polypeptide chain, I represents an early folding monomeric intermediate, M represents a late folding monomeric intermediate, D2

* represents a dimeric intermediate, and N2 represents a native dimer that undergoes a rapid rearrangement.

73

20-fold for a three-state model with a monomer intermediate compared with a two-state model with two unfolded chains and the native dimer. However, with four-state models the χ2 values never improved more than 5%, indicative of overparamaterization. An F- test of the χ2 values for these models suggests that less than 5% change was not probable in improving the model. Therefore, inclusion of the additional steps was not warranted by the available data even though their presence cannot be ruled out (see below).

As illustrated in Table 2.2, global modeling yields the microscopic rate constants and urea dependences for each step of the folding mechanism, 2U ' 2M ' N2. In unfolding, dimer dissociation (N2 ' 2M) is rate-limiting, with subsequent faster unfolding of a mature monomer intermediate (2M ' 2U). Interestingly, folding of the mature monomer intermediate under strongly folding conditions (2U ' 2M) is the rate- limiting step in refolding. The bimolecular association occurs subsequently and rapidly at micromolar concentrations. A comparison of the overall stability and m values determined from equilibrium (Table 2.2) and kinetic measurements (Table 2.2) shows excellent agreement between the two methods.

The thermodynamic parameters provided by the kinetic global analysis yield a Kd of 2 nM at pH 6.0 and 25 ºC. This result is in agreement with previously reported indirect inhibitor binding measurements of 0.75 nM at 30 °C and 3.4 nM at 37 °C at pH

5.5 (Darke et al. 1994). Although comparisons of alternatively reported Kd values are complicated by variations in the temperature, pH, and the buffering conditions, the micromolar to picomolar range in values (Grant et al. 1992; Xie et al. 1997; Todd et al.

1998), determined by methods assuming a two-state N2 ' 2U dissociation, exceeds

74

expectations. The rates from the global analysis were used to calculate a stability of 2.10 kcal (mol dimer)-1 for the monomer and 11.79 kcal (mol dimer)-1for the dimer association step, providing a total 13.89 kcal (mol dimer)-1of stability for the native dimer.

Compared with the two-state equilibrium value of 14.15 ± 0.21 kcal (mol dimer)-1, the overall stabilities determined by either equilibrium or kinetic measurements are very consistent, giving further support to the proposed kinetic model (Table 2.1). Although the stability of the monomer is only 2.10 kcal (mol dimer)-1, models assuming a two-state

N2 ' 2U do not separate this stability from the dimerization free energy and overestimate the dissociation constant, Kd. Therefore it appears that this global kinetic analysis, which detects a mature monomer intermediate (M), has also provided a more

accurate and precise measure of the HIV - PR D25N dissociation constant, Kd.

The population distribution of the unfolded (U), mature monomer intermediate

(M), and native dimer (N2) were simulated under equilibrium conditions based on the thermodynamic parameters obtained form the kinetic global analysis. The mature monomer intermediate is not significantly populated at any concentration between 0.1

µM and 60 µM (Figure 2.10a). It was not surprising that the population distribution was predominantly comprised of unfolded monomer chains and native dimer, as the measured equilibrium properties had apparent two-state behavior. These simulations emphasize that the thermodynamic parameters derived from the global kinetic analysis and the observed equilibrium thermodynamic parameters are consistent.

75

Step Microscopic Rate Urea Dependence Keq ∆G Constant 2U Œ 2M 4.72 e-2 ± 3.23 e-4 s-1 0.717 ± 0.003 kcal mol-1M-1 0.169 2.10 kcal mol-1 2M Œ 2U 7.97 e-3 ± 4.73 e-4 s-1 -0.175 ± 0.015 kcal mol-1M-1

1.02 e6 ± 6.49 e4 M-1 s-1 0.582 ± 0.019 kcal (mol dimer)-1 M-1 2.11 nM 11.79 kcal (mol dimer)-1 2M Œ N2

2.15 e-3 ± 1.61 e-5 s-1 -0.135 ± 7.66 e-5 kcal mol-1M-1 N2 Œ 2M

Table 2.2: Kinetic and Implied Thermodynamic Parameters: The rate constants and their associated urea dependence extracted from the global analysis define the HIV-1 protease folding landscape.

76

1.0

0.9 y t i 0.8

0.7

Probabil 0.6

0.5 0.5 1.0 1.5 2.0 2.5 Confidence Interval

(a) (b)

Figure 2.9: Comparison of Microscopic Rate Constants with Chevron of HIV-PR

Folding Kinetics. (a) A pseudo-first order approximation of the microscopic rates for

D25N HIV-1 protease folding at 4 µM final protein concentration is shown on the template of the dimer chevron from Figure 2.7a. The rate-limiting refolding step (solid line) corresponds to the 2U ' 2M transition. The rate-limiting unfolding step (dotted line) corresponds to the N2 ' 2M dissociation rate. The intermediate unfolding step

(dashed line) correlates with the 2M ' 2U unfolding rate. The bimolecular association rate (dotted and dashed line) does not overlay with any one particular observed relaxation time. A rigorous analysis of the bimolecular constant (b) demonstrates that the microscopic rate obtained from this fit is within a precise confidence interval.

77

(a) s e i

0.1 µM

Spec 5 µM n 60 µM o i t

ac Native Fr Intermediate Unfolded

(b)

1.0 0.1 µM

s 0.8 5 µM ie c e p 0.6 Native S n Intermediate

tio 0.4 c Unfolded Fra 0.2

0.0 (c) 1 10 100 1000 1.0 TiTimme e( s(s))

0.8 0.1 µM

s 1 µM ie

c 4 µM e 0.6

Sp Native on

ti 0.4 Intermediate c

Fra Unfolded 0.2

0.0 0.01 0.1 1 10 100 Time (s)

Figure 2.10: Predicted Populations of HIV-PR Folding Species. (a) Simulated equilibrium species plots at final monomer protein concentrations of 60 µM, 5 µM and

100 nM. (b) Simulated kinetic species plots to strongly unfolding conditions with final urea concentrations of 4 M and (c) strongly refolding conditions with final urea concentrations of 1 M.

78

The population of kinetic species populated under strongly unfolding and refolding conditions was also simulated based on the global analysis parameters. The unfolding simulations (Figure 2.10b) displayed the same population of species across several protein concentrations, consistent with the observation that all relaxation times for unfolding are protein concentration independent. In refolding (Figure 2.10c), the mature monomer intermediate is only marginally populated (10 %) in the micromolar concentration range, but became significantly more populated (~30%) at nanomolar concentrations.

Although the global analysis provided an assignment of microscopic rates to the

majority of the chevron, some aspects of the HIV - PR D25N folding mechanism remain unresolved because the fastest minor (~5%) refolding and unfolding phases were not incorporated into the global analysis. One possible origin of the rapid phases in refolding may be from the inherent limitation of the chevron plot in describing bimolecular reactions. The chevron plot is an empirical description of the data in terms of exponential functions which does not provide a true description of a second-order kinetic reaction.

Therefore, it is possible that at least some of the observed refolding relaxation times arise from spurious phases introduced into the chevron plot by treating bimolecular folding kinetics as exponentials. The global analysis treats the bimolecular kinetics explicitly by integrating the rate equations and does not suffer from these limitations. The coupling of the microscopic rate constants in the mechanism are also accounted for in the global analysis. While the minor kinetic phases proved difficult to incorporate into a global fit because of overparameterization, insights of their role in the folding of D25N HIV-1

79

protease could provide a more detailed mechanism. Conveniently, a monomer construct is experimentally accessible (Ishima et al. 2001; Ishima et al. 2003) and may elucidate the role of these fast phases.

Monomer construct studies. The isolated monomer folding reaction was studied

using a construct with a C-terminal deletion (∆96-99) introduced into the HIV - PR D25N

∆96−99 background ( HIV - PR D25N ) (Ishima et al. 2003). The same experimental conditions as previously described for a direct comparison between the full length dimer and the truncated monomer. Size exclusion chromatography (data not shown) confirms that

∆96−99 HIV - PR D25N does not dimerize, even at stock concentrations of 80 µM.

∆96−99 HIV - PR D25N maintains a similar CD signature spectrum (Figure 2.2a) as the full-length

∆96−99 dimer. However, compared to HIV - PR D25N , the 218 nm minimum for HIV - PR D25N is broader, suggesting that the C-terminal deletion construct may be more molten globule- like in its overall 2° structure (Greenfield 2004). The unfolded CD spectra for

∆96−99 HIV - PR D25N at high denaturant concentrations (6 M urea) were also collected, and they exhibit high similarity (Figure 2.2a).

∆96−99 HIV - PR D25N has a FL spectrum with a maximum emission at 351 nm and

diminished intensity compared to HIV - PR D25N (Figure 2.2b). Unfolding of

∆96−99 HIV - PR D25N decreases the relative intensity of the maximum emission peak, but does

80

not shift the maximal wavelength. The FL unfolded spectra of both HIV - PR D25N and

∆96−99 HIV - PR D25N are equivalent. The difference in maximum emission wavelength between

∆96−99 HIV - PR D25N and HIV - PR D25N indicates that the tryptophans within the

∆96−99 HIV - PR D25N are more solvent-exposed compared with those of HIV - PR D25N . This result is consistent with the NMR results (Ishima et al. 2003) which have shown an increase in solvent accessible surface area of the isolated monomer.

∆96−99 Equilibrium unfolding titrations of HIV - PR D25N were monitored with both CD and FL. This equilibrium transition did not yield a distinct native baseline. Attempts to achieve a native baseline with the osmolyte, TMAO (Baskakov and Bolen 1998; Qu et al.

1998), and temperatures did not provide meaningful results (data not shown).

∆96−99 Fortunately, the initial amplitudes from HIV - PR D25N CD unfolding kinetics provided an estimate for the native baseline at higher urea concentrations (Figure 2.11a). By linear extrapolation to lower urea concentrations, it was possible to obtain a robust fit of the monomer equilibrium unfolding data and obtain thermodynamic parameters (Table 2.1).

Only a single exponential phase was observed in either unfolding or refolding of

∆96−99 HIV - PR D25N by manual-mix CD kinetics (Figure 2.11b). A two-state fit of the

∆96−99 HIV - PR D25N chevron resulted in both a lower stability and a lower m value compared to the equilibrium fit (Table 2.1), indicating that this phase does not account for the entire folding reaction of the monomer.

81

(a)

(b)

Figure 2.11: Folding properties of m-HIV-PR. (a) CD equilibrium unfolding transition at 230 nm (filled circles). Initial kinetic amplitudes (open squares) provided a native baseline parameter. (b) Comparison of monomer construct chevron, manual-mix CD phase in circles and stopped-flow FL slow (τ1) phase in squares and fast (τ2) phase in fast and triangles, with dimer chevron shown as lines. Unfolding and refolding kinetics are represented with closed and open circles, respectively.

82

Stopped-flow FL kinetics displayed a slow (τm1) and fast (τm2) phase in both unfolding and refolding (Figure 2.11b), with the slow (τm1) phase coincident with the single CD

∆96−99 phase. The amplitude distribution of the HIV - PR D25N FL phases was 90% for τm1 and

10% for τm2, which is similar to the amplitude distribution of the fastest and slowest

∆96−99 phases in the refolding of HIV - PR D25N . The HIV - PR D25N fast FL phase (τm2) appears

similar to the fast FL phase (τ3) observed in HIV - PR D25N , suggesting that this phase may reflect a local monomer folding event.

Although the global analysis was not able to account for the minor FL phases

∆96−99 observed in unfolding and refolding, the monomer construct, HIV - PR D25N , provided

∆96−99 additional insights to the complete sequential folding pathway. HIV - PR D25N indicates that a local folding event within the monomer subunit independent of the global folding of the monomer. The complete kinetic mechanism for HIV-1 protease folding is shown

∆96−99 in Scheme 2.2. The thermodynamics obtained for HIV - PR D25N demonstrate that the complete monomer folding reaction in the dimer context contributes 2.7 kcal (mol dimer)-1, which is an additional 0.6 kcal mol-1 of stability compared to the monomer thermodynamic parameters resulting from the global fit.

Contribution of the Trp residues as FL probes. Mutational analysis involving two

HIV - PR D25N constructs, W6F and W42F was used to determine whether each Trp was reporting on different folding events (A. Kundu, unpublished data). The W42F variant displayed a lack of the inflection point at 230 nm in the native CD spectrum. Since the

83

CD kinetics were coincident at 220 nm and 230 nm, this wavelength is considered to reflect global folding events. The kinetics of the W42F variant exhibited the same kinetics as wild-type, suggesting this Trp may serve as probe of global folding events within the protease. The W6F variant retained the 230 nm inflection, but displayed different SFFL unfolding kinetics from wild-type protease. Refolding kinetics for this variant were inconclusive due to aggregation. The intermediate unfolding phase became slower and the fast unfolding phase disappeared. Interestingly, the fast unfolding phase also appeared slower in the C-terminal deletion monomer construct containing Trp6.

These results suggest that Trp6 is involved in a local folding event within the monomer that may involve interactions with the C-terminus.

Summary of results. By combining the global kinetic analysis of the data for the dimeric HIV-1 protease with the results for the monomer construct, a more complete energy landscape has emerged. Based upon the results presented in this study,

HIV - PR D25N folds via a four-state sequential pathway, involving unfolded polypeptide chains, partially folded monomers, mature monomers, and the native dimer. Although the accepted two-state equilibrium of HIV-PR provided a comparison benchmark for the

validity of studying this inactive HIV - PR D25N , which also exhibits a two-state

equilibrium denaturation by urea, the kinetic folding mechanism of HIV - PR D25N , and presumably HIV-PR, is considerably more complex. A four-state, sequential folding mechanism provides a more complete description of the folding pathway of

HIV - PR D25N .

84

Discussion

HIV-1 protease folding mechanism. A folding mechanism involving the unfolded chains (U) , a fast folding monomer intermediate (I), a mature monomer intermediate

(M), and a native dimer (N2), as depicted in Scheme 2, has been proposed based upon the

combined analysis of HIV - PR D25N equilibrium properties, the HIV - PR D25N chevron,

∆96−99 HIV - PR D25N global modeling, and HIV - PR D25N folding. These results have shown

that equilibrium analysis alone could not elucidate the details of the HIV - PR D25N folding landscape and that a more in-depth analysis was required. The detection of the mature monomer intermediate agrees with the proposed computational simulations that suggest a sequential, three-state folding of HIV-1 protease (Levy et al. 2004; Broglia et al. 2005) and validates the biological relevance of the isolated monomer construct.

Energetic landscape of HIV-1 protease. This analysis also revealed the wealth of information provided by FL kinetics, as phases beyond the rate-limiting steps were observed. Interestingly, monomer folding is rate limiting and the bimolecular association is rapid (1°106 M-1 s-1), while dimer dissociation is the rate-limiting step in unfolding and the faster monomer unfolding is detected beyond this step. Also, the observed fast folding monomer intermediate suggests that the stopped-flow FL kinetics used in this study were sensitive to the local structural folding events occurring prior to mature monomer folding and subsequent to unfolding of the mature monomer, as elucidated by

Trp6. Hosur and co-workers have also observed the local tryptophan unfolding events

85

monitored by NMR, in the context of a tethered dimer system (Panchal and Hosur 2000;

Panchal et al. 2001). It is possible that these fast folding rates correspond to formation of the N- and C-terminal β-sheet within one subunit prior to dimerization and the complete formation of the β-sheet. The idea that the fast folding steps are local events could explain why computational simulations show two steps while this kinetic study shows three steps. The fast folding of the monomer is most likely highly coupled to the stability of the mature monomer intermediate because the slow kinetic rates do not account for the total apparent two-state equilibrium stability. Thus, the monomer intermediate may be spectroscopically silent, as the mature monomer was in the apparent two-state dimeric equilibrium.

Significance of a monomeric intermediate. The three-step folding of

HIV - PR D25N is intriguing. Previous investigations have postulated that autoproteolysis involves degradation of a monomer HIV-PR strand by active HIV-PR dimer in the context of a monomer-dimer equilibrium (Darke et al. 1994). Controversy still exists as to whether active dimer can be proteolyzed (Kumar et al. 2002). Populating monomer intermediates may regulate the degradation of HIV-PR thereby significantly altering the

HIV viral life cycle. Rate-limiting folding of the monomer may ensure that HIV-PR is properly structured before gaining activity, while rate-limiting dissociation provides a mechanism to confirm that HIV-PR is no longer required for viral maturation before degradation. The activity of HIV-PR within the cell is therefore regulated by

LeChatelier’s principle. As protein concentration increases dimer formation is favored, but as the protein concentration within the virion decreases HIV-PR dissociates and

86

begins to cleave its monomeric chains. Eventually all of the HIV-PR dissociates and is degraded within the virion.

Comparison to superoxide dismutase (SOD) folding. The HIV-1 protease folding mechanism is quite similar to the folding mechanism of SOD, another dimeric β-barrel protein (Svensson et al. 2006a), with a sequential pathway involving a rate-limiting monomer folding reaction, rapid association of folded monomer to form the dimer, and rate-limiting dimer dissociation in unfolding. However, there are two distinct differences between the energy landscapes of these two proteins. First, the Kd for HIV-PR dimerization is in the nanomolar range, whereas for SOD it is nearly three orders of magnitude tighter (picomolar range) even for the Zn and Cu free construct.

Possibly this Kd difference could be attributed to the differing biological functions of each dimeric protein. HIV-PR is only intended to be functionally available during viral maturation; SOD requires long-term function throughout the body. Also, the HIV-

PR monomer has is less stable in the monomer construct then in the dimer context, while the SOD monomer is more stable in an isolated construct which was generated by charged mutations at the dimer interface (Banci et al. 1995; Banci et al. 1998). The

∆96−99 difference in chevron patterns between HIV - PR D25N and the monomeric subunit of

HIV - PR D25N suggests that the C-terminal β−strand may play a role in the stability of the subunit. The HIV-PR monomer has had biological implications only as a result of structure-based drug design efforts; the SOD monomer has clinical implications as a causative agent of ALS. It is also expected that these two proteins would have potential

87

differences in folding mechanisms because of difference in their dimerization interfaces;

HIV-PR is a dimer of β-barrels with an interface consisting of an antiparallel β−sheet formed from interactions between the subunits and the SOD dimer interface arises from the nonpolar surface contacts between the two β-barrels.

Comparisons with β-barrel dimers. A comparison can also be made between the

HIV - PR D25N folding mechanism with those for other dimeric β-barrel proteins, such as the CH3 antibody domains and the HPV E2 transcriptional regulator DNA binding

domain (HPV domain). While both the HIV - PR D25N and SOD (Svensson et al. 2006b) folding mechanisms are similar to that for the CH3 antibody domain folding (Thies et al.

1999), all three differ from the folding mechanism of the HPV domain (de Prat-Gay et al.

2005). It appears that the common factor in the rate-limiting step of the first three proteins is the β-barrel formation interaction. As in the other three β-barrels, the folding of the HPV domain also involves a monomeric intermediate; however, this intermediate folds rapidly and the bimolecular association is the rate-limiting step, occurring through parallel pathways. The HPV domain dimer interface forms a β-barrel (Bussiere et al.

1998), while HIV-PR and SOD form the β-barrel prior to association. More importantly, these three proteins all fold on a slower time scale than similarly sized dimeric proteins containing predominantly α−helical content (Gloss and Matthews 1998b). This difference in time scales can be attributed to both the more non-local stabilizing

88

interactions required for β−sheet formation and the long-range interactions between

β−sheets that occur in development of the tertiary β-barrel structure.

∆96−99 Monomer construct folding comparisons. The HIV - PR D25N folding mechanism is more complex than the folding of other monomeric β−proteins that are fewer than 100 amino acids in size. The SH3 domain proteins (small β−sandwich proteins consisting of three β hairpins) (Grantcharova and Baker 1997; Riddle et al. 1997), the cold-shock family of proteins (small β-barrels with a two-stranded and three-stranded sheet structure) (Schindler and Schmid 1996; Srimathi et al. 2002; Garcia-Mira et al. 2004;

Magg and Schmid 2004], and the -trefoil hFGF {Srimathi, 2003 #31; Kathir et al.

2005) all fold via a rapid two-state folding reaction with a native-like transition state.

Some Ig-type proteins, β-barrels with a Greek-key topology, are also two-state folders

∆96−99 (Clarke et al. 1999; Geierhaas et al. 2004; Geierhaas et al. 2006). HIV - PR D25N folding and stability may differ from that of other equal-sized monomeric β−proteins because it is really just a piece of a larger protein.

Conclusion. The results presented in this study have shown that the HIV - PR D25N folding mechanisms more complex than the apparent two-state process described by equilibrium methods. The processing of HIV-1 protease from within the Gag-Pro-Pol polyprotein previously has been shown to be tightly coupled to protein folding (Louis et al. 1999). Within the polyprotein tertiary structure is weakly formed, but dimerization is

89

sufficient to allow cleavage and release of the N-terminus. Since the proregion of the protease is not as well folded, release of the N-terminus is necessary for mature tertiary structure formation and enzymatic activity.

Thus, in the virion HIV-1 protease activity can be modulated via LeChatelier’s principle. As the polyprotein is pulled to associate through reverse transcriptase association, HIV-1 protease forms a weakly associating dimer that cleaves its intramolecular N-terminus. Subsequent to this step, the protease becomes more stable and can cleave other HIV-1 protease chains from the Gag-Pro-Pol polyproteins. As the concentration of HIV-1 protease increases in the virion, dimer formation is favored. The dimeric state remains preferred until all of the Gag and Gag-Pol polyproteins become cleaved to their respective viral proteins. Then as these proteins disperse through the virion in their various functions, the effective protein concentration begins to decrease.

At this point, the HIV-1 protease system is pushed toward dissociation. Since the protease begins to cleave dissociated, unfolded polypeptide chains, the protein concentration decreases more until no HIV-1 protease remains. The monomer intermediate provides a checkpoint for both viral maturation and autoproteolysis by adding a folding step both before activity is gained to insure the viability of the process and after activity is complete to insure that degradation should begin.

Application of this HIV - PR D25N folding analysis to drug-resistant HIV-1 protease variants could elucidate whether a mutation confers resistance within one subunit or through the dimerization interface. For example, studies HIV-1 protease variants with mutations that are not spatially near the active site have been shown to confer drug

90

resistance by a hydrophobic sliding mechanism (Foulkes-Murzycki et al. 2007).

Comparisons of the folding mechanisms of drug resistant variants with the mechanism presented here, as well as exploring the role of invariant residues in folding by mutational analysis will further extend the understanding of the HIV-1 protease folding energetics.

Details of the differences in the folding energy landscape between the many HIV-1 protease variants may enhance the on-going efforts of HIV-1 protease structure-based inhibitor design to create drugs that can avoid resistance.

91

CHAPTER III

THE ROLE OF COILED COIL ASSEMBLY IN THE SCAFFOLD OF A COMPLEX HEMOGLOBIN

92

Abstract

Coiled coil motifs are involved in the structural assembly of proteins required for diverse biological processes such as cytoskeletal architecture, cellular motion, virus assembly, cellular transport, and DNA transcription. The 5.5 Å resolution crystal structure of Lumbricus erythrocruorin, determined by Royer et al., demonstrates a hierarchical organization of 144 hemoglobin subunits and 36 linker subunits, comprised of four non-hemoglobin linker chains, ccL1, ccL2, ccL3, and ccL4. The 36 linker subunits arrange into a dodecameric scaffolding complex with helical coiled coils interdigitating similar to spokes in a wheel. The hemoglobin subunits dock onto the globular linker heads on the outer surface of the scaffolding complex. To determine whether the linker scaffolding complex of Lumbricus terrestris erythrocruorin assembles in a stepwise fashion driven by the formation of a coiled coil heterotrimer consisting of three of the four linker chains, ccL1-ccL2-ccL3 or ccL1-ccL2-ccL4, peptide constructs of each of the linker chains were combined in order of addition experiments and their secondary structure was assayed by circular dichroism (CD) spectroscopy. While the expected signal of a coiled-coil motif was not evident, modest helical structure was observed indicating that these domains can associate. The inclusion of larger linker chain domains could cooperatively organize structure.

93

Introduction

Hemoglobin as a model of protein structure studies. Evolutionary diversification has led to a number of proteins that share high homology in both structure and function, yet differ in their state of assembly. A classic example of one such family of proteins is the hemoglobins, multimeric biomolecules having the ability to bind oxygen reversibly by means of a heme moiety containing a Fe+2 cofactor. All hemoglobins share similar tertiary structures; however, amongst different species the quarternary association exhibits diversity (Weber and Vinogradov 2001; Royer et al. 2005). Hemoglobin and myoglobin were the first proteins whose three-dimensional structures were determined by

X-ray crystallography (Perutz 1960; Perutz 1964); as a result, this class of proteins has been studied extensively as a model of the relationship between protein structure and function (Perutz 1976; Perutz 1978). Although hundreds of hemoglobin crystal structures have been determined since then, new insights on this protein continue to be discovered through structure-function studies.

While human hemoglobin has evolved in the environment of the red blood cell, annelid hemoglobins exist in an extracellular environment. This evolutionary difference may account for the divergence between hemoglobins in humans and those in annelids.

Inside the red blood cell 2,3-bisphosphoglycerate allosterically regulates oxygen release

(Benesch and Benesch 1967), and reducing agents maintain the iron in ferrous form

(Dorman et al. 2002). In annelids, these processes are not available because hemoglobins are not contained within a cellular environment. Thus, it has been postulated that these

94

hemoglobins have evolved with a scaffold that helps maintain their function and assembly (Royer et al. 2006).

A model annelid hemoglobin. The 5.5 Å-resolution crystal structure of L. terrestris erythrocruorin (Figure 3.1) (Royer et al. 2006) exhibits a self-limited hierarchical organization of 144 hemoglobin subunits and 36 linker subunits, comprising four linker chains, ccL1, ccL2, ccL3, and ccL4. The 36 non-hemoglobin linker subunits arrange into a dodecameric scaffolding complex with helical coiled-coils interdigitating, similar to spokes in a wheel. The hemoglobin subunits dock onto the globular linker heads on the outer surface of the scaffolding complex (Figure 3.2). This molecule serves as an excellent model for self-limited assembly, with a high cooperativity dependent on interactions with both divalent cations and protons (Royer et al. 2000).

One of the most striking features of the erythrocruorin structure is the interdigitation of 12 putative triple-stranded coiled-coil helices near the center of the complex. The N-terminus of the linker is at the center of the scaffolding complex and proceeds outward with an approximately 45 Å long coiled-coil region followed by a bend, a short 20 Å coiled-coil region and terminating with a cysteine-rich LDL receptor domain. (Royer et al. 2000) The sequences of all four linker chains (Figure 3.3) have been identified (Kao et al. 2006), At 5.5 Å, it appears that the interactions between the long coiled-coil region of the linker chains are primarily responsible for the assembly of this scaffolding complex.

95

Figure 3.1: The 5.5 Å electron density map of Lumbricus terrestris erythrocruorin.

This 3.6 MDa molecular weight protein is organized into 144 hemoglobin subunits, which are docked upon a scaffold comprised of 36 linker subunits. The scaffold has a D6 symmetry with coiled coils separating the two dyads. The Q dyad is a view along the two-fold axis of symmetry and the P Dyad displays the six-fold axis of symmetry.

96

(a)

(b) (c)

Figure 3.2: The linker scaffold is organized in one-twelfth subunits. The one-twelfth subunit of the complex (a) shows the formation of a distinct coiled coil. The interactions between these subunits along each dyad (b) suggest that coiled coil formation may drive assembly of the linker scaffold (c).

97

long coiled-coil domain

short coiled-coil domain LDL receptor-like domain

β-barrel domain

Figure 3.3: The amino acid sequences of four distinct linker chains have been identified in L. terrestris hemoglobin. Each linker chain contains a β-barrel region, a cysteine-rich LDL receptor-like domain, and short and long coiled coils separated by a break in sequence. These linkers interact to form a scaffolding complex with coiled coils intertwined in the central part. The stars indicate the cysteines involved in a disulfide linkage

98

Structure of a coiled coil. An ideal coiled coil is identified on the basis of “knobs- into-holes” packing. (Crick 1952) This arrangement involves the packing of one side chain of one helix into four side chains of the facing helix with residues of the first helix directly to the side of the equivalent residue of the second helix. The “knobs-into-holes” arrangement differs from typical helix formation within globular proteins. The latter is more similar to a “ridges-into-grooves” arrangement, where the residues pack above or beneath the equivalent residue from the facing helix (Figure 3.4).

In a coiled coil, there is a characteristic heptad repeat contained within each helix designated as (abcdefg)n. The amino acids are arranged in a pattern consisting of hydrophobic residues spaced every four and three residues apart (4, 3 hydrophobic repeat) with 3.5 residues per turn of the helix. A helical wheel representation (Schiffer and Edmundson 1967) of a parallel, dimeric coiled coil is shown in Figure 3.5. However, some coiled-coils exhibit interruptions in the heptad repeat classified as stutters and stammers (Brown et al. 1996). Generally, positions a and d are occupied by hydrophobic residues which lie on the same face of the helix, and the hydrophobic interactions that occur between these residues on adjacent helices are key to the formation of a coiled-coil.

Positions e and g have been shown to play a role in the electrostatic interactions of interhelical ion pairs and contribute to salt bridges with a pairing of the i residue of one helix to the i +5 residue of the adjacent helix. The structure can be described in terms of pitch (the distance the superhelix requires to complete a full turn), pitch angle (the angle that each helix assumes relative to the superhelix axis), and helix-crossing angle (the

99

GCN4-pI “Knobs-into-Holes”

ROP “Ridges-into-Grooves”

Figure 3.4 Comparison of helical packing “Knobs-into-holes” packing, characteristic of coiled coils, is shown in the top structure of GCN4-p1. This packing contrasts with

“ridges-into-grooves” organization of helices within globular proteins, such as ROP.

100

Figure 3.5 Sequence of GCN4-p1 represented in helical wheel format. The heptad repeat is notated at the corresponding amino acid position.

101

angle between two neighboring helices). The coiled-coil can be distinguished from other amphipathic helices by the periodicity of hydrophobic residues (3.5 versus 3.65), the length (long versus short), and the packing interactions (“knobs-into-holes” versus

“ridges-into-grooves”) (Lupas 1996).

Importance of coiled coils. Coiled-coil motifs occur ubiquitously in nature. They provide structural support in fibrous proteins such as , , and fibrinogen, and they are necessary for formation (Oas and Endow 1994; Lupas

1996; Burkhard et al. 2001). Larger α-helical proteins involved in intracellular trafficking, such as and , have been shown to require coiled-coils for structural integrity (Mo et al. 1991a; Mo et al. 1991b). In bZIP proteins, small transcription factors consisting of an activation domain and a DNA binding domain, coiled coils have been shown to be the driving force of assembly (Landschulz et al. 1988;

O'Shea et al. 1991). Coiled coils have been observed in a binding domain of a Rab5 effector early endosome 1 that acts in endosome tethering and fusion (Merithew et al.

2003). These motifs have been shown to play a role in membrane fusion, as exemplified by the three-stranded influenza virus hemagglutanin protein (Wilson et al. 1981; Bentz

2000) and SNARE complexes (Short and Barr 2004). Additionally, coiled coils are found in the core of many multimeric proteins, as first discovered in catabolite gene activator protein (McKay and Steitz 1981). In Newcastle virus, coiled coils are postulated to mediate folding and assembly (McGinnes et al. 2001; Sergel et al. 2001).

102

Given the functional diversity of the proteins which contain coiled coils, the simplicity of the coiled-coil structure, and the extent of these motifs within the genome, the coiled-coil motif has served as an excellent model for investigations into the rules governing protein- protein interactions.

GCN4-p1: A model coiled coil. Knowledge of GCN4-p1 serves as a guide in the investigation of the role of these coiled coils in the scaffolding complex assembly. The

1.8 Å-crystal structure of the GCN4-p1 , determined by Kim, Alber, and colleagues (O'Shea et al. 1991), has served as a model of the coiled-coil motif for a number of investigations into the determinants of coiled-coil structure. This dimeric structure consists of a pair of 33 amino acid peptides corresponding to the leucine zipper domain of GCN4, a yeast transcriptional activator (Figure 3.6). GCN4-p1 is a dimeric coiled-coil consisting of two α-helices twisted into a left-handed supercoil. While

GCN4-p1 is a dimeric, parallel coiled-coil peptide, other coiled-coils have been shown to form dimers, trimers, and tetramers. Additionally, these coiled-coils can orient in a parallel (the N-termini of all helices are located at the same end) or an anti-parallel (the

N-terminus of one helix pairs with the C-terminus of another helix) fashion. Though some of the systems that have been investigated have included coiled-coils from naturally occurring proteins, other model systems have been engineered to study their determinants of oligomerization, orientation, and homogeneity versus heterogeneity of coiled coils.

Most of the designed coiled coils have been engineered by modifying the leucine zipper via mutagenesis.

103

e

Asn16

d

a g

Figure 3.6 Crystal structure of GCN4-p1. Top: Side view of the dimer with the a and d positions highlighted in blue and red respectively. The electrostatic e and g positions are in green and cyan. The buried polar residue is noted at Asn16.

104

Alber and coworkers studied the importance of hydrophobic effects by exploring the role of mutations at the a and d position of the heptad repeat in the GCN4-p1 sequence (Harbury et al. 1993). Mutations were simultaneously made at four of the a residues (Val9, Asn16, Val23, Val30) and the four d residues (Leu5, Leu12, Leu19,

Leu26). These resultant mutants were named p-IL, p-II, p-LI, p-VI, p-VL, p-LV, and p-LL, with the first letter corresponding to the amino acid at position a and the second letter denoting the amino acid at position d of the core. The p-IL, p-II, and p-LI mutants formed parallel oligomers such that associated as dimers, trimers, and tetramers respectively. Additionally, crystallographic packing constraints appear to influence the oligomeric state of the peptides. The dimeric coiled-coils resemble the parallel “knobs- into-holes” packing at the a position and perpendicular at d, the trimeric coiled-coils illustrate an acute geometry involving neither parallel nor perpendicular packing a and d, and the tetrameric coiled-coils form perpendicular packing at a and parallel packing at d.

The pitch of the helix also changes with oligomeric state: 148 Å for the dimer, 175 Å for the trimer, and 205 Å for the tetramer. The distance between helices also increases with oligomeric state. The helices of the p-LI tetramer were 5.5 Å further apart than for the p-IL dimer, while p-VI, p-VL, p-LV, and p-LL formed mixtures of multiple oligomeric states. Particularly, the oligomeric states of p-VI and p-VL were also concentration dependent. These multiple populations were postulated to be due to the absence of the buried polar residue at Asn16, which was crucial for oligomeric specificity.

The buried polar residue at Asn16 has been investigated by mutagenesis experiments. An allosteric switch was created with the N16V mutation of GCN4-p1 that

105

forms trimers upon benzene binding and dimers with no ligand (Gonzalez et al. 1996a;

Gonzalez et al. 1996b; Gonzalez et al. 1996c). An N16Q mutation incorporated into

GCN4-p1 formed the stable trimeric coiled-coil, GCN4-pIQI. The stability of this trimer is, however, lower than that of GCN4-pII, with all at the a and d positions

(Eckert et al. 1998). An N16L mutation causes a loss of specificity in the association orientation in GCN4-pII, resulting in both parallel and antiparallel dimers (Oakley and

Kim 1998). These studies have demonstrated that the packing of the coiled coil is influenced by the presence or absence of the polar group Asn16, and also by its side- chain interactions with the structural surroundings.

Several groups have shown also that the stability of the oligomeric association is influenced by electrostatic interactions between e and g′ residues of opposing α-helices within the coiled coil, while maintaining hydrophobic residues in the a and d positions

(John et al. 1994; Lumb and Kim 1995a; Lumb and Kim 1995b; Yu et al. 1996a; Yu et al.

1996b; Kohn et al. 1998; Krylov et al. 1998). The number of ionic interactions between the g positions of one heptad and the e′ positions of the next constitute an ‘interhelical salt-bridge rule,’ which has been used to describe the homo- and hetero-oligomerization of coiled-coils and also the parallel vs. anti-parallel orientation (Lupas 1996). Lys-Glu

' gn-e n+1 ion pairs tend to form trimers in parallel or anti-parallel orientations, while Gln-

' Gln gn-e n+1 interactions lead to dimer formation.

In contrast to the well-studied homodimer GCN4-p1, another ,

Fos/Jun forms a heterodimer (O'Shea et al. 1992). Although homodimers of Fos and

Jun can form, the Fos/Jun homodimer preferentially forms within the cell. The Fos/Jun

106

heterodimer has been postulated to form due to the ionic interaction between the two negatively charged Glu residues in g1 and e2 of the Fos leucine zipper and the two positively charged Lys residues in the equivalent positions of the Jun zipper. The repulsive electrostatic interactions within the Fos homodimer thermodynamically favor formation of the Fos/Jun heterodimer. When compared to the stability of GCN4-p1, the less stable Jun homodimers further explain the heterodimer association preference

(O'Shea et al. 1992; d'Avignon et al. 2006).

The folding mechanism has been well characterized for the model system GCN4- p1, a homodimeric coiled-coil bZIP protein (Zitzewitz et al. 1995; Sosnick et al. 1996;

Kentsis and Sosnick 1998; Bhattacharyya and Sosnick 1999; Moran et al. 1999; Zitzewitz et al. 2000; Ibarra-Molero et al. 2004). This small, highly helical protein with thirty-three amino acids folds via both apparent two-state equilibrium and kinetic mechanisms. As partially folded intermediates are not well populated, an extensive mutational analysis of the dimeric transition state elucidated the sequence requirements for structure formation

(Zitzewitz et al. 2000). The C-terminal part of the helix has been shown to fold first, but this formation depends upon the specific sequence rather than its location within the peptide chain. Furthermore, the buried Asn at residue 16 in GCN4-p1 confers thermodynamic stability to the dimer, but is not involved in determining the association rate constant (Zitzewitz et al. 2000). These insights from GCN4-p1 folding can be applied to other coiled coil folding studies.

107

Coiled coil diversity. Though studies of designed coiled coils have illustrated their properties, many natural coiled coils also have interesting structural characteristics.

For instance, the “stutter” and “stammer” observed in mannose binding protein, hemagglutanin, and myosin refer to deletions of three and four residues in the heptad repeat that can cause underwinding and overwinding of the supercoil (Brown et al. 1996).

Right-handed coiled coils have been observed in tetrabrachion, a filamentous archaebacterial surface protein assembly, but contain an 11 residue repeat containing three hydrophobic residues at positions a, d, and h in an undistorted helix (Yan et al.

1993; Peters et al. 1995).

Although numerous attempts have sought to discern a set of guidelines for coiled- coil prediction, many of the available algorithms have not been entirely reliable due to breaks in the heptad repeat in naturally occurring proteins (Wolf et al. 1997; Walshaw et al. 2001; Walshaw and Woolfson 2001; Woolfson 2005). Visual inspection of the sequences remains the best method (Lupas 1996). While some basic guidelines have been deduced, it is important to note that many of these rules are context dependent, and often the behavior of the coiled coil depends on the environment of the overall protein.

Aim of this chapter. In contrast to studies of engineered heteromultimeric coiled-coils, L. terrestris erythrocruorin provides a hierarchical model in which nature utilizes coiled- coils for complex assembly. The central region of the L. terrestris erythrocruorin scaffold consists of twelve coiled coils interdigitating similar to spokes in a wheel. Given the essential role of coiled coils in other structurally complex proteins with architectural

108

(a) (b)

Figure 3.7: Coiled coil assembly The long coiled-coil domain is proposed to drive assembly of the scaffold linker complex. (a) The one-twelfth subunit of the linker chain complex. (b) The long coiled-coil domain of the one-twelfth subunit (PDB 2GTL).

109

hierarchy, it is proposed that heterogenous coiled coil formation in this hemoglobin macromolecule (Figure 3.7) may be the component that initiates assembly of the linker scaffolding complex. In this study, peptides corresponding to the long coiled-coil domain of each of the four linker chains have been synthesized and the interactions between these peptides have been examined to determine to the role of this component in scaffold assembly. Evidence of enhanced helical structure, as well as oligomerization, was used to determine if these isolated domains were competent to assemble without the remainder of the linker chain. The results of these assays would test the hypothesis that coiled coil formation serves as the initial step in the assembly of this complex.

110

Experimental methods

Helix propensity prediction. AGADIR, an algorithm based on helix/coil transition theory, was used to predict the helix propensity of the individual peptides

(Munoz and Serrano 1997). This algorithm examines short range interactions to predict the helix propensity of monomeric peptides. Each individual sequence was input to determine its likelihood of forming a helix independent of the other peptides. The information output was then plotted with helical propensity as a function of residue number.

Peptide synthesis and purification. Peptide constructs corresponding to the ~45 Å long coiled coil domain of the four linker chains were synthesized on an Applied

Biosystems 433 synthesizer utilizing solid-phase 9-fluorenylmethoxycarbonyl (Fmoc) chemistry (Fields and Noble 1990). The sequences of these domains are shown in Figure

3.8. N-terminal CGG linkers were synthesized to allow for the addition of thiol-reactive fluorescent probes for fluorescence resonance energy transfer experiments. Since the peptides ccL2, ccL3, and ccL4 do not contain intrinsic chromophores, they were synthesized with a C-terminal tyrosine for spectroscopic determination of concentration

(Gill and von Hippel 1989). The peptides were N-terminal acetylated and C-terminal amidated to increase helix stability by neutralizing partial charges (Scholtz and Baldwin

1992). Since branched amino acids are difficult to attach during synthesis, these residues

111

defgabcdefgabcdefgabcdefgabcde ccL1 Ac-CGGFQYLVKNQNLHIDYLAKKLHDIEEEYNKLT-NH2 ccL2 Ac-CGGLDPRLGANAFLIIRLDRIIEKLRTKLDEAEY-NH2 ccL3 Ac-CGGHDEIIDKLIERTNKITTSISHVESLLDDRLY-NH2 ccL4 Ac-CGGEDNRARDISERIDKLTAEAFKLGRNLDARLY-NH2

Figure 3.8: Peptide constructs corresponding to the long coiled coil. The long coiled coil was hypothesized to consist of three heterogenous helices. The ccL peptides correspond to the long coiled coil domain of the four identified linkers. Each peptide was N-acetylated and C-amidated. An N-terminal CGG was added to allow for the addition of thiol-reactive probes and C-terminal Tyr were added for spectroscopic determination of protein concentration. The helices, ccL1 and ccL2 are postulated to be present in all twelve of the coiled coils, while ccL3 and ccL4 alternate.

112

were added with a double coupling cycle. Synthetic peptides were cleaved from the resin with the reagent K method (King et al. 1990). Following cleavage from the resin, the peptide s were dissolved in 5% glacial acetic acid and lyophilized. The lyophilized peptides were subsequently resuspended in 1% trifluoroacetic acid and purified as a single peak using HPLC on a Waters Breeze system with a reverse-phase column and a

35 - 45 % acetonitrile gradient. Purity of the peptides was confirmed using electrospray mass spectrometry. The purified peptides were lyophilized and stored at 4˚C.

Buffers. Standard buffering conditions of 10 mM monobasic/dibasic potassium phosphate, 0.2 mM ethylenediamine tetraacetic acid (EDTA), 1 mM β−mercaptoethanol

(βME), at pH 7.8, the pH of earthworm blood, were used for all studies.

Determination of Protein Concentration. Absorbance at 276 nm, corresponding to the chromophore tyrosine, was monitored using an Aviv 140S UV-VIS spectrometer.

The extinction coefficient of each peptide was determined using the method of Gill and von Hippel (Gill and von Hippel 1989). The molar extinction coefficient for each peptide was taken as 1500 M-1cm -1, based upon the contribution of a single tyrosine per peptide.

The concentrations of these peptides was then calculated according to Beer’s law

A = εcl (1) where A is the absorbance of the peptide, ε is the molar extinction coefficient (M-1cm -1) for the peptide, c is the given peptide concentration (M), and l is the path length of the cell (cm).

113

Size-exclusion chromatography (SEC). SEC, using an ÄKTA explorer FPLC with a sepharyl 75 peptide sizing column, was used to determine the oligomeric state of the peptides, both isolated and combined. The eluent used was 0.1 % trifluoracetic acid

/30 % acetonitrile. Molecular weight standards were dimeric GCN4, and an engineered trimeric GCN4, GCN4-IQI. Data were analyzed qualitatively by comparing the retention volumes of these coiled-coil molecular weight standards to those of the proteins in question.

Equilibrium circular dichroism (CD) experiments. Using a JASCO J-810

Spectropolarimeter equilibrium CD, wavelength scans from 300 nm to 185 nm were performed for individual and mixed peptides over a range of concentrations with standard buffering conditions. The scans were compared on the basis of mean residue ellipticity

(MRE, deg cm2 dmol-1)

θ MRE = (2) 10× c × l × n where θ is the ellipticity in millidegrees (mdeg), l is the pathlength of the cell (cm), n is the number of amino acids in the peptide, and c is the peptide concentration (M).

114

Results

Predicted helicity of the constructs. Coiled coils appear to form in the scaffold center of the 5.5 Å crystal structure of erythrocruorin. Examination of the corresponding sequences within each linker domain revealed aspects of the characteristic coiled-coil heptad repeat. Peptide constructs, corresponding to the 45 Å long coiled-coil domain of each linker chain, were synthesized. Each linker chain natively contains either Ile or Leu at multiple a and d positions, but the distribution of these residues is not specifically all

Ile or Leu as the positioning in engineered coiled coils to predict a trimeric oligomerization (Woolfson 2005). Although many of the e and g positions in each peptide are electrostatic residues, there again is no particular pattern to suggest the formation of parallel trimers.

To check the helix propensity of each chain, the peptide sequences were individually analyzed for propensity to form helix using the program AGADIR (Munoz and Serrano 1994; Munoz and Serrano 1997). This algorithm, based on helix/coil transition theory, explores short range interactions to predict the amount of helix a monomeric peptide will form. The predicted helix propensity by AGADIR of each peptide construct was plotted as a function of residue number (Figure 3.9).

The predictions suggested that ccL4 has the most helical propensity at a maximum of 15% helix from residues 17 through 23. Peptide ccL1 was predicted to have between 5-10% helical content in its C-terminus. Predictions for ccL2 displayed a 5% tendency to form helix in the C-terminal portion of the peptide. The construct ccL3 was

115

Figure 3.9: Agadir prediction of ccL peptide helicity. The program Agadir was used to predict the helical propensity of each of the four isolated sequences. Though each sequence does not display a high propensity to form helix, it is possible that interactions between three heterogenous sequences will result in coiled coil formation because different regions of each sequence display different helical propensities.

116

predicted to have the least helical character with less than 5% helical propensity. These same parameters were used to calculate the predicted helix propensity for a single chain of GCN4-pIQI, a homotrimeric coiled coil, as a standard for trimeric coiled-coil helix propensity. While the N-terminus of GCN4-pIQI was predicted to have less than 5% helix propensity, the C-terminus was predicted to have as much as 50% helix propensity.

These predictions suggest that the individual peptides may not form helices in isolation.

Rather, as illustrated in the crystal structure (Figure 3.1), each construct may need other peptides to form heterotrimeric coiled coils.

Peptide solubility. Peptide solubility was determined by measuring the volume of buffer necessary to dissolve a specific mass of peptide and then analyzing the absorbance spectrum of the particular solution. The peptides ccL1 and ccL4 more charged polar residues than the peptides ccL2 and ccL3, and as expected, ccL1 and ccL4 were the most soluble. Conversely, ccL3 was marginally soluble and ccL2 exhibited minor solubility with a tendency to aggregate (Figure 3.10). The high hydrophobicity and limited solubility of ccL2 suggests that it could associate with other chains (ccL1, ccL3 or ccL4) early in the folding process. Mutation of a phenylalanine to tyrosine in ccL2 did not drastically improve its solubility, creating a problem for its characterization in a primarily aqueous solvent. As shown in the UV absorbance plots, baseline scattering indicates that all of the constructs were prone to aggregation. Furthermore, solubility of the ccL2 peptide was not increased in helix-inducing organic solvents, including trifluoroethanol

117

Figure 3.10: UV Absorption Spectra of the ccL peptides. The peptides corresponding to ccL1 and ccL4 were the most soluble of the four peptides. ccL3 was moderately soluble and ccL2 was essentially insoluble. Mutation of a phenylalanine to tyrosine in ccL2 did not drastically improve its solubility.

118

(Storrs et al. 1992; Luo and Baldwin 1997) and acetonitrile. Although these results appeared discouraging in terms of individually characterizing the solution structure of the linker chains, they suggest that the linkers most likely assemble into heterogenous coiled coils.

Secondary structure of the individual linker chains. Circular dichroism (CD) spectra of each linker chain show the secondary structure of the individual peptide sequences (Figure 3.11). The ccL1 peptide demonstrates the most helical character of the four peptides with a -4000 MRE at 222 nm. Compared to the GCN4-p1 standard, with a highly helical content at -33,000 MRE, this peptide was only 12% helical. Urea denaturation displayed a progressive loss of structure for ccL1 (Figure 3.12). However, determination of the thermodynamics of folding was not possible due to the marginal stability of ccL1.

CD spectra revealed that ccL4 was only 9% helical with a -3000 MRE at 222 nm.

Compared to the local 222 nm minimum, both ccL1 and ccL4 exhibit a larger minimum in their CD spectra at 205 nm (-8000 MRE), suggesting these peptides were mostly unstructured with a relatively small helical content. At 222 nm ccL3 exhibited the same spectral shape as ccL4, but below 205 nm it is far more unstructured. Peptide ccL2 displayed a baseline spectrum that may be attributed to the insolubility of this construct.

The predicted helicity for this peptide is lowest of the four peptides, and experimental cross-validation with CD spectroscopy was unfortunately not possible due to its poor

119

Figure 3.11: CD Spectra of the individual ccL peptides. The spectra of the isolated peptides at individual concentrations of 10 µM in 10 mM potassium phosphate, 0.2 mM

EDTA, 1 mM BME, pH 7.8 were compared with each other as well as the spectrum of

GCN4-p1 at 10 µM (inset) in the same buffer conditions.

120

Figure 3.12: Urea denaturation of ccL1 displays a progressive loss of secondary structure. However, thermodynamic analysis was not possible because the unfolding reaction of the peptide does not display a cooperative transition.

121

solubility. The behavior of the individual peptides in solution further reinforces the idea that these chains contain a sequence preference for heterogeneous interactions.

Secondary structure induced by interactions between the peptides. After examining the isolated linker sequence constructs (ccL1, ccL2, ccL3, ccL4), mixing experiments were conducted to determine whether these constructs could form coiled coils outside the context of the linker scaffold, and if so to measure their stability. The linkers were mixed according to their apparent organization within the erythrocruorin scaffold. In the electron density maps, the one-twelfth subunit is comprised of a trimer of linker chains, containing either linkers 1-2-3 or linkers 1-2-4. Based on modeling, it is hypothesized that the interdigitation between each one-twelfth subunit is likely to be the ccL1-ccL2-ccL3 coiled coil interacting with the ccL1-ccL2-ccL4 coiled coil.

To determine whether certain linkers preferentially associated as dimers before the development of a stable trimer, the six possible combinatorial dimer mixtures between the three peptides were pre-equilibrated for twenty-four hours and their CD spectra were taken (Figure 3.13). The combination of ccL1 and ccL4 displayed the most helical structure of the six mixtures, followed by that of ccL1 and ccL3. All of those mixtures containing ccL2 appeared to exhibit a spectrum close to baseline, suggesting that these solutions contained minimal ccL2 compared to the other component peptide.

As a control, ccL3 and ccL4 were mixed, but showed little helical character. Compared to GCN4-p1, these six combinations lack significant helical structure beyond that of their

122

Figure 3.13: CD Spectra of the pairwise ccL combinations. The spectra of the combined peptides at a total concentration of 20 µM (10 µM each) in 10 mM potassium phosphate, 0.2 mM EDTA, 1 mM BME, pH 7.8 were compared with the summed spectra of the individual peptides.

123

constituent peptides. However, this result may be an indication that all three peptides must be in contact to fully form the heterotrimer.

Based upon these preliminary observations, the peptides were mixed in pre- equilibrated equimolar amounts of ccL1- ccL2-ccL3 as well as ccL1-ccL2-ccL4 and analyzed by CD (Figure 3.14). Compared to the sum of signals from the individual peptides, the signal strengths for the trimeric peptide mixtures exhibit a relative enhancement, yet neither combination of peptides resulted in the signature coiled coil spectrum of GCN4-p1. In fact, both of these combinations display only 10% of the helicity displayed by GCN4-p1, and they appear to be disordered with a spectral minimum at 205 nm rather than at 222 nm. As a control, a mixture of ccL2-ccL3- ccL4, which is not observed in this erythrocruorin scaffold, was studied but showed no helical signal suggesting that the ccL1-ccL2-ccL3 and ccL1-ccL2-ccL4 mixtures may have been forming a small amount of helical structure. Hence, the relative increase in signal for the trimeric mixtures, although modest, correlates with the results from the studies of the dimeric mixtures.

The CD spectrum of a pre-equilibrated combination of ccL1-ccL2-ccL3- ccL4

(2:2:1:1 ratio of peptides) was collected. This spectrum is then compared with both the summed CD spectral contributions of these four peptides as well as the CD spectrum of

GCN4-p1 (Figure 3.15). Although there was a slight enhancement of helical structure for the mixture compared to the sum of spectral contributions of the four peptides (13% helix for the mix and 7.5% helix for the sum), the overall spectrum resembled a disordered protein rather than a coiled coil.

124

Figure 3.14: CD Spectra of the putative trimer linker combination. The spectrum of the combined peptides of the three possible ccL combinations at a total concentration of

30 µM (10 µM each) in 10 mM potassium phosphate, 0.2 mM EDTA, 1 mM BME, pH

7.8 were compared with the summed spectra of the three individual peptides as well as the spectrum of GCN4-p1 at 10 µM (inset) in the same buffer conditions.

125

Figure 3.15: CD Spectra of the putative ccL1-ccL2-ccL3-ccL4 linker combination.

The spectrum of the combined peptides at a total concentration of 60 µM (20 µM ccL1,

20 µM ccL2, 10 µM ccL3, and 10 µM ccL4.) in 10 mM potassium phosphate, 0.2 mM

EDTA, 1 mM BME, pH 7.8 was compared with the summed spectra of the four individual peptides as well as the spectrum of GCN4-p1 at 10 µM in the same buffer conditions.

126

Oligomeric state of the peptides. The oligomeric states of the peptides were qualitatively assayed by size exclusion chromatography. As the stock concentrations are diluted by ten-fold in this technique, the low solubility of the ccL peptides complicated the detection of these molecules in the assay. Although the traditional quantitative assessment of molecular weight was not possible, the relative elution volumes of each peptide indicated a qualitative measure of its oligomeric state. Figure 3.16 shows representative sizing data from ccL1, ccL3, the mix ccL1- ccL2-ccL3, the dimer standard

GCN4-p1, and the trimer standard IQI. The two standards are well-defined in their oligomeric state and their elution positions were assumed to correspond with that state.

While only exact elution with either the dimer or trimer standards would suggest an accurate oligomeric state of the peptide, elution volumes occurring after the two standards were assumed to indicate monomer. It is to be acknowledged that the non- globular shape of coiled coils and peptides could distort traditional sizing analysis.

Compared with the dimer GCN4-p1 and the engineered trimer GCN4-pIQI, ccL1 exhibited an elution profile extending from the trimer standard to beyond the dimer standard. This peptide eluted as a single peak at a later volume than the dimer standard when a lower initial protein concentration stock was assayed. This result was interpreted to indicate equilibrium between monomer, dimer, and trimer that could be shifted to primarily monomer by lowering the protein concentration. However, sizing of ccL3 exhibited only a single peak extending beyond the dimer standard elution volume, suggesting a monomer-dimer equilibrium. Sizing of ccL2 was not possible because of solubility limits. The mix of ccL1-ccL2-ccL3 eluted similarly to ccL3, interpreted as a

127

Figure 3.16: Size Exclusion Chromatography of the ccL peptides. Based upon comparisons to GCN4-p1 and GCN4-pIQI, an engineered trimeric coiled coil, ccL1 displays an equilibrium between dimer and monomer that can be shifted to monomer by lowering the protein concentration, while ccL3 exhibits a monomer-dimer equilibrium.

The mix of ccL1, ccL2, and ccL3 elutes in a monomer-dimer equilibrium that can be shifted to monomer by lowering the protein concentration. These sizing data are qualitative, as elution of the dimer and trimer standards is not well resolved.

128

monomer-dimer equilibrium, and elution could be shifted to monomer by lowering the protein concentration. (At the time of the sizing experiments the ccL4 peptide was unavailable. Based upon the analysis of the CD data, further quantities of ccL4 were not synthesized to pursue this study.) These results suggest the possibility of oligomerization, but no strong evidence of heterotrimer formation.

Summary of results. These studies were influenced profoundly by the low solubility of ccL2. The presence of the other peptides was not enough to bring the ccL2 chain into solution as part of the structure, suggesting that ccL2 may organize the initial structure formation of each coiled coil. Alternatively, there may not have been enough ccL2 present to accurately quantify any possible interactions. It is possible that a larger domain of linker chain two may be necessary to confer solubility. These results may suggest that the interactions within the long coiled-coil domain of the linkers alone may not be sufficient to form initiating components of the scaffold, but rather requires other interactions to build the scaffold. However, the poor solubility of ccL2 did not provide experimental evidence to eliminate the ability of the long coiled-coil domain to associate.

129

Discussion

While the proposed initiation of structure within the erythrocruorin scaffold by coiled coils was not shown by the approaches used in this study, the idea of hierachical folding within this molecule should not be abandoned. At 5.5 Å resolution, rods of density within the scaffold of this molecule were a clear indicator of coiled coils, but β- sheet formation within the globular head domain of the linker chains was less apparent.

More recent structural data, at 3.5 Å, revealed that interactions within the scaffold occurred not only between the long coiled coils, but also between the β-barrel portions of the linker head groups (Figure 3.17) (Royer et al. 2006). In fact, the contacts between the linker head groups account for 1000 Å2 of buried surface area while the interactions of the coiled coil regions shows only 200 Å2 of buried surface area. While the initial 5.5 Å- resolution data suggested that the long coiled coil of the linker chain was the driving force of the scaffold assembly, this new data favors a more cooperative process of assembly involving both the long coiled coil and the β-barrel portion of the linker head groups.

A comparative study between the erythrocruourins from Arenicola marina and

Lumbricus terrestris further elucidates the lack of coiled coil formation observed in the folding study presented here. In the more evolutionarily primitive species, Arenicola marina, the coiled coil region of the linker chain extends ~70 Å continuously from the

LDL receptor-like portion of the linker chain head (Royer, JMB 2007). This structural difference is proposed to account for the architectural differences between the two

130

The type I architecture of Lumbricus erythrocruorin permits extensive packing of β-barrel domains from linker subunit L1 across the Q-dyad

Arenicola marina erythrocruorin Lumbricus terrestris erythrocruorin

Figure 3.17: Contacts occurring between the globular heads of the linker domains.

The 3.5 Å resolution density map of the L. terrestris erythrocruorin shows that there are stabilizing contacts between the β-barrel domain of the globular heads, which may coordinate with the coiled coils in the scaffolding assembly process (Royer, 2007).

131

erythrocruorins. A break between the long and short coiled coils of the L. terrestris erythrocruorin may account for the shift in alignment between the two dyads of the scaffolding complex. Thus, the β-barrels of the linker heads make stabilizing contacts within the complex.

This comparison between these two evolutionarily divergent erythrocruorins suggests that it would be appealing to study longer segments of the linker chains in L. terrestris erythrocruorin with constructs incorporating the long and short coiled coils.

With the added anchor of the short coiled coils, the linkers may be more likely to associate. Also, by using larger segments of these constructs, it could be possible to increase the solubility of these constructs. In contrast to L. terrestris, the dyads of the A. marina erythrocruorin are separated by the more rigid, continuous coiled coil. Since this coiled coil has no breaks in the heptad repeat, it may be a better model for studying the scaffold assembly of erythrocruorins. To guide these efforts, a recent model for the molecular basis of coiled coil formation (Steinmetz et al. 2007) would provide insights regarding “trigger sequences” within the coiled coil domains.

The unique hierarchy of this class of protein makes it an appealing model to investigate self-limited assembly. Their extracellular existence and remarkable oxygen transport properties provide not only a system from which to design blood substitutes, but also an example of a protein whose assembly may not be limited to the constraints of homeostasis within the cell. Additionally, the availability of varied erythrocruorins from different species, including oligochaetes (earthworm), achaetes (leech), vestimentiferans

(hydrothermal vent tube worm), and polychaetes (marine worms) allows for a

132

comparison of evolutionary conservation and diversification (Royer et al. 2007). The

~70 Å coiled coil from the more primitive scaffold architecture may be more likely to assemble isolated in solution, as the dyads appear to have no other interacting contacts.

Information regarding the folding of these coiled coils could then be used to examine the folding of the scaffold with a more intertwined linker complex. Thus, future folding and assembly studies within this class of proteins may enhance protein engineering efforts towards biological therapeutics.

As technology advances and expression of these isolated scaffolds becomes more accessible, studying the details involved in the assembly of these complex systems will then be logistically possible. For instance, the high degree of buried surface area amongst the beta barrel domains of the linker head groups suggests that these parts might drive the assembly of the L. terrestris erythrocruorin scaffold resulting in the long coiled coils. Alternatively, combining both the long and short coiled coil domain may satisfy the requirements for initiating assembly of the scaffolding complex. As technological developments yield an expression system for these linker chains, it may be possible to probe longer fragments of the chain. Furthermore, the ability to express this protein, rather than synthesize small constructs, may allow for studies that probe coiled-coil assembly with a single chain variant containing three long coiled-coil domains. Yet, another possibility is the cooperative interaction between both the coiled coil and the β- barrel domains. This study demonstrates that although simplified models can be examined to understand multimeric folding systems, the complexity of interactions often requires the cooperativity of multiple subunits.

133

Chapter IV

GENERAL DISCUSSION

134

Importance of protein folding studies

Elucidation of the mechanism by which a given polypeptide folds from a random coil into a functionally competent protein with a well-defined, three-dimensional structure further enhances the current knowledge base of the protein folding field. In the period since Levinthal posed the “protein folding problem” (Levinthal 1968), a myriad of protein folding pathways and energy landscapes have been defined. Comparisons within this database of protein folding mechanisms indicate that specific trends exist within structural classes of proteins; however, distinct correlations between protein families and their folding mechanisms have yet to be assigned. As further information about protein folding mechanisms is discovered, these correlations may become clearer. Thus, the need for protein folding studies remains today.

Beyond monomer folding. Biological function is dependent upon protein stability, and determination of the mechanism of protein folding for a number of proteins is important for understanding the molecular basis of this stability (Matthews 1993).

Furthermore, a large number of these proteins are comprised of an intricate quarternary structure. As the protein folding field moves toward discovering a folding code, it is important to build upon the knowledge base of smaller, homogenous monomeric proteins. Many proteins in biology function as a part of a larger system rather than a discrete unit. Although the database of protein folding mechanisms for relatively small proteins is ever increasing, in reality many proteins are quite large and comprised of many subunits in their biologically functional form. In addition to being complementary

135

to the existing knowledge, multimeric folding studies are essential to understanding the complete picture of protein folding.

Multimeric protein folding studies have been sparse in comparison to the wealth of monomeric studies in part because multimers pose more experimental challenges.

Often complex heteromers are more difficult to obtain by recombinant methods because it is difficult to coexpress multiple polypeptide chains in a prokaryotic cell. Even when protein expression is possible, purification becomes complicated by the presence of multiple polypeptide chains and issues with aggregation. While multimeric folding reactions yield another dimension of experimental data with their inherent protein dependence, the bimolecular character of association reactions (Bernasconi 1976) creates a layer of complexity in analysis. In cases where either the size or oligomeric state of a multimeric protein limit the feasibility of traditional technology, studies of small proteins and peptides can serve as models for the macromolecular assembly of larger more complex proteins (Lecomte and Matthews 1993).

Current perspectives on multimer folding. At present, relatively few energy landscapes have been defined for multimeric proteins. The current literature on multimeric folding (Mann et al. 1995; Gloss and Matthews 1997; Shao et al. 1997; Gloss and Matthews 1998b; Gloss and Matthews 1998a; Shao and Matthews 1998; Thies et al.

1999; Jaenicke and Lilie 2000; Kreisberg et al. 2000; Zitzewitz et al. 2000; Gloss et al.

2001; Gloss and Placek 2002; Knappenberger et al. 2002; Kreisberg et al. 2002; Placek and Gloss 2002; Thies et al. 2002; Banks and Gloss 2003; Banks and Gloss 2004; Doyle

136

et al. 2004; Feige et al. 2004; Ibarra-Molero et al. 2004; Topping et al. 2004; Placek and

Gloss 2005; Placek et al. 2005; Simler et al. 2006; Svensson et al. 2006a) illustrates that development of quartnerary structure varies throughout classes of proteins. In the case of highly helical dimers, development of secondary structure occurs is highly coupled with the association reaction (Zitzewitz et al. 2000; Knappenberger et al. 2002; Ibarra-Molero et al. 2004). Other α/β multimers display complex dimeric intermediates (Mann et al.

1995; Gloss and Matthews 1997; Shao et al. 1997; Gloss and Matthews 1998b; Gloss and

Matthews 1998a; Simler et al. 2006). Recently, proteins containing mainly β-motifs (Thies et al. 1999; Thies et al. 2002; Feige et al. 2004; Svensson et al. 2006a) have been shown to display monomer structure formation before association. These studies highlight the intricacies of quartenary association and address the necessity of further probing these types of reactions.

137

Significance of this thesis

This thesis examined the assembly of two structurally different proteins: a viral protease from the human immunodeficiency virus type I (HIV-1) and a coiled-coil component of L. terrestris erythrocruorin (ccL). HIV-1 protease is a 22 KDa homodimer consisting of β-barrel subunits, and its function is to cleave the Gag and Gag-Pro-Pol precursor proteins thereby resulting in viral maturation (Johnston et al. 1989). This protein is not meant to exist for more than a viral life cycle and degrades via autoproteolysis. L. terrestris erythrocrourin is a 3.5 MDa extracellular hemoglobin comprised of a hierarchical architecture with dodecamers of dodecameric hemoglobin subunits docking upon a complex linker scaffold. This erythrocruorin has evolved so that it allosterically binds oxygen with a high level of cooperativity. Though these two proteins appear on the outset to have relatively little in common, the unifying theme is that both proteins are excellent models for probing the mechanism of multimeric protein assembly.

HIV-1 Protease. Chapter II of the thesis defined the energy landscape for HIV-1

protease. To enhance the reversibility of the system, an inactive variant, HIV - PR D25N , was used to eliminate any complications that may be incurred by autoproteolytic degradation and ensure a complete thermodynamic characterization. The relevance of this inactive construct to wild-type HIV-1 protease was confirmed through a comparison of its equilibrium folding with a number of published stabilities for the equilibrium

138

folding of a variety of active HIV-1 protease constructs. A combination of equilibrium and kinetic studies were pursued on both the full-length dimer, as well as a monomer

∆96−99 construct, HIV - PR D25N , to provide a complete folding mechanism for HIV-1 protease.

A global analysis of the FL kinetic traces, coupled with kinetic information from the folding of the monomer construct, elucidated a sequential, four-state folding mechanism.

Furthermore, the thermodynamic values obtained from an independent analysis of the kinetics were in agreement with those from the equilibrium studies validating this proposed energy landscape. An enhanced understanding of in vitro HIV-1 protease folding could provide insights into mechanisms of catalytic turnover and drug resistance.

Additionally these studies may lead to alternatives to current inhibitor therapies.

L. terrestris erythrocruorin. Chapter III of this thesis extends beyond the idea of small (< 160 residues per subunit) multimers and examines the assembly issues of a macromolecule with a complex hierarchy. The erythrocruorin in L. terrestris is similar to other annelid hemoglobin molecules, with multiple hemoglobin subunits docked upon a large scaffolding complex. It is postulated that these hemoglobins evolved to accommodate allosteric binding in an extracellular environment (Weber and Vinogradov

2001; Royer et al. 2005). In particular, this chapter presented a study of the long coiled- coil region of the erythrocruorin scaffold. Based upon the results of a 5.5 Å electron density map, it was proposed that the heterogenous association of the long coiled-coil domain was the driving force in the assembly of the central scaffolding complex. Peptide constructs were generated and CD spectroscopy and Size Exclusion Chromatography

139

were used to assay their ability to oligomerize. Though modest helix formation was present, the results of this chapter show that coiled coils did not form in this isolated context. Coupled with new information on subunit interfaces from a 3.5 Å structure, it was apparent that a larger region of the linker chain may be necessary to induce complete structure formation. It is also the case that contacts between the linker head domains may be crucial in stabilizing this complex structure. Information pertaining to scaffolding proteins has nanotechnology applications and could help to stabilize therapeutic proteins.

In particular, details of annelid erythrocruorin assembly could lead to the development of universal blood substitutes.

Contribution to multimeric protein folding. The two studies presented in this thesis have added to the current knowledge of multimer folding. The well-defined energy landscape of HIV-1 protease, elucidated in Chapter II, has added to the limited folding knowledge of β-barrel dimers. The coiled coils explored in Chapter III provide an interesting perspective on a domain that has been of great interest to protein engineers.

Their modest, at best, assembly suggests the need for an improved technology to address the questions of hierarchical assembly. The examination of both of these systems has shown that there is still more to learn in the context of protein folding and that each protein experimental study provides more nuances to enhance the predictive powers of future algorithms.

140

Macromolecular crowding in the cell

While in vitro studies provide an isolated system in which to study a given protein of interest, they do not mimic the in vivo environment experienced by a protein within the cell. Within the last two decades, a number of studies have examined the role of macromolecular crowding in protein folding and association (Minton 2005; Ellis and

Minton 2006). Excluded volume theory suggests that the crowding in a cell will preferentially shift the equilibrium between folded and unfolded protein species so that the given protein excludes the least volume possible (Minton 1993). As a consequence, compact globular proteins are typically favored under these conditions because they exclude less volume than extended polypeptide chains. A major consideration of the crowded conditions within the cell is the limit in the ability of proteins to diffuse through solution. Thus, protein association reactions that are diffusion-controlled will decrease in their association rate. In contrast, association reactions with relatively slower in vitro rates will increase because association of subunits becomes favored to decrease the volume excluded by individual monomers.

The folding of lysozyme has been examined using polysaccharide crowding agents such as Ficoll and dextran to mimic the crowding of cytoskeletal environment in vitro (van den Berg et al. 1999; van den Berg et al. 2000). In conjunction with these crowding agents, proteins such as bovine serum albumin (BSA) and ovalbumin have been added as protein crowding agents. These studies have shown that protein crowding agents have a greater effect on the folding reaction than the polysaccharides and suggest that unfavorable protein-protein interactions may lead to decreased folding yields.

141

Additionally, these studies have confirmed the physiological role of protein disulfide isomerase as an effective chaperone in the crowded folding conditions of the cell.

Considerations of macromolecular crowding in these studies. In Chapter II of the thesis, association rates for HIV-1 protease were presented as part of the energy landscape. Additionally, the disputed range of dissociation constants (Kd) determined for the many variants of this protein were discussed. A major conclusion was that the global analysis of HIV-1 folding kinetics provided the best assay for the Kd of HIV-1 protease because it took the maturation of the monomer folding into consideration, rather than assuming a simple, two-state equilibrium between native folded dimer and unfolded monomer chains. Also, this Kd was in excellent agreement with the Kd measured by indirect inhibitor-binding methods (Darke et al. 1994).

However, Chapter II only considered the association reaction of HIV-1 protease in vitro. This analysis shows a Kd that is less than two orders of magnitude below the rate of diffusion control. Given the literature on macromolecular crowding, it is entirely possible that the Kd for HIV-1 protease within the virion is much higher than reported here. This putative increase in the Kd of HIV-1 protease has relevance to its disease potency because it has been reported that as the HIV-1 virus matures, the catalytic efficiency of HIV-1 protease changes (Szeltner and Polgar 1996). HIV-1 protease may tune its own activity via LeChatelier’s principle due to crowding conditions within the virion.

142

The idea of macromolecular crowding may or may not impact the hierarchical assembly of the erythrocruorin studied in chapter III. This hemoglobin functions in an extracellular environment, where macromolecular crowding issues are not a large consideration. However, the large assembly of this molecule suggests the need for chaperone-mediated folding. The mode of assembly for these large complexes will be elucidated as knowledge of in vivo folding within a variety of organisms increases and more assay methods become available.

143

Folding macromolecules within the cell

The studies presented in this dissertation focused on the interactions of purified proteins in controlled, aqueous, buffer solutions. However, in their biologically functional forms proteins do not exist in an isolated environment, but are susceptible to the interactions of other macromolecules surrounding them. In an effort to understand the lessons of in vitro protein folding it is necessary to consider the mechanisms that occur intracellularly.

A common argument to in vitro protein folding studies is that the N-terminal sequence is released from the ribosome by a vectorial process. The logic then follows that the N-terminal domain of the protein may begin to fold prior to the C-terminus.

However, a number of proteins have a more compact C-terminus that contradicts the vectorial folding argument.(Laio and Micheletti 2006) In the case of both ribonuclease (Neira and Rico 1997) and GCN4-p1 (Zitzewitz et al. 2000) the C- terminal sequence is a prerequisite for initiation of folding. These findings can be contrasted with the folding of the α-subunit of tryptophan synthase, which has an N- terminal requirement for folding (Zitzewitz and Matthews 1999). Moreover, many small single-domain proteins are proposed to fold similary in vitro and in vivo

(Daggett and Fersht 2003). Since single-domain proteins can serve as models of larger proteins, the in vitro folding mechanisms may provide insights to their cellular folding energy landscape as well.

144

Following protein synthesis along the ribosome, proteins are either directed to the endoplasmic reticulum or the cytoplasm to undergo folding. In the cytoplasm, proteins encounter a host of chaperones that aid in the folding process (Hartl and

Martin 1995; Fink 1999). Additionally, there are other chaperones involved in degrading misfolded and/or aggregated proteins. In concert, these chaperones maintain a protein homeostasis in the cell (Morimoto 2006). Though these protein chaperones alter the way in which a protein folds within the cell, the overall thermodynamics controlling protein folding do not change (Bukau et al. 2006). In fact, it is supposed that chaperones play a role in partitioning the folding kinetics between functional protein and aggregation that is not evident in the cell. Thus, in vitro study of protein folding remains a valid assessment of the thermodynamics that occur in the protein folding energy landscape (Jaenicke 1998).

While it is fully acknowledged that the in vitro system removes proteins from their biological context, this approach provides a view of protein folding that is not as clearly analyzed in the cell. Moreover, as illustrated here, even purified multimeric proteins in solution present a variety of challenges in their analysis. These two studies demonstrate that fragmentation of a multimeric protein can elucidate details of its folding mechanism that are not easily probed in the whole protein. Thus, examining the assembly of both the whole protein and its fragmented subunits provides a more complete picture of how a protein achieves its fold. By the same comparison, in vitro protein folding enables a simplified system in which to probe the behavior a specific protein ensemble. Building upon this knowledge, in vivo protein

145

folding studies can further investigate the basic folding principles within the cell. In conclusion, increased knowledge of in vitro multimeric protein folding mechanisms will serve to enhance and direct the understanding of in vivo protein folding and protein-protein interactions.

146

Bibliography

Abel, C. J., R. A. Goldbeck, R. F. Latypov, H. Roder and D. S. Kliger (2007). "Conformational equilibration time of unfolded protein chains and the folding speed limit." Biochemistry 46(13): 4090-9.

Agashe, V. R., F. X. Schmid and J. B. Udgaonkar (1997). "Thermodynamics of the complex protein unfolding reaction of barstar." Biochemistry 36(40): 12288-95.

Agashe, V. R., M. C. Shastry and J. B. Udgaonkar (1995). "Initial hydrophobic collapse in the folding of barstar." Nature 377(6551): 754-7.

Anfinsen, C. B. (1973). "Principles that govern the folding of protein chains." Science 181(96): 223-30.

Anfinsen, C. B., E. Haber, M. Sela and F. H. White, Jr. (1961). "The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain." Proc Natl Acad Sci U S A 47: 1309-14.

Anfinsen, C. B. and H. A. Scheraga (1975). "Experimental and theoretical aspects of protein folding." Adv Protein Chem 29: 205-300.

Arai, M., E. Kondrashkina, C. Kayatekin, C. R. Matthews, M. Iwakura and O. Bilsel (2007). "Microsecond hydrophobic collapse in the folding of Escherichia coli dihydrofolate reductase, an alpha/beta-type protein." J Mol Biol 368(1): 219-29.

Arora, P., T. G. Oas and J. K. Myers (2004). "Fast and faster: a designed variant of the B-domain of protein A folds in 3 microsec." Protein Sci 13(4): 847-53.

Babe, L. M., J. Rose and C. S. Craik (1992). "Synthetic "interface" peptides alter dimeric assembly of the HIV 1 and 2 proteases." Protein Sci 1(10): 1244-53.

Bai, Y., T. R. Sosnick, L. Mayne and S. W. Englander (1995). "Protein folding intermediates: native-state hydrogen exchange." Science 269(5221): 192-7.

Baker, D., J. L. Silen and D. A. Agard (1992a). "Protease pro region required for folding is a potent inhibitor of the mature enzyme." Proteins 12(4): 339-44.

Baker, D., J. L. Sohl and D. A. Agard (1992b). "A protein-folding reaction under kinetic control." Nature 356(6366): 263-5.

Baldwin, R. L. (1975). "Intermediates in protein folding reactions and the mechanism of protein folding." Annu Rev Biochem 44: 453-75.

147

Baldwin, R. L. (1995). "The nature of protein folding pathways: the classical versus the new view." J Biomol NMR 5(2): 103-9.

Ballew, R. M., J. Sabelko and M. Gruebele (1996). "Direct observation of fast protein folding: the initial collapse of apomyoglobin." Proc Natl Acad Sci U S A 93(12): 5759-64.

Banci, L., M. Benedetto, I. Bertini, R. Del Conte, M. Piccioli and M. S. Viezzoli (1998). "Solution structure of reduced monomeric Q133M2 copper, zinc superoxide dismutase (SOD). Why is SOD a dimeric enzyme?" Biochemistry 37(34): 11780-91.

Banci, L., I. Bertini, C. Y. Chiu, G. T. Mullenbach and M. S. Viezzoli (1995). "Synthesis and characterization of a monomeric mutant Cu/Zn superoxide dismutase with partially reconstituted enzymic activity." Eur J Biochem 234(3): 855-60.

Baneyx, F. (1999). "Recombinant protein expression in Escherichia coli." Curr Opin Biotechnol 10(5): 411-21.

Banks, D. D. and L. M. Gloss (2003). "Equilibrium folding of the core histones: the H3-H4 tetramer is less stable than the H2A-H2B dimer." Biochemistry 42(22): 6827- 39.

Banks, D. D. and L. M. Gloss (2004). "Folding mechanism of the (H3-H4)2 histone tetramer of the core nucleosome." Protein Sci 13(5): 1304-16.

Barry, J. K. and K. S. Matthews (1999). "Thermodynamic analysis of unfolding and dissociation in lactose repressor protein." Biochemistry 38(20): 6520-8.

Baskakov, I. and D. W. Bolen (1998). "Forcing thermodynamically unfolded proteins to fold." J Biol Chem 273(9): 4831-4.

Benesch, R. and R. E. Benesch (1967). "The effect of organic phosphates from the human erythrocyte on the allosteric properties of hemoglobin." Biochem Biophys Res Commun 26(2): 162-7.

Bentz, J. (2000). "Membrane fusion mediated by coiled coils: a hypothesis." Biophys J 78(2): 886-900.

Berg, P. and M. F. Singer (1995). "The recombinant DNA controversy: twenty years later." Proc Natl Acad Sci U S A 92(20): 9011-3.

Bernasconi, C. F. (1976). Relaxation Kinetics. New York, Academic Press, Inc. .

148

Bezsonova, I., D. M. Korzhnev, R. S. Prosser, J. D. Forman-Kay and L. E. Kay (2006). "Hydration and packing along the folding pathway of SH3 domains by pressure-dependent NMR." Biochemistry 45(15): 4711-9.

Bhattacharyya, R. P. and T. R. Sosnick (1999). "Viscosity dependence of the folding kinetics of a dimeric and monomeric coiled coil." Biochemistry 38(8): 2601-9.

Bilsel, O. and C. R. Matthews (2006). "Molecular dimensions and their distributions in early folding intermediates." Curr Opin Struct Biol 16(1): 86-93.

Bilsel, O., L. Yang, J. A. Zitzewitz, J. M. Beechem and C. R. Matthews (1999a). "Time-resolved fluorescence anisotropy study of the refolding reaction of the alpha- subunit of tryptophan synthase reveals nonmonotonic behavior of the rotational correlation time." Biochemistry 38(13): 4177-87.

Bilsel, O., J. A. Zitzewitz, K. E. Bowers and C. R. Matthews (1999b). "Folding mechanism of the alpha-subunit of tryptophan synthase, an alpha/beta barrel protein: global analysis highlights the interconversion of multiple native, intermediate, and unfolded forms through parallel channels." Biochemistry 38(3): 1018-29.

Bowler, B. E. (2007). "Thermodynamics of protein denatured states." Mol Biosyst 3(2): 88-99.

Brange, J., D. R. Owens, S. Kang and A. Volund (1990). "Monomeric insulins and their experimental and clinical implications." Diabetes Care 13(9): 923-54.

Broglia, R. A., G. Tiana, L. Sutto, D. Provasi and F. Simona (2005). "Design of HIV- 1-PR inhibitors that do not create resistance: blocking the folding of single monomers." Protein Sci 14(10): 2668-81.

Brown, J. H., C. Cohen and D. A. Parry (1996). "Heptad breaks in alpha-helical coiled coils: stutters and stammers." Proteins 26(2): 134-45.

Bukau, B., J. Weissman and A. Horwich (2006). "Molecular chaperones and protein quality control." Cell 125(3): 443-51.

Bunagan, M. R., L. Cristian, W. F. DeGrado and F. Gai (2006a). "Truncation of a cross-linked GCN4-p1 coiled coil leads to ultrafast folding." Biochemistry 45(36): 10981-6.

149

Bunagan, M. R., X. Yang, J. G. Saven and F. Gai (2006b). "Ultrafast folding of a computationally designed Trp-cage mutant: Trp2-cage." J Phys Chem B 110(8): 3759-63.

Burkhard, P., J. Stetefeld and S. V. Strelkov (2001). "Coiled coils: a highly versatile protein folding motif." Trends Cell Biol 11(2): 82-8.

Burns, L. L., P. M. Dalessio and I. J. Ropson (1998). "Folding mechanism of three structurally similar beta-sheet proteins." Proteins 33(1): 107-18.

Burns, L. L. and I. J. Ropson (2001). "Folding of intracellular retinol and retinoic acid binding proteins." Proteins 43(3): 292-302.

Bussiere, D. E., X. Kong, D. A. Egan, K. Walter, T. F. Holzman, F. Lindh, T. Robins and V. L. Giranda (1998). "Structure of the E2 DNA-binding domain from human papillomavirus serotype 31 at 2.4 A." Acta Crystallogr D Biol Crystallogr 54(Pt 6 Pt 2): 1367-76.

Capaldi, A. P., C. Kleanthous and S. E. Radford (2002). "Im7 folding mechanism: misfolding on a path to the native state." Nat Struct Biol 9(3): 209-16.

Capaldi, A. P. and S. E. Radford (1998). "Kinetic studies of beta-sheet protein folding." Curr Opin Struct Biol 8(1): 86-92.

Capaldi, A. P., M. C. Shastry, C. Kleanthous, H. Roder and S. E. Radford (2001). "Ultrarapid mixing experiments reveal that Im7 folds via an on-pathway intermediate." Nat Struct Biol 8(1): 68-72.

Carlsson, U. and B. H. Jonsson (1995). "Folding of beta-sheet proteins." Curr Opin Struct Biol 5(4): 482-7.

Cassell, D. J., S. Choudhri, R. Humphrey, R. E. Martell, T. Reynolds and A. B. Shanafelt (2002). "Therapeutic enhancement of IL-2 through molecular design." Curr Pharm Des 8(24): 2171-83.

Chen, Z., Y. Li, E. Chen, D. L. Hall, P. L. Darke, C. Culberson, J. A. Shafer and L. C. Kuo (1994). "Crystal structure at 1.9-A resolution of human immunodeficiency virus (HIV) II protease complexed with L-735,524, an orally bioavailable inhibitor of the HIV proteases." J Biol Chem 269(42): 26344-8.

Chothia, C. (1975). "Structural invariants in protein folding." Nature 254(5498): 304-8.

150

Chou, K. (1996). "Prediction of human immunodeficiency virus protease cleavage sites in proteins." Anal Biochem 233: 1-14.

Clark, P. L., Z. P. Liu, J. Zhang and L. M. Gierasch (1996). "Intrinsic tryptophans of CRABPI as probes of structure and folding." Protein Sci 5(6): 1108-17.

Clark, P. L., B. F. Weston and L. M. Gierasch (1998). "Probing the folding pathway of a beta-clam protein with single-tryptophan constructs." Fold Des 3(5): 401-12.

Clarke, J., E. Cota, S. B. Fowler and S. J. Hamill (1999). "Folding studies of immunoglobulin-like beta-sandwich proteins suggest that they share a common folding pathway." Structure 7(9): 1145-53.

Covalt, J. C., Jr., M. Roy and P. A. Jennings (2001). "Core and surface mutations affect folding kinetics, stability and cooperativity in IL-1 beta: does alteration in buried water play a role?" J Mol Biol 307(2): 657-69.

Crick, F. H. (1952). "Is alpha-keratin a coiled coil?" Nature 170(4334): 882-3.

Crick, F. H. (1958). "On protein synthesis." Symp Soc Exp Biol 12: 138-63. d'Avignon, D. A., G. L. Bretthorst, M. E. Holtzer, K. A. Schwarz, R. H. Angeletti, L. Mints and A. Holtzer (2006). "Site-specific experiments on folding/unfolding of Jun coiled coils: thermodynamic and kinetic parameters from spin inversion transfer nuclear magnetic resonance at leucine-18." Biopolymers 83(3): 255-67.

Daggett, V. and A. Fersht (2003). "The present view of the mechanism of protein folding." Nat Rev Mol Cell Biol 4(6): 497-502.

Dale, G. E., C. Oefner and A. D'Arcy (2003). "The protein as a variable in protein crystallization." J Struct Biol 142(1): 88-97.

Dalessio, P. M. and I. J. Ropson (2000). "Beta-sheet proteins with nearly identical structures have different folding intermediates." Biochemistry 39(5): 860-71.

Darke, P. L., S. P. Jordan, D. L. Hall, J. A. Zugay, J. A. Shafer and L. C. Kuo (1994). "Dissociation and association of the HIV-1 protease dimer subunits: equilibria and rates." Biochemistry 33(1): 98-105.

de Prat-Gay, G., A. D. Nadra, F. J. Corrales-Izquierdo, L. G. Alonso, D. U. Ferreiro and Y. K. Mok (2005). "The folding mechanism of a dimeric beta-barrel domain." J Mol Biol 351(3): 672-82.

151

Derewenda, Z. S. (2004). "The use of recombinant methods and molecular engineering in protein crystallization." Methods 34(3): 354-63.

Dobson, C. M. (2001). "The structural basis of protein folding and its links with human disease." Philos Trans R Soc Lond B Biol Sci 356(1406): 133-45.

Dobson, C. M. (2002). "Getting out of shape." Nature 418(6899): 729-30.

Dobson, C. M. (2004). "Experimental investigation of protein folding and misfolding." Methods 34(1): 4-14.

Dobson, C. M., P. A. Evans and S. E. Radford (1994). "Understanding how proteins fold: the lysozyme story so far." Trends Biochem Sci 19(1): 31-7.

Dorman, S. C., C. F. Kenny, L. Miller, R. E. Hirsch and J. P. Harrington (2002). "Role of redox potential of hemoglobin-based oxygen carriers on methemoglobin reduction by plasma components." Artif Cells Blood Substit Immobil Biotechnol 30(1): 39-51.

Doyle, S. M., O. Bilsel and C. M. Teschke (2004). "SecA folding kinetics: a large dimeric protein rapidly forms multiple native states." J Mol Biol 341(1): 199-214.

Dyson, H. J. and P. E. Wright (1996). "Insights into protein folding from NMR." Annu Rev Phys Chem 47: 369-95.

Dyson, H. J. and P. E. Wright (2004). "Unfolded proteins and protein folding studied by NMR." Chem Rev 104(8): 3607-22.

Dyson, H. J. and P. E. Wright (2005). "Elucidation of the protein folding landscape by NMR." Methods Enzymol 394: 299-321.

Dyson, H. J., P. E. Wright and H. A. Scheraga (2006). "The role of hydrophobic interactions in initiation and propagation of protein folding." Proc Natl Acad Sci U S A 103(35): 13057-61.

Eckert, D. M., V. N. Malashkevich and P. S. Kim (1998). "Crystal structure of GCN4-pIQI, a trimeric coiled coil with buried polar residues." J Mol Biol 284(4): 859-65.

Eder, J., M. Rheinnecker and A. R. Fersht (1993). "Folding of subtilisin BPN': role of the pro-sequence." J Mol Biol 233(2): 293-304.

152

Eliezer, D., P. A. Jennings, H. J. Dyson and P. E. Wright (1997). "Populating the equilibrium molten globule state of apomyoglobin under conditions suitable for structural characterization by NMR." FEBS Lett 417(1): 92-6.

Eliezer, D., P. A. Jennings, P. E. Wright, S. Doniach, K. O. Hodgson and H. Tsuruta (1995). "The radius of gyration of an apomyoglobin folding intermediate." Science 270(5235): 487-8.

Eliezer, D., J. Yao, H. J. Dyson and P. E. Wright (1998). "Structural and dynamic characterization of partially folded states of apomyoglobin and implications for protein folding." Nat Struct Biol 5(2): 148-55.

Ellis, R. J. and A. P. Minton (2006). "Protein aggregation in crowded environments." Biol Chem 387(5): 485-97.

Eyles, S. J., T. Dresch, L. M. Gierasch and I. A. Kaltashov (1999). "Unfolding dynamics of a beta-sheet protein studied by mass spectrometry." J Mass Spectrom 34(12): 1289-95.

Eyles, S. J. and I. A. Kaltashov (2004). "Methods to study protein dynamics and folding by mass spectrometry." Methods 34(1): 88-99.

Fairlie, D. P., J. D. Tyndall, R. C. Reid, A. K. Wong, G. Abbenante, M. J. Scanlon, D. R. March, D. A. Bergman, C. L. Chai and B. A. Burkett (2000). "Conformational selection of inhibitors and substrates by proteolytic enzymes: implications for drug design and polypeptide processing." J Med Chem 43(7): 1271-81.

Feige, M. J., S. Walter and J. Buchner (2004). "Folding mechanism of the CH2 antibody domain." J Mol Biol 344(1): 107-18.

Ferguson, N., A. P. Capaldi, R. James, C. Kleanthous and S. E. Radford (1999). "Rapid folding with and without populated intermediates in the homologous four- helix proteins Im7 and Im9." J Mol Biol 286(5): 1597-608.

Ferguson, N., W. Li, A. P. Capaldi, C. Kleanthous and S. E. Radford (2001). "Using chimeric immunity proteins to explore the energy landscape for alpha-helical protein folding." J Mol Biol 307(1): 393-405.

Fersht, A. R., M. Bycroft, A. Horovitz, J. T. Kellis, Jr., A. Matouschek and L. Serrano (1991). "Pathway and stability of protein folding." Philos Trans R Soc Lond B Biol Sci 332(1263): 171-6.

Fields, G. B. and R. L. Noble (1990). "Solid phase peptide synthesis utilizing 9- fluorenylmethoxycarbonyl amino acids." Int J Pept Protein Res 35(3): 161-214.

153

Fink, A. L. (1999). "Chaperone-mediated protein folding." Physiol Rev 79(2): 425- 49.

Finke, J. M. and P. A. Jennings (2002). "Interleukin-1 beta folding between pH 5 and 7: experimental evidence for three-state folding behavior and robust transition state positions late in folding." Biochemistry 41(50): 15056-67.

Fire, A., S. Xu, M. K. Montgomery, S. A. Kostas, S. E. Driver and C. C. Mello (1998). "Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans." Nature 391(6669): 806-11.

Flaugh, S. L., M. S. Kosinski-Collins and J. King (2005a). "Contributions of hydrophobic domain interface interactions to the folding and stability of human gammaD-crystallin." Protein Sci 14(3): 569-81.

Flaugh, S. L., M. S. Kosinski-Collins and J. King (2005b). "Interdomain side-chain interactions in human gammaD crystallin influencing folding and stability." Protein Sci 14(8): 2030-43.

Forsyth, W. R., O. Bilsel, Z. Gu and C. R. Matthews (2007). "Topology and Sequence in the Folding of a TIM Barrel Protein: Global Analysis Highlights Partitioning between Transient Off-pathway and Stable On-pathway Folding Intermediates in the Complex Folding Mechanism of a (betaalpha)(8) Barrel of Unknown Function from B. subtilis." J Mol Biol.

Foulkes-Murzycki, J. E., W. R. Scott and C. A. Schiffer (2007). "Hydrophobic sliding: a possible mechanism for drug resistance in human immunodeficiency virus type 1 protease." Structure 15(2): 225-33.

Fowler, S. B. and J. Clarke (2001). "Mapping the folding pathway of an : structural detail from Phi value analysis and movement of the transition state." Structure 9(5): 355-66.

Friel, C. T., A. P. Capaldi and S. E. Radford (2003). "Structural analysis of the rate- limiting transition states in the folding of Im7 and Im9: similarities and differences in the folding of homologous proteins." J Mol Biol 326(1): 293-305.

Frokjaer, S. and D. E. Otzen (2005). "Protein drug stability: a formulation challenge." Nat Rev Drug Discov 4(4): 298-306.

Gai, F., D. Du and Y. Xu (2007). "Infrared temperature-jump study of the folding dynamics of alpha-helices and beta-hairpins." Methods Mol Biol 350: 1-20.

154

Garcia-Mira, M. M., D. Boehringer and F. X. Schmid (2004). "The folding transition state of the cold shock protein is strongly polarized." J Mol Biol 339(3): 555-69.

Garvey, E. P. and C. R. Matthews (1990). "Site-directed mutagenesis and its application to protein folding." Biotechnology 14: 37-63.

Gegg, C. V., K. E. Bowers and C. R. Matthews (1997). "Probing minimal independent folding units in dihydrofolate reductase by molecular dissection." Protein Sci 6(9): 1885-92.

Geierhaas, C. D., R. B. Best, E. Paci, M. Vendruscolo and J. Clarke (2006). "Structural comparison of the two alternative transition states for folding of TI I27." Biophys J 91(1): 263-75.

Geierhaas, C. D., E. Paci, M. Vendruscolo and J. Clarke (2004). "Comparison of the transition states for folding of two Ig-like proteins from different superfamilies." J Mol Biol 343(4): 1111-23.

Georgiou, G. (1995). "Recombinant DNA technology." Trends Biotechnol 13(3): 79- 80.

Ghaemmaghami, S., J. M. Word, R. E. Burton, J. S. Richardson and T. G. Oas (1998). "Folding kinetics of a fluorescent variant of monomeric lambda repressor." Biochemistry 37(25): 9179-85.

Gibbs, A. C., T. C. Bjorndahl, R. S. Hodges and D. S. Wishart (2002). "Probing the structural determinants of type II' beta-turn formation in peptides and proteins." J Am Chem Soc 124(7): 1203-13.

Gill, S. C. and P. H. von Hippel (1989). "Calculation of protein extinction coefficients from amino acid sequence data." Anal Biochem 182(2): 319-26.

Gilmanshin, R., R. H. Callender and R. B. Dyer (1998). "The core of apomyoglobin E-form folds at the diffusion limit." Nat Struct Biol 5(5): 363-5.

Gilmanshin, R., S. Williams, R. H. Callender, W. H. Woodruff and R. B. Dyer (1997a). "Fast events in protein folding: relaxation dynamics and structure of the I form of apomyoglobin." Biochemistry 36(48): 15006-12.

Gilmanshin, R., S. Williams, R. H. Callender, W. H. Woodruff and R. B. Dyer (1997b). "Fast events in protein folding: relaxation dynamics of secondary and tertiary structure in native apomyoglobin." Proc Natl Acad Sci U S A 94(8): 3709- 13.

155

Gloss, L. M. (2007). "Tying the knot that binds." Structure 15(1): 2-4.

Gloss, L. M. and C. R. Matthews (1997). "Urea and thermal equilibrium denaturation studies on the dimerization domain of Escherichia coli Trp repressor." Biochemistry 36(19): 5612-23.

Gloss, L. M. and C. R. Matthews (1998a). "The barriers in the bimolecular and unimolecular folding reactions of the dimeric core domain of Escherichia coli Trp repressor are dominated by enthalpic contributions." Biochemistry 37(45): 16000- 10.

Gloss, L. M. and C. R. Matthews (1998b). "Mechanism of folding of the dimeric core domain of Escherichia coli trp repressor: a nearly diffusion-limited reaction leads to the formation of an on-pathway dimeric intermediate." Biochemistry 37(45): 15990-9.

Gloss, L. M. and B. J. Placek (2002). "The effect of salts on the stability of the H2A- H2B histone dimer." Biochemistry 41(50): 14951-9.

Gloss, L. M., B. R. Simler and C. R. Matthews (2001). "Rough energy landscapes in protein folding: dimeric E. coli Trp repressor folds through three parallel channels." J Mol Biol 312(5): 1121-34.

Gonzalez, L., Jr., R. A. Brown, D. Richardson and T. Alber (1996a). "Crystal structures of a single coiled-coil peptide in two oligomeric states reveal the basis for structural polymorphism." Nat Struct Biol 3(12): 1002-9.

Gonzalez, L., Jr., J. J. Plecs and T. Alber (1996b). "An engineered allosteric switch in leucine-zipper oligomerization." Nat Struct Biol 3(6): 510-5.

Gonzalez, L., Jr., D. N. Woolfson and T. Alber (1996c). "Buried polar residues and structural specificity in the GCN4 leucine zipper." Nat Struct Biol 3(12): 1011-8.

Gorski, S. A., A. P. Capaldi, C. Kleanthous and S. E. Radford (2001). "Acidic conditions stabilise intermediates populated during the folding of Im7 and Im9." J Mol Biol 312(4): 849-63.

Gorski, S. A., C. S. Le Duff, A. P. Capaldi, A. P. Kalverda, G. S. Beddard, G. R. Moore and S. E. Radford (2004). "Equilibrium hydrogen exchange reveals extensive hydrogen bonded secondary structure in the on-pathway intermediate of Im7." J Mol Biol 337(1): 183-93.

156

Gosavi, S., L. L. Chavez, P. A. Jennings and J. N. Onuchic (2006). "Topological frustration and the folding of interleukin-1beta." J Mol Biol 357(3): 986-96.

Govindarajan, S. and R. A. Goldstein (1998). "On the thermodynamic hypothesis of protein folding." Proc Natl Acad Sci U S A 95(10): 5545-9.

Grant, S. K., I. C. Deckman, J. S. Culp, M. D. Minnich, I. S. Brooks, P. Hensley, C. Debouck and T. D. Meek (1992). "Use of protein unfolding studies to determine the conformational and dimeric stabilities of HIV-1 and SIV proteases." Biochemistry 31(39): 9491-501.

Grantcharova, V. P. and D. Baker (1997). "Folding dynamics of the src SH3 domain." Biochemistry 36(50): 15685-92.

Grantcharova, V. P., D. S. Riddle and D. Baker (2000). "Long-range order in the src SH3 folding transition state." Proc Natl Acad Sci U S A 97(13): 7084-9.

Greenfield, N. J. (2004). "Analysis of circular dichroism data." Methods Enzymol 383: 282-317.

Gu, Z., J. A. Zitzewitz and C. R. Matthews (2007). "Mapping the structure of folding cores in TIM barrel proteins by hydrogen exchange mass spectrometry: the roles of motif and sequence for the indole-3-glycerol phosphate synthase from Sulfolobus solfataricus." J Mol Biol 368(2): 582-94.

Gualfetti, P. J., O. Bilsel and C. R. Matthews (1999). "The progressive development of structure and stability during the equilibrium folding of the alpha subunit of tryptophan synthase from Escherichia coli." Protein Sci 8(8): 1623-35.

Gulnik, S., J. W. Erickson and D. Xie (2000). "HIV protease: enzyme function and drug resistance." Vitam Horm 58: 213-56.

Gulotta, M., R. Gilmanshin, T. C. Buscher, R. H. Callender and R. B. Dyer (2001). "Core formation in apomyoglobin: probing the upper reaches of the folding energy landscape." Biochemistry 40(17): 5137-43.

Gunasekaran, K., S. J. Eyles, A. T. Hagler and L. M. Gierasch (2001). "Keeping it in the family: folding studies of related proteins." Curr Opin Struct Biol 11(1): 83- 93.

Guruprasad, K., V. Dhanaraj, M. Groves and T. L. Blundell (1995). "Aspartic proteinases: The structures and functions of a versatile superfamily of enzymes." Perspectives in Drug Discovery and Design 2(3): 329-41.

157

Hagen, S. J. and W. A. Eaton (2000). "Two-state expansion and collapse of a polypeptide." J Mol Biol 301(4): 1019-27.

Harbury, P. B., T. Zhang, P. S. Kim and T. Alber (1993). "A switch between two-, three-, and four-stranded coiled coils in GCN4 leucine zipper mutants." Science 262(5138): 1401-7.

Hardin, C., T. V. Pogorelov and Z. Luthey-Schulten (2002). "Ab initio protein structure prediction." Curr Opin Struct Biol 12(2): 176-81.

Harrington, J. P., S. Kobayashi, S. C. Dorman, S. L. Zito and R. E. Hirsch (2007). "Acellular invertebrate hemoglobins as model therapeutic oxygen carriers: unique redox potentials." Artif Cells Blood Substit Immobil Biotechnol 35(1): 53-67.

Hartl, F. U. and J. Martin (1995). "Molecular chaperones in cellular protein folding." Curr Opin Struct Biol 5(1): 92-102.

Heidary, D. K., L. A. Gross, M. Roy and P. A. Jennings (1997). "Evidence for an obligatory intermediate in the folding of interleukin-1 beta." Nat Struct Biol 4(9): 725-31.

Heidary, D. K. and P. A. Jennings (2002). "Three topologically equivalent core residues affect the transition state ensemble in a protein folding reaction." J Mol Biol 316(3): 789-98.

Heidary, D. K., M. Roy, G. O. Daumy, Y. Cong and P. A. Jennings (2005). "Long- range coupling between separate docking sites in interleukin-1beta." J Mol Biol 353(5): 1187-98.

Hendrickson, W. A. (1991). "Determination of macromolecular structures from anomalous diffraction of synchrotron radiation." Science 254(5028): 51-8.

Hendrickson, W. A. (2000). "Synchrotron crystallography." Trends Biochem Sci 25(12): 637-43.

Hennig, W. (2004). "The revolution of the biology of the genome." Cell Res 14(1): 1- 7.

Henry, E. R. and J. Hofrichter (1992). "Singular value decomposition: Application to analysis of experimental data." Methods in Enzymology 210: 129-192.

Hirsch, R. E., L. A. Jelicks, B. A. Wittenberg, D. K. Kaul, H. L. Shear and J. P. Harrington (1997). "A first evaluation of the natural high molecular weight

158

polymeric Lumbricus terrestris hemoglobin as an oxygen carrier." Artif Cells Blood Substit Immobil Biotechnol 25(5): 429-44.

Honig, B. (1999). "Protein folding: from the levinthal paradox to structure prediction." J Mol Biol 293(2): 283-93.

Huang, G. S. and T. G. Oas (1995). "Structure and stability of monomeric lambda repressor: NMR evidence for two-state folding." Biochemistry 34(12): 3884-92.

Ibarra-Molero, B., J. A. Zitzewitz and C. R. Matthews (2004). "Salt-bridges can stabilize but do not accelerate the folding of the homodimeric coiled-coil peptide GCN4-p1." J Mol Biol 336(5): 989-96.

Ignatova, Z. and L. M. Gierasch (2005). "Aggregation of a slow-folding mutant of a beta-clam protein proceeds through a monomeric nucleus." Biochemistry 44(19): 7266-74.

Ishima, R., R. Ghirlando, J. Tozser, A. M. Gronenborn, D. A. Torchia and J. M. Louis (2001). "Folded monomer of HIV-1 protease." J Biol Chem 276(52): 49110-6.

Ishima, R., D. A. Torchia, S. M. Lynch, A. M. Gronenborn and J. M. Louis (2003). "Solution structure of the mature HIV-1 protease monomer: insight into the tertiary fold and stability of a precursor." J Biol Chem 278(44): 43311-9.

Ivankov, D. N., S. O. Garbuzynskiy, E. Alm, K. W. Plaxco, D. Baker and A. V. Finkelstein (2003). "Contact order revisited: influence of protein size on the folding rate." Protein Sci 12(9): 2057-62.

Iwakura, M., B. E. Jones, C. J. Falzone and C. R. Matthews (1993). "Collapse of parallel folding channels in dihydrofolate reductase from Escherichia coli by site- directed mutagenesis." Biochemistry 32(49): 13566-74.

Iwakura, M., B. E. Jones, J. Luo and C. R. Matthews (1995). "A strategy for testing the suitability of cysteine replacements in dihydrofolate reductase from Escherichia coli." J Biochem (Tokyo) 117(3): 480-8.

Jaenicke, R. (1998). "Protein self-organization in vitro and in vivo: partitioning between physical biochemistry and cell biology." Biol Chem 379(3): 237-43.

Jaenicke, R. and H. Lilie (2000). "Folding and association of oligomeric and multimeric proteins." Adv Protein Chem 53: 329-401.

Jang do, S., H. J. Lee, B. Lee, B. H. Hong, H. J. Cha, J. Yoon, K. Lim, Y. J. Yoon, J. Kim, M. Ree, H. C. Lee and K. Y. Choi (2006). "Detection of an intermediate during

159

the unfolding process of the dimeric ketosteroid isomerase." FEBS Lett 580(17): 4166-71.

Jennings, P., M. Roy, D. Heidary and L. Gross (1998). "Folding pathway of interleukin-1 beta." Nat Struct Biol 5(1): 11.

Jennings, P. A., B. E. Finn, B. E. Jones and C. R. Matthews (1993). "A reexamination of the folding mechanism of dihydrofolate reductase from Escherichia coli: verification and refinement of a four-channel model." Biochemistry 32(14): 3783-9.

Jennings, P. A., S. M. Saalau-Bethell, B. E. Finn, X. W. Chen and C. R. Matthews (1991). "Mutational analysis of protein folding mechanisms." Methods Enzymol 202: 113-26.

Jennings, P. A. and P. E. Wright (1993). "Formation of a molten globule intermediate early in the kinetic folding pathway of apomyoglobin." Science 262(5135): 892-6.

John, M., J. P. Briand, M. Granger-Schnarr and M. Schnarr (1994). "Two pairs of oppositely charged amino acids from Jun and Fos confer heterodimerization to GCN4 leucine zipper." J Biol Chem 269(23): 16247-53.

Johnston, M. I., H. S. Allaudeen and N. Sarver (1989). "HIV proteinase as a target for drug action." Trends Pharmacol Sci 10(8): 305-7.

Jones, B. E., J. M. Beechem and C. R. Matthews (1995). "Local and global dynamics during the folding of Escherichia coli dihydrofolate reductase by time-resolved fluorescence spectroscopy." Biochemistry 34(6): 1867-77.

Jones, B. E., P. A. Jennings, R. A. Pierre and C. R. Matthews (1994). "Development of nonpolar surfaces in the folding of Escherichia coli dihydrofolate reductase detected by 1-anilinonaphthalene-8-sulfonate binding." Biochemistry 33(51): 15250- 8.

Jones, B. E. and C. R. Matthews (1995). "Early intermediates in the folding of dihydrofolate reductase from Escherichia coli detected by hydrogen exchange and NMR." Protein Sci 4(2): 167-77.

Kabsch, W. and C. Sander (1983). "How good are predictions of protein secondary structure?" FEBS Lett 155(2): 179-82.

Kao, W. Y., J. Qin, K. Fushitani, S. S. Smith, T. A. Gorr, C. K. Riggs, J. E. Knapp, B. T. Chait and A. F. Riggs (2006). "Linker chains of the gigantic hemoglobin of the

160

earthworm Lumbricus terrestris: primary structures of linkers L2, L3, and L4 and analysis of the connectivity of the disulfide bonds in linker L1." Proteins 63(1): 174- 87.

Kathir, K. M., T. K. Kumar, D. Rajalingam and C. Yu (2005). "Time-dependent changes in the denatured state(s) influence the folding mechanism of an all beta- sheet protein." J Biol Chem 280(33): 29682-8.

Katoh, E., J. M. Louis, T. Yamazaki, A. M. Gronenborn, D. A. Torchia and R. Ishima (2003). "A solution NMR study of the binding kinetics and the internal dynamics of an HIV-1 protease-substrate complex." Protein Sci 12(7): 1376-85.

Kemmink, J. and T. E. Creighton (1993). "Local conformations of peptides representing the entire sequence of bovine pancreatic trypsin inhibitor and their roles in folding." J Mol Biol 234(3): 861-78.

Kentsis, A. and T. R. Sosnick (1998). "Trifluoroethanol promotes helix formation by destabilizing backbone exposure: desolvation rather than native hydrogen bonding defines the kinetic pathway of dimeric coiled coil folding." Biochemistry 37(41): 14613-22.

Kiehna, S. E. and M. L. Waters (2003). "Sequence dependence of beta-hairpin structure: comparison of a salt bridge and an aromatic interaction." Protein Sci 12(12): 2657-67.

King, D. S., C. G. Fields and G. B. Fields (1990). "A cleavage method which minimizes side reactions following Fmoc solid phase peptide synthesis." Int J Pept Protein Res 36(3): 255-66.

King, N. M., L. Melnick, M. Prabu-Jeyabalan, E. A. Nalivaika, S. S. Yang, Y. Gao, X. Nie, C. Zepp, D. L. Heefner and C. A. Schiffer (2002). "Lack of synergy for inhibitors targeting a multi-drug-resistant HIV-1 protease." Protein Sci 11(2): 418- 29.

King, N. M., M. Prabu-Jeyabalan, E. A. Nalivaika and C. A. Schiffer (2004). "Combating susceptibility to drug resistance: lessons from HIV-1 protease." Chem Biol 11(10): 1333-8.

Klimov, D. K. and D. Thirumalai (2002). "Stiffness of the distal loop restricts the structural heterogeneity of the transition state ensemble in SH3 domains." J Mol Biol 317(5): 721-37.

Knappenberger, J. A., J. E. Smith, S. H. Thorpe, J. A. Zitzewitz and C. R. Matthews (2002). "A buried polar residue in the hydrophobic interface of the coiled-coil

161

peptide, GCN4-p1, plays a thermodynamic, not a kinetic role in folding." J Mol Biol 321(1): 1-6.

Kohl, N. E., E. A. Emini, W. A. Schleif, L. J. Davis, J. C. Heimbach, R. A. Dixon, E. M. Scolnick and I. S. Sigal (1988). "Active human immunodeficiency virus protease is required for viral infectivity." Proc Natl Acad Sci U S A 85(13): 4686-90.

Kohn, W. D., C. M. Kay and R. S. Hodges (1998). "Orientation, positional, additivity, and oligomerization-state effects of interhelical ion pairs in alpha-helical coiled-coils." J Mol Biol 283(5): 993-1012.

Korzhnev, D. M., I. Bezsonova, F. Evanics, N. Taulier, Z. Zhou, Y. Bai, T. V. Chalikian, R. S. Prosser and L. E. Kay (2006). "Probing the transition state ensemble of a protein folding reaction by pressure-dependent NMR relaxation dispersion." J Am Chem Soc 128(15): 5262-9.

Kosinski-Collins, M. S., S. L. Flaugh and J. King (2004). "Probing folding and fluorescence quenching in human gammaD crystallin Greek key domains using triple tryptophan mutant proteins." Protein Sci 13(8): 2223-35.

Krantz, B. A., A. K. Srivastava, S. Nauli, D. Baker, R. T. Sauer and T. R. Sosnick (2002). "Understanding protein formation with kinetic H/D amide isotope effects." Nat Struct Biol 9(6): 458-63.

Kreisberg, J. F., S. D. Betts, C. Haase-Pettingell and J. King (2002). "The interdigitated beta-helix domain of the P22 tailspike protein acts as a molecular clamp in trimer stabilization." Protein Sci 11(4): 820-30.

Kreisberg, J. F., S. D. Betts and J. King (2000). "Beta-helix core packing within the triple-stranded oligomerization domain of the P22 tailspike." Protein Sci 9(12): 2338-43.

Krishna, M. M., L. Hoang, Y. Lin and S. W. Englander (2004). "Hydrogen exchange methods to study protein folding." Methods 34(1): 51-64.

Krylov, D., J. Barchi and C. Vinson (1998). "Inter-helical interactions in the leucine zipper coiled coil dimer: pH and salt dependence of coupling energy between charged amino acids." J Mol Biol 279(4): 959-72.

Kumar, M., K. K. Kannan, M. V. Hosur, N. S. Bhavesh, A. Chatterjee, R. Mittal and R. V. Hosur (2002). "Effects of remote mutation on the autolysis of HIV-1 PR: X-ray and NMR investigations." Biochem Biophys Res Commun 294(2): 395-401.

162

Kumar, S., B. Ma, C. J. Tsai, N. Sinha and R. Nussinov (2000). "Folding and binding cascades: dynamic landscapes and population shifts." Protein Sci 9(1): 10-9.

Kuwajima, K. (1989). "The molten globule state as a clue for understanding the folding and cooperativity of globular-protein structure." Proteins 6(2): 87-103.

Laio, A. and C. Micheletti (2006). "Are structural biases at protein termini a signature of vectorial folding?" Proteins 62(1): 17-23.

Lakowicz, J. R. (1980). "Fluorescence spectroscopic investigations of the dynamic properties of proteins, membranes and nucleic acids." J Biochem Biophys Methods 2(1): 91-119.

Lakowicz, J. R. (1986). "Fluorescence studies of structural fluctuations in macromolecules as observed by fluorescence spectroscopy in the time, lifetime, and frequency domains." Methods Enzymol 131: 518-67.

Lakowicz, J. R., B. P. Maliwal, H. Cherek and A. Balter (1983). "Rotational freedom of tryptophan residues in proteins and peptides." Biochemistry 22(8): 1741-52.

Lakshmikanth, G. S., K. Sridevi, G. Krishnamoorthy and J. B. Udgaonkar (2001). "Structure is lost incrementally during the unfolding of barstar." Nat Struct Biol 8(9): 799-804.

Landschulz, W. H., P. F. Johnson and S. L. McKnight (1988). "The leucine zipper: a hypothetical structure common to a new class of DNA binding proteins." Science 240(4860): 1759-64.

Lazaridis, T. and M. Karplus (1997). ""New view" of protein folding reconciled with the old through multiple unfolding simulations." Science 278(5345): 1928-31.

Lecomte, C., B. Guillot, N. Muzet, V. Pichon-Pesme and C. Jelsch (2004). "Ultra- high-resolution X-ray structure of proteins." Cell Mol Life Sci 61(7-8): 774-82.

Lecomte, J. T. and C. R. Matthews (1993). "Unraveling the mechanism of protein folding: new tricks for an old problem." Protein Eng 6(1): 1-10.

Levinthal, C. (1968). "Are there pathways for protein folding?" J Chim Phys 65: 44- 45.

Levy, Y., A. Caflisch, J. N. Onuchic and P. G. Wolynes (2004). "The folding and dimerization of HIV-1 protease: evidence for a stable monomer from simulations." J Mol Biol 340(1): 67-79.

163

Lindberg, M. J., R. Bystrom, N. Boknas, P. M. Andersen and M. Oliveberg (2005). "Systematically perturbed folding patterns of amyotrophic lateral sclerosis (ALS)- associated SOD1 mutants." Proc Natl Acad Sci U S A 102(28): 9754-9.

Lindberg, M. J., J. Normark, A. Holmgren and M. Oliveberg (2004). "Folding of human superoxide dismutase: disulfide reduction prevents dimerization and produces marginally stable monomers." Proc Natl Acad Sci U S A 101(45): 15893-8.

Louis, J. M., G. M. Clore and A. M. Gronenborn (1999). "Autoprocessing of HIV-1 protease is tightly coupled to protein folding." Nat Struct Biol 6(9): 868-75.

Louis, J. M., R. Ishima, I. Nesheiwat, L. K. Pannell, S. M. Lynch, D. A. Torchia and A. M. Gronenborn (2003). "Revisiting monomeric HIV-1 protease. Characterization and redesign for improved properties." J Biol Chem 278(8): 6085-92.

Lucent, D., V. Vishal and V. S. Pande (2007). "Protein folding under confinement: a role for solvent." Proc Natl Acad Sci U S A 104(25): 10430-4.

Lumb, K. J. and P. S. Kim (1995a). "A buried polar interaction imparts structural uniqueness in a designed heterodimeric coiled coil." Biochemistry 34(27): 8642-8.

Lumb, K. J. and P. S. Kim (1995b). "Measurement of interhelical electrostatic interactions in the GCN4 leucine zipper." Science 268(5209): 436-9.

Luo, P. and R. L. Baldwin (1997). "Mechanism of helix induction by trifluoroethanol: a framework for extrapolating the helix-forming properties of peptides from trifluoroethanol/water mixtures back to water." Biochemistry 36(27): 8413-21.

Lupas, A. (1996). "Coiled coils: new structures and new functions." Trends Biochem Sci 21(10): 375-82.

Magg, C. and F. X. Schmid (2004). "Rapid collapse precedes the fast two-state folding of the cold shock protein." J Mol Biol 335(5): 1309-23.

Mahalingam, B., J. M. Louis, J. Hung, R. W. Harrison and I. T. Weber (2001). "Structural implications of drug-resistant mutants of HIV-1 protease: high- resolution crystal structures of the mutant protease/substrate analogue complexes." Proteins 43(4): 455-64.

Makrides, S. C. (1996). "Strategies for achieving high-level expression of genes in Escherichia coli." Microbiol Rev 60(3): 512-38.

164

Mann, C. J., X. Shao and C. R. Matthews (1995). "Characterization of the slow folding reactions of trp aporepressor from Escherichia coli by mutational analysis of prolines and catalysis by a peptidyl-prolyl isomerase." Biochemistry 34(44): 14573- 80.

Martin, P., J. F. Vickrey, G. Proteasa, Y. L. Jimenez, Z. Wawrzak, M. A. Winters, T. C. Merigan and L. C. Kovari (2005). ""Wide-open" 1.3 A structure of a multidrug-resistant HIV-1 protease as a drug target." Structure 13(12): 1887-95.

Matagne, A. and C. M. Dobson (1998). "The folding process of hen lysozyme: a perspective from the 'new view'." Cell Mol Life Sci 54(4): 363-71.

Matthews, C. R. (1987). "Effect of point mutations on the folding of globular proteins." Methods Enzymol 154: 498-511.

Matthews, C. R. (1993). "Pathways of protein folding." Annu Rev Biochem 62: 653- 83.

McCallister, E. L., E. Alm and D. Baker (2000). "Critical role of beta-hairpin formation in protein G folding." Nat Struct Biol 7(8): 669-73.

McGinnes, L. W., T. Sergel, H. Chen, L. Hamo, S. Schwertz, D. Li and T. G. Morrison (2001). "Mutational analysis of the membrane proximal heptad repeat of the newcastle disease virus ." Virology 289(2): 343-52.

McGregor, W. C. (1983). "Large-scale isolation and purification of proteins from recombinant E. coli." Ann N Y Acad Sci 413: 231-7.

McKay, D. B. and T. A. Steitz (1981). "Structure of catabolite gene activator protein at 2.9 A resolution suggests binding to left-handed B-DNA." Nature 290(5809): 744- 9.

Mei, G., A. Di Venere, F. De Matteis and N. Rosato (2003). "The recovery of dipolar relaxation times from fluorescence decays as a tool to probe local dynamics in single tryptophan proteins." Arch Biochem Biophys 417(2): 159-64.

Meier, S., S. Guthe, T. Kiefhaber and S. Grzesiek (2004). "Foldon, the natural trimerization domain of T4 fibritin, dissociates into a monomeric A-state form containing a stable beta-hairpin: atomic details of trimer dissociation and local beta- hairpin stability from residual dipolar couplings." J Mol Biol 344(4): 1051-69.

Merithew, E., C. Stone, S. Eathiraj and D. G. Lambright (2003). "Determinants of Rab5 interaction with the N terminus of early endosome antigen 1." J Biol Chem 278(10): 8494-500.

165

Merrifield, R. B. (1965). "Solid-Phase Peptide Syntheses." Endeavour 24: 3-7.

Milla, M. E., B. M. Brown, C. D. Waldburger and R. T. Sauer (1995). "P22 Arc repressor: transition state properties inferred from mutational effects on the rates of protein unfolding and refolding." Biochemistry 34(42): 13914-9.

Milla, M. E. and R. T. Sauer (1994). "P22 Arc repressor: folding kinetics of a single- domain, dimeric protein." Biochemistry 33(5): 1125-33.

Millar, D. P. (1996). "Fluorescence studies of DNA and RNA structure and dynamics." Curr Opin Struct Biol 6(3): 322-6.

Minton, A. P. (1993). "Macromolecular crowding and molecular recognition." J Mol Recognit 6(4): 211-4.

Minton, A. P. (2005). "Influence of macromolecular crowding upon the stability and state of association of proteins: predictions and observations." J Pharm Sci 94(8): 1668-75.

Mittermaier, A. and L. E. Kay (2006). "New tools provide new insights in NMR studies of protein dynamics." Science 312(5771): 224-8.

Mo, J., M. E. Holtzer and A. Holtzer (1991a). "Kinetics of folding and unfolding of alpha alpha-tropomyosin and of nonpolymerizable alpha alpha-tropomyosin." Biopolymers 31(12): 1417-27.

Mo, J. M., M. E. Holtzer and A. Holtzer (1991b). "Kinetics of self-assembly of alpha alpha-tropomyosin coiled coils from unfolded chains." Proc Natl Acad Sci U S A 88(3): 916-20.

Mohana-Borges, R., N. K. Goto, G. J. Kroon, H. J. Dyson and P. E. Wright (2004). "Structural characterization of unfolded states of apomyoglobin using residual dipolar couplings." J Mol Biol 340(5): 1131-42.

Mok, K. H. and P. J. Hore (2004). "Photo-CIDNP NMR methods for studying protein folding." Methods 34(1): 75-87.

Mok, Y. K., L. G. Alonso, L. M. Lima, M. Bycroft and G. de Prat-Gay (2000). "Folding of a dimeric beta-barrel: residual structure in the urea denatured state of the human papillomavirus E2 DNA binding domain." Protein Sci 9(4): 799-811.

166

Mok, Y. K., M. Bycroft and G. de Prat-Gay (1996). "The dimeric DNA binding domain of the human papillomavirus E2 protein folds through a monomeric intermediate which cannot be native-like." Nat Struct Biol 3(8): 711-7.

Moncoq, K., I. Broutin, C. T. Craescu, P. Vachette, A. Ducruix and D. Durand (2004). "SAXS study of the PIR domain from the Grb14 molecular adaptor: a natively unfolded protein with a transient structure primer?" Biophys J 87(6): 4056-64.

Moore, P. B. and T. A. Steitz (2005). "The ribosome revealed." Trends Biochem Sci 30(6): 281-3.

Moran, L. B., J. P. Schneider, A. Kentsis, G. A. Reddy and T. R. Sosnick (1999). "Transition state heterogeneity in GCN4 coiled coil folding studied by using multisite mutations and crosslinking." Proc Natl Acad Sci U S A 96(19): 10699-704.

Morimoto, R. I. (2006). "Stress, aging, and neurodegenerative disease." N Engl J Med 355(21): 2254-5.

Moult, J. (2005). "A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction." Curr Opin Struct Biol 15(3): 285-9.

Muir, T. W. (2003). "Semisynthesis of proteins by expressed protein ligation." Annu Rev Biochem 72: 249-89.

Munoz, V. and L. Serrano (1994). "Elucidating the folding problem of helical peptides using empirical parameters." Nat Struct Biol 1(6): 399-409.

Munoz, V. and L. Serrano (1997). "Development of the multiple sequence approximation within the AGADIR model of alpha-helix formation: comparison with Zimm-Bragg and Lifson-Roig formalisms." Biopolymers 41(5): 495-509.

Munson, M., K. S. Anderson and L. Regan (1997). "Speeding up protein folding: mutations that increase the rate at which Rop folds and unfolds by over four orders of magnitude." Fold Des 2(1): 77-87.

Myers, J. K. and T. G. Oas (2002). "Mechanism of fast protein folding." Annu Rev Biochem 71: 783-815.

Nash, H. A. (1975a). "Integrative recombination in bacteriophage lambda: analysis of recombinant DNA." J Mol Biol 91(4): 501-14.

Nash, H. A. (1975b). "Integrative recombination of bacteriophage lambda DNA in vitro." Proc Natl Acad Sci U S A 72(3): 1072-6.

167

Neet, K. E. and D. E. Timm (1994). "Conformational stability of dimeric proteins: quantitative studies by equilibrium denaturation." Protein Sci 3(12): 2167-74.

Neira, J. L. and M. Rico (1997). "Folding studies on ribonuclease A, a model protein." Fold Des 2(1): R1-11.

Neira, J. L., P. Sevilla, M. Menendez, M. Bruix and M. Rico (1999). "Hydrogen exchange in ribonuclease A and ribonuclease S: evidence for residual structure in the unfolded state under native conditions." J Mol Biol 285(2): 627-43.

Neri, D., M. Billeter, G. Wider and K. Wuthrich (1992a). "NMR determination of residual structure in a urea-denatured protein, the 434-repressor." Science 257(5076): 1559-63.

Neri, D., G. Wider and K. Wuthrich (1992b). "1H, 15N and 13C NMR assignments of the 434 repressor fragments 1-63 and 44-63 unfolded in 7 M urea." FEBS Lett 303(2-3): 129-35.

Neudecker, P., A. Zarrine-Afsar, W. Y. Choy, D. R. Muhandiram, A. R. Davidson and L. E. Kay (2006). "Identification of a collapsed intermediate with non-native long-range interactions on the folding pathway of a pair of Fyn SH3 domain mutants by NMR relaxation dispersion spectroscopy." J Mol Biol 363(5): 958-76.

Nishimura, C., H. J. Dyson and P. E. Wright (2002). "The apomyoglobin folding pathway revisited: structural heterogeneity in the kinetic burst phase intermediate." J Mol Biol 322(3): 483-9.

Nishimura, C., P. E. Wright and H. J. Dyson (2003). "Role of the B helix in early folding events in apomyoglobin: evidence from site-directed mutagenesis for native- like long range interactions." J Mol Biol 334(2): 293-307.

O'Shea, E. K., J. D. Klemm, P. S. Kim and T. Alber (1991). "X-ray structure of the GCN4 leucine zipper, a two-stranded, parallel coiled coil." Science 254(5031): 539- 44.

O'Shea, E. K., R. Rutkowski and P. S. Kim (1992). "Mechanism of specificity in the Fos-Jun oncoprotein heterodimer." Cell 68(4): 699-708.

Oakley, M. G. and P. S. Kim (1998). "A buried polar interaction can direct the relative orientation of helices in a coiled coil." Biochemistry 37(36): 12603-10.

Oas, T. G. and S. A. Endow (1994). "Springs and hinges: dynamic coiled coils and discontinuities." Trends Biochem Sci 19(2): 51-4.

168

Ohtaka, H., A. Schon and E. Freire (2003). "Multidrug resistance to HIV-1 protease inhibition requires cooperative coupling between distal mutations." Biochemistry 42(46): 13659-66.

Ozer, N., T. Haliloglu and C. A. Schiffer (2006). "Substrate specificity in HIV-1 protease by a biased sequence search method." Proteins 64(2): 444-56.

Panchal, S. C., N. S. Bhavesh and R. V. Hosur (2001). "Real time NMR monitoring of local unfolding of HIV-1 protease tethered dimer driven by autolysis." FEBS Lett 497(1): 59-64.

Panchal, S. C. and R. V. Hosur (2000). "Unfolding kinetics of tryptophan side chains in the dimerization and hinge regions of HIV-I protease tethered dimer by real time NMR spectroscopy." Biochem Biophys Res Commun 269(2): 387-92.

Pande, V. S., I. Baker, J. Chapman, S. P. Elmer, S. Khaliq, S. M. Larson, Y. M. Rhee, M. R. Shirts, C. D. Snow, E. J. Sorin and B. Zagrovic (2003). "Atomistic protein folding simulations on the submillisecond time scale using worldwide distributed computing." Biopolymers 68(1): 91-109.

Pande, V. S., A. Grosberg and T. Tanaka (1997a). "On the theory of folding kinetics for short proteins." Fold Des 2(2): 109-14.

Pande, V. S., A. Grosberg, T. Tanaka and D. S. Rokhsar (1998). "Pathways for protein folding: is a new view needed?" Curr Opin Struct Biol 8(1): 68-79.

Pande, V. S., A. Y. Grosberg and T. Tanaka (1997b). "Statistical mechanics of simple models of protein folding and design." Biophys J 73(6): 3192-210.

Pande, V. S. and D. S. Rokhsar (1998). "Is the molten globule a third phase of proteins?" Proc Natl Acad Sci U S A 95(4): 1490-4.

Pande, V. S. and D. S. Rokhsar (1999a). "Folding pathway of a lattice model for proteins." Proc Natl Acad Sci U S A 96(4): 1273-8.

Pande, V. S. and D. S. Rokhsar (1999b). "Molecular dynamics simulations of unfolding and refolding of a beta-hairpin fragment of protein G." Proc Natl Acad Sci U S A 96(16): 9062-7.

Peng, Z. Y. and L. C. Wu (2000). "Autonomous protein folding units." Adv Protein Chem 53: 1-47.

Perutz, M. F. (1960). "Structure of hemoglobin." Brookhaven Symp Biol 13: 165-83.

169

Perutz, M. F. (1964). "The Structure of Myoglobin and Haemoglobin." Dan Tidsskr Farm 38: 113-22.

Perutz, M. F. (1976). "Structure and mechanism of haemoglobin." Br Med Bull 32(3): 195-208.

Perutz, M. F. (1978). "Hemoglobin structure and respiratory transport." Sci Am 239(6): 92-125.

Peters, J., M. Nitsch, B. Kuhlmorgen, R. Golbik, A. Lupas, J. Kellermann, H. Engelhardt, J. P. Pfander, S. Muller, K. Goldie and et al. (1995). "Tetrabrachion: a filamentous archaebacterial surface protein assembly of unusual structure and extreme stability." J Mol Biol 245(4): 385-401.

Piana, S., P. Carloni and M. Parrinello (2002). "Role of conformational fluctuations in the enzymatic reaction of HIV-1 protease." J Mol Biol 319(2): 567-83.

Placek, B. J. and L. M. Gloss (2002). "The N-terminal tails of the H2A-H2B histones affect dimer structure and stability." Biochemistry 41(50): 14960-8.

Placek, B. J. and L. M. Gloss (2005). "Three-state kinetic folding mechanism of the H2A/H2B histone heterodimer: the N-terminal tails affect the transition state between a dimeric intermediate and the native dimer." J Mol Biol 345(4): 827-36.

Placek, B. J., L. N. Harrison, B. M. Villers and L. M. Gloss (2005). "The H2A.Z/H2B dimer is unstable compared to the dimer containing the major H2A isoform." Protein Sci 14(2): 514-22.

Plaxco, K. W., I. S. Millett, D. J. Segel, S. Doniach and D. Baker (1999). "Chain collapse can occur concomitantly with the rate-limiting step in protein folding." Nat Struct Biol 6(6): 554-6.

Plaxco, K. W., K. T. Simons and D. Baker (1998). "Contact order, transition state placement and the refolding rates of single domain proteins." J Mol Biol 277(4): 985-94.

Plaxco, K. W., K. T. Simons, I. Ruczinski and D. Baker (2000). "Topology, stability, sequence, and length: defining the determinants of two-state protein folding kinetics." Biochemistry 39(37): 11177-83.

Porter, R. R. (1950). "The formation of a specific inhibitor by hydrolysis of rabbit antiovalbumin." Biochem J 46(4): 479-84.

170

Porter, R. R. (1959). "The hydrolysis of rabbit y-globulin and antibodies with crystalline papain." Biochem J 73: 119-26.

Prabu-Jeyabalan, M., E. Nalivaika and C. A. Schiffer (2000). "How does a symmetric dimer recognize an asymmetric substrate? A substrate complex of HIV-1 protease." J Mol Biol 301(5): 1207-20.

Prabu-Jeyabalan, M., E. Nalivaika and C. A. Schiffer (2002). "Substrate shape determines specificity of recognition for HIV-1 protease: analysis of crystal structures of six substrate complexes." Structure 10(3): 369-81.

Prabu-Jeyabalan, M., E. A. Nalivaika, K. Romano and C. A. Schiffer (2006). "Mechanism of substrate recognition by drug-resistant human immunodeficiency virus type 1 protease variants revealed by a novel structural intermediate." J Virol 80(7): 3607-16.

Pradeep, L. and J. B. Udgaonkar (2007). "Diffusional barrier in the unfolding of a small protein." J Mol Biol 366(3): 1016-28.

Qu, Y., C. L. Bolen and D. W. Bolen (1998). "Osmolyte-driven contraction of a random coil protein." Proc Natl Acad Sci U S A 95(16): 9268-73.

Radford, S. E. (2000). "Protein folding: progress made and promises ahead." Trends Biochem Sci 25(12): 611-8.

Rick, S. W., J. W. Erickson and S. K. Burt (1998). "Reaction path and free energy calculations of the transition between alternate conformations of HIV-1 protease." Proteins 32(1): 7-16.

Riddle, D. S., V. P. Grantcharova, J. V. Santiago, E. Alm, I. Ruczinski and D. Baker (1999). "Experiment and theory highlight role of native state topology in SH3 folding." Nat Struct Biol 6(11): 1016-24.

Riddle, D. S., J. V. Santiago, S. T. Bray-Hall, N. Doshi, V. P. Grantcharova, Q. Yi and D. Baker (1997). "Functional rapidly folding proteins from simplified amino acid sequences." Nat Struct Biol 4(10): 805-9.

Riley, P. W., H. Cheng, D. Samuel, H. Roder and P. N. Walsh (2007). "Dimer dissociation and unfolding mechanism of coagulation factor XI apple 4 domain: spectroscopic and mutational analysis." J Mol Biol 367(2): 558-73.

Roder, H., G. A. Elove and S. W. Englander (1988). "Structural characterization of folding intermediates in cytochrome c by H-exchange labelling and proton NMR." Nature 335(6192): 700-4.

171

Roder, H., K. Maki, H. Cheng and M. C. Shastry (2004). "Rapid mixing methods for exploring the kinetics of protein folding." Methods 34(1): 15-27.

Ropson, I. J., J. I. Gordon and C. Frieden (1990). "Folding of a predominantly beta- structure protein: rat intestinal fatty acid binding protein." Biochemistry 29(41): 9591-9.

Ropson, I. J., B. C. Yowler, P. M. Dalessio, L. Banaszak and J. Thompson (2000). "Properties and crystal structure of a beta-barrel folding mutant." Biophys J 78(3): 1551-60.

Rose, J. R., R. Salto and C. S. Craik (1993). "Regulation of autoproteolysis of the HIV-1 and HIV-2 proteases with engineered amino acid substitutions." J Biol Chem 268(16): 11939-45.

Rotondi, K. S. and L. M. Gierasch (2003a). "Local sequence information in cellular retinoic acid-binding protein I: specific residue roles in beta-turns." Biopolymers 71(6): 638-51.

Rotondi, K. S. and L. M. Gierasch (2003b). "Role of local sequence in the folding of cellular retinoic abinding protein I: structural propensities of reverse turns." Biochemistry 42(26): 7976-85.

Rotondi, K. S., L. F. Rotondi and L. M. Gierasch (2003). "Native structural propensity in cellular retinoic acid-binding protein I 64-88: the role of locally encoded structure in the folding of a beta-barrel protein." Biophys Chem 100(1-3): 421-36.

Roy, M., L. L. Chavez, J. M. Finke, D. K. Heidary, J. N. Onuchic and P. A. Jennings (2005). "The native energy landscape for interleukin-1beta. Modulation of the population ensemble through native-state topology." J Mol Biol 348(2): 335-47.

Roy, M. and P. A. Jennings (2003). "Real-time NMR kinetic studies provide global and residue-specific information on the non-cooperative unfolding of the beta-trefoil protein, interleukin-1beta." J Mol Biol 328(3): 693-703.

Royer, W. E., Jr., M. N. Omartian and J. E. Knapp (2007). "Low resolution crystal structure of Arenicola erythrocruorin: influence of coiled coils on the architecture of a megadalton respiratory protein." J Mol Biol 365(1): 226-36.

Royer, W. E., Jr., H. Sharma, K. Strand, J. E. Knapp and B. Bhyravbhatla (2006). "Lumbricus erythrocruorin at 3.5 A resolution: architecture of a megadalton respiratory complex." Structure 14(7): 1167-77.

172

Royer, W. E., Jr., K. Strand, M. van Heel and W. A. Hendrickson (2000). "Structural hierarchy in erythrocruorin, the giant respiratory assemblage of annelids." Proc Natl Acad Sci U S A 97(13): 7107-11.

Royer, W. E., Jr., H. Zhu, T. A. Gorr, J. F. Flores and J. E. Knapp (2005). "Allosteric hemoglobin assembly: diversity and similarity." J Biol Chem 280(30): 27477-80.

Rumbley, J., L. Hoang, L. Mayne and S. W. Englander (2001). "An amino acid code for protein folding." Proc Natl Acad Sci U S A 98(1): 105-12.

Sauer, R. T., M. E. Milla, C. D. Waldburger, B. M. Brown and J. F. Schildbach (1996). "Sequence determinants of folding and stability for the P22 Arc repressor dimer." Faseb J 10(1): 42-8.

Schiffer, M. and A. B. Edmundson (1967). "Use of helical wheels to represent the structures of proteins and to identify segments with helical potential." Biophys J 7(2): 121-35.

Schindler, T. and F. X. Schmid (1996). "Thermodynamic properties of an extremely rapid protein folding reaction." Biochemistry 35(51): 16833-42.

Scholtz, J. M. and R. L. Baldwin (1992). "The mechanism of alpha-helix formation by peptides." Annu Rev Biophys Biomol Struct 21: 95-118.

Schoppe, A., H. J. Hinz, V. R. Agashe, S. Ramachandran and J. B. Udgaonkar (1997). "DSC studies of the conformational stability of barstar wild-type." Protein Sci 6(10): 2196-202.

Schwarzinger, S., P. E. Wright and H. J. Dyson (2002). "Molecular hinges in protein folding: the urea-denatured state of apomyoglobin." Biochemistry 41(42): 12681-6.

Searle, M. S. and B. Ciani (2004). "Design of beta-sheet systems for understanding the thermodynamics and kinetics of protein folding." Curr Opin Struct Biol 14(4): 458-64.

Segel, D. J., A. Bachmann, J. Hofrichter, K. O. Hodgson, S. Doniach and T. Kiefhaber (1999a). "Characterization of transient intermediates in lysozyme folding with time-resolved small-angle X-ray scattering." J Mol Biol 288(3): 489-99.

Segel, D. J., D. Eliezer, V. Uversky, A. L. Fink, K. O. Hodgson and S. Doniach (1999b). "Transient dimer in the refolding kinetics of cytochrome c characterized by small-angle X-ray scattering." Biochemistry 38(46): 15352-9.

173

Selvin, P. R. (1995). "Fluorescence resonance energy transfer." Methods Enzymol 246: 300-34.

Sergel, T. A., L. W. McGinnes and T. G. Morrison (2001). "Mutations in the fusion peptide and adjacent heptad repeat inhibit folding or activity of the Newcastle disease virus fusion protein." J Virol 75(17): 7934-43.

Seydoux, G., C. C. Mello, J. Pettitt, W. B. Wood, J. R. Priess and A. Fire (1996). "Repression of in the embryonic germ lineage of C. elegans." Nature 382(6593): 713-6.

Shankar, P., N. Manjunath and J. Lieberman (2005). "The prospect of silencing disease using RNA interference." Jama 293(11): 1367-73.

Shao, X., P. Hensley and C. R. Matthews (1997). "Construction and characterization of monomeric tryptophan repressor: a model for an early intermediate in the folding of a dimeric protein." Biochemistry 36(32): 9941-9.

Shao, X. and C. R. Matthews (1998). "Single-tryptophan mutants of monomeric tryptophan repressor: optical spectroscopy reveals nonnative structure in a model for an early folding intermediate." Biochemistry 37(21): 7850-8.

Sharp, P. A. and P. D. Zamore (2000). "Molecular biology. RNA interference." Science 287(5462): 2431-3.

Shastry, M. C. and J. B. Udgaonkar (1995). "The folding mechanism of barstar: evidence for multiple pathways and multiple intermediates." J Mol Biol 247(5): 1013-27.

Short, B. and F. A. Barr (2004). "Membrane fusion: caught in a trap." Curr Biol 14(5): R187-9.

Simler, B. R., Y. Levy, J. N. Onuchic and C. R. Matthews (2006). "The folding energy landscape of the dimerization domain of Escherichia coli Trp repressor: a joint experimental and theoretical investigation." J Mol Biol 363(1): 262-78.

Sinha, K. K. and J. B. Udgaonkar (2007). "Dissecting the non-specific and specific components of the initial folding reaction of barstar by multi-site FRET measurements." J Mol Biol 370(2): 385-405.

Snow, C. D., B. Zagrovic and V. S. Pande (2002). "The Trp cage: folding kinetics and unfolded state topology via molecular dynamics simulations." J Am Chem Soc 124(49): 14548-9.

174

Sohl, J. L., S. S. Jaswal and D. A. Agard (1998). "Unfolded conformations of alpha- lytic protease are more stable than its native state." Nature 395(6704): 817-9.

Sorensen, H. P. and K. K. Mortensen (2005a). "Advanced genetic strategies for recombinant protein expression in Escherichia coli." J Biotechnol 115(2): 113-28.

Sorensen, H. P. and K. K. Mortensen (2005b). "Soluble expression of recombinant proteins in the cytoplasm of Escherichia coli." Microb Cell Fact 4(1): 1.

Sosnick, T. R., S. Jackson, R. R. Wilk, S. W. Englander and W. F. DeGrado (1996). "The role of helix formation in the folding of a fully alpha-helical coiled coil." Proteins 24(4): 427-32.

Spence, G. R., A. P. Capaldi and S. E. Radford (2004). "Trapping the on-pathway folding intermediate of Im7 at equilibrium." J Mol Biol 341(1): 215-26.

Srimathi, T., T. K. Kumar, Y. H. Chi, I. M. Chiu and C. Yu (2002). "Characterization of the structure and dynamics of a near-native equilibrium intermediate in the unfolding pathway of an all beta-barrel protein." J Biol Chem 277(49): 47507-16.

Steinmetz, M. O., I. Jelesarov, W. M. Matousek, S. Honnappa, W. Jahnke, J. H. Missimer, S. Frank, A. T. Alexandrescu and R. A. Kammerer (2007). "Molecular basis of coiled-coil formation." Proc Natl Acad Sci U S A 104(17): 7062-7.

Steitz, T. A. and P. B. Moore (2003). "RNA, the first macromolecular catalyst: the ribosome is a ribozyme." Trends Biochem Sci 28(8): 411-8.

Storrs, R. W., D. Truckses and D. E. Wemmer (1992). "Helix propagation in trifluoroethanol solutions." Biopolymers 32(12): 1695-702.

Svensson, A. K., O. Bilsel, E. Kondrashkina, J. A. Zitzewitz and C. R. Matthews (2006a). "Mapping the Folding Free Energy Surface for Metal-free Human Cu,Zn Superoxide Dismutase." J Mol Biol.

Svensson, A. K., J. A. Zitzewitz, C. R. Matthews and V. F. Smith (2006b). "The relationship between chain connectivity and domain stability in the equilibrium and kinetic folding mechanisms of dihydrofolate reductase from E.coli." Protein Eng Des Sel 19(4): 175-85.

Szeltner, Z. and L. Polgar (1996). "Rate-determining steps in HIV-1 protease catalysis. The hydrolysis of the most specific substrate." J Biol Chem 271(50): 32180-4.

175

Tabara, H., A. Grishok and C. C. Mello (1998). "RNAi in C. elegans: soaking in the genome sequence." Science 282(5388): 430-1.

Tanaka, S. and H. A. Scheraga (1975). "Model of protein folding: inclusion of short- , medium-, and long-range interactions." Proc Natl Acad Sci U S A 72(10): 3802-6.

Thieffry, D. and S. Sarkar (1998). "Forty years under the central dogma." Trends Biochem Sci 23(8): 312-6.

Thies, M. J., J. Mayer, J. G. Augustine, C. A. Frederick, H. Lilie and J. Buchner (1999). "Folding and association of the antibody domain CH3: prolyl isomerization preceeds dimerization." J Mol Biol 293(1): 67-79.

Thies, M. J., F. Talamo, M. Mayer, S. Bell, M. Ruoppolo, G. Marino and J. Buchner (2002). "Folding and oxidation of the antibody domain C(H)3." J Mol Biol 319(5): 1267-77.

Thompson, L. C., J. Walters, J. Burke, J. F. Parsons, R. N. Armstrong and H. W. Dirr (2006). "Double mutation at the subunit interface of glutathione transferase rGSTM1-1 results in a stable, folded monomer." Biochemistry 45(7): 2267-73.

Todd, M. J., N. Semo and E. Freire (1998). "The structural stability of the HIV-1 protease." J Mol Biol 283(2): 475-88.

Topping, T. B., D. A. Hoch and L. M. Gloss (2004). "Folding mechanism of FIS, the intertwined, dimeric factor for inversion stimulation." J Mol Biol 335(4): 1065-81.

Turner, B. G. and M. F. Summers (1999). "Structural biology of HIV." J Mol Biol 285(1): 1-32.

Udgaonkar, J. B. and R. L. Baldwin (1988). "NMR evidence for an early framework intermediate on the folding pathway of ribonuclease A." Nature 335(6192): 694-9.

Udgaonkar, J. B. and R. L. Baldwin (1990). "Early folding intermediate of ribonuclease A." Proc Natl Acad Sci U S A 87(21): 8197-201.

Uversky, V. N., A. S. Karnoup, R. Khurana, D. J. Segel, S. Doniach and A. L. Fink (1999). "Association of partially-folded intermediates of staphylococcal nuclease induces structure and stability." Protein Sci 8(1): 161-73.

Uversky, V. N., S. Winter and G. Lober (1998). "Self-association of 8-anilino-1- naphthalene-sulfonate molecules: spectroscopic characterization and application to the investigation of protein folding." Biochim Biophys Acta 1388(1): 133-42.

176

van den Berg, B., R. J. Ellis and C. M. Dobson (1999). "Effects of macromolecular crowding on protein folding and aggregation." Embo J 18(24): 6927-33. van den Berg, B., R. Wain, C. M. Dobson and R. J. Ellis (2000). "Macromolecular crowding perturbs protein refolding kinetics: implications for folding inside the cell." Embo J 19(15): 3870-5.

Wahl, M. C. and M. Sundaralingam (1995). "New crystal structures of nucleic acids and their complexes." Curr Opin Struct Biol 5(3): 282-95.

Wallace, L. A. and H. W. Dirr (1999). "Folding and assembly of dimeric human glutathione transferase A1-1." Biochemistry 38(50): 16686-94.

Walshaw, J., J. M. Shipway and D. N. Woolfson (2001). "Guidelines for the assembly of novel coiled-coil structures: alpha-sheets and alpha-cylinders." Biochem Soc Symp(68): 111-23.

Walshaw, J. and D. N. Woolfson (2001). "Socket: a program for identifying and analysing coiled-coil motifs within protein structures." J Mol Biol 307(5): 1427-50.

Wang, T., W. L. Lau, W. F. DeGrado and F. Gai (2005). "T-jump infrared study of the folding mechanism of coiled-coil GCN4-p1." Biophys J 89(6): 4180-7.

Wang, Z., J. Mottonen and E. J. Goldsmith (1996). "Kinetically controlled folding of the serpin plasminogen activator inhibitor 1." Biochemistry 35(51): 16443-8.

Watson, J. D. and F. H. Crick (1953a). "Genetical implications of the structure of deoxyribonucleic acid." Nature 171(4361): 964-7.

Watson, J. D. and F. H. Crick (1953b). "Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid." Nature 171(4356): 737-8.

Watson, J. D. and F. H. Crick (1953c). "The structure of DNA." Cold Spring Harb Symp Quant Biol 18: 123-31.

Weber, R. E. and S. N. Vinogradov (2001). "Nonvertebrate hemoglobins: functions and molecular adaptations." Physiol Rev 81(2): 569-628.

Wetzel, R. (1980). "Applications of recombinant DNA technology." Am Sci 68(6): 664-75.

177

Wilson, I. A., J. J. Skehel and D. C. Wiley (1981). "Structure of the haemagglutinin membrane glycoprotein of influenza virus at 3 A resolution." Nature 289(5796): 366-73.

Wlodawer, A. and J. W. Erickson (1993). "Structure-based inhibitors of HIV-1 protease." Annu Rev Biochem 62: 543-85.

Wlodawer, A., M. Miller, M. Jaskolski, B. K. Sathyanarayana, E. Baldwin, I. T. Weber, L. M. Selk, L. Clawson, J. Schneider and S. B. Kent (1989). "Conserved folding in retroviral proteases: crystal structure of a synthetic HIV-1 protease." Science 245(4918): 616-21.

Wolf, E., P. S. Kim and B. Berger (1997). "MultiCoil: a program for predicting two- and three-stranded coiled coils." Protein Sci 6(6): 1179-89.

Wolynes, P. G., J. N. Onuchic and D. Thirumalai (1995). "Navigating the folding routes." Science 267(5204): 1619-20.

Woody, R. W. (1994). "Contributions of tryptophan side chains to the far- ultraviolet circular dichroism of proteins." Eur Biophys J 23(4): 253-62.

Woody, R. W. (1995). "Circular dichroism." Methods Enzymol 246: 34-71.

Woolfson, D. N. (2005). "The design of coiled-coil structures and assemblies." Adv Protein Chem 70: 79-112.

Wu, T. D., C. A. Schiffer, M. J. Gonzales, J. Taylor, R. Kantor, S. Chou, D. Israelski, A. R. Zolopa, W. J. Fessel and R. W. Shafer (2003). "Mutation patterns and structural correlates in human immunodeficiency virus type 1 protease following different protease inhibitor treatments." J Virol 77(8): 4836-47.

Wu, Y. and C. R. Matthews (2002a). "A cis-prolyl peptide bond isomerization dominates the folding of the alpha subunit of Trp synthase, a TIM barrel protein." J Mol Biol 322(1): 7-13.

Wu, Y. and C. R. Matthews (2002b). "Parallel channels and rate-limiting steps in complex protein folding reactions: prolyl isomerization and the alpha subunit of Trp synthase, a TIM barrel protein." J Mol Biol 323(2): 309-25.

Wu, Y. and C. R. Matthews (2003). "Proline replacements and the simplification of the complex, parallel channel folding mechanism for the alpha subunit of Trp synthase, a TIM barrel protein." J Mol Biol 330(5): 1131-44.

178

Wu, Y., R. Vadrevu, S. Kathuria, X. Yang and C. R. Matthews (2007). "A tightly packed hydrophobic cluster directs the formation of an off-pathway sub-millisecond folding intermediate in the alpha subunit of tryptophan synthase, a TIM barrel protein." J Mol Biol 366(5): 1624-38.

Wu, Y., R. Vadrevu, X. Yang and C. R. Matthews (2005). "Specific structure appears at the N terminus in the sub-millisecond folding intermediate of the alpha subunit of tryptophan synthase, a TIM barrel protein." J Mol Biol 351(3): 445-52.

Xie, D., S. Gulnik, L. Collins, E. Gustchina, L. Suvorov and J. W. Erickson (1997). "Dissection of the pH dependence of inhibitor binding energetics for an aspartic protease: direct measurement of the protonation states of the catalytic aspartic acid residues." Biochemistry 36(51): 16166-72.

Xu, Y., P. Purkayastha and F. Gai (2006). "Nanosecond folding dynamics of a three- stranded beta-sheet." J Am Chem Soc 128(49): 15836-42.

Yan, Y., E. Winograd, A. Viel, T. Cronin, S. C. Harrison and D. Branton (1993). "Crystal structure of the repetitive segments of spectrin." Science 262(5142): 2027- 30.

Yang, W. Y. and M. Gruebele (2004). "Folding lambda-repressor at its speed limit." Biophys J 87(1): 596-608.

Yang, X., R. Vadrevu, Y. Wu and C. R. Matthews (2007). "Long-range side-chain- main-chain interactions play crucial roles in stabilizing the (betaalpha)8 barrel motif of the alpha subunit of tryptophan synthase." Protein Sci 16(7): 1398-409.

Yee, A., A. Gutmanas and C. H. Arrowsmith (2006). "Solution NMR in structural genomics." Curr Opin Struct Biol 16(5): 611-7.

Yi, Q., P. Rajagopal, R. E. Klevit and D. Baker (2003). "Structural and kinetic characterization of the simplified SH3 domain FP1." Protein Sci 12(4): 776-83.

Yu, Y., O. D. Monera, R. S. Hodges and P. L. Privalov (1996a). "Investigation of electrostatic interactions in two-stranded coiled-coils through residue shuffling." Biophys Chem 59(3): 299-314.

Yu, Y., O. D. Monera, R. S. Hodges and P. L. Privalov (1996b). "Ion pairs significantly stabilize coiled-coils in the absence of electrolyte." J Mol Biol 255(3): 367-72.

179

Zagrovic, B., C. D. Snow, M. R. Shirts and V. S. Pande (2002). "Simulation of folding of a small alpha-helical protein in atomistic detail using worldwide- distributed computing." J Mol Biol 323(5): 927-37.

Zeeb, M. and J. Balbach (2004). "Protein folding studied by real-time NMR spectroscopy." Methods 34(1): 65-74.

Zitzewitz, J. A., O. Bilsel, J. Luo, B. E. Jones and C. R. Matthews (1995). "Probing the folding mechanism of a leucine zipper peptide by stopped-flow circular dichroism spectroscopy." Biochemistry 34(39): 12812-9.

Zitzewitz, J. A., B. Ibarra-Molero, D. R. Fishel, K. L. Terry and C. R. Matthews (2000). "Preformed secondary structure drives the association reaction of GCN4-p1, a model coiled-coil system." J Mol Biol 296(4): 1105-16.

Zitzewitz, J. A. and C. R. Matthews (1999). "Molecular dissection of the folding mechanism of the alpha subunit of tryptophan synthase: an amino-terminal autonomous folding unit controls several rate-limiting steps in the folding of a single domain protein." Biochemistry 38(31): 10205-14.