The Development of the Prediction of Protein Structure

6 The Development of the Prediction of Protein Structure Gerald D. Fasman I. Introduction .................................................................... 194 II. Protein Topology. .. 196 III. Techniques of Protein Prediction ................................................... 198 A. Sequence Alignment .......................................................... 199 B. Hydrophobicity .............................................................. 200 C. Minimum Energy Calculations ................................................. 202 IV. Approaches to Protein Conformation ................................................ 203 A. Solvent Accessibility ......................................................... 203 B. Packing of Residues .......................................................... 204 C. Distance Geometry ........................................................... 205 D. Amino Acid Physicochemical Properties ......................................... 205 V. Prediction of the Secondary Structure of Proteins: a Helix, ~ Strands, and ~ Turn .......... 208 A. ~ Turns .................................................................... 209 B. Evaluation of Predictive Methodologies .......................................... 218 C. Other Predictive Algorithms ................................................... 222 D. Chou-Fasman Algorithm ...................................................... 224 E. Class Prediction ............................................................. 233 VI. Prediction of Tertiary Structure .................................................... 235 A. Combinatorial Approach ...................................................... 236 B. ~ Sheets ................................................................... 239 C. Packing of a Helices (aia) .................................................... 243 D. Amphipathic a Helices and ~ Strands: Dipole Moments and Electrostatic Interactions .... 245 E. Packing of a Helices and ~ Pleated Sheets (ai~) .................................. 250 F. Prediction of Protein Conformation by Minimum-Energy Calculations ................. 253 G. Expert Systems .............................................................. 256 VII. Prediction of Membrane Structure: Methods of Prediction ............................... 257 VIII. Predicted Membrane Structures .................................................... 264 A. Bacteriorhodopsin ............................................................ 264 B. Acetylcholine Receptor ....................................................... 267 Gerald D. Fasman • Graduate Department of Biochemistry, Brandeis University, Waltham, Massachusetts 02254. 193 G. D. Fasman (ed.), Prediction of Protein Structure and the Principles of Protein Conformation © Plenum Press, New York 1989 194 Gerald D. Fasman C. ATPases 269 D. T-Cell Receptor ............................................................. 271 E. Sodium Channel ............................................................. 272 IX. References ..................................................................... 277 X. Appendixes..................................................................... 303 Appendix 1: List of Reviews on Protein Folding and Prediction of Secondary and Tertiary Structure . .. 303 Appendix 2: Programs Available through This Publication for Protein Secondary Structure Prediction . .. 305 Appendix 3: Commercially Available Programs ....................................... 309 Appendix 4: Relevant Programs Described in the Literature ............................. 311 Appendix 5: National Resource Data Bases .......................................... 313 I. INTRODUCTION The tenet of structural biology that function follows fonn had its seeds in the monograph by C. B. Anfinsen, The Molecular Basis of Evolution (Anfinsen, 1959), wherein he stated "Pro tein chemists naturally feel that the most likely approach to the understanding of cellular behavior lies in the study of structure and function of protein molecules. " The achievement of protein crystallography over the past 30 years has confinned this view whereby the description of the structure and function of proteins is now frequently understood at the atomic level. The classical experiments of Anfinsen and co-workers (Anfinsen et al., 1961; Anfinsen, 1973) proved that ribonuclease could be denatured and refolded without loss of enzymatic activity. This implied that the amino acid sequence contains sufficient infonnation to define the three-dimensional structure of a protein in a particular environment. The acceptance of this tenet has led to multifarious efforts to predict the confonnation of proteins based only on the consideration of sequence alone, which has been termed the protein-folding problem. Both theoreticians (e.g., Levitt and Warshel, 1975; Nemethy and Scheraga, 1977; Karplus and Weaver, 1979; Schulz and Schinner, 1979; Sternberg, 1983) and experimentalists (e.g., Shoemaker et al., 1987; Creighton, 1978) have tackled the chain-folding problem with very limited success. It is thought that the native structure of a protein will lie near the minimum of free energy; however, it will not fold by sampling every possible conformation (Levinthal, 1968; Wet laufer, 1973), but there will be one or more pathways along which the protein folds (Creighton, 1979). Anfinsen (1973) had proposed that one or more regions of secondary structure, e.g., a helices or a two-stranded antiparallel ~ sheet, having marginal stability would act as nucleation sites and direct the refolding. The advent of recombinant DNA techniques has led to an explosion of infonnation concerning the sequences of receptors and enzymes that will be important for drug, herbicide, and pesticide design. Technological developments of industrial, clinical, and agriCUltural importance may be achieved in the coming years by imitation of the interactions between macromolecules and ligands that occur naturally in the living cell (Blundell et ai., 1987). Engineered honnones may have more desirable properties than their native counterparts in tenns of stability or activity. All these promising prospects have fIred the desire to predict the confonnation of and design protein molecules with a high degree of sophistication. The theoretical efforts could be categorized into three main areas: energetic, heuristic, and statistical. The school following the assumption that a protein folds so as to minimize the free energy of the system has had many contributors (e.g., Levitt and Warshel, 1975; Nemethy Development of Protein Structure Prediction 195 and Scheraga, 1977; McCammon et at., 1977; Weiner et at., 1984). These researchers have developed potential functions to describe the energy surface of a polypeptide chain. Chain folding is simulated computationally, directed by surface gradients to find the energy mini mum. Alternatively, conformation space is probed from a starting point by integrating the equations of motion over time. These theoretical predictions of structure have been influenced by both thermodynamic and kinetic arguments. The thermodynamic properties of the polypep tide chain were the first to be cOlisidered. Liquori and co-workers (DeSantis et al., 1965) and Ramachandran et al. (1963) first demonstrated that the peptide unit can adopt only certain allowed conformations. They constructed <\>, I)J plots, which were subsequently improved by semiempirical energy calculations (Lewis et at., 1973b), to predict preferred regions for the various secondary structures of polypeptides. Recent energy calculations have been refined, either by choosing parameters to fit crystal structures (Hagler et al., 1974) or by performing complex molecular orbital calculations (Pullman and Pullman, 1974). On the basis of the thermodynamic hypothesis, by calculating the total free energy of a protein and finding the global minimum, it should be possible to predict the native structure. However, minimization schemes have failed to predict chain folding accurately (Hagler and Honig, 1978; Cohen and Sternberg, 1980a,b). The lack of success presumably stems from the difficulties in modeling protein-solvent interactions, the use of analytical functions to approximate the chemical potential, and the compounding of these errors in the computed gradient; in addition, the energy surface has multiple minima, which make it nearly impossible to locate a global minimum. Attempts at solving these problems are underway. Molecular dynamics offers solutions to these problems but remains computationally limited as a technique for studying chain folding (McCammon et al., 1977). Chain folding is thought to take place in the milli second time scale (Baldwin, 1980). Elaborate computing resources must be applied to sample 100 nsec in the life of a small protein (Post et at., 1986). One of the first attempts to predict the conformation of a protein, ribonuclease, was that of Scheraga (1960). This model used the available chemical information, deuterium-hydrogen exchange data, and the knowledge of the a helix. It later was shown to have little similarity to the native molecule. The prediction of the secondary structure of protein conformation was pioneered by Guzzo (1965). By analysis of the known sequences and structures of myoglobin and the a- and l3-hemoglobins, he found that groups of amino acids were helix disrupters. He also empha sized the role of hydrophobic interactions. He noted that Blout (1962) had pointed out that the known poly(a-amino acids) fall into two categories, helix

The Development of the Prediction of Protein Structure

Computation and Visualization of Protein Topology Graphs Including Ligand Information

Predictive Energy Landscapes for Folding Α-Helical Transmembrane Proteins

Protein Sequencing Production of Ions for Mass Spectrometry

The Future of Protein Secondary Structure Prediction Was Invented by Oleg Ptitsyn

Determination of Pk, Values of the Histidine Side

Families and the Structural Relatedness Among Globular Proteins

Chapter 4 the Three-Dimensional Structure of Proteins

Conformational Properties of Constrained Proline Analogues and Their Application in Nanobiology”

Protein Folding and the Organization of the Protein Topology Universe

Jacquelyn S. Fetrow

9 3 Innovirolgy Examples and Applications

Single-Molecule Peptide Fingerprinting