Poly(L-Alanine) As a Universal Reference Material for Understanding Protein Energies and Structures TERESA HEAD-GORDON, FRANK H
Total Page:16
File Type:pdf, Size:1020Kb
Proc. Nati. Acad. Sci. USA Vol. 89, pp. 11513-11517, December 1992 Biophysics Poly(L-alanine) as a universal reference material for understanding protein energies and structures TERESA HEAD-GORDON, FRANK H. STILLINGER, MARGARET H. WRIGHT, AND DAVID M. GAY AT&T Bell Laboratories, Murray Hill, NJ 07974 Contributed by Frank H. Stillinger, June 17, 1992 ABSTRACT We present a proposition, the "poly(L- alanine) hypothesis," which asserts that the native backbone geometry for any polypeptide or protein of M residues has a closely mimicking, mechanically stable, image in poly(L- POLY-L-ALANINE alanine) of the same number of residues. Using a molecular co mechanics force field to represent the relevant potential energy a: hypersurfaces, we have carried out calculations over a wide z range of M values to show that poly(L-alanine) possesses the w structural versatility necessary to satisfy the proposition. These w include poly(L-alanine) representatives of minima correspond- U- ing to secondary and supersecondary structures, as well as PROTEIN poly(L-alanine) images for tertiary structures of the naturally occurring proteins bovine pancreatic trypsin inhibitor, crambin, ribonuclease A, and superoxide dismutase. The suc- cessful validation of the hypothesis presented in this paper BACKBONE COORDINATES indicates that poly(L-alanine) will serve as a good reference FIG. 1. Topographic basis ofthe poly(L-alanine) hypothesis. Free material in thermodynamic perturbation theory and calcula- energy hypersurfaces are compared for a given protein and for tions aimed at evaluating relative free energies for competing poly(L-alanine) with the same number of residues. candidate tertiary structures in real polypeptides and proteins. structure. A third advantage, as explained in Section 4, is that the hypothesis provides a context in which the specific Section 1: Introduction structural roles of disulfide bonds, charged side chains, In this paper we present an idea, the "poly(L-alanine) hy- packing of amino acid side chains, and special residues such pothesis," for the purpose of offering an independent ap- as proline and glycine can be independently and quantita- proach to understanding protein structure and energetics. It tively assessed. asserts that there is a close correspondence, or short-distance A variety of calculations has been undertaken to demon- mapping, between the backbone geometry at the native- strate the validity of the poly(L-alanine) hypothesis. These structure free-energy minimum for any polypeptide or pro- calculations range in M, the number of residues, from very tein of M residues and that of a (metastable) minimum of small to quite large polypeptides and demonstrate the ability poly(L-alanine) of the same number of residues. Fig. 1 of poly(L-alanine) to adopt an impressive repertoire of struc- illustrates this concept. Alanine is the most natural choice of tural alternatives. This diversity is evident even at the dimer residue for such a comprehensive comparison because its level (4). Using larger oligomers, we have demonstrated degree of geometric plasticity, as measured by a Ramachan- stability for representative secondary structures (helices, dran plot (1), is most representative ofall amino acids except turns, sheets), for supersecondary conformers, and for ter- glycine and proline. Furthermore it is the simplest chiral tiary structures exhibited by several naturally occurring residue. Notice that inverse correspondences are not postu- proteins. lated: for steric reasons the global free energy minimum for (L-Ala)M may have no close image among the local minima for Section 2: Methods the free-energy function of a given M-residue protein. Several conceptual advantages arise from successful val- Potential Energy Functions. The empirical potential energy idation of the poly(L-alanine) hypothesis. One is that me- function used has the typical molecular mechanics form and chanical stability of protein native structures do not depend has been described elsewhere (4). We have used the param- on side-chain details but that absolute energy (or free energy) stability is indeed controlled by those details. In this con- eters ofboth the all-atom representation (5) and the extended- nection Matthews and coworkers' (2, 3) T4-lysozyme muta- atom representation (version 19) ofCHARMM (6). For reasons genesis studies have a special significance, wherein replace- of expediency, the all-atom representation (5) has been used ment of several residues by L-alanine has been demonstrated for the cases where the polypeptide has 10 amino acids or to preserve native structure. A second advantage is that fewer; in all other cases the extended-atom representation poly(L-alanine) can serve as a natural reference material in was used (6). For the smaller cases we have also used methyl theoretical calculations of the relative stabilities of distinct blocking groups instead of the amino- and carboxyl- folded structures for a given protein or of the relative terminating groups of naturally occurring peptides. For the stabilities of distinct M-residue proteins in the same tertiary larger cases we have used the zwitterionic form for initiating and terminating the polypeptide. The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" Abbreviations: CRN, crambin; BPTI, bovine pancreatic trypsin in accordance with 18 U.S.C. §1734 solely to indicate this fact. inhibitor; SOD, superoxide dismutase. 11513 Downloaded by guest on October 1, 2021 11514 Biophysics: Head-Gordon et al. Proc. Nad. Acad. Sci. USA 89 (1992) The poly(L-alanine) hypothesis requires demonstrating presented in Section 2 was used to find the secondary that there exists a minimum on the poly(L-alanine) hypersur- structure minimum for the larger polypeptides. face closely mimicking a given protein's native structure. In We have also found poly(L-alanine) minima corresponding Section 3, where we consider supersecondary and tertiary to at least two types of supersecondary structures: the structure of long polypeptides, we will almost always have a parallel 3-sheet-a-helix-parallel 1-sheet (13-a--) and a-he- crystal structure with which to compare the native and the lix-turn-a-helix-turn-a-helix (a-t-a-t-a). For the case of poly(L-alanine) structures. For the secondary structures of a-t-a-t-a, the starting configuration of 30 residues was built small peptides considered in Section 3, we rely on the "by hand," and energy minimized to give this supersecond- adequacy of the all-atom gas phase potential of blocked ary structure type. The initial configuration of the 13-a-f poly(L-alanine) since high-level ab initio studies (7) indicate secondary structure was taken from a subsequence (residues that the all-atom model of alanine dipeptide performs quite 200-249) ofthe crystal structure ofcarboxypeptidase A (14). well structurally and energetically. Nonaliphatic heavy atom centers were given hydrogens so Optimization Techniques. We have used two types of that all hydrogen positions satisfie excluded volume and optimization procedures for finding minima on the potential geometric constraints (6). (Nonpolar hydrogens are repre- energy surfaces described above. For the cases of small sented by an extended version of the aliphatic carbon to peptides (number ofamino acids '10) in secondary structure which they are attached.) The resulting hydrogenated crystal conformations, we have used a quasi-Newton sequential structure was m ed using the penalty finction protocol quadratic programming algorithm (8) with nonlinear con- presented in Section 2. This structure was edited so that the straints. In these cases, five types of constraints were con- backbone atoms and 1-carbon side-chain atoms were re- structed: an L-chirality constraint, a peptide torsion con- tained; for proline and glycine, the backbone hydrogen atom straint (trans or cis), (Cii-N,-CQ,-C,) and T (Ni-Cai-Ci- and 1-carbon side chain were added, respectively, with all Ni+1) constraints appropriate for the secondary structure geometric and steric constraints satisfied. This poly(L- under consideration, and hydrogen bond constraints, again alanine) starting structure was minimized again with the when appropriate for the secondary structure type. The penalty function protocol (Section 2). The rms deviation minimization was started with a structure closely resembling between the native sequence and alanine sequence appears in the secondary structure under consideration and minimized Table 1. with the objective function and the five types of constraints While it is well appreciated that poly(L-alanine) is capable discussed above. The nonlinear constraints were imposed of adopting many different secondary structures, it has not with upper and lower bounds. The optimization was consid- been demonstrated that poly(L-alanine) can mimic wide va- ered converged when the L-chirality dihedral (Cai-Ni-CrCli) rieties of tertiary structure. We now demonstrate that this is constraint of 330 was satisfied to within ±30, the peptide so for the following representative examples: the 46-residue torsion (Cai-Ci-Ni+ICai+i) constraint of00 or 1800 satisfied to protein crambin (CRN) (15), the 58-residue protein bovine within ±100, the backbone dihedrals and T converging to pancreatic trypsin inhibitor (BPTI) (16), the 124-residue within ±0.50 of their secondary structure value, and the protein A (RNase) (17), and the 158-residue protein super- hydrogen bond (O.-Hj) constraint of 1.9 A falling within a oxide dismutase (SOD) (18). window of ±0.4 A. The second stage of the procedure is to All crystal structures were taken from the Protein Data minimize with no constraints starting with the structure Bank (Brookhaven National Laboratory) and supplied with obtained from the constrained first stage. polar hydrogens. The resulting structure was minimized For the cases ofpolypeptides and proteins of >10 residues, using the penalty function protocol described in Section 2 and we have imposed a penalty function protocol for relaxing the cHARmm potential with parameters aroiate for all 20 starting structures into a nearby local minimum.