Calculation of Protein Conformation As an Assembly of Stable Overlapping
Total Page:16
File Type:pdf, Size:1020Kb
Proc. Natl. Acad. Sci. USA Vol. 88, pp. 3661-3665, May 1991 Biophysics Calculation of protein conformation as an assembly of stable overlapping segments: Application to bovine pancreatic trypsin inhibitor (conformational energy calculations/short-range interactions/build-up procedure/"conformon") ISTVAN SIMON*, LESLIE GLASSERt, AND HAROLD A. SCHERAGAt Baker Laboratory of Chemistry, Cornell University, Ithaca, NY 14853-1301 Contributed by Harold A. Scheraga, January 24, 1991 ABSTRACT Conformations of bovine pancreatic trypsin (1). This procedure has been applied to bovine pancreatic inhibitor were calculated by assuming that the final structure trypsin inhibitor (BPTI), a 58-residue protein. as well as properly chosen overlapping segments thereof are simultaneously in low-energy (not necessarily the lowest- Designation of the Native Conformation energy) conformational states. Therefore, the whole chain can be built up from building blocks whose conformations are We assume that the native conformation is not only the one determined primarily by short-range interactions. Our earlier of highest stability but also the one in which all properly buildup procedure was modified by taking account of a statis- chosen segments ofthe polypeptide chain are simultaneously tical analysis ofknown amino acid sequences that indicates that in low-energy (not necessarily the lowest-energy) conforma- there is nonrandom pairing of amino acid residues in short tions (3). This implies that short-range interactions play a segments along the chain, and by carrying out energy mini- dominant role in determining the conformations of these mization on only these segments and on the whole chain segments (4). [without minimizing the energies ofintermediate-size segments The number of low-energy conformations of an oligopep- (20-30 residues long)]. Results of this statistical analysis were tide is usually several orders of magnitude smaller than the used to determine the variable sizes of the overlapping oligo- number of combinations constructed by combining the low- peptide building blocks used in the calculations; these varied energy conformations of the individual residues of which the from tripeptides to octapeptides, depending on the amino acid oligopeptide is constituted. Since the common part of two sequence. Successive stages of approximations were used to overlapping segments must be in the same conformation in combine the low-energy conformations ofthese building blocks both of the overlapping oligopeptides (5), the number of in order to keep the number of variables in the computations conformations of the whole polypeptide chain in which most to a manageable size. The calculations led to a limited number of the overlapping segments are simultaneously in a low- ofconformations ofthe protein (only two different groups, with energy conformation is rather limited (3). Such whole poly- very similar structure within each group), most residues of peptide conformations are designated as "conformons" [for- which were in the same conformational state as in the native merly called X-conformations (3)], and it has been suggested structure. that, for properly defined segment sizes and energy ranges, the native conformation is the only conformon of the whole To overcome the large entropy difference between the unique polypeptide chain (3); for short-chain proteins, this unique native state and the ensemble of conformations constituting conformon would be the native conformation, but for long the unfolded states, the native conformation ofa protein must chains it would be the native conformation only of an correspond to a deep minimum in its conformational energy independently folding domain. Thus, we search the confor- hypersurface. The whole conformational space available to mational space for the conformon ofthe whole chain (defined the protein cannot possibly be explored within any reason- by short-range interactions) and assume that it corresponds able amount of time, so that the success of refolding exper- to the native conformation. iments implies that the minimum in the Gibbs free energy On the other hand, we must keep in mind that, whereas function which corresponds to the native state must be the mini- significantly deeper than any other minima attainable during short-range interactions dominate in determining the course of refolding. When the whole of conformational mum-energy conformation corresponding to the conformon space is considered, however, there are so many minima (1) of the whole chain, long-range and protein-solvent interac- that there is no hope offinding the one minimum correspond- tions affect the stabilization free energy and contribute in ing to the native conformation by attempting to examine all determining the exact conformation of the protein; these of them (2). Therefore, the native conformation must be additional interactions are incorporated by minimizing the identified in a computationally feasible way that does not energy of the whole structure (with inclusion of solvent and require comparison ofthe energies ofall the minimum-energy disulfide bonds) in the final stage of our procedure. conformations. Low-energy conformations of oligopeptides can be ob- In this paper, we present a procedure that uses information tained by using a buildup procedure (6). Vdsquez and Scher- only about the amino acid sequence and locations of the aga (7, 8) built up the whole BPTI conformation from disulfide bonds ofthe protein and that results in a very limited low-energy fragments by using a limited number of distance number of conformations, which are close enough to the native one for application offurther refinement techniques Abbreviation: BPTI, bovine pancreatic trypsin inhibitor. *On leave from: Institute of Enzymology, B.R.C., Hungarian Acad- emy of Sciences, H-1518 Budapest, P.O. Box 7, Hungary 1989-90. The publication costs of this article were defrayed in part by page charge tOn leave from: Department of Chemistry, University of the Wit- payment. This article must therefore be hereby marked "advertisement" watersrand, Wits 2050, South Africa 1989-90. in accordance with 18 U.S.C. §1734 solely to indicate this fact. tTo whom reprint requests should be addressed. 3661 Downloaded by guest on September 23, 2021 3662 Biophysics: Simon et al. Proc. Natl. Acad. Sci. USA 88 (1991) constraints from simulated NMR spectra. In that approach, dipeptide conformations differed from each other (i.e., in 4, the low-energy conformations of all the constituent tetrapep- q1, XI), irrespective of the conformation of residue R. These tides were calculated independently of one another and then selected PD dipeptide conformations were combined with all overlapped. Since the number of conformations of larger of the low-energy conformations ofthe next residue, F. With fragments built up from various combinations of these tet- this overlapping new tripeptide, the whole of the above rapeptide conformations was exceedingly large, about 200 procedure was repeated. The process was then repeated for distance constraints (from simulated NMR spectra) were each successive overlapping tripeptide of the whole BPTI required to reduce the magnitude of the computational prob- polypeptide chain. An 8-kcal/mol cutoff was used for tripep- lem to select the final, native structure. tides that did not contain half-cystine, C, whereas a higher, In this paper, we modify the buildup procedure by intro- 10-kcal/mol cutoff was used for those that did contain C, ducing statistical information expressing the correlation be- because of the extra constraining covalent bond that can tween the types of residues that can exist at various positions compensate for higher conformational energies. along the chain; this information dictates the sizes of the It is important to note that, when the conformations of the fragments (which depend on the amino acid sequence) that second tripeptide, PDF, were calculated, the lowest-energy should be used in the buildup procedure and obviates the conformation was not necessarily the known global- previous need (7, 8) to introduce distance constraints. At the minimum one of the isolated tripeptide PDF. This is because present stage of development of this modified procedure, we used only those initial PD conformations which appeared however, the final results are not yet as good as those of among the low-energy conformations of the first tripeptide, refs. 7 and 8. RPD, and this ensemble of PD conformations reflects the Initial Screening of the Conformational influence of the residue R. Some of the conformations of Space isolated PDF might have lower energies than those computed We first considered the influence of neighboring residues on here but, when the PD portion is combined with R, they each other's conformations, without introducing correlations would have higher energies than the 8-kcal/mol cutoff for among the types of amino residues at this stage. For this RPD. The cutoff level was measured with respect to the purpose, we began with tripeptides as the initial building minimum of this modified set of PDF conformations (influ- blocks. enced by residue R), rather than from the global minimum of When the conformational energies of tripeptides (and, isolated PDF, and a range of energies (8-10 kcal/mol), later, of larger oligopeptides) were calculated, their N and C smaller than that used previously (7, 8), was enough to retain termini were blocked with acetyl and methylamide groups, the native conformations of all tripeptides. By using this respectively.