Predicting the Side-Chain Dihedral Angle Distributions of Nonpolar, Aromatic, and Polar Amino Acids Using Hard Sphere Models Alice Qinhua Zhou,1,2 Corey S

proteins STRUCTURE O FUNCTION O BIOINFORMATICS Predicting the side-chain dihedral angle distributions of nonpolar, aromatic, and polar amino acids using hard sphere models Alice Qinhua Zhou,1,2 Corey S. O’Hern,2,3 and Lynne Regan1,2* 1 Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, Connecticut, 06520-8114 2 Integrated Graduate Program in Physical and Engineering Biology (IGPPEB), Yale University, New Haven, Connecticut, 06520-8114 3 Departments of Mechanical Engineering & Materials Science, Applied Physics, and Physics, and Graduate Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, 06520-8286 ABSTRACT The side-chain dihedral angle distributions of all amino acids have been measured from myriad high-resolution protein crystal structures. However, we do not yet know the dominant interactions that determine these distributions. Here, we explore to what extent the defining features of the side-chain dihedral angle distributions of different amino acids can be captured by a simple physical model. We find that a hard-sphere model for a dipeptide mimetic that includes only steric interactions plus stereochemical constraints is able to recapitulate the key features of the back-bone dependent observed amino acid side-chain dihedral angle distributions of Ser, Cys, Thr, Val, Ile, Leu, Phe, Tyr, and Trp. We find that for certain amino acids, performing the calculations with the amino acid of interest in the central position of a short a-helical segment improves the match between the predicted and observed distributions. We also identify the atomic interactions that give rise to the differences between the predicted distributions for the hard-sphere model of the dipeptide and that of the a- helical segment. Finally, we point out a case where the hard-sphere plus stereochemical constraint model is insufficient to recapitulate the observed side-chain dihedral angle distribution, namely the distribution P(v3) for Met. Proteins 2014; 82:2574–2584. VC 2014 Wiley Periodicals, Inc. Key words: protein structure prediction; side-chain dihedral angle distribution; dipeptide; helical segment; protein crystal structure. INTRODUCTION Using such hard-sphere calculations, they identified the backbone dihedral angle combinations (u and w) for an Many types of interactions, including purely repulsive alanyl dipeptide mimetic that are sterically allowed, given steric, electrostatic, and the attractive component of van physically reasonable values for the atom sizes, bond der Waals interactions, determine the structure of proteins lengths, and bond angles. They found that large regions in general and the conformations of amino acid side chains of u and w space are disallowed due to steric clashes in particular. However, the relative contributions of the between atoms in the dipeptide. Moreover, we now know different types of interactions are not known. For example, that essentially every amino acid in every protein the dominant interactions that control the side-chain dihedral angle distributions for each residue have not been identified, even though there is now a wealth of high- Additional Supporting Information may be found in the online version of this resolution structural data on thousands of proteins. We article. Grant sponsor: National Science Foundation; Grant numbers: DMR-1006537, seek to determine to what extent the key features of the DMR-1307712, and PHY-1019147; Grant sponsor: Raymond and Beverly Sackler side-chain dihedral angle distributions of different amino Institute for Biological, Physical, and Engineering Sciences; Grant sponsor: acids observed in protein crystal structures can be captured National Science Foundation; Grant sponsor: Howard Hughes Medical Institute Interanational Research Fellowship; Grant number: CNS 08-21132. by steric repulsion and stereochemical constraints alone. *Correspondence to: Lynne Regan, 322 Bass Center, 266 Whitney Ave., New A similar question motivated Ramachandran and col- Haven, CT 06511. E-mail: [email protected] Received 23 April 2014; Accepted 30 May 2014 leagues to model peptides as hard spheres with stereo- Published online 10 June 2014 in Wiley Online Library (wileyonlinelibrary.com). chemical constraints for the bond lengths and angles.1,2 DOI: 10.1002/prot.24621 2574 PROTEINS VC 2014 WILEY PERIODICALS, INC. Predicting Side-Chain Dihedral Angle Distributions structure obeys the hard-sphere limits on u and w cally disallowed due to atomic clashes with the surround- defined by Ramachandran and coworkers.3,4 ing residues in the a-helical segment, not by clashes Dunbrack and colleagues have also gained significant within the local environment of the dipeptide mimetic. insight into the side-chain conformations by analyzing The ability to predict amino acid side-chain dihedral high-resolution protein crystal structures.5–8 They angle distributions using a simple physical model pro- emphasized that the side-chain dihedral angle distribu- vides an efficient tool for computational protein design tions are rotameric, with high probabilities at specific v1 strategies that does not rely on knowledge-based poten- 17 and v2 combinations that depend sensitively on the tials for determining side-chain dihedral angles. backbone dihedral angles. They also showed that certain rotamers are rare because of steric repulsions analogous to those that constrain the conformations of hydrocar- MATERIALS AND METHODS bon chains. However, these studies did not specifically address whether hard-sphere interactions plus stereo- Databases of protein crystal structures chemical constraints alone can recapitulate the key fea- Our calculations of the sterically allowed side-chain tures of side-chain dihedral angle distributions. dihedral angle distributions are compared with the Common computational strategies for calculating the observed side-chain dihedral angle distributions obtained side-chain dihedral angle distributions in proteins involve from a database of high-resolution protein crystal struc- 9 quantum mechanical calculations and molecular tures provided by Dr. Roland Dunbrack, Jr.18 This data- 10 11 mechanics force fields, such as Amber, CHARMM, base is composed of 850 non-homologous protein 12 13 GROMOS, and OPLS. However, these simulation structures with resolution 1.7 A˚ or less, side chain B- methods require prohibitively large computing resources factors per residue 30 A˚ 2 or less, and R-factors 0.2 or when considering large proteins. In addition, molecular less. mechanics force fields directly sample the experimentally We used a higher-resolution (1.0 A˚ or less) set of 221 measured backbone and side-chain dihedral angle distri- structures8 to fix the bond lengths, bond angles, and x butions using knowledge-based potentials, such as backbone dihedral angles of the dipeptide mimetics for 14 15 CHARMM-CMAP and v-CMAP, or Amber- our calculations. The higher-resolution set limits the 16 ILDN. Thus, the molecular mechanics force fields do refinement bias on the bond lengths, bond angles, and x not allow one to quantify the distinct contribution of dihedral angles associated with the crystal structure steric repulsive interactions to the side-chain dihedral determination, but there are too few structures in this set angle distributions and build-in the observed side-chain to provide meaningful comparisons with the calculated dihedral angle distributions that we seek to predict. side-chain dihedral angle distributions. The numbers of In this manuscript, we pose the key question: To what each residue type in both databases are displayed in extent, do steric interactions plus stereochemical con- Table S1 in the Supporting Information. In addition, we straints determine the side-chain dihedral angle distribu- identified 53 protein crystal structures containing Nle in tions for each residue? Specifically, we determine the the protein data bank (PDB). 24 of the 53 crystal struc- sterically allowed side-chain dihedral angle distributions tures are nonhomologous. After removing structures for the nonpolar (Leu, Ile, and Val), aromatic (Phe, Tyr, with resolution higher than 1.7 A˚ , 11 Nle amino acids and Trp), and polar residues (Cys, Ser, and Thr) in either remained. a-helix or b-sheet backbone conformations using a hard- sphere model that only includes steric interactions plus stereochemical constraints for the bond lengths and Hard-sphere plus stereochemical constraint model of dipeptide mimetics and a-helical angles. For each residue, we determine which features of segments the side-chain dihedral angle distributions can be explained by a hard-sphere model, in the context of an Figure 1 shows a stick representation of the Ala dipep- amino acid dipeptide mimetic (Fig. 1). For several amino tide mimetic as well as nine other residues (Val, Thr, acids, the major features of the observed distributions Phe, Ser, Cys, Tyr, Leu, Ile, and Trp). The u backbone are recapitulated by the hard-sphere model for a dipep- dihedral angle is defined by the clockwise rotation tide mimetic. For certain other amino acids, some fea- around the N-Ca bond (viewed from N to Ca) involving tures of the observed distributions are not captured by the backbone atoms C-N-Ca-C. The w backbone dihe- the hard-sphere model for a dipeptide mimetic. These dral angle is defined by the clockwise rotation about the discrepancies were especially apparent for a-helical back- Ca-C bond (viewed from Ca to C) involving the backbone conformations. We therefore also performed hard-

Predicting the Side-Chain Dihedral Angle Distributions of Nonpolar, Aromatic, and Polar Amino Acids Using Hard Sphere Models Alice Qinhua Zhou,1,2 Corey S

Compact Hyperbolic Tetrahedra with Non-Obtuse Dihedral Angles

Using Information Theory to Discover Side Chain Rotamer Classes

Predicting Backbone Cα Angles and Dihedrals from Protein Sequences by Stacked Sparse Auto-Encoder Deep Neural Network

Toroidal Diffusions and Protein Structure Evolution

H2O2 and NH 2 OH

Klein 4A Alkanes+Isomerism

Nomenclature and Conformations of Alkanes and Cycloalkanes1 on The

Unit-2 Conformational Isomerism

Elasticity, Structure, and Relaxation of Extended Proteins Under Force

Dihedral Orbitals” Approach To

Monitoring Molecular Chirality Exchange by Photon Echoes

Remarks on Dihedral and Polyhedral Angles