Expanding Molecular Modeling and Design Tools to Nonnatural Sidechains
Total Page:16
File Type:pdf, Size:1020Kb
WWW.C-CHEM.ORG FULL PAPER Expanding Molecular Modeling and Design Tools to Non-Natural Sidechains David Gfeller,[a] Olivier Michielin,*[a,b,c] and Vincent Zoete*[a] Protein–protein interactions encode the wiring diagram of increase peptide ligand binding affinity. Our results obtained on cellular signaling pathways and their deregulations underlie a non-natural mutants of a BCL9 peptide targeting beta-catenin variety of diseases, such as cancer. Inhibiting protein–protein show very good correlation between predicted and interactions with peptide derivatives is a promising way to experimental binding free-energies, indicating that such develop new biological and therapeutic tools. Here, we develop predictions can be used to design new inhibitors. Data a general framework to computationally handle hundreds of generated in this work, as well as PyMOL and UCSF Chimera non-natural amino acid sidechains and predict the effect of plug-ins for user-friendly visualization of non-natural sidechains, inserting them into peptides or proteins. We first generate all are all available at http://www.swisssidechain.ch. Our results structural files (pdb and mol2), as well as parameters and enable researchers to rapidly and efficiently work with hundreds topologies for standard molecular mechanics software of non-natural sidechains. VC 2012 Wiley Periodicals, Inc. (CHARMM and Gromacs). Accurate predictions of rotamer probabilities are provided using a novel combined knowledge and physics based strategy. Non-natural sidechains are useful to DOI: 10.1002/jcc.22982 Introduction are currently expanding peptide screening technologies to include non-natural sidechains. Correct interpretation of these Protein–protein interactions are fundamental to most biologi- results at the atomic level requires structural and biochemical cal and biochemical processes. However, targeting protein– information about these sidechains, such as rotamers. More- protein interactions to develop therapeutics has remained a [1] over, as the number of possible non-natural sidechains is significant challenge in recent years. This can be partly huge, rational structure-based design of non-natural peptides understood by the relatively flat and surface exposed bind- provides a powerful alternative to high-throughput screen- ing interfaces with distant hotspots that are not very well [8] ing. As such, computer-aided and modeling strategies are suited for standard small molecule high-throughput screen- promising tools to help narrowing-down the list of ligands to ing experiments. Recent experimental advances have pro- be experimentally tested. In silico approaches are also particu- vided an ever-increasing amount of information about pro- larly appropriate for non-natural sidechains insertion, since this tein–protein interactions, in particular interactions mediated corresponds to a relatively small extrapolation from experi- by short peptides binding to a protein, as often found in sig- mental structural data. In particular, mutating one natural naling pathways.[2] Structural studies have enabled character- amino acid to a non-natural one is likely to leave unchanged izing a very large number of these complexes.[3] Moreover, [4] the binding mode of the rest of the peptide. high-throughput techniques, such as peptide arrays, [13] [5] [6] Most existing molecular modeling software, such as FoldX phage-display, or ribosome-display, reveal unprecedented [14] [7] or TINKER, as well as molecular dynamics (MD) software, insights into binding specificity. However, the use of natu- such as Gromacs[15] or CHARMM,[16] only contain a limited set ral peptides as inhibitors is often restricted by their sensitiv- of amino acid sidechains in addition to the 20 natural ones ity to protease-mediated degradation. In addition, several naturally occurring interactions (either between proteins of the same organism, or between host and pathogens) have This article was published online on 14 April 2012. An error was already been optimized along evolution, suggesting that subsequently identified. This notice is included in the online and print versions to indicate that both have been corrected 24 April 2012. peptides restricted to the set of 20 natural amino acids will [a] D. Gfeller, O. Michielin, V. Zoete hardly be appropriate for inhibiting or competing with these Swiss Institute of Bioinformatics (SIB), Quartier Sorge, Baˆtiment Genopode, interactions. CH-1015 Lausanne, Switzerland Peptido-mimetics, and especially incorporation of non-natu- E-mail: [email protected], [email protected] [b] O. Michielin ral sidechains, is a promising strategy to harness the current Ludwig Institute for Cancer Research, Centre Hospitalier Universitaire wealth of data about natural peptides towards the develop- Vaudois, Lausanne, Switzerland ment of more potent inhibitors.[8] Some remarkable experi- [c] O. Michielin mental advances to genetically encode non-natural side- Pluridisciplinary Center for Clinical Oncology (CePO), Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland chains,[9] such as the use of amber codon,[10] tRNA [11] [12] acylation, or engineered quadruplet-decoding ribosomes VC 2012 Wiley Periodicals, Inc. Journal of Computational Chemistry 2012, 33, 1525–1535 1525 FULL PAPER WWW.C-CHEM.ORG (typically consisting of phosphorylated or methylated residues, interactions with the backbone of the protein, and interactions as well as a few others). To bridge this gap and expand the with other atoms found in the vicinity of the sidechain. Back- sidechain chemical alphabet that can be used in molecular bone dependent rotamer libraries capture the former two modeling and in silico drug design studies, we built structural aspects by providing rotamer probability distributions for each files (mol2, pdb) for 209 non-natural sidechains. Parameter and value of backbone dihedral angles f and w (typically using a topology files to run standard MD simulations with these non- 10 Â 10 grid on these angles).[20] Backbone independent natural sidechains in CHARMM and Gromacs have been gener- rotamer libraries instead provide rotamer probabilities without ated using direct mapping from existing parameters supple- incorporating dependencies on backbone dihedral angles. mented by data retrieved from the SwissParam[18] web service Such data are crucial to select conformations that are physi- when necessary. Rotamer probabilities are predicted using a cally reasonable when refining experimental structures or novel algorithm combining energy calculations with statistical mutating in silico a sidechain.[21,22] For natural sidechains, analysis of experimental data. We show that accurate binding rotamers are computed by running statistics on existing free-energy predictions can be obtained with our data for structures in the PDB.[20,23–25] In the absence of sufficient non-natural sidechains. These results are made available at experimental data for non-natural sidechains, we designed a http://www.swisssidechain.ch and the non-natural sidechains novel strategy to predict rotamer probabilities, combining can be inserted into peptide or protein structures and visual- physical and statistical approaches. We first validate this ized using our PyMOL and UCSF Chimera[19] plug-ins. strategy on natural sidechains, and then use it to predict rotamer probabilities of our non-natural sidechains. Results and Discussion Building non-natural sidechains Physics-based approach The chemical alphabet for non-natural amino acid sidechains is Rotamer probabilities have often been predicted using molec- [26–28] nearly infinite. Here, to maximize the applicability of the results, ular mechanics or quantum mechanics calculations. Here, we focused on amino acid sidechains with structural informa- we rely on MD simulations. A tri-alanine peptide was used, tion in the Protein Data Bank (PDB), as well as commercially where the second residue is mutated to all other sidechains available ones. Non-natural amino acids that modify the back- (see Material and Methods). Simulations were carried out for a [16] bone, such as b-homo, cyclic or aromatic backbones, or proline total of 200ns per sidechain with CHARMM using the FACTS [29] derivatives, were not included, since they are more likely to per- implicit solvent model. Rotamer probabilities are estimated turb the overall conformation of peptides or proteins and are as frequencies of conformations corresponding to each therefore less amenable to molecular modeling studies. This rotamer along the trajectories (see Material and Methods). This resulted in a total of 209 non-natural sidechains, with 141 being way of computing rotamer probabilities is similar to the way present in the PDB. For the latter, pdb files were downloaded experimental rotamer probabilities are computed (the only dif- directly from this database. For the rest we used the tools of ference being that the average is done over all existing struc- ChemAxon (MarvinSketch) and the UCSF Chimera modeling tures, rather than over a trajectory). Similar statistics as in Ref. [23] software.[19] Mol2 and smiles files were generated with OpenBa- have been applied to smoothen and extrapolate probabil- bel. We next generated parameter and topology files for ities to low-sampling regions (see Material and Methods). CHARMM and Gromacs. For 33% of the sidechains, topology To assess the capability of computed rotamer probabilities files, especially partial charges, could be readily generated by to reproduce experimental ones, we first tested