A Computational System for Modeling Flexible Protein-Protein and Protein

From: ISMB-98 Proceedings. Copyright © 1998, AAAI (www.aaai.org). All rights reserved. A Computational System for Modelling Flexible Protein-Protein and Protein-DNA Docking MichaelJ. E. Sternberg (1) ¶, Patrick Aloy (1,2), HenryA. Gabb Richard MJackson (1), Gidon Moont(1), Enrique Querol (2) & Francesc X. Aviles ¶ CorrespondingAuthor (I) BiomolecularModelling Laboratory Imperial CancerResearch Fund 44 Lincoln’s Inn Fields, LondonWC2A 3PX, UK e-mail [email protected] (2) Institut de BiologiaFonarnental and Departmentde Bioquimica, Universitat Aut6nomade Barcelona,08193 Bellaterra, Barcelona,Spain Abstract Acomputational system is described that predicts the based design of novel regulators of activity. For structure of protein / protein and protein / DNA reviews of macromolecular docking see (Janin, 1995; complexesstarting from unboundcoordinate sets. The Shoichet and Kuntz, 1996; Sternberg et al., 1998). approachis (i) a global search with rigid-bodydocking for complexes with shape complementarity and This paper describes a system for macromolecular favourableelectrostatics; (ii) use of distanceconstraints docking recently developedin the group (see Figure 1). fromexperimental ( or predicted) knowledgeof critical The four componentsof the strategy are : residues; (iii) use of pair potential to screen docked i) FTDOCKwhich performs a global search for complexesand (iv) refinementand further screening protein-side chain optimisationand interracial energy favourable rigid-body docking of the two components minimisation.The system has been applied to modelten (Gabbet al., 1997); protein/proteinand eight protein-repressor/ DNA(steps ii) FILTRwhich screens the favourable solutions from i to iii only) complexes.In general a few complexes, (i) by use of distance constraints from known one of which is close to the true structure, can be predicted regions involved in complexformation (Gabb generated. et al., 1997); iii) RPDOCKwhich rapidly screens complexes using empirical residue pair potentials (Moontet al., 1998); 1 - Introduction iv) MULTIDOCKwhich refines the structure of the complex including both the side-chain conformation The aim of predictive macromolecular docking is to and rigid-body minimisation and thereby provides start with the coordinates of the two molecules in their another screen of the complexes(Jackson et al., 1998). unbound states and compute the three-dimensional structure of the complexincluding the conformational The entire approach has been tested on ten change on association. There is a increasing need for protein/protein complexes(Gabb et al., 1997; Jackson et successful docking algorithms as the rate of protein al., 1998; Moontet al., 1998) and the first three stages structure determination is increasing (> 5,000 deposited applied to model eight protein-repressor/DNA coordinates) whilst there are the structures of far fewer complexes(Aloy et al., 1998). Nearly all the starting protein-protein and protein/DNA complexes (< 200 coordinates were unbound (i.e. not taken from the coordinates). It is the structures of these complexesthat complex) but occasion lack of data required use of one provides valuable understanding of the functions of the of the two coordinates to be from the bound state. The moleculesand can serve as the basis for the structure- results show that computational methods with biochemical data can generate a few docked complexes Copyright© 1998, American Association for Artificial Intelligence one of whichis close to the correct conformation. (www.aaai.org).All rights reserved. Sternberg 183 Figure 1 - Schematic of the strategy for macromolecular docking (unbound state) 1 FTDOCK-Global search by rigid-body I dockingusing Fourier correlation for Ishapearts electrostatic complementarity 1- rotate 4000 FILTR- Screencomplexes on distanceconstraint (e.g. active site, combiningloops or DNAfingerprint) 40-400 RPDOCK-Screenvia empirical amino-acid/amino-acidor Iamino-acid/nucleotidepotentials 4-40 MULTIDOCK- Refine and screen via all atomside-chain mutli copyand limited rigid-body minimisation 1-10 Final model(s) for testing 1 The numbers on the right hand side refer to the number of complexes generated for the enzyme-inhibitor studies (see Table I). 184 ISMB-98 multiplications and additions. Katchalski-Katzir et al suggested that proceeding via a discrete Fourier 2 - Algorithm transform enables the correlation function to be calculated in the order of N3 In N3 steps. For a global search of rigid body docking, molecule B is rotated The approach will be described as used for protein- ° protein docking and then the modifications to tackle through the three Eulerian angles in 15 steps and for protein/DNAcomplexes will be reported. each step the correlation c calculated. An important additional constraint of macromolecular complexes is that there nearly always will be a 2.1) FTDOCK- Rigid-body docking by Fourier favourable electrostatic interaction. To consider this we Correlation have augmentedthe Fourier correlation approach with an electrostatic calculation. As with shape complementarity, the modelling of electrostatics is The number of internal degrees of freedom for two designed to be sufficiently soft to cope with the macromolecules with numerous rotatable bonds conformational changes on association. For molecule together with the six degrees of associational freedom A, charges are assigned to the atoms and the make at present a global search for the docked electrostatic potential evaluated outside the molecule conformation impracticable. Accordingly the present from approach is to start with a rigid-body docking and to incorporate softness into the scoring function to allow Ct,,.,n= for conformational changes on association. We have r/j developed a program FTDOCKto perform the initial step of rigid-body docking basing the approach on the where Ol,m,n is the potential at node l,m,n (position Fourier correlation methodof Katchalski-Katzir et al., i), qj is the charge on atom j, rij is the distance (1992). between i and j (with a minimumvalue of 2.3, to avoid FTDOCK searches for complexes with artificially large values of the potential) and E(~j) is complementarity of shape. The two molecules A and B distance dependentdielectric function. Inside molecule are placed onto a three-dimensional grids each of size N A, Ol,m,n is zero. For molecule B charges are assigned x N x N. For the larger molecule A each node l,m,n to neighbouring grid points giving a function ql,m,n" is assigned a value { 1 for grid points on the surface The electrostatic interaction ect,[3, T for a shift of al,m,n= { 9 for the core a, fl,)’ is calculated from { 0 for the outside of the molecule N N N where p has a negative value ( we use -15) for grid ea,fl,~’ = Z ZZOI,m,n " ql+a,m+fl,n+7 nodes within the surface layer of thickness t (we use l=l m=ln=l between 1.5 A and 1.2A). For the smaller molecule The Fourier correlation approach is also used to the nodes are assigned values calculate ea,fl,},. bl,m,n= { l f or the surface { 0 for the outside of the molecule FTDOCKis written in FORTRAN77and the time- The complementarity of shape is given by the consumingFourier correlation can be performed using correlation ca,fl, 7 between the molecules and is the Silicongraphics libraries and can be run in parallel evaluated from on a Silicongraphics Challenge taking a few hours of N N N cpu time using 8 RI0000processors. Serial code is also C O~,fl,~/ :~Z~a ,,m,n .b l+ct,m+fl,n+), available. l=l m=ln=l The results of benchmarking this procedure on ten whereO~, fl,)r are the translational shift of molecule protein complexes showed that electrostatics is best with respect to A for a given relative orientation of the used as a binary filter excluding unfavourable two grids. A high value for c denotes a complex with interactions but then ranking the allowed complexes good shape complementarity since there is substantial solely by shape complementarity. To sample docking overlap of the surface grid nodes of A with molecule B space, the three most favourable complexes from each without a high degree of penetration of the core of A orientation of B are stored and then after all orientations into B. Calculation of c is time consumingas for each considered the top set (we use 4000) are examined. of the N30~,fl,~’shifts there3 are of the order of N Fromthis list one needs to consider at least a list of Sternberg 185 hundreds of complexes to include one complex that is where na and nb are the total occurrences of residues close (<2.5A rms for the Ca atoms at the interface) of types a and b. the correct structure. A log-odds score for a pairing of residue types a and b is derived: 2.2) FILTR- Distance constraints Sab = log10 (~) The lack of selectivity after a global search is a Empirical residue pair potential are derived by applying consequence of the simplified scoring functions used Boltzmann’sequation to the log odds score top obtain and the lack of consideration of conformation change the potential. Howeverthe validity of this approachhas on association. Howeverin manypractical applications been questioned (Skolnick et al., 1997; Thomasand of docking there is knowledgeabout the binding site on Dill, 1996) and therefore we simply use the log odds as at least one of the two components. The procedure a statistical measure of the tendencies of different FILTR is used to screen complexes generated by residue types to pair. FTDOCKfor compatibility with residue-residue distance constraints. The inter Ca distance is calculated The program RPDOCKevaluates the stability of a and this check to be less than the sumof the effective complexby evaluating a total score whichis the sum of side-chain radii plus 4.5A. Sab values for all intermolecular residue pairings with the distance less than dcut (= 12A) and where each residue has a relative accessibility above a cut-off of 2.3) RPDOCK- Screening docked complexes by (Acut) (= 5%) to exclude buried side-chains.

A Computational System for Modeling Flexible Protein-Protein and Protein

Induction of Tumor Apoptosis Through a Circular RNA Enhancing Foxo3 Activity

© Copyright 2012 Craig J. Bierle Portions of This Dissertation Were

Computational Modeling of Rna-Small Molecule and Rna-Protein Interactions

A Uracil Transport Metabolon in E. Coli: Interaction Between the Membrane Transporter Uraa and the Cytosolic Enzyme UPRT

Assessing HADDOCK's Protein-Ligand Ensemble Docking Capabilities Through Urokinase Inhibitors

Investigation of RNA Editing Sites Within Bound Regions of RNA-Binding Proteins

Protein-Protein Docking with HADDOCK

Evolution of in Silico Strategies for Protein-Protein Interaction Drug Discovery

Molecular Docking Studies – a Review

Macromolecular Docking Simulation to Identify Binding Site of FGB1 for Antifungal Compounds

Methods and Applications of in Silico Aptamer Design and Modeling

Genome-Wide Identification and 3D Modeling of Proteins Involved in DNA Damage Recognition and Repair (Final Report)