Rdock Reference Guide Rdock Development Team August 27, 2015 Contents
Total Page:16
File Type:pdf, Size:1020Kb
rDock Reference Guide rDock Development Team August 27, 2015 Contents 1 Preface 4 2 Acknowledgements 4 3 Introduction 4 4 Configuration 4 5 Cavity mapping 6 5.1 Two sphere method . 6 5.2 Reference ligand method . 8 6 Scoring function reference 11 6.1 Component Scoring Functions . 11 6.1.1 van der Waals potential . 11 6.1.2 Empirical attractive and repulsive polar potentials . 11 6.1.3 Solvation potential . 12 6.1.4 Dihedral potential . 13 6.2 Intermolecular scoring functions under evaluation . 13 6.2.1 Training sets . 13 6.2.2 Scoring Functions Design . 13 6.2.3 Scoring Functions Validation . 14 6.3 Code Implementation . 15 7 Docking protocol 17 7.1 Protocol Summary . 17 7.1.1 Pose Generation . 17 7.1.2 Genetic Algorithm . 17 7.1.3 Monte Carlo . 17 7.1.4 Simplex . 18 7.2 Code Implementation . 18 7.3 Standard rDock docking protocol (dock.prm) . 18 8 System definition file reference 22 8.1 Receptor definition . 22 8.2 Ligand definition . 23 8.3 Solvent definition . 24 8.4 Cavity mapping . 25 8.5 Cavity restraint . 27 8.6 Pharmacophore restraints . 27 8.7 NMR restraints . 28 8.8 Example system definition files . 28 9 Molecular files and atoms typing 30 9.1 Atomic properties. 30 9.2 Difference between formal charge and distributed formal charge . 30 9.3 Parsing a MOL2 file . 31 9.4 Parsing an SD file . 31 9.5 Assigning distributed formal charges to the receptor . 31 10 rDock file formats 32 10.1 .prm file format . 32 10.2 Water PDB file format . 33 10.3 Pharmacophore restraints file format . 34 2 11 rDock programs 35 11.1 Programs reference . 35 11.1.1 rbdock . 35 11.1.2 rbcavity . 36 11.1.3 rbcalcgrid . 37 11.1.4 make grid.csh . 37 11.1.5 rbmoegrid . 37 11.1.6 sdrmsd . 38 11.1.7 sdtether . 38 11.1.8 sdfilter . 39 11.1.9 sdreport . 39 11.1.10 sdsplit . 39 11.1.11 sdsort . 40 11.1.12 sdmodify . 40 11.1.13 rbhtfinder . 40 11.1.14 rblist . 41 12 Common Use cases 42 12.1 Standard docking . 42 12.1.1 Standard docking workflow . 42 12.2 Tethered scaffold docking . 42 12.2.1 Example ligand definition for tethered scaffold . 43 12.3 Docking with pharmacophore restraints . 43 12.4 Docking with explicit waters . 43 13 Appendix 45 3 1 Preface It is intended to develop this document into a full reference guide for the rDock platform, describing the software tools, parameter files, scoring functions, and search engines. The reader is encouraged to cross- reference the descriptions with the corresponding source code files to discover the finer implementation details. 2 Acknowledgements Third-party source code. Two third-party C++ libraries are included within the rDock source code, to provide support for specific numerical calculations. The source code for each library can be distributed freely without licensing restrictions and we are grateful to the respective authors for their contributions. • Nelder-Meads Simplex search, from Prof. Virginia Torczon's group, College of William and Mary, Department of Computer Science, VA. (http://www.cs.wm.edu/va/software/) • Template Numerical Toolkit, from Roldan Pozo, Mathematical and Computational Sciences Divi- sion, National Institute of Standards and Technology (http://math.nist.gov/tnt/) 3 Introduction The rDock platform is a suite of command-line tools for high-throughput docking and virtual screening. The programs and methods were developed and validated from 1998 to 2002 at RiboTargets (more re- cently, Vernalis) for proprietary use. The original program (RiboDock) was designed for high-throughput virtual screening of large ligand libraries against RNA targets, in particular the bacterial ribosome. Since 2002 the programs have been substantially rewritten and validated for docking of drug-like ligands to protein and RNA targets. A variety of experimental restraints can be incorporated into the docking calculation, in support of an integrated Structure-Based Drug Design process. In 2006, the software was licensed to the University of York for maintenance and distribution and, in 2012, Vernalis and the University of York agreed to release the program as Open Source software. rDock is licensed under GNU-LGPL version 3.0 with support from the University of Barcelona - rdock.sourceforge.net. 4 Configuration Before launching rDock, make sure the following environment variables are defined. Precise details are likely to be site-specific. • RBT ROOT environment variable: should be defined to point to the rDock installation direc- tory. • RBT HOME environment variable: is optional, but can be defined to to point to a user project directory containing rDock input files and customised data files. • PATH environment variable: $RBT ROOT/bin should be added to the $PATH environment variable. • LD LIBRARY PATH. $RBT ROOT/lib should be added to the $LD LIBRARY PATH envi- ronment variable. Input file locations. The search path for the majority of input files for rDock is: • Current working directory • $RBT HOME, if defined • The appropriate subdirectory of $RBT ROOT/data/. For example, the default location for scoring function files is $RBT ROOT/data/sf/. 4 The exception is that input ligand SD files are always specified as an absolute path. If you wish to customise a scoring function or docking protocol, it is sufficient to copy the relevant file to the current working directory or to $RBT HOME, and to modify the copied file. Launching rDock. For small scale experimentation, the rDock executables can be launched directly from the command line. However, serious virtual screening campaigns will likely need access to a compute farm. In common with other docking tools, rDock uses the embarrassingly parallel approach to distributed computing. Large ligand libraries are split into smaller chunks, each of which is docked independently on a single machine. Docking jobs are controlled by a distributed resource manager (DRM) such as Condor or SGE. 5 5 Cavity mapping Virtual screening is very rarely conducted against entire macromolecules. The usual practice is to dock small molecules in a much more confined region of interest. rDock makes a clear distinction between the region the ligand is allowed to explore (known here as the docking site), and the receptor atoms that need to be included in order to calculate the score correctly. The former is controlled by the cavity mapping algorithm, whilst the latter is scoring function dependent as it depends on the distance range of each component term (for example, vdW range >> polar range). For this reason, it is usual practice with rDock to prepare intact receptor files (rather than truncated spheres around the region of interest), and to allow each scoring function term to isolate the relevant receptor atoms within range of the docking site. rDock provides two methods for defining the docking site: • Two sphere method • Reference ligand method Note All the keywords found in capital letters in following cavity mapping methods explanation (e.g. RADIUS), make reference to the parameters defined in prm rDock configuration file. For more informa- tion, go to section 8.4 - Cavity mapping on page 25. 5.1 Two sphere method The two sphere method aims to find cavities that are accessible to a small sphere (of typical atomic or solvent radius) but are inaccessible to a larger sphere. The larger sphere probe will eliminate flat and convex regions of the receptor surface, and also shallow cavities. The regions that remain and are accessible to the small sphere are likely to be the nice well defined cavities of interest for drug design purposes. 1. A grid is placed over the cavity mapping region, encompassing a sphere of radius=RADIUS, cen- ter=CENTER. Cavity mapping is restricted to this sphere. All cavities located will be wholly within this sphere. Any cavity that would otherwise protrude beyond the cavity mapping sphere will be truncated at the periphery of the sphere. 6 2. Grid points within the volume occupied by the receptor are excluded (coloured red). The radii of the receptor atoms are increased temporarily by VOL INCR in this step. 3. Probes of radii LARGE SPHERE are placed on each remaining unallocated grid point and checked for clashes with receptor excluded volume. To eliminate edge effects, the grid is extended beyond the cavity mapping region by the diameter of the large sphere (for this step only). This allows the large probe to be placed on grid points outside of the cavity mapping region, yet partially protrude into the cavity mapping region. 4. All grid points within the cavity mapping region that are accessible to the large probe are excluded (coloured green). 7 5. Probes of radii SMALL SPHERE are placed on each remaining grid point and checked for clashes with receptor excluded volume (red) or large probe excluded volume (green) 6. All grid points that are accessible to the small probe are selected (yellow). 7. The final selection of cavity grid points is divided into distinct cavities (contiguous regions). In this example only one distinct cavity is found. User-defined filters of MIN VOLUME and MAX CAVITIES are applied at this stage to select a subset of cavities if required. Note that the filters will accept or reject intact cavities only. 5.2 Reference ligand method The reference ligand method provides a much easier option to define a docking volume of a given size around the binding mode of a known ligand, and is particularly appropriate for large scale automated validation experiments. 8 1. Reference coordinates are read from REF MOL. A grid is placed over the cavity mapping region, encompassing overlapping spheres of radius=RADIUS, centered on each atom in REF MOL.