Improving Posing and Ranking of Molecular Docking by Izhar Wallach
Total Page:16
File Type:pdf, Size:1020Kb
Improving Posing and Ranking of Molecular Docking by Izhar Wallach A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Computer Science University of Toronto Copyright c 2012 by Izhar Wallach Abstract Improving Posing and Ranking of Molecular Docking Izhar Wallach Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2012 Molecular docking is a computational tool commonly applied in drug discovery projects and fundamental biological studies of protein-ligand interactions. Traditionally, molecu- lar docking is used to address one of three following questions: (i) given a ligand molecule and a protein receptor, predict the binding mode (pose) of the ligand within the context of a receptor, (ii) screen a collection of small-molecules against a receptor and rank ligands by their likelihood of being active, and (iii) given a ligand molecule and a target receptor, predict the binding affinity of the two. Here, we focus on the first two questions, namely ranking and pose prediction. Currently, state-of-the-art docking algorithms predict poses within 2A˚of the native pose in a rate lower than ∼60% and in many cases, below 40%. In ranking, their ability to identify active ligands is inconsistent and generally suffers from high false-positive rate. In this thesis we present novel algorithms to enhance the ability of molecular docking to address these two questions. These algorithms do not substitute traditional docking but rather being applied on top of them to provide synergistic effect. Our algorithms improve pose predictions by 0.5-1.0A˚ and ranking order for 23% of the targets in gold-standard benchmarks. As importantly, the algorithms improve the con- sistence of the posing and ranking predictions over diverse sets of targets and screening libraries. In addition to the posing and ranking, we present the pharmacophore concept. A pharmacophore is an ensemble of physiochemical descriptors associated with a biolog- ical target that elucidates common interaction patterns of ligands with that target. We ii introduce a novel pharmacophore inference algorithm and demonstrate its utilization in molecular docking. This thesis is outlined as follow. First we introduce the molecular docking approach for pose prediction and ranking. Second, we discuss the pharmacophore concept and present algorithms for pharmacophore inference. Third, we demonstrate the utilization of pharmacophores for pose prediction by re-scoring candidate poses generated by docking algorithms. Finally, we present algorithms to improve ranking by reducing bias in scoring functions employed by docking algorithms. iii Contents 1 Introduction to Molecular Docking 3 1.1 Introduction . .3 1.2 Overview of docking and scoring approaches . .4 1.2.1 Conformational Search . .4 1.2.2 Scoring Functions . .6 1.3 Pose prediction using molecular docking . .7 1.3.1 Consensus scoring . .8 1.3.2 Ligand-based constraints . .9 1.4 Ranking using molecular docking . 11 1.4.1 Descriptor-based methods . 12 1.4.2 Receptor based methods . 15 1.4.3 Ligand-based methods . 16 1.5 Pharmacophore Inference and its application in molecular docking . 17 1.5.1 Protein { small-molecule binding patterns . 19 1.5.2 Pharmacophore inference methods . 21 2 Sub-Cavity-based Pharmacophore Inference 30 2.1 Introduction . 30 2.2 Methods . 32 2.2.1 Dataset generation . 33 iv 2.2.2 Ligand chemical analysis . 35 2.2.3 Shape identification and characterization of binding sites . 35 2.2.4 Binding site division into sub-cavities . 36 2.2.5 Sub-cavity similarity . 37 2.2.6 Sub-cavity clustering and reshaping . 39 2.3 Results & Discussion . 40 2.3.1 Simulated data . 41 2.3.2 Clustering different protein classes . 43 2.3.3 PDB sub-cavity analysis and inference . 44 2.3.4 Extensions and limitations . 49 2.4 Conclusion . 50 3 Pose Prediction using Pharmacophore Hypotheses 52 3.1 Introduction . 52 3.2 Methods . 55 3.2.1 Dataset generation . 55 3.2.2 Binding mode prediction . 57 3.3 Results & Discussion . 62 3.3.1 Scoring function analysis . 63 3.3.2 Predictions . 64 3.3.3 Robustness of the algorithm . 71 3.4 Conclusion & Future Work . 76 4 Virtual Decoy Sets for Molecular Docking Benchmarks 80 4.1 Introduction . 80 4.2 Results . 82 4.2.1 Benchmarks vs. DUD . 82 4.2.2 \Self-DUD" experiment . 85 v 4.2.3 Controlled bias experiments . 87 4.2.4 Benchmarks using EA-Inventor as decoy generator . 89 4.3 Discussion & Conclusion . 90 4.4 Methods . 93 4.4.1 Docking Procedure . 93 4.4.2 Self-Generated DUD . 94 4.4.3 Decoy generation algorithm . 94 5 Normalizing Molecular Docking Rankings 103 5.1 Introduction . 103 5.2 Results . 107 5.2.1 Experiments using a 0.8 tanimoto coefficient threshold . 109 5.2.2 Experiments using a 0.5 tanimoto coefficient threshold . 113 5.2.3 Results with two validated systems . 116 5.2.4 Experiments using randomly selected drug-like libraries . 120 5.3 Discussion . 123 5.4 Conclusion . 127 5.5 Methods . 128 5.5.1 Generation of datasets . 128 5.5.2 Docking . 129 5.5.3 Fitting and normalization . 130 6 Summary & Conclusion 133 Bibliography 137 vi List of Tables 2.1 Inference success rate by homogeneity score cutoffs. Clusters having higher homogeneity scores demonstrate better inference precision. The iterative clustering{reshaping process increases the precision for clusters with a higher homogeneity scores. It supports the assumption that the simi- larity within a cluster comes from the sharing of a common substructure { the reshaping process uncovers this structure and increases prediction accuracy. Clusters with lower homogeneity scores are less likely to share common substructure and benefit less from the reshaping process. 46 2.2 Inference results for HIV-1 Protease active site. Sub-cavity inference re- sults for the binding site of HIV-1 Protease. Using three different homo- geneity score thresholds (0.65, 0.75, 0.85), the predicted sub-cavity labels were compared to a set of nine ligands. R-groups refer to Figure 2.5A. The predictions made by our algorithm appear in the `Prediction' row (HBD: hydrogen bond donor, Arom: aromatic). No prediction was made when the cluster's homogeneity score did not pass the set threshold (indicated by a '-'). An entry X/Y indicates X correct predictions made for the Y lig- ands with a corresponding R-group (i.e., not all ligands have a substituted chemical group at each R position). 48 vii 2.3 Sub-cavity inference results for the binding site of Thrombin. Using three different homogeneity score thresholds (0.65, 0.75, 0.85), the predicted sub- cavity labels were compared to a set of nine ligands. The predictions made by our algorithm appear in the `Prediction' row (HBD: hydrogen bond donor, HBA: hydrogen bond acceptor). No prediction was made when the cluster's homogeneity score did not pass the set threshold (indicated by a '-'). An entry X/Y indicates X correct predictions made for the Y ligands with a corresponding R-group (i.e., not all ligands have a substituted chemical group at each R position). 51 3.1 Pearson correlation coefficient of configuration score vs. RMSD using the Sigmoid and Euclidean similarity functions and Thrombin (1TOM) as the target protein (p-value of all correlations are < 10−10)........... 67 3.2 The fraction of ligands with a correct solution ranked first (< 2:5A˚) which also have a solution τ percent better within the top 50 ranked candidate binding modes (see text). For 87% of the tested cases, when a correct binding mode appeared at the top of the ranked list, a pose more similar to the native binding mode appeared elsewhere in the top 50 poses. 72 3.3 Average ligand similarity of 1200 randomly generated prediction exper- iments. The table lists the average Tanimoto coefficient and standard deviation (std. dev.) of the ligand sets using Daylight and MACCS-like fingerprints. 75 viii 4.1 A comparison of different enrichments attained using the VDS and the DUD decoy sets over the 40 DUD protein targets. AUC denotes the area under the ROC curve. Larger AUC value indicates better enrichment. In our case, datasets that produce smaller AUC values are generally con- sidered better for benchmarking. EF1, EF20, and EFmax correspond to the enrichment factors at 1% of the decoys, 20% of the decoys, and the maximal enrichment over the whole set of decoys. 99 4.2 Using the eHiTS docking algorithm. A comparison of different enrichments attained using the VDS and the DUD decoy sets over the 13 targets from Andrew Good's DUD clustering. AUC denotes the area under the ROC curve. EF3 and EF20 correspond to the early enrichment at 3% of the decoys and late enrichment at 20% of the decoys. 100 4.3 Using the Glide docking algorithm. A comparison of different enrichments attained using the VDS and the DUD decoy sets over the 13 targets from Andrew Good's DUD clustering. AUC denotes the area under the ROC curve. EF3 and EF20 correspond to the early enrichment at 3% of the decoys and late enrichment at 20% of the decoys. 100 4.4 A comparison between the VDS and the DUD datasets over five physical descriptors used as indicators of physical similarity between active ligands and decoys. For every target and every physical descriptor, the proper- ties of the VDS and DUD decoys are compared to the properties of the corresponding active ligands. 101 5.1 Using Glide. Distance between the distributions of docking scores of de- coys generated for active (active-VDS) and non-active (non-active-VDS) molecules. Active- and non-active-VDS were generated with a 0.8 and a 0.5 TC penalty threshold.