SCIENTIFIC PUBLICATIONS 43 Overview of the Articles 45
Total Page:16
File Type:pdf, Size:1020Kb
Head Office: Università degli Studi di Padova Department of Chemical Sciences ___________________________________________________________________ Ph.D. COURSE IN: Molecular Sciences CURRICULUM: Pharmaceutical Sciences SERIES XXX EXPLORING PROTEIN FLEXIBILITY DURING DOCKING TO INVESTIGATE LIGAND-TARGET RECOGNITION Coordinator: Ch. mo Prof. Leonard Jan Prins Supervisor : Ch. mo Prof. Stefano Moro Ph.D. student : Veronica Salmaso A mia mamma, mio papà e mia sorella ABSTRACT Abstract Ligand-protein binding models have experienced an evolution during time: from the lock-key model to induced-fit and conformational selection, the role of protein flexibility has become more and more relevant. Understanding binding mechanism is of great importance in drug-discovery, because it could help to rationalize the activity of known binders and to optimize them. The application of computational techniques to drug-discovery has been reported since the 1980s, with the advent computer-aided drug design. During the years several techniques have been developed to address the protein flexibility issue. The present work proposes a strategy to consider protein structure variability in molecular docking, through a ligand-based/structure-based integrated approach and through the development of a fully automatic cross- docking benchmark pipeline. Moreover, a full exploration of protein flexibility during the binding process is proposed through the Supervised Molecular Dynamics. The application of a tabu-like algorithm to classical molecular dynamics accelerates the binding process from the micro-millisecond to the nanosecond timescales. In the present work, an implementation of this algorithm has been performed to study peptide-protein recognition processes. 3 ABSTRACT Sommario I modelli di riconoscimento ligando-proteina si sono evoluti nel corso degli anni: dal modello chiave-serratura a quello di fit-indotto e selezione conformazionale, il ruolo della flessibilità proteica è diventato via via più importante. Capire il meccanismo di riconoscimento è di grande importanza nella progettazione di nuovi farmaci, perchè può dare la possibilità di razionalizzare l’attività di ligandi noti e di ottimizzarli. L’applicazione di tecniche computazionali alla scoperta di nuovi farmaci risale agli anni ‘80, con l’avvento del cosiddetto “Computer-Aided Drug Design”, o, tradotto, progettazione di farmaci aiutata dal computer. Negli anni sono state sviluppate molte tecniche che hanno affrontato il problema della flessibilità proteica. Questo lavoro propone una strategia per considerare la variabilità delle strutture proteiche nel docking, attraverso un approccio combinato ligand-based/structure-based e attraverso lo sviluppo di una procedura completamente automatizzata di docking incrociato. In aggiunta, viene proposta una piena esplorazione della flessibilità proteica durante il processo di legame attraverso la Dinamica Molecolare Supervisionata. L’applicazione di un algoritmo simil-tabu alla dinamica molecolare classica accelera il processo di riconoscimento dalla scala dei micro-millisecondi a quella dei nanosecondi. Nel presente lavoro è stata fatta un’implementazione di questa algoritmica per studiare il processo di riconoscimento peptide-proteina. 4 TABLE OF CONTENTS INTRODUCTION 7 1. Ligand-protein binding models 9 2. Computational methods to study ligand-protein binding 11 2.1 Molecular Docking 12 2.1.1 Scoring functions 12 2.1.2 Sampling 15 2.1.3 Semi-flexible docking 16 2.1.4 Flexible docking 19 2.2 Molecular Dynamics 21 2.2.1 Molecular Dynamics and exploration of the phase space 22 2.2.2 Advances in classical MD simulations 23 2.2.3 Enhanced sampling techniques 25 2.2.3.1 Collective Variables-free methods 25 Replica Exchange Molecular Dynamics (REMD) 25 Accelerated Molecular Dynamics (aMD) 26 2.2.3.2 Collective Variables-dependent methods 27 Steered Molecular Dynamics (SMD) 27 Random Acceleration Molecular Dynamics (RAMD) 27 Umbrella Sampling (US) 28 Metadynamics 28 5 TABLE OF CONTENTS References 30 AIM OF THE WORK 39 SCIENTIFIC PUBLICATIONS 43 Overview of the articles 45 DockBench as docking selector tool: the lesson learned from D3R Grand Challenge 2015 47 Combining self- and cross-docking as benchmark tools: the performance of DockBench in the D3R Grand Challenge 2 73 Sulfonamido-derivatives of unsubstituted carbazole as BACE1 Inhibitors 95 Synthesis, structure-activity relationships and biological evaluation of 7-phenyl-pyrroloquinolinone 3-amide derivatives as potent antimitotic agents 105 The role of 5-arylalkylamino- and 5-piperazino- moieties on the 7-aminopyrazolo[4,3-d]pyrimidine core in affecting adenosine A1 and A2A receptor affinity and selectivity profiles 141 Deciphering the Complexity of Ligand–Protein Recognition Pathways Using Supervised Molecular Dynamics (SuMD) Simulations 175 Exploring Protein-Peptide Recognition Pathways Using a Supervised Molecular Dynamics Approach 205 New Trends in Inspecting GPCR-ligand Recognition Process: the Contribution of the Molecular Modeling Section (MMS) at the University of Padova 221 CONCLUSIONS 237 6 INTRODUCTION Introduction 7 INTRODUCTION 8 INTRODUCTION 1. Ligand-protein binding models No protein is an island, but exerts its function through recognition with other molecular partners. Ligand- protein interactions are involved in many biological processes, so understanding binding mechanism has been occupying scientists for a long time. Several theories have been developed during the years, with an increasing emphasis on the degree of flexibility of the ligand and protein counterparts. The first explanation of binding was provided by Emil Fischer in 1894 [1], who firstly proposed the “lock-key” model to explain enzyme specificity: in this model the ligand is natively complementary in shape to its protein binding site, which it recognizes and occupies rigidly as a key to its lock. However, this model could explain neither the behaviour of enzyme noncompetitive inhibition nor allosteric modulation. A modification of the “lock-key” theory was introduced by Koshland [2], who proposed the “induced-fit” theory starting from his observations on enzyme-substrate interactions: according to this binding model, the ligand is able to induce conformational changes to the protein, optimizing their interactions. Later works suggested that many conformational states of the same protein can exist before binding [3, 4], and a ligand can bind preferentially to one of these conformations; this laid the basis for the “conformational selection” model. Protein-ligand binding is better explained in terms of protein energy landscape, a concept originated to explain protein folding. The free energy landscape model describes the whole conformational space accessible to a system: it consists of the free energy of the system as a function of its conformations (i. e. coordinates of its atoms), resulting in a hypersurface [5]. At a given instant, a protein exists as a single conformational state, but it explores different conformations in time. The probability to occupy a particular region of the phase space is correlated with the energy of that state. In other terms, the energy landscape is conceived as the probability distributions of all conformational states of a protein. The shape of the surface is made of hills and valleys, which respectively stands for low and high probability states. The global minimum and local minima respectively represents native state and metastable states, which are separated by energy barriers (transition states). The folding energy surface answered to the Levinthal’s paradox [6], which wondered about how protein could fold rapidly given the vastness of the search space. The folding landscape has the shape of a funnel, with the bottom occupied by the folded state; the funnel means that multiple parallel pathways could lead from the multitude of unfolded states to the stable native state [7–11]. The shape of the funnel bottom represents the flexibility of the folded protein: if it is smooth, the protein is rigid, while if it is rugged, with numerous valleys separated by low energy barriers, the protein is flexible. On this basis, the binding process could be explained in terms of binding energy landscape [5, 12]. Binding can be interpreted as a folding without chain connectivity, and thus with a greater number of degrees of 9 INTRODUCTION freedom; this makes the binding energy landscape even more complicated than the folding one. Starting from the simpler example, where both the binding counterparts are rigid (i. e. smooth folding-funnel), recognition process can be conceived as the search for the global minimum of the free energy surface described by the three translational and rotational degrees of freedom of the ligand and the protein. If the interacting counterparts are both flexible (i. e. rugged folding-funnel bottom), they exist as an ensemble of different conformations, and each conformation of the ligand may in principle interact with each conformation of the protein, making the binding landscape really complex [13]. The binding event can stabilize one of the conformations of the protein, either the native or a different one; in the second case, a shift is induced in the equilibrium of protein populations [14]. “Population-shift” (“conformational selection”) differs from “induced-fit” in the chronological sequence of events that bring to binding, and have different ranges of applicability [15–17]; according to the first, the ligand picks one of the already available conformations of the