Free-Energy Calculations in Structure-Based Drug Design

5 Free-energy calculations in structure-based drug design Michael R. Shirts, David L. Mobley, and Scott P.Brown INTRODUCTION that is required of any computational method will depend greatly its speed. A number of rapid structure-based virtual The ultimate goal of structure-based drug design is a sim- screening methods, generally categorized as “docking,” can ple, robust process that starts with a high-resolution crys- help screen large molecular libraries for potential binders tal structure of a validated biological macromolecular target and locate a putative binding site (see Chapter 7 for more and reliably generates an easily synthesized, high-affinity information on docking). However, recent studies have small molecule with desirable pharmacological proper- illustrated that although docking methods can be useful ties. Although pharmaceutical science has made significant for identifying putative binding sites and identifying ligand gains in understanding how to generate, test, and validate poses, scoring methods are not reliable for predicting small molecules for specific biochemical activity, such a compound binding affinities and do not currently possess complete process does not now exist. In any drug design the accuracy necessary for lead optimization.4–6 project, enormous amounts of luck, intuition, and trial and Atomistic, physics-based computational methods are error are still necessary. appealing because of their potential for high transferabil- For any small molecule to be considered a likely drug ity and therefore greater reliability than methods based on candidate, it must satisfy a number of different absorp- informatics or extensive parameterization. Given a suffi- tion/distribution/metabolism/excretion (ADME) proper- ciently accurate physical model of a protein/ligand com- ties and have a good toxicological profile. However, a small plex and thorough sampling of the conformational states molecule must above all be active, which in most cases of this system, one can obtain accurate predictions of means that it must bind tightly and selectively to a specific binding affinities that could then be robustly incorporated location in the protein target before any of the other impor- into research decisions. By using a fundamental physi- tant characteristics are relevant. To design a drug, large cal description, such methods are likely to be valid for regions of chemical space must be explored to find candi- any given biological system under study, as long as suffi- date molecules with the desired biological activity. High- cient physical detail is included. Yet another advantage of throughput experimental screening methods have become physics-based models is that the failures can be more eas- the workhorse for finding such hits.1,2 However, their results ily recognized and understood in the context of the physi- are limited by the quality and diversity of the preexisting cal chemistry of the system, which cannot be easily done in chemical libraries, which may contain only molecules rep- informatics-based methods. resentative of a limited portion of the relevant chemical Despite this potential for reliable predictive power, space for a given target. Combinatorial libraries can be pro- few articles exist in the literature that report successful, duced to supplement these efforts, but their use requires prospective use of physics-based tools within industrial or careful design strategies and they are subject to a num- academic pharmaceutical research. Some of the likely rea- ber of pitfalls.3 Morefocuseddirectinvivoorinvitromea- sons for such failures are the very high computational costs surements provide important information about the effect of such methods, insufficiently accurate atomistic mod- of prospective drugs in the complete biological system but els, and software implementations that make it difficult for provide relatively little information that can be directly used even experts to easily set up with each new project. Until to further engineer new molecules. Given a small number of these problems are resolved, there remain significant obsta- molecules, highly accurate assays of binding, such as sur- cles to the realization of more rigorous approaches in indus- face plasmon resonance (SPR) or isothermal calorimetry trial drug research. (ITC), are relatively accessible though rather costly. There have been a number of important technical Ideally, small molecules with high potential biological advances in the computation of free energies since the activity could be accurately and reliably screened by com- late 1990s that, coupled with the rapid increase in compu- puter before ever being synthesized. The degree of accuracy tational power, have brought these calculations closer to 61 62 Michael R. Shirts, David L. Mobley, and Scott P. Brown the goal of obtaining reliable and pharmaceutically useful ability of accurate and reliable computational methods to binding energies. In this chapter, we briefly review these influence drug research productivity. latest advances, with a focus on specific applications of Suppose our chemist sits down each week and envisions these methods in the recent literature. Under “How Accu- a large number of modifications of a lead compound he or rate Must Calculations of Affinity Be to Add Value” we first she would like to make and test. Instead of simply selecting discuss the level of reliability and accuracy that binding cal- only his or her best guess from that list, which would lead to culations must have to add some degree of value to the a distribution in affinity gains similar to the one described pharmaceutical process. Under “Free Energy Methodolo- above, this chemist selects N compounds to submit to an gies” we give an overview of the methods currently used idealized computer screening program. The chemist then to calculate free energies, including recent advances that synthesizes the top-rated compound from the computer may eventually lead to sufficiently high throughput for predictions. What is the expected distribution of affinities effective pharmaceutical utility. Under “MM-PBSA Calcu- arising from this process for different levels of computa- lations” and “Alchemical Calculations” we review recent tional error? ligand binding calculations in the literature, beginning To model this process, we assume the medicinal with relatively computationally efficient methods that are chemist’s proposals are similar to the Abbott data and we generally more approximate but still attempt to calcu- approximate this distribution of binding affinity changes as late a true affinity without system-dependent parameters a Gaussian distribution with mean zero and standard devi- and then address pharmaceutically relevant examples of ation of 1.02 kcal/mol, resulting in 8.5% of changes having most physically rigorous methods. We conclude with a apK i increase of 1.0. We assume the computational pre- discussion of the implications of recent progress in cal- dictions of binding affinity have Gaussian noise with stan- culating ligand binding affinities on structure-based drug dard deviation ⑀. In our thought experiment, we generate N design. “true” binding affinity changes from the distribution. The computational screen adds Gaussian error with width ⑀ to each measurement. We then rank the “noisy” computational estimates and look at distribution of “true” affinities HOW ACCURATE MUST CALCULATIONS OF AFFINITY that emerge from selecting the best of the corresponding BE TO ADD VALUE? “noisy” estimates. Repeating this process a number of times Physics-based binding calculations can be very compu- (for Figure 5.1, one million), we can generate a distribution tationally demanding. Given these time requirements, it of affinities from the screened process. is important to understand quantitatively what levels of Shown in Figure 5.1 is the modeled distribution of exper- precision, throughput, and turnaround time are required imental affinity changes from the chemist’s predictions for any computational method to systematically effect the (blue) versus the distribution of the experimental affin- lead-optimization efforts of industrial medicinal chemists ity changes after computationally screening N = 10 com- in a typical work flow. To be useful, a method does not pounds with noise ⑀ = 0.5 (pink), ⑀ = 1.0 (red), and ⑀ = necessarily need to deliver perfect results, as long as it 2.0 (purple). In other words, the blue distribution of affini- can produce reliable results with some predictive capacity ties is what the medicinal chemist would obtain alone; on time scales relevant to research decision-making pro- the redder curves what the chemist would obtain synthe- cesses. These issues are frequently addressed anecdotally, sizing the computer’s choice of his N proposed modifi- but rarely in a quantitative manner, and we will try to sketch cation. The shaded area represents the total probability out at least one illustration of what the requirements of a of a modification with affinity gain greater than 1.4 kcal/ computational method might be. mol. A recent analysis of more than 50,000 small-molecule With 0.5 kcal/mol computational noise, screening just chemical transformations spanning over 30 protein targets ten molecules results in an almost 50% chance of achieving at Abbott Laboratories found that approximately 80% of the 1pK i binding increase in a single round of synthesis, versus resulting modified molecules had potencies lying within 1.4 an 8.5% chance without screening. With 1 kcal/mol error, 7 kcal/mol (i.e., 1 pK i log unit) of the

Free-Energy Calculations in Structure-Based Drug Design

1 Drug Development Working Group Report: Analysis of Existing

Drug Design Project Tips

Computational Approaches for Drug Design and Discovery: an Overview

Recent Advances in Automated Protein Design and Its Future

Protein Structure Analysis: Introducing Students to Rational Drug Design

Kinase Inhibitor Chemistry Jack W

Drug Design Project: Information and Tips Part of Your Grade in Drug

Computer Aided Drug Design: a Novel Loom to Drug Discovery

Structural Biology and Drug Design

Recent Advances in Developing Small Molecules Targeting Nucleic Acid

Polypeptide and Protein Modeling for Drug Design Encyclopedia Of

Rise of Virtual Drug Design Tools