Advances in Experimental Medicine and Biology 796

Marta Filizola Editor G -Coupled Receptors - Modeling and Simulation Advances in Experimental Medicine and Biology

Volume 796

Editorial Board

Nathan Back, State University of New York at Buffalo, Buffalo, NY, USA Irun R. Cohen, The Weizmann Institute of Science, Rehovot, Israel N. S. Abel Lajtha, Kline Institute for Psychiatric Research, Orangeburg, NY, USA John D. Lambris, University of Pennsylvania, Philadelphia, PA, USA Rodolfo Paoletti, University of Milan, Milan, Italy

For further volumes: http://www.springer.com/series/5584

Marta Filizola Editor

G Protein-Coupled Receptors - Modeling and Simulation

123 Editor Marta Filizola Department of Structural and Chemical Biology Icahn School of Medicine at Mount Sinai New York, NY USA

ISSN 0065-2598 ISSN 2214-8019 (electronic) ISBN 978-94-007-7422-3 ISBN 978-94-007-7423-0 (eBook) DOI 10.1007/978-94-007-7423-0 Springer Dordrecht Heidelberg New York London

Library of Congress Control Number: 2013951640

© Springer Science+Business Media Dordrecht 2014 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com) Preface

G protein-coupled receptors (GPCRs) are membrane of significant interest in pharmaceutical research owing to their involvement in several important biological processes, including those leading to some serious medical conditions. In spite of focused research, progress towards the discovery of effective therapeutics for GPCRs has long been hampered by the lack of high-resolution structural information about these receptors. Although the number of high-resolution crystal structures of GPCRs has grown significantly in the past few years, the information they provide is limited. Not only are we still far from a comprehensive structural coverage of the GPCR superfamily, but the available structures refer to static, heavily engineered, and generally inactive conformational states of receptor subtypes stripped out of their natural lipid environment. Furthermore, it has become increasingly clear that a full understanding of GPCR structure and function requires dynamic information at a level of detail that is likely to require integration of experimental and computational approaches. Significant progress has been made over the past decade in the devel- opment and application of computational approaches to the large family of GPCRs. A dedicated book that discusses in depth this important topic is lacking but strongly needed owing to: (a) the critical (but sometimes unappreciated) impact that these computational approaches have on under- standing the molecular mechanisms underlying the physiological function of GPCRs in support of rational , (b) the recent advances in theory, hardware, and software, and (c) the potential for much-improved applications using newly available experimentally-derived structural and dynamic information on GPCRs. Thus, I sought the help of experts with an established reputation in the development and/or application of computational methods to GPCRs, and asked them to contribute their state-of-the-art views on modeling and simulation of this important family of membrane proteins. I am indebted to the highly distinguished authors of this book for agreeing to participate in this project and to provide chapters for four different sessions: a first one describing the impact of currently available GPCR crystal structures on structural modeling of other receptor subtypes, a second one reporting on critical insights from simulations, a third one focusing on recent progress in rational ligand discovery and mathematical modeling, and a fourth one providing an overview of tools and resources that are available for GPCRs.

v vi Preface

Heartfelt thanks also go to the several anonymous reviewers of the chapters, and to Thijs van Vlijmen from Springer for the opportunity he offered me to edit this volume. I believe this book adds a unique facet to the “Advances in Experimental Medicine and Biology” series, and I hope the reader will find it both fascinating and of enduring interest.

New York, NY, USA Marta Filizola, Ph.D. June 3, 2013 Contents

Part I Progress in Structural Modeling of GPCRs

1 The GPCR Crystallography Boom: Providing an Invaluable Source of Structural Information and Expanding the Scope of Homology Modeling ...... 3 Stefano Costanzi and Keyun Wang 2 Modeling of G Protein-Coupled Receptors Using Crystal Structures: From Monomers to Signaling Complexes ...... 15 Angel Gonzalez, Arnau Cordomí, Minos Matsoukas, Julian Zachmann, and Leonardo Pardo

Part II GPCRs in Motion: Insights from Simulations

3 Structure and Dynamics of G-Protein Coupled Receptors .... 37 Nagarajan Vaidehi, Supriyo Bhattacharya, and Adrien B. Larsen 4 How the Dynamic Properties and Functional Mechanisms of GPCRs Are Modulated by Their Coupling to the Membrane Environment ...... 55 Sayan Mondal, George Khelashvili, Niklaus Johner, and Harel Weinstein 5 Coarse-Grained Provides Insight into the Interactions of Lipids and Cholesterol with ... 75 Joshua N. Horn, Ta-Chun Kao, and Alan Grossfield 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G Protein-Coupled Receptors with Enhanced Molecular Dynamics Methods ...... 95 Jennifer M. Johnston and Marta Filizola

Part III GPCR-Focused Rational Design and Mathematical Modeling

7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery ...... 129 Albert J. Kooistra, Rob Leurs, Iwan J.P. de Esch, and Chris de Graaf

vii viii Contents

8 Mathematical Modeling of G Protein-Coupled Receptor Function: What Can We Learn from Empirical and Mechanistic Models? ...... 159 David Roche, Debora Gil, and Jesús Giraldo

Part IV Bioinformatics Tools and Resources for GPCRs

9 GPCR & Company: Databases and Servers for GPCRs and Interacting Partners ...... 185 Noga Kowalsman and Masha Y. Niv 10 Bioinformatics Tools for Predicting GPCR Functions ... 205 Makiko Suwa

Index ...... 225 Part I Progress in Structural Modeling of GPCRs The GPCR Crystallography Boom: Providing an Invaluable Source 1 of Structural Information and Expanding the Scope of Homology Modeling

Stefano Costanzi and Keyun Wang

Abstract G protein-coupled receptors (GPCRs) are integral membrane proteins of high pharmaceutical interest. Until relatively recently, their structures have been particularly elusive, and rhodopsin has been for many years the only member of the superfamily with experimentally elucidated structures. However, a number of recent technical and scientific advancements made the determination of GPCR structures more feasible, thus leading to the solution of the structures of several receptors. Besides providing direct structural information, these experimental GPCR structures also provide templates for the construction of GPCR models. In depth studies have been performed to probe the accuracy of these models, in particular with respect to the interactions with their ligands, and to assess their applicability the rational discovery of GPCR modulators. Given the current state of the art and the pace of the field, the future of GPCR structural studies is likely to be characterized by a landscape populated by an increasingly higher number of experimental and theoretical structures.

Keywords Rhodopsin ¥ Homology ¥ Crystal structures ¥ Docking ¥ 3D Models

Lefkowitz and Brian Kobilka for their pioneering 1.1 Introduction and revolutionary studies that led to our current understanding of the structure and the For the scientific community involved in the functioning of these notable receptor proteins. study of G protein-coupled receptors (GPCRs), GPCRs form a superfamily that encompasses 2012 was a special year, since the Nobel a large number of receptors embedded on the Prize in was awarded to Robert lipid bilayer of the plasma membrane of animal cells as well as other eukaryotic cells, including S. Costanzi ()¥K.Wang yeast. In light of their characteristic topology, Department of Chemistry and Center for Behavioral which features one polypeptide chain that spans , American University, Washington, DC ’ 20016, USA seven times the membrane with seven -helices, e-mail: [email protected] they are also known as seven transmembrane

M. Filizola (ed.), G Protein-Coupled Receptors - Modeling and Simulation, Advances 3 in Experimental Medicine and Biology 796, DOI 10.1007/978-94-007-7423-0__1, © Springer ScienceCBusiness Media Dordrecht 2014 4 S. Costanzi and K. Wang

(7TM) receptors (Pierce et al. 2002). These for drug discovery, as they provide a direct insight receptors function as cellular sensors for stimuli into the structure-function relationships of the provided by extracellular first messengers, targets and a platform for the discovery of novel which can be either endogenous or exogenous modulators of their activity (Congreve et al. 2011; substances. The first category encompasses a Congreve and Marshall 2009; Mason et al. 2012). wide variety of signaling molecules secreted by However, being integral membrane proteins, the organism, among which are neurotransmitters GPCRs are not easy targets for structural biology or hormones. Conversely, the second category studies. The first experimentally determined includes environmental substances that are three-dimensional (3D) GPCR structures, first detected by the organism, for instance molecules in the form of a projection map (Schertler et al. responsible for odor or taste. Beyond chemical 1993), and then of 3D X-ray crystallographic first messengers, some GPCRs are also structures (Palczewski et al. 2000), were stimulated by physical messengers such as light determined for rhodopsin, a photoreceptor photons. Upon activation, GPCRs couple with an activated by light photons. Rhodopsin is a rather array of direct intracellular partners, chiefly G peculiar GPCR, which features a covalently proteins and arrestins, and transduce the stimulus bound 11-cis-retinal molecule that acts as an into biochemical signals involving complex inverse agonist. Light photons isomerize the pathways (Pierce et al. 2002). ligand to its all-trans form and consequently As stated by Robert Lefkowitz in an insightful trigger activation of the receptor. However, personal retrospective on his career devoted to the despite these unique characteristics, rhodopsin study of seven transmembrane receptors, GPCR shares its main structural features with the signaling ultimately regulates virtually all known entire GPCR superfamily. Historically, the first physiological pathways (Lefkowitz 2007a). Thus, direct evidence that, beyond rhodopsin, a seven the superfamily of GPCRs provides a wealth of transmembrane domain topology was actually targets for pharmacological intervention. As a shared by all GPCRs emerged in the 1980s, result, the modulation of GPCR signaling is the when the laboratories of Lefkowitz and Strader mechanism of action of many of the currently cloned the “2 and detected a approved drugs (Overington et al. 2006). Some striking similarity with the hydropathicity profile of these drugs act directly on the receptors, ei- shown by rhodopsin (Dixon et al. 1986; Dohlman ther by stimulating their activity, as in the case et al. 1987). As Lefkowitz recalled in his of agonists of the opioid receptors used for personal retrospective, prior to these discoveries, the treatment of pain (Spetea and Schmidham- nobody had thought of any possible relation- mer 2012), or by preventing their stimulation ships between rhodopsin and other receptors by natural agonists, as in the case of blockers (Lefkowitz 2007a, b). of the receptor used for the prevention We now know that rhodopsin is just one of of platelet aggregation (Jacobson et al. 2011). the members of the very large superfamily of Conversely, other drugs alter GPCR signaling G protein-coupled receptors (Fredriksson et al. through indirect means, for instance by prevent- 2003) (Gloriam et al. 2007). In humans, the ing the reuptake of the natural agonists of the superfamily counts over 800 distinct members receptors, as in the case of the selective serotonin that, on the basis of sequence similarity, can be reuptake inhibitors (SSRIs) used for the treatment divided into five different families: the glutamate of depression (Donati and Rasenick 2003), or by family (G), also known as class C or family III; altering the activity of the enzymes responsible the rhodopsin family (R), which is by far the for the synthesis of these molecules, as in the case largest and is also known as class A or family I; of the inhibitors of the angiotensin-converting the adhesion family (A) and the secretin family enzyme used for the treatment of hypertension (S), which taken together are also known as (Rosskopf and Michel 2008). class B or family 2; and the (F)/taste2 Experimentally determined three-dimensional family (Fredriksson et al. 2003; Gloriam et al. structures of drug targets are very powerful tools 2007). 1 The GPCR Crystallography Boom: Providing an Invaluable Source of Structural Information... 5

As reviewed by Costanzi et al., rhodopsin magnetic resonance (NMR) spectroscopy (Park remained for several years the only GPCR with et al. 2012). Given the fast pace of the field, many experimentally elucidated three-dimensional more structures are very likely to be solved in the structures (Costanzi et al. 2009). Beyond pro- near future. viding direct information on the light-activated receptor, these structures also superseded those of rhodopsin as the template of choice for the 1.2 There Once Was Rhodopsin study of other members of the superfamily, in particular those belonging to class A (Audet and Since the 1970s, Rhodopsin has been the ob- Bouvier 2012; Costanzi et al. 2009). However, ject of biochemical investigations intended to in light of the vast pharmaceutical interest unveil its amino acid sequence, facilitated by the revolving around GPCRs, enormous efforts fact that the protein could be obtained in high were put toward the crystallization of other quantities from the retinas of cows (Hargrave members of the superfamily. In 2007, these 2001). The first complete amino acid sequence efforts culminated in a number of breakthroughs of rhodopsin was published in 1982 and 1983 that yielded the solution of the X-ray structure by the laboratories of Ovchinnikov and Har- of the “2-adrenergic receptors (Cherezov et al. grave (Hargrave et al. 1983; Ovchinnikov et al. 2007; Rasmussen et al. 2007; Rosenbaum et al. 1982). The hydropathicity profile inferred from 2007). Thanks to these advances, the field of the sequence indicated a seven transmembrane GPCR crystallography is now in full blossom topology. Moreover, Hargrave and coworkers had and, at the time of a recent survey study previously determined that the C-terminus of the conducted by Jacobson and Costanzi during the receptor was located in the cytoplasm. Hence, summer of 2012, the Protein Data Bank (www. taken together, these data led to the drawing rcsb.org) enlisted structures for a total of 73 of the first bi-dimensional model of rhodopsin structures for 15 distinct receptors (Jacobson as a serpentine originating in the extracellular and Costanzi 2012). At the time of this writing, milieu, spanning seven times the plasma mem- structures for two additional receptors, namely brane with seven ’-helical domains connected by the NTSR1 1 (NTSR1) three extracellular and three intracellular loops, (White et al. 2012) and the protease-activated and terminating in the cytoplasm (Hargrave et al. receptor 1 (PAR1) (Zhang et al. 2012), have 1983)ÐseeFig.1.1. been published. Moreover, another milestone in The first direct experimental insights into the the field of GPCR structural studies has been three-dimensional structure of rhodopsin were reached with the determination of the structure of provided by Schertler and coworkers, who sub- the chemokine CXCR1 receptor through nuclear jected the receptor to electron crystallography

EL1 EL2 EL3 + 3HN Extracellular milieu

Plasma TM2 TM1 TM7 TM6 TM4 TM5 Fig. 1.1 Schematic membrane TM3 representation of Rhodopsin as a membrane spanning serpentine. Abbreviations: TM Cytosol transmembrane domain, IL COO intracellular loop, EL extracellular loop IL1 IL2 IL3 6 S. Costanzi and K. Wang

also part of the termini and almost the entirety of the intracellular and extracellular loops that interconnect the transmembrane domains were visible in the structure. In this regard, the structure unveiled that the second extracellular loop, which connects the fourth and fifth transmembrane domains, in rhodopsin assumes a “-hairpin conformation and obstructs the access to the helical bundle from the extracellular milieu like a closed lid, thus creating a closed binding pocket for retinal. As we discuss later in the article, subsequent X-ray crystallography studies highlighted that the extracellular domains in general, and the second extracellular loop in particular, are highly variable structures that adopt different conformations in different members of the GPCR family.

1.3 2007: The Year of Change

Twenty years after the discovery of the homology between rhodopsin and the “ adrenergic receptors (Dixon et al. 1986; Dohlman et al. 1987) and 7 years after the solution of the crystal structure of rhodopsin (Palczewski et al. 2000), crystal structures were solved for members of the “ Fig. 1.2 The three-dimensional structure of bovine rhodopsin (PDB ID: 1F88), showing the backbone of the adrenergic receptor family (Fig. 1.3) (Cherezov receptor schematically represented as a ribbon et al. 2007; Day et al. 2007; Rasmussen et al. 2007; Rosenbaum et al. 2007). This milestone and obtained a 2D projection map (Schertler et al. ultimately sanctioned the structural homology 1993). On the basis of this map, Baldwin subse- between rhodopsin and the “ adrenergic receptors quently built a molecular model of the receptor and, more in general, the entire superfamily of that allocated the individual TMs to the various GPCRs postulated 20 years earlier by Lefkowitz peaks (Baldwin 1993). and coworkers (Dixon et al. 1986; Dohlman et al. The year 2000 brought another leap forward 1987). Just as the experiments conducted in the for the structural studies of GPCRs, with the 1980s had suggested, the “ adrenergic receptors publication by Palczewski and coworkers of a indeed share a very high degree of structural 2.8 Å resolution X-ray crystal structure of bovine similarity with rhodopsin. In particular, the root rhodopsin in its inactive, dark-adapted state (PDB mean square deviations (RMSD) of the atomic ID: 1F88) (Palczewski et al. 2000). Palczewski’s coordinates between the C’ atoms of the amino structure, which is shown in Fig. 1.2, provided acid residues located in the membrane spanning the first three-dimensional representation of regions of the two receptors amounts to a mere rhodopsin experimentally elucidated at the about 2.7 Å (Cherezov et al. 2007; Rosenbaum atomic level. In particular, the structure revealed et al. 2007). Moreover, a significant overlap be- the fine features of the backbone of the protein as tween the ligands co-crystallized with the two well as the side chains of its amino acid residues. receptors is noticeable (Cherezov et al. 2007; Importantly, not only the ’-helical bundle, but Rosenbaum et al. 2007). 1 The GPCR Crystallography Boom: Providing an Invaluable Source of Structural Information... 7

Fig. 1.3 The three-dimensional structures of (a) the hu- the “2 adrenergic receptor for the first blind assessment man “2-adrenergic receptor (PDB ID: 2RH1) and (b)the of GPCR modeling and docking (Michino et al. 2009). (PDB ID: 3EML). (c) Our model The figure shows the backbone of the three receptors of the adenosine A2A receptor constructed on the basis of schematically represented as a ribbon

The solution of the structures of the adrenergic However, the various ligands co-crystallized receptors marked the beginning of a new era. with their respective receptors occupy different The technical expedient that made the solution regions of this cavity, more or less overlapping of these structures attainable, which include the with each other. Moreover, in some cases fusion of GPCRs with easily crystallizable pro- different ligands were found to bind to the same teins, the use of antibodies, and the introduc- receptor by exploiting different regions of the tion of stabilizing mutations, provided general binding cavity. This situation is particularly tools for the crystallization of other members evident when comparing the binding of a small of the superfamily as well (Hanson and Stevens molecule antagonist and a much larger cyclic 2009; Salon et al. 2011; Steyaert and Kobilka peptide antagonist to the CXCR4 chemokine 2011;Tate2012). As a result, in 2008 it be- receptor, which show a complete lack of overlap. came the turn of a member of the adenosine For pictorial views of the ligand binding cavities receptor family, namely the A2A receptor, to be of the crystallized GPCRs and a comparison of solved through X-ray crystallography (Fig. 1.3) ligand binding modes, see a recent review by (Jaakola et al. 2008). The solution of the struc- Jacobson and Costanzi (2012). ture of various other receptors belonging to the It is also now evident that the extracellular large rhodopsin family rapidly followed, with a regions assume different conformations in the continuous flow that keeps making the list of various receptor families. This high variability is experimentally solved GPCRs longer (Jacobson also found in the second extracellular loop, which and Costanzi 2012). in most receptors is the extracellular domain Taken together, the currently solved GPCR that shows more contacts with the bound ligand. structures indicate that the members of the Notably, however, all the structures of receptors superfamily, or at least those belonging to naturally activated by peptides that were solved the rhodopsin family, share a high degree to date, including chemokine, opioid, and noci- of structural similarity, which is particularly ceptin receptors, featured a very similar second pronounced within the helical bundle. However, extracellular loop, which adopted a particularly it also highlighted the fact that each receptor has solvent exposed “-hairpin conformation that kept its own peculiarity. Specifically, it is now evident the binding cavity wide open and ready for the that for all the crystallized receptors the ligand binding of large ligands (Granier et al. 2012; binding cavity lies within the heptahelical bundle, Manglik et al. 2012; Thompson et al. 2012;Wu near its opening toward the extracellular space. et al. 2010, 2012). 8 S. Costanzi and K. Wang

Most of the crystal structure of GPCRs have topology (Audet and Bouvier 2012; Okada and been solved solely in the inactive state, which is Palczewski 2001). Conversely, as we mentioned believed to be more rigid and, therefore, more in the introduction, the seminal work conducted amenable to crystallization. However, structures by Lefkowitz, Strader and coworkers in the that reflect activated states have been solved for 1980s revealed beyond doubts that rhodopsin some receptors, including rhodopsin, the “ adren- and GPCRs were indeed members of a single ergic receptors, the adenosine receptors (Choe superfamily of receptors (Dixon et al. 1986; et al. 2011;Parketal.2008; Rasmussen et al. Dohlman et al. 1987). Thus, in light of this proven 2011a, b; Scheerer et al. 2008; Standfuss et al. homology, the structures of rhodopsin published 2011;Xuetal.2011). As reviewed by Audet and by Baldwin on the basis of Schertler’s projection Bouvier, these data provide significant insights maps, the X-ray structure of rhodopsin published into the molecular mechanisms at the basis of the in 2000 and a number of additional subsequently activation of GPCRs (Audet and Bouvier 2012). solved X-ray structures of the same receptor, Most notable is the structure that was solved became the templates of choice for the modeling by Kobilka and coworkers for the “2 adrenergic of GPCRs and maintained this status for several receptor in its activated state, in complex with an years, until the structures of additional members agonist and an activated G protein heterotrimer of the superfamily became available (Audet and (Rasmussen et al. 2011b). This structure, which Bouvier 2012; Costanzi et al. 2009). captures a GPCR in the act of signaling, arguably Currently, due to the explosion of GPCR crys- represents the current pinnacle of the structural tallography, a variety of templates are available studies of GPCRs. for the construction of GPCR homology models. Ultimately, the success of any homology mod- eling effort rests in the use of a suitable tem- 1.4 Assessing the Accuracy plate and the accuracy of the sequence alignment. of Homology Models in Light The presence of high sequence similarity and of the Corresponding Crystal the detection of shared features such as com- Structures mon sequence motifs, putatively shared disulfide bridges and a similar distribution of proline and Besides unveiling the three-dimensional topology glycine residues, which notoriously affects the of the protein for which they have been shape of ’-helices, are criteria commonly used determined, experimental structures also provide for template selection. Beyond the selection of templates for the construction of homology single templates, the use of multiple templates to models of homologous proteins. This application model different regions of a target receptor have is particularly prominent in the field of been proposed as well (Costanzi 2012; Mobarec GPCRs, given the size of the superfamily, the et al. 2009; Worth et al. 2009). For a detailed pharmaceutical relevance of its members and the methodological article on the procedures that paucity of direct structural information. Prior typically followed by our group for the con- to the solution of the structures of rhodopsin, struction of homology modeling of GPCRs, the earlier GPCR modeling studies were based on the reader is referred to an article recently published structure of bacteriorhodopsin, the structure of in the journal “Methods in Molecular Biology” which was first obtained in the mid 1970s through (Costanzi 2012). electron microscopy (Ballesteros and Weinstein For many years a direct and straightforward 1992; Henderson and Unwin 1975; IJzerman analysis of the accuracy of GPCR models was et al. 1992, 1994; Oliveira et al. 1993; Pardo not possible, since experimental structures were et al. 1992). Bacteriorhodopsin, however is not a available only for rhodopsin. However, such GPCR but a proton pump. Moreover, despite comparisons became suddenly possible with the featuring seven transmembrane domains, it availability of multiple templates yielded by the shows a significantly different three-dimensional breakthroughs of 2007. In fact, it became possible 1 The GPCR Crystallography Boom: Providing an Invaluable Source of Structural Information... 9 to build a model of an experimentally-solved that the side chains of the residues lining the bind- receptor on the basis of a different template, to ing cavity be modeled with accuracy (Costanzi then compare the similarity between the model 2010; Mobarec et al. 2009). Thus, whenever the and the experimental structure. rotameric state of a residue thought to be in close In this context, our group studied the feasi- proximity to the ligand cannot be guessed from bility of the construction of molecular models the template, the identification and incorporation of GPCRs in complex with their ligands through of experimentally derived constraints that can homology modeling followed by molecular dock- guide the selection of the conformation of its side ing (Costanzi 2010). In particular, soon after chain may yield a markedly more accurate model. the unveiling of the crystal structure of the “2 Concerning the overall structure of the adrenergic receptor in complex with a potent in- receptor, at the level of the transmembrane verse agonist, a comparison of this experimental domains, which demonstrated a high degree structure with models of the same receptor-ligand of structural conservation, homology models complex obtained through rhodopsin-based ho- usually show a significant structural accuracy, mology modeling followed by molecular docking with average RMSD values below 3 Å. was published (Costanzi 2008). The encouraging Conversely, they are substantially less accurate results of these comparative studies were con- at the level of the extracellular and intracellular firmed by our subsequent success in the first domains, especially the long ones. This is not blind assessment of GPCR modeling and dock- surprising, in light of the low degree of sequence ing, which was conducted in concomitance with and structural conservation shown by these the solution of the crystal structure of the adeno- regions. As mentioned, the second extracellular sine A2A receptors (Fig. 1.3) (Jaakola et al. 2008; loop is often of particular importance for Michino et al. 2009). This controlled assessment ligand recognition, since it lines the interhelical was organized to gage how closely receptor- binding cavity for many GPCRs. The homology ligand complexes obtained through homology modeling of this domain, which assumes modeling and molecular docking would resemble different conformations in different receptor the experimental structure itself. Importantly, the families, is not always feasible. However, the assessment was conducted in a blind manner, i.e. modeling process is assisted by the fact that, the models were submitted to the evaluators be- in most GPCRs, this domain is endowed with fore the unveiling of the experimental structure. a characteristic disulfide bridge that links it to Taken together, our models of the “2 adrenergic TM3. The portion of the loop that shows the most receptor (Costanzi 2008) as well as the models contacts with the ligand is usually the segment of the adenosine A2A receptor submitted to the downstream of this bridge. This is typically a blind assessment (Michino et al. 2009) illustrated relatively short stretch of residues, and hence, is that reasonably accurate models of the complexes suited for de novo modeling, which typically can be constructed. The most accurate models yields more accurate results than homology resulted to be those that incorporated experimen- modeling. Particularly encouraging results in tal derived from site-directed mutagenesis, with this direction have been obtained by Friesner and average root mean square deviations (RMSD) coworkers, who demonstrated the possibility from the atomic coordinates of the experimental of reconstructing through de novo modeling structures ranging from 1.7 Å (for the “2 adren- loops previously expunged from GPCR structures ergic receptor) to 2.8 Å (for the adenosine A2A (Goldfeld et al. 2011, 2012). receptor) for the ligands and 2.7 Å (for the “2 A second blind assessment intended to probe adrenergic receptor) to 3.4 Å (for the adenosine the status of GPCR modeling and docking was A2A receptor) for the residues surrounding the organized in 2010, in coordination with the binding pocket. When the models are built for the experimental elucidation of the structures of study of receptor-ligand interactions of for drug- the dopamine D3 receptor and the chemokine discovery purposes, it is of particular importance CXCR4 receptor through X-ray crystallography 10 S. Costanzi and K. Wang

(Chien et al. 2010; Kufareva et al. 2011; Shoichet and coworkers revealed that a homology Wu et al. 2010). This exercise highlighted model of the dopamine D3 receptor constructed that the accuracy of the predictions depend on the basis of a close homologue, namely the “2 dramatically on the similarity of the target adrenergic receptor, was as effective in a prospec- receptor with the templates. In particular, while tive virtual screening as the crystal structure of the structures of receptors that share a substantial the same receptor (Carlsson et al. 2011). Con- sequence similarity with the template used for versely, a model of the CXCR4 receptor built on their modeling are generally endowed with the basis of a more distant homologue performed a significant level of accuracy, predicting the poorly in comparison to the crystal structure of structures of receptors that are more distant the same receptor (Mysinger et al. 2012). from the available templates is a particularly difficult problem, especially in the absence of experimentally derived constraints (Kufareva 1.5 Summary and Conclusions et al. 2011). The ability of experimental structures and ho- Due to the pharmaceutical interest that revolves mology models of GPCRs to serve as a platform around them, G protein-coupled receptors for virtual screening campaigns intended to iden- (GPCRs) have always been the object of intense tify active compounds out of large databases of efforts toward their structural determination. molecules has been probed through a number of However, until relatively recently their structures controlled experiments that measured the ability proved to be elusive. In this context, rhodopsin of these structures to retrieve a pool of known has been for years the only GPCR with an ligands dispersed within a larger set of inactive experimentally solved structure and has been compounds, a non-comprehensive list of which heavily used to construct models of the other is provided in the reference (Cavasotto 2011; members of the superfamily through a technique Cavasotto et al. 2008; Katritch et al. 2010; Phatak known as homology modeling. Thanks to a et al. 2010; Vilar et al. 2010, 2011)Ðfora number of technical and scientific advancements, review of the application of virtual screening to now the solution of GPCR structures through X- GPCRs, see Chapter 18 of the book “G Protein- ray crystallography and, more recently, NMR is Coupled Receptors: From Structure to Function” more feasible. Thus, structures of several GPCRs published by the Royal Society of Chemistry have been recently disclosed. These structures, (Costanzi 2011). These studies highlighted that besides providing direct structural information accurate models are indeed capable of priori- on the receptor in question, also provide a tizing binders versus non-binders, albeit not as larger basis of templates for the construction well as crystal structures. Not surprisingly, the of more accurate models of those receptors that models tended to perform according to their level have not yet been experimentally solved yet. of structural accuracy. For instance, a study con- The availability of multiple templates has also ducted by Katrich and coworkers on the basis offered the possibility of performing accurate of the A2A models submitted to the blind as- studies intended to probe the strengths and the sessment revealed very good performance levels limits of homology modeling and molecular for the three most structurally accurate models docking applied to homology models. Moreover, (Costanzi, Katritch/Abagyan and Lam/Abagyan) controlled virtual screening experiments have (Katritch et al. 2010). Conversely, less accurate been conducted to investigate the extent of the models performed very poorly. applicability of homology models to molecular Not surprisingly, virtual screening campaigns recognition campaigns and to compare it to based on homology models seem to perform that shown by experimental structures. The better when the models are built on the ba- current pace of structural biology and the sis of closely rather than distantly related tem- rapid advancements of plates. Accordingly, a recent study conducted by suggest a future landscape populated by 1 The GPCR Crystallography Boom: Providing an Invaluable Source of Structural Information... 11 increasingly more numerous experimental and Costanzi S (2010) Modeling G protein-coupled receptors: theoretical GPCR structures. The former and the a concrete possibility. Chim Oggi 28:26Ð31 latter, especially when supported by experimental Costanzi S (2011) Chapter 18. Structure-based virtual screening for ligands of G protein-coupled receptors. data and built on the basis of closely related In: Giraldo J, Pin J-P (eds) G protein-coupled recep- templates, will provide a solid basis for the tors: from structure to function. The Royal Society of rational discovery of novel modulators of GPCR Chemistry, London and Cambridge, pp 359Ð374 activity. Costanzi S (2012) Homology modeling of class a G protein-coupled receptors. Methods Mol Biol 857:259Ð279 Costanzi S, Siegel J, Tikhonova I, Jacobson K (2009) References Rhodopsin and the others: a historical perspective on structural studies of G protein-coupled receptors. Curr Pharm Des 15:3994Ð4002 Audet M, Bouvier M (2012) Restructuring G-protein- Day P, Rasmussen S, Parnot C, Fung J, Masood A, coupled receptor activation. Cell 151:14Ð23 Kobilka T, Yao X, Choi H, Weis W, Rohrer D et Baldwin J (1993) The probable arrangement of the helices al (2007) A monoclonal antibody for G protein- in G protein-coupled receptors. EMBO J 12:1693Ð coupled receptor crystallography. Nat Methods 4: 1703 927Ð929 Ballesteros JA, Weinstein H (1992) Analysis and refine- Dixon R, Kobilka B, Strader D, Benovic J, Dohlman H, ment of criteria for predicting the structure and relative Frielle T, Bolanowski M, Bennett C, Rands E, Diehl orientations of transmembranal helical domains. Bio- R et al (1986) Cloning of the gene and cDNA for phys J 62:107Ð109 mammalian beta-adrenergic receptor and homology Carlsson J, Coleman RG, Setola V, Irwin JJ, Fan H, with rhodopsin. Nature 321:75Ð79 Schlessinger A, Sali A, Roth BL, Shoichet BK (2011) Dohlman H, Bouvier M, Benovic J, Caron M, Lefkowitz Ligand discovery from a dopamine D(3) receptor ho- R (1987) The multiple membrane spanning topogra- mology model and crystal structure. Nat Chem Biol phy of the beta 2-adrenergic receptor. Localization 7:769Ð778 of the sites of binding, glycosylation, and regulatory Cavasotto CN (2011) Homology models in docking phosphorylation by limited proteolysis. J Biol Chem and high-throughput docking. Curr Top Med Chem 262:14282Ð14288 11:1528Ð1534 Donati RJ, Rasenick MM (2003) G protein signaling and Cavasotto CN, Orry AJ, Murgolo NJ, Czarniecki MF, the molecular basis of antidepressant action. Life Sci Kocsi SA, Hawes BE, O’Neill KA, Hine H, Burton 73:1Ð17 MS, Voigt JH et al (2008) Discovery of novel chemo- Fredriksson R, Lagerström M, Lundin L, Schiöth H types to a G-protein-coupled receptor through ligand- (2003) The G-protein-coupled receptors in the human steered homology modeling and structure-based vir- genome form five main families. Phylogenetic analy- tual screening. J Med Chem 51:581Ð588 sis, paralogon groups, and fingerprints. Mol Pharmacol Cherezov V, Rosenbaum D, Hanson M, Rasmussen S, 63:1256Ð1272 Thian F, Kobilka T, Choi H, Kuhn P, Weis W, Kobilka Gloriam D, Fredriksson R, Schiöth H (2007) The G B et al (2007) High-resolution crystal structure of an protein-coupled receptor subset of the rat genome. engineered human beta2-adrenergic G protein-coupled BMC Genomics 8:338 receptor. Science 318:1258Ð1265 Goldfeld DA, Zhu K, Beuming T, Friesner RA (2011) ChienEY,LiuW,ZhaoQ,KatritchV,HanGW,Hanson Successful prediction of the intra- and extracellular MA, Shi L, Newman AH, Javitch JA, Cherezov V et al loops of four G-protein-coupled receptors. Proc Natl (2010) Structure of the human dopamine D3 receptor Acad Sci U S A 108:8275Ð8280 in complex with a D2/D3 selective antagonist. Science Goldfeld DA, Zhu K, Beuming T, Friesner RA (2012) 330:1091Ð1095 Loop prediction for a GPCR homology model: algo- Choe HW, Kim YJ, Park JH, Morizumi T, Pai EF, Krauss rithms and results. Proteins 81:214Ð228 N, Hofmann KP, Scheerer P, Ernst OP (2011) Crystal Granier S, Manglik A, Kruse AC, Kobilka TS, Thian FS, structure of metarhodopsin II. Nature 471:651Ð655 Weis WI, Kobilka BK (2012) Structure of the delta- Congreve M, Marshall F (2009) The impact of GPCR bound to naltrindole. Nature 485:400Ð structures on and structure-based drug 404 design. Br J Pharmacol 159:986Ð996 Hanson MA, Stevens RC (2009) Discovery of new GPCR Congreve M, Langmead CJ, Mason JS, Marshall FH biology: one receptor structure at a time. Structure (2011) Progress in structure based for G 17:8Ð14 protein-coupled receptors. J Med Chem 54:4283Ð4311 Hargrave PA (2001) Rhodopsin structure, function, and to- Costanzi S (2008) On the applicability of GPCR ho- pography the Friedenwald lecture. Invest Ophthalmol mology models to computer-aided drug discovery: a Vis Sci 42:3Ð9 comparison between in silico and crystal structures Hargrave P, McDowell J, Curtis D, Wang J, Juszczak E, of the beta2-adrenergic receptor. J Med Chem 51: Fong S, Rao J, Argos P (1983) The structure of bovine 2907Ð2914 rhodopsin. Biophys Struct Mech 9:235Ð244 12 S. Costanzi and K. Wang

Henderson R, Unwin PN (1975) Three-dimensional Okada T, Palczewski K (2001) Crystal structure of model of purple membrane obtained by electron mi- rhodopsin: implications for vision and beyond. Curr croscopy. Nature 257:28Ð32 Opin Struct Biol 11:420Ð426 IJzerman A, Van Galen P, Jacobson K (1992) Molecular Oliveira L, Paiva ACM, Vriend G (1993) A common modeling of adenosine receptors. I. The ligand binding motif in G-protein-coupled 7 transmembrane helix site on the A1 receptor. Drug Des Discov 9:49Ð67 receptors. J Comput Aided Mol Des 7:649Ð658 IJzerman AP, van der Wenden EM, Van Galen PJ, Ja- Ovchinnikov Y, Abdulaev N, Feigina M, Artamonov I, cobson KA (1994) Molecular modeling of adenosine Zolotarev A, Kostina M, Bogachuk A, Miroshnikov A, receptors. The ligand binding site on the rat adenosine Martinov V, Kudelin A (1982) The complete amino- A2A receptor. Eur J Pharmacol 268:95Ð104 acid-sequence of visual rhodopsin. Bioorganicheskaya Jaakola V, Griffith M, Hanson M, Cherezov V, Chien Khimiya 8:1011Ð1014 E, Lane J, Ijzerman A, Stevens R (2008) The 2.6 Overington JP, Al-Lazikani B, Hopkins AL (2006) How angstrom crystal structure of a human A2A adeno- many drug targets are there? Nat Rev Drug Discov sine receptor bound to an antagonist. Science 322: 5:993Ð996 1211Ð1217 Palczewski K, Kumasaka T, Hori T, Behnke C, Motoshima Jacobson KA, Costanzi S (2012) New insights for drug H, Fox B, Le Trong I, Teller D, Okada T, Stenkamp R design from the x-ray crystallographic structures et al (2000) Crystal structure of rhodopsin: a G protein- of g-protein-coupled receptors. Mol Pharmacol 82: coupled receptor. Science 289:739Ð745 361Ð371 Pardo L, Ballesteros JA, Osman R, Weinstein H (1992) Jacobson KA, Deflorian F, Mishra S, Costanzi S (2011) On the use of the transmembrane domain of bacte- Pharmacochemistry of the platelet purinergic recep- riorhodopsin as a template for modeling the three- tors. Purinergic Signal 7:305Ð324 dimensional structure of guanine nucleotide-binding Katritch V, Rueda M, Lam P, Yeager M, Abagyan R regulatory protein-coupled receptors. Proc Natl Acad (2010) GPCR 3D homology models for ligand screen- Sci U S A 89:4009Ð4012 ing: lessons learned from blind predictions of adeno- Park JH, Scheerer P, Hofmann KP, Choe HW, Ernst OP sine A2a receptor complex. Proteins 78:197Ð211 (2008) Crystal structure of the ligand-free G-protein- Kufareva I, Rueda M, Katritch V, Stevens RC, Abagyan coupled receptor . Nature 454:183Ð187 R (2011) Status of GPCR modeling and docking as Park SH, Das BB, Casagrande F, Tian Y, Nothnagel reflected by community-wide GPCR Dock 2010 as- HJ, Chu M, Kiefer H, Maier K, De Angelis AA, sessment. Structure 19:1108Ð1126 Marassi FM et al (2012) Structure of the chemokine Lefkowitz RJ (2007a) Seven transmembrane receptors: receptor CXCR1 in phospholipid bilayers. Nature 491: a brief personal retrospective. Biochim Biophys Acta 779Ð783 1768:748Ð755 Phatak SS, Gatica EA, Cavasotto CN (2010) Ligand- Lefkowitz RJ (2007b) Seven transmembrane receptors: steered modeling and docking: a benchmarking study something old, something new. Acta Physiol (Oxf) in class a G-protein-coupled receptors. J Chem Inf 190:9Ð19 Model 50:2119Ð2128 Manglik A, Kruse AC, Kobilka TS, Thian FS, Mathiesen Pierce K, Premont R, Lefkowitz R (2002) Seven- JM, Sunahara RK, Pardo L, Weis WI, Kobilka BK, transmembrane receptors. Nat Rev Mol Cell Biol Granier S (2012) Crystal structure of the mu-opioid 3:639Ð650 receptor bound to a morphinan antagonist. Nature Rasmussen S, Choi H, Rosenbaum D, Kobilka T, Thian F, 485:321Ð326 Edwards P, Burghammer M, Ratnala V, Sanishvili R, Mason JS, Bortolato A, Congreve M, Marshall FH (2012) Fischetti R et al (2007) Crystal structure of the human New insights from structural biology into the drugga- beta2 adrenergic G-protein-coupled receptor. Nature bility of G protein-coupled receptors. Trends Pharma- 450:383Ð387 col Sci 33:249Ð260 Rasmussen SG, Choi HJ, Fung JJ, Pardon E, Casarosa Michino M, Abola E, 2008 Participants G, Brooks Cr, P, Chae PS, Devree BT, Rosenbaum DM, Thian FS, Dixon J, Moult J, Stevens R (2009) Community-wide Kobilka TS et al (2011a) Structure of a nanobody- assessment of GPCR structure modelling and ligand stabilized active state of the beta(2) adrenoceptor. docking: GPCR Dock 2008. Nat Rev Drug Discov Nature 469:175Ð180 8:455Ð463 Rasmussen SG, DeVree BT, Zou Y, Kruse AC, Chung KY, Mobarec J, Sanchez R, Filizola M (2009) Modern homol- Kobilka TS, Thian FS, Chae PS, Pardon E, Calinski D ogy modeling of G-protein coupled receptors: which et al (2011b) Crystal structure of the beta2 adrenergic structural template to use? J Med Chem 52:5207Ð5216 receptor-Gs protein complex. Nature 477:549Ð555 Mysinger MM, Weiss DR, Ziarek JJ, Gravel S, Doak Rosenbaum D, Cherezov V, Hanson M, Rasmussen S, AK, Karpiak J, Heveker N, Shoichet BK, Volk- Thian F, Kobilka T, Choi H, Yao X, Weis W, Stevens R man BF (2012) Structure-based ligand discovery for et al (2007) GPCR engineering yields high-resolution the protein-protein interface of structural insights into beta2-adrenergic receptor func- CXCR4. Proc Natl Acad Sci U S A 109:5517Ð5522 tion. Science 318:1266Ð1273 1 The GPCR Crystallography Boom: Providing an Invaluable Source of Structural Information... 13

Rosskopf D, Michel MC (2008) Pharmacogenomics of for ligands of G protein-coupled receptors: Not only G protein-coupled receptor ligands in cardiovascular crystal structures but also in silico models. J Mol medicine. Pharmacol Rev 60:513Ð535 Graph Model. doi:10.1016/j.jmgm.2010.1011.1005 Salon JA, Lodowski DT, Palczewski K (2011) The signif- Vilar S, Karpiak J, Berk B, Costanzi S (2011) In silico icance of G protein-coupled receptor crystallography analysis of the binding of agonists and blockers to for drug discovery. Pharmacol Rev 63:901Ð937 the beta2-adrenergic receptor. J Mol Graph Model Scheerer P, Park JH, Hildebrand PW, Kim YJ, Krauss 29:809Ð817 N, Choe HW, Hofmann KP, Ernst OP (2008) Crystal White JF, Noinaj N, Shibata Y, Love J, Kloss B, Xu F, structure of opsin in its G-protein-interacting confor- Gvozdenovic-Jeremic J, Shah P, Shiloach J, Tate CG mation. Nature 455:497Ð502 et al (2012) Structure of the agonist-bound neurotensin Schertler G, Villa C, Henderson R (1993) Projection receptor. Nature 490:508Ð513 structure of rhodopsin. Nature 362:770Ð772 Worth C, Kleinau G, Krause G (2009) Comparative se- Spetea M, Schmidhammer H (2012) Recent advances in quence and structural analyses of G-protein-coupled the development of 14-alkoxy substituted morphinans receptor crystal structures and implications for molec- as potent and safer opioid analgesics. Curr Med Chem ular models. PLoS One 4:e7011 19:2442Ð2457 Wu B, Chien EY, Mol CD, Fenalti G, Liu W, Katritch Standfuss J, Edwards PC, D’Antona A, Fransen M, Xie G, V, Abagyan R, Brooun A, Wells P, Bi FC et al Oprian DD, Schertler GF (2011) The structural basis (2010) Structures of the CXCR4 chemokine GPCR of agonist-induced activation in constitutively active with small-molecule and cyclic peptide antagonists. rhodopsin. Nature 471:656Ð660 Science 330:1066Ð1071 Steyaert J, Kobilka BK (2011) Nanobody stabilization Wu H, Wacker D, Mileni M, Katritch V, Han GW, Vardy of G protein-coupled receptor conformational states. E, Liu W, Thompson AA, Huang XP, Carroll FI et al Curr Opin Struct Biol 21:567Ð572 (2012) Structure of the human kappa-opioid receptor Tate CG (2012) A crystal clear solution for determining G- in complex with JDTic. Nature 485:327Ð332 protein-coupled receptor structures. Trends Biochem Xu F, Wu H, Katritch V, Han GW, Jacobson KA, Gao Sci 37:343Ð352 ZG, Cherezov V, Stevens RC (2011) Structure of an Thompson AA, Liu W, Chun E, Katritch V, Wu H, agonist-bound human A2A . Sci- Vardy E, Huang XP, Trapella C, Guerrini R, Calo G ence 332:322Ð327 et al (2012) Structure of the nociceptin/orphanin FQ Zhang C, Srinivasan Y, Arlow DH, Fung JJ, Palmer receptor in complex with a peptide mimetic. Nature D, Zheng Y, Green HF, Pandey A, Dror RO, Shaw 485:395Ð399 DE et al (2012) High-resolution crystal structure Vilar S, Ferino G, Phatak SS, Berk B, Cavasotto CN, of human protease-activated receptor 1. Nature 492: Costanzi S (2010) Docking-based virtual screening 387Ð392 Modeling of G Protein-Coupled Receptors Using Crystal Structures: 2 From Monomers to Signaling Complexes

Angel Gonzalez, Arnau Cordomí, Minos Matsoukas, Julian Zachmann, and Leonardo Pardo

Abstract G proteinÐcoupled receptors constitute a large and functionally diverse family of transmembrane proteins. They are fundamental in the transfer of extracellular stimuli to intracellular signaling pathways and are among the most targeted proteins in drug discovery. Recent advances in crys- tallization methods have permitted to resolve the molecular structure of several members of the family. This chapter focuses on the impact of these structures in the use of homology modeling techniques for building three- dimensional models of homologous G proteinÐcoupled receptors, higher order oligomers, and their complexes with ligands and signaling proteins.

Keywords Transmembrane helices ¥ Homology modeling ¥ Conformation ¥ Ligand binding ¥ Activation mechanism

organism, including fungi and plants. They 2.1 Introduction are highly diversified in mammalian genomes with current estimates of about 1,000 Membrane receptors coupled to guanine (2Ð3 % of the human proteome) (Fredriksson and nucleotide-binding proteins (commonly known Schioth 2005). GPCRs transduce sensory signals as G protein-coupled receptors, GPCRs) of external origin such as odors, , comprise one of the widest and most adaptable or tastes; and endogenous signals such as families of cellular sensors, as they are able neurotransmitters, (neuro)peptides, proteases, to mediate a wide range of transmembrane glycoprotein hormones, purine ligands and signal transduction processes (Kristiansen 2004). ions, among others. The response is operated GPCRs are present in almost every eukaryotic through second messenger cascades controlled by different heterotrimeric guanine nucleotide- A. Gonzalez ¥ A. Cordomí ¥ M. Matsoukas ¥ J. Zachmann binding proteins (G-proteins) coupled at their ¥L.Pardo() intracellular regions (Oldham and Hamm 2008). Laboratori de Medicina Computacional, Unitat de Due to their relevance to cellular physiology Bioestadística, Facultat de Medicina, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain (Smit et al. 2007) and their accessibility from the e-mail: [email protected] extracellular environment, membrane proteins

M. Filizola (ed.), G Protein-Coupled Receptors - Modeling and Simulation, Advances 15 in Experimental Medicine and Biology 796, DOI 10.1007/978-94-007-7423-0__2, © Springer ScienceCBusiness Media Dordrecht 2014 16 A. Gonzalez et al. represent a significant portion of therapeutic drug set of membrane proteins maintains a strong targets (Arinaminpathy et al. 2009; Imming et al. conservation of the TM structure even at low 2006). sequence identity (<20 %) (Olivella et al. 2013). The GPCR family is not an exception. All crystal structures preserve analogous 2.2 The Structure of G secondary/tertiary structures at the seven-helical- Protein-Coupled Receptors bundle domain (Fig. 2.1) despite the percentage of sequence identity in the TM segments is very Significant advances in crystallization of GPCRs low (Mobarec et al. 2009; Gonzalez et al. 2012). (Day et al. 2007; Serrano-Vega et al. 2008)have Structure conservation in the GPCR family is permitted to elucidate the crystal structures of associated, in contrast to other proteins, to the many receptors (Table 2.1) (see (Katritch et al. presence of at least one highly conserved amino 2012, 2013) for recent reviews). All these struc- acid in each helix (Mirzadegan et al. 2003): N tures share the common architecture of seven in TM1 (present in 98 % of the sequences), D plasma membrane-spanning (or transmembrane) in TM2 (93 %), R in TM3 (95 %), W in TM4 domains (TMs, which also terms this family of (96%),PinTM5(76%),PinTM6(98%), proteins as 7TM receptors) connected to each and P in TM7 (93 %). This feature was used other with three extracellular (ECL) and three by Ballesteros and Weinstein (1995) to define intracellular loops (ICL), a disulphide bridge be- a general numbering scheme consisting of two tween ECL 2 and TM 3, and a cytoplasmic CÐ numbers: the first (1 through 7) corresponds to terminus containing an ’-helix (Hx8) parallel to the helix in which the amino acid of interest the cell membrane. In addition, GPCRs contain is located; the second indicates its position an extracellular N-terminal region (N-terminus) relative to the most conserved residue in the and an intracellular C-terminal tail (C-tail). helix, arbitrarily assigned to 50. Significantly, the position of these highly conserved amino acids in each helix is the same in the superimposition 2.3 Homology Modeling of G of the currently available crystal structures Protein-Coupled Receptors (Fig. 2.1). This finding validates the use of these amino acids as reference points in TM sequence Because of the limited high-resolution struc- alignments (instead of the common procedure tural information on GPCRs, computational of using substitution matrices and fast sequence techniques to predict their structure from the similarity search algorithms) (see red box in amino acid sequence are a valuable tool (Pieper Fig. 2.2), and in the construction of homology et al. 2013). Recently, de novo techniques using models of GPCRs with unknown structure (de la evolutionary constraints have been applied to Fuente et al. 2010; Blattermann et al. 2012). predict 3D structures of TM proteins (Hopf et al. 2012). However, homology models of proteins with unknown experimental structure can also 2.4 The Conformation be built from homologous proteins of known of Transmembrane Helices structure and similar sequence (templates). This in G Protein-Coupled method is based on the fact that in homologous Receptors proteins, structure is more conserved than sequence. Thus, in general, homologous proteins Figure 2.1 shows the superimposition of the with a sequence identity above 35 % have a TM domain of representative crystal structures. similar 3D structure (Krissinel and Henrick Clearly, the structure of the cytoplasmic part is 2004). Because membrane proteins contain highly conserved. This structural conservation only two types of folds in their TM domains, correlates with the fact that most conserved ’-helix bundles and “-barrels, a significant residues are clustered in the central and 2 Modeling of G Protein-Coupled Receptors Using Crystal Structures... 17

Table 2.1 Crystal structures of G protein coupled receptors Receptor Ligand PDB Reference Rhodopsin Bovine Rhodopsin (bRho) 11-cis retinal 1F88, 1GZM Palczewski et al. (2000) and Li et al. (2004) Squid Rhodopsin (sRho) 11-cis retinal 2Z73 Murakami and Kouyama (2008) Opsin 3CAP Park et al. (2008) Opsin C transducin peptide 3DQB Scheerer et al. (2008) Constitutively active rhodopsin 2X72 Standfuss et al. (2011) Metarhodopsin II 11-trans retinal 3PXO Choe et al. (2011) Metarhodopsin II C transducin peptide 11-trans retinal 3PQR Choe et al. (2011) Biogenic amine receptors

“1-adrenergic (“1AR) Cyanopindolol 2VT4 Warne et al. (2008) “1AR Isoprenaline 2Y03 Warne et al. (2011) “1AR homo-oligomer 4GPO Huang et al. (2013) “2-adrenergic (“2AR) Carazolol 2RH1 Cherezov et al. (2007)and Rosenbaum et al. (2007)

“2AR C nanobody BI-167107 3POG Rasmussen et al. (2011a) “2AR C Gs BI-167107 3SN6 Rasmussen et al. (2011b) Dopamine D3 (D3R) Eticlopride 3PBL Chien et al. (2010)

Histamine H1 (H1R) Doxepin 3RZE Shimamura et al. (2011) Muscarinic M2 (M2R) 3-quinuclidinyl-benzilate 3UON Haga et al. (2012) Muscarinic M3 (M3R) Tiotropium 4DAJ Kruse et al. (2012) Serotonin 5HT1B (5HT1BR) Ergotamine 4IAR Wang et al. (2013a) Serotonin 5HT1B (5HT2BR) Ergotamine 4IB4 Wacker et al. (2013) Nucleotide

Adenosine A2A (A2AR) ZM241385 3EML Jaakola et al. (2008) A2AR UK-432097 3QAK Xu et al. (2011) C A2AR C Na ZM241385 4EIY Liu et al. (2012) Peptide receptors CXCR4 CVX15 3OE0 Wu et al. (2010) CXCR4 IT1t 3ODU Wu et al. (2010) -opioid (-OR) “-funaltrexamine 4DKL Manglik et al. (2012) ›-opioid (›-OR) JDTic 4DJH Wu et al. (2012) •-opioid (•-OR) Naltrindole 4EJ4 Granier et al. (2012) Nociceptin/orphanin FQ C-24 4EA3 Thompson et al. (2012) Neurotensin1 (NTSR1) Neurotensin (8Ð13) 4GRV White et al. (2012) Protease-activated receptor 1 (PAR1) Vorapaxar 3VW7 Zhang et al. (2012) Lipid

Sphingosine S1P (S1P1R) ML056 3V2Y Hanson et al. (2012) Frizzled (class F) (SMO) LY2940680 4JKV Wang et al. (2013b) intracellular regions of the receptor (Mirzadegan that GPCRs, during their evolution, have evolved et al. 2003). In contrast, there is a low degree of to adjust the structural characteristics of their sequence conservation among different GPCRs cognate ligands, by customizing a preserved at their extracellular domains. Accordingly, the scaffold (7TM receptors) through conformational structure of the extracellular part of TM helices plasticity (Deupi et al. 2007). We use this is more divergent. We have previously suggested term to describe the structural differences 18 A. Gonzalez et al.

Fig. 2.1 Comparison of the TM bundle of GPCRs. The structures of bRho (PDB code 1U19), “2R(2RH1),D3R(3PBL), H1R (3RZE), M2R, 5HT1bR(4IAR),A2AR (4EIY), CXCR4 (3ODU), OR (4DKL), NTSR1 (4GRV), PAR1 (3VW7), and S1P1R (3V2Y) are shown. The colour code of the helices is TMs 1 in white,2inyellow,3inred, 4ingrey,5ingreen,6in dark blue,and7inlight blue. The highly conserved N1.50 (in white), D2.50 (in yellow), R3.50 (in red), W4.50 (in grey), P5.50 (in green), P6.50 (in dark blue), and P7.50 (in light blue) are shown as spheres

Fig. 2.2 Sequence alignments of TMs 1–7 of GPCRs with known structures. The highly conserved amino acids in each helix, used as reference points in TM sequence alignments are boxed in red among different receptor subfamilies within the modulating the polytopic membrane protein extracellular side, near the binding site crevices, architecture (Riek et al. 2001). Deviations from responsible for recognition and selectivity of the regular ’-helical context have been associated diverse ligands. to prolines (Von Heijne 1991), glycines (Senes Moreover, comparison among the crystal et al. 2000), serines and threonines (Deupi et al. structures of GPCRs revealed backbone 2004, 2010), or to the insertion or deletions anomalies, in the form of kinks and bulges, (indels) of residues within the TMs (Deville in the majority of TM helices. These non- et al. 2009). Moreover, specific intra- and canonical elements are frequent in TM proteins, interhelical interactions involving polar side 2 Modeling of G Protein-Coupled Receptors Using Crystal Structures... 19

Fig. 2.3 Comparison of individual TM helices in representative structures of the inactive state of GPCRs. The most conserved domain of the helices is superimposed. The crystal structures of GPCRs revealed backbone anomalies, in the form of kinks and bulges, in the majority of TM helices

chains, backbone carbonyls, disulphide bridges (Fig. 2.3). It appears shifted towards the central and, in some cases, structural water molecules axis of the receptor in Rho, A2AR, CXCR4, embedded in the TM bundle (Pardo et al. 2007) opioid receptors, NTSR1, and PAR1. The major also cause these distortions. Here we present a displacement of TM1 corresponds to CXCR4 due detailed analysis of these distortions and their to the formation of a disulphide bond between implication in modeling other GPCRs. Figure 2.3 the C28 in the N-terminal region and C2747.25 shows the superimposition of the more conserved in TM7 (Wu et al. 2010). In contrast, TM1 is part of individual TM helices in representative pointing outside of the bundle in biogenic amine structures of the inactive state of GPCRs. receptors. The highly conserved N1.50 (97 % in class A non-olfactory GPCRs) most probably influences the packing of the TM bundle (see 2.4.1 Transmembrane Helix 1 Fig. 2.3) since its N•2-H atoms act as hydrogen bond donors in the interaction with the backbone The extracellular region of TM1 displays a bend- carbonyl oxygen of residues at positions 1.46 and ing propensity in some of the crystal structures 7.46, linking TMs 1 and 7. Moreover, O•1 of 20 A. Gonzalez et al.

N1.50 interacts with the highly conserved D2.50 helix at the extracellular part (3.6 residues per (in 92 % of the sequences), via a conserved turn). This conformation of TM2 moves its extra- water molecule, linking TMs 1 and 2. Previous cellular part away from the TM bundle, relative to studies have shown that interactions involving the other structures, and modifies the orientation a polar Asn side chain provide a strong ther- of the side chains at the extracellular side. In modynamic driving force for membrane helix order to translate these structural observations association (Choma et al. 2000). into the sequence space, a two-residue gap in the sequences of S1P1R, CXCR4, opioid receptors and PAR1, or one-residue gap in the sequences 2.4.2 Transmembrane Helix 2 – of bRho, biogenic amine receptors and NTSR1, Extracellular Loop 1 relative to sRho, must be inserted (Gonzalez et al. 2012) (Fig. 2.2). The shape of TM2 at the extracellular part, which Importantly, the conserved Trp residue bends towards TM1 and leans away from TM3, in ECL1, part of the (W/F) (F/L)G motif is similar in all structures (Fig. 2.3); despite previously identified (Klco et al. 2006), points the amino acid sequence is strongly divergent toward the helical bundle, between TMs 2 and with, for instance, Pro residues at either po- 3, in the crystal structures with the exception of sition 2.58 (CXCR4, opioid receptors, PAR1), S1P1R (not shown). 2.59 (biogenic amine receptors, NTSR1) or 2.60 (sRho). The only exception is TM2 of A2AR, which contains Pro at position 2.59 but kinks 2.4.3 Transmembrane Helix 3 towards TM3 due to the Cys-bridge between ECL1 and ECL2 exclusive of this family (not TM3 is the longest and most tilted helix in the shown); and TM2 of S1P1R that lacks Pro in receptor structures. No major deviations among the helix (see below). Contrarily to S1P1R, the structures are observed with the exception of also Pro-lacking bRho and muscarinic receptors A2AR, due to the Cys-bridge between ECL1 and possess TM2 structurally similar to the other ECL2 exclusive of this family (see above). The Pro-containing receptors due to the presence of highly conserved C3.25 forms a disulphide bridge the GGxTT motif in bRho and N2.59 in mus- with a Cys residue located at various positions carinic receptors that hydrogen bonds the back- in ECL2. The cytoplasmic side of TM3 contains bone carbonyl at position 2.55 (Gonzalez et al. the highly conserved (D/E)R3.50(Y/W) motif in- 2012). Interestingly, the superimposition of struc- volved in receptor activation (see below). Impor- tures reveals that the highly D2.50 and the Pro tantly, the central location of TM3 within the residue located at position 2.58, 2.59 or 2.60 TM bundle allows the helix to interact with the are perfectly overlaid (Gonzalez et al. 2012). ligand at the extracellular part and with the G Thus, the backbone helical conformation of the protein at the intracellular part (Venkatakrishnan amino acids located between these two residues et al. 2013). must differ. In this region, TM2 of CXCR4, opioid receptors, and PAR1 adopts a 310 or tight turn (3.0 residues per turn), TM2 of biogenic 2.4.4 Transmembrane Helix 4 amine receptors and NTSR1 adopts a -bulge or wide turn (4.8 residues per turn), and TM2 of TM4, the shortest helix, is almost perpendicular sRho presents an extreme distortion (9 residues to the membrane. However, significant structural per turn) characterized by a cis P2.60 backbone divergences at the extracellular part of TM 4 conformation, which is stabilized by two water are found among structures, which may be re- molecules (Gonzalez et al. 2012). In contrast to lated to the structural requirements necessary these receptors, S1P1R contains a canonical ’- to accommodate the diverse ECL2 architectures 2 Modeling of G Protein-Coupled Receptors Using Crystal Structures... 21

5.50 5.38 (see below). For instance, in contrast to TM4 relative to P )ofA2AR with F/Y of the of other biogenic amine receptors, TM4 of mus- P5.50-containing structures (i-12 relative to P5.50) carinic receptors bends towards outside of the and F/Y/W5.39 of the P5.50-lacking structures bundle, away from TM3, due to the hydrogen (Fig. 2.2). bond interactions between the side chain of Q4.65 and the backbone carbonyl oxygen at position 4.62. Significantly, the shape of TM4 at the ex- 2.4.6 Transmembrane Helix 6 tracellular part, in peptide receptors (in which ECL2 is formed by two “-strands, see below), TM6 presents the most pronounced kink in the bends towards TM3. In CXCR4, TM4 is longer TM bundle. This severe distortion is energetically and substantially deviate from the conformation stabilized through two structural and functional observed in other peptide receptors. elements. First, P6.50 of the highly conserved CWxP6.50(Y/F) motif introduces a flexible point in TM6 facilitating this extreme conformation. 2.4.5 Transmembrane Helix 5 Second, a structural water molecule located in a small cavity between TMs 6 and 7 help to main- P5.50 (conserved in 76 % of the rhodopsin-like tain the Pro induced distortion. This water acts as sequences) induces a local opening of TM5, at a hydrogen-bond acceptor in the interaction with the 5.43Ð5.48 turn (Pro-unwinding), in all crystal the backbone N-H amide at position 6.51, and as structures except S1P1R (see below), which has a hydrogen bond donor in the interactions with been proposed to be involved in the mechanism of the backbone carbonyl at position 6.47 and 7.38. ligand-induced receptor activation (Sansuk et al. Thus, in addition to stabilizing the kink of TM6, 2011; Rasmussen et al. 2011a). Thus, P5.50 trig- this water molecule links TMs 6 and 7. gers a -bulge or wide turn conformation (5 residues per turn). However, A2AR displays an extended opening of TM5 from positions 5.35Ð 2.4.7 Transmembrane Helix 7 5.48, in contrast to other P5.50-containing struc- tures in which the opening of the helix is re- TM7 start at different position among receptors. stricted to the 5.43Ð5.48 range of amino acids. TM7 in CXCR4 is two helical turns longer than Moreover, P5.50 is absent in melanocortin, gly- in other GPCRs. In this case, the longer TM7 coprotein hormone, lysosphingolipid, prostanoid, allows C7.25 to be placed at the tip of the helix and cannabinoid receptors. In these cases, in a favorable position to form a disulphide bond the similarly conserved Y5.58 (73 % of with Cys28 in the N-terminal region. TM7 is the sequences), functionally involved in the kinked at P7.50 of the highly conserved NPxxY stabilization of the active state of the receptor motif. This region of TM7, involved in key con- by interacting with R3.50 of the (D/E)RY motif formational changes associated with GPCR ac- in TM3, as revealed by the crystal structures tivation (Rosenbaum et al. 2009), is highly ir- of “2AR in complex with Gs (Rasmussen regular. A network of water molecules stabilizes et al. 2011b) and the ligand-free opsin (Park the helical deformation of TM7 and provides et al. 2008), is used as reference for sequence hydrogen-bonding partners to polar side chains. alignment of TM5 (Fig. 2.2). The absence of For instance, the unusual P7.50 deformation re- Pro in TM5 of S1P1R leads to a regular a- moves the intrahelical hydrogen bond between helical conformation (3.6 residues per turn). the carbonyl group and the N-H amide at po- Thus, the alignment of the S1P1R sequence to sitions 7.45 and 7.49, respectively. A conserved the other receptors requires two-residue gap water molecule is located between the backbone relative to A2AR and one-residue gap relative carbonyl at position 7.45 and the backbone N-H to all other structures, which overlays Y5.37 (i-13 amide at position 7.49. 22 A. Gonzalez et al.

covers the other half (Hanson et al. 2012). In 2.5 The Extracellular Surface these cases, retinal (Hildebrand et al. 2009; in Class A G Protein-Coupled Park et al. 2008) and sphingosine-1-phosphate Receptors (Hanson et al. 2012) may gain access to the binding pocket from the lipid bilayer (Martin- The extracellular surface of GPCRs is defined by Couce et al. 2012). In contrast, ECL2 in the conformation of the N-terminus region and biogenic amine receptors, adenosine and peptide ECLs1-3. Notably, the N-terminus and ECL2 in receptors adopt different spatial conformations particular are highly variable in sequence, length, that maintain the binding site rather accessible and structure (Peeters et al. 2011) (Fig. 2.4). from the extracellular environment (Fig. 2.4). In rhodopsin, the N-terminus (formed by two ECL2 of peptide receptors are formed by two “- “-strands) and ECL2 (two “-strands) block the strands, whereas a helical segment forms ECL2 access of the extracellular ligand to the core of of adrenergic receptors. This ’-helix between the receptor (Palczewski et al. 2000). Similarly, TM4 and the disulphide bridge is not conserved in S1P1R, the N-terminus (contains a short in the other members of the biogenic amine ’-helix) covers half the binding pocket and receptor family. Thus, each receptor subfamily ECL2 (formed by a family-specific disulphide has probably developed, during evolution, a bridge within ECL 2, but lacking the conserved specific N-terminus/ECL2 to adjust the structural disulphide bridge between TM3 and ECL 2) characteristics of its cognate ligands, and to

Fig. 2.4 Molecular surface of the extracellular domain in known crystal structures of GPCRs. The N-terminus domain is shown in red, ECL2 is shown in yellow, and the ligand in the binding site is shown as spheres 2 Modeling of G Protein-Coupled Receptors Using Crystal Structures... 23 modulate the ligand binding/unbinding events (Hurst et al. 2010;Droretal.2011; Gonzalez 2.7 Intracellular Structural et al. 2011). Changes Associated with Activation of G Protein-Coupled Receptors 2.6 Ligand Binding to G Protein-Coupled Receptors The publication of the crystal structure of the ligand-free opsin (Park et al. 2008), which con- Analysis of the known crystal structures of tains several distinctive features of the active state GPCRs shows that ligand binding mostly occurs as it has been confirmed in the recent structure in a main cavity located between the extracellular of the “2-adrenergic receptor bound to Gs (Ras- segments of TMs 3, 5, 6, and 7 or in a minor mussen et al. 2011b), showed that during the binding cavity located between the extracellular process of receptor activation the intracellular segments of TMs 1, 2, 3, and 7 (Rosenkilde part of TM6 tilts outwards, TM5 nears TM6, and et al. 2010) (Fig. 2.5a). Despite these common R3.50 within the (D/E)RY motif in TM3 adopts pockets, different ligands penetrate to different an extended conformation pointing towards the depths within the TM bundle (Venkatakrishnan protein core, to interact with the highly conserved et al. 2013) (Fig. 2.5bÐf). A major issue in these Y5.58 in TM5 and Y7.53 of the (N/D)PxxY motif in common binding modes is the specificity of TM7 (Fig. 2.6). As shown in the original publica- ligands among subfamilies of receptors. tion of the opsin structure, these conformational

Fig. 2.5 Ligand binding to GPCRs.(a) Binding cav- (d) The binding of doxepin (white)toH1R(Shimamura ities in “2AR. (b) The binding of vorapaxar (white)to et al. 2011) and ergotamine (gray)to5HT1bR (Wang et al. PAR1 (Zhang et al. 2012), IT1t (gray) to CXCR4 (Wu 2013a). (e) The binding of ZM241385 (white)toA2AR et al. 2010), and morphinan (olive)to-OR (Manglik (Jaakola et al. 2008). (f) The binding of ML056 to S1P1R et al. 2012). (c) The binding of the CVX15 cyclic peptide (Hanson et al. 2012). The structures of retinal (orange (olive) to CXCR4 (Wu et al. 2010) and aminoacids 8Ð sticks)andC3.25 and W6.48 (green sticks) are shown in 13 of neurotensin (pink) to NTSR1 (White et al. 2012). panels B-F for comparison purposes 24 A. Gonzalez et al.

Fig. 2.6 Intracellular structural changes associated with receptor activation. Comparison of (a, c) the crystal structure of inactive rhodopsin (1GZM) with (b, d)the crystal structure of the ligand-free opsin (3CAP), which contains several distinctive features of the active state, in views parallel (c, d)and perpendicular (a, b)tothe membrane. Panel B shows the positions of TMs 3, 5Ð7 in rhodopsin (transparent cylinders) and opsin (opaque cylinders) for comparison purposes changes disrupt the ionic interaction between bulky hydrophobic side chain at position 3.40 R3.50 with negatively charged side chains at po- (Fig. 2.7a), highly conserved in the whole Class sitions 3.49 in TM3 and 6.30 in TM6 (Fig. 2.6a, A GPCR family (I:40 %, V:25 %, L:11 %). c) and facilitates the interaction between K5.66 Mutation of I3.40 to either Ala or Gly, i.e. re- in TM 5 and E6.30 in TM 6 (Fig. 2.6b, d). It moving the bulky side chain at this position, has been suggested that conserved hydrophobic abolishes the constitutive activity of the histamine amino acids in the environment of these key H1 receptor, the effect of constitutive-activity polar residues form hydrophobic cages, which increasing mutations, as well as the histamine- also restrain GPCRs in inactive conformations induced receptor activation (Sansuk et al. 2011). (Caltabiano et al. 2013). Thus, the inward movement of P5.50 upon ag- onist binding repositions I3.40 and F6.44, which contributes to a rotation and outward movement 2.8 Mechanism of of TM6 for receptor activation (Rasmussen et al. Ligand-Induced G 2011a). Protein-Coupled Receptor The structures of metarhodopsin II (Choe et al. Activation 2011), the constitutively active rhodopsin (Stand- fuss et al. 2011) and the A2A adenosine receptor The crystal structure of a nanobody-stabilized in complex with the agonist UK-432097 (Xu 6.48 active state of the “2-adrenergic receptor bound to et al. 2011) have shown that W moves toward the BI-167107 agonist (Rasmussen et al. 2011a) TM5 relative to the inactive structures (Fig. 2.7b), shows hydrogen bonding interactions with S5.42 facilitating the rotation and tilt of the intracellular and S5.46 (Fig. 2.7a). These interactions stabilize part of TM6. a receptor conformation that includes a 2.1 Å The role of the extracellular domain in re- inward movement of TM5 at position 5.46 and ceptor function still remains unclear. However, 5.50 1.4 Å inward movement of the conserved P NMR studies on the “2-adrenergic receptor have relative to the inactive, carazolol-bound structure shown ligand-specific conformational changes on (Rosenbaum et al. 2007). This key distortion is the extracellular domain (Bokoch et al. 2010). stabilized in the known crystal structures by a Similarly, it has recently been reported that a 2 Modeling of G Protein-Coupled Receptors Using Crystal Structures... 25

Fig. 2.7 Mechanisms of ligand-induced receptor ac- carazolol-boundstructure. (b) The conformational change tivation.(a) Detailed view of the “2-adrenergic recep- of inactive 11-cis retinal (in white) to the active 11-trans tor bound to the full agonist BI-167107 (in orange). retinal (in orange) stabilizes a receptor conformation that The hydrogen bond interaction between full agonists includes an inward movement of TM5 together with a and S5.46 stabilizes a receptor conformation that in- movement of W6.48 toward TM5 relative to the inactive cludes an inward movement of TM5 relative to the structures (shown in white for comparison purposes) inactive (shown in white for comparison purposes), small cavity (vestibule) present at the entrance and a single G protein, is maximally activated of the ligand-binding cavity controls the extent by agonist binding to a single protomer. Inverse of receptor movement to govern a hierarchical agonist binding to the second protomer enhances order of G-protein coupling (Bock et al. 2012). signaling, whereas agonist binding to the second Finally, the N-terminal domain of melanocortin protomer blunts signaling. Moreover, binding of receptors plays a significant role in their consti- agonists or the G protein to “2- regulates receptor tutive, ligand-independent, activity (Ersoy et al. oligomerization (Fung et al. 2009). Cysteine 2012). cross-linking experiments have suggested that receptor oligomerization involves hydrophobic interactions via the surfaces of TMs1, 4, and/or 2.9 G Protein-Coupled Receptor 5 (Klco et al. 2003; Guo et al. 2005, 2008). Oligomerization Nevertheless, electrostatic interactions of the intracellular domains are key in the formation of GPCRs have been classically described as receptor heteromers (Navarro et al. 2010). monomeric TM receptors that form a ternary The recent release of the high-resolution crys- complex: a ligand, the GPCR, and its associated tal structures of OR (Manglik et al. 2012) and G protein. This is compatible with observations “1-AR (Huang et al. 2013) in the form of homo- that monomeric rhodopsin and “2-adrenergic oligomers (Fig. 2.8) facilitates the task of mod- receptor are capable of activating G proteins eling GPCR dimers and higher order oligomers. (Ernst et al. 2007; Whorton et al. 2007). The structure of OR shows receptor molecules Nevertheless, it is now well accepted that many associated into pairs through two different in- GPCRs have been observed to oligomerize in terfaces (Fig. 2.8a). The first interface is via cells (Pin et al. 2007; Ferre et al. 2009). It has TMs1 and 2 and Hx8, and the second interface been shown that receptor activation is modulated comprises TMs 5 and 6. The structure of “1-AR by allosteric communication between protomers contains a similar TMs1 and 2 and Hx8 interface of dopamine class A GPCR dimers (Han et al. but the other interface engages residues from 2009). The minimal signaling unit, two receptors TMs4 and 5 (Fig. 2.8b). 26 A. Gonzalez et al.

Fig. 2.8 GPCR oligomerization. The recent high- of homo-oligomers, and sRho (Murakami and Kouyama resolution crystal structures of (a) OR (Manglik et al. 2008)andH1R(Shimamuraetal.2011) in the form of 2012)and(b) “1-AR (Huang et al. 2013) in the form homo-dimers 2 Modeling of G Protein-Coupled Receptors Using Crystal Structures... 27

Additional crystal structures with GPCR terminal ’5 helix of G’ binds to the intracellular dimers have been published. Interestingly, a TM1 cavity that is opened by the movement of interface, similar to the one observed for OR the cytoplasmic end of TM6 away from TM3 and “1-AR, is present in the structures of the ›OR and towards TM5 (see above). The C-terminal (Wu et al. 2012), opsin (Scheerer et al. 2008), and ’5 helix of the ’-subunit interacts with the metarhodopsin II (Choe et al. 2011). Moreover, extended conformation of R3.50, the short loop the TM4/5 interface of “1-AR resembles the connecting TM7 and Hx8, and the inner side of interface previously obtained for rhodopsin the cytoplasmic TMs 5 and 6 (Fig. 2.9). using atomic force microscopy (Fotiadis et al. 2003). The crystal structure of the (Shimamura et al. 2011) contains a TM4 2.11 The Binding of the C-Tail of G interface (Fig. 2.8d), which is different from the Protein-Coupled Receptors TM4/5 interface of “1-AR due to the absence to Arrestin of TM5 contacts. Similarly, the structures of CXCR4 (Wu et al. 2010) and squid rhodopsin Phosphorylation of several residues of the C-tail (Murakami and Kouyama 2008) contain a TM5 of GPCRs, by Ser/Thr kinases called G protein- interface (Fig. 2.8c), which are different from the coupled receptor kinases (GRKs), promotes the TM4/5 interface of “1-AR or the TM5/6 interface interactions between the receptor and arrestin, of OR. leading to receptor desensitization (Lefkowitz and Shenoy 2005). GPCRs can bind four different arrestin proteins: arrestin-1 and arrestin-4 (known 2.10 The Binding of G as visual arrestins) bind to the phosphorylated Protein-Coupled Receptors form of active rhodopsin, whereas arrestin-2 to the G Protein and arrestin-3 interact and regulate the activity of non-visual GPCRs (Gurevich and Gurevich The formation of the complex between the 2006). active conformation of the receptor and the Arrestin comprises two domains (N- and heterotrimeric G protein triggers GDP release C- domains) of antiparallel “-sheets connected from the G’-subunit, GTP binding to the G’- through a hinge region (Granzin et al. 1998) subunit and dissociation of the G“”-subunits (Fig. 2.10). The binding region for phosphory- (Chung et al. 2011), which finally leads to a lated ligand-activated receptor is located at the N- cascade of signals depending on the G-protein terminal domain, which is occupied by the long type. Noteworthy, more than 800 known GPCRs C-terminal tail in the basal state (blue peptide can bind 17 different G’ subunits, which have in Fig. 2.10a). The crystal structure of arrestin-2 been grouped into four different classes (G’s, in complex with a phosphorylated 29-aminoacid G’i,G’q and G’12)(Simonetal.1991). To carboxy-terminal peptide derived from the human date, the crystal structures of the ligand-free V2 (V2Rpp) (Shukla et al. opsin (Scheerer et al. 2008), metarhodopsin 2013) has recently released. This structure shows II (Choe et al. 2011) and the constitutively that the phosphorylated C-tail region of GPCRs active rhodopsin mutant E3.28Q (Standfuss (yellow peptide in Fig. 2.10a) displaces the C- et al. 2011) in complex with a peptide derived tail of arrestin. Moreover, an active conformation from the carboxy terminus of the ’-subunit of arrestin-1, mimicked by C-tail truncation, of the G protein transducin, together with the has also been published (Kim et al. 2013). structure of the “2-adrenergic receptor bound Both structures show significant conformational to Gs (Rasmussen et al. 2011b) have been changes relative to inactive, basal, arrestin. released. These structures have shown that the C- These include rotation of the N- and C-terminal 28 A. Gonzalez et al.

Fig. 2.9 G-protein binding.(a) Crystal structure of complex depicted in panel B.(b) Detailed view of the the “2-adrenergic receptor in complex with the Gs het- interaction between the C-terminal ’5 helix of the ’- erotrimer (’-subunit in olive, “-subunit in white,and”- subunit (in orange) with the short loop connecting TM7 subunit in gray). The C-terminal ’5 helix of the ’-subunit and Hx8 (light blue), TM3 (red), and the inner side of the is shown in orange.Therectangle shows the part of the cytoplasmic TMs 5 (green)and6(blue) domains relative to each other, and major of solved structures that include several members reorientations of the lariat, middle, and finger of the GPCR family (bound to either agonists, loops (Fig. 2.10b). antagonists, or inverse agonists), in the form of monomers or homo-oligomers, in complex with the G protein, or the C-tail bound to 2.12 Conclusions arrestin. Thus, the used of these structures as templates allows molecular modelers to GPCRs are disordered allosteric proteins that simulate the process of signal transduction exhibit modulator behavior with a number of through the cell membrane. These tailor-made guests in both the extracellular (ligand) and models can study ligand binding, receptor intracellular (G protein, arrestin) spaces (Kenakin specificity, receptor activation, G protein and Miller 2010). This considers GPCRs as coupling, allosteric communication among monomeric TM receptors. Nevertheless, it is protomers, among others. However, we want now well accepted that many GPCRs form to emphasize that homology modeling of GPCRs homo- and hetero-oligomers (Khelashvili et al. is far from being a routine technique. Clearly, 2010). Since 2007, innovative crystallographic the inclusion of experimental results can improve techniques (Venkatakrishnan et al. 2013)have the reliability of the models, and their predictive resulted in an exponential growth in the number character. 2 Modeling of G Protein-Coupled Receptors Using Crystal Structures... 29

Fig. 2.10 The binding of the C-tail of GPCRs to ar- of inactive arrestin (blue peptide). (b) Detailed view of restin.(a) The active conformation of arrestin-2 (PDB the finger, middle and lariat loops, in the presumably id 4JQI, shown in orange) is superimposed to inactive active conformation of arrestin-2, which interact with the arrestin-2 (1G4M, in gray). The phosphorylated C-tail phosphorylated C-tail of GPCRs region of GPCRs (yellow peptide) displaces the C-tail

Ballesteros JA, Weinstein H (1995) Integrated methods References for the construction of three dimensional models and computational probing of structure-function relations Arinaminpathy Y, Khurana E, Engelman DM, Gerstein in G-protein coupled receptors. Methods Neurosci MB (2009) Computational analysis of membrane pro- 25:366Ð428 teins: the largest class of drug targets. Drug Discov Blattermann S, Peters L, Ottersbach PA, Bock A, Konya Today 14(23Ð24):1130Ð1135 V, Weaver CD, Gonzalez A, Schroder R, Tyagi R, 30 A. Gonzalez et al.

Luschnig P, Gab J, Hennen S, Ulven T, Pardo L, Mohr Deupi X, Dolker N, Lopez-Rodriguez M, Campillo M, K, Gutschow M, Heinemann A, Kostenis E (2012) A Ballesteros J, Pardo L (2007) Structural models of biased ligand for OXE-R uncouples G alpha and G class a G protein-coupled receptors as a tool for drug beta gamma signaling within a heterotrimer. Nat Chem design: insights on transmembrane bundle plasticity. Biol 8(7):631Ð638 Curr Top Med Chem 7(10):999Ð1006 Bock A, Merten N, Schrage R, Dallanoce C, Batz J, Deupi X, Olivella M, Sanz A, Dolker N, Campillo M, Klockner J, Schmitz J, Matera C, Simon K, Kebig A, Pardo L (2010) Influence of the g- conformation of Peters L, Muller A, Schrobang-Ley J, Trankle C, Hoff- Ser and Thr on the structure of transmembrane helices. mann C, De Amici M, Holzgrabe U, Kostenis E, Mohr J Struct Biol 169(1):116Ð123 K (2012) The allosteric vestibule of a seven transmem- Deville J, Rey J, Chabbert M (2009) An indel in trans- brane helical receptor controls G-protein coupling. Nat membrane helix 2 helps to trace the molecular evo- Commun 3:1044 lution of class A G-protein-coupled receptors. J Mol Bokoch MP, Zou Y, Rasmussen SG, Liu CW, Nygaard Evol 68(5):475Ð489 R, Rosenbaum DM, Fung JJ, Choi HJ, Thian FS, Dror RO, Pan AC, Arlow DH, Borhani DW, Maragakis P, Kobilka TS, Puglisi JD, Weis WI, Pardo L, Prosser Shan Y, Xu H, Shaw DE (2011) Pathway and mecha- RS, Mueller L, Kobilka BK (2010) Ligand-specific nism of drug binding to G-protein-coupled receptors. regulation of the extracellular surface of a G-protein- Proc Natl Acad Sci U S A 108(32):13118Ð13123 coupled receptor. Nature 463(7277):108Ð112 Ernst OP, Gramse V, Kolbe M, Hofmann KP, Heck Caltabiano G, Gonzalez A, Cordomi A, Campillo M, M (2007) Monomeric G protein-coupled receptor Pardo L (2013) The role of hydrophobic amino acids rhodopsin in solution activates its G protein transducin in the structure and function of the rhodopsin family at the diffusion limit. Proc Natl Acad Sci U S A of G protein-coupled receptors. Methods Enzymol 104(26):10859Ð10864 520:99Ð115 Ersoy BA, Pardo L, Zhang S, Thompson DA, Millhauser Cherezov V, Rosenbaum DM, Hanson MA, Rasmussen G, Govaerts C, Vaisse C (2012) Mechanism of N- SG, Thian FS, Kobilka TS, Choi HJ, Kuhn P, terminal modulation of activity at the melanocortin-4 Weis WI, Kobilka BK, Stevens RC (2007) High- receptor GPCR. Nat Chem Biol 8(8):725Ð730 resolution crystal structure of an engineered human Ferre S, Baler R, Bouvier M, Caron MG, Devi LA, beta2-adrenergic G protein-coupled receptor. Science Durroux T, Fuxe K, George SR, Javitch JA, Lohse 318(5854):1258Ð1265 MJ, Mackie K, Milligan G, Pfleger KD, Pin JP, ChienEY,LiuW,ZhaoQ,KatritchV,HanGW,Hanson Volkow ND, Waldhoer M, Woods AS, Franco R (2009) MA, Shi L, Newman AH, Javitch JA, Cherezov V, Building a new conceptual framework for receptor Stevens RC (2010) Structure of the human dopamine heteromers. Nat Chem Biol 5(3):131Ð134 D3 receptor in complex with a D2/D3 selective antag- Fotiadis D, Liang Y, Filipek S, Saperstein DA, Engel onist. Science 330(6007):1091Ð1095 A, Palczewski K (2003) Atomic-force microscopy: Choe HW, Kim YJ, Park JH, Morizumi T, Pai EF, rhodopsin dimers in native disc membranes. Nature Krauss N, Hofmann KP, Scheerer P, Ernst OP 421(6919):127Ð128 (2011) Crystal structure of metarhodopsin II. Nature Fredriksson R, Schioth HB (2005) The repertoire of G- 471(7340):651Ð655 protein-coupled receptors in fully sequenced genomes. Choma C, Gratkowski H, Lear JD, DeGrado WF (2000) Mol Pharmacol 67(5):1414Ð1425 Asparagine-mediated self-association of a model Fung JJ, Deupi X, Pardo L, Yao XJ, Velez-Ruiz GA, transmembrane helix. Nat Struct Biol 7(2):161Ð166 Devree BT, Sunahara RK, Kobilka BK (2009) Ligand- Chung KY, Rasmussen SG, Liu T, Li S, DeVree BT, Chae regulated oligomerization of beta(2)-adrenoceptors in PS, Calinski D, Kobilka BK, Woods VL Jr, Sunahara a model lipid bilayer. EMBO J 28(21):2384Ð2392 RK (2011) Conformational changes in the G protein Gonzalez A, Perez-Acle T, Pardo L, Deupi X (2011) Gs induced by the beta2 adrenergic receptor. Nature Molecular basis of ligand dissociation in beta- 477(7366):611Ð615 adrenergic receptors. PLoS One 6(9):e23815 Day PW, Rasmussen SG, Parnot C, Fung JJ, Masood Gonzalez A, Cordomí A, Caltabiano G, Campillo A, Kobilka TS, Yao XJ, Choi HJ, Weis WI, Rohrer M, Pardo L (2012) Impact of helix irregulari- DK, Kobilka BK (2007) A monoclonal antibody for ties on sequence alignment and homology mod- G protein-coupled receptor crystallography. Nat Meth- elling of G protein-coupled receptors. Chembiochem ods 4(11):927Ð929 13(10):1393Ð1399 de la Fuente T, Martin-Fontecha M, Sallander J, Benhamu Granier S, Manglik A, Kruse AC, Kobilka TS, Thian B, Campillo M, Medina RA, Pellissier LP, Claeysen FS, Weis WI, Kobilka BK (2012) Structure of the S, Dumuis A, Pardo L, Lopez-Rodriguez ML (2010) delta-opioid receptor bound to naltrindole. Nature Benzimidazole derivatives as new serotonin 5-HT(6) 485(7398):400Ð404 receptor antagonists. Molecular mechanisms of recep- Granzin J, Wilden U, Choe HW, Labahn J, Krafft tor inactivation. J Med Chem 53(3):1357Ð1369 B, Buldt G (1998) X-ray crystal structure of ar- Deupi X, Olivella M, Govaerts C, Ballesteros JA, restin from bovine rod outer segments. Nature Campillo M, Pardo L (2004) Ser and Thr residues 391(6670):918Ð921 modulate the conformation of pro-kinked transmem- Guo W, Shi L, Filizola M, Weinstein H, Javitch JA brane alpha-helices. Biophys J 86(1):105Ð115 (2005) Crosstalk in G protein-coupled receptors: 2 Modeling of G Protein-Coupled Receptors Using Crystal Structures... 31

changes at the transmembrane homodimer interface SR, Javitch JA, Lohse MJ, Milligan G, Neubig RR, determine activation. Proc Natl Acad Sci U S A Palczewski K, Parmentier M, Pin JP, Vriend G, Cam- 102:17495Ð17500 pagne F, Filizola M (2010) GPCR-OKB: the G protein Guo W, Urizar E, Kralikova M, Mobarec JC, Shi L, coupled receptor oligomer knowledge base. Bioinfor- Filizola M, Javitch JA (2008) Dopamine D2 receptors matics 26(14):1804Ð1805 form higher order oligomers at physiological expres- Kim YJ, Hofmann KP, Ernst OP, Scheerer P, Choe HW, sion levels. EMBO J 27(17):2293Ð2304 Sommer ME (2013) Crystal structure of pre-activated Gurevich VV, Gurevich EV (2006) The structural basis arrestin p44. Nature 497(7447):142Ð146 of arrestin-mediated regulation of G-protein-coupled Klco JM, Lassere TB, Baranski TJ (2003) C5a receptors. Pharmacol Ther 110(3):465Ð502 Receptor oligomerization. I. Disulfide trapping Haga K, Kruse AC, Asada H, Yurugi-Kobayashi T, Shi- reveals oligomers and potential contact surfaces roishi M, Zhang C, Weis WI, Okada T, Kobilka BK, in a G protein-coupled receptor. J Biol Chem Haga T, Kobayashi T (2012) Structure of the human 278(37):35345Ð35353 M2 muscarinic bound to an Klco JM, Nikiforovich GV, Baranski TJ (2006) antagonist. Nature 482(7386):547Ð551 Genetic analysis of the first and third extracellular Han Y, Moreira IS, Urizar E, Weinstein H, Javitch JA loops of the reveals an essential (2009) Allosteric communication between protomers WXFG motif in the first loop. J Biol Chem of dopamine class A GPCR dimers modulates activa- 281(17):12010Ð12019 tion. Nat Chem Biol 5(9):688Ð695 Krissinel E, Henrick K (2004) Secondary-structure match- Hanson MA, Roth CB, Jo E, Griffith MT, Scott FL, Rein- ing (SSM), a new tool for fast align- hart G, Desale H, Clemons B, Cahalan SM, Schuerer ment in three dimensions. Acta Crystallogr D Biol SC, Sanna MG, Han GW, Kuhn P, Rosen H, Stevens Crystallogr 60(Pt 12 Pt 1):2256Ð2268 RC (2012) Crystal structure of a lipid G protein- Kristiansen K (2004) Molecular mechanisms of ligand coupled receptor. Science 335(6070):851Ð855 binding, signaling, and regulation within the superfam- Hildebrand PW, Scheerer P, Park JH, Choe HW, Piechnick ily of G-protein-coupled receptors: molecular model- R, Ernst OP, Hofmann KP, Heck M (2009) A ligand ing and mutagenesis approaches to receptor structure channel through the G protein coupled receptor opsin. and function. Pharmacol Ther 103(1):21Ð80 PLoS One 4(2):e4382 Kruse AC, Hu J, Pan AC, Arlow DH, Rosenbaum DM, Hopf TA, Colwell LJ, Sheridan R, Rost B, Sander C, Rosemond E, Green HF, Liu T, Chae PS, Dror RO, Marks DS (2012) Three-dimensional structures of Shaw DE, Weis WI, Wess J, Kobilka BK (2012) Struc- membrane proteins from genomic sequencing. Cell ture and dynamics of the M3 muscarinic acetylcholine 149(7):1607Ð1621 receptor. Nature 482(7386):552Ð556 Huang J, Chen S, Zhang JJ, Huang XY (2013) Crystal Lefkowitz RJ, Shenoy SK (2005) Transduction of receptor structure of oligomeric beta1-adrenergic G protein- signals by beta-arrestins. Science 308(5721):512Ð517 coupled receptors in ligand-free basal state. Nat Struct Li J, Edwards PC, Burghammer M, Villa C, Schertler Mol Biol 20(4):419Ð425 GF (2004) Structure of bovine rhodopsin in a trigonal Hurst DP, Grossfield A, Lynch DL, Feller S, Romo crystal form. J Mol Biol 343(5):1409Ð1438 TD, Gawrisch K, Pitman MC, Reggio PH (2010) Liu W, Chun E, Thompson AA, Chubukov P, Xu F, A lipid pathway for ligand binding is necessary for a Katritch V, Han GW, Roth CB, Heitman LH, IJzerman cannabinoid G protein-coupled receptor. J Biol Chem AP, Cherezov V, Stevens RC (2012) Structural basis 285(23):17954Ð17964 for allosteric regulation of GPCRs by sodium ions. Imming P, Sinning C, Meyer A (2006) Drugs, their targets Science 337(6091):232Ð236 and the nature and number of drug targets. Nat Rev Manglik A, Kruse AC, Kobilka TS, Thian FS, Mathiesen Drug Discov 5(10):821Ð834 JM, Sunahara RK, Pardo L, Weis WI, Kobilka BK, Jaakola VP, Griffith MT, Hanson MA, Cherezov V, Granier S (2012) Crystal structure of the micro-opioid Chien EY, Lane JR, Ijzerman AP, Stevens RC (2008) receptor bound to a morphinan antagonist. Nature The 2.6 angstrom crystal structure of a human A2A 485(7398):321Ð326 adenosine receptor bound to an antagonist. Science Martin-Couce L, Martin-Fontecha M, Palomares O, 322(5905):1211Ð1217 Mestre L, Cordomi A, Hernangomez M, Palma S, Katritch V, Cherezov V, Stevens RC (2012) Diversity and Pardo L, Guaza C, Lopez-Rodriguez ML, Ortega- modularity of G protein-coupled receptor structures. Gutierrez S (2012) Chemical probes for the recog- Trends Pharmacol Sci 33(1):17Ð27 nition of cannabinoid receptors in native systems. Katritch V, Cherezov V, Stevens RC (2013) Structure- Angewandte Chemie 51(28):6896Ð6899 function of the G protein-coupled receptor superfam- Mirzadegan T, Benko G, Filipek S, Palczewski K ily. Annu Rev Pharmacol Toxicol 53:531Ð556 (2003) Sequence analyses of G-protein-coupled Kenakin T, Miller LJ (2010) Seven transmembrane recep- receptors: similarities to rhodopsin. Biochemistry tors as shapeshifting proteins: the impact of allosteric 42(10):2759Ð2767 modulation and functional selectivity on new drug Mobarec JC, Sanchez R, Filizola M (2009) Modern discovery. Pharmacol Rev 62(2):265Ð304 homology modeling of G-protein coupled receptors: Khelashvili G, Dorff K, Shan J, Camacho-Artacho M, which structural template to use? J Med Chem Skrabanek L, Vroling B, Bouvier M, Devi LA, George 52(16):5207Ð5216 32 A. Gonzalez et al.

Murakami M, Kouyama T (2008) Crystal structure of Riek RP, Rigoutsos I, Novotny J, Graham RM (2001) squid rhodopsin. Nature 453(7193):363Ð367 Non-alpha-helical elements modulate polytopic mem- Navarro G, Ferre S, Cordomi A, Moreno E, Mallol J, brane protein architecture. J Mol Biol 306(2):349Ð362 Casado V, Cortes A, Hoffmann H, Ortiz J, Canela EI, Rosenbaum DM, Cherezov V, Hanson MA, Ras- Lluis C, Pardo L, Franco R, Woods AS (2010) Inter- mussen SG, Thian FS, Kobilka TS, Choi HJ, Yao actions between intracellular domains as key determi- XJ, Weis WI, Stevens RC, Kobilka BK (2007) nants of the quaternary structure and function of recep- GPCR engineering yields high-resolution structural in- tor heteromers. J Biol Chem 285(35):27346Ð27359 sights into beta2-adrenergic receptor function. Science Oldham WM, Hamm HE (2008) Heterotrimeric G protein 318(5854):1266Ð1273 activation by G-protein-coupled receptors. Nat Rev Rosenbaum DM, Rasmussen SG, Kobilka BK (2009) The Mol Cell Biol 9(1):60Ð71 structure and function of G-protein-coupled receptors. Olivella M, Gonzalez A, Pardo L, Deupi X (2013) Nature 459(7245):356Ð363 Relation between sequence and structure in mem- Rosenkilde MM, Benned-Jensen T, Frimurer TM, brane proteins. Bioinformatics 29(13):1589Ð1592. Schwartz TW (2010) The minor binding pocket: a doi:10.1093/bioinformatics/btt249 major player in 7TM receptor activation. Trends Phar- Palczewski K, Kumasaka T, Hori T, Behnke CA, Mo- macol Sci 31(12):567Ð574 toshima H, Fox BA, Trong IL, Teller DC, Okada Sansuk K, Deupi X, Torrecillas IR, Jongejan A, Nijmeijer T, Stenkamp RE, Yamamoto M, Miyano M (2000) S, Bakker RA, Pardo L, Leurs R (2011) A structural in- Crystal structure of rhodopsin: a G protein-coupled sight into the reorientation of transmembrane domains receptor. Science 289(5480):739Ð745 3 and 5 during family a G protein-coupled receptor Pardo L, Deupi X, Dolker N, Lopez-Rodriguez ML, activation. Mol Pharmacol 79(2):262Ð269 Campillo M (2007) The role of internal water Scheerer P, Park JH, Hildebrand PW, Kim YJ, Krauss molecules in the structure and function of the N, Choe HW, Hofmann KP, Ernst OP (2008) Crystal rhodopsin family of G protein-coupled receptors. structure of opsin in its G-protein-interacting confor- Chembiochem 8(1):19Ð24 mation. Nature 455(7212):497Ð502 Park JH, Scheerer P, Hofmann KP, Choe HW, Ernst OP Senes A, Gerstein M, Engelman DM (2000) Statistical (2008) Crystal structure of the ligand-free G-protein- analysis of amino acid patterns in transmembrane coupled receptor opsin. Nature 454(7201):183Ð187 helices: the GxxxG motif occurs frequently and in Peeters MC, van Westen GJ, Li Q, Ijzerman AP (2011) association with “-branched residues at neighboring Importance of the extracellular loops in G protein- positions. J Mol Biol 296:921Ð936 coupled receptors for ligand recognition and receptor Serrano-Vega MJ, Magnani F, Shibata Y, Tate CG (2008) activation. Trends Pharmacol Sci 32(1):35Ð42 Conformational thermostabilization of the beta1- Pieper U, Schlessinger A, Kloppmann E, Chang GA, adrenergic receptor in a detergent-resistant form. Proc Chou JJ, Dumont ME, Fox BG, Fromme P, Hendrick- Natl Acad Sci U S A 105(3):877Ð882 son WA, Malkowski MG, Rees DC, Stokes DL, Stow- Shimamura T, Shiroishi M, Weyand S, Tsujimoto H, ell MH, Wiener MC, Rost B, Stroud RM, Stevens RC, Winter G, Katritch V, Abagyan R, Cherezov V, Liu W, Sali A (2013) Coordinating the impact of structural Han GW, Kobayashi T, Stevens RC, Iwata S (2011) genomics on the human alpha-helical transmembrane Structure of the human histamine H1 receptor complex proteome. Nat Struct Mol Biol 20(2):135Ð138 with doxepin. Nature 475(7354):65Ð70 Pin JP, Neubig R, Bouvier M, Devi L, Filizola M, Javitch Shukla AK, Manglik A, Kruse AC, Xiao K, Reis RI, JA, Lohse MJ, Milligan G, Palczewski K, Parmen- Tseng WC, Staus DP, Hilger D, Uysal S, Huang tier M, Spedding M (2007) International union of LY, Paduch M, Tripathi-Shukla P, Koide A, Koide basic and clinical pharmacology. LXVII. Recommen- S, Weis WI, Kossiakoff AA, Kobilka BK, Lefkowitz dations for the recognition and nomenclature of G RJ (2013) Structure of active beta-arrestin-1 bound to protein-coupled receptor heteromultimers. Pharmacol a G-protein-coupled receptor phosphopeptide. Nature Rev 59(1):5Ð13 497(7447):137Ð141 Rasmussen SG, Choi HJ, Fung JJ, Pardon E, Casarosa Simon MI, Strathmann MP, Gautam N (1991) Diver- P, Chae PS, Devree BT, Rosenbaum DM, Thian sity of G proteins in signal transduction. Science FS, Kobilka TS, Schnapp A, Konetzki I, Suna- 252(5007):802Ð808 hara RK, Gellman SH, Pautsch A, Steyaert J, Weis Smit MJ, Vischer HF, Bakker RA, Jongejan A, Tim- WI, Kobilka BK (2011a) Structure of a nanobody- merman H, Pardo L, Leurs R (2007) Pharma- stabilized active state of the beta(2) adrenoceptor. cogenomic and structural analysis of constitutive G Nature 469(7329):175Ð180 protein-coupled receptor activity. Annu Rev Pharma- Rasmussen SG, DeVree BT, Zou Y, Kruse AC, Chung KY, col Toxicol 47:53Ð87 Kobilka TS, Thian FS, Chae PS, Pardon E, Calinski Standfuss J, Edwards PC, D’Antona A, Fransen M, Xie G, D, Mathiesen JM, Shah ST, Lyons JA, Caffrey M, Oprian DD, Schertler GF (2011) The structural basis Gellman SH, Steyaert J, Skiniotis G, Weis WI, Suna- of agonist-induced activation in constitutively active hara RK, Kobilka BK (2011b) Crystal structure of the rhodopsin. Nature 471(7340):656Ð660 beta2 adrenergic receptor-Gs protein complex. Nature Thompson AA, Liu W, Chun E, Katritch V, Wu H, Vardy 477(7366):549Ð555 E, Huang XP, Trapella C, Guerrini R, Calo G, Roth 2 Modeling of G Protein-Coupled Receptors Using Crystal Structures... 33

BL, Cherezov V, Stevens RC (2012) Structure of the agonist action on a beta(1)-adrenergic receptor. Nature nociceptin/orphanin FQ receptor in complex with a 469(7329):241Ð244 peptide mimetic. Nature 485(7398):395Ð399 White JF, Noinaj N, Shibata Y, Love J, Kloss B, Xu F, Venkatakrishnan AJ, Deupi X, Lebon G, Tate CG, Gvozdenovic-Jeremic J, Shah P, Shiloach J, Tate CG, Schertler GF, Babu MM (2013) Molecular Grisshammer R (2012) Structure of the agonist-bound signatures of G-protein-coupled receptors. Nature neurotensin receptor. Nature 490(7421):508Ð513 494(7436):185Ð194 Whorton MR, Bokoch MP, Rasmussen SG, Huang B, Zare Von Heijne G (1991) Proline kinks in transmembrane RN, Kobilka B, Sunahara RK (2007) A monomeric alpha-helices. J Mol Biol 218(3):499Ð503 G protein-coupled receptor isolated in a high-density Wacker D, Wang C, Katritch V, Han GW, Huang XP, lipoprotein particle efficiently activates its G protein. Vardy E, McCorvy JD, Jiang Y, Chu M, Siu FY, Proc Natl Acad Sci U S A 104(18):7682Ð7687 Liu W, Xu HE, Cherezov V, Roth BL, Stevens RC Wu B, Chien EY, Mol CD, Fenalti G, Liu W, Katritch (2013) Structural features for functional selectivity at V, Abagyan R, Brooun A, Wells P, Bi FC, Hamel serotonin receptors. Science 340(6132):615Ð619 DJ, Kuhn P, Handel TM, Cherezov V, Stevens RC Wang C, Jiang Y, Ma J, Wu H, Wacker D, Katritch (2010) Structures of the CXCR4 chemokine GPCR V, Han GW, Liu W, Huang XP, Vardy E, McCorvy with small-molecule and cyclic peptide antagonists. JD, Gao X, Zhou EX, Melcher K, Zhang C, Bai Science 330(6007):1066Ð1071 F, Yang H, Yang L, Jiang H, Roth BL, Cherezov Wu H, Wacker D, Mileni M, Katritch V, Han GW, Vardy V, Stevens RC, Xu HE (2013a) Structural basis for E, Liu W, Thompson AA, Huang XP, Carroll FI, molecular recognition at serotonin receptors. Science Mascarella SW, Westkaemper RB, Mosier PD, Roth 340(6132):610Ð614 BL, Cherezov V, Stevens RC (2012) Structure of the Wang C, Wu H, Katritch V, Han GW, Huang XP, Liu W, human kappa-opioid receptor in complex with JDTic. Siu FY, Roth BL, Cherezov V, Stevens RC (2013b) Nature 485(7398):327Ð332 Structure of the human smoothened receptor bound to Xu F, Wu H, Katritch V, Han GW, Jacobson KA, Gao an antitumour agent. Nature 497(7449):338Ð343 ZG, Cherezov V, Stevens RC (2011) Structure of an Warne T, Serrano-Vega MJ, Baker JG, Moukhametzianov agonist-bound human A2A adenosine receptor. Sci- R, Edwards PC, Henderson R, Leslie AG, Tate CG, ence 332(6027):322Ð327 Schertler GF (2008) Structure of a beta(1)-adrenergic Zhang C, Srinivasan Y, Arlow DH, Fung JJ, Palmer G-protein-coupled receptor. Nature 454:486Ð491 D, Zheng Y, Green HF, Pandey A, Dror RO, Shaw Warne T, Moukhametzianov R, Baker JG, Nehme R, DE, Weis WI, Coughlin SR, Kobilka BK (2012) Edwards PC, Leslie AG, Schertler GF, Tate CG High-resolution crystal structure of human protease- (2011) The structural basis for agonist and partial activated receptor 1. Nature 492(7429):387Ð392 Part II GPCRs in Motion: Insights from Simulations Structure and Dynamics of G-Protein Coupled Receptors 3

Nagarajan Vaidehi, Supriyo Bhattacharya, and Adrien B. Larsen

Abstract G-protein coupled receptors (GPCRs) are seven helical transmembrane proteins that mediate cell-to-cell communication. They also form the largest superfamily of drug targets. Hence detailed studies of the three dimensional structure and dynamics are critical to understanding the functional role of GPCRs in signal transduction pathways, and for drug design. In this chapter we compare the features of the crystal structures of various biogenic amine receptors, such as “1 and “2 adrenergic receptors, dopamine D3 receptor, M2 and M3 muscarinic acetylcholine receptors. This analysis revealed that conserved residues are located facing the inside of the transmembrane domain in these GPCRs improving the efficiency of packing of these structures. The NMR structure of the chemokine receptor CXCR1 without any ligand bound, shows significant dynamics of the transmembrane domain, especially the helical kink angle on the transmembrane helix6. The activation mechanism of the “2-adrenergic receptor has been studied using multiscale computational methods. The results of these studies showed that the receptor without any ligand bound, samples conformations that resemble some of the structural characteristics of the active state of the receptor. Ligand binding stabilizes some of the conformations already sampled by the apo receptor. This was later observed in the NMR study of the dynamics of human “2-adrenergic receptor. The dynamic nature of GPCRs leads to a challenge in obtaining purified receptors for biophysical studies. Deriving thermostable mutants of GPCRs has been a successful strategy to reduce the conformational heterogeneity and stabilize the receptors. This has lead to several crystal

N. Vaidehi () ¥ S. Bhattacharya ¥ A.B. Larsen Division of Immunology, Beckman Research Institute of the City of Hope, 1500, E. Duarte Road, Duarte, CA 91010, USA e-mail: [email protected]

M. Filizola (ed.), G Protein-Coupled Receptors - Modeling and Simulation, Advances 37 in Experimental Medicine and Biology 796, DOI 10.1007/978-94-007-7423-0__3, © Springer ScienceCBusiness Media Dordrecht 2014 38 N. Vaidehi et al.

structures of GPCRs. However, the cause of how these mutations lead to thermostability is not clear. Computational studies are beginning to shed some insight into the possible structural basis for the thermostability. Molecular Dynamics simulations studying the conformational ensemble of thermostable mutants have shown that the stability could arise from both enthalpic and entropic factors. There are regions of high stress in the wild type GPCR that gets relieved upon mutation conferring thermostability.

Keywords Thermostable mutants ¥ Predictions ¥ Conformational states ¥ Drug design ¥ Biogenic amines ¥ Multiscale method ¥ LItiCon

Although the class A GPCRs have a common 3.1 Introduction seven helical transmembrane topology, their cel- lular functions are divergent. Further, the same 3.1.1 Introduction to Class A GPCRs GPCR can bind to different ligands and thereby elicit different responses depending on the nature G-protein coupled receptors (GPCRs), also of the ligand that binds to the GPCR (Audet and known as the seven transmembrane (TM) Bouvier 2012). They do so by varying the three receptors form a superfamily of membrane dimensional structural conformation of the recep- bound proteins that mediate cell signaling tor. Different ligands also elicit different types of and are responsible for many physiological response at a receptor level (Vaidehi and Kenakin functions of the cell. GPCRs belong to one of 2010). A full agonist elicits a maximal response the largest superfamily of receptors that serve while a partial agonist elicits only a fraction of the as important drug targets. GPCRs have been maximal response at saturating concentrations of broadly classified into three classes based on the the ligand. A very low efficacy ligand that may lengths of their amino termini. In this chapter not produce any measurable response but could we discuss only the class A GPCRs (denoted as prevent further response from a full or a partial just GPCRs hereafter) that have short, about agonist is known as neutral antagonist. Ligands 10Ð75 amino acids in their amino terminus. that reduce the basal response exhibited in a GPCRs have a structural topology consisting of ligand-independent fashion in many GPCRs are seven transmembrane (TM) helices connected by known as an inverse agonist. GPCRs exist in intracellular (ICL) and extracellular (ECL) loops. multiple “active” and “inactive” conformational Ligands ranging from small molecules to pep- states that are in equilibrium even in the absence tides and proteins activate GPCRs by binding ei- of any ligand (Niesen et al. 2011; Vaidehi and ther in the TM domain or in the extracellular loop Bhattacharya 2011). This leads to the ligand in- region. Ligand binding leads to conformational dependent basal activation of the receptor and changes in the receptor that results in coupling the level of the basal activation is different for with cytosolic proteins such as trimeric G-protein each GPCR. In the ligand independent state, the or “-arrestin to elicit cellular response (Reiter population of the receptor in the inactive state of et al. 2012). Understanding the mechanism of the receptor is still higher than the active state of how ligand binding in the extracellular region the receptor (Niesen et al. 2011). The “inactive of the receptor, causing an allosteric effect of state” is defined as a conformation of the recep- coupling to cytosolic proteins, triggering the sig- tor that does not activate the G-protein. Upon nal transduction cascade is an important step in binding of an agonist (full or partial) and the modulating various cellular processes initiated by G-protein, the GPCR undergoes conformational GPCRs. changes that lead to the “active state” of the 3 Structure and Dynamics of G-Protein Coupled Receptors 39 receptor (Rasmussen et al. 2011). Upon bind- receptors (Cherezov et al. 2007; Rasmussen et al. ing of an agonist or inverse agonist the relative 2011), muscarinic acetyl choline receptors M2 population of the active and inactive states shift and M3 (Haga et al. 2012; Kruse et al. 2012), depending on the efficacy of the ligand (Yao dopamine D3DR receptor (Chien et al. 2010), his- et al. 2006, 2009; Vaidehi and Kenakin 2010). tamine H1HR receptor (Shimamura et al. 2011), However binding of the agonist alone may not be adenosine A2A receptors (Lebon et al. 2011b, sufficient to activate the receptor, which is evident Xu et al. 2011) chemokine receptor CXCR4 (Wu from a few agonist bound crystal structures of et al. 2010), peptide receptor neurotensin receptor GPCRs (e.g. “2-adrenergic receptor bound to BI- 1 (White et al. 2012) and protease activated 167107, “1-adrenergic receptor bound to isopro- receptor 1 (Zhang et al. 2012), and opioid recep- terenol). These structures are very close to the tors (Granier and Kobilka 2012; Manglik et al. inactive conformations of various GPCRs. It has 2012; Thompson et al. 2012) have been crys- been suggested that binding of both the ago- tallized using these two strategies for stabilizing nist and G-protein are possibly required to shift specific conformations. With the publication of the conformational equilibrium of the receptor all these GPCR structures, one can understand to the fully active state. However, the agonist the structural differences between closely related bound adenosine receptor A2A shows active-like receptors as well as those that are far removed state movement (Lebon et al. 2011b). Exactly in sequence space. Additionally “2-adrenergic how do the receptor conformations change upon receptor has been crystallized both in the “in- binding of various ligands of different efficacies active” and “fully active” conformational states. and how the relative population of the various Adenosine A2A receptor has been crystallized conformational states are modulated by ligand in the inactive and partially active state (Lebon binding is a current topic in GPCR research et al. 2011b;Xuetal.2011). Many of these (Granier and Kobilka 2012). An understanding of GPCRs have also been crystallized with only the the structural and dynamic features of the various agonist bound. In the next section we compare active and inactive conformations of GPCRs, various GPCR crystal structures and analyze the and the molecular mechanism of the conforma- similarities and differences. tional changes from the inactive to the active states is important in designing functionally spe- cific drugs (Galandrin et al. 2007; Mailman and 3.1.2 Comparison of Various GPCR Murthy 2010; Kenakin et al. 2012). In fact this Structures dynamic flexibility of GPCRs poses one of the major bottlenecks in obtaining pure protein for GPCRs have a seven helical topology and the biophysical structural studies of GPCRs (Tate structural differences between two structures and Schertler 2009;Tate2012). These bottle- arise from the differences in the helical tilts necks have been overcome partially, resulting and kinks, helical rotational and translational in many recently published three dimensional orientations, packing of the helices, and the loop crystal structures of GPCRs. There are two major structures. Two GPCR structures are compared strategies that have been used to stabilize GPCRs by aligning the C’ atoms, and the root mean for crystallization: (1) Attachment of an antibody squared deviation (RMSD) in coordinates of a set (Steyaert and Kobilka 2011) or a protein such as of chosen atoms that is common between the two T4Lysozyme (denoted as T4L hereafter) to the compared structures is reported. RMSD is a crude ICL or ECL to facilitate crystallization (Chere- measure of the differences between structures zov et al. 2007; Katritch et al. 2012) and (2) and does not describe the nuances in structural screening for thermostable mutations that stabi- similarities and differences especially for two lize a particular receptor conformation to enable GPCR structures with divergent sequences. crystallization. In fact many GPCRs such as “1- Therefore we developed a computational adrenergic (Warne et al. 2008) and “2-adrenergic method (GPCRCompare) that provides structural 40 N. Vaidehi et al.

Fig. 3.1 Comparison of rigid body structural differences Helices are represented by cylinders; block arrows repre- (helical translation, tilt and axial rotation) of CXCR4 (pdb sent center of mass translations, curved arrows represent ID: 3ODU) and D3DR (pdb ID: 3PBL) crystal structures tilts and rotations. Figures are schematic, and not to actual to that of “2AR (pdb ID: 2RH1); (a) CXCR4; (b)D3DR; scale differences in terms of the relative helical (10ı), while TM6 shows a minor translation tilt, rotational and translational orientations (0.6 Å), tilt (5ı), and rotation (8ı) compared to (Bhattacharya et al. 2013). Such a comparison the “2-adrenergic receptor. provides a more intuitive understanding of the The crystal structures of several class A structural similarities and differences. Figure 3.1 GPCRs have been compared in previous pub- shows the comparison of two crystal structures lications (Katritch et al. 2012; Deupi 2012; Bhat- of similar receptors such as the dopamine D3DR tacharya et al. 2013; Filizola and Devi 2013). In receptor and “2 adrenergic receptor and also the next section we demonstrate the advantages of comparison of two different receptors such as the comparing GPCR structures using helical struc- chemokine receptor, CXCR4 and the dopamine tural parameters with GPCRCompare program receptor D3DR that have very different amino to compare the crystal structures of the inactive acid sequences. The structural comparisons here states of the biogenic amine receptors (“1 and “2 have been made based on relative translational, adrenergic receptors, muscarinic acetyl choline rotational and tilt orientations of the seven receptors M2 and M3, D3DR helices. Such an analysis provides insight into and H1HR) crystallized so far. the type of helical movements that occur when comparing GPCR structures. The structural differences in the helical movements between 3.1.3 Comparison of the Inactive CXCR4 and “2 adrenergic receptor are more States of the Biogenic Amine predominant as anticipated, compared to the Receptors differences between D3DR and “2 adrenergic receptor. Significant structural differences Here we have compared the adrenergic receptor between CXCR4 and “2 adrenergic receptor structures to the dopamine, histamine and are in the rotational and tilt orientations of muscarinic acetylcholine receptors. Since the TM2, TM4 and TM5 as shown in Fig. 3.1a. structures of “1 and “2-adrenergic receptors The difference in the tilt angle of TM1 is also are similar, we report a comparison of the significant, but this difference could stem from adrenergic receptor structures with those of the truncated amino terminus of each of these the other biogenic amine family receptors. The receptors. In D3DR, TM4 shows a small rotation major differences between the dopamine D3 3 Structure and Dynamics of G-Protein Coupled Receptors 41 receptor and the “1 and “2adrenergic receptors in all these receptors. TM3 has more conserved are tilts of TM4, TM5 and TM6 (10ı,9ı,8ı residues than any other TM helix. The extracel- respectively) and rotations of TM4 and TM5 (9ı lular loops show less conserved residues than and 10ı). The histamine H1 receptor shows a the intracellular loops. ICL1 is highly conserved 7ı tilt of TM4 and major rotation of TM6, and in all the receptors. We have omitted histamine TM7 (13ı,8ı). The muscarinic acetylcholine H1HR in this analysis since it had only 20 % se- M2 receptor shows a 1.2 Å translation of TM5, quence identity to other subtypes of the histamine tilts of TM4 and TM5 (7ı,6ı), and rotations receptors and skewed this analysis. of TM1, TM5, TM6 (11ı,6ı,13ı). The M3 muscarinic receptor shows significant differences in the rotational orientations of TM5 and TM6 3.1.4 Analysis of the NMR (9ı,13ı respectively). Thus in comparison to the Structures of the Chemokine “1 and “2adrenergic receptors, TM4, TM5 and Receptor CXCR1 TM6 are the most diverse in conformation among the biogenic amine family of receptors. In these We have also compared the extent of variability receptors, TM3, TM5 and TM6 contain a large and the dynamics observed in the recently pub- proportion of the residues that line the ligand lished three dimensional NMR structures of the binding pocket of the orthosteric ligands. While chemokine receptor CXCR1 (Park et al. 2012). TM3 consists of a conserved aspartate residue This structure is the first, class A GPCR structure that forms a salt bridge with the amine group to be solved without any ligand bound to it. of the endogenous biogenic amines, the confor- The NMR structures of CXCR1 in phospholipids mations of TM5 and TM6 give selectivity to the show significant dynamics. We have analyzed specific ligands each of these receptors binds. The the extent of the dynamics of these structures diverse conformation of TM4 could be related to using GPCRCompare program in this section. selectivity of dimerization with other GPCRs, The RMSD in coordinates among these ten NMR since TM4 has been shown to form the dimer models ranges from 1.8 to 2.2 Å. The ensemble interface in several GPCRs (Johnston et al. 2011). of NMR conformations is shown in Fig. 3.3. With the availability of representative struc- The RMSD in coordinates among these ten NMR tures of the biogenic amine receptors an in depth models range from 1.8 to 2.2 Å. The kink an- analysis of the location of the conserved residues gle on TM6 ranges from 21ı to 34ı showing and their role in structure packing would be significant dynamics in the ten NMR structures useful for structure prediction protocols (Bhat- as shown in Figs. 3.3 and 3.4a. To quantify the tacharya et al. 2008a, b, 2013; Bhattacharya and dynamic fluctuations of different regions of the Vaidehi 2010;Halletal.2009; Nedjai et al. receptor, we computed the average root mean 2011). We have analyzed the extent to which square fluctuations (RMSF) of the C’ coordinate the residues are conserved in each one of these of each residue from the average NMR structure. receptors within their subfamilies and compared Figure 3.4b shows a representative NMR struc- the location of the conserved residues in the re- ture with the residues colored according to their spective receptor structures as shown in Fig. 3.2. mean RMSF, with red being large fluctuations The conserved residues are shown in blue and and blue being less than 1 Å change in RMSD. the non-conserved ones are shown in red. It can TM6 and helix 8 are the most dynamic segments be seen from Fig. 3.2, that the residues in the in the CXCR1 structure, followed by TM3 and intracellular half of the transmembrane domains TM5. Among the loops, ECL2 is packed within of the receptors are more conserved than the the TM bundle and shows very little fluctuations, extracellular half of the receptors. As shown in whereas ICL1 and ICL3 are the most dynamic. Fig. 3.2c, d, the residues that point inwards into In order to further “dissect” the nature of helical the transmembrane barrel are more conserved motions observed in CXCR1, we compared the than the residues that point to the membrane rigid body orientations of the seven TM helices of 42 N. Vaidehi et al.

Fig. 3.2 Comparison of the location of conserved the receptors. (b) The extracellular loops shows a lot of residues in the crystal structures of “1 and “2 adrenergic variability that could confer specificity to ligand binding; receptors, M2 and M3 muscarinic acetylcholine recep- (c) The view from the extracellular side of the receptor tors, D3 dopamine receptor. The residues that are highly showing that the conserved residues mostly face inside the conserved within their respective subtypes are shown in TM bundle, and (d) The view from the intracellular side of blue and the non-conserved residues are shown in red. the receptors showing more conserved residues pointing (a) The side view of the seven TM domains showing that inwards towards the TM bundle more conserved residues are in the intracellular region of the ten NMR conformations against one another deformations (kinks) of the helices. For CXCR1, using the program GPCRCompare. In the course the aligned RMSDs of the TM helices range of this analysis, we measured the C’ RMSD of from 0.4 to 0.8 Å with TM7 showing the highest individual helices, first by aligning the overall aligned RMSD. Next we compared the rigid body structures (referred to as “in place” in Fig. 3.4), orientations of the TM helices. TM5, TM6, and and then by aligning the TM helix of one struc- TM7 show significant variations in the tilt and ture to that of the other (referred to as “aligned” in rotational motion, whereas TM3 shows less tilt Fig. 3.4). Figure 3.4c compares the mean RMSD and larger rotation. A schematic of the dynamic of the TM domains both before and after align- motions of CXCR1 is shown in Fig. 3.4d. The ta- ment. Since the alignment removes the rigid body ble in Fig. 3.4e lists the magnitudes of rigid body orientational differences between the helices, the movements of the different helices, as calculated “aligned” RMSD gives a measure of the internal by GPCRCompare. 3 Structure and Dynamics of G-Protein Coupled Receptors 43

Fig. 3.3 Superposition of the 10 NMR derived structures from the pdb file 2LNL. The backbone of each structure is shown in a different color. The kink on helix6 is dynamic in the NMR time scale

ligands of varied efficacies on the dynamics of the 3.2 Flexibility of GPCR receptor has also been well studied using several Structures and Relevance biophysical techniques. to Drug Design

GPCRs exhibit multiple functions in the cell, that 3.2.1 NMR and Fluorescence is achieved by adopting various receptor confor- Spectroscopy Shows mations influenced by binding of the ligand, G- Flexibility in GPCR protein, other proteins coupling to the receptor, Conformations and by the lipid bilayer environment. Hence an understanding of the receptor flexibility and con- The NMR structure of the chemokine receptor formational dynamics would lead to more selec- CXCR1 in phospholipid bilayers has been pub- tive design of drugs. The fact that the GPCR con- lished recently (Park et al. 2012). As discussed formations are dynamic has been substantiated by in the previous section, comparison of the ten many experimental and computational techniques NMR models shows significant level of dynam- (Swaminath et al. 2004, 2005; Bhattacharya et al. ics in the system. The structural variations in 2008a; Ahuja and Smith 2009;Droretal.2009; the ten NMR structures are in the intracellular Khelashvili et al. 2009; Bhattacharya and Vaidehi part of TM3, kink modulation on TM6 and he- 2010; Vaidehi and Kenakin 2010; Vaidehi 2010; lix8 as shown in Figs. 3.3 and 3.4. Interestingly, Niesen et al. 2011; Provasi et al. 2011;Parketal. GPCR crystal structures show the TM helices 2012; Nygaard et al. 2013). Further the effect of to be less flexible (as evident by low b-factors), 44 N. Vaidehi et al.

Fig. 3.4 Variability among the 10 NMR derived confor- (d) Schematic representation showing movements exhib- mations of CXCR1 (pdb ID: 2LNL); (a) Cartoon represen- ited in TM5, TM6 and TM7 deduced from the ensemble tation of the 10 NMR conformations; (b) Two representa- of NMR structures; (e) Table showing the average value tive NMR structures with residues colored according to of the translation, tilt and rotational orientations of seven the mean RMSD fluctuations in coordinates; (c)RMSDs helices for the ten NMR derived structures calculated by of each transmembrane helix before and after alignment; the program GPCRCompare (Bhattacharya et al. 2013) whereas the loops are more mobile with higher dynamic of all the TMs. The kink angle on b-factors. However in the apoprotein structure of transmembrane helix 6 is also dynamic and varies CXCR1 the TM helices show high mobility. This from 21ı to 34ı. Based on earlier studies on is a marked distinction from crystal structures, activation of GPCRs, the side chain conformation which are cryogenic conformations, whereas the of the Trp6.48 of the highly conserved motif WxP NMR structures represent the dynamic nature of in TM6, was expected to toggle upon activation GPCRs at the physiological temperature. Also (Yao et al. 2006). This toggle was observed in the use of phospholipid bilayers in the NMR the NMR studies of the activation of rhodopsin measurements could contribute to the increased (Ahuja and Smith 2009) but was not seen in mobility of the TM regions. However the high the crystal structure of the active state of “2- mobility of the TM helices is in direct agreement adrenergic receptor (Rasmussen et al. 2011), or with the established notion on GPCR activation, the partial active state of adenosine A2A receptor which suggests that the helical domains (most im- (Lebon et al. 2011a, b;Xuetal.2011). The portantly TM6) undergo conformational change side chain conformation of Trp2556.48 in CXCR1 to trigger activation. It is noteworthy that in the shows significant variation among the ten NMR NMR conformations of CXCR1, TM6 is the most structures, denoting the dynamic nature of this 3 Structure and Dynamics of G-Protein Coupled Receptors 45 residue in the absence of any ligand. Here we to the human “2-adrenergic receptor lead to have used the Ballesteros and Weinstein num- stabilization of ligand specific conformations bering for residues that were defined for class of the receptor (Swaminath et al. 2004, 2005; A GPCRs (Ballesteros and Weinstein 1995). The Yao et al. 2006). Bioluminescence Resonance first number denotes the TM helix in which the Energy Transfer (BRET) studies pioneered by residue is located and the second number is the Bouvier and coworkers, on the kinetics of G position of the residue from the most conserved protein coupling to “2-adrenergic receptor and residue position denoted as x.50 in that helix. “1-adrenergic receptor in living cells, showed In the CXCR1 NMR structures, the distance be- differential activation kinetics by full, partial tween the intracellular regions of TM3 and TM6 and inverse agonists (Vilardaga et al. 2003; (for example, the distance between Arg135 and Galés et al. 2005; Galandrin et al. 2007). Met241) varies between 9.1 and 12.6 Å in the Fluorescence experiments of reconstituted human ten NMR structures. This analysis shows that “2-adrenergic receptor in lipid nanoparticles the GPCR without any ligand samples a wide coupled with the G-protein, showed that ligands range of conformations. Future studies on NMR modulate the binding affinity of the receptor structures with ligand on CXCR1 would reveal to the G-protein and similarly binding of the if conformational selection from the subset of G-protein also modulates the efficacy of the conformations sampled by the apoprotein could receptor to the ligand (Yao et al. 2009). Since be one of the predominant mechanisms of ligand the experimental results on activation of GPCRs binding in GPCRs. are fragmented, computational methods play an Biophysical studies on the visual receptor important role in combining these observations to rhodopsin, using spin-labeling techniques, provide atomic level insight into the mechanism solid-state NMR, fluorescent spectroscopy, of the conformational transitions and activation computational methods, and crystal structure in GPCRs. of the partially active state opsin show that the intracellular region of the TM3 and TM6 move apart upon activation (Park et al. 3.2.2 Multiscale Computational 2008). Additionally, there are several inter- Methods for Understanding residue contacts also known as “conformational the Mechanism switches” made or broken upon activation of of Conformational Transitions rhodopsin (Krishna et al. 2002; Hubbell et al. and Activation of GPCRs 2003; Schertler 2005;Parketal.2008; Ahuja and Smith 2009; Zaitseva et al. 2010). Briefly, Computational methods have been used to comparison of the rhodopsin and opsin crystal delineate the atomic level mechanisms of structures showed considerable conformational activation of GPCRs. All-atom molecular changes in the intracellular region of TM5 dynamics (MD) simulations combined with and TM6. TM5 also shows elongation of the inter-residue distances measured by NMR as helix in this region. Comparatively smaller restraints provided insight into the movement changes were observed in the extracellular of ECL2 in the activation of rhodopsin (Ahuja loop 2 (ECL2) (Ahuja and Smith 2009). There and Smith 2009). Starting from the inactive are substantial rearrangements in all the three state crystal structure of rhodopsin, several MD intracellular loops in the opsin structure. Thus simulation studies have provided insight into the it is evident that activation is associated with early events of activation of rhodopsin (Saam considerable conformational changes especially et al. 2002; Crozier et al. 2007; Grossfield et al. in the intracellular region of the receptor. 2008; Khelashvili et al. 2009). The activation Fluorescence spectroscopic life time mea- mechanism of rhodopsin from MD trajectories surements with purified receptors, have shown has been described in terms of structural that binding of ligands of varied efficacies changes in inter-residue contacts also known 46 N. Vaidehi et al.

Fig. 3.5 A scheme of the conformational sampling in formational transitions. All-atom Cartesian simulations multiscale simulation method. The coarse grained sim- are then performed starting from the low energy and well ulation methods in this figure represents methods that populated structures from the coarse grained simulations. sample wider range of conformations with accurate all- This would lead to more accurate sampling of the low atom forcefield. The coarse grained methods capture con- energy states as microdomain switches. These microdomain all-atom MD simulations. This is because the switches are also modulated by interaction with system gets trapped in the potential energy min- cholesterol (Khelashvili et al. 2009). imum of the inactive state with high free energy Dror et al. performed microseconds of MD barriers to transition to the active state, and the simulations on the inactive state of “2AR crystal dynamic movement to the active state has to structure with and without T4L lysozyme (T4L) occur through a series of rare events as the system bound to show that there are two inactive states moves from one potential energy basin to another. of the receptor with and without the ionic lock Thus advanced multiscale dynamics methods are formed between Arg3.50 and Glu6.30 (Dror et al. required to study the activation mechanism of 2009). More recently, long time scale MD sim- GPCRs. ulations on the pathway of ligand entry into “2- adrenergic receptor showed that the ligand ini- tially binds to the extracellular loop region where 3.2.3 Multiscale Method they shed the water molecules and hence show Simulations of Activation a finite lifetime in this region. Following this of GPCRs event the ligand trickles down into the orthosteric binding site. The pathway is the same for agonists Multiscale simulation methods comprise of using and antagonists tested in these simulations (Dror (1) a coarse grained MD technique that samples et al. 2011). Although all-atom MD simulations multiple conformational states by overcoming provide valuable insights into atomic level mech- energy barriers, (2) followed by finer scale all- anism of the early events of GPCR activation, atom MD simulations starting from conforma- there still exists a bottleneck in simulating long tions extracted from the coarse grained simula- time scale biological processes such as GPCR tions (see Fig. 3.5). In here, by “coarse grained” activation using all atom MD simulations. Start- we mean a wider and enhanced sampling dy- ing from the inactive state of the receptor, it namics method than all-atom MD simulations is not feasible to sample the active state using and not necessarily the coarse grained forcefield. 3 Structure and Dynamics of G-Protein Coupled Receptors 47

Steered MD (Isralewitz et al. 2001), Acceler- inactive state (Niesen et al. 2011). Subsequent ated MD (Hamelberg et al. 2007), metadynamics NMR experiments studying the ensemble MD (Provasi et al. 2011), methods are examples of conformations without the ligand present of methods that sample multiple conformational have actually shown that the “2-adrenergic states of proteins. Both Steered MD (Gouldson receptor without any ligand is lot more flexible et al. 2004) and metadynamics simulations have and samples multiple states, than when a ligand been used to study the ligand specific confor- is bound (Nygaard et al. 2013). This is an mational states of “2-adrenergic receptor. This example, where the computational methods method provides insight into how ligands of var- provided predictions and insights into the ied efficacies modulate the free energy surface conformational sampling of the receptor without of the “2-adrenergic receptor. The method how- any ligand prior to the experiments. Binding ever requires prior knowledge of the active and of agonist norepinephrine or partial agonist inactive states of the receptor as well as pre- salbutamol leads to selection of a subset of determined activation pathways (Provasi et al. these conformations sampled by the receptor, 2011). while inverse agonist carazolol selects only For cases where the active state of the inactive state conformations. This demonstrates receptor is not known, we developed a multiscale the usefulness of the computational methods dynamics approach by combining a coarse- to provide insights that are difficult to probe grained discrete conformational sampling experimentally. Thus we conclude that ligand method, called LITICon (Bhattacharya et al. binding in GPCRs lead to conformational 2008a, b) with fine-grained molecular dynamics selection to a greater extent than the induced investigating the effect of various ligand binding conformational changes. However the role of on the ensemble of conformations sampled by ligand induced receptor movements cannot be human “2-adrenergic receptor (Bhattacharya eliminated completely. and Vaidehi 2010; Niesen et al. 2011). The discrete conformational sampling method allows traversing over energy barriers and 3.3 Insights into the Structural samples multiple conformations of the receptor Basis of Thermostability (Fig. 3.5). We showed that the receptor, in the in GPCRs absence of any ligand samples an extensive conformational space that includes a shear 3.3.1 Thermostabilization of GPCR motion of the intracellular regions of TM5 Conformations and TM6 and a breathing motion of the ligand binding site. This shear motion is similar to The dynamic nature of GPCRs confers its adapt- the reorganization of the TM helices observed ability and versatility to get activated by multiple in the crystal structure of the active state ligands and trigger multiple signaling pathways of “2-adrenergic receptor (Rasmussen et al. in the cell. At the same time the dynamic nature 2011). However the extent of the movement has been a great impediment and challenge to in the intracellular regions of TM5 and TM6 purify GPCRs in detergent solution and amenable with respect to TM3, is considerably less to crystallization. Although membrane proteins when no ligand is present compared to when are extremely stable in bilayer environment they the agonist and G-protein are bound to the are often unstable when solubilized in deter- receptor. Thus, the computational data prior gents (Bowie 2001). Two major strategies have to experiments lead to the prediction that the been used to circumvent this challenge. One is receptor samples “active-like” conformational to attach T4L or antibody to the loop regions states even in the absence of any ligand, and (Hanson and Stevens 2009; Steyaert and Kobilka the population of these “active-like” states is 2011) and the other is to derive thermostable mu- much less compared to the population of the tants that remain stable in detergents (Tate 2012). 48 N. Vaidehi et al.

These two strategies have proved successful in in purifying these proteins. There are multiple obtaining GPCR crystal structures. The experi- challenges in obtaining purified GPCRs for mental procedure for deriving the thermostable structural studies. Some of these challenges mutant GPCRs is tedious and therefore an under- are: (1) Low level of expression of mammalian standing of how do these thermostable mutations GPCRs, (2) optimization of detergents for GPCR affect the stability of GPCRs is unknown. In this solubilization. When solubilized in detergents section we will focus on the thermostable mu- for purification, GPCRs tend to aggregate (Tate tants and demonstrate how computational meth- 2012; Grisshammer 2009;Chiuetal.2008) and, ods have been used to understand the effect of (3) The flexibility of GPCR conformations poses thermostable mutations on the stability of the the greatest challenge for the receptor purification receptor structures. due to the conformational heterogeneity. Rational Membrane proteins show mutations that en- single point mutations to engineer stabilizing hance the protein stability occurring with a high disulfide bridges (Standfuss et al. 2007), or frequency, thus showing that these wild type mutating certain non-conserved residues that proteins are probably not optimized for high sta- would strengthen helical interfaces were tested bility (Bowie 2001). Thermostable mutants have (Roth et al. 2008) for producing thermostable been engineered for GPCRs by time consuming mutants of rhodopsin and “2-adrenergic receptor. processes as described in detail in the next sec- Although these strategies were successful, tion. Various types of assays have been used to they could not be generalized and applied to measure thermostability for membrane proteins any GPCR. Tate and coworkers developed a (Kawate and Gouaux 2006; Alexandrov et al. systematic procedure for deriving thermostable 2008). Among these techniques, the radiolabelled GPCR mutants (Serrano-Vega et al. 2008; ligand binding is the most used technique for Magnani et al. 2008; Shibata et al. 2009;Tate GPCRs (Tate 2012). Thermostable mutants have 2012; Cooke et al. 2013). This procedure consists lead to crystal structures for turkey “1-adrenergic of several steps: receptor (Warne et al. 2008), human adenosine ¥ Ala/Leu scanning mutagenesis where every A2A receptor (Doré et al. 2011, Lebon et al. residue was mutated to Ala (except for Ala 2011a, b) and (White residues that were mutated to Leu), and each et al. 2012). Analysis of the amino acid positions mutant was expressed and its thermostability that lead to thermostability shows that there is was determined relative to the wild type re- no sequence consensus in the positions among ceptor. The thermostability was determined by various receptors, nor there is clustering of the measuring the antagonist binding affinity at thermostable mutant positions. Thus the process high temperatures. of engineering thermostable GPCR mutants is te- ¥ Once thermostabilizing single point mutations dious and time consuming. Therefore a molecular were identified, these positions were mutated level understanding of the effect of thermostable to other amino acid residues to identify im- mutations would throw light into rational engi- provements in thermostability. neering of mutants for other GPCRs. ¥ The best thermostabilizing mutations were then combined to give the optimally stable mutant. 3.3.2 Structural Basis This is a systematic but time-consuming pro- of Thermostability cess, and most of the mutation positions are from Computational Methods not transferable from one GPCR to another and this fact makes it more tedious (Serrano-Vega The fact that GPCRs are dynamic and exhibit and Tate 2009). Insights into why certain mu- multiple active and inactive states have been tations stabilize the receptor, as well as why demonstrated amply with experimental and certain mutations specifically stabilize agonist computational data. This poses a great challenge binding versus antagonist binding (Lebon et al. 3 Structure and Dynamics of G-Protein Coupled Receptors 49

Fig. 3.6 Schematic representation of the receptor population distribution in wt-“1AR (blue) and thermostable mutant m23-“1AR (red). Note that in the mutant m23-“1AR the receptor conformational population is shifted towards the inactive state of the receptor

1.59 2011b) states would greatly aid the design pro- “1AR has six point mutations namely R68S , cess (Balaraman et al. 2010). M90V2.53, Y227A5.58, A282LICL3, F327A7.37 and The structural basis of thermostability of sol- F338M7.48. Using long time scale all-atom MD uble proteins has been studied extensively from simulations in explicit lipid bilayer and water, the perspectives of structure, and ther- we have studied the effect of the mutations on modynamics (Sterpone and Melchionna 2012). In the stability and dynamics of the m23-“1AR as contrast, studies on membrane protein stability compared to the wild type “1AR referred to as are less common, which is probably because wt-“1AR hereafter (Niesen et al. 2013). they are far less tractable experimentally (White We found that thermostabilization results in an and Wimley 1999;Bowie2011). Measurements increase in the number of accessible microscopic of reversible folding and unfolding processes to conformational substates arising from side chain determine the thermodynamic and kinetic prop- fluctuations within the inactive state ensemble, erties of folding have been done only for a few effectively increasing the entropy of the inactive membrane proteins (Booth and Curnow 2009). state. Figure 3.6 shows the schematic represen- These measurements have not yet been possible tation of the energy landscape of the inactive to study the role of thermostabilizing mutations in state of the receptor for both wild type and m23 GPCRs. Therefore computational methods offer a mutant. The number of states shown in the figure complementary and feasible solution to study the comes from computational data (Niesen et al. effect of thermostable mutations on the structure 2013). It is seen that the number of microscopic and dynamics of GPCRs. states in the mutant m23-“1AR has increased at room temperature. The thermostable mutant m23-“1AR showed an increase in side chain 3.3.3 Insights into the Structural movement within the inactive state (multiple mi- Basis of Thermostability croscopic states within the inactive state). This from Computational Studies increase in side chain flexibility in the m23-“1AR plays a critical role in absorbing the extra ther- Serrano-Vega et al. derived a thermally stable mu- mal energy at higher temperatures, thus retain- tant of the inactive state of turkey “1-adrenergic ing the native fold even at higher temperatures receptor (“1AR) and showed that a combina- thus preventing unfolding and aggregation. Such tion of six single point mutations resulted in a entropic stabilization has been observed in sol- ı 20 C increase in thermal stability of the inactive uble thermophilic proteins (Colombo and Merz state of “1AR (Serrano-Vega et al. 2008). This 1999; Hernandez et al. 2000; Wintrode et al. inactive state mutant of “1AR known as m23- 2003; Meinhold et al. 2008). The dynamics of the 50 N. Vaidehi et al.

mutant m23-“1AR also shows loss of correlated leads not only stress relieving stabilization but motion between residues that are in the ligand also loss of activation of the receptor. P3056.50 binding site to the G-protein coupling site and on TM6 showed a high level of stress in the this reduces the ability of the receptor to go into wild type receptor. P3056.50 forms the hinge for the active state. The thermostabilizing mutations, the kink in TM6 and modulation of the proline Y227A5.58 and F338M7.48 stabilize the inactive kink could be crucial for the outward motion of state by removing long range correlated move- TM6 leading to activation (Hornak et al. 2010; ment between residues in the G-protein coupling Bhattacharya et al. 2008a). Thus reduced stress site to the extracellular region thus inhibiting ac- at P3056.50 could lead to increased stability of the 5.58 tivation (Niesen et al. 2013). The Y227A mu- inactive state in m23-“1AR. tation also stabilized the receptor by increasing Thus the basis of thermostabilization of the the propensity for salt-bridge formation between mutants are multifold. They stem from both en- R1393.50 on TM3 and E2856.30 on TM6. The salt- thalpic and entropic contributions. They not only bridge between TM3 and TM6, a putative sig- affect the neighborhood residues of the location nature of the inactive state (Hubbell et al. 2003; of the mutation sites, but also show long range Yao et al. 2006; Palczewski et al. 2000), is sta- allosteric effects upon mutation. Thus studying bilized by these mutations and hence contribute the effect of the mutations on the overall structure to increase in stability. The mutation F338M7.48 and the conformational ensemble is important on TM7 alters the interaction in the conserved rather than looking for specific changes in mi- motif NPxxY(x)5,6F and hence modulates the crodomains of protein structures. interaction of TM2-TM7-helix8 micro-domain (Balaraman et al. 2010). The R68S mutation that is present in the intracellular loop 1, leads to 3.4 Conclusions formation of a hydrogen bond with R355 on helix 8 to gain stabilization. We observed the formation GPCRs are seven helical transmembrane proteins of the hydrogen bond in the simulations and that play a critical physiological role and this was also found subsequently in the crystal are implicated in many serious diseases. The structure of the mutant (Warne et al. 2012). dynamics of GPCR structure plays an important The distribution of the net intra and inter- role in their activation and signal transduction molecular forces, intramolecular repulsive and function. Here we have described a method, attractive forces within a protein structure gov- GPCRCompare to compare the three dimensional erns how the protein reacts to thermal stress, structures of GPCRs in terms of the relative since non-optimal distribution of internal stress rotational, tilt and translational orientations of the can lead to structural destabilization at elevated seven helices. This software is freely available for temperatures (Stadler et al. 2012). However, in- users by contacting the corresponding author. ternal stress also leads to movement of those Comparison of the ten NMR structures of regions of the receptor that lead to the active state. the chemokine receptor shows a significantly Thus a balance of these internal forces is required dynamic range of rotation and tilt angles for to maintain receptor stability and activity. The transmembrane helices 5, 6 and 7. The kink distribution of the net force on each residue angle on TM6 ranges from 21ı to 34ı showing calculated from the snapshots from the MD sim- significant dynamics in the ten NMR structures. ulations of both m23-“1AR and wt-“1AR showed Comparison of the crystal structures of the that Y2275.58, F3277.37 and F3387.48 are all in po- biogenic amine receptors available to date, sitions of high stress in wt-“1AR that get relieved showed that the conserved amino acids face upon mutation, leading to thermostabilization. the interior of the transmembrane barrel. We Y2275.58 is also a position that shows long range have also elaborated the structural dynamics and torsional correlation to the extracellular regions activation mechanism of “2-adrenergic receptor of the receptor. Therefore mutating this residue as derived from multiscale MD simulation meth- 3 Structure and Dynamics of G-Protein Coupled Receptors 51 ods. We observed that the “2-adrenergic receptor Bowie JU (2001) Stabilizing membrane proteins. Curr samples conformations with active state like Opin Struct Biol 11:397Ð402 characteristics and the population of such states Bowie JU (2011) Membrane protein folding: how im- portant are hydrogen bonds? Curr Opin Struct Biol are much lower than those of the inactive state. 21:42Ð49 Ligand binding leads to conformational selection Cherezov V, Rosenbaum DM, Hanson MA, Rasmussen of substates (Niesen et al. 2011). Finally, study SG, Thian FS, Kobilka TS, Choi HJ, Kuhn P, Weis WI, Kobilka BK, Stevens RC (2007) High-resolution of the dynamics of the conformational ensemble crystal structure of an engineered human beta2- of thermostable mutants showed that the stability adrenergic G-protein coupled receptor. Science 318: could arise from both enthalpic and entropic 1258Ð1265 factors. There are regions of high stress in the ChienEY,LiuW,ZhaoQ,KatritchV,HanGW,Hanson MA, Shi L, Newman AH, Javitch JA, Cherezov V, wild type GPCR that get relieved upon mutation Stevens RC (2010) Structure of the human dopamine conferring thermostability. D3 receptor in complex with a D2/D3 selective antag- onist. Science 330:1091Ð1095 Acknowledgements We acknowledge the helpful dis- Chiu ML, Tsang C, Grihalde N, MacWilliams MP (2008) cussions with Dr. Christopher Tate and Dr. Reinhard Over-expression, solubilization and purification of G- Grisshammer. We thank NIH-R01097261 for providing protein coupled receptors for structural biology. Comb funding for our study on GPCR thermostable mutants. Chem High Throughput Screen 11:439Ð462 Colombo G, Merz KM (1999) Stability and activity of mesophilic subtilisin E and its thermophilic homolog: insights from molecular dynamics simulations. J Am References Chem Soc 121:6895Ð6903 Cooke RM, Koglin M, Errey JC, Marshall FH (2013) Ahuja S, Smith SO (2009) Multiple switches in G protein- Preparation of purified GPCRs for structural studies. coupled receptor activation. Trends Pharmacol Sci Biochem Soc Trans 41(1):185Ð190 30:494Ð502 Crozier PS, Stevens MJ, Woolf TB (2007) How a small Alexandrov AI, Mileni M, Chien EY, Hanson MA, change in retinal leads to G-protein activation: initial Stevens RC (2008) Microscale fluorescent thermal events suggested by molecular dynamics calculations. stability assay for membrane proteins. Structure Proteins 66:559Ð574 16:351Ð359 Deupi X (2012) Quantification of structural distortions in Audet M, Bouvier M (2012) Restructuring G-protein the transmembrane helices of GPCRs. Methods Mol coupled receptor activation. Cell 151:14Ð23 Biol 914:219Ð235 Balaraman GS, Bhattacharya S, Vaidehi N (2010) Struc- Doré AS, Robertson N, Errey JC, Ng I, Hollenstein K, tural insights into conformational stability of wild Tehan B, Hurrell E, Bennett K, Congreve M, Mag- type and mutant beta1-adrenergic receptor. Biophys J nani F, Tate CG, Weir M, Marshall FH (2011) Struc- 99:568Ð577 ture of the adenosine A2A receptor in complex with Ballesteros J, Weinstein J (1995) Integrated methods for ZM241385 and xanthines XAC and caffeine. Structure modeling G-protein coupled receptors. Methods Neu- 19(9):1283Ð1293 rosci 25:366Ð428 Dror RO, Arlow DH, Borhani DW, Jensen M¯, Piana S, Bhattacharya S, Vaidehi N (2010) Computational map- Shaw DE (2009) Identification of two distinct inactive ping of the conformational transitions in agonist selec- conformations of the beta2-adrenergic receptor rec- tive pathways of a G-protein coupled receptor. J Am onciles structural and biochemical observations. Proc Chem Soc 132(14):5205Ð5214 Natl Acad Sci U S A 106:4689Ð4694 Bhattacharya S, Hall SE, Vaidehi N (2008a) Agonist- Dror RO, Pan AC, Arlow DH, Borhani DW, Maragakis P, induced conformational changes in bovine rhodopsin: Shan Y, Xu H, Shaw DE (2011) Pathway and mecha- insight into activation of G-protein-coupled receptors. nism of drug binding to G-protein coupled receptors. J Mol Biol 382:539Ð555 Proc Natl Acad Sci U S A 108:13118Ð13123 Bhattacharya S, Hall SE, Li H, Vaidehi N (2008b) Ligand- Filizola M, Devi LA (2013) Grand opening of structure- stabilized conformational states of human beta(2) guided design for novel opioids. Trends Pharmacol Sci adrenergic receptor: insight into G-protein-coupled 34(1):6Ð12 receptor activation. Biophys J 94(6):2027Ð2042 Galandrin S, Oligny-Longpré G, Bouvier M (2007) The Bhattacharya S, Lam AR, Li H, Balaraman G, Niesen MJ, evasive nature of drug efficacy: implications for drug Vaidehi N (2013) Critical analysis of the successes and discovery. Trends Pharmacol Sci 2007(28):423Ð430 failures of homology models of G protein-coupled re- Galés C, Rebols RV, Hogue M, Trieu P, Brelt A, Hébert ceptors. Proteins 81:729Ð739. doi:10.1002/prot.24195 TE, Bouvier M (2005) Real-time monitoring of re- Booth PJ, Curnow P (2009) Folding scene investigation: ceptor and G-protein interactions in living cells. Nat membrane proteins. Curr Opin Struct Biol 19(1):8Ð13 Methods 2:177Ð184 52 N. Vaidehi et al.

Gouldson PR, Kidley NJ, Bywater RP, Psaroudakis G, Kawate T, Gouaux E (2006) Fluorescence-detection Brooks HD, Diaz C, Shire D, Reynolds CA (2004) size exclusion chromatography for precrystallization Toward the active conformations of rhodopsin and the screening of integral membrane proteins. Structure beta2-adrenergic receptor. Proteins 56(1):67Ð84 14:673Ð681 Granier S, Kobilka B (2012) A new era of GPCR structural Kenakin T, Watson C, Muniz-Medina V, Christopoulos and chemical biology. Nat Chem Biol 8:670Ð673 A, Novick S (2012) A simple method for quantifying Granier S, Manglik A, Kruse AC, Kobilka TS, Thian functional selectivity and agonist bias. ACS Chem FS, Weis WI, Kobilka BK (2012) Structure of Neurosci 3:193Ð203 the •-opioid receptor bound to naltrindole. Nature Khelashvili G, Grossfield A, Feller SE, Pitman MC, 485(7398):400Ð404 Weinstein H (2009) Structural and dynamic effects Grisshammer R (2009) Purification of recombinant of cholesterol at preferred sites of interaction with G-protein coupled receptors. Methods Enzymol rhodopsin identified from microsecond length molec- 463:631Ð645 ular dynamics simulations. Proteins 76:403Ð417 Grossfield A, Pittman MC, Feller SE, Soubias O, Krishna A, Menon ST, Terry TJ, Sakmar TP (2002) Gawrisch K (2008) Internal hydration increases dur- Evidence that helix 8 of rhodopsin acts as a ing activation of the G-protein-coupled receptor membrane-dependent conformational switch. Bio- rhodopsin. J Mol Biol 381:478Ð486 chemistry 41:8298Ð8309 Haga K, Kruse AC, Asada H, Yurugi-Kobayashi T, Shi- Kruse AC, Hu J, Pan AC, Arlow DH, Rosenbaum DM, roishi M, Zhang C, Weis WI, Okada T, Kobilka BK, Rosemond E, Green HF, Liu T, Chae PS, Dror RO, Haga T, Kobayashi T (2012) Structure of the human Shaw DE, Weis WI, Wess J, Kobilka BK (2012) Struc- M2 muscarinic acetylcholine receptor bound to an ture and dynamics of the M3 muscarinic acetylcholine antagonist. Nature 482:547Ð551 receptor. Nature 482:552Ð556 Hall SE, Mao A, Nicolaidou V, Finelli M, Wise EL, Nedjai Lebon G, Bennett K, Jazayeri A, Tate CG (2011a) Ther- B, Kanjanapangka J, Harirchian P, Chen D, Selchau V, mostabilization of an agonist bound conformation Ribeiro S, Schyler S, Pease JE, Horuk R, Vaidehi N of the human adenosine A2A receptor. J Mol Biol (2009) Elucidation of binding sites of dual antagonists 409:298Ð310 in the human chemokine receptors CCR2 and CCR5. Lebon G, Warne T, Edwards PC, Bennett K, Langmead Mol Pharm 75:1325Ð1336 CJ, Leslie AG, Tate CG (2011b) Agonist bound adeno- Hamelberg D, de Oliveira CA, McCammon JA (2007) sine A2A receptor structures reveal common features Sampling of slow diffusive conformational transitions of GPCR activation. Nature 474:521Ð525 with accelerated molecular dynamics. J Chem Phys Magnani F, Shibata Y, Serrano-Vega MJ, Tate CG (2008) 127(15):155102Ð155110 Co-evolving stability and conformational homogene- Hanson MA, Stevens RC (2009) Discovery of new GPCR ity of the human adenosine A2a receptor. Proc Natl biology: one receptor structure at a time. Structure Acad Sci U S A 105(31):10744Ð10749 17(1):8Ð14 Mailman RB, Murthy V (2010) Ligand functional se- Hernandez G, Jenney FE, Adams MWW, LeMaster DM lectivity advances our understanding of drug mecha- (2000) Millisecond time scale conformational flexibil- nisms and drug discovery. Neuropsychopharmacology ity in a hyperthermophile protein at ambient tempera- 35:345Ð346 ture. Proc Natl Acad Sci U S A 97:3166Ð3170 Manglik A, Kruse AC, Kobilka TS, Thian FS, Mathiesen Hornak V, Ahuja S, Eilers M, Goncalves JA, Sheves JM, Sunahara RK, Pardo L, Weis WI, Kobilka BK, M et al (2010) Light activation of rhodopsin: in- Granier S (2012) Crystal structure of the -opioid sights from molecular dynamics simulations guided by receptor bound to a morphinan antagonist. Nature solid-state NMR distance restraints. J Mol Biol 396: 485(7398):321Ð326 510Ð527 Meinhold L, Clement D, Tehei M, Daniel R, Finney Hubbell WL, Altenbach C, Hubbell CM, Khorana HG JL et al (2008) Protein dynamics and stability: the (2003) Rhodopsin structure, dynamics, and activation: distribution of atomic fluctuations in thermophilic a perspective from crystallography, site directed spin and mesophilic dihydrofolate reductase derived us- labeling, sulfhydryl reactivity, and disulfide cross- ing elastic incoherent neutron scattering. Biophys linking. Adv Protein Chem 63:243Ð290 J 94:4812Ð4818 Isralewitz B, Gao M, Schulten K (2001) Steered molecular Nedjai B, Li H, Stroke IL, Wise EL, Webb ML, Merritt dynamics and mechanical functions of proteins. Curr JR, Henderson I, Klon AE, Cole AG, Horuk R, Vaidehi Opin Struct Biol 11:224Ð230 N, Pease JE (2011) Small-molecule chemokine mimet- Johnston JM, Aburi M, Provasi D, Bortolato A, Urizar ics suggest a molecular basis for the observation E, Lambert NA, Javitch JA, Filizola M (2011) that CXCL10 and CXCL11 are allosteric ligands of Making structural sense of dimerization interfaces CXCR3. Br J Pharmacol 166(3):912Ð923 of delta opioid receptor homodimers. Biochemistry Niesen M, Bhattacharya S, Vaidehi N (2011) Conforma- 50(10):1682Ð1690 tional selection upon ligand binding in G-protein cou- Katritch V, Cherezov V, Stevens RC (2012) Diversity and pled receptors. J Am Chem Soc 133(33):13197Ð13204 modularity of G-protein coupled receptor structures. Niesen M, Bhattacharya S, Tate CG, Grisshammer R, Trends Pharmacol Sci 33:17Ð27 Vaidehi N (2013) Thermostabilization of the “1- 3 Structure and Dynamics of G-Protein Coupled Receptors 53

adrenergic receptor Correlates with Increased Entropy Shimamura T, Shiroishi M, Weyand S, Tsujimoto H, of the Inactive State. J Phys Chem 117(24):7283Ð7291 Winter G, Katritch V, Abagyan R, Cherezov V, Liu W, Nygaard R, Zou Y, Dror RO, Mildorf TJ, Arlow DH, Han GW, Kobayashi T, Stevens RC, Iwata S (2011) Manglik A, Pan AC, Liu CW, Fung JJ, Bokoch MP, Structure of the human histamine H1 receptor complex Thian FS, Kobilka TS, Shaw DE, Mueller L, Prosser with doxepin. Nature 475:65Ð70 RS, Kobilka BK (2013) The dynamic process of “2- Stadler AM, Garvey CJ, Bocahut A, Sacquin-Mora adrenergic receptor activation. Cell 152(3):532Ð542 S, Digel I et al (2012) Thermal fluctuations of Palczewski K, Kumasaka T, Hori T, Behnke CA, haemoglobin from different species: adaptation to Motoshima H, Fox BA, Le Trong I, Teller DC, Okada temperature via conformational dynamics. J R Soc T, Stenkamp RE, Yamamoto M, Miyano M (2000) Interface 9(76):2845Ð2855 Crystal structure of rhodopsin: a G protein-coupled Standfuss J, Xie G, Edwards PC, Burghammer M, receptor. Science 289:739Ð745 Oprian DD, Schertler GF (2007) Crystal structure Park JH, Scheerer P, Hoffman KP, Choe H-W, Ernst OP of a thermally stable rhodopsin mutant. J Mol Biol (2008) Crystal structure of the ligand-free G-protein 372(5):1179Ð1188 coupled receptor opsin. Nature 454:183Ð187 Sterpone F, Melchionna S (2012) Thermophilic proteins: Park SH, Das BB, Casagrande F, Tian Y, Nothnagel HJ, insight and perspective from in silico experiments. Chu M, Kiefer H, Maier K, De Angelis AA, Marassi Chem Soc Rev 41(5):1665Ð1676 FM, Opella SJ (2012) Structure of the chemokine Steyaert J, Kobilka BK (2011) Nanobody stabilization receptor CXCR1 in phospholipid bilayers. Nature of G-protein coupled receptor conformational states. 491(7426):779Ð783 Curr Opin Struct Biol 21:567Ð572 Provasi D, Artacho MC, Negri A, Mobarec JC, Filizola M Swaminath G, Xiang Y, Lee TW, Steenhuis J, Parnot (2011) Ligand induced modulation of the free energy C, Kobilka BK (2004) Sequential binding of ag- landscape of the G-protein coupled receptors explored onists to the beta2 adrenoceptor-kinetic evidence by adaptive biasing techniques. PLoS Comput Biol for intermediate conformational states. J Biol Chem 7(10):e1002193 279:686Ð691 Rasmussen SG, DeVree BT, Zou Y, Kruse AC, Chung Swaminath G, Deupi X, Lee TW, Zhu W, Thian FS, KY, Kobilka TS, Thian FS, Chae PS, Pardon E, Calin- Kobilka TS, Kobilka BK (2005) Probing the beta2 ski D, Mathiesen JM, Shah ST, Lyons JA, Caffrey adrenoceptor binding site with catechol reveals differ- M, Gellman SH, Steyaert J, Skiniotis G, Weis WI, ences in binding and activation by agonists and partial Sunahara RK, Kobilka BK (2011) Crystal structure of agonists. J Biol Chem 280:22165Ð22171 the “2-adrenergic receptor-Gs protein complex. Nature Tate CG (2012) A crystal clear solution for determining G- 477:549Ð555 protein coupled receptor structures. Trends Biochem Reiter E, Ahn S, Shukla AK, Lefkowitz RJ (2012) Molec- Sci 37:343Ð352 ular mechanism of “-arrestin biased agonism at seven Tate CG, Schertler GF (2009) Engineering G protein- transmembrane receptors. Ann Rev Pharmacol Toxicol coupled receptors to facilitate their structure determi- 52:179Ð197 nation. Curr Opin Struct Biol 19:386Ð395 Roth CB, Hanson MA, Stevens RC (2008) Stabilization Thompson AA, Liu W, Chun E, Katritch V, Wu H, Vardy of the human beta2-adrenergic receptor TM4-TM3- E, Huang XP, Trapella C, Guerrini R, Calo G, Roth TM5 helix interface by mutagenesis of Glu122(3.41), BL, Cherezov V, Stevens RC (2012) Structure of the a critical residue in GPCR structure. J Mol Biol nociceptin/orphanin FQ receptor in complex with a 376(5):1305Ð1319 peptide mimetic. Nature 485(7398):395Ð399 Saam J, Tajkhorshid E, Havashi S, Schulten K (2002) Vaidehi N (2010) Dynamics and flexibility of G-protein Molecular dynamics investigation of primary photoin- coupled receptor conformations and their relevance to duced events in the activation of rhodopsin. Biophys drug design. Drug Discov Today 15:951Ð957 J 83:3097Ð3112 Vaidehi N, Bhattacharya S (2011) Multiscale computa- Schertler GF (2005) Structure of rhodopsin and the tional methods for mapping conformational ensembles metarhodopsin I photointermediate. Curr Opin Struct of G-protein coupled receptors. Adv Protein Chem Biol 15:408Ð15 Struct Biol 85:253Ð280 Serrano-Vega MJ, Tate CG (2009) Transferability of Vaidehi N, Kenakin T (2010) The role of confor- thermostabilizing mutations between beta adrenergic mational ensembles of seven transmembrane recep- receptors. Mol Membr Biol 26:385Ð396 tors in functional selectivity. Curr Opin Pharmacol Serrano-Vega MJ, Magnani F, Shibata Y, Tate CG (2008) 10:775Ð781 Conformational thermostabilization of the beta1- Vilardaga JP, Bunemann M, Krasel C, Castro M, Lohse adrenergic receptor in a detergent resistant form. Proc M (2003) Measurement of the millisecond activation Natl Acad Sci U S A 105(3):877Ð882 switch of G protein-coupled receptors in living cells. Shibata Y, White JF, Serrano-Vega MJ, Magnani F, Aloia Nat Biotechnol 21:807Ð812 AL, Grisshammer R, Tate CG (2009) Thermostabi- Warne T, Serrano-Vega MJ, Baker JG, lization of the neurotensin receptor NTS1. J Mol Biol Moukhametzianov R, Edwards PC, Henderson 2390(2):262Ð277 R, Leslie AG, Tate CG, Schertler GF (2008) Structure 54 N. Vaidehi et al.

of a beta1-adrenergic G-protein-coupled receptor. Xu F, Wu H, Katritch V, Han GW, Jacobson KA, Gao ZG, Nature 454(7203):486Ð491 Cherezov V, Stevens RC (2011) Structure of an ag- Warne T, Edwards PC, Leslie AGW, Tate CG (2012) Crys- onist bound human A2A adenosine receptor. Science tal structures of a stabilized “1-adrenoceptor bound to 332(6027):322Ð327 the biased agonists bucindolol and carvedilol. Struc- Yao X, Parnot C, Deupi X, Ratnala VRP, Swami- ture 20:841Ð849 nath G, Farrens D, Kobilka BK (2006) Coupling White SH, Wimley WC (1999) Membrane protein folding ligand structure to specific conformational switches and stability: physical principles. Annu Rev Biophys in the beta2-adrenoceptor. Nat Chem Biol 2006(2): Biomol Struct 28:319Ð365 417Ð422 White JF, Noinaj N, Shibata Y, Love J, Kloss B, Xu F, Yao XJ, Ruiz G, Whorton MR, Rasmussen SG, De Vree Gvozdenovic-Jeremic J, Shah P, Shiloach J, Tate CG, BT, Deupi X, Sunahara RK, Kobilka B (2009) The Grisshammer R (2012) Structure of the agonist bound effect of ligand efficacy on the formation and stability neurotensin receptor. Nature 490:508Ð513 of a GPCR-G-protein complex. Proc Natl Acad Sci U Wintrode PL, Zhang DQ, Vaidehi N, Arnold FH, God- S A 106:9501Ð9506 dard WA (2003) Protein dynamics in a family of Zaitseva E, Brown MF, Vogel R (2010) Sequential rear- laboratory evolved thermophilic enzymes. J Mol Biol rangement of interhelical networks upon rhodopsin ac- 327:745Ð757 tivation in membranes: the Meta II(a) conformational Wu B, Chien EY, Mol CD, Fenalti G, Liu W, Katritch substate. J Am Chem Soc 132:4815Ð4821 V, Abagyan R, Brooun A, Wells P, Bi FC, Hamel Zhang C, Srinivasan Y, Arlow DH, Fung JJ, Palmer DJ, Kuhn P, Handel TM, Cherezov V, Stevens RC D, Zheng Y, Green HF, Pandey A, Dror RO, Shaw (2010) Structures of the CXCR4 chemokine GPCR DE, Weis WI, Coughlin SR, Kobilka BK (2012) with small-molecule and cyclic peptide antagonists. High-resolution crystal structure of human protease- Science 330(6007):1066Ð1071 activated receptor 1. Nature 492(7429):387Ð392 How the Dynamic Properties and Functional Mechanisms of GPCRs 4 Are Modulated by Their Coupling to the Membrane Environment

Sayan Mondal, George Khelashvili, Niklaus Johner, and Harel Weinstein

Abstract Experimental observations of the dependence of function and organization of G protein-coupled receptors (GPCRs) on their lipid environment have stimulated new quantitative studies of the coupling between the proteins and the membrane. It is important to develop such a quantitative understanding at the molecular level because the effects of the coupling are seen to be physiologically and clinically significant. Here we review findings that offer insight into how membrane-GPCR coupling is connected to the structural characteristics of the GPCR, from sequence to 3D structural detail, and how this coupling is involved in the actions of ligands on the receptor. The application of a recently developed computational approach designed for quantitative evaluation of membrane remodeling and the energetics of membrane-protein interactions brings to light the importance of the radial asymmetry of the membrane-facing surface of GPCRs in their interaction with the surrounding membrane. As the radial asymmetry creates adjacencies of hydrophobic and polar residues at specific sites of the GPCR, the ability of membrane remodeling to achieve complete hydrophobic matching is limited, and the residual mismatch carries a significant energy cost. The adjacencies are shown to be affected by ligand-induced conformational changes. Thus, functionally important organization of GPCRs in the cell membrane can depend both on ligand-determined properties

S. Mondal ¥ G. Khelashvili ¥ N. Johner Department of Physiology and Biophysics, Weill Cornell Medical College, Cornell University, Room E-509, 1300 York Avenue, 10065 New York City, NY, USA The HRH Prince Alwaleed Bin Talal Bin Abdulaziz H. Weinstein () Alsaud Institute for Computational Biomedicine, Weill Department of Physiology and Biophysics, Weill Cornell Cornell Medical College, Cornell University, New York, Medical College, Cornell University, Room E-509, 1300 NY 10065, USA York Avenue, 10065 New York City, NY, USA e-mail: [email protected]

M. Filizola (ed.), G Protein-Coupled Receptors - Modeling and Simulation, Advances 55 in Experimental Medicine and Biology 796, DOI 10.1007/978-94-007-7423-0__4, © Springer ScienceCBusiness Media Dordrecht 2014 56 S. Mondal et al.

and on the lipid composition of various membrane regions with different remodeling capacities. That this functionally important reorganization can be driven by oligomerization patterns that reduce the energy cost of the residual mismatch, suggests a new perspective on GPCR dimerization and ligand-GPCR interactions. The relation between the modulatory effects on GPCRs from the binding of specific cell-membrane components, e.g., cholesterol, and those produced by the non-local energetics of hydropho- bic mismatch are discussed in this context.

Keywords GPCR ¥ Transmembrane proteins ¥ Structure and remodeling of lipid membranes ¥ Hydrophobic mismatch ¥ Lipid/protein interactions ¥ CTMD ¥ Continuum elastic theory ¥ Molecular dynamics simulations ¥ Residual exposure ¥ Residual mismatch ¥ Membrane deformation en- ergy ¥ Oligomerization ¥ Serotonin receptor ¥ Dopamine receptor ¥ Rhodopsin ¥ Cholesterol/GPCR binding ¥ DHA ¥ Structure-function relations in GPCRs ¥ Functional microdomains in GPCRs

Abbreviations Warne et al. 2008; Chien et al. 2010;Wu et al. 2010; Choe et al. 2011; Lebon et al. 5HT 5-hydroxytryptamine 2011; Rosenbaum et al. 2011;Xuetal.2011; CTMD Continuum-Molecular Dynamics Granier et al. 2012;Hagaetal.2012; Hanson hybrid approach et al. 2012; Liu et al. 2012;Wuetal.2012; DHA ¨-3 Docosahexaenoic Acid Manglik et al. 2012). The literature abounds GPCR G protein-coupled receptor in such information, obtained from decades of MD Molecular Dynamics detailed examination of the molecular structural SASA Solvent Accessible Surface Area determinants for the pharmacologically and SDPC 1-stearoyl-2-docosahexaenoyl-sn- physiologically measured ligand-recognition Glycero-3-phosphocholine properties and mechanisms of receptor activation. SDPE 1-stearoyl-2-docosahexaenoyl-sn- Such structure-based details have emerged from Glycero-3-phosphoethanolamine experiments in which the receptors were probed SM/FM Sequence Motif/ Functional with structurally defined ligands and mutations Microdomain to reveal mechanistic details in activation and TM Transmembrane segment dimerization (Ballesteros et al. 2001; Han et al. 2009; Strader et al. 1987; Suryanarayana et al. 1992; Luttrell et al. 1999; Liapakis et al. 2000; 4.1 Introduction Barak et al. 1995; Chelikani et al. 2007; Fritze et al. 2003; Green and Liggett 1994; Guo et al. A rich body of knowledge and information 2005; Javitch et al. 1995a, b, 1998; Kahsai et al. concerning the molecular pharmacology of 2011; Kofuku et al. 2012;O’Dowdetal.1988; the receptors, and their coupling to signal Prioleau et al. 2002; Shi and Javitch 2004; Shi transduction cascades in the cell, is offering a et al. 2002). Together, computational modeling, functional context for the recently determined mutagenesis, and crystallography have shown structures of GPCRs (Palczewski et al. 2000; that each of the five GPCR families share highly Ruprecht et al. 2004; Cherezov et al. 2007; conserved sequence motifs (SMs) that constitute Rasmussen et al. 2007, 2011; Jaakola et al. functional microdomains (FMs) mediating 2008;Parketal.2008; Scheerer et al. 2008; receptor activation (Ballesteros et al. 1998; 4 How the Dynamic Properties and Functional Mechanisms of GPCRs Are Modulated... 57

ple the response of specific membrane regions to the transition among such conformations. This coupling is becoming recognized as a determi- nant factor in (i)-the specificity of ligand binding, and (ii)-the responses elicited from the various activated states. Importantly, the effects of the membrane environment are strongly dependent on the lipid composition and hence can vary between different cell types, and also within dif- ferent regions of the plasma membrane of the same cell. For example, cholesterol, a compo- nent of biological membranes with a relative abundance that varies among cell types and even within different regions of the same plasma mem- brane, has been shown to affect GPCR signaling. Functionally important receptor properties such as thermal stability, ligand binding, and local- ization in the cell membrane have been shown to be cholesterol-dependent for beta2-adrenergic, Fig. 4.1 The position of Structural Motifs recognized 5HT1A, oxytocin, and metabotropic glutamate as Functional Microdomains (SM/FMs) in GPCRs, il- receptors (Xiang et al. 2002; Pucadyil and Chat- lustrated in the molecular model of 5HT2AR (Reprinted topadhyay 2004; Gimpl and Fahrenholz 2002; with permission from Shan et al. 2012) Eroglu et al. 2003). Rhodopsin has served as a prototype for elucidating the role of the lipid Visiers et al. 2002; Prioleau et al. 2002; Barak environment. In systematic studies, the activa- et al. 1995; Fritze et al. 2003; Lagerstrom and tion and the photochemical function of rhodopsin Schioth 2008; Rosenbaum et al. 2009; Palczewski were found to be affected by the cholesterol et al. 2000; Warne et al. 2008; Cherezov et al. concentration, as well as by molecular-level char- 2007; Weinstein 2005). Prominent examples of acteristics of the lipids such as their acyl chain these SM/FMs from Class-A GPCRs include the length, unsaturation of the acyl chain, headgroup “arginine cage” around the E/DRY motif in TM3 charge, and headgroup size (Brown 1994;Brown (Ballesteros et al. 1998), and the NPxxY motif in et al. 2002; Botelho et al. 2002, 2006;Gibson TM7 (Ballesteros et al. 1998; Barak et al. 1995; and Brown 1991, 1993). To take the example Fritze et al. 2003; Prioleau et al. 2002). Figure 4.1 of acyl chain unsaturation, the equilibrium be- illustrates some of these SM/FMs in a structural tween the inactive Metarhodopsin I and the active model of the serotonin receptor 5HT2AR. Metarhodopsin II was shown to shift in favor With the focus on the conformational changes of Metarhodopsin II with increasing molar frac- of structural elements of the GPCR proteins, the tion of lipids containing the polyunsaturated ¨- mechanistic models have largely ignored the in- 3 docosahexaenoic (DHA) tail, as compared to volvement of the membrane environment in the monounsaturated lipid tails (Brown 1994). Such discriminant properties of the GPCRs, and the modulation is physiologically and clinically sig- differential ligand-determined effects. This situa- nificant, as dietary deficiency in ¨-3 fatty acids tion is now changing rapidly, because the mem- is well-known to be implicated in a wide vari- brane environment has emerged as an essential ety of diseases (Innis 2008; Lavie et al. 2009; part of functional mechanisms including the ac- Neuringer et al. 1984; Stahl et al. 2008), at tivation, coupling, and cell-surface organization least some of which involve the modulation of of the GPCRs. In particular, the conformations GPCR function by the lipid environment. In rat of the GPCRs in the various states of activation models, for example, deficiency of dietary ¨-3 Ð both ligand-determined and constitutive Ð cou- fatty acids impaired photochemical function, with 58 S. Mondal et al. the Metarhodopsin II formation being both lower coupling. The focus is on the manner in and slower in response to dim light stimulus (Niu which such coupling is (a)-connected to et al. 2002). the structural/sequence characteristics of the As the structures of GPCRs become known receptor, and (b)-involved in the modes and from crystallography at increasingly high consequences of receptor-ligand interactions. resolution, and various biophysical methods of investigation offer increasingly detailed quantitative information about mechanistic 4.2.1 How GPCRs Couple elements of GPCR function, it becomes possible to the Surrounding to focus on the structural context of the mech- Membrane anisms. In parallel, this development requires a corresponding evaluation of the demonstrated Results from experimental investigations and a role of the lipid environment using appropriate variety of theoretical and computational studies quantitative methods to assess the membrane- suggest that two major aspects of the coupling protein coupling components. Achieving this between the membrane and the embedded GPCR goal requires the development and application have the greatest impact on the functional prop- of quantitative approaches, capable of relating erties of the receptors: (1) the interaction of the lipid-protein interactions to the structural individual lipid molecules with specific sites of characteristics of the receptor in its various the receptor molecule, and (2) the mismatch be- states, and especially in the functional context of tween the hydrophobic/hydrophilic character of ligand-induced conformational rearrangements. the membrane and the membrane-facing surface Significant progress in this direction has been of the receptor. As detailed below, a significant achieved with biophysical methods anchored in part of this mismatch persists even after the fundamental experimental data, and formulated membrane is remodeled around the embedded in the framework of physics-based computational protein, giving rise to an energy cost that affects modeling approaches ranging from molecular the functional properties and spatial organiza- dynamics simulations to continuum-based and tion of the GPCRs. This is due to the inherent mean field methods. We review, in Sect. 4.2, asymmetry of the membrane-facing surface of the new types of insights attained from the the embedded protein, caused by the differences application of such approaches. In Sect. 4.3,we among the TM segments. Consequently, it is an present the essential methodological advances important common characteristic of the energet- that enabled quantitative investigation of the ics of membrane coupling for multi-TM segment relation between lipid-protein interactions and proteins. the GPCR structure. To enable clear references to structural elements in different GPCRs, 4.2.1.1 Specific Interactions of Lipid residues and motifs are identified throughout the Molecules chapter with the Ballesteros-Weinstein notation with Membrane-Embedded (Ballesteros and Weinstein 1995). GPCRs With the details obtained from atomistic Molecular Dynamics (MD) simulations it has 4.2 Relation Between Receptor become possible to discern the dynamics of Structure and Its Interaction membrane components surrounding the receptor with the Membrane molecules, such as cholesterol, DHA, and various in a Functional Context other lipids. The simulations revealed both the nature of the interactions of the various lipid This section reviews findings that demonstrate molecules with specific sites of the GPCR, how the membrane interacts with the embedded and the effect on the membrane and the GPCR, i.e., the nature of the membrane-GPCR proteins as a whole. For example, in a 1.6 s 4 How the Dynamic Properties and Functional Mechanisms of GPCRs Are Modulated... 59

Fig. 4.2 Cholesterol dynamics around rhodopsin time. Cholesterol is shown in space-fill, and Pro7.38 in from atomistic MD simulation. Cholesterol interacting green. The TM6-TM7 bundle of rhodopsin is shown as with Pro7.38 in TM7. The time-sequence in panel (a) ribbons (a), or as cylinders (b). (c) The angle between is for the z-directional distance between the cholesterol TM7 and H8 over the course of the trajectory. (Reprinted oxygen and the C’ atom of Pro7.38. Panel (b)showsthe with permission from Khelashvili et al. 2009. Copyright angle between the cholesterol ring and the axis of the by Wiley Periodicals Inc.) extracellular segment of TM7 as a function of trajectory all-atom MD simulation of the prototypical (W,Y)] Ð [ 4.46 (I,V,L)] Ð [2.41 (F,Y)] was GPCR rhodopsin, three molecules of cholesterol calculated to occur in 21 % of human class A were found to be localized preferentially at GPCRs (Hanson et al. 2008), suggesting that three distinct sites of the protein (Khelashvili such specific interactions are a general feature of et al. 2009). Similar preferential localization GPCR-membrane coupling. of cholesterol molecules to specific sites was Results from the MD simulations suggested observed in s-scale MD simulations of ligand- as well a mechanistic role for the specific bind- bound serotonin receptor 5-HT2AR (Shan et al. ing of cholesterol. This was based on observed 2012). Such observations of specific cholesterol- correlations between the dynamics of choles- receptor interactions at identified sites in MD terol molecules and the conformations adopted simulations are consistent with structural data by the SM/FM regions of the receptor struc- for several GPCRs (Khelashvili et al. 2009). ture that had been identified for their involve- Indeed, electron microscopy of the inactive ment in activation. For example, the binding photo-intermediate Metarhodopsin I (Ruprecht mode of a cholesterol molecule in the TM6- et al. 2004), and several X-ray structures of TM7 region of rhodopsin was found to cor- GPCRs, show individual cholesterol molecules relate with the angle between TM7 and H8, associated with specific receptor sites (Hanson which involves the NPxxY SM/FM (see Fig. 4.2) et al. 2012; Liu et al. 2012; Cherezov et al. 2007). (Khelashvili et al. 2009). Similarly, in simula- A proposed cholesterol binding motif identified tions of the agonist-bound 5HT2AR (Shan et al. by groups of residues [4.39Ð4.43 (R,K)] Ð [4.50 2012), the movement of cholesterol away from 60 S. Mondal et al.

Fig. 4.3 Cholesterol dynamics correlates with the V7.52, and L7.55 residues on TM7 moves towards TM6 structural transitions in agonist-bound 5-HT2AR.(a) and engages in interactions with the residues K6.35, I6.39, Evolution of the minimum distances between the choles- F6.42, and V6.46 on TM6. (b) Snapshots at 10 and terol at the extracellular (EC) end of TM6 and selected 30 ns showing the cholesterol from the top panel of (a) TM6 residues in the 5-HT simulation (top panel). Time interacting with EC end of TM6. (c) Snapshots at 50, traces of the minimum distances between the cholesterol 167.6 and 250 ns showing the cholesterol from the bottom at the intracellular (IC) ends of TM6Ð7 and selected panels of (a) interacting with either IC end of TM6 or of residues on TM6 and 7 (middle and bottom panels). The TM7 (Reprinted with permission from Shan et al. 2012) cholesterol initially in contact with the L7.44, V7.48, its association with TM7 and towards TM6 (see components in the first shell (Soubias et al. 2006). Fig. 4.3), coincided with the intracellular end of Statistical analyses of the MD trajectories further TM6 beginning to bend away from TM3. Overall, suggested that DHA preferentially solvates the cholesterol-TM6 distance was found to be rhodopsin due to the lower entropic cost for correlated with agonist-induced conformational its association with rhodopsin, compared to that changes in 5-HT2AR (Shan et al. 2012). incurred by the stearoyl chain (Grossfield et al. Similarly, DHA lipid chains were found to 2006a). localize preferentially at a small number of specific sites of the GPCR from a statistical 4.2.1.2 Hydrophobic Mismatch analysis of 26 independent 100 ns atomistic and Membrane Remodeling MD simulations of rhodopsin embedded in a Alongside the specific sites of interactions membrane composed of a 2:2:1 molar mixture between lipid molecules and the GPCR of SDPC (1-stearoyl-2-docosahexaenoyl-sn- molecules, the remodeling of the membrane Glycero-3-phosphocholine)/SDPE (1-stearoyl-2- surrounding the protein constitutes another docosahexaenoyl-sn-Glycero-3-phosphoethanol- important form of coupling between membranes amine)/cholesterol (Grossfield et al. 2006b). and integral proteins. This remodeling is The suggestion of specific sites for DHA, an driven by the mismatch of the hydrophobic- important component of the cell membrane hydrophilic interfaces between the membrane environment, in binding to rhodopsin (Grossfield and the protein. Deformation of the membrane et al. 2006b), is consistent with spectroscopic to match the hydrophobic length of an embedded data from NMR experiments measuring peptide has been demonstrated in studies magnetization transfer from rhodopsin to lipid employing single-helical peptides as models to 4 How the Dynamic Properties and Functional Mechanisms of GPCRs Are Modulated... 61

Fig. 4.4 Membrane-deformation profiles u(x,y)for posed of (a)diC14:1PC, (b)diC20:1PC, and (c) 7:7:6 rhodopsin immersed in lipid bilayers of differ- SDPC/POPC/cholesterol membranes (Reprinted with per- ent bulk thicknesses. u(x,y) was calculated directly mission from Mondal et al. 2011. Copyright by Elsevier) from MD trajectories for rhodopsin in bilayers com- investigate membrane-protein coupling (Harroun assumptions, the calculations also revealed that et al. 1999a). Furthermore, the treatment of a the membrane deforms differently near different membrane-inserted peptide as a cylinder enabled parts of the 7-TM GPCR, in a specific pattern the assumption of radially symmetric bilayer dependent on the properties of the embedded deformations and perfect hydrophobic matching GPCR. This is illustrated in Fig. 4.4 by results between the peptide and the membrane so that the obtained for the systems of rhodopsin embed- energy cost of bilayer remodeling could be eval- ded in a thinner di(C14:1)PC membrane and a uated using the continuum theory of membrane thicker di(C20:1)PC bilayer that had been ex- elasticity (Huang 1986; Andersen and Koeppe amined earlier by the Gawrisch lab (Soubias 2007; Lundbæk and Andersen 1999; Lundbaek et al. 2008). Indeed, the di(C14:1)PC membrane et al. 2003; Nielsen et al. 1998; Goforth et al. is seen to become thicker on average around 2003). This energy cost has been shown to rhodopsin, and the di(C20:1)PC membrane to affect the function of membrane-embedded become thinner. The calculated extents of av- molecules (Andersen and Koeppe 2007; erage membrane deformation around the GPCR Lundbaek et al. 2010). agreed with the findings from solid state NMR However, the membrane remodeling pattern measurements on these systems (Soubias et al. is much more complex for GPCRs. This was 2008). But in addition, our CTMD calculations shown by results from calculations with a novel brought to light the radial asymmetry of the method that did not require the assumption of membrane deformations that exhibit both local radially symmetric membrane deformations and thickening and local thinning near different re- cylindrical proteins. We developed the theoret- gions of the protein (Mondal et al. 2011). For ical framework of the hybrid Continuum-MD example, the di(C14:1)PC lipid bilayer thickened (CTMD) method, described in detail in Sect. 4.3, by 2 Å on average near the protein, similar to to make possible the quantitative evaluation of results from NMR experiments, but it thickened radially asymmetric bilayer deformations around locally by 5 Å near TM4 (Mondal et al. 2011). membrane-inserted proteins (Mondal et al. 2011). The theoretical framework allowed the conclu- The CTMD calculations confirmed the expec- sion that the radial asymmetry of the membrane tation that the average remodeling of a thick deformation is due to the radially asymmetric membrane will make it thinner around an em- hydrophobic surface of the GPCR, which has bedded GPCR, and a thin membrane will become TMs of different hydrophobic lengths (Mondal thicker on average. But, freed of the symmetry et al. 2011). 62 S. Mondal et al.

ab c Gul5.36 Gul5.36 C2 atoms

E5.36 (polar)

Q5.60 F5.63 (polar) Phe5.63 C2 atoms Phe5.63 (hydrophobic)

d V2.66, V2.67 N1.33

Fig. 4.5 Illustration of hydrophobic adaptation and penalty at TM5 for rhodopsin in diC14:1PC (Panels A and residual exposure at the molecular level.(a) A snap- B are reprinted with permission from Mondal et al. 2011. shot from the MD trajectory of rhodopsin in the thin Copyright by Elsevier). (c) In Rhodopsin, adjacency of diC14:1PC bilayer. TM5 is shown in blue, and two residues the hydrophobic Phe5.63 to the polar Gln5.60. The left (Glu5.36 and Phe5.63) are highlighted. (b) A snapshot of panel shows rhodopsin in VdW representation in order rhodopsin in the thick diC20:1PC bilayer. The diC20:1PC to depict the irregular surface with which the membrane bilayer thins near Glu5.36, which substantially reduces interacts. The right panel shows the protein in the cartoon its exposure to the hydrophobic core of the bilayer, thus representation to show the helices. (d) In the dopamine D2 showing hydrophobic adaptation. Phe5.63 remains un- receptor model, adjacency of polar N1.33 and hydropho- favorably exposed to the polar environment in the thin bic V.266, V2.67 in the membrane-facing surface, with diC14:1PC bilayer but not in the thick diC20:1PC bilayer. the protein shown in both, VdW (left panel) and cartoon Thus, Phe5.63 contributes to residual exposure energy representations (right panel)

4.2.1.3 The Residual Hydrophobic more, a number of computational studies of the Mismatch and the Energy Cost coupling between the membrane and embedded of Protein-Membrane Coupling multi-TM proteins identified specific residues of Although hydrophobic mismatch is energetically the GPCRs where the hydrophobic matching re- costly, membrane deformations do not achieve a mains incomplete in a membrane of specific com- complete hydrophobic matching at all TM seg- position, thus highlighting specific regions of the ments (Mondal et al. 2011), and residues in the GPCR where unfavorable hydrophobic-polar in- TM-bundle can remain exposed to unfavorable teractions carry a significant energy cost (Mondal hydrophobic-polar interactions. For example, in et al. 2011, 2012; Shan et al. 2012; Khelashvili MD simulations of rhodopsin in di(C14:1)PC, et al. 2012). TM5 maintains a residual hydrophobic mismatch Results from CTMD calculations and coarse- on its extracellular side where a hydrophobic grained simulations suggest that some of this phenylalanine is exposed to the solvent in spite of energy cost contributes to the modulation of re- the overall membrane remodeling around the pro- ceptor oligomerization by the membrane. GPCRs tein (see Fig. 4.5a). Such incomplete adaptation have been shown to self-associate spontaneously of the membrane to the hydrophobic surface has upon reconstitution into liposomes, without the also been inferred from EPR experiments mea- need for any cellular machinery (Mansoor et al. suring lipid-protein associations for a number of 2006; Harding et al. 2009), with oligomeriza- transmembrane proteins (Marsh 2008). Further- tion occurring preferentially at specific interfaces 4 How the Dynamic Properties and Functional Mechanisms of GPCRs Are Modulated... 63

(Fung et al. 2009). The oligomerization is a That adjacencies responsible for the resid- reversible process (Dorsch et al. 2009;Hernetal. ual exposure of multi-TM proteins are a com- 2010), and is driven by hydrophobic mismatch mon feature of GPCRs, is evident from the re- (Botelho et al. 2006). Coarse-grained MD sim- cently determined crystal structures and validated ulations of the interaction of GPCRs in the mem- structural models of a variety of rhodopsin-like brane have provided results consistent with the GPCRs, including B2AR, KOR, DOR, 5-HT2AR, experimental observations, and offered important and D2R in the literature (Mondal et al. 2011, insights into the energetics of this oligomeriza- 2012; Shan et al. 2012). Indeed, computational tion (Periole et al. 2007, 2012). Such simula- modeling with the CTMD method showed that tions, taken together with the quantitative re- residual exposure occurs at these adjacencies in sults from CTMD calculations, suggest that the different GPCRs and in a number of different residual exposure contributes to the organiza- membranes (see Tables 4.1 and 4.2) (Mondal tional role of the hydrophobic mismatch (Mondal et al. 2011, 2012; Shan et al. 2012). This is et al. 2011, 2012). In particular, these calcu- illustrated in Fig. 4.5aÐc showing the residual lations suggest that there is a drive for GPCR exposure in TM5 of rhodopsin at a site where oligomerization at specific interfaces that remove Phe and Gln are juxtaposed. Figure 4.5dshows the structural context for the residual hydropho- another example of the adjacency, with the polar bic mismatch, thereby reducing the hydrophobic N1.33 next to the hydrophobic V2.66 and V2.67 mismatch and its energy cost (Mondal et al. 2011, in a structural model of the dopamine D2 recep- 2012). tor. Here too, the calculations show a residual The detailed quantitative analysis of the exposure, with an energy cost of 2 kT at N1.33 incomplete hydrophobic matching associated in a raft-like membrane. with the protein-membrane coupling of multi- These results from computational modeling TM-segment proteins, revealed a persistent local highlight the two key elements of the coupling exposure of residues to unmatched hydrophobic energy between the membrane and the GPCR, environments, which could not be eliminated viz. (1)-the radially asymmetric membrane de- by the overall membrane remodeling. This formation, and (2)-the residual exposure remain- was termed “residual exposure” (Mondal et al. ing after the membrane deforms to alleviate the 2011). Calculations in several different GPCR- hydrophobic mismatch. The nature of these two membrane systems (Mondal et al. 2011, 2012) identifiable elements demonstrates that the cou- showed that this residual exposure is due to the pling between the membrane and GPCR proteins radial asymmetry of the GPCR’s hydrophobic is not a simple result of the difference between surface. Specifically, if polar and hydrophobic the average hydrophobic length of the protein and residues, e.g., in two different TMs, are adjacent that of the pure membrane as previously postu- in space, membrane deformation would need to lated for model peptides (Harroun et al. 1999a, produce a match to both polar and hydrophobic b; Lundbæk and Andersen 1999;Marsh2008). residues in a small neighborhood, and this may be Rather, the important consequences of the cou- energetically unfavorable. Therefore, adjacencies pling emerge from specific features of the protein of hydrophobically disparate residues mark surface, both with respect to residue identity possible sites of residual exposure and thus and to local structure. Therefore, very different energetically costly membrane-protein interac- consequences can be expected from even small tions. The CTMD approach that determines the differences between highly homologous proteins energetic cost of the residual exposure caused (Mondal et al. 2012), from point mutations, or by such adjacencies (Mondal et al. 2011), is from structural changes due to ligand binding presented in Sect. 4.3 below. (see next Section). 64 S. Mondal et al.

2 Table 4.1 Residual SAres values (in Å ) and the corresponding Gres values (in kBT) calculated for TMs of rhodopsin in diC14:1PC, diC16:1PC, diC18:1PC, diC20:1PC, and 7:7:6 SDPC/POPC/cholesterol membranes

diC14:1PC diC16:1PC diC18:1PC diC20:1PC SDPC/POPC/cholesterol

TM SAres Gres SAres Gres SAres Gres SAres Gres SAres Gres 1 210 9.9 80 3.8 75 3.5 70 3.3 63 3.0 2 74 3.5 29 1.4 0 0 32 1.5 27 1.3 3N.D.a N.D. N.D. N.D. N.D. N.D. N.D. N.D. N.D. N.D. 4 0 0 77 3.6 90 4.3 79 3.7 134 6.3 5 115 5.4 1 0 9 0.4 7 0.3 17 0.8 6 36 1.7 31 1.5 56 2.7 38 1.8 51 2.4 7 119 5.6 44 2.1 22 1.1 0 0 0 0 Reprinted with permission from Mondal et al. (2011). Copyright by Elsevier aN.D. Not Determined

2 Table 4.2 Residual SAres values (in Å ) and the corresponding Gres values (in kBT) calculated for TMs of 5-HT2AR in complex with 5-HT, LSD, and KET, respectively, in an SDPC/POPC/cholesterol lipid bilayer

5-HT2AR with 5-HT 5-HT2AR with LSD 5-HT2ARwithKET

TM SAres Gres SAres Gres SAres Gres 1 0 0 0 0 107 5.1 2301.400 00 3N.D.a N.D. N.D. N.D. N.D. N.D. 4 25 1.2 81 3.8 103 4.9 5 94 4.4 81 3.8 69 3.3 6401.900 00 71 0 0 0 0 0 Reprinted with permission from Mondal et al. (2011). Copyright by Elsevier aN.D. Not Determined

4.2.2 Properties profile reflects the different conformational of GPCR-Membrane rearrangements of the receptor, e.g., the different Interactions Determined tilt angles of TM6 (Shan et al. 2012), induced by Ligand Binding by the three ligands. The patterns of residual exposure calculated for the receptor complexes Of particular significance to GPCR signaling with the three ligands and the host membrane is the observation that both the membrane are shown in Table 4.2 (Mondal et al. 2011). deformation profile and the residual exposure Incomplete hydrophobic matching with an pattern, can be specific to the ligand that energy penalty substantially larger than 1 kT acts on the receptor. For example, analysis of is predicted for TMs 1, 4, and 5 for the ligand-dependent conformational states of a 5- Ketanserin complex, but only for TMs 4 and HT2AR receptor model showed that membrane 5 for the LSD-bound receptor. These calculations remodeling around the receptor bound to the illustrate how ligand binding to GPCRs can affect agonist 5-HT is different from that observed membrane-protein interactions, and consequently around 5-HT2AR in complex with either the the patterns of oligomerization (both extent partial agonist LSD, or the inverse agonist and interface). Indeed, in addition to being Ketanserin (see Fig. 4.6) (Shan et al. 2012). membrane-driven, GPCR oligomerization has The membrane is thinner near TM4 and TM6 been reported to be ligand-sensitive for the when either the agonist 5-HT, or the partial dopamine D2, serotonin 5HT2C, and B2AR agonist LSD, is bound, compared to the structure receptors (Guo et al. 2005; Mancia et al. 2008; stabilized by the inverse agonist Ketanserin. Fung et al. 2009). The findings reviewed here The difference in membrane deformation suggest that the modulation of oligomerization 4 How the Dynamic Properties and Functional Mechanisms of GPCRs Are Modulated... 65

Fig. 4.6 Hydrophobic thickness profiles of the mem- colored fields represent distances (in Å) between lipid branes around 5-HT2AR in complex with 5-HT, LSD, backbone C2 atoms from the opposing leaflets. For this or KET. The structures of the various ligand-bound recep- analysis, the membrane plane was divided into square tor models averaged over the last 100 ns of MD simula- 2Å 2 Å bins, and the average C2ÐC2 distances in each tions are shown in cartoon, with only the helices depicted bin were collected by scanning the last 100 ns of trajectory (in different colors) with corresponding TM numbers. The (Reprinted with permission from Shan et al. 2012) in the membrane by a specific ligand involves a radial asymmetry of the hydrophobic surface contribution from its effect on the hydrophobic plays for GPCRs (Sect. 4.2), CTMD appears mismatch. The CTMD approach presented next, particularly suited for, and was in fact first in Sect. 4.3, can be used to obtain a quantitative developed for, the analysis of GPCR-membrane prediction regarding the effects of binding of interactions. We note, however, that the CTMD different ligands to a given GPCR, specifically formulation is general enough to be applied to whether the different effects might result in any transmembrane protein. a substantial difference in the energetics of membrane-protein interactions. 4.3.1 Membrane Deformations

4.3 Continuum-Molecular CTMD employs the well-established continuum, Dynamics (CTMD) Approach elastic theory of membranes (Nielsen et al. 1998; Huang 1986; Lundbaek et al. 2010; The hybrid Continuum-Molecular Dynamics Andersen and Koeppe 2007) to evaluate the (CTMD) approach we developed evaluates the energy cost of membrane deformation (Gdef ). energy cost of hydrophobic mismatch for multi- Formulations with elastic terms representing the TM proteins, such as GPCRs (Mondal et al. compression-extension and the splay-distortion 2011). Taking into consideration the energy of the membrane have explained the functional cost of membrane deformations (Sect. 4.3.1) effects of hydrophobic mismatch in model lipid- and the energy penalty due to residual exposure protein systems at a quantitative level (Huang at specific residues (Sect. 4.3.2), the CTMD is, 1986; Nielsen et al. 1998; Goforth et al. 2003). to our knowledge, the only method currently To date, typical quantitative applications of the available that takes into account the radial continuum, elastic theory of membranes have asymmetry of the protein hydrophobic surface been limited to single transmembrane helical in evaluating the energy cost of hydrophobic proteins because the calculations for such model mismatch. In view of the central role that the peptides could be simplified by assuming the 66 S. Mondal et al. peptide to be a cylinder and assuming perfect cylindrical symmetry. The method (Mondal et al. hydrophobic matching between the protein 2011) is implemented in the CTMDapp (http:// and the membrane. In the CTMD approach memprotein.org/resources/servers-and-software) the formalism has enabled the quantitative and is described below. application of this theory to multi-helical In CTMD, the membrane shape is defined by transmembrane proteins by taking into account the local deformation u(x,y) as the radial asymmetry of the hydrophobic surface of such proteins. Thus CTMD considers (1)-the 1 u .x;y/ D .d .x;y/ d0/ ; (4.1) radially asymmetric membrane deformations, 2 and (2)-the possibility of incomplete alleviation of the hydrophobic mismatch between the protein where d(x,y) is the local bilayer thickness and and the membrane. To this end, the membrane d0 is the bulk thickness of the bilayer (i.e., the deformations at the membrane-protein interface equilibrium thickness away from the protein).  are determined from cognate MD trajectories, Gdef is defined as the sum of contributions and the membrane deformations away from from compression-extension, splay-distortion, the protein are solved at the continuum level and surface tension (Nielsen et al. 1998; Huang without the typical simplifying assumption of 1986):

Z ( !) 1 .2 /2 @2 @2 2 @ 2 @ 2 G K u K u u C ˛ u u d; def D a 2 C c 2 C 2 o C C (4.2) 2 d0 @x @y @x @y 

where Ka and Kc are the elastic constants evaluate the membrane deformations and the for compression-extension and splay-distortion corresponding energy costs without invoking the respectively, C0 is the monolayer spontaneous typical assumption of cylindrical protein, we next curvature, and ˛ the coefficient of surface perform the following two steps: tension. The phenomenological constants Ka, 1. The membrane-protein interface is determined Kc, C0, and d0 characterize the membrane from cognate MD simulations, with €in and property at the continuum level. Experimental u0(x,y) being obtained from the time-averaged measurements for these constants are available position of the P-atoms around the protein. for typical lipid types, see (Rawicz et al. 2000)for Specifically, the trajectory is first centered example. and aligned on the protein, keeping the Z- The local membrane deformations u(x,y)at axis along the membrane normal. Then, the continuum level are calculated by solving the time-averaged z-position of the P-atoms the Euler-Lagrange equations corresponding to is calculated on a rectangular grid with a Eq. 4.2. Following the Euler-Lagrange formula- 2 Å by 2 Å mesh to obtain the membrane-  tion, the Gdef definition in Eq. 4.2 leads to the protein interface as well as the membrane boundary value problem, thickness on this interface. This allows for € 4K ˇ a non-cylindrical in corresponding to the K 4 ˛ 2 a 0; ˇ .x;y/ ; c r u r u C 2 u D u € in D uo membrane-protein interface. Additionally, the d0 ˇ ˇ membrane deformations u0(x,y)isnot simply 2 ˇ 2 ˇ uj€ D 0; r u €in D vo .x;y/ ; r u D 0 out €out the difference between the hydrophobic length (4.3) of the protein and that of the unperturbed membrane, thus allowing for residual where €in and €out represent the boundary at exposure. Note that the pattern of local the GPCR-membrane interface and the outer membrane deformations at the membrane- boundary of the simulation box respectively. To protein interface from MD accounts for 4 How the Dynamic Properties and Functional Mechanisms of GPCRs Are Modulated... 67

various interfacial interactions at the atomistic where  is the polar angle corresponding to the level, including interactions between single coordinates (x,y), and an and bn are the Fourier lipid molecules and specific sites of the coefficients. This step is required to make the GPCRs and any tilting of the TMs within method feasible for multi-TM proteins with long the steric constraint of the whole TM-bundle. membrane-protein interface, e.g., the 7 TM- 2. The resulting boundary value problem bundle of GPCRs has a diameter on the order in Eq. 4.3 is numerically solved without of 50 Å, thus sharing a long circumferential simplifying the fourth order partial dif- boundary with the membrane. A given pair of ferential equation in u(x,y)toanordinary fan, bng in Eq. 4.5 defines a particular v0(x,y) differential equation by assuming radial vector, for which Eq. 4.4 yields the membrane symmetry. This is achieved by first expressing deformations u(x,y) and Eq. 4.2 the Gdef . the partial differential equation in u as In the CTMDapp, the non-linear optimization coupled partial differential equations in problem is solved with the objective function u and v, Gdef D ffan, bng using the BFGS optimization algorithm, which is a standard global, quasi- 4K K 2 ˛ a 0; Newtonian optimization procedure. cr v v C 2 u D d0 We note that the procedure actually involves ˇ 2 ˇ two nested minimization procedures. The inner r u D v; uˇ D uo .x;y/ ; €in minimization occurs via the Euler-Lagrange for- ˇ ˇ malism to give the membrane deformation shape ˇ ˇ uˇ D u1 .x;y/ ; vˇ D v0 .x;y/ ; that minimizes the energy cost, given u0 and v0. €out €in ˇ The outer minimization obtains the v0 that min- ˇ imizes the energy cost from the Euler-Lagrange vˇ D v1 .x;y/ (4.4) €out formalism, thus obtaining the curvature at the membrane-protein interface for which the mem- This set of simultaneous equations can be brane deformations are least costly. To verify that solved on a rectangular grid using standard finite membrane shapes calculated from CTMD and difference schemes for Poisson equations. As MD are similar, the macroscopic u(x,y) obtained implemented in the CTMDapp, Eq. 4.4 is dis- by the nested optimization procedure can be com- cretized using the central 5-point approximation pared to u(x,y) calculated directly from the cog- of the Laplacian operator, and the system is then nate microscopic MD simulation. The calcula- solved using the iterative Gauss-Siedel algorithm. tions for a number of GPCR-membrane systems As a guideline, we mention that in our experience have shown that the macroscopic u(x,y) indeed the procedure converged within 130 iterations for converges to a profile similar (within 1Å) GPCR monomers in a 100 Å by 100 Å box. to the u(x,y) obtained from the cognate MD While € , € and u (x,y) are obtained from in out 0 trajectories. This is illustrated for rhodopsin in MD trajectories, the boundary condition on the di(C14:1)PC in Fig. 4.7. Importantly, the CTMD curvature is obtained self-consistently using an calculation evaluates the G using the well- optimization procedure that searches for v (x,y) def 0 tested elastic continuum theory of membrane to minimize G globally. The complexity of def deformations. this optimization procedure is made numerically tractable by reducing the size of the v0(x,y) vector by expressing it as a truncated Fourier series 4.3.2 Residual Exposure X7 vo .x;y/ W v0 ./ fan cos .n/ The residual exposure of each membrane-facing nD0 residue (SAres,i) is quantified in terms of the sur- face area involved in unfavorable hydrophobic- C bn sin .n/g ; (4.5) polar interactions. SAres,i is obtained in terms of 68 S. Mondal et al.

Fig. 4.7 Protocol for the CTMD approach, illustrated in panel A and a random curvature boundary condition schematically for rhodopsin in a diC14:1PC lipid bi- to produce the starting point for the free-energy-based layer.(a) The membrane-deformation profile u(x,y)is optimization (Eqs. 4.4 and 4.5). The right panel is the calculated directly from the MD simulations. The left membrane-deformation profile calculated with the natu- panel is the deformation profile shown as a color map ral boundary condition, which minimizes the membrane- projected onto the surface defined by fitting a grid (spacing deformation energy penalty. Note the agreement between 2 Å) to the positions of the phosphate atoms in the two the profiles in A, calculated using a microscopic theory, leaflets during the trajectory, followed by time averaging and B, calculated using the continuum theory (they are and spatial smoothing. The right panel shows the same de- within 0.5 Å RMSD of each other). The CTMD approach formation color map on the x-y plane. (b) The membrane- also allows evaluation of the protein-induced membrane- deformation profile u(x,y) on a 100 100 Å patch, calcu- deformation energy penalty, which is 4.7 kBT in this case lated with CTMD. The left panel represents the membrane (Reprinted with permission from Mondal et al. 2011. shape calculated with the deformation boundary condition Copyright by Elsevier) at the membrane-protein interface from the MD profile the Solvent Accessible Surface Areas (SASAs) the solute taken to be the protein and the hy- calculated from cognate MD trajectories, as fol- drophobic part of the membrane. lows: The corresponding energy penalty (Gres,i) For hydrophobic residues, SAres,i is calculated can be approximated as linearly proportional to as the SASA with the solute taken to be the SAres,i, protein and hydrophobic core of the bilayer. This gives the surface area of the hydrophobic residues Gres;i D res SAres;i (4.6) exposed to the membrane headgroup or water.  For hydrophilic residues, SAres,i quantifies the with the constant of proportionality res taken to surface area of the residue embedded in the hy- be 0.028 kcal/(mol. Å2) (Choe et al. 2008; Ben- drophobic core of the membrane. It is calculated Tal et al. 1996). as the difference of the SASA with the solute In these calculations, Lys, Arg, Trp, Tyr, Ser, taken to be the protein only and the SASA with and Thr need to be treated with special attention 4 How the Dynamic Properties and Functional Mechanisms of GPCRs Are Modulated... 69 to the structural context. If the polar Lys or Arg has been taken to suggest that the lipid-mediated residues reside within the hydrophobic part of effect on rhodopsin function is not due to lipids the membrane but near the water-membrane in- of a specific chemical nature (Botelho et al. terface, their long side chains can extend towards 2006). Such “non-specific interactions” have the phosphate headgroups so that the polar part been interpreted to involve energetically costly of their side-chains interacts with the polar phos- hydrophobic mismatch. phate headgroups (Strandberg and Killian 2003; The computational results described herein Sankararamakrishnan and Weinstein 2002). This identify the nature of both “specific” and “non- conformational preference is known as “snorke- specific” interactions and suggest that the ling”, and it has been estimated that as a result, “specific interactions” are local and the “non- only a small energy penalty is associated with specific” hydrophobic mismatch interactions hydrophobic matching by means of snorkeling are non-local in nature. Notably, however, (Strandberg and Killian 2003). Therefore, Lys individual lipid molecules that bind to specific and Arg residues near the water-membrane in- sites of the GPCR can also participate in terface are not considered in the calculations of hydrophobic matching. For example, in the residual exposure if they are found in the MD 1.6 s simulation of rhodopsin (see Sect. 4.2.1), simulations to snorkel to the lipid headgroup re- the cholesterol that binds specifically at the gion. Interfacial Trp and Tyr are also not consid- extracellular end of TM2-TM3 partly reduces ered as candidates for residual exposure, as their solvent accessibility at the hydrophobic Leu3.27 commonly observed interfacial location is con- (Khelashvili et al. 2009). Therefore, what is sidered favorable (Yau et al. 1998). Finally, Ser fundamental to this “specific” interaction is its and Thr residues are polar, but when they reside local nature, involving electrostatic/H-bonding/ within the hydrophobic part of the membrane, VdW interactions between particular protein their polar parts can form H-bonds with the helix residues and the particular cholesterol molecule. backbone (Gray and Matthews 1984). In such On the other hand, the “non-specific” nature of situations, the Ser and Thr are interacting directly the hydrophobic mismatch is mitigated by the with the protein and not the membrane so they evidence that the membrane deforms differently would not be candidates for residual exposure. near TMs of different hydrophobic lengths, and the residual exposure occurs at specific residues of the receptor depending on the spatial 4.4 Perspective on “Specific” vs. organization of polar and hydrophobic residues. “Non-specific” However, the deformation profile involves the GPCR-Membrane membrane around the GPCR as a whole thus Interactions substantiating the non-local character of the effect. Consequently, membrane deformation Functionally relevant interaction of lipids with profiles could indeed be obtained in the CTMD GPCRs has historically been viewed as being of approach by treating the entire membrane as two different types: (1)-specific lipid-protein an elastic continuum and minimizing the total interactions, e.g., the binding of individual energetic cost of the membrane deformations cholesterol molecules to specific residues of (provided the thickness boundary conditions the GPCR; and (2)-non-specific interactions, were known from atomistic MD simulations). such as hydrophobic mismatch (Botelho et al. We showed for a number of membrane-GPCR 2002, 2006;Brown1994; Gibson and Brown systems that the membrane deformation profile 1991, 1993). For example, in affecting rhodopsin thus obtained was consistent with results from activation, lipids with PE headgroups can microscopic MD calculations that treat the compensate for the absence of lipids with PS membrane at the level of lipid molecules (Mondal headgroups and lipids with DHA tails, which et al. 2011). 70 S. Mondal et al.

for Computational Biomedicine at Weill Medical College 4.5 Conclusion of Cornell University, (2) the New York Blue Gene Com- putational Science facility housed at Brookhaven National The multifaceted role of the membrane in GPCR Lab, and (3) NSF Teragrid allocation MCB090022. function has, of late, come to the forefront of GPCR research. This has been driven by the experimental data suggesting that the membrane References environment affects oligomerization, stability, and activity of GPCRs and the observation Andersen OS, Koeppe RE (2007) Bilayer thickness and of lipid molecules co-crystallized with some membrane protein function: an energetic perspective. Annu Rev Biophys Biomol Struct 36:107Ð130 GPCR crystals. A central aspect of GPCR- Ballesteros JA, Weinstein H (1995) Integrated methods membrane interactions is how the receptor for the construction of three-dimensional models and and the membrane couple to each other at the computational probing of structure-function relations molecular level. Recent computational modeling in G protein-coupled receptors. In: Sealfon SC (ed) Methods in , vol 25. Academic, San has provided significant insight into connecting Diego, pp 366Ð428 GPCR-membrane interactions with the receptor Ballesteros J, Kitanovic S, Guarnieri F, Davies P, Fromme structure. In particular, this review brings to light BJ, Konvicka K, Chi L, Millar RP, Davidson JS, Wein- stein H (1998) Functional microdomains in G-protein- the importance of the radial asymmetry of the coupled receptors. J Biol Chem 273(17):10445Ð10453 membrane-facing surface of GPCRs in their Ballesteros JA, Jensen AD, Liapakis G, Rasmussen SGF, interaction with the surrounding membrane. The Shi L, Gether U, Javitch JA (2001) Activation of the radial asymmetry creates a context of adjacent B2-adrenergic receptor involves disruption of an ionic lock between the cytoplasmic ends of transmembrane hydrophobic and polar residues at specific sites segments 3 and 6. J Biol Chem 276(31):29171Ð29177 of the GPCR where hydrophobic matching may Barak LS, Menard L, Ferguson SSG, Colapietro remain incomplete. Moreover, it is due to this AM, Caron MG (1995) The conserved seven- radial asymmetry that there exist a few specific transmembrane sequence NP (X) 2, 3Y of the G- protein-coupled receptor superfamily regulates multi- sites where specific lipid molecules such as ple properties of the beta 2-adrenergic receptor. Bio- cholesterol bind. These interactions have been chemistry 34(47):15407Ð15414 shown to be involved in various aspects of the Ben-Tal N, Ben-Shaul A, Nicholls A, Honig B (1996) function and organization of GPCRs. Free-energy determinants of alpha-helix insertion into lipid bilayers. Biophys J 70(4):1803Ð1812 We have shown how to take into account Botelho AV, Gibson NJ, Thurmond RL, Wang Y, Brown the radial asymmetry of the hydrophobic surface MF (2002) Conformational energetics of rhodopsin of GPCRs in providing quantitative information modulated by nonlamellar-forming lipids. Biochem- about their interaction with the membrane. With istry 41(20):6354Ð6368 Botelho AV, Huber T, Sakmar TP, Brown MF (2006) the new understanding gained from the findings Curvature and hydrophobic forces drive oligomeriza- and methods presented here, it becomes possible tion and modulate activity of rhodopsin in membranes. to investigate how GPCR mechanisms, includ- Biophys J 91(12):4464Ð4477 ing ligand-determined states and their functional Brown MF (1994) Modulation of rhodopsin function by properties of the membrane bilayer. Chem Phys Lipids properties, are affected in particular lipid envi- 73(1Ð2):159Ð180 ronments. In particular, it becomes possible to Brown MF, Thurmond RL, Dodd SW, Otten D, Beyer quantify the contribution of the lipid-protein in- K (2002) Elastic deformation of membrane bilayers teractions with respect to (1)-the individual lipid probed by deuterium NMR relaxation. J Am Chem Soc 124(28):8471Ð8484 molecules interacting locally with specific sites Chelikani P, Hornak V, Eilers M, Reeves PJ, Smith of the GPCR, (2)-the deformation/remodeling SO, RajBhandary UL, Khorana HG (2007) Role of of the membrane, and (3)-the residual exposure group-conserved residues in the helical core of beta2- profile and its energy cost. adrenergic receptor. Proc Natl Acad Sci U S A 104(17):7027Ð7032 Cherezov V, Rosenbaum DM, Hanson MA, Rasmussen Acknowledgement This work was supported by the Na- SGF, Thian FS, Kobilka TS, Choi HJ, Kuhn P, Weis tional Institutes of Health grants DA012923-09 and U54- WI, Kobilka BK (2007) High-resolution crystal struc- GM087519 to H.W. We also gratefully acknowledge the ture of an engineered human beta2-adrenergic G pro- allocations of computational resources at (1) the Institute tein coupled receptor. Science 318(5854):1258Ð1265 4 How the Dynamic Properties and Functional Mechanisms of GPCRs Are Modulated... 71

Chien EYT, Liu W, Zhao Q, Katritch V, Won Han G, Han- Grossfield A, Feller SE, Pitman MC (2006b) A role for son MA, Shi L, Newman AH, Javitch JA, Cherezov V, direct interactions in the modulation of rhodopsin by Stevens RC (2010) Structure of the human dopamine omega-3 polyunsaturated lipids. Proc Natl Acad Sci U D3 receptor in complex with a D2/D3 selective antag- S A 103(13):4888Ð4893 onist. Science 330(6007):1091Ð1095 Guo W, Shi L, Filizola M, Weinstein H, Javitch JA (2005) Choe S, Hecht KA, Grabe M (2008) A continuum method Crosstalk in G protein-coupled receptors: changes at for determining membrane protein insertion energies the transmembrane homodimer interface determine and the problem of charged residues. J Gen Physiol activation. Proc Natl Acad Sci U S A 102(48):17495Ð 131(6):563Ð573 17500 Choe HW, Kim YJ, Park JH, Morizumi T, Pai EF, Krauß Haga K, Kruse AC, Asada H, Yurugi-Kobayashi T, Shi- N, Hofmann KP, Scheerer P, Ernst OP (2011) Crystal roishi M, Zhang C, Weis WI, Okada T, Kobilka BK, structure of metarhodopsin II. Nature 471(7340):651Ð Haga T, Kobayashi T (2012) Structure of the human 655 M2 muscarinic acetylcholine receptor bound to an Dorsch S, Klotz KN, Engelhardt S, Lohse MJ, Bunemann antagonist. Nature 482:547Ð551 M (2009) Analysis of receptor oligomerization by Han Y, Moreira IS, Urizar E, Weinstein H, Javitch JA FRAP microscopy. Nat Methods 6(3):225Ð230 (2009) Allosteric communication between protomers Eroglu Ç, Brügger B, Wieland F, Sinning I of dopamine class A GPCR dimers modulates activa- (2003) Glutamate-binding affinity of Drosophila tion. Nat Chem Biol 5(9):688Ð695 metabotropic is modulated by Hanson MA, Cherezov V, Griffith MT, Roth CB, Jaakola association with lipid rafts. Proc Natl Acad Sci U S A VP, Chien EYT, Velasquez J, Kuhn P, Stevens RC 100(18):10219 (2008) A specific cholesterol binding site is estab- Fritze O, Filipek S, Kuksa V, Palczewski K, Hofmann KP, lished by the 2.8 Å structure of the human beta2- Ernst OP (2003) Role of the conserved NPxxY (x) adrenergic receptor. Structure 16(6):897Ð905 5, 6F motif in the rhodopsin ground state and during Hanson MA, Roth CB, Jo E, Griffith MT, Scott activation. Proc Natl Acad Sci U S A 100(5):2290Ð FL, Reinhart G, Desale H, Clemons B, Cahalan 2295 SM, Schuerer SC, Sanna MG, Han GW, Kuhn P, Fung JJ, Deupi X, Pardo L, Yao XJ, Velez-Ruiz GA, Rosen H, Stevens RC (2012) Crystal structure of a DeVree BT, Sunahara RK, Kobilka BK (2009) Ligand- lipid G protein-coupled receptor. Science 335(6070): regulated oligomerization of 2-adrenoceptors in a 851Ð855 model lipid bilayer. EMBO J 28(21):3315Ð3328 Harding PJ, Attrill H, Boehringer J, Ross S, Wadhams Gibson NJ, Brown MF (1991) Role of phosphatidylserine GH, Smith E, Armitage JP, Watts A (2009) Consti- in the MI-MIII equilibrium of rhodopsin*. Biochem tutive dimerization of the G-protein coupled receptor, Biophys Res Commun 176(2):915Ð921 neurotensin receptor 1, reconstituted into phospholipid Gibson NJ, Brown MF (1993) Lipid headgroup and acyl bilayers. Biophys J 96(3):964Ð973 chain composition modulate the MI-MII equilibrium Harroun TA, Heller WT, Weiss TM, Yang L, Huang of rhodopsin in recombinant membranes. Biochem- HW (1999a) Experimental evidence for hydropho- istry 32(9):2438Ð2454 bic matching and membrane-mediated interactions in Gimpl G, Fahrenholz F (2002) Cholesterol as stabilizer lipid bilayers containing gramicidin. Biophys J 76(2): of the . Biochimi Biophys Acta 937Ð945 Biomembr 1564(2):384Ð392 Harroun TA, Heller WT, Weiss TM, Yang L, Huang Goforth RL, Chi AK, Greathouse DV, Providence LL, HW (1999b) Theoretical analysis of hydrophobic Koeppe RE, Andersen OS (2003) Hydrophobic cou- matching and membrane-mediated interactions in lipid pling of lipid bilayer energetics to channel function. J bilayers containing gramicidin. Biophys J 76(6): Gen Physiol 121(5):477Ð493 3176Ð3185 Granier S, Manglik A, Kruse AC, Kobilka TS, Thian Hern JA, Baig AH, Mashanov GI, Birdsall B, Corrie FS, Weis WI, Kobilka BK (2012) Structure of the JET, Lazareno S, Molloy JE, Birdsall NJM (2010) delta-opioid receptor bound to naltrindole. Nature Formation and dissociation of M1 muscarinic receptor 485(7398):400Ð404 dimers seen by total internal reflection fluorescence Gray TM, Matthews BW (1984) Intrahelical hydrogen imaging of single molecules. Proc Natl Acad Sci U S A bonding of serine, threonine and cysteine residues 107(6):2693Ð2698 within [alpha]-helices and its relevance to membrane- Huang HW (1986) Deformation free energy of bilayer bound proteins. J Mol Biol 175(1):75Ð81 membrane and its effect on gramicidin channel life- Green SA, Liggett SB (1994) A proline-rich region of time. Biophys J 50(6):1061Ð1070 the third intracellular loop imparts phenotypic beta 1- Innis SM (2008) Dietary omega 3 fatty acids and the versus beta 2-adrenergic receptor coupling and seques- developing brain. Brain Res 1237:35Ð43 tration. J Biol Chem 269(42):26215Ð26219 Jaakola VP, Griffith MT, Hanson MA, Cherezov V, Chien Grossfield A, Feller SE, Pitman MC (2006a) Contribution EYT, Lane JR, Ijzerman AP, Stevens RC (2008) of omega-3 fatty acids to the thermodynamics of mem- The 2.6 angstrom crystal structure of a human A2A brane protein solvation. J Phys Chem B 110(18):8907Ð adenosine receptor bound to an antagonist. Science 8909. doi:10.1021/jp060405r 322(5905):1211Ð1217 72 S. Mondal et al.

Javitch JA, Fu D, Chen J (1995a) Residues in the fifth Lundbaek JA, Collingwood SA, Ingólfsson HI, Kapoor membrane-spanning segment of the dopamine D2 re- R, Andersen OS (2010) Lipid bilayer regulation of ceptor exposed in the binding-site crevice. Biochem- membrane protein function: gramicidin channels as istry 34(50):16433Ð16439 molecular force probes. J R Soc Interface 7(44): Javitch JA, Fu D, Chen J, Karlin A (1995b) Mapping the 373Ð395 binding-site crevice of the dopamine D2 receptor by Luttrell LM, Ferguson SSG, Daaka Y, Miller WE, Maud- the substituted-cysteine accessibility method. Neuron sley S, Della Rocca GJ, Lin FT, Kawakatsu H, Owada 14(4):825Ð831 K, Luttrell DK, Caron MG, Lefkowitz RJ (1999) Javitch JA, Ballesteros JA, Weinstein H, Chen J (1998) Beta2-arrestin-dependent formation of beta2 adrener- A cluster of aromatic residues in the sixth membrane- gic receptor-Src protein kinase complexes. Science spanning segment of the dopamine D2 receptor is 283(5402):655Ð661 accessible in the binding-site crevice. Biochemistry Mancia F, Assur Z, Herman AG, Siegel R, Hendrick- 37(4):998Ð1006 son WA (2008) Ligand sensitivity in dimeric associ- Kahsai AW, Xiao K, Rajagopal S, Ahn S, Shukla AK, ations of the serotonin 5HT2c receptor. EMBO Rep Sun J, Oas TG, Lefkowitz RJ (2011) Multiple ligand- 9(4):363Ð369 specific conformations of the B2-adrenergic receptor. Manglik A, Kruse AC, Kobilka TS, Thian FS, Mathiesen Nat Chem Biol 7(10):692Ð700 JM, Sunahara RK, Pardo L, Weis WI, Kobilka BK, Khelashvili G, Grossfield A, Feller SE, Pitman MC, Granier S (2012) Crystal structure of the mu-opioid Weinstein H (2009) Structural and dynamic effects receptor bound to a morphinan antagonist. Nature of cholesterol at preferred sites of interaction with 485(7398):321Ð327 rhodopsin identified from microsecond length molecu- Mansoor SE, Palczewski K, Farrens DL (2006) Rhodopsin lar dynamics simulations. Prot Struct Funct Bioinform self-associates in asolectin liposomes. Proc Natl Acad 76(2):403Ð417 Sci U S A 103(9):3060 Khelashvili G, Albornoz PBC, Johner N, Mondal S, Marsh D (2008) Energetics of hydrophobic match- Caffrey M, Weinstein H (2012) Why GPCRs behave ing in lipid-protein interactions. Biophys J 94(10): differently in cubic and lamellar lipidic mesophases. 3996Ð4013 J Am Chem Soc. doi:10.1021/ja3056485 Mondal S, Khelashvili G, Shan J, Andersen OS, Wein- Kofuku Y, Ueda T, Okude J, Shiraishi Y, Kondo K, Maeda stein H (2011) Quantitative modeling of membrane M, Tsujishita H, Shimada I (2012) Efficacy of the B2- deformations by multihelical membrane proteins: ap- adrenergic receptor is determined by conformational plication to G-protein coupled receptors. Biophys J equilibrium in the transmembrane region. Nat Com- 101(9):2092Ð2101 mun 3:1045 Mondal S, Khelashvili G, Wang H, Provasi D, Andersen Lagerstrom MC, Schioth HB (2008) Structural diversity of OS, Filizola M, Weinstein H (2012) Interaction with G protein-coupled receptors and significance for drug the membrane uncovers essential differences between discovery. Nat Rev Drug Discov 7(4):339Ð357 highly homologous GPCRs. Biophys J 102(3):514a Lavie CJ, Milani RV, Mehra MR, Ventura HO (2009) Neuringer M, Connor WE, Van Petten C, Barstad L (1984) Omega-3 polyunsaturated fatty acids and cardiovascu- Dietary omega-3 fatty acid deficiency and visual loss lar diseases. J Am Coll Cardiol 54(7):585Ð594 in infant rhesus monkeys. J Clin Invest 73(1):272 Lebon G, Warne T, Edwards PC, Bennett K, Lang- Nielsen C, Goulian M, Andersen OS (1998) Energetics mead CJ, Leslie AGW, Tate CG (2011) Agonist- of inclusion-induced bilayer deformations. Biophys J bound adenosine A2A receptor structures reveal com- 74(4):1966Ð1983 mon features of GPCR activation. Nature 474(7352): Niu S-L, Mitchell DC, Litman BJ (2002) Manipulation of 521Ð525 cholesterol levels in rod disk membranes by methyl- Liapakis G, Ballesteros JA, Papachristou S, Chan WC, beta-cyclodextrin. J Biol Chem 277(23):20139Ð20145. Chen X, Javitch JA (2000) The forgotten serine: a doi:10.1074/jbc.M200594200 critical role for Ser-2035.42 in ligand binding to and O’Dowd BF, Hnatowich M, Regan JW, Leader WM, activation of the B2-adrenergic receptor. J Biol Chem Caron MG, Lefkowitz RJ (1988) Site-directed mu- 275(48):37779Ð37788 tagenesis of the cytoplasmic domains of the human Liu W, Chun E, Thompson AA, Chubukov P, Xu F, beta 2-adrenergic receptor. Localization of regions Katritch V, Han GW, Roth CB, Heitman LH, Ijzerman involved in G protein-receptor coupling. J Biol Chem AP, Cherezov V, Stevens RC (2012) Structural basis 263(31):15985Ð15992 for allosteric regulation of GPCRs by sodium ions. Palczewski K, Kumasaka T, Hori T, Behnke CA, Science 337(6091):232Ð236 Motoshima H, Fox BA, Trong IL, Teller DC, Lundbæk JA, Andersen OS (1999) Spring constants Okada T, Stenkamp RE (2000) Crystal structure for channel-induced lipid bilayer deformations esti- of rhodopsin: a G protein-coupled receptor. Science mates using gramicidin channels. Biophys J 76(2): 289(5480):739Ð745 889Ð895 Park JH, Scheerer P, Hofmann KP, Choe HW, Ernst Lundbaek JA, Andersen OS, Werge T, Nielsen C (2003) OP (2008) Crystal structure of the ligand-free G- Cholesterol-induced protein sorting: an analysis of protein-coupled receptor opsin. Nature 454(7201): energetic feasibility. Biophys J 84(3):2080Ð2089 183Ð187 4 How the Dynamic Properties and Functional Mechanisms of GPCRs Are Modulated... 73

Periole X, Huber T, Marrink SJ, Sakmar TP (2007) Shi L, Javitch JA (2004) The second extracellular loop G protein-coupled receptors self-assemble in dynam- of the dopamine D2 receptor lines the binding- ics simulations of model bilayers. J Am Chem Soc site crevice. Proc Natl Acad Sci U S A 101(2): 129(33):10126Ð10132 440Ð445 Periole X, Knepp AM, Sakmar TP, Marrink SJ, Huber T Shi L, Liapakis G, Xu R, Guarnieri F, Ballesteros JA, (2012) Structural determinants of the supramolecular Javitch JA (2002) B2 Adrenergic receptor activation: organization of G protein-coupled receptors in bilay- modulation of the proline kink in transmembrane 6 ers. J Am Chem Soc 134(26):10959Ð10965 by a rotamer toggle switch. J Biol Chem 277(43): Prioleau C, Visiers I, Ebersole BJ, Weinstein H, Sealfon 40989Ð40996 SC (2002) Conserved helix 7 tyrosine acts as a mul- Soubias O, Teague WE, Gawrisch K (2006) Evidence for tistate conformational switch in the 5HT2C receptor. specificity in lipid-rhodopsin interactions. J Biol Chem J Biol Chem 277(39):36577Ð36584 281(44):33233Ð33241 Pucadyil TJ, Chattopadhyay A (2004) Cholesterol mod- Soubias O, Niu SL, Mitchell DC, Gawrisch K ulates ligand binding and G-protein coupling to (2008) Lipid- rhodopsin hydrophobic mismatch serotonin 1A receptors from bovine hippocam- alters rhodopsin helical content. J Am Chem Soc pus. Biochimi Biophys Acta Biomembr 1663(1Ð2): 130(37):12465Ð12471 188Ð200 Stahl LA, Begg DP, Weisinger RS, Sinclair AJ (2008) The Rasmussen SGF, Choi HJ, Rosenbaum DM, Kobilka TS, role of omega-3 fatty acids in mood disorders. Curr Thian FS, Edwards PC, Burghammer M, Ratnala VRP, Opin Investig Drugs 9(1):57Ð64 Sanishvili R, Fischetti RF, Schertler GFX, Weis WI, Strader CD, Sigal IS, Register RB, Candelore MR, Rands Kobilka BK (2007) Crystal structure of the human E, Dixon RA (1987) Identification of residues required beta2 adrenergic G-protein-coupled receptor. Nature for ligand binding to the beta-adrenergic receptor. Proc 450(7168):383Ð387 Natl Acad Sci U S A 84(13):4384Ð4388 Rasmussen SGF, Choi HJ, Fung JJ, Pardon E, Casarosa Strandberg E, Killian JA (2003) Snorkeling of lysine side P, Chae PS, DeVree BT, Rosenbaum DM, Thian chains in transmembrane helices: how easy can it get? FS, Kobilka TS (2011) Structure of a nanobody- FEBS Lett 544(1Ð3):69Ð73 stabilized active state of the beta2 adrenoceptor. Na- Suryanarayana S, Von Zastrow M, Kobilka BK (1992) ture 469(7329):175Ð180 Identification of intramolecular interactions in Rawicz W, Olbrich KC, McIntosh T, Needham D, Evans adrenergic receptors. J Biol Chem 267(31): E (2000) Effect of chain length and unsaturation 21991Ð21994 on elasticity of lipid bilayers. Biophys J 79(1): Visiers I, Ebersole BJ, Dracheva S, Ballesteros J, Sealfon 328Ð339 SC, Weinstein H (2002) Structural motifs as functional Rosenbaum DM, Rasmussen SG, Kobilka BK (2009) The microdomains in G-protein coupled receptors: ener- structure and function of G-protein-coupled receptors. getic considerations in the mechanism of activation Nature 459(7245):356Ð363 of the serotonin 5HT2A receptor by disruption of the Rosenbaum DM, Zhang C, Lyons JA, Holl R, Aragao ionic lock of the arginine cage. Int J Quantum Chem D, Arlow DH, Rasmussen SGF, Choi HJ, DeVree 88(1):65Ð75 BT, Sunahara RK, Chae PS, Gellman SH, Dror RO, Warne T, Serrano-Vega MJ, Baker JG, Moukhametzianov Shaw DE, Weis WI, Caffrey M, Gmeiner P, Ko- R, Edwards PC, Henderson R, Leslie AGW, Tate CG, bilka BK (2011) Structure and function of an irre- Schertler GFX (2008) Structure of a beta1-adrenergic versible agonist-beta 2 adrenoceptor complex. Nature G-protein-coupled receptor. Nature 454(7203): 469(7329):236Ð240 486Ð491 Ruprecht JJ, Mielke T, Vogel R, Villa C, Schertler GFX Weinstein H (2005) Hallucinogen actions on 5-HT re- (2004) Electron crystallography reveals the structure ceptors reveal distinct mechanisms of activation and of metarhodopsin I. EMBO J 23(18):3609Ð3620 signaling by G protein-coupled receptors. AAPS J Sankararamakrishnan R, Weinstein H (2002) Positioning 7(4):871Ð884 and stabilization of dynorphin peptides in membrane Wu B, Chien EYT, Mol CD, Fenalti G, Liu W, bilayers: the mechanistic role of aromatic and basic Katritch V, Abagyan R, Brooun A, Wells P, Bi residues revealed from comparative MD simulations. FC, Hamel DJ, Kuhn P, Handel TM, Chere- J Phys Chem B 106(1):209Ð218 zov V, Stevens RC (2010) Structures of the Scheerer P, Park JH, Hildebrand PW, Kim YJ, Krauß CXCR4 chemokine GPCR with small-molecule N, Choe HW, Hofmann KP, Ernst OP (2008) Crystal and cyclic peptide antagonists. Science 330(6007): structure of opsin in its G-protein-interacting confor- 1066Ð1071 mation. Nature 455(7212):497Ð502 Wu H, Wacker D, Mileni M, Katritch V, Han GW, Vardy Shan J, Khelashvili G, Mondal S, Mehler EL, Weinstein E, Liu W, Thompson AA, Huang XP, Carroll FI, H (2012) Ligand-dependent conformations and dy- Mascarella SW, Westkaemper B, Mosier PD, Roth namics of the serotonin 5-HT2A receptor determine BL, Cherezov V, Stevens RC (2012) Structure of the its activation and membrane-driven oligomerization human kappa-opioid receptor in complex with JDTic. properties. PLoS Comput Biol 8(4):e1002473 Nature 485(7398):327Ð332 74 S. Mondal et al.

Xiang Y, Rybin VO, Steinberg SF, Kobilka B (2002) agonist-bound human A2A adenosine receptor. Sci- Caveolar localization dictates physiologic signaling ence 332(6027):322Ð327 of beta2-adrenoceptors in neonatal cardiac myocytes. Yau WM, Wimley WC, Gawrisch K, White J Biol Chem 277(37):34280Ð34286 SH (1998) The preference of tryptophan for Xu F, Wu H, Katritch V, Han GW, Jacobson KA, Gao membrane interfaces. Biochemistry 37(42):14713Ð ZG, Cherezov V, Stevens RC (2011) Structure of an 14718 Coarse-Grained Molecular Dynamics Provides Insight into the Interactions 5 of Lipids and Cholesterol with Rhodopsin

Joshua N. Horn, Ta-Chun Kao, and Alan Grossfield

Abstract Protein function is a complicated interplay between structure and dy- namics, which can be heavily influenced by environmental factors and conditions. This is particularly true in the case of membrane proteins, such as the visual receptor rhodopsin. It has been well documented that lipid headgroups, polyunsaturated tails, and the concentration of cholesterol in membranes all play a role in the function of rhodopsin. Recently, we used all-atom simulations to demonstrate that different lipid species have preferential interactions and possible binding sites on rhodopsin’s surface, consistent with experiment. However, the limited timescales of the simulations meant that the statistical uncertainty of these results was substantial. Accordingly, we present here 32 independent 1.6 s coarse-grained simulations exploring lipids and cholesterols surrounding rhodopsin and opsin, in lipid bilayers mimicking those found naturally. Our results agree with those found experimentally and in previous simu- lations, but with far better statistical certainty. The results demonstrate the value of combining all-atom and coarse-grained models with experiment to provide a well-rounded view of lipid-protein interactions.

Keywords Rhodopsin ¥ All-atom simulations ¥ Coarse-graining ¥ Lipid-protein interactions ¥ Protein function

5.1 Lipid-Protein Interactions

Protein structure and dynamics play a major role J.N. Horn ¥ T.-C. Kao ¥ A. Grossfield () in function. It is not surprising then that mem- Department of Biochemistry and Biophysics, brane protein function would be strongly in- University of Rochester Medical Center, 601 Elmwood fluenced by interactions at the protein-lipid in- Ave, Box 712, Rochester, NY 14642, USA terface (Marsh, 2008). There are many reasons e-mail: alan_grossfi[email protected] that the lipid environment plays a major part

M. Filizola (ed.), G Protein-Coupled Receptors - Modeling and Simulation, Advances 75 in Experimental Medicine and Biology 796, DOI 10.1007/978-94-007-7423-0__5, © Springer ScienceCBusiness Media Dordrecht 2014 76 J.N. Horn et al. in the structure and thus the function of inte- properties, protein structure and flexibility, and gral membrane proteins. First, lipid composi- protein function (Haswell et al., 2011). One such tion of membranes is incredibly diverse, mean- channel, MscL (mechanosensitive channel of ing that there is no single property that defines large conductance), responds to turgor pressure a membrane; rather, membranes from different changes and hypotonic shock by opening a cells (or different parts of the same cell) can 30 Å pore to release osmolytes and solutes differ wildly from each other (Brügger et al., (Martinac et al., 1987; Cruickshank et al., 1997). Moreover, the membrane-water interface 1997). The mechanism by which MscL “senses” is a “region of tumultuous chemical heterogene- these conditions is by responding structurally ity” (Wiener and White, 1992), giving membrane to tension changes in the lipid bilayer. Turgor proteins a highly diverse surrounding environ- pressure, as well as cellular processes like cell ment (Engelman, 2005). division, stretches or compresses the bilayer. A number of interactions are at play in the Thorough reviews of this interplay between membrane environment, including hydrophobic membrane structure and protein function for mismatch (Mouritsen and Bloom, 1984), these channels exist elsewhere (Kung et al., 2010; lipid fluidity, membrane tension, hydrocarbon Martinac, 2011; Haswell et al., 2011). chain packing (Fattal and Ben-Shaul, 1993), The bacterial potassium channel, KcsA, bilayer free volume (Mitchell et al., 1990), the is another prototypical example. It has been intrinsic curvature of lipids (Gruner, 1985) and demonstrated that KcsA requires a lipid bilayer elastic strain resulting from bilayer curvature for proper folding, despite being stable and frustration (Lee, 2004). Furthermore, lipid- active in experiments using detergent micelles, protein interactions can be subdivided into bulk and requires the binding of a single negatively interactions, or those that result from bilayer charged lipid on each monomer for activation properties, and specific interactions, or those (Valiyaveetil et al., 2002). Simulation studies that involve an association with individual lipids confirmed a Arg64ÐArg89 binding motif (Valiyaveetil et al., 2002). Specific interactions for acidic lipids that was discovered in the can be further subdivided into those with annular experimental studies (Sansom et al., 2005). lipids, or the boundary lipids forming the first Linear gramicidins are a family of bilayer- shell around a protein, and non-annular lipids, spanning antibacterial cation channels that which can be described as “co-factor” lipids increase the permeability of target membranes with unique binding sites (Simmonds et al., (Harold and Baarda, 1967). The natural folding 1982; Lee, 2003). In some cases, lipids can even preference is to form intertwined (double- be structural elements of membrane proteins stranded) dimers (Burkhart et al., 1998). In the (Lee, 2003). A more complete survey of the presence of lipid bilayers they refold into the suggested lipid-protein interaction models can functional end-to-end (single-stranded) dimers be found in a number of thorough reviews (Andersen et al., 1999), with all four Trp residues (Mouritsen and Bloom, 1993; Andersen and in each subunit hydrogen bonding with the Koeppe, 2007). bilayer at the membrane surface (O’Connell et al., 1990), effectively anchoring the structure in a bilayer spanning configuration. The preference 5.1.1 Model Systems for for this conformation in the presence of a Lipid-Protein Interactions lipid bilayer is driven by these Trp residues at the bilayer-solvent interface, which would A number of model proteins exist that depend create a penalty if they were buried in the heavily on lipid interaction for function. One hydrophobic core. These effects have been such class of proteins is that of mechanosensitive noted for other proteins that show a similar channels. The function of these channels Trp anchoring pattern (Killian and von Heijne, depends on the relationship between membrane 2000). 5 Coarse-Grained Molecular Dynamics Provides Insight into the Interactions of Lipids... 77

provide near-atomic resolution into the possible 5.2 Rhodopsin general and specific mechanisms by which the lipid bilayer interacts with rhodopsin. Rhodopsin, a G protein-coupled receptor (GPCR) responsible for dim-light vision, is a well- characterized transmembrane protein activated 5.2.1 Rhodopsin-Lipid Interactions by the isomerization of 11-cis-retinal to the all- trans configuration via light absorption. The A number of studies by the Brown lab, start- ligand’s isomerization initiates a cascade of ing in the late 1980s and early 1990s (Wied- thermal relaxations in the protein, ending with mann et al., 1988; Gibson and Brown, 1993; metarhodopsin I (MI). MI exists in equilibrium Brown, 1994), focused on MII production as with metarhodopsin II (MII), the tranducin- a result of the photoisomerization of rhodopsin binding (or “active”) form of the protein reconstituted in membranes with a variety of (Liebman et al., 1987). The MI-MII transition phospholipid and fatty acid combinations. They features significant structural motions that lead showed that the population of MII depended to the activation of the G protein and ultimately on the lipid headgroup composition as well as results in signal transduction. the concentration of polyunsaturated acyl chains. Rhodopsin is found in large concentrations Native ROS membranes, as expected, showed the in the rod outer segment disks (ROS) of rod greatest quantities of MII. In membranes of PC cells, making up the vast majority of the with short, saturated acyl chains, for instance protein component of each disk’s membrane di(14:0)PC, rhodopsin is essentially inactive. Us- and occupying about a third of the total area ing di(22:6)PC, they demonstrated that polyun- (Molday, 1998; Buzhynskyy et al., 2011). The saturation increased the degree of activity, but disk membrane phospholipid distribution is not to native levels. The addition of PE lipids about 44 % (PC), 41 % also increased activity, though this increase was phosphatidylethanolamine (PE), 13 % phospha- minor. The presence of polyunsaturation or PE tidylserine (PS), and 2 % phosphatidylinositol lipids alone does not recreate native activity. In- (Boesze-Battaglia and Albert, 1992), with a high stead, a mixture of phospholipids containing both concentration polyunsaturated docosahexaenoic polyunsaturated chains (22:6!3) and PE head acid (DHA) tails (Boesze-Battaglia and Albert, groups had the highest activity among non-native 1989). The concentration of cholesterol in systems (Wiedmann et al., 1988). new disks is high (30 %) and decreases as Exploring the role of chain length, and in turn the disk ages (Boesze-Battaglia et al., 1989). the hydrophobic mismatch between the bilayer Given that rhodopsin is an integral membrane and rhodopsin, has shown that the MII popu- protein with a cascade of structural changes lation is maximized with chain lengths around implicated in its function, it is not surprising 18 carbons, with the equilibrium shifting back that this unique membrane environment has towards MI with chain lengths between 16 and been shown to affect the behavior of rhodopsin, 20 carbons. This is coupled to local bilayer com- particularly the equilibrium between the MI pression and stretching effects (Botelho et al., and MII states; recent reviews of these effects 2006). This mismatch and the resulting bilayer are available (Soubias and Gawrisch, 2012). deformations affect rhodopsin activation by al- Given the biomedical importance of GPCRs tering the helical content of the protein (Soubias (Drews, 2000) and studies of polyunsaturated et al., 2008). All of this suggests that mechanical fatty acids in dietary intake (Neuringer, 2000), and physical properties of the bilayer, including the implications for bilayer regulation of GPCRs the bilayer thickness and area per lipid, likely to human diseases are clear. Here, we intend to modulate the MI-MII transition. However, these not only highlight experimental and simulation bulk effects do not exclude the possibility of work that explores these effects, but also present localized lipid binding sites on the surface of the long time-scale coarse-grained simulations that protein. 78 J.N. Horn et al.

Some preliminary tests were included 5.2.2 Role of Cholesterol exploring the role of headgroups in rhodopsin activation, by comparing the effects of PE and PS A major component of the rod disk membranes, headgroups (Gibson and Brown, 1993). Later, the cholesterol has been shown to have an effect role of the membrane potential at the membrane- on rhodopsin activation as well. The presence water interface due to lipid headgroups was ex- of cholesterol in membranes drives the MI- plored more thoroughly. It was demonstrated that MII equilibrium towards MI, reducing signaling lipids with PS headgroups have two contradictory (Mitchell et al., 1990). However, it is unclear effects: they alter the bilayer’s properties in ways from these experiments whether this is caused by that oppose MII formation, but their net charge cholesterol’s effects on bilayer properties, direct creates an electrostatic environment rich with interactions between it and the protein, or some C H3O ions that promotes MII formation (Brown, combination of these effects. 1997). This is in agreement with other results Cholesterol’s effects on liquid crystalline that showed that MII formation is enhanced in membrane structure are well documented. The acidic conditions (Delange et al., 1997). Finally, presence of cholesterol causes tighter packing they concluded that both PS headgroups and of lipid hydrocarbons (Niu et al., 2002) and the combination of PE headgroups and DHA increases the thickness of the bilayer, leading to chains are needed to maximize the population changes in the lateral compressibility (Needham of MII. The theory suggests that membranes et al., 1988). These bilayer effects may create with only PC and PS headgroups would favor an environment that inhibits the conformational MII based only on electrostatics, but this would transition from MI to MII, though cholesterol be counteracted by the structural unfavorability also promotes negative curvature elastic stress, a of a charged bilayer surface. As a result, ROS property of PE headgroups that tends to promote membranes have high concentrations of lipids MII formation (Chen and Rand, 1997). with highly negative spontaneous curvature (PE) The alternative means by which cholesterol and polyunsaturated chains to counteract this by may affect rhodopsin activation is through di- providing curvature stress and thus promote MII rect and possibly specific interactions. Studies of (Wang et al., 2002). These results led Brown cholesterol, cholestatrienol and ergosterol in ROS and coworkers to a general model, known as the membranes have suggested that there is at least flexible surface model (FSM), where composition one cholesterol binding site on rhodopsin (Albert of the lipid matrix actively regulates rhodopsin et al., 1996). Sites of preferential cholesterol function (Botelho et al., 2002). interaction have been identified via molecular The effects of headgroups on the MI-MII simulation as well (Grossfield et al., 2006b), equilibrium are not limited to electrostatics and although these and other simulations suggested membrane elasticity. In fact, in work intended to that cholesterol is, on the whole, depleted at the discern the energetic contribution of membrane protein surface (Pitman et al., 2005). elasticity to rhodopsin function, Gawrisch and coworkers noted that PE headgroups also induce a shift toward MII that correlates with their 5.2.3 Simulation Applied hydrogen-bonding ability (Soubias et al., 2010). to Rhodopsin Furthermore, saturation transfer NMR studies of rhodopsin in mixed PC/PE bilayers showed In 2000, the first crystal structure of a GPCR greater magnetization transfer to PE lipids when was solved in the form of bovine rhodopsin at compared to PC lipids (Soubias et al., 2006), 2.8 Å resolution (Palczewski et al., 2000), open- suggesting that in addition to their effects on bulk ing the way for the use of molecular dynamics bilayer properties, there may be a specific role simulation to probe the atomic-level interactions for PE headgroups at the surface of rhodopsin. between rhodopsin and its environment. An early 5 Coarse-Grained Molecular Dynamics Provides Insight into the Interactions of Lipids... 79 simulation of rhodopsin, performed by Huber alyzed to explore the interactions between the et al. (2002), was compared to available NMR palmitoyl moieties attached to a pair of cysteines data to quantify membrane deformation in the in the cytoplasmic helix H8 and the helical bundle presence of protein and compute cross-sectional of rhodopsin (Olausson et al., 2012). There was protein areas. They found that rhodopsin imposed a high degree of contact between both chains curvature in the bilayer, which could facilitate and the protein, but also a significant difference selection for polyunsaturated lipids at the surface between the two, even though they are attached of rhodopsin. This was seen as further support for to consecutive residues. The high level of contact the flexible surface model (Huber et al., 2004). between the palmitate on Cys322 and the protein As the available computer power improved, helices suggests that it may play an important role simulations began to explore the interactions beyond that of a nonspecific lipid anchor. between rhodopsin and bilayer constituents via Recently, some groups have begun using longer molecular dynamics trajectories of the coarse-grained simulations to explore rhodopsin- protein embedded in a more realistic bilayer. lipid interactions as well. For example, These systems featured cholesterol, as well as a Periole et al. (2007) explored the effects of mixture of lipids with two different headgroups, varying the bilayer hydrophobic thickness on each of which was linked to one polyunsaturated the oligomerization of rhodopsin, using the tail and one saturated tail. Early results showed a MARTINI force field (Marrink et al., 2007; preference for the polyunsaturated tails at the Monticelli et al., 2008). The results demonstrated surface of rhodopsin and little indication of that oligomerization is driven by frustration specific binding sites for cholesterol (Feller et al., of lipid-protein interactions, something that 2003; Pitman et al., 2005). had already been seen experimentally (Botelho Later, a series of 26 independent 100 ns et al., 2006). Rhodopsin significantly altered simulations of rhodopsin in a realistic lipid the local membrane thickness, encouraging composition was used to address these questions, oligomerization as a way to reduce unfavorable with better statistical sampling (Grossfield protein-lipid interactions. While useful for et al., 2006b). Polyunsaturated tails were again exploring bilayer adaptations attributed to enriched at the surface of the protein. It was hydrophobic mismatch, each simulation featured also now possible to identify residues that homogeneous bilayers, so no new evidence preferentially interacted with cholesterol and was gleaned about the role for cholesterol each of the lipid components. Although this and different lipid headgroups in rhodopsin work required a heroic effort at the time, activation. the sampling was not sufficient to give high In this work, we employed simulations confidence in the predictions about specific featuring membranes with native-like compo- residues, particularly for cholesterol, since any sitions to more thoroughly explore protein-lipid given interaction was only seen in a small interactions. Our focus in these systems was fraction the trajectories (Grossfield et al., the role played by the saturation state of the 2006b). Nonetheless, the results demonstrated lipids, the headgroups of the lipids, and the that computational methods could confirm the presence of cholesterol in modulating lipid- experimentally suggested trend of preferential protein interactions. interactions with polyunsaturated tails, allowing some to speculate on the roles these flexible lipids could play in the activation of rhodopsin 5.3 Methods (Feller and Gawrisch, 2005). Preferential sites of interaction between rhodopsin and cholesterol 5.3.1 Simulation Systems were also identified (Khelashvili et al., 2009). More recently, the same data (supplemented The goal of this research was to explore lipid- by an additional 1.6 s simulation) was rean- protein interactions in a system with rhodopsin 80 J.N. Horn et al. and a biologically relevant model membrane. We retinal-free opsin was also modeled (PDB ID: sought to extend previously published work that 3CAP Park et al. 2008). The MARTINI coarse- featured 26 separate 100 ns all-atom simulations grained force field was used (Marrink et al., of rhodopsin in a membrane (Grossfield et al., 2007) with the extension for proteins (Monticelli 2006b). These systems featured a 2:2:1 ratio of 1- et al., 2008). Recently, the MARTINI forcefield stearoyl-2-docosahexaenoyl-phosphatidylcholine was applied to study the aggregation properties (SDPC), 1-stearoyl-2-docosahexaenoyl-phosph- of rhodopsin (Periole et al., 2012; Knepp et al., atidylethanolamine (SDPE), and cholesterol, and 2012). To create the coarse-grained model for were intended to mimic those membranes found each protein, all molecules present other than biologically and used in experimental studies rhodopsin/opsin were removed from the crystal with model membranes (Boesze-Battaglia and structure. We then mapped the coarse amino Albert, 1989; Wiedmann et al., 1988; Albert acids onto this structure using the martinize.py et al., 1996). The systems described here mimic script available online at the MARTINI website these all-atom systems in an attempt to bring (MARTINI, 2011). The long C-terminal tail together a great deal of experimental work that found in the rhodopsin structure, which was has been done investigating the role of various not resolved in the opsin structure, was not membrane properties on rhodopsin activation, removed. It makes minimal contact with the taking advantage of the ability of molecular bilayer and thus should not affect our results dynamics simulation to explore interactions on in this context. The acyl chains on the cysteine the molecular level. residues at positions 322 and 323 were manually While being a remarkable effort at the time, added with parameters used for palmitoyl chains previous all-atom simulations had some limita- in lipids. The resulting system was energy tions that faster computers and more efficient minimized. Retinal was not explicitly represented simulation methods have allowed us to over- in the rhodopsin model; in the MARTINI protein come. By employing a coarse-grained model, we model, protein fluctuations are dominated by the simulated larger systems for longer timescales, network of restraints required to stabilize the allowing for better sampling of long time-scale tertiary structure, and since the retinal itself does processes critical for bilayer-protein interactions, not interact with the membrane environment in such as the lateral reorganization of the lipid any significant way, we believe it is sufficient to bilayer. The work described here features systems model the retinal-stabilized dark-state rhodopsin nearly three times larger than the all-atom work, structure of the protein in its absence. with timescales an order of magnitude longer. First, we built an SDPC bilayer via self- Furthermore, simulations of “dark” and “active” assembly with a box of randomly placed SDPC rhodopsin were done to make comparisons about lipids and water. For the rhodopsin system, we the lipid and cholesterol binding surfaces in these inserted the protein into this SDPC lipid bilayer differing structures. Overall, the new results agree using a bilayer expansion and compression with the earlier all-atom work, while provid- technique (Kandt et al., 2007). In this method, ing more robust statistics and allowing us to the lipids are translated in the plane of the bilayer make some conclusions about specific and gen- by a large scaling factor, creating space for the eral interactions between the lipid environment protein insertion without clashes. Then, the lipids and rhodopsin. are scaled back to the ideal area-per-lipid using a number of cycles of translation and minimization. This reduces lipid-lipid and lipid-protein clashes. 5.3.2 Construction Water and ions were added to solvate and neutralize the system; additional NaCl was added Rhodopsin was modeled after the same crystal to bring the concentration to 100 mM. After- structure used in the previous work (PDB ID: wards, we held the protein position fixed and 1U19 Okada et al. 2004). For comparison, performed a 10 ns simulation to allow the lipid 5 Coarse-Grained Molecular Dynamics Provides Insight into the Interactions of Lipids... 81 and solvent environment to relax. To generate simulations with only the parameters necessary unique starting states for the bilayer, randomly to maintain the secondary structure. However, selected SDPC lipids were converted to SDPE under these conditions the rhodopsin structure lipids by simply changing the head group beads, moved rapidly away from the crystal structure, while others were swapped with cholesterol. The reaching a transmembrane alpha carbon RMSD result was a set of 16 unique rhodopsin systems as high as 6 Å (data not shown). This problem with 2:2:1 SDPC:SDPE:cholesterol bilayers. We in the MARTINI force field with maintaining minimized the resulting systems and performed protein tertiary structure has been noted another round of equilibration, again holding the before and protocols have been developed for protein fixed. overcoming these limitations utilizing distance- To make the equivalent opsin model, based restraints (Periole et al., 2009). To maintain rhodopsin was replaced with opsin after the integrity of our proteins, we included a backbone alignment for each of the rhodopsin similar network model, restraining the distances systems. These were then equilibrated with between backbone beads between 2 and 10 Å a short simulation (100 ps) to ensure that the apart. We tested multiple force constants using starting configurations for the opsin systems and short trajectories to try to match the amplitude the rhodopsin systems were similar. of fluctuations for the transmembrane helices The final systems included 1 protein molecule to previous all-atom simulations; with a force (either rhodopsin or opsin), 180 SDPC, 180 constant of 800 kJ/molnm2, the rhodopsin SDPE, 90 cholesterol, about 19,000 water beads transmembrane helix RMSD fluctuated between (each representing 4 water molecules), and about 2.0 and 2.5 Å, consistent with previous all-atom 140 each of NaC and Cl. This brings the system results (Grossfield et al., 2008), and the opsin size to about 26,000 CG beads, which is roughly RMSD fluctuated between 2.5 and 3.0 Å. equivalent to an all-atom system with 105,000 We ran 32 independent simulations, 16 each atoms (double the total system size and more for rhodopsin and opsin. Each simulation was than triple the bilayer size of the previous work). 1.6 s, for a total of 51.2 s of simulation time (effective time of about 205 s if we apply a 4X scaling to the time, as suggested by previous 5.3.3 Simulation Protocol authors to account for the enhanced kinetics of the coarse-grained model Marrink et al. 2007). Simulations were performed with version 4.5.4 All times we report here are the actual simulation of the GROMACS molecular dynamics package times, without the 4X scaling. (van der Spoel et al., 2005; Hess et al., 2008)on a Linux cluster. We used a time step of 10 fs, as suggested for accurate integration (Winger et al., 5.3.4 Analysis 2009; Marrink et al., 2010), with the neighbor list updated every 5 steps. We held the temperature at All simulation analyses were performed using 300 K using Nosé-Hoover temperature coupling tools developed using the LOOS library. LOOS (Nosé and Klein, 1983; Hoover, 1985) and treated is an object-oriented library implemented in C++ the pressure semi-isotropically with a reference and Boost for rapidly creating new tools for an- of 1 bar using the Parrinello-Rahman barostat alyzing molecular dynamics simulations (Romo (Parrinello and Rahman, 1981). A shift function and Grossfield, 2009, 2012). All analysis was per- was employed for electrostatics with a coulomb formed on trajectories with 1 ns time resolution. cutoff of 12 Å. The Lennard-Jones potential was shifted between 9 and 12 Å. 5.3.4.1 Bound Lipid Lifetimes It is well established that external restraints In order to measure the time required for lipids are required when simulating native proteins to exchange off the surface of the protein, we using MARTINI. Initially, we performed our calculated a conditional survival probability. 82 J.N. Horn et al.

Specifically, we identified those lipids within 6 Å Where N is the number of atoms in residue of any protein bead as “bound”, and computed n, M is the sum of all atoms for all molecules the probability that if a lipid is bound at time of lipid component m in the system, rij is the t it will also be bound at time t C t. Error distance between atoms i and j . The normalized bars are estimated by treating each simulation as residue score is then simply the residue score an independent measure of the decay curve and divided by the average residue score for all trans- computing the standard error in the means. This membrane alpha helix residues: calculation was implemented using LOOS. R Rı nm nm D 1 PN (5.2) tm Rkm 5.3.4.2 Lateral Radial Distribution Ntm k Function We computed the lateral radial distribution func- 5.3.4.5 Statistical Significance tion (RDF) of various bilayer species relative to Because we have multiple independently con- the center of mass of the protein structure in the structed trajectories of each system, we have membrane plane using the xy_rdf tool distributed attempted to assess the statistical significance of with LOOS tools. Each molecule was treated as a our results. This was done using a Welch’s T- single unit, located at its centroid. test, treating the average result (e.g. the radial distribution density at a particular distance) from 5.3.4.3 Density Maps each trajectory as a single data point. We typically We created 2D density maps to show average plot the p-values on the same axes as the results density of each lipid component in the plane themselves. To compute these p-values, we used of the bilayer. We aligned the protein structure statistical tools available in SciPy (Jones et al., of each frame of the simulation to the initial 2001Ð). structure. Then we binned the centroid of each component in a grid on the plane parallel to the membrane, with 1 Å2 bins. The resulting density 5.4 Results and Discussion histogram is displayed as a heat map. To probe the 3-dimensional distribution of Before beginning any kind of analysis of lipid lipid components about the membrane, we first binding, it is critical that we show thorough aligned our trajectories using transmembrane sampling of the exchange between surface 3 C˛’s. Then we used a 1 Å grid superimposed bound lipids and bulk lipids; otherwise, the over the protein’s bounding box, padded by results would be dominated by the starting 30 Å. Each atom is then placed in the nearest configuration of the lipids. To accomplish this bin and the resulting histogram is convolved we calculated the probability if a lipid is found with a gaussian for a smoother visualization. at the protein surface at time t it will also The gaussian smoothing is only applied to the be there at time t C t. The resulting decay 3D visualizations, not the 2D density maps. curves for SDPE and SDPC are shown in Fig. 5.1. These curves can be fit to a double 5.3.4.4 Residue-Based Binding Scans exponential, resulting in two characteristic decay To highlight interactions between lipid compo- times. The fast time, accounting for roughly nents and individual residues, we computed a half the amplitude, is below 1 ns. Given that residue binding score for each residue of the the analysis was performed using 1 ns samples, protein to each lipid component. The residue this is most likely the result of flickering at score R for residue n and lipid component m can the edge of the “bound” state. The slow decay be expressed as: times for SDPE and SDPC in the rhodopsin system are 78.3 ns (˙2.88) and 63.2 ns (˙1.31), N M 1 X X 1 respectively. Rnm D (5.1) N r6 i j ij 5 Coarse-Grained Molecular Dynamics Provides Insight into the Interactions of Lipids... 83

Fig. 5.1 Survival 1 probability for lipids at the SDPE surface of rhodopsin. Error 0.9 SDPC bars are the standard error of the means of the 0.8 16 trajectories 0.7

0.6

0.5

Probability 0.4

0.3

0.2

0.1

0 0 100 200 300 400 500 Δt (ns)

5.4.1 RDFs Demonstrate Surface D and E show these p-value plots for choles- Preferences for DHA terol, DHA and stearoyl, comparing the means between rhodopsin and opsin. Statistically signif- To begin our analysis, we used a two-dimensional icant (P < 0:01) differences appear between all radial distribution function (RDF) in the plane of analyzed bilayer constituents, but some for only the membrane, as was done in previous work brief stretches of the RDF curves. (Feller et al., 2003), to assess the packing of The short region of cholesterol significance the different members of the bilayer against coincides with a peak in the opsin curve that is rhodopsin. In Fig. 5.2, we show RDFs for the not present in rhodopsin, indicating a region of two lipid tail types and cholesterol. Here we can bulk density at the very surface of opsin where see a drastic enrichment of docosahexaenoic acid the DHA density is the highest. The significant (DHA) between 15 and 20 Å from the center regions are much more substantial for the lipids of the protein, with the peaks for cholesterol tails. For stearoyl, large stretches of the curve and stearoyl beyond 20 Å. Interestingly, while show very significant differences (P < 0:001). cholesterol is not enriched at the surface, it has Visual inspection of the RDFs shows greater significant density deeper into the protein than penetration of the lipid tails between 10 and 20 Å. stearoyl, and nearly as deep as DHA. Given its This can be explained to some degree by the smaller size, this may be indicative of regions greater flexibility and more “open” structure for accessible only to cholesterol. the opsin system; there is a greater area accessible To accurately assess the significance of the to the lipid tails between the helices and in the difference between RDF curves for opsin and protein interior. rhodopsin, we calculated p-values for each point. Given that each point is the mean of a set of 16 independent samples, we have a fairly large 5.4.2 Density Maps Show DHA set of data from which to do this assessment Preference (unusual in the simulation community). The re- sulting p-values are plotted below the RDF curves The above results are consistent with previous with the same x-axis, with confidence levels of simulation and experimental results. However, 0.01 and 0.05 shown for reference. Panels C, simple lateral radial distribution functions 84 J.N. Horn et al.

Fig. 5.2 Lateral radial a distribution functions of 1.4 each of the lipid tails and cholesterol for the (a) 1.2 rhodopsin and (b) opsin 1 systems. Comparison of means tests between 0.8 rhodopsin and opsin for (c) 0.6 cholesterol, (d) DHA, and (e) stearoyl 0.4 CHOL 0.2 DHA Radial Distribution Function STEA b 0 1.4 1.2 1 0.8 0.6 0.4 CHOL 0.2 DHA Radial Distribution Function STEA 0 c 1 .05 .01 .001

P−value .0001 d 1 .05 .01 .001

P−value .0001 e 1 .05 .01 .001

P−value .0001

5 10 15 20 25 30 35

X(Å) contain limited information, because they treat dark region in the center of each frame repre- rhodopsin as a featureless cylinder, integrating sents the excluded volume of the helix bundle, out the distinctions between different portions as we look down from extracellular side. Opsin of the protein surface. Moreover, in these and rhodopsin were aligned by their backbones plots, both leaflets were treated together, again so that they are oriented the same way in all averaging away potentially valuable information. of the heat maps for comparison. These images Accordingly, we instead project our results along represent the average of all 16 simulations for 2 dimensions, using lateral density heat maps. each system. Figure 5.3 shows density maps for the dif- In the plots of DHA density, we see a ferent lipid components for both rhodopsin and bright, thin ring tracing the protein space. In opsin, in both the upper and lower leaflets. The contrast with the low densities for stearoyl 5 Coarse-Grained Molecular Dynamics Provides Insight into the Interactions of Lipids... 85

Rhodopsin Opsin Upper Lower Upper Lower DHA 30 20 10 )

Å 0 Y ( −10 −20 −30 STEA 30 20 10 )

Å 0 Y ( −10 −20 −30 CHOL 30 20 10 )

Å 0 Y ( −10 −20 −30

−30 −20 −10 0 10 20 30 −30 −20 −10 0 10 20 30 −30 −20 −10 0 10 20 30 −30 −20 −10 0 10 20 30

X (Å) X (Å) X (Å) X (Å) 1

0 0.5 1 1.5 2 2.5 3 3.5 2 7 Density

3 6

4 5

Fig. 5.3 Density maps of each bilayer component, for origin and aligned against a reference structure so that each leaflet, in each system (rhodopsin or opsin). Density the maps can be directly compared. The orientation of is reported as lipid components per Å2. All images are the helices is shown in the small panel in the bottom viewed from the extracellular side of the protein. The right corner, also viewed from the extracellular side of upper leaflet refers to the leaflet on the extracellular side. the protein For every map, rhodopsin or opsin was centered at the and cholesterol, this indicates that DHA is stearoyl rings are dimmer and, in general, more preferentially packed against the surface of diffuse. the protein, with the exception of a bright The lateral radial distribution functions, cholesterol spot next to helices H1 and H7. The coupled with the density heat maps, suggest a corresponding stearoyl densities show rings as strong preference for DHA at the surface of both well, immediately outside the DHA ring. The opsin and rhodopsin, in agreement with previous 86 J.N. Horn et al. experimental and computational results. Previous comparison, p-values were computed for every work suggests that this preference is entropically point in the maps and plotted. Panel C shows driven (Grossfield et al., 2006a). It has been the p-value plot comparing SDPE (panel A) and demonstrated that DHA is extremely flexible SDPC (panel B) for the upper leaflet. There (Feller et al., 2002) and rapidly interconverts is a large region of statistical significance that between conformations (Soubias and Gawrisch, indicates differences between the densities of the 2007), making it ideal for packing against two headgroups along helices H3, H4 and H5 the relatively rigid but uneven surface of the on the extracellular side of the protein. No such protein. regions of interest are seen in the other leaflet, and The region just beyond the first shell of DHA it is less pronounced for the opsin system (data chains is enriched in stearate. This result is not not shown). surprising, because the lipids used in these sim- ulations each have one DHA and one stearoyl. For every lipid with a DHA tail packed against 5.4.4 Mapping Density to Structure rhodopsin, there is also a stearoyl facing away Probes Cholesterol Binding from the protein, accounting for the inner DHA Sites ring and the outer stearoyl ring. This outer ring is not as bright in the heat maps as the DHA ring The present simulations, by accessing the mi- because the accessible surface area in this ring is crosecond timescale, are much more effective at far greater, so the motion of these tails is more allowing for sampling of the lateral motion of diffuse. cholesterol and lipids in the bilayer than previous work. Considering the enhanced rate of lipid diffusion in the MARTINI forcefield, each of 5.4.3 SDPE is Preferred at Protein our simulations arguably samples nearly 6.5 s Surface of lateral reorganization. With this in mind, we have the ability to probe for the appearance of In Fig. 5.4, we compare preferences between cholesterol binding sites in all of our simulations SDPE and SDPC within each system to explore and hopefully converge on representative sites, a possible preference for one headgroup over whereas previous work could only note sites another. Visual inspection suggests a slight where some fraction of the simulations had seen preference for SDPE over SDPC at the surface of strong contacts. the protein, which is confirmed to be statistically In the lateral radial distribution plots, there significant; the lower panels show the p-value is evidence that cholesterol can pack inside the for the difference between SDPC and SDPE and helical bundle (deeper than either lipid tail), albeit demonstrates significance at the 0.01 level for not with high abundance. Indeed, integrating the the entire first “solvation” shell of rhodopsin, RDF suggests the presence of roughly 1 choles- and most of that region for opsin. Panel E shows terol immediately at the protein surface (data not that the differences in the SDPC RDF between shown). The heat maps clarify this result; visual rhodopsin and opsin are marginally significant at examination shows one site of clear cholesterol best, but the SDPE RDF (Panel F) does show a preference, next to helices H1 and H7, approxi- significant difference in the location of this initial mately where the palmitoyls attach to helix H8. rise at the protein surface. DHA lipids are excluded from this location, as This is in agreement with experimental results seen by a patch of low density. Whether this that suggest that PE is a preferred partner for site represents a true competitive binding site, rhodopsin (Soubias et al., 2006, 2010). To explore where binding to cholesterol is preferred over the possibility of specific PE interaction sites, other bilayer constituents, or simply a deep pro- we generated density maps of SDPE and SDPC tein pocket or crevice that is only accessible to for rhodopsin, found in Fig. 5.5. For quantitative cholesterol, is unclear. 5 Coarse-Grained Molecular Dynamics Provides Insight into the Interactions of Lipids... 87

1.4 a b 1.2

1

0.8

0.6

0.4 Radial Distribution Function 0.2 SDPE SDPE SDPC SDPC 0 1 .05 .01 .001 P−value .0001 c d 5 10 15 20 25 30 35 5 10 15 20 25 30 35 X (Å) X (Å) 1 .05 .01 .001 P−value .0001 e 1 .05 .01 .001 P−value .0001 f 5 10 15 20 25 30 35

X (Å)

Fig. 5.4 Lateral radial distribution functions of SDPE systems. P-values comparing differences in mean between and SDPC for the (a) rhodopsin and (b) opsin systems. rhodopsin and opsin were plotted for (e)SDPCand(f) Comparison of means tests were performed comparing SDPE SDPE to SDPC within the (c) rhodopsin and (d) opsin

In order to better understand the details of spot” is found in both the opsin and rhodopsin the cholesterol binding, we further expanded the simulations, as is a second region found on the data to look at the full 3-dimensional distribu- opposite side of the protein, between the cyto- tion. Figure 5.6 shows contours of regions with plasmic end of helix H3 and helices H4 and H5. high average cholesterol density superimposed This corresponds with a cholesterol interaction onto the structures of rhodopsin and opsin. The site predicted from a pair of 800 ns simulations brightest cholesterol spot in the 2D maps, which of the adenosine A2A receptor (Lee and Lyman, as mentioned excludes other lipids, corresponds 2012). with the region of high density beside H8, the Lastly, at the density contour level chosen, a intracellular helix lying parallel to the bilayer, third high density cholesterol region is present and packed against H1 and H7 of the protein. in the opsin system packed against helices H5 The density surface packs neatly behind the pair and H6. At lower contour thresholds, this region of post-translational palmitates on Cys322 and appears in the rhodopsin systems as well, but Cys323. In these simulations, H8 is embedded in its presence here indicates a possible difference the bilayer interface, creating a pocket in the hy- in cholesterol packing interfaces between the drophobic core that is too small for lipids to com- two structures. Given that the greatest structural fortably diffuse, leading us to conclude that in this changes between the chosen opsin and rhodopsin case, cholesterol is able to pack where other lipids structures are the orientations of helices H5 cannot penetrate efficiently. This cholesterol “hot and H6, as well as the elongation of helix H5 88 J.N. Horn et al.

ab c 30

20

10

0 Y (Å) −10

−20

−30

cd f 30

20

10

0 Y (Å) −10

−20

−30

−30 −20 −10 0 10 20 30 −30 −20 −10 0 10 20 30 −30 −20 −10 0 10 20 30 X (Å) X (Å) X(Å)

0 0.5 1 1.5 2 2.5 3 3.5 0 0.002 0.004 0.006 0.008 0.01 Density P−value

Fig. 5.5 Density maps of (a)SDPEintheupper leaflet lipid components per Å2. Maps of p-values comparing the of rhodopsin, (b)SDPCintheupper leaflet rhodopsin, means for SDPE and SDPC for (c)theupper leaflet and (f) (d)SDPEinthelower leaflet of rhodopsin and (e)SDPC the lower leaflet are also shown. The protein orientation is in the lower leaflet of rhodopsin. Density is reported as identical to that in Fig. 5.3 in opsin, these changes in cholesterol packing et al., 2006b), as discussed in the methods sec- suggest a role for helix-helix interactions and tion. Unlike that work, here we account for the arrangement in cholesterol preferences. The size of the residues, and report the ratio of the presence of the extra cholesterol binding spot packing score to that of the average score of all is consistent with Fig. 5.2, which shows greater residues in the transmembrane region (Eq. 5.1). cholesterol binding at the surface of the opsin Figure 5.7 shows the protein structures colored system. This is indicative of a general trend by residue score. The residues clustered into three that the opsin structure is more amenable to well-defined groups based on physical location; cholesterol binding, again either because the Table 5.1 lists the residues with values at least surface has greater preference for cholesterol three times greater than the average. These group- or the more “open” opsin structure provides a ings correspond nicely with the three high density greater number of cholesterol accessible pockets. regions for cholesterol as seen previously for the The above visualizations are convenient, but opsin structure. Overall, the high-scoring clusters do not in themselves tell us precisely which pro- bear a striking resemblance to the groups of tein residues are involved in the “binding sites.” residues identified in the previous all-atom work Accordingly, we decided to track specific lipid- (Grossfield et al., 2006b). residue contacts, using a variant of the pack- The most dominant cluster is a collection ing score applied in previous work (Grossfield of residues that pack between helices H1 and 5 Coarse-Grained Molecular Dynamics Provides Insight into the Interactions of Lipids... 89

H8, with the greatest contribution to the cluster score coming from the palmitoyl chains attached to H8. In recent ˇ2-adrenergic receptor crystal structures, cholesterol and palmitic acids have been resolved in the H1/H8 interface, close to where the palmitoyl post-translational modifica- tions in rhodopsin are located (Cherezov et al., 2007). However, it has been suggested that these cholesterols may exist as an artifact of crystal packing or protein dimerization. Our simulations suggest a real effect exists, showing preferential interactions between cholesterol and rhodopsin at helix H8, despite the presence of only one rhodopsin molecule. Despite success correlating our results with previous simulation and crystallographic data, we have been unable to detect significant con- tacts between cholesterol and the groove on the intracellular ends of helices H1, H2, H3 and Fig. 5.6 3D images of regions of high cholesterol density H4, which had been detected in previous all- (gray) for both rhodopsin and opsin. Bottom panels are top down views, as seen from the extracellular side. atom simulations (Khelashvili et al., 2009) and in Rhodopsin and opsin are colored with a spectrum from structures of the ˇ2-adrenergic GPCR (Cherezov the N-terminus (blue) to the C-terminus (red) et al., 2007; Hanson et al., 2008). We were also unable to locate cholesterol binding sites that correspond with three sites identified in a recent high-resolution structure of the adenosine A2A receptor (Liu et al., 2012).Oneofthesesites, between the intracellular ends of helices H1 and H2, was noted in all-atom simulations (Lyman et al., 2009) and crystal structures of the ˇ2- adrenergic structure (Hanson et al., 2008), and absent in other simulations of A2A (Lee and Lyman, 2012).

5.5 Conclusions

The membrane environment around rhodopsin contains a diverse set of constituents that impact receptor activation, from general bilayer struc- tural properties to specific binding interactions. Utilizing coarse-grained simulation and a com- Fig. 5.7 3D images of rhodopsin and cholesterol inter- bination of radial distribution functions, density actions, color coded from low contact in yellow (binding representations and quantitative binding scans, score less than 3) to high contact in red (binding score we explored and identified a number of bilayer- greater than 3) rhodopsin interactions. 90 J.N. Horn et al.

Table 5.1 Clusters of key cholesterol-binding residues identified via binding scan Structure Group Residues Average score 1 43, 46, 50, 53, 56, 294, 301, 321, 322, 323 6.0683 Rhodopsin 2 126, 159, 206, 209, 210, 213, 214, 220, 221 3.6042 3 263 3.3597 1 50, 53, 54, 56, 318, 321, 322, 323 6.7907 Opsin 2 126, 159, 162, 206, 209, 210, 213, 214, 220 3.9135 3 256, 259, 263 3.4568

DHA chains were found in higher concentra- uniquely suited to interactions with cholesterol tions at the protein surface, with stearoyl chains are then unable to interact. The use of an all-atom excluded to a second solvation shell in the bilayer. model would overcome these limitation to some There was enrichment of PE headgroups over degree, but the loss of sufficient sampling remains PC headgroups at the surface of the protein. A a barrier to utilizing these models to explore a region of significant difference was discovered, process as slow as bilayer lateral reorganization. suggesting a possible specific binding site for In the future, a number of other variables could lipids with PE headgroups. Possible cholesterol- be explored. First, it would be worth exploring binding sites were also identified, with the pre- variations in concentration of lipid components in dominant one at the helix H1 and helix H8 inter- the bilayer. Cholesterol concentrations vary with face, behind the palmitoyl chains attached to the the age of the disc membrane and can result in protein. a drastic change in rhodopsin activity. Also, an We also found differences between the improved MARTINI model that accurately main- rhodopsin and opsin systems for these lipid con- tains tertiary structure would remove the need for stituents. In the opsin system, the concentrations restraints and allow us to explore the effect of of cholesterol, stearoyl and DHA reach bulk the lipid bilayer and specific lipid species on the levels deeper in the protein and the stearoyl and structure of rhodopsin. DHA peaks were much higher at the surface of In designing a simulation, there are always two the protein, suggesting a more open structure questions to answer: Does the proposed model with greater available surface area. represent the system with sufficient accuracy that The use of restraints to maintain protein sta- one can draw conclusions from the results? Can bility limited the motions available to rhodopsin the problem be solved using available resources? and opsin, likely preventing any major structural The former questions is answered by evaluating changes that would result from bilayer-protein the force field (and perhaps the system size and interactions. We do not feel that this is an issue contents), while the latter has to do with the for our particular simulations, as we are probing computational cost of running the simulations preferential interaction sites along the surface of long enough to obtain valid statistics. Satisfying the protein, not the effects of these interactions both criteria at the same time—finding a model on protein structure. Our chosen rhodopsin and that is accurate and readily solvable—is often opsin structures represent distant endpoints along a challenge, in that more detailed models are the activation path of rhodopsin, allowing us usually more expensive to implement. Actually, to explore the effect of these major structural an analogous phenomenon exists experimentally, changes on the surface available for interaction. for instance when choosing between more real- Coarse-grained models can limit the ability istic in vivo experiments and simpler (but more of the simulations to capture specific binding interpretable) in vitro ones. interactions. For example, coarse-grained repre- We believe that there is often a certain synergy sentations of cholesterol maintain the molecule’s to combining all-atom and coarse-grained simu- hydrophobicity, but cannot capture the chem- lations, as we have done here. The all-atom cal- ically distinct faces. Interaction sites that are culations can serve to validate surprising results 5 Coarse-Grained Molecular Dynamics Provides Insight into the Interactions of Lipids... 91 from the coarse-grained ones; usually, this role is Botelho AV, Huber T, Sakmar TP, Brown MF (2006) played by comparison to experiment, but some- Curvature and hydrophobic forces drive oligomeriza- times there is no readily available experimental tion and modulate activity of rhodopsin in membranes. Biophys J 91:4464Ð4477 comparison. Alternatively, the all-atom simula- Brown MF (1994) Modulation of rhodopsin function by tions may suggest interesting phenomena, but properties of the membrane bilayer. Chem Phys Lipids lack sufficient duration to draw strong conclu- 73(1Ð2):159Ð180 sions. In this case (as happened here), coarse- Brown MF (1997) Influence of non-lamellar-forming lipids on rhodopsin. Curr Top Membr 44:285Ð356 grained simulations can be used to flesh out the Brügger B, Erben G, Sandhoff R, Wieland FT, Lehmann case. WD (1997) Quantitative analysis of biological mem- In a broader sense, if one asks the question brane lipids at the low picomole level by nano- electrospray ionization tandem mass spectrometry. “Do I want to use all-atom or coarse-grained Proc Natl Acad Sci 94(6):2339Ð2344 simulation or experiment?”, the best answer is Burkhart BM, Li N, Langs DA, Pangborn WA, Duax simply “Yes”. All approaches have their own WL (1998) The conducting form of gramicidin a is a strengths and weaknesses, and the best way to right-handed double-stranded double helix. Proc Natl Acad Sci 95(22):12950Ð12955. doi:10.1073/pnas.95. answer the scientific question is to attack from as 22.12950 many directions as possible. Buzhynskyy N, Salesse C, Scheuring S (2011) Rhodopsin is spatially heterogeneously distributed in rod outer Acknowledgements We would like to thank Nick segment disk membranes. J Mol Recognit 24(3):483Ð Leioatts, Dejun Lin and Tod Romo for critical reviews 489. doi:10.1002/jmr.1086 of this manuscript. We would also like to gratefully Chen Z, Rand R (1997) The influence of choles- acknowledge financial support from the U.S. National terol on phospholipid membrane curvature and bend- Institutes of Health (1R01GM095496). We also thank the ing elasticity. Biophys J 73(1):267Ð276. doi:10.1016/ University of Rochester’s Center for Integrated Research S0006-3495(97)78067-6 Computing for the computing resources necessary to Cherezov V, Rosenbaum DM, Hanson MA, Rasmussen support this work. SGF, Thian FS, Kobilka TS, Choi HJ, Kuhn P, Weis WI, Kobilka BK, Stevens RC (2007) High- resolution crystal structure of an engineered human beta2-adrenergic G protein-coupled receptor. Science References 318(5854):1258Ð1265. doi:10.1126/science.1150577 Cruickshank C, Minchin R, Dain AL, Martinac B Albert AD, Young JE, Yeagle P (1996) Rhodopsin- (1997) Estimation of the pore size of the large- cholesterol interactions in bovine rod out segment disk conductance mechanosensitive ion channel of Es- membranes. Biochim Biophys Acta 1285:47Ð55 cherichia coli. Biophys J 73(4):1925Ð1931. doi:10. Andersen OS, Koeppe RE (2007) Bilayer thickness and 1016/S0006-3495(97)78223-7 membrane protein function: an energetic perspective. Delange F, Merkx M, Bovee-Geurts PHM, Ann Rev Biophys Biomol Struct 36:107Ð130. doi:10. Pistorius AMA, Degrip WJ (1997) Modulation 1146/annurev.biophys.36.040306.132643 of the metarhodopsin I/metarhodopsin II Andersen O,Apell HJ, Bamberg E, Busath D, Koeppe equilibrium of bovine rhodopsin by ionic R, Sigworth F, Szabo G, Urry D, Woolley A (1999) strength. Eur J Biochem 243(1Ð2):174Ð180. doi: Gramicidin channel controversy Ð the structure in a 10.1111/j.1432-1033.1997.0174a.x lipid environment. Nat Struct Mol Biol 6(7):609Ð609 Drews J (2000) Drug discovery: a historical perspective. Boesze-Battaglia K, Albert AD (1989) Fatty acid compo- Science 287(5460):1960Ð1964 sition of bovine rod outer segment plasma membrane. Engelman DM (2005) Membranes are more mosaic Exp Eye Res 49(4):699Ð701 than fluid. Nature 438(7068):578Ð580. doi:10.1038/ Boesze-Battaglia K, Albert AD (1992) Phospholipid dis- nature04394 tribution among bovine rod outer segment plasma Fattal DR, Ben-Shaul A (1993) A molecular model for membrane and disk membranes. Exp Eye Res lipid-protein interactions in membranes: the role of 54(5):821Ð823 hydrophobic mismatch. Biophys J 65:1795Ð1809 Boesze-Battaglia K, Hennessey T, Albert AD (1989) Feller SE, Gawrisch K (2005) Properties of docosahex- Cholesterol heterogeneity in bovine rod outer segment aenoic acid-containing lipids and their influence on disk membranes. J Biol Chem 264(14):8151Ð8155 the function of the GPCR rhodopsin. Curr Opin Struct Botelho AV, Gibson NJ, Thurmond RJ, Wang Y, Brown Biol 15:416Ð422 MF (2002) Conformational energetics of rhodopsin Feller SE, Gawrisch K, MacKerell AD Jr (2002) Polyun- modulated by nonlamellar-forming lipids. Biochem- saturated fatty acids in lipid bilayers: intrinsic and istry 41:6354Ð6368 environmental contributions to their unique physical properties. J Am Chem Soc 124(2):318Ð326 92 J.N. Horn et al.

Feller SE, Gawrisch K, Woolf TB (2003) Rhodopsin Khelashvili G, Grossfield A, Feller SE, Pitman MC, exhibits a preference for solvation by polyunsaturated Weinstein H (2009) Structural and dynamic effects docosohexaenoic acid. J Am Chem Soc 125(15):4434Ð of cholesterol at preferred sites of interaction with 4435. doi:10.1021/ja0345874 rhodopsin identified from microsecond length molec- Gibson NJ, Brown MF (1993) Lipid headgroup and acyl ular dynamics simulations. Proteins 76(2):403Ð417. chain composition modulate the MI-MII equilibrium doi:10.1002/prot.22355 of rhodopsin in recombinant membranes. Biochem- Killian J, von Heijne G (2000) How proteins adapt istry 32:2438Ð2454 to a membrane-water interface. Trends Biochem Sci Grossfield A, Feller SE, Pitman MC (2006a) Contribution 25(9):429Ð434. doi:10.1016/S0968-0004(00)01626-1 of omega-3 fatty acids to the thermodynamics of mem- Knepp AM, Periole X, Marrink SJ, Sakmar TP, Huber brane protein solvation. J Phys Chem B 110(18):8907Ð T (2012) Rhodopsin forms a dimer with cytoplasmic 8909. doi:10.1021/jp060405r helix 8 contacts in native membranes. Biochemistry Grossfield A, Feller SE, Pitman MC (2006b) A 51(9):1819Ð1821. doi:10.1021/bi3001598 role for direct interactions in the modulation of Kung C, Martinac B, Sukharev S (2010) rhodopsin by omega-3 polyunsaturated lipids. Proc Mechanosensitive channels in microbes. Natl Acad Sci USA 103(13):4888Ð4893. doi:10.1073/ Ann Rev Microbiol 64(1):313Ð329. doi: pnas.0508352103 10.1146/annurev.micro.112408.134106 Grossfield A, Pitman MC, Feller SE, Soubias O, Gawrisch Lee AG (2003) Lipid-protein interactions in biological K (2008) Internal hydration increases during activation membranes: a structural perspective. Biochimica et of the G-protein-coupled receptor rhodopsin. J Mol Biophysica Acta (BBA) Ð Biomembranes 1612(1):1Ð Biol 381(2):478Ð486. doi:10.1016/j.jmb.2008.05.036 40. doi:10.1016/S0005-2736(03)00056-7 Gruner SM (1985) Intrinsic curvature hypothesis for Lee AG (2004) How lipids affect the activities of in- biomembrane lipid composition: a role for nonbilayer tegral membrane proteins. Biochimica et Biophysica lipids. Proc Natl Acad Sci 82(11):3665Ð3669 Acta (BBA) Ð Biomembranes 1666(1Ð2):62Ð87. doi: Hanson MA, Cherezov V, Griffith MT, Roth CB, Jaakola 10.1016/j.bbamem.2004.05.012 VP, Chien EY, Velasquez J, Kuhn P, Stevens RC (2008) Lee JY, Lyman E (2012) Predictions for cholesterol in- A specific cholesterol binding site is established by the teraction sites on the A(2A) adenosine receptor. J 2.8 angstrom structure of the human beta2-adrenergic Am Chem Soc 134(40):16512Ð16515. doi:10.1021/ receptor. Structure 16(6):897Ð905. doi:10.1016/j.str. ja307532d 2008.05.001 Liebman PA, Parker KR, Dratz EA (1987) The molecular Harold FM, Baarda JR (1967) Gramicidin, valinomycin, mechanism of visual excitation and its relation to and cation permeability of Streptococcus faecalis. J the structure and composition of the rod outer seg- Bacteriol 94(1):53Ð60 ment. Ann Rev Physiol 49(1):765Ð791. doi:10.1146/ Haswell E, Phillips R, Rees D (2011) Mechanosensi- annurev.ph.49.030187.004001 tive channels: what can they do and how do they Liu W, Chun E, Thompson AA, Chubukov P, Xu F, do it? Structure 19(10):1356Ð1369. doi:10.1016/j.str. Katritch V, Han GW, Roth CB, Heitman LH, IJz- 2011.09.005 erman AP, Cherezov V, Stevens RC (2012) Struc- Hess B, Kutzner C, van der Spoel D, Lindahl E (2008) tural basis for allosteric regulation of GPCRs by GROMACS 4: algorithms for highly efficient, load- sodium ions. Science 337(6091):232Ð236. doi:10. balanced, and scalable molecular simulation. J Chem 1126/science.1219218 Theory Comput 4(3):435Ð447. doi:10.1021/ct700301q Lyman E, Higgs C, Kim B, Lupyan D, Shelley JC, Farid Hoover WG (1985) Canonical dynamics: equilibrium R, Voth GA (2009) A role for a specific choles- phase-space distributions. Phys Rev A 31(3):1695Ð terol interaction in stabilizing the apo configuration 1697. doi:10.1103/PhysRevA.31.1695 of the human A(2A) adenosine receptor. Structure Huber T, Rajamoorthi K, Kurze VF, Beyer K, Brown MF 17(12):1660Ð1668. doi:10.1016/j.str.2009.10.010 (2002) Structure of docosahexaenoic acid-containing Marrink SJ, Risselada HJ, Yefimov S, Tieleman DP, phospholipid bilayers as studied by 2H NMR and de Vries AH (2007) The MARTINI force field: coarse molecular dynamics simulation. J Am Chem Soc grained model for biomolecular simulations. J Phys 124:298Ð309 Chem B 111(27):7812Ð7824. doi:10.1021/jp071097f Huber T, Botelho AV, Beyer K, Brown MF (2004) Marrink SJ, Periole X, Tieleman DP, de Vries AH (2010) Membrane model for the G-protein-coupled receptor Comment on “on using a too large integration time rhodopsin: hydrophobic interface and dynamical struc- step in molecular dynamics simulations of coarse- ture. Biophys J 86:2078Ð2100 grained molecular models” by M. Winger, D. Trzes- Jones E,Oliphant T, Peterson P, et al (2001Ð) SciPy: open niak, R. Baron and W. F. van Gunsteren, Phys. Chem. source scientific tools for Python. http://www.scipy. Chem. Phys., 2009, 11, 1934. Phys Chem Chem org/ Phys 12(9):2254Ð2256; author reply 2257Ð2258. doi: Kandt C, Ash WL, Tieleman DP (2007) Setting up and 10.1039/b915293h running molecular dynamics simulations of mem- Marsh D (2008) Protein modulation of lipids, and vice- brane proteins. Methods 41(4):475Ð488. doi:10.1016/ versa, in membranes. Biochimica et Biophysica Acta j.ymeth.2006.08.006 5 Coarse-Grained Molecular Dynamics Provides Insight into the Interactions of Lipids... 93

(BBA) Ð Biomembranes 1778:1545Ð1575. doi:10. Crystal structure of rhodopsin: a G protein-coupled 1016/j.bbamem.2008.01.015 receptor. Science 289:739Ð745 Martinac B (2011) Bacterial mechanosensitive channels Park JH, Scheerer P, Hofmann KP, Choe HW, Ernst OP as a paradigm for mechanosensory transduction. Cell (2008) Crystal structure of the ligand-free G-protein- Physiol Biochem 28(6):1051Ð1060 coupled receptor opsin. Nature 454(7201):183Ð187. Martinac B, Buechner M, Delcour AH, Adler J, Kung C doi:10.1038/nature07063 (1987) Pressure-sensitive ion channel in Escherichia Parrinello M, Rahman A (1981) Polymorphic transitions coli. Proc Natl Acad Sci 84(8):2297Ð2301 in single crystals: a new molecular dynamics method. MARTINI (2011) http://mdchemrugnl/cgmartini/ J Appl Phys 52(12):7182Ð7190. doi:10.1063/1.328693 Mitchell DC, Straume M, Miller JL, Litman BJ Periole X, Huber T, Marrink SJ, Sakmar TP (2007) (1990) Modulation of metarhodopsin formation G protein-coupled receptors self-assemble in dynam- by cholesterol-induced ordering of bilayer ics simulations of model bilayers. J Am Chem Soc lipids. Biochemistry 29(39):9143Ð9149. doi: 129(33):10126Ð10132. doi:10.1021/ja0706246 10.1021/bi00491a007 Periole X, Cavalli M, Marrink SJ, Ceruso MA (2009) Molday RS (1998) Photoreceptor membrane proteins, Combining an elastic network with a coarse-grained phototransduction, and retinal degenerative diseases. molecular force field: structure, dynamics, and in- The Friedenwald Lecture. Invest Ophthalmol Vis Sci termolecular recognition. J Chem Theory Comput 39(13):2491Ð2513 5(9):2531Ð2543. doi:10.1021/ct9002114 Monticelli L, Kandasamy S, Periole X, Larson R, Tiele- Periole X, Knepp AM, Sakmar TP, Marrink SJ, Huber man D, Marrink S (2008) The MARTINI coarse T (2012) Structural determinants of the supramolec- grained forcefield: extension to proteins. J Chem The- ular organization of G protein-coupled receptors in ory Comput 4:819Ð839 bilayers. J Am Chem Soc 134(26):10959Ð10965. doi: Mouritsen OG, Bloom M (1984) Mattress model of lipid- 10.1021/ja303286e protein interactions in membranes. Biophys J 46:141Ð Pitman MC, Grossfield A, Suits F, Feller SE (2005) Role 153 of cholesterol and polyunsaturated chains in lipid- Mouritsen OG, Bloom M (1993) Models of lipid-protein protein interactions: molecular dynamics simulation interactions in membranes. Ann Rev Biophys Biomol of rhodopsin in a realistic membrane environment. Struct 22:145Ð171 J Am Chem Soc 127(13):4576Ð4577. doi:10.1021/ Needham D, McIntosh TJ, Evans E (1988) ja042715y Thermomechanical and transition properties Romo TD, Grossfield A (2009) LOOS: an extensible of dimyristoylphosphatidylcholine/cholesterol platform for the structural analysis of simulations. bilayers. Biochemistry 27(13):4668Ð4673. doi: Conf Proc IEEE Eng Med Biol Soc 2009:2332Ð2335. 10.1021/bi00413a013 doi:10.1109/IEMBS.2009.5335065 Neuringer M (2000) Infant vision and retinal function Romo TD, Grossfield A (2012) LOOS: a lightweight in studies of dietary long-chain polyunsaturated fatty object-oriented software library. LOOS: Lightweight acids: methods, results, and implications. Am J Clin object oriented structure analysis, Grossfield Lab. Nutr 71(1 Suppl):256SÐ267S http://loos.sourceforge.net Niu SL, Mitchell DC, Litman BJ (2002) Manipulation of Sansom MS, Bond PJ, Deol SS, Grottesi A, Haider S, cholesterol levels in rod disk membranes by methyl-ˇ- Sands ZA (2005) Molecular simulations and lipid- cyclodextrin. J Bio Chem 277:20139Ð20145 protein interactions: potassium channels and other Nosé S, Klein ML (1983) Constant pressure molecular membrane proteins. Biochem Soc Trans 33(Pt 5):916Ð dynamics for molecular systems. Mol Phys 50:1055Ð 920. doi:10.1042/BST20050916 1076 Simmonds A, East J, Jones O, Rooney E, McWhirter O’Connell A, Koeppe R, Andersen O (1990) Kinet- J, Lee A (1982) Annular and non-annular binding ics of gramicidin channel formation in lipid bilay- sites on the (Ca2CC Mg2C)-ATPase. Biochimica et ers: transmembrane monomer association. Science Biophysica Acta (BBA) Ð Biomembranes 693(2):398Ð 250(4985):1256Ð1259. doi:10.1126/science.1700867 406. doi:10.1016/0005-2736(82)90447-3 Okada T, Sugihara M, Bondar AN, Elstner M, Entel P, Soubias O, Gawrisch K (2007) Docosahexaenoyl chains Buss V (2004) The retinal conformation and its envi- isomerize on the sub-nanosecond time scale. J ronment in rhodopsin in light of a new 2.2 angstrom Am Chem Soc 129(21):6678Ð6679. doi:10.1021/ crystal structure. J Mol Biol 342:571Ð583 ja068856c Olausson BES, Grossfield A, Pitman MC, Brown MF, Soubias O, Gawrisch K (2012) The role of the lipid Feller SE, Vogel A (2012) Molecular dynamics simu- matrix for structure and function of the GPCR lations reveal specific interactions of post-translational rhodopsin. Biochim Biophys Acta 1818(2):234Ð240. palmitoyl modifications with rhodopsin in membranes. doi:10.1016/j.bbamem.2011.08.034 J Am Chem Soc 134(9):4324Ð4331. doi:10.1021/ Soubias O, Teague WE, Gawrisch K (2006) Evidence for ja2108382 specificity in lipid-rhodopsin interactions. J Biol Chem Palczewski K, Kumasaka T, Hori T, Behnke CA, Mo- 281(44):33233Ð33241. doi:10.1074/jbc.M603059200 toshima H, Fox BJ, Le Trong I, Teller DC, Okada Soubias O, Niu SL, Mitchell DC, Gawrisch K T, Stenkamp RE, Yamamoto M, Miyano M (2000) (2008) Lipid-rhodopsin hydrophobic mismatch 94 J.N. Horn et al.

alters rhodopsin helical content. J Am Chem Soc metarhodopsin II formation in visual transduction. J 130(37):12465Ð12471. doi:10.1021/ja803599x Am Chem Soc 124(26):7690Ð7701 Soubias O, Teague WE, Hines KG, Mitchell DC, Wiedmann TS, Pates RD, Beach JM, Salmon A, Brown Gawrisch K (2010) Contribution of membrane elastic MF (1988) Lipid-protein interactions mediate the energy to rhodopsin function. Biophys J 99(3):817Ð photochemical function of rhodopsin. Biochemistry 824. doi:10.1016/j.bpj.2010.04.068 27:6469Ð6474 van der Spoel D, Lindahl E, Hess B, Groenhof G, Mark Wiener MC, White SH (1992) Structure of a fluid AE, Berendsen HJC (2005) GROMACS: fast, flexible, dioloeoylphosphatidylcholine bilayer determined by and free. J Comput Chem 26(16):1701Ð1718. doi:10. joint refinement of x-ray and neutron diffraction data: 1002/jcc.20291 III. Complete structure. Biophys J 61:434Ð447 Valiyaveetil FI, Zhou Y, MacKinnon R (2002) Lipids Winger M, Trzesniak D, Baron R, van Gunsteren in the structure, folding, and function of the KcsA WF (2009) On using a too large integration time K+ channel. Biochemistry 41(35):10771Ð10777. doi: step in molecular dynamics simulations of coarse- 10.1021/bi026215y grained molecular models. Phys Chem Chem Phys Wang Y, Botelho AV, Martinez GV, Brown MF (2002) 11(12):1934Ð1941. doi:10.1039/b818713d Electrostatic properties of membrane lipids coupled to Beyond Standard Molecular Dynamics: Investigating 6 the Molecular Mechanisms of G Protein-Coupled Receptors with Enhanced Molecular Dynamics Methods

Jennifer M. Johnston and Marta Filizola

Abstract The majority of biological processes mediated by G Protein-Coupled Receptors (GPCRs) take place on timescales that are not conveniently ac- cessible to standard molecular dynamics (MD) approaches, notwithstand- ing the current availability of specialized parallel computer architectures, and efficient simulation algorithms. Enhanced MD-based methods have started to assume an important role in the study of the rugged energy landscape of GPCRs by providing mechanistic details of complex receptor processes such as ligand recognition, activation, and oligomerization. We provide here an overview of these methods in their most recent application to the field.

Keywords Enhanced methods ¥ Biased simulations ¥ Metadynamics ¥ Umbrella sampling ¥ Free-energy calculations ¥ Coarse-graining ¥ Ligand recog- nition ¥ GPCR activation ¥ Molecular mechanisms ¥ Oligomerization ¥ Dimers ¥ GPCR structure ¥ GPCR dynamics ¥ GPCR function

termed GPCR signalosomes (Huber and Sakmar 6.1 Introduction 2011), are suggested to operate through a multitude of dynamic steps and allosteric GPCRs are components of complex macromolec- signaling conduits whose properties may not ular machines with multiple ligand-induced depend necessarily on their individual elements. ‘active’ states that can engender different Thus, it appears evident that a conceptual signaling outputs through association with framework strongly relying on a combination specific accessory proteins. These functionally of high-content experimental platforms and versatile macromolecular complexes, recently computational approaches is necessary to account for the complexity of these signalosomes and J.M. Johnston ¥ M. Filizola () Department of Structural and Chemical Biology, Icahn ultimately understand the relationships between School of Medicine at Mount Sinai, One Gustave L. Levy their structure, dynamics, and function (Huber Place, New York, NY 10029, USA and Sakmar 2011). e-mail: marta.fi[email protected]

M. Filizola (ed.), G Protein-Coupled Receptors - Modeling and Simulation, Advances 95 in Experimental Medicine and Biology 796, DOI 10.1007/978-94-007-7423-0__6, © Springer ScienceCBusiness Media Dordrecht 2014 96 J.M. Johnston and M. Filizola

One way to interpret the functional versatility receptor (B2AR) (Dror et al. 2011)), these of GPCRs is in terms of their structural plasticity, hardware/software configurations are presently through system representation as energy land- the prerogative of a small group of investigators. scapes (Deupi and Kobilka 2010; Choe et al. Undoubtedly, atomistic descriptions provide 2011; Vaidehi and Kenakin 2010). Building upon the most comprehensive and complete repre- a concept put forth in 1991 by Frauenfelder sentations of a GPCR, but it is arguable that and colleagues (1991), the energy landscape of multiple, lengthy stochastic simulations of such a protein is described as a hyper-surface in a systems are not best suited to access so-called conformational space of very high dimension, “rare events” in the lives of GPCRs. Since the with a very large number of valleys (confor- number of calculations required per step of a mational sub-states) and peaks (energy barriers). MD simulation of a GPCR system scales with The valleys and peaks of such a rugged landscape the square of the number of particles, reducing cannot be classified individually, but must be the system size considerably increases the speed described by distributions. Specific energy levels with which simulations can be performed. This are no longer relevant, but are more accurately size reduction can be achieved by eliminating described in statistical terms. At any given time, the explicit representation of a component of an individual protein molecule can jump be- the system (e.g. the solvent or the membrane, as tween conformational sub-states as it navigates in Generalized Born models), or by grouping its own energy landscape, and this exploration individual atoms into interaction sites (e.g., is critically dependent on temperature (Frauen- coarse-grained bead models). Of course, such felder et al. 1991). The available X-ray crystal decisions must be taken with care and are structures of GPCRs (for a review see Katritch inevitably dependent on the nature of the question 2012) represent single, static sub-states of these that the simulation is designed to answer. In the proteins. Sections of missing density in many of cases where the fastest degrees of freedom can the structures (e.g. Wu et al. 2012) highlight the be neglected, these approaches have helped to inherent flexibility, even at low temperatures, of smooth the energy landscape of GPCRs and some regions of the receptor, particularly parts of thereby extend the range of accessible time and the intra and extracellular loops and the N- and length scales. C-termini, and remind us that the dynamic and The reduced representation of a GPCR sys- adaptable nature of the GPCR structure is a key tem may, however, still be insufficient to bridge part of their functionality. the gap between the timescales accessible to Ligands with different efficacies can modulate standard MD simulations and average experi- the GPCR energy landscape by shifting the mental timescales. We will illustrate here how conformational equilibrium towards active or both simplified physical representations and en- inactive conformations, thus eliciting different hanced MD-based methods have recently proven physiological responses (Deupi and Kobilka useful in the study of the rugged energy land- 2010; Vaidehi and Kenakin 2010). Examining scape of GPCRs by providing otherwise inac- the molecular basis of this ligand-induced cessible details of important events in the life conformational shift is not an easy task using cycle of these receptors, such as recognition of standard MD approaches due to the limited ligands, ligand-specific conformational changes, timescales they can access. Although much and oligomerization. However, we acknowledge has been accomplished by standard MD of that both the rate at which advances are being atomistically-represented GPCRs using some made in this field, and the space constraints of of the largest multiprocessor computing clusters, this chapter do not permit an entirely compre- running the most efficient MD codes currently hensive review here, but rather our goal is to available (e.g., see the work recently done to provide an overview by way of a few selected simulate ligand binding to the “2-adrenergic examples. 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 97

than rhodopsin. However, for drug discovery 6.2 Ligand Recognition in GPCRs purposes, a limitation of GPCR homology models has been that the binding pockets are not A thorough understanding of the mechanisms always big enough to accommodate the variety of by which GPCRs recognize their ligands is fun- ligands that have been demonstrated to bind to a damental to successful drug discovery. While particular receptor through other experimental multiple long timescale atomistic standard MD assays. Indeed, retinal is a small, covalently simulations (of the order of hundreds of s) bound ligand, so ligands of interest that are have recently permitted researchers to visualize, different in shape or size to retinal, and most for instance, the binding of different ligands to importantly, not covalently bound, might not be the B2AR (Dror et al. 2011), biased MD tech- expected to be a good fit to receptor models niques have been successfully employed to en- generated from a rhodopsin backbone template. hance the probability of observing such events Kimura and colleagues (2008) have developed a during shorter simulation timescales. Three stud- method for increasing the available space for ies we discuss in this section focus on the premise ligand binding within the orthosteric pocket that from knowledge of the pathways by which a of a model of a GPCR, which may be readily ligand can exit a receptor, one can infer specific transferable to other types of binding pockets for details of the binding mechanism (Gonzalez et al. other proteins. Once the binding pocket has been 2011; Selvam et al. 2012; Wang and Duan 2009). expanded using their algorithm, the endogenous The probability of spontaneous ligand exit from ligand can be docked into the pocket, and induced a binding cavity, on the timescale accessible to fit docking methods, sidechain rotamer sampling MD simulations, can be enhanced by addition of or MD methods can be used to refine the pocket an external force imposed upon the ligand, as is before virtual screening or docking of novel the case for steered MD and random acceleration ligands is performed. MD. We compare the results of these studies Conceptually, the method proposed by Kimura with those obtained by the very long timescale, and colleagues is similar to inflating a balloon unbiased simulations from Dror and colleagues inside the pocket. By slowly increasing the pres- (2011). A fourth study we discuss below focuses sure on the cavity walls during a MD simulation, on a binding event in which the ligand travels the pocket can be expanded. To achieve this, along a predetermined pathway from the bulk the authors placed several (>20) Lennard-Jones solvent, towards a bound state within the receptor (LJ) beads of small radius (0.25 Å) within the (Provasi et al. 2009). We also comment on a cavity, and these particles collided with the cavity novel method of altering the orthosteric binding walls with velocities consistent with a Maxwell pocket of receptor homology models, to permit distribution according to the temperature of the successful docking of ligands of differing shape simulation. A slow increase in pressure was im- and size to that in the template structure (Kimura plemented by increasing the radii of the particles. et al. 2008). The particles filled the pocket, and were tethered by weak (k D 1 kcal/(mol Å2)) harmonic forces to their four immediate neighbors, but these bonds 6.2.1 Exploring the Binding Site were reassigned with every increase of the par- in Homology Models ticle radius. The radius was increased in steps of 0.05Ð0.1 Å every 300Ð500 ps. The simula- Prior to the application of receptor engineering tions were performed using NAMD, with ex- innovations that have given rise to the recent plicitly represented solvent and palmitoyl-oleoyl- influx of solved crystal structures of GPCRs, phosphatidyl-choline (POPC) lipid bilayer. The homology modeling was the only tool in system was minimized and equilibrated for sev- the arsenal of the researcher wishing to eral hundreds of picoseconds prior to the pocket computationally investigate any receptor other expansion stages. During the expansion stages, 98 J.M. Johnston and M. Filizola the backbone dihedral angles of the protein were r D 0.3 state satisfied available experimental data weakly restrained (k D 2.0 kcal/(mol rad2)) to of agonist ligand binding, although required the prevent gross deformations and unwinding of the re-orientation of some side chains (Strader et al. TM helices. 1988, 1989). It seems that key advantage pro- Kimura and colleagues tested their protocol vided by the balloon expansion methodology is on three receptors; firstly, in their 2008 paper that it generates an ensemble of models, unbiased (Kimura et al. 2008), they generated an apo by the inherent dependence on the template in- state of the rhodopsin structure (PDB ID: 1LH9 troduced during the homology modeling process, (Okada et al. 2002)). By removing the retinal which may then be evaluated in the context of the ligand and simulating the receptor for a period available mutagenesis, SAR and small molecule of 4 ns, the sidechains were rearranged such pharmacophoric data. However, these expanded that the pocket was too small to accommodate models require further refinement before the re- retinal without modification. After the pocket sults of any docking studies may be considered expansion simulations, docking studies were per- any more than qualitatively accurate. formed with GLIDE (Friesner et al. 2004, 2006; Halgren et al. 2004) and a comparison of the crystal structure, the MD relaxed structure and 6.2.2 Exploring Ligand Egress the structure after expansion with the balloon Pathways with Random potential indicate that re-docking the retinal is Acceleration MD significantly improved between the relaxed struc- ture and the balloon expansion structure, but the Random acceleration MD (Wang and Duan 2007, crystal structure pose is not recovered exactly, 2009), also called random expulsion MD (Lude- with an RMSD of 7.2 Å between the re-docked mann et al. 2000) uses a randomly directed, exter- pose and the crystallographic pose. nally applied force to encourage unbinding of a Secondly, they built a homology model of ligand from the pocket of a receptor. This ensures the chemokine receptor type 2 (CCR2) based thorough exploration of the pocket, and good on a rhodopsin template, Three potent antago- sampling of possible modes of ligand egress. nists, RS-504393 (MacKerell et al. 1998), TAK- According to the formulation, the direction of the 779 (Baba et al. 1999), and a Teijin lead com- force applied to the center of mass (COM) of the pound (Shiota 1999) were docked into the recep- ligand is chosen at random, and it is maintained tor at different stages of the pocket expansion. for N steps. During these steps, the COM of the At r D 0.35 Å, a cluster of poses correlating with ligand is expected to move a minimum distance the available mutagenesis data was observed. rmin, i.e. the average velocity, , of the ligand Docking into the pockets formed with r > 0.35 Å will maintain a threshold value of at least: . yielded many more poses, owing to the increased r min ; space available in the pocket, but clustering these hviD Nt (6.1) poses did not yield any that matched with the known mutagenesis data. where t is the timestep of the MD simulation. Finally (Krystek et al. 2006), the same authors Upon encountering an unyielding portion of have built and refined a homology model of the binding site, the velocity of the ligand falls B2AR from a rhodopsin template (PDB ID: 1F88 below the threshold velocity and the trajectory is (Okada et al. 2002)) and validated this model considered to be complete, thus a new random according to the paradigm of structure activity direction is assigned. For each new trajectory, the relationships (SAR) and mutagenesis data ex- direction of the force, and thus the movement of isting at the time (the study was published in the ligand, is maintained as long as the velocity 2006, prior to the solution of the B2AR structure of the ligand exceeds the threshold value, or until (Cherezov et al. 2007)). After pocket expansion, N steps of MD have been completed. By using docking of agonists such as propranolol into the multiple trajectories, the ligand can thoroughly 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 99 probe the binding pocket until it finds a suitable from the binding site is, however, restricted by exit pathway or no egress, if appropriate. The two bulky aromatic residues (F193 on EL2 and key feature of this technique, which makes it Y3087.35 on TM7) and a salt bridge between EL2 particularly effective for the purpose of exploring and TM7, formed by D192-K3057.32. ligand binding pathways, as compared to e.g. The random acceleration MD trajectories steered MD, is that no prior knowledge of an exit suggested that the dominant exit pathway pathway is required. of carazolol from the B2AR crystallographic In 2007, Wang and Duan (2007) applied binding pocket was via the extracellular opening the random acceleration MD technique to the of the binding pocket. The salt bridge between rhodopsin crystal structure, (PDB ID: 1U19 EL2 and TM7, (D192-K3057.32), was broken (Okada et al. 2004)), embedded in a POPC during exit along this pathway. Furthermore, if bilayer and solvated with explicit water. The the salt bridge can be thought of as bisecting endogenous ligand, 11-cis-retinal, is covalently the extracellular opening of the receptor, exit via bonded to K2967.43 within a deep binding pocket this dominant pathway can be subdivided into formed by the TM domain of rhodopsin. The two, and egress was found to be approximately superscript refers to the Ballesteros-Weinstein equally distributed between both the A1 sub- generic residue numbering scheme (Ballesteros pathway (toward TM 5, 6 and 7, thick red arrow and Weinstein 1995), where the first number 1inFig.6.1), and A2 sub-pathway (toward TM (e.g., 1 in 1.48) indicates the helix, and the 2, 3 and 7, thick red arrow 2 in Fig. 6.1), of second number (e.g., 48 in 1.48) represents the salt bridge. Five additional exit pathways the residue position in that helix relative to the were observed through inter-helical clefts. Of most conserved residue in the helix (numbered these, the most statistically significant offered 50 by definition). EL2 in rhodopsin forms a an exit through transient breakage of the inter- beta-hairpin fold that completely blocks access helical interactions between TM4 and TM5. to this binding pocket from the extracellular The predominant barrier to ligand egress was side. Thirty-eight random acceleration MD presented by the interactions between the ligand simulations were performed on the rhodopsin and the polar groups within the binding pocket. system and the predominant exit pathways were During 120 ns of unbiased, standard MD sim- observed to be towards the extracellular side, ulation of the receptor in the absence of ligand, through interhelical clefts, either between TM4 the authors found that the D192-K3057.32 salt and TM5, or between TM5 and TM6. These bridge was dynamic in nature, and that a con- exit pathways involved transient breakage of formational change in F193 caused it to rotate the interhelical interactions, which reformed outward toward TM7 and forming a ‘hydropho- immediately upon complete exit of the ligand bic cluster’ between EL2 and TM7. The authors from the binding site. repeated their random acceleration MD simula- In 2009, the same authors conducted a simi- tions from this state after re-docking the cara- lar investigation on the B2AR (Wang and Duan zolol. Overall, the average ligand egress time was 2009). A total of 100 random acceleration MD slightly longer from the putative ‘ligand-free’ trajectories were performed on the B2AR crys- conformation than from the crystal structure and tal structure (PDB ID: 2RH1) (Cherezov et al. this increase was attributed to the barrier formed 2007), which was first crystallized in the inactive by the clustering of F193 and the salt bridge conformation, with carazolol (an inverse agonist) connecting TM7 and EL2. Furthermore, the pop- located in a binding pocket defined by strong ulation of the dominant pathway was shifted interactions with polar residues in TM3, TM5 and strongly toward exit via the A2 sub-pathway TM7. The second extracellular loop (EL2) forms (arrow 2 in Fig. 6.1), between ECL2 and TM2, 3, a short helix in the B2AR crystal structure, and is and 7. The authors used these results to propose a extended outward, rendering the binding pocket binding pathway for carazolol. First, the ligand slightly open to the extracellular side. Egress was found to access the receptor via the cleft 100 J.M. Johnston and M. Filizola

Fig. 6.1 Exit/entry pathways (red transparent arrows, labeled 1Ð4) for the B1AR (left)andthe B2AR (right). Top (top panels)andside (bottom panels) views. Receptor is shown in cartoon representation, colored blue (TM1) to red (H8). The crystallized antagonists cyanopindolol (in B1AR) and carazolol (in B2AR) are shown in stick representation

between TM2, TM3 and TM7 at the extracellu- Atomic Force Microscopy (AFM), by applying lar opening. Subsequently, the ligand interacted a time-dependent external force to the system with F193 as it passed through the TM7-EL2 through the atomistically fine tip, and measuring junction on the way to the bottom of the binding the mechanical resistance properties of the pocket, where it was oriented and stabilized by biomolecule. Such a process can be mimicked the polar interactions with TM3, TM5 and TM7. virtually, using steered MD. Steered MD uses The difference in results between the B2AR and the application of an external force to drive rhodopsin studies described here, alongside spec- the system towards a desired state, be it via a ulative reports of different lipid phase ligand conformational change, or a ligand-unbinding entry pathways for e.g. the event. Steered MD is a non-equilibrium (Hurst et al. 2010) and for the PAR1 (Zhang et al. technique, during which the MD pathway 2012) indicate that ligand binding pathways may is irreversible. Analysis of the position and not be general across receptor types. interactions of the dissociating ligand, and the evolution of the applied forces during a forced ligand unbinding by steered MD can 6.2.3 Forced Ligand Unbinding provide reliable qualitative insights into the by Steered MD irreversible work required for the unbinding process (Isralewitz et al. 1997; Jensen et al. Under experimental conditions, unbinding 2002; Fishelovitch et al. 2009; Yang et al. 2009). of a ligand can be monitored by means of Such information may enable researchers to 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 101 determine structural features of these receptor- 3 ns each. The force profile, as a function of ligand complexes that contribute crucially to simulation time, revealed the ease with which the ligand binding. Unlike random acceleration MD, ligand can be extracted from the receptor along the direction of the force must be predetermined, the different pathways (Gonzalez et al. 2011). To though usually this is unknown. The egress address the shortcomings of an arbitrary choice of pathway can therefore often be arbitrary, and exit pathway from the receptors, CAVER (Petrek as such the sampling of different possible modes et al. 2006) was used to establish feasible egress of ligand exit may be less comprehensive than pathways connecting the orthosteric binding with random acceleration MD. pocket with the surface of the receptor. This In steered MD, the elastic force is proportional showed two possible pathways, depicted in to the change in the spring extension, relative to Fig. 6.1, by red arrows labeled 1 and 2. Two its equilibrium length: additional, lipid phase exit pathways between TM1/7 and TM5/6 inter-helical clefts were also ! ! ! F.t/ D k vt r.t/ r 0 n (6.2) tested (red arrows 3 and 4 in Fig. 6.1). Though not found by the CAVER exploration, they have where k is the spring constant; v is the constant been implicated as entry/exit pathways for retinal velocity of pulling, mimicking the retracting can- in rhodopsin (Hildebrand et al. 2009), and an tilever; r0 and r(t) are the ligand center of mass exit pathway between TM5/6 was found to be position at initial and current time t respectively significant in the random acceleration MD study and n is the direction of the pulling vector. of rhodopsin, from Wang and Duan (2007) The potential of mean force (PMF) along the (neither of these pathways was found to be reaction coordinate was calculated by the second- significant in their investigation of B2AR (Wang order cumulant expansion of the irreversible work and Duan 2009)). In Gonzalez’s investigation, measurements (Park et al. 2003) according to: the initial force peaks for extracting ligands Z t through pathways 3 and 4 were found to be twice W.t/ D v F.t/dt (6.3) that for extraction via pathway 1 or pathway 0 2, so the lipid phase extraction channels were considered to be unfavorable for the adrenergic 1 2 2 PMF DhW i hW ihW i (6.4) receptors. For B2AR, extraction of carazolol 2kB T via pathway 1 (bounded by TM5, 6 and 7) was where is the mean work averaged from the determined to have two force peaks, indicating six trajectories, kB is Boltzmann’s constant and T breaking of interactions between ligand and is the bulk temperature. receptor along the exit pathway. The first (and Gonzalez and colleagues embedded the crystal highest) peak in the force occurred early in structure of the human B2AR (PDB ID: 2RH1) the simulations and represents extraction from (Cherezov et al. 2007) and a MODELLER the orthosteric binding pocket, breaking key (Eswar et al. 2007; Fiser and Sali 2003) generated interactions with D3.32, S5.42 and N7.39, structure of the human B1AR, based on the turkey while the second peak occurred later, and B1AR crystal structure (PDB ID: 2VT4) (Warne represents breaking interactions with D192 and et al. 2008) into POPC membranes, and then N301 in the extracellular loops EL2 and EL3 used steered MD to remove the two crystallized respectively. Extraction of cyanopindolol from ligands, cyanopindolol and carazolol, from the B1AR through the same channel showed similar binding sites of B1AR and B2AR respectively, behavior, showing not only a second barrier, through different channels (Gonzalez et al. 2011). arising from interactions with D217 in EL2, and Steered MD simulations were performed at D7.32, but also some subsequent barriers, as constant velocity of 10 Å/ns and the spring additional interactions in the extracellular loops constant was set to 250 pN/Å. Trajectories were were broken. Extractions of the ligands through repeated six times, and lasted approximately channel 2 (bounded by TM2, 3 and 7) for both 102 J.M. Johnston and M. Filizola receptors demonstrate two retention sites after the second barrier in the force profile observed the initial orthosteric pocket interactions were during Gonzalez and colleagues’ steered MD broken, before being released into the solvent. simulations (Gonzalez et al. 2011) and was at- The potentials of mean force (PMFs) of these tributed to dewetting of both the ligand and the extractions indicate channel 1 is favored for binding pocket by Dror and colleagues, noting both receptors, but only by a small amount that alprenolol loses 80 % of its hydration upon (1 kcal/mol). The PMFs also show that the binding, and 63 % of this is at the point of enter- difference between the bound and the unbound ing the pre-binding vestibule (Dror et al. 2011). states is positive, indicating that the bound state Gonzalez and colleagues present data showing is favored in all cases. that the number of water molecules within 3 Å of These pathways suggested by random accel- the ligand in their simulations increases dramati- eration MD and steered MD can be directly cally, from 10 to 30 molecules, at the point of compared with the recent entry pathways ob- encountering this second barrier to ligand egress served for similar ligands (antagonists alprenolol, (Gonzalez et al. 2011). propranolol and dihydroalprenolol) binding to The biased MD simulations are largely able to B1AR and B2AR, by Dror and colleagues, during reproduce the characteristics of binding to B2AR, 82 standard MD simulations, ranging from 1 to i.e. dominant exit through the extracellular open- 19 s in length (Dror et al. 2011). During their ing of B2AR, no or limited exit through interheli- 82 simulations, Dror and colleagues observe 21 cal pathways (Dror and colleagues observe that binding events in total. Out of 12 binding events alprenolol partitions into the bilayer but never for alprenolol to B2AR, in 6 cases alprenolol enters the receptor through an interhelical path- replicates the crystallographic pose. The authors way). The steered MD is also able to replicate note that entry is almost exclusively through the the existence of two barriers to ligand binding at cleft between ECL2 and TM5, 6 and 7 (11 out stages along the entry/egress pathway compara- of 12 binding events) i.e. pathway 1 in Fig. 6.1. ble to those observed in the unbiased simulations. In the remaining event, entry was through path- Furthermore, many of the detailed features of the way 2. Similar results were observed for binding dynamics of the ligand binding events can be cap- dihydroalprenolol to B1AR. These results are tured by both the random acceleration MD and largely in agreement with the preference noted by steered MD, including conformational changes Gonzalez and colleagues (2011), but opposite to in F218/193 and F/Y7.35 in B1AR/B2AR re- that of Wang and Duan (2009). Also, similarly spectively; changes in ligand hydration and the to Gonzalez and colleagues, but again unlike dynamic nature of the salt bridge between K305 Wang and Duan, Dror and colleagues observed and D192. The key advantage of the biased events two energetic barriers to ligand binding through is the much shorter simulations that are required pathway 1 for B2AR (Dror et al. 2011). The to obtain these details. main barrier is that presented by ligand dewetting Following the simulations described above, and orientation within the binding pocket, and Selvam and colleagues (2012)haveusedasimilar is comparable to that observed during ligand steered MD methodology to Gonzalez and col- extraction from the orthosteric pocket for both of leagues, to demonstrate that the PMF of extract- the biased MD studies. The other barrier observed ing ligands from the B1AR and the B2AR can be by Dror and colleagues occurred prior to ligand used to differentiate selective ligands from their entry into the binding pocket and was not ob- non-selective counterparts. The work performed served during the random acceleration MD simu- to extract a B1AR selective antagonist (Esmolol) lations (Wang and Duan 2009), but according to from the B1AR was greater than that required Dror and colleagues, is at least as large, if not to extract ICI-118,551 (B2AR selective) and the larger than the barrier to binding in the pocket reverse is also true: more work was required to (Dror et al. 2011). This barrier is comparable to extract ICI-118,551 from B2AR than Esmolol. 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 103

6.2.4 Exploring Ligand Binding DOP receptor was simulated for 500 ns, in Pathways Using a hydrated DPPC-cholesterol lipid bilayer, Well-Tempered Metadynamics using ten walkers. All simulations were performed using GROMACS 4.0.5 (Van der Well-tempered metadynamics (Barducci et al. Spoel et al. 2005) with PLUMED (Bonomi et al. 2008) is an enhanced sampling technique 2009). that enables more efficient exploration of To represent the binding event, two CVs were the multidimensional free energy surfaces of chosen to describe (1) the distance between the biological systems by adding Gaussian bias to a center of mass of the heavy atoms of NLX and the standard MD simulation (Laio and Parrinello center of mass of the heavy atoms of the alkaloid- 2002; Leone et al. 2010). The dynamics is binding pocket of opioid receptor and (2) the biased by a non-Markovian (history-dependent) distance between the center of mass of the DOP potential, constructed as a sum of Gaussian receptor alkaloid binding pocket and the center of “hills” localized along the trajectory in the mass of the heavy atoms of the middle residues of direction of the reaction coordinate (or collective the EL2. This second CV was selected to enable variable, CV) of interest. The accumulation enhanced conformational sampling of the EL2 re- of this biasing potential enables the simulated gion of the DOP receptor, given its uncertain pre- system to overcome high energy barriers in dicted structure (this work was performed before order to explore efficiently its free energy the crystal structure of the DOP receptor became landscape as defined by the CVs. The success of available (Granier et al. 2012)). Using these two a metadynamics study is predicated on a careful CVs, we reconstructed the free-energy surface of apriorichoice of a set of CVs to provide a the NLX binding event, determining that the non- satisfactory description of the process of interest. selective antagonist NLX exhibits a molecular In well-tempered metadynamics, the height of recognition site on the DOP receptor surface at the added Gaussian hills depends both on a a cleft formed by EL2 and EL3, and ends in a temperature scaling factor and the underlying preferred orientation into the receptor alkaloid bias, and decreases to zero once a given energy binding pocket, after visiting some less stable threshold is reached. As a result, convergence of states in between. The most stable NLX-bound the algorithm to the correct free energy profile state of DOP corresponded to a conformation can be proven rigorously, and exploration of in which NLX interacted directly with D1283.32 physically relevant regions of the conformational via a salt bridge, and with Y3087.43, W2746.48, space is ensured for complex systems. H2786.52,in broad agreement with experimental In 2009, we performed well-tempered data from mutagenesis and competition binding metadynamics to elucidate the mechanistic assays (Befort et al. 1996a, b;Lietal.1999; details of flexible-ligand, flexible-protein Mansour et al. 1997; Bot et al. 1998; Spivak et docking of naloxone (NLX), a non-selective al. 1997; Surratt et al. 1994). The ligand was antagonist for opioid receptors, from the bulk stabilized in a specific orientation through inter- water environment into the orthosteric binding actions with a number of residues on TM3, TM5, pocket of a B2AR-based homology model of TM6 and TM7, in particular: M1323.35 as well as • opioid (DOP) receptor (Provasi et al. 2009). F2185.43 and F2225.47. The recent crystal struc- Using the multiple walker method (Raiteri et ture of the DOP bound to naltrindole confirmed al. 2006), several well-tempered metadynamics some key interactions in the bound state, in par- simulations can be simultaneously performed, ticular, the interactions with Y3087.43, H2746.52 each contributing to the same history-dependent M1323.36 and D1283.32 were observed. On the bias potential; this is a key advantage of pathway to this final binding mode, NLX visited a the method, enabling the most efficient use number of different less stable alternative pockets of cluster computing resources. Thus, the and in particular, two metastable states charac- 104 J.M. Johnston and M. Filizola terized by different degrees of opening of EL2. reduced system representations and steered MD, The existence of these metastable states before as well as combinations of adiabatic biased MD binding to the orthosteric pocket is reminiscent and metadynamics techniques. We also discuss of the pathways observed for the B2AR using ligand-specific conformations obtained depend- steered MD (Gonzalez et al. 2011) and unbiased ing on the physiological response of the bound MD (Dror et al. 2011), though the intermediate ligand. In particular, we discuss applications to poses visited by NLX along the binding pathway two GPCRs, the B2AR and the 5HT2A serotonin are, as expected, not the same as those observed receptor, and their perceived success as judged by for ligand binding to B2AR. Some residue posi- virtual ligand screening and, in the case of B2AR, tions are, however, implicated to interact along when compared with the recently published crys- the entry pathway for both receptors, in particular tal structures. Y/L7.35, H/W6.58 and D300/D290 in the B2AR and DOP receptor respectively. Sampling of the NLX ligand in the bulk sol- 6.3.1 Using Guided MD to Build vent was corrected for, using a methodology put Activated GPCRs forward by the Roux lab (Roux 1999; Allen et al. 2004). Using the free energy surface from There are very few currently available X-ray the metadynamics simulation, with the collec- crystal structures of GPCRs displaying the char- tive variables described above, we obtained a acteristics of an active state, and these are: the restrained free-energy profile, following the pro- ligand free opsin crystal structures at either low tocol and equations described in (Provasi et al. pH (PDB ID: 3CAP (Park et al. 2008)) or in 2009). Our calculated equilibrium constant, of complex with a synthetic peptide derived from Keq D 80 ˙ 13 nM, for the final bound state of carboxy terminus of the alpha-subunit of the het- NLX at DOP was remarkably close to the ma- erotrimeric G protein (PDB ID: 3DQB (Scheerer jority of reported experimental values (e.g. Toll et al. 2008)); the crystal structure of the proposed et al. 1998), and thus, this methodology offers active state of the A2A adenosine receptor (PDB great potential for describing, quantitatively, the ID: 3QAK, 2YDO (Xu et al. 2011; Lebon et al. binding events of other GPCRs. 2011)); the crystal structures of metarhodopsin II (PDB ID: 3PXO (Choe et al. 2011)) and a consti- tutively active metarhodopsin II (PDB ID: 4A4M 6.3 Activation Mechanisms (Deupi et al. 2012)); and finally the crystal struc- in GPCRs tures of B2AR in complex with the Gs protein (PDB ID: 3SN6 (Rasmussen et al. 2011b)), and Activation processes of GPCRs are known to a camelid nanobody (PDB ID: 3P0G (Rasmussen occur on timescales that are inaccessible to cur- et al. 2011a)). rent simulations using standard MD. From recent In 2011, two groups combined experimentally crystal structures, the most pronounced, common derived restraints and ligand binding data from conformational rearrangements that mark the ac- extensive literature searches with steered MD to tivation of a receptor include: breaking of the so- guide the generation of the active states of the called “ionic lock” between the E/DRY motif in B2AR (Simpson et al. 2011) and the 5HT2A re- TM3 and acidic residues in TM6, upon a large ceptor (Isberg et al. 2011), and have assessed the (i.e. 11 Å in the case of B2AR (Rasmussen et al. success of their approaches by virtual screening 2011a)) outward movement and slight rotation of of test sets of known agonists, antagonists and the intracellular end of TM6; smaller (i.e. 6Å inverse agonists, diluted among non-binders. for B2AR (Rasmussen et al. 2011a)) outward To build an initial model for the active state movement of TM5; and some smaller slightly of the B2AR, Simpson and colleagues have com- inward movements of TM3 and TM7. Here we bined the intracellular part of the opsin crystal describe activated GPCR models obtained using PDB ID: 3DQB (Scheerer et al. 2008) with the 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 105 extracellular portion of the inactive B2AR crystal colleagues modeled the whole heterotrimeric structure at 2.1 Å (PDB ID: 2RH1 (Cherezov assembly, using structural details from a number et al. 2007)) to generate a template, and used of different crystallographic structures (namely an alignment based on common motifs in the PDB IDs: 1AZS (Tesmer et al. 1997), 1GOT transmembrane regions as input for MODELLER (Lambright et al. 1996) and 1GP2 (Wall version 9v1 (Fiser and Sali 2003). Two hundred et al. 1995)). initial models were narrowed down to a single The initial model of the B2AR from Simpson model that scored well according to the MOD- and colleagues was energy minimized using ELLER objective function (measuring spatial re- the CHARMM (Brooks et al. 1983) force field straint violation) and had a low backbone RMSD and an implicitly represented membrane/solvent from the template. environment described by the Generalized Born Isberg and colleagues generated a single ho- algorithm with the simple switching function mology model of the 5HT2A based on the B2AR (GBSW) (Im et al. 2004). In the GBSW model, inactive structure, PDB ID: 2RH1 (Cherezov et the influence of the membrane is included as al. 2007). They modified this structure, to match a solvent-inaccessible, infinite planar slab of the active characteristics found in the opsin struc- low dielectric constant. A simple smoothing ture, PDB ID: 3DQB (Scheerer et al. 2008). function is included to approximate the dielectric Specifically, (1) the lower parts of TM5 and TM6 boundary between the “bulk water” and the were tilted outwards, manually; (2) the missing “membrane” (Im et al. 2003). For this study, density of IL3 in the B2AR structure was re- the membrane thickness was set at 35 Å with placed by the opsin IL3 structure, and (3) the a smoothing length of 5 Å. The use of the G’i peptide was inserted into the structure and implicit membrane can minimize problems mutated to G’s and then subsequently to the associated with complete sampling of phase 3.50 G’q subtype. Finally, (4), the rotamer of R131 space during short standard MD simulations, was set to their G protein interacting confor- since the number of explicit particles for mation, and W2866.48 was set to its presumed which pair-wise interactions are required to be active conformation, as predicted by some earlier calculated is dramatically reduced. In particular investigations (see e.g. Holst et al. 2010; Shi et for this application, this meant that the entire al. 2002, among others), but not yet observed in G protein assembly could be included without any active receptor crystals to date (see supple- the prohibitive contribution to the number of mentary figure 7a of Taddese et al. 2012). This particles from explicitly represented solvent. modified structure was then used as the input to The simulation length total was approximately MODELLER version 9v6 (Eswar et al. 2007). 175 ns. The choice to use an implicit solvent In agreement with subsequent observations model could be controversial, given previous of the activated B2AR during long timescale reports of a structural and functional role for MD simulations (Rosenbaum et al. 2011), H-bonding interactions of waters entering the which revealed that interaction with either the rhodopsin structure upon activation (Grossfield G protein or a nano-body G protein surrogate is et al. 2008), and the internal crystallographic essential to maintaining an activated receptor waters noted in many X-ray derived structures of conformation, both groups have included GPCRs. Simpson and colleagues have indirectly interactions with the G protein to complete incorporated the effects of such solvation, their activated conformations. The 3DQB through specific experimentally derived restraints opsin structure represents the apo-receptor, that would open out parts of the structure stabilized by a peptide derived from the main sufficient to permit entry of water molecules upon binding site of the heterotrimeric G protein, activation under explicit solvation conditions. the C-terminus of the ’ subunit (G’CT). Isberg and colleagues embedded their Isberg and colleagues included the G protein model of the 5HT2A receptor in an explicitly peptide, mutated to G’q while Simpson and represented DPPC membrane, with the TIP4P 106 J.M. Johnston and M. Filizola explicit water model and counterions under derived from site-directed spin labeling, disulfide the OPLS-AA 2001 (Kaminski et al. 2001) cross-linking, engineered zinc binding and site- forcefield. Their simulations were carried out directed mutagenesis experiments. The restraints using the Schrödinger implementation of the were introduced slowly, as the system was heated DESMOND software package, version 2.4 to simulation temperature, and all but the toggle originally developed by DE Shaw and associates switch restraints were maintained throughout the (Chow et al. 2006). Perhaps reflecting the 150 ns production simulation. increased number of degrees of freedom, these The success of the models, in both cases, was simulations are shorter in duration, four stages probed by virtual screening during which it was of 5 ns simulations with restraints for a total demonstrated that the predicted active models of 20 ns. The choice of DPPC may not be successfully discriminated known receptor- the best lipid for activation studies of GPCRs, selective agonists from antagonists. Simpson since Vogel and colleagues presented evidence and colleagues refined their active structure that, at least for rhodopsin, the fully saturated by flexibly docking full agonists epinephrine, chains of DPPC can inhibit the activation isoprotenerol, TA-2005 and salmeterol into transition to the receptor activated meta II state the binding pocket, and screened these refined (Vogel et al. 2004). Furthermore, the melting receptors using GLIDE (Friesner et al. 2004, temperature of DPPC is 41 ıC, and so, under 2006; Halgren et al. 2004). They were able physiological conditions, the membrane might to check all binding poses, because they be expected to be gel-like, casting some doubt used a fairly small set of ligands, comprising on the choice of this lipid as a valid membrane 29 full and partial B1/B2AR agonists, 38 mimetic. Nevertheless, Isberg and colleagues B1/B2AR antagonists and inverse agonists, ensured a fluid membrane environment for their and 56 non-peptide ligands selective for 5HT2A model, by maintaining their simulation opioid, muscarinic, neurokinin, neurotensin, conditions at T D 325 K (51 ıC). bradykinin and serotonin receptors. Docking Both groups derived sets of restraints from scores demonstrated that the active model previously published experimental information to showed a preference for B1/B2AR agonists define their activated states, and in both cases, over antagonists, inverse agonists and other during the earliest parts of the simulations, har- ligands. Active enrichment curves showed that monic restraints were applied to the backbone of the enrichment over the first 10 % of ligands was transmembrane regions of the receptor, to prevent significantly greater than expected at random. distortions of the helical structure. Isberg and Furthermore, enrichment for B2AR-selective colleagues applied pairwise harmonic distance agonists was higher than for B1AR-selective constraints, collated from mutagenesis and X- agonists, and enrichment for antagonists was ray crystallographic data, to inter-helical, ligand- only observed once all the agonists had been receptor and G protein-receptor distances. These found. Isberg and colleagues performed a larger constraints were used to drive the conforma- scale virtual screen, also using GLIDE. They tion of the 5HT2A towards a presumed activated selected 182 known 5HT2A ligands, including structure, achieved once they are satisfied. Simp- 139 agonists (112 phenylethylamine agonists son and colleagues applied six sets of harmonic and 27 others), 39 antagonists and 4 inverse distance restraints to the B2AR receptor model agonists. These were diluted with 9257 diverse during MD simulations of 150 ns in production compounds from the ZINC database (Irwin et al. length, to drive the conformation towards an 2012). GLIDE docking scores demonstrated that activated state. The basis of these restraints was the 29 highest ranked hits were phenylethylamine from experimental evidence sourced from the agonists, and these were the best scoring class of available literature on both B2AR and other class molecules by a large margin, with a mean G-score A GPCRs (in particular, the muscarinic recep- 1.5 lower than other agonists. Phenylethylamine tor and rhodopsin). The sets of restraints were antagonists also scored better than other agonists, 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 107 other antagonists, and other ligands. The virtual can be thought of as a direct consequence of screenings seem to demonstrate the success the simplification or averaging over interactions of modeling the active state of both receptors, collated from different structural/biophysical using restraints derived from experimental data. sources introduced by the homology modeling However, the subsequent solution of the active process, but can also be attributed to the static state crystal structures of B2AR (Rasmussen nature of a crystallographic structure versus et al. 2011a, b; Rosenbaum et al. 2011) has the dynamic manner in which the model was enabled direct comparison between the B2AR constructed. crystal and the active model, and Taddese and colleagues (2012) have discussed this comparison in a brief paper in 2012. Overall, 6.3.2 Combined Adiabatic Biased they report an average RMSD of 1.3 Å in the MD with Metadynamics invariant TM region (i.e. residues that have to Explore Activation the same environment when compared over a Pathways number of structures) between the active model from Simpson and colleagues (2011) and the As mentioned earlier in this chapter, and as active B2AR structure. The overall RMSD is observed in experimental studies, the value of perhaps less explanatory than the residue-by- GPCRs as pharmaceutical targets is enhanced residue variation between the B2AR model by the concept of “functional selectivity”, or and the B2AR crystal structure, which ranges “biased agonism”, a concept wherein a single from 1.0 Å in the TM region, to 7.2 Å in the ligand might display different efficacies at a loops. The larger value of RMSD for the loop single receptor isoform depending on the effector regions is not unexpected owing to their high pathway coupled to that receptor (Rives et al. mobility. The active B2AR model was based 2012). A simple mechanistic explanation for this on an active, G protein bound state of opsin, phenomenon can be derived from the dynamic and the difference between the opsin template equilibrium of GPCRs between different and the active B2AR crystal structure is 1.7 Å conformational sub-states. Different ligands or over the invariant TM region. The active model changes in protein-protein interactions (such shows good agreement with the overall position as may be observed in receptor dimerization) of the helices in the structure PDB ID: 3SN6 can shift the equilibrium population of these for TM1, TM2 and TM5, but seems more like sub-states toward conformations of the receptor the opsin template for TM3, TM4, TM6 and that are kinetically and structurally distinct. TM7. The orthosteric binding pocket appears to Evidence of different sub-states can be inferred have been successfully modeled, with an RMSD from crystallographic structures: not all of the of 1.3 Å between the residues in the active attributes usually associated with an ‘active- model and the crystal structure, and this may state’ receptor are present for active-state crystal be the reason for the strong enrichment results structures solved in the presence of different for the virtual screening. The key receptor-G agonists (Provasi et al. 2011). We have recently protein interactions were correctly predicted developed a general computational strategy from the opsin-G protein complex template, that combines adiabatic biased MD with path with 27/38 residues on the receptor and 20/34 CV metadynamics, which we have used to residues on the G protein surface correctly characterize possible metastable active states of predicted. Overall the modeling moderately rhodopsin (Provasi and Filizola 2010); to discern successfully reproduced the nature of the the effects of ligands with variable efficacies on activated characteristics observed in the crystal the conformation of the B2AR receptor (Provasi structure, but the atomic resolution specificity of et al. 2011) and, finally, to offer a mechanistic the interactions seen in the single conformation explanation of allosteric effects of ligand binding of the X-ray structure was not reproduced. This in a GPCR dimer of the 5HT2A serotonin receptor 108 J.M. Johnston and M. Filizola

Fig. 6.2 (a) Multidimensional scaling representation of inactive and active states. The receptor is shown in grey multiple adiabatic biased MD trajectories clustered ac- cartoon representation, in (b)and(c), except for TM5 and cording to RMSD. Representative structures selected from TM6, which show the progress from inactive to active, in each cluster (shown in (b): side view and (c): intracellu- colors corresponding to the clusters in (a) lar view) to define a homogenous pathway between the and the glutamate receptor (mGluR2) observed is bigger than the minimum value previously by our experimental colleagues (Fribourg et al. achieved during the trajectory. The adiabatic bi- 2011). ased MD trajectories were short (e.g. 1Ð5 ns), In the first stage of the method, a mono- with small elastic constants (0.01 kJ/nm2) and in- dimensional pathway defining activation of the dependently drawn Maxwellian initial velocities. receptor was derived. Frames depicting the re- All of the independent adiabatic biased MD tra- ceptor in intermediate states between ‘inactive’ jectories were then pooled, and all of the sampled and ‘active’ X-ray-derived crystal structures (or conformations clustered based on the RMSD of models, in the absence of crystal structures) de- the TM domain of the receptor. Representative fined the pathway. The frames were derived from structures from these clusters were then chosen multiple (e.g. 5 or 10) statistically independent as reference frames to define the path CVs for adiabatic biased MD simulations (Marchi and further metadynamics simulations. The number Ballone 1999). Like targeted MD (Schlitter et of clusters, and hence frames, was between 4 and al. 1994), adiabatic biased MD drives the system 10, depending on the nature of application. See towards the desired final state, by enforcing a Fig. 6.2 for an example of this. reduction in RMSD between the initial and target To obtain information about the relative stabil- conformations, to enable observations of a con- ity of the states populated by the receptor during formational transition during a single trajectory. transition from inactive to active conformations, In contrast to TMD, the adiabatic biased MD metadynamics simulations with path collective algorithm (Marchi and Ballone 1999) ensures variables (Branduardi et al. 2007) can be per- exploration of low-energy pathways by keeping formed. The CVs defining the phase space are the the total potential energy of the system constant position (s) along and the distance (z)fromthe during the MD run through application of a time- pre-determined activation pathway. The reference dependent biasing potential. The harmonic bias- frames, k, define the position of the progression ing potential is only applied when the RMSD of the system from inactive (s 0) to active 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 109

(s 1) during the pre-determined transition (CV and Filizola 2010) and NAMD 2.7 (Phillips et al. 1), and its distance (z) from it (CV 2), according 2005) with the Charmm27 force-field (MacKerell to the following equation: et al. 1998)forthe5HT2A, mGluR2 monomers and dimer (Fribourg et al. 2011). Xk 1 i 1 œd .i/ This combinatorial approach was initially val- s .R.t// D Z e TM R.t/; R k 1 idated on rhodopsin (Provasi and Filizola 2010), iD1 where the activation pathway was defined be- (6.5) tween the crystal structure of a photoactivated, deprotonated intermediate of rhodopsin (PDB ID: where • Dhd (R(i),R(i ˙ 1))i is the average dis- TM 2I37) and the low pH crystal structure of opsin tance between two adjacent frames in the transi- (PDB ID: 3CAP), the only crystal structure dis- tion pathway, the constant exponent, œ,mustbe playing characteristics of an active state avail- chosen in such a way that œ • Š 1, and Z is a able at the time of this study. Simulations were normalization factor defined by: performed for wild-type rhodopsin, embedded in a POPC membrane, with either a charged or an Xk œd .i/ 3.49 Z e TM R.t/; R (6.6) uncharged residue E134 within the E(D)RY iD1 motif, to simulate proton uptake from the bulk oc- curring in the late stage of rhodopsin activation. while Reconstruction of the rhodopsin free energy landscape indicated three common metastable 1 z .R.t// Dœ log Z (6.7) states between 2I37 and 3CAP along the adia- batic biased MD pre-calculated transition path. The free-energy of the receptor as a function Two of the identified minima are comparable to of s and z can then be reconstructed from con- active intermediates of bovine rhodopsin, char- verged well-tempered metadynamics simulations acterized by different amplitude of the outward of varying length (80Ð300 ns) (see Provasi et al. movement of TM6. This was revealed by com- 2011; Provasi and Filizola 2010; Fribourg et al. paring intra-molecular distance analysis of these 2011 for details of the simulation protocols). states with results from double electron-electron Convergence of the reconstructed free-energy can resonance spectroscopy (Altenbach et al. 2008). be assessed in two ways: firstly, by ensuring The largest predicted separation between TM3 minimal variation in the free-energy difference and TM6 was in line with data obtained for Meta between the minima around the experimental IIb from spectroscopy (Knierim et al. 2007), and endpoints with time, and secondly, in a converged for opsin from crystallography (Park et al. 2008). simulation, exploration of the FES represented by Key residues, thought to influence the activa- the CVs faces no further energy barriers, there- tion pathway, were mutated and additional sim- fore one can expect frequent re-crossing of the ulations were carried out on these receptors for values of the CVs after convergence is attained. comparison to the WT. An interaction between All simulations in each of the applications K2315.66 and E2476.30 appeared to stabilize the of this method to date have been carried out predicted active conformation with the largest using an explicit atomistic representation of all separation between TM3 and TM6, as also seen components, and the receptors were embedded in the opsin crystal structure (Park et al. 2008). in solvated POPC/10 % cholesterol membranes. Furthermore, we carried out metadynamics sim- The versatility of the method lies in the im- ulations of the K231A5.66 mutant rhodopsin with plementation using Plumed plugin (Bonomi et either a charged or an uncharged residue E1343.49 al. 2009), meaning it has been performed using within the E(D)RY motif. We hypothesized that GROMACS (Van der Spoel et al. 2005) and the removal of a polar interaction between K2315.66 OPLS-AA force-field (Kaminski et al. 2001)for and E2476.30 by replacing K2315.66 with ala- rhodopsin and B2AR (Provasi et al. 2011;Provasi nine would switch the equilibrium toward the 110 J.M. Johnston and M. Filizola predicted conformation with a smaller separation Comparison of the free energy surfaces of between TM3 and TM6. Our simulations showed the receptor activation when complexed with that the putative active conformation of rhodopsin different ligands demonstrated selective inactive with the largest separation between TM3 and or active conformations depending on their TM6 was destabilized in the presence of a neu- known elicited functional responses. tral E1343.49 mimicking the activation-dependent In the absence of ligand, the receptor was proton uptake from the bulk. found to have two low energy conformations, one In the most comprehensive study using with s 0.2, i.e. close to the inactive state, and a this methodology (Provasi et al. 2011), we second, closer to an active state, with s 0.6, sep- were able to present compelling evidence for arated by a low energy barrier. This corresponds the concept that ligands of differing efficacy well with the known basal activity of the B2AR, access structurally distinct conformations of and the inherent conformational flexibility of the the B2AR. Path CV metadynamics simulations unliganded B2AR may help to explain the diffi- were conducted for six ligands, representing a culty in obtaining crystallographic structures of full agonist (epinephrine), weak partial agonists the B2AR in the absence of ligand. (dopamine and catechol) inverse agonists (cara- The free energy surface for B2AR complexed zolol and ICI-118,551) and a neutral antagonist with alprenolol, ostensibly a neutral antagonist, (alprenolol). These different ligands were also i.e. a ligand that competitively binds in the or- compared to the unliganded receptor. Parameters thosteric binding pocket of a receptor and blocks for the ligands were obtained manually, by the binding of (and thus cellular responses to) analogy with existing molecules in the OPLS- agonists or inverse agonists, also displayed two AA force-field, while Coulomb point charges stable states. One, close to the inactive, s 0.2, were obtained from quantum chemical geometry was marginally (1 kcal/mol) more stable than optimization using Gaussian 03 and restricted the second state, at s 0.6. This type of free en- Hartree-Fock calculations with the 6-31G* basis ergy profile illustrates possible reasons for some set. The ligands were either docked into the studies having found alprenolol to behave in dif- receptor using Autodock 4.0 (Morris et al. 1996) ferent ways in different assays: most notably, as (for dopamine, catechol and epinephrine) or an inverse agonist (Elster et al. 2007; Hopkinson positioned according to their crystallographically et al. 2000) or as a partial agonist (Callaerts-Vegh determined binding modes. The activation et al. 2004). pathway was homogenously defined by ten The profiles for the inverse agonists (ligands frames, between the inactive state of the B2AR, that preferentially stabilize a receptor in an inac- represented by PDB ID: 2RH1 (Cherezov et tive conformational state), ICI-118,551 and cara- al. 2007) and the active state, represented by zolol both displayed single conformational states the nanobody stabilized structure, PDB ID: with low values of s (s < 0.2), in deep free energy 3P0G (Rasmussen et al. 2011a). Metadynamics wells. The free energy minimum, in each case, simulations for each receptor/ligand complex was approximately 4 kcal/mol lower in energy were run for approximately 300 ns until the than the next lowest minima. This corresponds difference between the free energies of the with the solution of many crystal structures in metastable states observed was converged. The inverse agonist bound conformations. activation of the receptor was also measured in Results for the agonists were less clear-cut. terms of other structurally pertinent variables, The free energy profiles for both partial (catechol by reweighting the free energy surface. These and dopamine) and full (epinephrine) agonists variables were: the distance between the each displayed a deepest free energy minimum at sidechains of the R1313.50 and E2686.30 residues the expected value of s, i.e. s 0.6 for the partial of the ionic lock; the dihedral angle of the agonists, and s > 0.9 for the full agonist. How- sidechain of W2866.48, i.e. the rotamer toggle ever, other minima, only marginally less stable switch; and the outward displacement of TM6. than these were found in all cases, at lower values 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 111 of s. A feasible explanation for this arises from methysergide, and an inverse agonist, clozapine. the absence of G protein: as demonstrated by For the monomeric 5HT2A, clozapine and Rosenbaum and colleagues (2011), during very methysergide stabilized two different inactive- long timescale MD simulations, an active confor- like conformations, while DOI stabilized an mation of the B2AR will quickly lose many of active like conformation. These experiments the active characteristics in the absence of the G were repeated for dimeric arrangements of protein or a stabilizing nanobody surrogate. the receptors, with the lowest energy ligand- The results from this study provide an atomic stabilized conformations of 5HT2A complexed resolution view of structurally distinct receptor with a ligand-free mGluR2. When complexed conformations stabilized by different ligands, and with the clozapine-stabilized 5HT2A conforma- demonstrate that ligands with varying efficacies tion, the mGluR2 conformation is shifted toward might be used to control population shifts toward an activated state. In contrast, when bound to desirable GPCR conformations. Such knowledge either the DOI-stabilized active-like 5HT2A,or might be useful to aid rational drug design: biased the methysergideÐstabilized inactive-like 5HT2A ligands may represent novel, more efficacious conformation, the conformational equilibrium of therapeutics. mGluR2 is shifted toward an inactive-like state. Finally, in conjunction with tissue and animal These results (in combination with the tissue behavioral studies, we have recently applied and animal studies) suggest that the formation of this methodology to determine some of the the heteromeric complex enables modulation of mechanistic details driving the antipsychotic the mGluR2 response to its endogenous ligand properties of drugs targeting a heteromeric by binding of different ligands to the 5HT2A. arrangement of the serotonin 5HT2A and mGluR2 Therefore, the strong 5HT2A agonist, DOI, can (Fribourg et al. 2011), in particular, why some greatly stimulate Gq signaling, while decreasing 5HT2A inhibitors (e.g. clozapine) may exhibit Gi signaling, but the inverse agonist, clozapine, neuropsychological effects while others (e.g. abolishes Gq signaling while simultaneously methysergide) do not. Oligomeric receptor stimulating Gi signaling. These results go some complexes have been shown to have distinct way to explaining the complexities surrounding signaling properties when compared to their the disparity between the neuropsychological monomeric components, and these receptors effects of 5HT2A inhibitors, some of which have been shown to form a specific hetero- demonstrate antipsychotic effects, while others oligomeric complex in mammalian brain tissue do not. A drug bound to one receptor of the that is implicated in schizophrenia. The complex heteromer was shown to influence the signaling integrates the responses from the Gq-coupled response of the partner to its endogenous ligand mGluR2 and that of the Gi-coupled 5HT2A and such mechanistic insights are invaluable to modulate signal transduction and influence to therapeutic strategy design for disorders in downstream signaling events. To investigate this which oligomeric arrangements of receptors are computationally, and to provide a structural implicated. context for the experimental observations, we built homology models of both receptors: mGluR2 was rhodopsin based, using PDB 6.4 Oligomerization ID: 1U19 for the inactive state and 3DQB Mechanisms in GPCRs for the active-like state, while 5HT2A was based on B2AR, using 2RH1 as the inactive In the previous section, we have mentioned the template and 3P0G as the active-like template. distinct properties of oligomeric arrangements To these models, we applied the methodology of some GPCRs as compared to those of as in the previous cases, comparing the free their monomeric components (Milligan 2009). energy profiles of activation for 5HT2A bound However, the ability of GPCRs to assemble into to a strong agonist, DOI, a neutral antagonist, stable, heteromeric or homomeric, oligomeric 112 J.M. Johnston and M. Filizola arrangements remains a subject of much debate in residue-level detail is ensured by the extensive (Fonseca and Lambert 2009; Lambert 2010). calibration of a large library of chemical building From a computational perspective, understanding blocks against thermodynamic data (in particular, the thermodynamics and kinetics of GPCR oil/water partition coefficients). These versatile oligomerization is not a trivial problem owing building blocks permit the force field to flexibly to (1) the large size of the complex system, represent a large range of biomolecules without and (2) the stochastic nature of the process. the need to re-parameterize the model in each In this section we survey the enhanced MD- case. The MARTINI force field is compatible based methods that have recently been used in with GROMACS (Van der Spoel et al. 2005) and combination with reduced representations of the its CG mapping of heavy atoms (including water GPCR systems to study the dynamic behavior molecules) to beads is a 4:1 ratio, except in the of dimeric/oligomeric arrangements of these case of molecules containing rings, where a map- receptors. ping of 2:1 is used. There are four main types of beads, P D polar, N D non-polar, C D apolar and charged D Q, with subtypes depending on hy- 6.4.1 Spontaneous Self-Assembly drogen bonding capacity and polarity. Each CG of Coarse-Grained bead has a mass of 72 atomic mass units (amu), Representations of GPCRs equivalent to the mass of four water molecules, except for ring structure beads, which have a The number of calculations required by a bi- mass of 45 amu. This improves efficiency and en- ological molecule scales as the square of the ables a simulation timestep of 20Ð40 fs, approx- number of particles included in its representa- imately ten times that possible with an all-atom tion, and consequently, a lengthy simulation of GROMACS simulation. Based on comparison of a large, explicitly represented GPCR, particu- diffusion rates for atomistic and MARTINI CG larly in a dimeric or oligomeric state, becomes models, the CG model affords a scaling factor particularly difficult at atomistic resolution. This of 4 for the effective time sampled, though there issue has prompted several recent efforts focused has been some recent debate surrounding the best on reducing the number of particles included in time step (Winger et al. 2009; Marrink et al. a simulation of a protein and its environment, 2009). and a number of strategies have been proposed. The earliest application of the MARTINI CG We focus here on the MARTINI coarse-grained model to GPCRs was the self-aggregation of (CG) model (Marrink et al. 2008), one of several rhodopsin monomers in four different explicit, “bead models” that have been devised to coarsely CG, lipid membrane environments (Periole et al. represent biomolecules (see Tozzini 2005, 2010 2007). Systems with 16 CG rhodopsin monomers for recent reviews). (based on the 1L9H structure (Okada et al. 2002)) The MARTINI forcefield has proven to be a at a protein:lipid ratio of 1:100, were simulated successful means of studying oligomerization for for up to 8 s. Such time scales would be very GPCRs and has been used in extensive studies difficult to reach in an atomistic representation. of oligomerization for rhodopsin (Periole et al. The results of these simulations indicate a local- 2007, 2012) opioid receptors (Johnston et al. ized membrane adaptation to the presence of the 2011; Provasi et al. 2010) and the B1AR and receptor, supposedly driven by the motivation to B2AR (Johnston et al. 2012). Developed by Mar- overcome the hydrophobic mismatch between the rink and co-workers, the MARTINI force-field length of the hydrophobic part of the monomeric (Marrink et al. 2004, 2007, 2008; Monticelli et receptor and the equilibrium hydrophobic bi- al. 2008; Lopez et al. 2009) offers coarse-grained layer thickness. Hydrophobic mismatch has been (CG) representations of proteins, several lipid shown to promote self-assembly of rhodopsin types and cholesterol. Close comparability to reconstituted in membrane bilayers (Kusumi and both experimental and atomistic MD approaches Hyde 1982; Ryba and Marsh 1992). The details 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 113 of such phenomena could not be modeled in GPCRs. Two studies of dimerization of a simulation using an implicit representation of the DOP receptor were performed using a membrane, like the generalized Born method established methodologies, firstly umbrella (Im et al. 2004) described earlier in this chapter. sampling, from which we derived the PMF of Periole and colleagues simulated rhodopsin a dimerization event (Provasi et al. 2010), and in different phosphocholine (PC) lipid environ- secondly, a metadynamics study in which we ments under non-biased conditions, and they ob- established the most favorable orientation of served a clear dependence on chain length. Bi- the individual protomers involved in different layer adaptation was manifest as local thickening dimeric arrangements (Johnston et al. 2011), i.e. near helices TM2, TM4, and TM7 in (C12:0)2PC, comprised of different contact interfaces, of the (C16:1)2PC, and (C20:1)2PC bilayers and as lo- DOP receptor. The computational results of this cal thinning near helices TM1/H8 and TM5/TM6 study compared favorably with inferences from in (C20:1)2PC and (C20:0)2PC. The results in- cysteine crosslinking experiments, supporting the dicated a higher propensity for protein:protein role of specific residues at the interfaces. contact interactions in (C16:1)2PC. Clear reorga- To characterize dimerization for the DOP re- nization and increase in scope of the rhodopsin ceptor, we performed biased MD simulations dimer interfaces was observed during the simu- of a CG representation of a homo-dimeric ar- lations, indicating a search for a shape comple- rangement of this receptor in an explicitly CG mentarity that maximized the buried accessible represented, POPC:10 % cholesterol-water en- surface area. The number of contact interfaces vironment. Since these studies were conducted was higher in (C12:0)2PC and (C16:1)2PC, where prior to the crystallographic solution of the opioid strong forces were introduced by the hydrophobic receptor structures in 2012 (Wu et al. 2012; mismatch, than in (C20:1)2PC and (C20:0)2PC, Granier et al. 2012; Thompson et al. 2012; Man- where the forces were more balanced. Three glik et al. 2012), an all-atom structural model of contact interfaces were clearly visible on the 6Ð the DOP receptor protomer from Mus musculus, 8-s time scale in (C20:1)2PC; these included was generated by homology modeling, using the previously suggested homo- and heterodimeriza- crystal structure of the B2AR at 2.4 Å resolu- tion interfaces in rhodopsin and other GPCRs, tion (PDB ID: 2RH1) (Cherezov et al. 2007)in involving the exposed surfaces of the helices MODELLER 9v3 (Eswar et al. 2007), using the TM1/TM2/H8, TM4/TM5, and TM6/TM7. The same strategy we recently reported in the liter- authors concluded that hydrophobic mismatch ature for the human DOP receptor (Provasi et al. drives self-assembly of rhodopsin into liquid- 2009). The loop regions were built ab initio using like structures with short-range order, and that ROSETTA 2.2 (Wang et al. 2007). A pair of the the interactions may not be dictated by specific resultant DOP receptor models was placed facing residue-residue contacts at the contact surfaces. one another at a putative symmetrical interface, involving residue 4.58, inferred from cysteine cross-linking data on this and other GPCRs (see 6.4.2 Combining CG Representation e.g. Guo et al. 2005, 2008). and Biased MD to Investigate In an effort to improve the stability of the GPCR Dimerization secondary structure of the CG representation of the receptor, we combined an elastic By combining the MARTINI reduced representa- network model (ENM) with the MARTINI CG tion of the system with biased MD techniques, representation, using a method developed and it has been possible to make predictions about termed ELNEDIN by Periole and colleagues the relative strength of dimers formed at different (2009). ENMs are ideally suited to preserve the interfaces in an explicit membrane environment. secondary or tertiary structure of biomolecules, We pioneered the use of a free energy since they are structure derived, and therefore approach to characterize oligomerization of introduce an intrinsic bias toward the structure 114 J.M. Johnston and M. Filizola

Fig. 6.3 (a) Definition of collective variables (CVs) used 121 and Gly51 (shown as red spheres) on each protomer, in studies of dimers from the Filizola lab (Johnston et al. thus the variables are consistent across different interface 2011, 2012; Provasi et al. 2010). CV1 is the distance r arrangements. They describe the distance between the between the COMs of the protomers (CA)and(CB); CV2 receptors, d; the tilt of long axis of each receptor relative is the rotational angle  A, defined by the projection on to the receptor-receptor direction ™1and™2; the rotation to the plane of the membrane of the COM of the TMi of the receptors around their long axis (parallel to the 0 0 of A,theCA,andtheCB, and CV3 corresponded to the membrane normal), ¥1, ¥3or™1 and ™2 , the relative 0 equivalent rotational angle  B. Shown for the B2AR at a orientation of the receptor’s long axis, ¥2. d is the TM1/H8 interface, TM1 and H8 are colored black.The interfacial receptor distance and is defined as the distance, CVs are defined according to the interface of interest. d, between the receptors from which the distance at the (b) Definitions of the virtual bond algorithm used by minimum of the PMFs was subtracted, see supplementary Periole and colleagues in their recent publication (Periole material from (Periole et al. 2012). Once again, TM1 and et al. 2012). The axes (black dotted lines in Fig. 6.3b) H8 are colored black are anchored by the backbone CG beads of Cys 187, Gly upon which they are established. In a novel secondary structure (e.g. coil, bend, or turn), 2 extension to the ELNEDIN method, we applied a force constant of KSPRING D 250 kJ/mol/nm a secondary-structure-dependent construct to was applied. A comparison of the RMSF for models of the DOP receptor dimer (Provasi the C’ beads for the CG representation with et al. 2010). The strength of the force constant an equivalent atomistic simulation indicated a for the harmonic restraint, KSPRING,was qualitative agreement, permitting the secondary determined by the secondary structure of each structure of the helices to be maintained, without of the residues. If the residue was determined compromising the flexibility of the loop regions. to have a defined secondary structure (by We used the CVs illustrated in Fig. 6.3ato DSSP (Kabsch and Sander 1983)), e.g. ’- describe the relative position of interacting DOP helix as in the case of the TM regions of receptor protomers A and B during the simula- the DOP receptor, then a force constant of tions: CV1 represented the distance r between 2 KSPRING D 1,000 kJ/mol/nm was applied. For the COM of protomers A (CA) and B (CB); CV2 a sequence of >2 residues with undefined described the rotational angle  A, defined by the 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 115 projection on to the plane of the membrane of potential. The advantage of umbrella sampling, the COM of the TM4 of A,theCA, and the considered to be key from a computational per- CB, and CV3 corresponded to the equivalent spective, is that the relative independence of the rotational angle  B. To allow exploration of an windows from one another (though this should be experimentally-supported TM4 interface of DOP considered and confirmed when recombining the receptor homodimers involving position 4.58 in windows) allows the biased simulations along the a reasonable timescale, we limited the sampling length of — to be conducted in parallel on large ı of the two rotational angles  A and  B to a 25 computer clusters. interval, using steep repulsive potentials. To obtain the PMF, (W( )) for the range of The system setup was the same for both that is of interest, the simulations from each of the of the studies (Johnston et al. 2011;Provasi window must be unbiased and combined. Thus, et al. 2010). All simulations were performed it is crucial that the histogrammed variations in using GROMACS (Van der Spoel et al. 2005) the measured variable obtained from each of the incorporating the Plumed plugin (Bonomi et al. windows overlap sufficiently such that the whole 2009).Inthefirstofthesetwo,weusedumbrella of the coordinate space is covered. sampling to define the PMF of dimerization as a Roux offers a thorough description of the function of the separation between the centers of derivation in (Roux 1995), but briefly, the biased mass of the two protomers. distribution function, < ( )>(i) as obtained from Umbrella sampling was introduced in 1977 the ith biased ensemble, is: by Torrie and Valleau (1974), and has been con- 1 wi . /=kB T wi . /=kB T sidered the prototypic biased MD technique for h . /i.i/ D e h . /ihe i improving sampling of a PMF as defined by (6.10) the Kirkwood equation (Kirkwood 1935)(6.8), from the average distribution, < ( )>, along a where kBT is the Boltzmann factor and the unbi- predefined reaction coordinate, : ased PMF from the ith biased ensemble is: h . /i h . /i.i/ W . / D W . / kB T ln (6.8) Wi . / D W . / kB T ln h . /i h . /i

wi . / C Fi (6.11) where kBT is the Boltzmann factor, and * and

W( *) are arbitrary constants. In this method, the * and W( *) are arbitrary constants. Fi is an un- reaction coordinate is divided into windows, and determined constant representing the free energy a biasing potential (w( )) is introduced to tether associated with introducing the biasing potential: the system to the centre of each window. The . / biasing potential acts to restrict the variations of eFi =KB T Dhewi =kB T i (6.12) the variable in order to ensure enhanced configu- rational sampling within each of the independent Most commonly, the Weighted Histogram windows, and as a result, along the entirety of the Analysis Method, or WHAM (Kumar et al. 1992, reaction coordinate when all of the windows are 1995) is used to derive the PMF from the output combined. The biased simulations have potential of the simulations. This probably reflects the fact energy [U(R) C wi(—)], (where R is the system that the Grossfield lab (University of Rochester, coordinates) and wi(—) is a harmonic potential of NY) has created a very user-friendly and freely the form: distributed implementation of WHAM, but other . unbiasing techniques e.g. MBAR (Shirts and 1 2 wi . / D 2 K. i / (6.9) Chodera 2008) have proven to be equally useful. WHAM focuses on optimizing the estimate of centered on the successive values of i for each the coordinate-dependent unbiased distribution window. K represents the force constant of the function as a weighted sum over all of the data 116 J.M. Johnston and M. Filizola from the biased simulations, and determining which were separated from each other by a the functional form of the weight factors that transition state at rTS1 D 3.28 nm, and from minimize the statistical error. The derivation of the monomeric state at large values of the the WHAM equations can be found in (Kumar et separation (r 4.90 nm), by a transition state at al. 1992) but here, in short, the key expressions rTS2 D 3.75 nm. In D1, the structure indicated that can be distilled down to: TM4 from each protomer inserts into a groove on the surface of the opposite protomer, formed N Xw by helices TM2, the C-terminal half of TM3, and n . / i h ii TM4. As expected from the energetic similarity, iD1 . / the D2 structure was similar to D1 in the overall h iD N (6.13) Xw orientation of the protomers, but corresponded n eŒwj . /Fj =kB T j to a slightly less compact (r D 3.40 nm), j D1 Z and slightly asymmetric arrangement of the wj . /=kB T protomers. Fj DkB T ln h . /ie d Using the derived free-energy surface of (6.14) DOP receptor homodimers, and the formalism described by Roux and co-workers in Allen et where Nw is the number of windows; ni is the al. (2004) and Roux (1999), we were able to number of independent data points, used to con- calculate several observable quantities from our struct the biased distribution function, in the ith simulations. The dimerization constant for the b window at specific value of ; < ( )>i is the identified lowest-energy DOP homodimer was 2 biased distribution function in the ith window; KD D 1.02 m (see details of the derivation wj( ) is the window potential of the jth window, in Johnston et al. 2012). From this calculated at specific ; Fj is the (unknown and to be deter- KD, and in combination with a diffusion 2 mined) free energy constant. The unbiased dis- coefficient DT value of 0.08 mm /s determined tribution function, < ( )> is dependent on fFjg, experimentally for the mu-opioid (MOP) receptor and thus the simultaneous Eqs. (6.13) and (6.14) (Sauliere-Nzeh et al. 2010), and also in line must be solved iteratively, until a consistent solu- with a value of 0.1 mm2/s obtained for several tion is obtained for both of them. This unbiased other GPCRs (Barak et al. 1997; Hegener et al. distribution function may then be substituted into 2004; Henis et al. 1982; Poo and Cone 1973), the Kirkwood equation (Kirkwood 1935), (6.8), we obtained an estimate of a few seconds for the to obtain the PMF along the coordinate —. half-time of DOP receptor dimers at a contact For our first umbrella sampling simulations interface comprised of TM4 in a lipid bilayer of GPCRs (Provasi et al. 2010), 43 independent mimetic. We repeated these calculations in a windows with values of the reaction coordinate second simulation (Johnston et al. 2011), for in question, i.e. protomer separation, r, between a contact interface simultaneously involving r1 D 3.0 nm and r43 D 4.90 nm were simulated contact at both TM4 and TM5 of each protomer. for 300 ns, using harmonic restraints on the dis- The comparison revealed that the TM4 interface tance. The distribution p(r) of the separation was was marginally more stable than the interface harvested for the last 250 ns, and the resulting involving both TM4 and TM5, indicating that the probability distributions were combined using stability of the dimer pair is dependent on the WHAM to derive the free energy as a function region of contact between the protomers. These of the separation between the protomers. calculated lifetimes of DOP receptor homodimers The free energy surface identified two are consistent with the transient, sub-second to different, yet energetically very close, homo- millisecond association inferred by recent single- dimeric states of the DOP receptor (D1 and molecule studies of GPCRs (Hern et al. 2010; D2) involving the TM4 interface, energetically Kasaietal.2011). Whether such a lifetime for stabilized relative to the monomeric state, DOP receptor homodimers and other GPCRs 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 117 within the membrane has implications on the 6.4.3 Relative Stability of Dimer functional role of receptor complexes and/or the Interfaces in GPCRs specificity of their interactions requires more in-depth investigation. In 2012, two papers more comprehensively ad- In a second study (Johnston et al. 2011), us- dressing dimerization of rhodopsin (Periole et al. ing this same system setup, we performed a 2012) and the B1/B2ARs (Johnston et al. 2012) well-tempered metadynamics study, to better ex- were published almost simultaneously. We have plore the TM4-TM4 contact interface of the pro- recently combined the two techniques that we tomers in this same dimeric arrangement, and have used in the past to investigate the DOP re- compare it with the interface comprising con- ceptor, i.e. umbrella sampling and metadynamics, tact at both TM4 and TM5 of each protomer. into a single method to comprehensively charac- The well-tempered metadynamics implementa- terize dimerization of the B1AR and B2AR, at tion is identical to that described earlier in this contact interfaces previously documented to have chapter. We restricted exploration of CV1, r, to physiological relevance for a number of GPCRs r D 3.80 nm, i.e. the value of the transition state by experimental techniques. We have focused on observed in the umbrella sampling simulations contact interfaces involving transmembrane he- (Provasi et al. 2010), using a steep repulsive po- lices TM1/H8 and TM4, and have used umbrella tential, and applied a Gaussian biasing potential sampling to explore the FES as a function of to ensure thorough exploration of CV2 and CV3, protomer:protomer separation as in (Provasi et ı angles  A and  B, within a 25 interval for al. 2010), while simultaneously employing meta- the contact interface involving TM4 only, and dynamics as in (Johnston et al. 2011) to ensure ı a 17 interval for the more symmetric inter- complete sampling of the angle space available face involving TM4 and TM5 simultaneously. to the protomers at each interface. The missing The FES as a function of angles  A and  B, segments of the crystal structure of the B2AR for each of the two contact interfaces (TM4, (PDB ID: 2RH1 (Cherezov et al. 2007)) were and TM4/5) was reconstructed from the history generated using Rosetta (Wang et al. 2007), and a of the applied bias. We extracted a structure homology model of the human B1AR was built from the minima of each of the resulting 2- using the turkey B1AR crystal structure (PDB dimensional free energy surfaces, and proceeded ID: 2VT4 (Warne et al. 2008)). The system for to characterize the residues representing sym- simulation was constructed in an identical fashion metric contacts between the two protomers in to that for the DOP receptor simulations, with the minimum energy dimeric structures. The re- both the protein and the environment represented sults of the calculated relative stability of the in the CG MARTINI forcefield. two interfaces were in line with inferences from Periole and colleagues (2012) extended their cysteine cross-linking studies. The key residue long-timescale simulations of rhodopsin in 4.58 defining the TM4 interface was V181 , and (C20:1)2PC bilayer (Periole et al. 2007), to was found to have a C“-C“ separation of <7Å multiple repetitions on the 100’s of s timescale, after conversion to an atomistic resolution repre- and with a larger bilayer, encompassing 64 sentation in the OPLS-AA forcefield (Kaminski rhodopsin receptors, rather than the original et al. 2001; Jorgensen et al. 1996), minimiza- 16 used in the previous study. The dimerization tion and short equilibration (1 ns) simulation in events were clustered and suggested that the most GROMACS (Van der Spoel et al. 2005). For commonly occurring events involved contact the TM4/TM5 interface, the cysteine cross-linked interfaces also comprised of TM4, TM4/5 and residue, T2135.38, was found within a C“-C“ TM1/H8. distance of 11 Å between the opposing pro- Despite the long timescale and the multiple tomers. receptors, Periole and colleagues found that the 118 J.M. Johnston and M. Filizola number of individual dimerization events was in- individual methods, on a simple two-dimensional sufficient to provide reliable statistics from which toy system (Johnston et al. 2012). to calculate free energy estimates for the dimer- The similarities between the results of these ization of rhodopsin. Accordingly, they included two independent studies are striking. Most no- umbrella sampling simulations to establish five tably, both studies find that the most stable of PMFs for this dimerization event at different the interfaces investigated, by a significant mar- interfaces observed from the self-assembly sim- gin, was that involving TM1 and H8. It is dif- ulations, using the virtual bond algorithm (VBA) ficult to compare the exact nature of the inter- defined in Fig. 6.3b. Their VBA defined 8 re- faces, since firstly, the protomeric arrangement straints, but in general, the ones found to be most at this interface is not identical between the two useful were dihedral angles ¥1and ¥3 and dis- investigations, and secondly, the measurement tance, d. Periole and colleagues use these CVs to of the separation between the protomers is de- investigate PMFs for dimerization at symmetric fined by the interfacial distance in the study of interfaces involving TM4, TM5, TM4 and TM5, rhodopsin, while it is between the centers of and TM1 and H8. They also considered an asym- mass of the helical bundles in the study of the metric interface composed of TM4 interacting B1/B2ARs. Nevertheless, the similarity of the with TM6 on the adjacent protomer. We only result between the two independent investiga- investigated symmetric interfaces, comprised of tions is compelling, particularly when the small TM4 and TM1/H8 that have been routinely sug- contact surface area between the protomers is gested to have physiological relevance through considered. The TM4 interface was found to be biophysical experiments. It must also be noted the “weakest” or least stable for both rhodopsin that the arrangement of the helices in TM1/H8 is and the B1/B2ARs (no PMF was presented for similar but not identical between the two studies. TM4/5 for the B1/B2ARs). The PMF in the The combination of simulation length per win- rhodopsin simulation (Periole et al. 2012)shows dow and the number of windows was different remarkable qualitative similarity to that found between the two studies. Periole and colleagues for the same interface in our first umbrella sam- used a small number of windows, with a weak pling study of the DOP receptor (Provasi et al. force constant (500Ð5,000 kJ mol1 nm2 on r, 2010). Both PMFs show two minima, separated and 300 kJ mol1 nm2 on ¥1 and ¥3), and a long by a small transition state, with the minimum simulation length, reaching up to 20 s (effective at larger separation being slightly shallower than timescale) in some cases. In our work, we used that in which the protomers are tightly bound many shorter windows, centered at intervals of together. Periole and colleagues suggest that the r of only 0.05 nm. We also used a large force shallower minimum corresponds to a protomeric constant, k D 10,000 kJ mol1 nm2 for the har- arrangement in which a small number of lipids monic restraint to tether the measured value of the “lubricate” the contact surface between the two distance to the center of each window. The angles protomers. Both PMFs show a small barrier for  A and  B as defined in Fig. 6.3a were restricted dimerization, between the monomeric state and by steep harmonic potentials, as described in pre- the shallower minimum for the TM4 interface vious studies (Johnston et al. 2011; Provasi et al. for both rhodopsin and DOP receptor. No such 2010), but were explored using the metadynamics barrier is noted for TM1/H8 for rhodopsin or the techniques as described in (Johnston et al. 2011). B1/B2ARs, (no TM1/H8 interface has been in- The simulation length of each window was 1 s, vestigated yet for the DOP receptor). Periole and which was longer than in the previous umbrella colleagues suggest this barrier may be related to sampling study (in which it was only 300 ns) difficulties in delipidation of the surfaces between (Provasi et al. 2010), in order to permit a thorough the protomers. A significant difference between exploration of the angle range using the Gaus- the studies from these two labs is the choice of sian bias. This novel combination methodology lipid membrane mimetic for the environment of was validated by favorable comparison with the the proteins. This is particularly pertinent because 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 119

Periole and colleagues found that de-lipidation of this by performing their umbrella sampling some of the contact interfaces was very challeng- experiments for this interface both with and ing, even on the (effective) time scale of 20 s. without palmitoylation. They found that there Following on from their earlier studies (Periole was no significant difference between the et al. 2007), Periole and colleagues chose the PMF profiles calculated for both cases and longer-tailed lipid mimetic of those tested, i.e. thus speculate that the palmitoylation does not C(20:1)2 PC, since this was found to influence contribute to the strength of the contact interface dimerization the least. Atomistically, this would between TM1 and H8 on adjacent protomers. represent 1,2, di(D-cis-eicosanoyl)-sn-glycero-3- Both works offer putative physiological ex- phosphocholine. In contrast, we used a shorter- planations for the striking difference in relative tailed lipid membrane mimetic, representing a strength of interfaces involving different trans- POPC lipid, with a 10 % cholesterol component. membrane regions. We have proposed that the While it is acknowledged that the presence of strength of the TM1/H8 interface, and its sub- cholesterol might reduce the mobility of lipids sequently calculated “long” lifetime (of the or- and proteins in the membrane particularly in der of minutes), might be reconciled with the terms of exchange at the contact interface, plots sub-second lifetimes reported for the muscarinic depicting the residency time at the interface (in receptor (Hern et al. 2010) and the N-formyl terms of percentage of the trajectory spent within peptide receptor (Kasai et al. 2011), through a 15 Å of the interface helices) of all lipids and model wherein a stable dimer is formed at a con- cholesterol in every simulation we ran indicated tact interface involving TM1/H8, and this diffuses that 75 % of the cholesterol spent less than 60 % through the membrane interacting with other sta- of the trajectory within 15 Å of the interface and ble TM1/H8 dimers at other, weaker, interfaces. 75 % of lipids spent less than 20 % of their time The interactions at the other weaker interfaces are at the interface, while the median time at the shorter lived, and these may account for the ob- interface for both lipids and cholesterol was much servations of the single molecule experiments. A lower than 10 % of the trajectory. These statistics dimer/tetramer model of oligomerization would suggested adequate lipid exchange. also be comparable to the inferences the Scarlata In both studies, the contact interface between lab in collaboration with us derived (Golebiewska the two protomers at TM4 is more extensive than et al. 2011) from the diffusion rates and number for the TM1/H8 interface, and Periole and col- and brightness measurements from fluorescence leagues suggest that this means that the strength imaging of tagged MOP and DOP receptors in of the interaction is not dependent on the buried live cells after chronic morphine treatment. surface area between the protomers for these Periole and colleagues compare their results receptors, as is the case for soluble proteins. to 2-dimensional rows of dimers of rhodopsin, as Several post-translational modifications are suggested by AFM studies of rhodopsin in native known to occur for rhodopsin, and in particular, disk membranes (Fotiadis et al. 2003). To further palmitoylation at cysteine residues 322 and test this hypothesis, they built and simulated a 323, has been predicted to play a role in system of rows of TM1/H8 dimers interacting anchoring the receptor to its native membrane in a lipid-mediated or “lubricated” manner with environment and specific interactions between adjacent rows of receptors, matching the struc- the palmitoylate chain and the protein may tural information derived from the AFM study have a functional relevance for the dark state of (Fotiadis et al. 2003). Their system remained rhodopsin (Olausson et al. 2012). The location of stable after 16 s of simulation (Periole et al. these palmitoylated residues, towards the end 2012). of the amphipathic helix 8, means that they Other information from structural studies could feasibly form part of a contact interface supports the TM1/H8 interaction as a contact in- between two protomers involving TM1 and terface with putative relevance. Two-dimensional H8. Periole and colleagues have accounted for electron crystallography of rhodopsin (Schertler 120 J.M. Johnston and M. Filizola and Hargrave 1995) and of metarhodopsin I Ballesteros JA, Weinstein H (1995) Integrated methods (Ruprecht et al. 2004) and X-ray crystallography for the construction of three-dimensional models and of rhodopsin (Salom et al. 2006), opsin (Park computational probing of structure-function relations in G protein-coupled receptors. In: Sealfon SC, Conn and Schulten 2004) and metarhodopsin II (Choe PM (eds) Methods in neurosciences, vol 25. Aca- et al. 2011) have hinted at this interaction, as demic, San Diego, pp 366Ð428 have the most recent KOP (Wu et al. 2012) and Barak LS, Ferguson SSG, Zhang J, Martenson C, Meyer MOP (Manglik et al. 2012) receptor X-ray crystal T, Caron MG (1997) Internal trafficking and surface mobility of a functionally intact beta(2)-adrenergic structures, although the physiological relevance receptor-green fluorescent protein conjugate. Mol of these arrangements remain to be demonstrated, Pharmacol 51:177Ð184 in spite of recent experimental studies (Knepp et Barducci A, Bussi G, Parrinello M (2008) Well-tempered metadynamics: a smoothly converging and tunable al. 2012) leaning in the same direction. free-energy method. Phys Rev Lett 100:020603 Befort K, Tabbara L, Bausch S, Chavkin C, Evans C, Kieffer B (1996a) The conserved aspartate residue 6.5 Conclusions in the third putative transmembrane domain of the delta-opioid receptor is not the anionic counterpart Throughout this chapter, we have discussed the for cationic opiate binding but is a constituent of the receptor binding site. Mol Pharmacol 49:216Ð223 ways in which biased MD methods are increas- Befort K, Tabbara L, Kling D, Maigret B, Kieffer BL ingly becoming invaluable tools to investigate, at (1996b) Role of aromatic transmembrane residues of the detailed molecular level, key events in the the delta-opioid receptor in ligand recognition. J Biol lifetime of GPCRs that take place on timescales Chem 271:10161Ð10168 Bonomi M, Branduardi D, Bussi G, Camilloni C, Provasi in nature that are not routinely accessible to stan- D, Raiteri P, Donadio D, Marinelli F, Pietrucci F, dard approaches. As these techniques become Broglia RA et al (2009) PLUMED: a portable plugin more theoretically comprehensive and computa- for free-energy calculations with molecular dynamics. Comput Phys Commun 180:1961Ð1972 tional power becomes greater, the combination Bot G, Blake AD, Li SX, Reisine T (1998) Mutagenesis of these two factors will progressively reveal of a single amino acid in the rat mu-opioid recep- important mechanistic details of events in the tor discriminates ligand binding. J Neurochem 70: lives of GPCR signalosomes. 358Ð365 Bowers KJ, Chow E, Xu H, Dror RO, Eastwood MP, Gregersen BA, Klepeis KL, Kolossváry I, Moraes MA, Acknowledgements The authors’ work on GPCRs Sacerdoti FD, Salmon JK, Shan Y, Shaw DE (2006) is currently supported by NIH grants DA026434 and Scalable algorithms for molecular dynamics simula- DA034049. Their computations are run, in part, on tions on commodity clusters. In: Proceedings of the resources available through the Scientific Computing ACM/IEEE conference on supercomputing (SC06), Facility at Icahn School of Medicine at Mount Sinai, Tampa, 11Ð17 November 2006 and in part on advanced computing resources provided by Branduardi D, Gervasio FL, Parrinello M (2007) From A Texas Advanced Computing Center through MCB080077. to B in free energy space. J Chem Phys 126:054103 Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M (1983) Charmm Ð a pro- gram for macromolecular energy, minimization, and References dynamics calculations. J Comput Chem 4:187Ð217 Callaerts-Vegh Z, Evans KL, Dudekula N, Cuba D, Knoll Allen TW, Andersen OS, Roux B (2004) Energetics of ion BJ, Callaerts PF, Giles H, Shardonofsky FR, Bond conduction through the gramicidin channel. Proc Natl RA (2004) Effects of acute and chronic administration Acad Sci USA 101:117Ð122 of beta-adrenoceptor ligands on airway function in a Altenbach C, Kusnetzow AK, Ernst OP, Hofmann KP, murine model of asthma. Proc Natl Acad Sci USA Hubbell WL (2008) High-resolution distance mapping 101:4948Ð4953 in rhodopsin reveals the pattern of helix movement due Cherezov V, Rosenbaum DM, Hanson MA, Rasmussen to activation. Proc Natl Acad Sci USA 105:7439Ð7444 SGF, Thian FS, Kobilka TS, Choi HJ, Kuhn P, Weis Baba M, Nishimura O, Kanzaki N, Okamoto M, Sawada WI, Kobilka BK et al (2007) High-resolution crystal H, Iizawa Y, Shiraishi M, Aramaki Y, Okonogi K, structure of an engineered human beta(2)-adrenergic Ogawa Y et al (1999) A small-molecule, nonpep- G protein-coupled receptor. Science 318:1258Ð1265 tide CCR5 antagonist with highly potent and selec- Choe HW, Kim YJ, Park JH, Morizumi T, Pai EF, Krauss tive anti-HIV-1 activity. Proc Natl Acad Sci USA N, Hofmann KP, Scheerer P, Ernst OP (2011) Crystal 96:5698Ð5703 structure of metarhodopsin II. Nature 471:651Ð655 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 121

Chow E, Xu H, Dror RO, Eastwood MP, Gregersen Golebiewska U, Johnston JM, Devi L, Filizola M, Scarlata BA, Klepeis JL, Kolossvary I, Moraes MA, Sacerdoti S (2011) Differential response to morphine of the FD, Salmon JK et al (2006) In: Scalable algorithms oligomeric state of mu-opioid in the presence of delta- for molecular dynamics simulation on commodity opioid receptors. Biochemistry 50:2829Ð2837 clusters Gonzalez A, Perez-Acle T, Pardo L, Deupi X (2011) Deupi X, Kobilka BK (2010) Energy landscapes as a tool Molecular basis of ligand dissociation in beta- to integrate GPCR structure, dynamics, and function. adrenergic receptors. PLoS One 6:e23815 Physiology (Bethesda) 25:293Ð303 Granier S, Manglik A, Kruse AC, Kobilka TS, Thian Deupi X, Edwards P, Singhal A, Nickle B, Oprian D, FS, Weis WI, Kobilka BK (2012) Structure of the Schertler G, Standfuss J (2012) Stabilized G pro- delta-opioid receptor bound to naltrindole. Nature tein binding site in the structure of constitutively 485:400Ð404 active metarhodopsin-II. Proc Natl Acad Sci USA Grossfield A, Pitman MC, Feller SE, Soubias O, Gawrisch 109:119Ð124 K (2008) Internal hydration increases during activation Dror RO, Pan AC, Arlow DH, Borhani DW, Maragakis P, of the G-protein-coupled receptor rhodopsin. J Mol Shan Y, Xu H, Shaw DE (2011) Pathway and mecha- Biol 381:478Ð486 nism of drug binding to G-protein-coupled receptors. Guo W, Shi L, Filizola M, Weinstein H, Javitch JA (2005) Proc Natl Acad Sci USA 108:13118Ð13123 Crosstalk in G protein-coupled receptors: changes at Elster L, Elling C, Heding A (2007) Bioluminescence the transmembrane homodimer interface determine resonance energy transfer as a screening assay: fo- activation. Proc Natl Acad Sci USA 102:17495Ð17500 cus on partial and inverse agonism. J Biomol Screen Guo W, Urizar E, Kralikova M, Mobarec JC, Shi L, 12:41Ð49 Filizola M, Javitch JA (2008) Dopamine D2 receptors Eswar N, Webb B, Marti-Renom MA, Madhusudhan form higher order oligomers at physiological expres- MS, Eramian D, Shen MY, Pieper U, Sali A (2007) sion levels. EMBO J 27:2293Ð2304 Comparative protein structure modeling using MOD- Halgren TA, Murphy RB, Friesner RA, Beard HS, Frye ELLER. Curr Protoc Protein Sci, Chapter 2:Unit 2 9 LL, Pollard WT, Banks JL (2004) Glide: a new ap- Fiser A, Sali A (2003) MODELLER: generation and proach for rapid, accurate docking and scoring. 2. refinement of homology-based protein structure mod- Enrichment factors in database screening. J Med Chem els. In Macromolecular crystallography, Pt D. Method 47:1750Ð1759 Enzymol 374: 461-C Hegener O, Prenner L, Runkel F, Baader SL, Kappler J, Fishelovitch D, Shaik S, Wolfson HJ, Nussinov R (2009) Haberlein H (2004) Dynamics of beta(2)-adrenergic Theoretical characterization of substrate access/exit receptor Ð ligand complexes on living cells. Biochem- channels in the human cytochrome P450 3A4 enzyme: istry 43:6190Ð6199 involvement of phenylalanine residues in the gating Henis YI, Hekman M, Elson EL, Helmreich EJM mechanism. J Phys Chem B 113:13018Ð13025 (1982) Lateral motion of beta-receptors in membranes Fonseca JM, Lambert NA (2009) Instability of a class of cultured liver-cells. Proc Natl Acad Sci USA A G protein-coupled receptor oligomer interface. Mol 79:2907Ð2911 Pharmacol 75:1296Ð1299 Hern JA, Baig AH, Mashanov GI, Birdsall B, Corrie JET, Fotiadis D, Liang Y, Filipek S, Saperstein DA, Engel Lazareno S, Molloy JE, Birdsall NJM (2010) For- A, Palczewski K (2003) Atomic-force microscopy: mation and dissociation of M-1 muscarinic receptor rhodopsin dimers in native disc membranes. Nature dimers seen by total internal reflection fluorescence 421:127Ð128 imaging of single molecules. Proc Natl Acad Sci USA Frauenfelder H, Sligar SG, Wolynes PG (1991) The 107:2693Ð2698 energy landscapes and motions of proteins. Science Hildebrand PW, Scheerer P, Park JH, Choe HW, Piechnick 254:1598Ð1603 R, Ernst OP, Hofmann KP, Heck M (2009) A ligand Fribourg M, Moreno JL, Holloway T, Provasi D, Baki channel through the G protein coupled receptor opsin. L, Mahajan R, Park G, Adney SK, Hatcher C, Eltit PLoS One 4:e4382 JM et al (2011) Decoding the signaling of a GPCR Holst B, Nygaard R, Valentin-Hansen L, Bach A, Engel- heteromeric complex reveals a unifying mechanism of stoft MS, Petersen PS, Frimurer TM, Schwartz TW action of antipsychotic drugs. Cell 147:1011Ð1023 (2010) A conserved aromatic lock for the tryptophan Friesner RA, Banks JL, Murphy RB, Halgren TA, rotameric switch in TM-VI of seven-transmembrane Klicic JJ, Mainz DT, Repasky MP, Knoll EH, Shelley receptors. J Biol Chem 285:3973Ð3985 M, Perry JK et al (2004) Glide: a new approach Hopkinson HE, Latif ML, Hill SJ (2000) Non-competitive for rapid, accurate docking and scoring. 1. Method antagonism of beta(2)-agonist-mediated cyclic AMP and assessment of docking accuracy. J Med Chem accumulation by ICI 118551 in BC3H1 cells en- 47:1739Ð1749 dogenously expressing constitutively active beta(2)- Friesner RA, Murphy RB, Repasky MP, Frye LL, Green- adrenoceptors. Br J Pharmacol 131:124Ð130 wood JR, Halgren TA, Sanschagrin PC, Mainz DT Huber T, Sakmar TP (2011) Escaping the flatlands: new (2006) Extra precision glide: docking and scoring approaches for studying the dynamic assembly and incorporating a model of hydrophobic enclosure for activation of GPCR signaling complexes. Trends Phar- protein-ligand complexes. J Med Chem 49:6177Ð6196 macol Sci 32:410Ð419 122 J.M. Johnston and M. Filizola

Hurst DP, Grossfield A, Lynch DL, Feller S, Romo Knepp AM, Periole X, Marrink SJ, Sakmar TP, Huber TD, Gawrisch K, Pitman MC, Reggio PH (2010) A T (2012) Rhodopsin forms a dimer with cytoplasmic lipid pathway for ligand binding is necessary for a helix 8 contacts in native membranes. Biochemistry cannabinoid G protein-coupled receptor. J Biol Chem 51:1819Ð1821 285:17954Ð17964 Knierim B, Hofmann KP, Ernst OP, Hubbell WL (2007) Im WP, Lee MS, Brooks CL (2003) Generalized born Sequence of late molecular events in the activa- model with a simple smoothing function. J Comput tion of rhodopsin. Proc Natl Acad Sci USA 104: Chem 24:1691Ð1702 20290Ð20295 Im W, Feig M, Brooks CL (2004) An implicit membrane Krystek SR Jr, Kimura SR, Tebben AJ (2006) Modeling generalized born theory for the study of structure, sta- and active site refinement for G protein-coupled bility, and interactions of membrane proteins (vol 85, receptors: application to the beta-2 adrenergic pg 2900, 2003). Biophys J 86:3330Ð3330 receptor. J Comput Aided Mol Des 20:463Ð470 Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman Kumar S, Bouzida D, Swendsen RH, Kollman PA, RG (2012) ZINC: a free tool to discover chemistry for Rosenberg JM (1992) The weighted histogram biology. J Chem Inf Model 52:1757Ð1768 analysis method for free-energy calculations on Isberg V, Balle T, Sander T, Jorgensen FS, Gloriam DE biomolecules.1. The method. J Comput Chem (2011) G protein- and agonist-bound serotonin 5- 13:1011Ð1021 HT2A receptor model activated by steered molecular Kumar S, Rosenberg JM, Bouzida D, Swendsen RH, Koll- dynamics simulations. J Chem Inf Model 51:315Ð325 man PA (1995) Multidimensional free-energy calcula- Isralewitz B, Izrailev S, Schulten K (1997) Binding path- tions using the weighted histogram analysis method. J way of retinal to bacterio-opsin: a prediction by molec- Comput Chem 16:1339Ð1350 ular dynamics simulations. Biophys J 73:2972Ð2979 Kusumi A, Hyde JS (1982) Spin-label saturation-transfer Jensen MO, Park S, Tajkhorshid E, Schulten K (2002) electron-spin resonance detection of transient associ- Energetics of glycerol conduction through aquaglyc- ation of rhodopsin in reconstituted membranes. Bio- eroporin GlpF. Proc Natl Acad Sci USA 99:6731Ð6736 chemistry 21:5978Ð5983 Johnston JM, Aburi M, Provasi D, Bortolato A, Urizar Laio A, Parrinello M (2002) Escaping free-energy min- E, Lambert NA, Javitch JA, Filizola M (2011) ima. Proc Natl Acad Sci USA 99:12562Ð12566 Making structural sense of dimerization interfaces Lambert NA (2010) Gpcr dimers fall apart. Sci Signal of delta opioid receptor homodimers. Biochemistry 3:pe12 50:1682Ð1690 Lambright DG, Sondek J, Bohm A, Skiba NP, Hamm Johnston JM, Wang H, Provasi D, Filizola M (2012) HE, Sigler PB (1996) The 2.0 A crystal structure of Assessing the relative stability of dimer interfaces a heterotrimeric G protein. Nature 379:311Ð319 in g protein-coupled receptors. PLoS Comput Biol Lebon G, Warne T, Edwards PC, Bennett K, Langmead 8:e1002649 CJ, Leslie AG, Tate CG (2011) Agonist-bound adeno- Jorgensen WL, Maxwell DS, TiradoRives J (1996) Devel- sine A2A receptor structures reveal common features opment and testing of the OPLS all-atom force field of GPCR activation. Nature 474:521Ð525 on conformational energetics and properties of organic Leone V, Marinelli F, Carloni P, Parrinello M (2010) liquids. J Am Chem Soc 118:11225Ð11236 Targeting biomolecular flexibility with metadynamics. Kabsch W, Sander C (1983) Dictionary of pro- Curr Opin Struct Biol 20:148Ð154 tein secondary structure Ð pattern-recognition of Li JG, Chen CG, Yin JL, Rice K, Zhang Y, Matecka hydrogen-bonded and geometrical features. Biopoly- D, de Riel JK, DesJarlais RL, Liu-Chen LY (1999) mers 22:2577Ð2637 Asp147 in the third transmembrane helix of the rat mu Kaminski GA, Friesner RA, Tirado-Rives J, Jorgensen opioid receptor forms ion-pairing with morphine and WL (2001) Evaluation and reparametrization of the naltrexone. Life Sci 65:175Ð185 OPLS-AA force field for proteins via comparison with Lopez CA, Rzepiela AJ, de Vries AH, Dijkhuizen L, accurate quantum chemical calculations on peptides. Hunenberger PH, Marrink SJ (2009) Martini coarse- J Phys Chem B 105:6474Ð6487 grained force field: extension to carbohydrates. J Chem Kasai RS, Suzuki KG, Prossnitz ER, Koyama-Honda Theor Comput 5:3195Ð3210 I, Nakada C, Fujiwara TK, Kusumi A (2011) Full Ludemann SK, Lounnas V, Wade RC (2000) How do characterization of GPCR monomer-dimer dynamic substrates enter and products exit the buried active site equilibrium by single molecule imaging. J Cell Biol of cytochrome P450cam? 1. Random expulsion molec- 192:463Ð480 ular dynamics investigation of ligand access channels Katritch V, Cherezov V, Stevens RC (2012) Structure- and mechanisms. J Mol Biol 303:797Ð811 function of the G protein-coupled receptor superfam- MacKerell AD, Bashford D, Bellott M, Dunbrack RL, ily. Annu Rev Pharmacol Toxicol 53:531Ð556 Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S Kimura SR, Tebben AJ, Langley DR (2008) Expanding et al (1998) All-atom empirical potential for molecular GPCR homology model binding sites via a balloon modeling and dynamics studies of proteins. J Phys potential: a molecular dynamics refinement approach. Chem B 102:3586Ð3616 Proteins 71:1919Ð1929 Manglik A, Kruse AC, Kobilka TS, Thian FS, Mathiesen Kirkwood JG (1935) Statistical mechanics of fluid mix- JM, Sunahara RK, Pardo L, Weis WI, Kobilka BK, tures. J Chem Phys 3:300Ð313 Granier S (2012) Crystal structure of the micro-opioid 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 123

receptor bound to a morphinan antagonist. Nature Park JH, Scheerer P, Hofmann KP, Choe HW, Ernst OP 485:321Ð326 (2008) Crystal structure of the ligand-free G-protein- Mansour A, Taylor LP, Fine JL, Thompson RC, Hov- coupled receptor opsin. Nature 454:183Ð187 ersten MT, Mosberg HI, Watson SJ, Akil H (1997) Periole X, Huber T, Marrink SJ, Sakmar TP (2007) Key residues defining the mu-opioid receptor bind- G protein-coupled receptors self-assemble in dynam- ing pocket: a site-directed mutagenesis study. J Neu- ics simulations of model bilayers. J Am Chem Soc rochem 68:344Ð353 129:10126Ð10132 Marchi M, Ballone P (1999) Adiabatic bias molecular Periole X, Cavalli M, Marrink SJ, Ceruso MA (2009) dynamics: a method to navigate the conformational Combining an elastic network with a coarse-grained space of complex molecular systems. J Chem Phys molecular force field: structure, dynamics, and in- 110:3697Ð3702 termolecular recognition. J Chem Theor Comput Marrink SJ, de Vries AH, Mark AE (2004) Coarse grained 5:2531Ð2543 model for semiquantitative lipid simulations. J Phys Periole X, Knepp AM, Sakmar TP, Marrink SJ, Huber T Chem B 108:750Ð760 (2012) Structural determinants of the supramolecular Marrink SJ, Risselada HJ, Yefimov S, Tieleman DP, de organization of G protein-coupled receptors in bilay- Vries AH (2007) The MARTINI force field: coarse ers. J Am Chem Soc 134:10959Ð10965 grained model for biomolecular simulations. J Phys Petrek M, Otyepka M, Banas P, Kosinova P, Koca J, Chem B 111:7812Ð7824 Damborsky J (2006) CAVER: a new tool to explore Marrink SJ, Fuhrmans M, Risselada HJ, Periole X (2008) routes from protein clefts, pockets and cavities. BMC The MARTINI forcefield. In: Voth G (ed) Coarse Bioinformatics 7:316 graining of condensed phase and biomolecular sys- Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid tems. CRC Press, Boca Raton E, Villa E, Chipot C, Skeel RD, Kale L, Schulten K Marrink SJ, Periole X, Tieleman DP, de Vries AH (2005) Scalable molecular dynamics with NAMD. J (2009) Comment on “On using a too large integra- Comput Chem 26:1781Ð1802 tion time step in molecular dynamics simulations of Poo MM, Cone RA (1973) Lateral diffusion of rhodopsin coarse-grained molecular models” by M. Winger, D. in the visual receptor membrane. J Supramol Struct Trzesniak, R. Baron and W. F. van Gunsteren, Phys 1:354 Chem Chem Phys 11:1934. Phys Chem Chem Phys Provasi D, Filizola M (2010) Putative active states of 12:2254Ð2256; author reply 2257Ð2258 a prototypic G-protein-coupled receptor from biased Milligan G (2009) G protein-coupled receptor hetero- molecular dynamics. Biophys J 98:2347Ð2355 dimerization: contribution to pharmacology and func- Provasi D, Bortolato A, Filizola M (2009) Explor- tion. Br J Pharmacol 158:5Ð14 ing molecular mechanisms of ligand recognition by Monticelli L, Kandasamy SK, Periole X, Larson RG, opioid receptors with metadynamics. Biochemistry Tieleman DP, Marrink SJ (2008) The MARTINI 48:10020Ð10029 coarse-grained force field: extension to proteins. Provasi D, Johnston JM, Filizola M (2010) Lessons from J Chem Theor Comput 4:819Ð834 free energy simulations of delta-opioid receptor ho- Morris GM, Goodsell DS, Huey R, Olson AJ (1996) Dis- modimers involving the fourth transmembrane helix. tributed automated docking of flexible ligands to pro- Biochemistry 49:6771Ð6776 teins: parallel applications of AutoDock 2.4. J Comput Provasi D, Artacho MC, Negri A, Mobarec JC, Filizola M Aided Mol Des 10:293Ð304 (2011) Ligand-induced modulation of the free-energy Okada T, Fujiyoshi Y, Silow M, Navarro J, Landau landscape of G protein-coupled receptors explored EM, Shichida Y (2002) Functional role of inter- by adaptive biasing techniques. PLoS Comput Biol nal water molecules in rhodopsin revealed by x- 7:e1002193 ray crystallography. Proc Natl Acad Sci USA 99: Raiteri P, Laio A, Gervasio FL, Micheletti C, Parrinello M 5982Ð5987 (2006) Efficient reconstruction of complex free energy Okada T, Sugihara M, Bondar AN, Elstner M, Entel P, landscapes by multiple walkers metadynamics. J Phys Buss V (2004) The retinal conformation and its envi- Chem B 110:3533Ð3539 ronment in rhodopsin in light of a new 2.2 Angstrom Rasmussen SGF, Choi HJ, Fung JJ, Pardon E, Casarosa crystal structure. J Mol Biol 342:571Ð583 P, Chae PS, DeVree BT, Rosenbaum DM, Thian FS, Olausson BE, Grossfield A, Pitman MC, Brown MF, Kobilka TS et al (2011a) Structure of a nanobody- Feller SE, Vogel A (2012) Molecular dynamics simu- stabilized active state of the beta(2) adrenoceptor. lations reveal specific interactions of post-translational Nature 469:175Ð180 palmitoyl modifications with rhodopsin in membranes. Rasmussen SG, DeVree BT, Zou Y, Kruse AC, Chung J Am Chem Soc 134:4324Ð4331 KY, Kobilka TS, Thian FS, Chae PS, Pardon E, Park S, Schulten K (2004) Calculating potentials of mean Calinski D et al (2011b) Crystal structure of the force from steered molecular dynamics simulations. beta2 adrenergic receptor-Gs protein complex. Nature J Chem Phys 120:5946Ð5961 477:549Ð555 Park S, Khalili-Araghi F, Tajkhorshid E, Schulten K Rives ML, Rossillo M, Liu-Chen LY, Javitch JA (2012) 60- (2003) Free energy calculation from steered molecu- Guanidinonaltrindole (60-GNTI) is a G protein-based lar dynamics simulations using Jarzynski’s equality. kappa-opioid receptor agonist that inhibits arrestin J Chem Phys 119:3559 recruitment. J Biol Chem 287:27050Ð27054 124 J.M. Johnston and M. Filizola

Rosenbaum DM, Zhang C, Lyons JA, Holl R, Aragao D, activation of mu-opioid receptors mutated at a his- Arlow DH, Rasmussen SGF, Choi HJ, DeVree BT, tidine residue lining the opioid binding cavity. Mol Sunahara RK et al (2011) Structure and function of Pharmacol 52:983Ð992 an irreversible agonist-beta(2) adrenoceptor complex. Strader CD, Sigal IS, Candelore MR, Rands E, Hill WS, Nature 469:236Ð240 Dixon RA (1988) Conserved aspartic acid residues Roux B (1995) The calculation of the potential of mean 79 and 113 of the beta-adrenergic receptor have force using computer-simulations. Comput Phys Com- different roles in receptor function. J Biol Chem mun 91:275Ð282 263:10267Ð10271 Roux B (1999) Statistical mechanical equilibrium theory Strader CD, Candelore MR, Hill WS, Sigal IS, Dixon RA of selective ion channels. Biophys J 77:139Ð153 (1989) Identification of two serine residues involved Ruprecht JJ, Mielke T, Vogel R, Villa C, Schertler GF in agonist activation of the beta-adrenergic receptor. J (2004) Electron crystallography reveals the structure Biol Chem 264:13572Ð13578 of metarhodopsin I. EMBO J 23:3609Ð3620 Surratt CK, Johnson PS, Moriwaki A, Seidleck BK, Ryba NJP, Marsh D (1992) Protein rotational diffu- Blaschak CJ, Wang JB, Uhl GR (1994) Mu opiate sion and lipid protein interactions in recombinants of receptor Ð charged transmembrane domain amino- bovine rhodopsin with saturated diacylphosphatidyl- acids are critical for agonist recognition and intrinsic cholines of different chain lengths studied by conven- activity. J Biol Chem 269:20548Ð20553 tional and saturation-transfer electron-spin-resonance. Taddese B, Simpson LM, Wall ID, Blaney FE, Kid- Biochemistry 31:7511Ð7518 ley NJ, Clark HS, Smith RE, Upton GJ, Gouldson Salom D, Lodowski DT, Stenkamp RE, Le Trong I, PR, Psaroudakis G et al (2012) G-protein-coupled Golczak M, Jastrzebska B, Harris T, Ballesteros JA, receptor dynamics: dimerization and activation mod- Palczewski K (2006) Crystal structure of a photoac- els compared with experiment. Biochem Soc Trans tivated deprotonated intermediate of rhodopsin. Proc 40:394Ð399 Natl Acad Sci USA 103:16123Ð16128 Tesmer JJ, Sunahara RK, Gilman AG, Sprang SR (1997) Sauliere-Nzeh AN, Millot C, Corbani M, Mazeres S, Crystal structure of the catalytic domains of adenylyl Lopez A, Salome L (2010) Agonist-selective dynamic cyclase in a complex with Gsalpha.GTPgammaS. Sci- compartmentalization of human Mu opioid receptor as ence 278:1907Ð1916 revealed by resolutive FRAP analysis. J Biol Chem Thompson AA, Liu W, Chun E, Katritch V, Wu H, 285:14514Ð14520 Vardy E, Huang XP, Trapella C, Guerrini R, Calo G Scheerer P, Park JH, Hildebrand PW, Kim YJ, Krauss et al (2012) Structure of the nociceptin/orphanin FQ N, Choe HW, Hofmann KP, Ernst OP (2008) Crystal receptor in complex with a peptide mimetic. Nature structure of opsin in its G-protein-interacting confor- 485:395Ð399 mation. Nature 455:497Ð502 Toll L, Berzetei-Gurske IP, Polgar WE, Brandt SR, Schertler GF, Hargrave PA (1995) Projection structure of Adapa ID, Rodriguez L, Schwartz RW, Haggart D, frog rhodopsin in two crystal forms. Proc Natl Acad O’Brien A, White A et al (1998) Standard binding Sci USA 92:11578Ð11582 and functional assays related to medications develop- Schlitter J, Engels M, Kruger P (1994) Targeted ment division testing for potential cocaine and opiate molecular-dynamics Ð a New approach for searching narcotic treatment medications. NIDA Res Monogr pathways of conformational transitions. J Mol Graph 178:440Ð466 12:84Ð89 Torrie GM, Valleau JP (1974) Monte-Carlo free-energy Selvam B, Wereszczynski J, Tikhonova IG (2012) Com- estimates using Non-Boltzmann sampling Ð applica- parison of dynamics of extracellular accesses to the tion to subcritical Lennard-Jones fluid. Chem Phys beta(1) and beta(2) adrenoceptors binding sites un- Lett 28:578Ð581 covers the potential of kinetic basis of antagonist Tozzini V (2005) Coarse-grained models for proteins. selectivity. Chem Biol Drug Des 80:215Ð226 Curr Opin Struct Biol 15:144Ð150 Shi L, Liapakis G, Xu R, Guarnieri F, Ballesteros JA, Tozzini V (2010) Multiscale modeling of proteins. Ac- Javitch JA (2002) Beta2 adrenergic receptor activa- counts Chem Res 43:220Ð230 tion. Modulation of the proline kink in transmem- Vaidehi N, Kenakin T (2010) The role of conformational brane 6 by a rotamer toggle switch. J Biol Chem ensembles of seven transmembrane receptors in func- 277:40989Ð40996 tional selectivity. Curr Opin Pharmacol 10:775Ð781 Shiota T (1999) Cyclic amine derivatives and their use as Van der Spoel D, Lindahl E, Hess B, Groenhof G, Mark drugs. US Patent 1999 AE, Berendsen HJC (2005) GROMACS: fast, flexible, Shirts MR, Chodera JD (2008) Statistically optimal anal- and free. J Comput Chem 26:1701Ð1718 ysis of samples from multiple equilibrium states. J Vogel R, Ruprecht J, Villa C, Mielke T, Schertler GF, Chem Phys 129:124105 Siebert F (2004) Rhodopsin photoproducts in 2D crys- Simpson LM, Wall ID, Blaney FE, Reynolds CA (2011) tals. J Mol Biol 338:597Ð609 Modeling GPCR active state conformations: the Wall MA, Coleman DE, Lee E, Iniguez-Lluhi JA, Posner beta(2)-adrenergic receptor. Proteins 79:1441Ð1457 BA, Gilman AG, Sprang SR (1995) The structure of Spivak CE, Beglan CL, Seidleck BK, Hirshbein LD, the G protein heterotrimer Gi alpha 1 beta 1 gamma 2. Blaschak CJ, Uhl GR, Surratt CK (1997) Naloxone Cell 83:1047Ð1058 6 Beyond Standard Molecular Dynamics: Investigating the Molecular Mechanisms of G... 125

Wang T, Duan Y (2007) Chromophore channeling in the Wu H, Wacker D, Mileni M, Katritch V, Han GW, Vardy G-protein coupled receptor rhodopsin. J Am Chem E, Liu W, Thompson AA, Huang XP, Carroll FI et al Soc 129:6970Ð6971 (2012) Structure of the human kappa-opioid receptor Wang T, Duan Y (2009) Ligand entry and exit path- in complex with JDTic. Nature 485:327Ð332 ways in the beta(2)-adrenergic receptor. J Mol Biol Xu F, Wu H, Katritch V, Han GW, Jacobson KA, Gao 392:1102Ð1115 ZG, Cherezov V, Stevens RC (2011) Structure of an Wang C, Bradley P, Baker D (2007) Protein-protein dock- agonist-bound human A2A adenosine receptor. Sci- ing with backbone flexibility. J Mol Biol 373:503Ð519 ence 332:322Ð327 Warne T, Serrano-Vega MJ, Baker JG, Moukhamet- Yang LJ, Zou J, Xie HZ, Li LL, Wei YQ, Yang zianov R, Edwards PC, Henderson R, Leslie AGW, SY (2009) Steered molecular dynamics simulations Tate CG, Schertler GFX (2008) Structure of a reveal the likelier dissociation pathway of imatinib beta(1)-adrenergic G-protein-coupled receptor. Nature from its targeting kinases c-Kit and Abl. PLoS One 454:486Ð491 4:e8470 Winger M, Trzesniak D, Baron R, van Gunsteren Zhang C, Srinivasan Y, Arlow DH, Fung JJ, Palmer WF (2009) On using a too large integration time D, Zheng Y, Green HF, Pandey A, Dror RO, step in molecular dynamics simulations of coarse- Shaw DE et al (2012) High-resolution crystal struc- grained molecular models. Phys Chem Chem Phys ture of human protease-activated receptor 1. Nature 11:1934Ð1941 492:387Ð392 Part III GPCR-Focused Rational Design and Mathematical Modeling From Three-Dimensional GPCR Structure to Rational Ligand 7 Discovery

Albert J. Kooistra, Rob Leurs, Iwan J.P. de Esch, and Chris de Graaf

Abstract This chapter will focus on G protein-coupled receptor structure-based virtual screening and ligand design. A generic virtual screening workflow and its individual elements will be introduced, covering amongst others the use of experimental data to steer the virtual screening process, ligand binding mode prediction, virtual screening for novel ligands, and rational structure-based virtual screening hit optimization. An overview of recent successful structure-based ligand discovery and design studies shows that receptor models, despite structural inaccuracies, can be efficiently used to find novel ligands for GPCRs. Moreover, the recently solved GPCR crystal structures have further increased the opportunities in structure-based ligand discovery for this pharmaceutically important protein family. The current chapter will discuss several challenges in rational ligand discovery based on GPCR structures including: (i) structure-based identification of ligands with specific effects on GPCR mediated signaling pathways, and (ii) virtual screening and structure-based optimization of fragment-like molecules.

Keywords Drug design ¥ In Silico methods ¥ Virtual screening ¥ Docking ¥ Crystal structures

7.1 From In Crystallo to In Silico: GPCR Structures for Rational Ligand Discovery

A.J. Kooistra ¥ R. Leurs ¥ I.J.P. de Esch ¥ C. de Graaf () Knowledge of the three-dimensional structure Division of Medicinal Chemistry, Faculty of Sciences, of G protein-coupled receptors (GPCRs) pro- Amsterdam Institute for Molecules, Medicines and vides important insights into receptor function Systems (AIMMS), VU University Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands and receptor-ligand interactions. This informa- e-mail: [email protected] tion is key for the in silico rational discovery

M. Filizola (ed.), G Protein-Coupled Receptors - Modeling and Simulation, Advances 129 in Experimental Medicine and Biology 796, DOI 10.1007/978-94-007-7423-0__7, © Springer ScienceCBusiness Media Dordrecht 2014 130 A.J. Kooistra et al.

a b Overlay of GPCR structures Overlay of the co-crystallized ligands

AA2AR ADRB2 CXCR4 DRD3 6.55 HRH1 5.42 2.63

7.39

5.46 3.32 c 2D structures of the co-crystallized ligands

O OH H NH NH N N N O S N O N N N N N N H N O HO N O H S Cl N NH2 OH ZM241385 (S)-carazolol IT1t (S)-eticlopride Doxepin (AA2AR) (ADRB2) (CXCR4) (DRD3) (HRH1)

Fig. 7.1 (a) Structural alignment of multiple GPCR (Cherezov et al. 2010)) is depicted as a light yellow rib- crystal structures (PDB-codes AA2AR:3EML (Jaakola bon. Residue positions are indicated using the Ballesteros et al. 2008), ADRB2:2RH1 (Cherezov et al. 2007), Weinstein numbering scheme (Ballesteros and Weinstein CXCR4:3ODU (Wu et al. 2010), DRD3:3PBL (Chien 1995) and both the ligands and selected residues are et al. 2010), HRH1:3RZE (Shimamura et al. 2011)) high- colored according to the coloring scheme depicted in a. light the highly conserved TM fold (b)Close-upoftheco- (c) 2D structures of the co-crystallized ligands depicted crystallized ligands within the conserved binding site. The in b backbone of carazolol bound ADRB2 (PDB-code: 2RH1 of new bioactive molecules that can target this 2012), the lipid S1PR1 receptor (Hanson et al. family of pharmaceutically relevant drug targets 2012), the • (OPRD) (Granier et al. 2012), (Congreve et al. 2011; de Graaf and Rognan (OPRM) (Manglik et al. 2012), › (OPRK) (Wu 2009). After the first GPCR crystal structure et al. 2012), and nociceptin (OPRX) (Thompson of bovine rhodopsin in 2000 (Palczewski et al. et al. 2012) opioid receptors, and the protease- 2000), the first crystal structures of druggable activated receptor 1 (PAR1) (White et al. 2012). GPCRs have been solved only in the past 6 years These GPCR crystal structures (Salon et al. 2011; (Salon et al. 2011), including beta1 (ADRB1) Katritch et al. 2012) offer unique opportunities to (Warne et al. 2008) and beta2 (ADRB2) (Chere- push the limits of structure-based rational ligand zov et al. 2007) adrenergic receptors, dopamine discovery and design (Congreve et al. 2011; Sa- D3 receptor (DRD3) (Chien et al. 2010), his- lon et al. 2011), and offer higher resolution tem- tamine H1 receptor (HRH1) (Shimamura et al. plates for modeling the structures of GPCRs for 2011), and muscarinic M2 (Haga et al. 2012) and which crystal structures have not yet been solved M3 (Kruse et al. 2012) receptors, the adenosine (Katritch et al. 2012; Stevens et al. 2012;Ro- A2A receptor (AA2AR) (Jaakola et al. 2008), the driguez and Gutierrez-de-Teran 2013). It should chemokine CXCR4 receptor (Wu et al. 2010), however be noted that modeling of GPCRs with the neurotensin 1 receptor (NTR1) (White et al. low homology to the currently available GPCR 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 131 crystal structures (e.g., class B (Miller et al. small molecules as well as peptides (Fig. 7.1c) 2011; Vohra et al. 2013; de Graaf et al. 2011a) (Surgand et al. 2006; Lagerstrom and Schioth and class C GPCRs (Petrel et al. 2004;Mal- 2008). herbe et al. 2006; Gloriam et al. 2011)) still remains a difficult task in which experimental data are of utmost importance to restrict the 7.2 Hierarchical Workflow for number of possible models (de Graaf and Rognan GPCR Structure-Based 2009; Roumen et al. 2011). The challenges of Ligand Discovery GPCR modeling has been for example demon- strated in recent community wide competitions to This section will describe the steps along a hi- predict GPCR crystal structures (GPCR DOCK erarchical virtual screening workflow applied in 2008 (Katritch et al. 2010a; Michino and Abola many GPCR virtual screening studies (Fig. 7.2). 2009), and GPCR DOCK 2010 (Roumen et al. Although the different steps can be generally 2011; Kufareva et al. 2011)) and GPCR mod- applied to other protein targets, it should be noted eling methods have been described in several that incorporation of target specific information reviews (de Graaf and Rognan 2009; de Graaf can highly increase the chance on success. In et al. 2011a; Cavasotto 2011; Kooistra et al. step 1 the initial compound library (Moura Bar- 2013). The current chapter gives an overview bosa and Del Rio 2012) can be filtered to re- of the challenges and opportunities in structure- move undesirable compounds that contain chem- based discovery of GPCR ligands and focuses ical moieties that can interfere with experimental in particular on the developments in the pe- validation assays by forming aggregates, chemi- riod over the past 5 years in which the first cally reacting with proteins or directly interfere druggable GPCR crystal structures have been in assay signaling (Baell and Holloway 2010). used for rational GPCR ligand design. Different Furthermore one can decide to exclude com- steps along the virtual screening workflow will pounds that contain scaffolds associated with be discussed in Sect. 7.2. An overview of sev- toxicity (Muegge 2003) or that have poor oral eral successful structure-based ligand discovery bioavailability (Lipinski et al. 2001). In most studies in Sect. 7.3 shows that GPCR models, reported GPCR-based virtual screening studies, despite structural inaccuracies, can be efficiently such pre-filters are applied to construct the chem- used to find novel ligands for GPCRs. Moreover, ical library to be screened (Table 7.1). Additional the recently solved GPCR crystal structures have filters, based on the properties (e.g., molecu- further increased the opportunities in structure- lar weight, number of rotatable bonds, number based discovery of small molecule ligands for of rings, hydrogen bond donor/acceptor counts, this pharmaceutically important protein family. number of positively/negatively charged atoms, All crystal structures clearly show the conserved etc.) of a set of known actives (step 1 of Fig. 7.2), heptahelical fold in the TM domain (Fig. 7.1a) as are applied in most GPCR VS studies as well well as for the intracellular helix 8. (Table 7.1) to obtain a more focused chemical The loops on the other hand differ highly be- database. Substructures (Varady et al. 2003;Ev- tween GPCRs in sequence composition, length, ers and Klabunde 2005; Evers and Klebe 2004; and (secondary) structure. Apart from that they Kellenberger et al. 2007), chemical similarity de- have also shown to be harder to resolve in crystal scriptors (Evers and Klebe 2004; Cavasotto et al. structures as the electron density is distorted in 2008; Tikhonova et al. 2008) and 3D-shape sim- this region, as is the case for ECL2 in, for exam- ilarity or pharmacophore models (de Graaf et al. ple, the HRH1 structure as well as some AA2AR 2011a; Varady et al. 2003; Evers and Klabunde crystal structures. Within the conserved 7 TM 2005; Evers and Klebe 2004; Cavasotto et al. helices and below the extracellular loops lies 2008) (Table 7.2) derived from known ligands the conserved orthosteric binding site of GPCRs can be used to narrow down the number of com- (Fig. 7.1b) which can bind a plethora of different pounds to be handled during conformational sam- 132 A.J. Kooistra et al.

Fig. 7.2 Structure-based Initial cpd database virtual screening (SBVS) workflow (see Sect. 7.2 for description of the Focused database creation individual steps, Sect. 7.3 and Tables 7.1 and 7.2 for Reactive groups Filtering details of SBVS runs 2D/3D similarity against specific GPCRs, and Sect. 7.4 for the Novelty filter 3D pharmacophore discussion of step 6 observed in reported SBVS studies) Conformational Sampling

Receptor-ligand mismatch Known ligand(s) & Ranking Postprocessing

2D/3D similarity

3D pharmacophore

Visual inspection Final selection

Experimental validation Design

SB optimization pling (step 2) even further. These methods can ferent automated docking programs and scor- also be used inversely in order to limit the focused ing functions based on different physicochemical database to novel ligands, i.e. small molecules approximations are available (Moitessier et al. that have a low similarity (as assessed by the ap- 2008) (Sect. 7.2). In some GPCR SBVS studies plied method) compared to known ligands (avail- docking based screening simulations have been able in target annotated chemical databases like guided by pharmacophore constraints (Evers and WomBat (Olah et al. 2007), BindingDB (Liu Klabunde 2005; Evers and Klebe 2004;Sirci et al. 2007) and ChEMBL (Gaulton et al. 2012)). et al. 2012). Pharmacophore models and/or exclu- However, more often such a filter to identify sion constraints derived from structural models of novel scaffolds is performed (although computa- receptor-ligand complexes (or the receptor alone) tionally more expensive) after docking the com- can be used as an alternative structure-based pounds in the database. virtual screening approach to molecular docking In step 2 the chemical database is automati- simulations, as demonstrated in several success- cally docked in the receptor model. Many dif- ful GPCR SBVS campaigns to discover ligands 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 133 of CA3R (Klabunde et al. 2009), CNR2 (Salo prioritize scaffolds rather than individual com- et al. 2005), FFAR1 (Tikhonova et al. 2008), pounds in screening ranking lists. Sampling a few FPR1R (Edwards et al. 2005), MCHR1 (Cava- representative analogues for each scaffold usually sotto et al. 2008), HRH3 (Sirci et al. 2012). enables a selection of chemically dissimilar com- In step 3, the docking poses are post-processed pounds for biological evaluation (Kellenberger and ranked. Many SBVS investigations have et al. 2007). Finally, the selected docking poses only employed docking scoring functions to rank and compounds should be visually inspected in docking poses, but more and more structure- step 4 for the ultimate selection: no algorithm yet based in silico screening protocols include outperforms the brain of an experienced modeler additional filters to post-process docking results for such a task. (Sect. 7.2), because the scoring accuracy of Although this chapter mainly focuses on docking-scoring combinations is very much SBVS studies (and primarily docking-based dependent on physicochemical details of target- virtual screening campaigns) it should be noted, ligand interactions and fine details of the protein however, that there are alternative (or com- structure. Strategies to overcome these problems plementary) screening methods that have been are: (i) the use of consensus scoring strategies successfully applied for the discovery of novel (Evers and Klabunde 2005; Bissantz et al. 2003; ligands (e.g. ligand-based similarity (2D and Chen et al. 2007; Kiss et al. 2008), (ii) topological 3D), and receptor/ligand-based pharmacophore filters to filter out poses exhibiting steric or searches (Kellenberger et al. 2007; Cavasotto electrostatic mismatches between the ligand and et al. 2008; Tikhonova et al. 2008; Klabunde the binding site (de Graaf et al. 2011a), (iii) et al. 2009; Salo et al. 2005; Edwards et al. receptor-ligand interaction post-processing filters 2005)). For 3D similarity and receptor-based (Tables 7.1 and 7.2) (Carlsson et al. 2010, 2011; pharmacophore searches the receptor-bound Langmead et al. 2012) and/or (iv) receptor-ligand conformation of reference compounds can interaction fingerprint (IFP) scoring methods to be derived from docking simulations in the select and rank poses based on binding mode receptor model. The obtained receptor-ligand similarity with reference ligand poses (de Graaf complexes are used to derive the information and Rognan 2008, 2009; de Graaf et al. 2008, for the 3D-similarity searches or the creation 2011a, b) (Sect. 7.2,Figs.7.4 and 7.5). of pharmacophore models. Alternatively, pure Before selecting the final compounds (step receptor-based pharmacophore models, extracted 4), the novelty of the discovered hits can be from ligand-receptor interaction hot spots in assessed by 2D or 3D similarity searches against the binding pocket, can be used even in the previously known ligands (as discussed earlier), absence of a ligand (Sanders et al. 2011, 2012; as performed in recent virtual screening studies Barillari et al. 2008), as successfully applied against e.g. AA2AR (Carlsson et al. 2010;Ka- for the discovery of CA3R ligands (incl. 45) tritch et al. 2010b), ADRB2 (Kolb et al. 2009), (Klabunde et al. 2009). Frequently GPCR DRD3 (Carlsson et al. 2011), and HRH1 (de homology-model-based virtual screening studies Graaf et al. 2011b) (Table 7.2). Moreover, 2D combine multiple screening methods and filtering and 3D ligand-based similarity searches of the steps into a hierarchical workflow. In general, screening database against the reference ligand large chemical database are first filtered by used to refine the receptor (or present in the application of a 2D and/or 3D pharmacophore receptor co-crystal structure) can be performed to model after which the focused library is subjected demonstrate the strength of the SBVS approach to molecular docking simulations (Table 7.1, (de Graaf et al. 2011a, b). When too many ligands Fig. 7.1). This type of hierarchical SBVS has are retrieved along the VS funnel, it is generally frequently been applied to bRho-based homology wise to cluster virtual hits by chemical diver- models in order to find novel ligands (de Graaf sity before visual inspection. Compounds can be and Rognan 2009). Apart from the hierarchical classified by their chemical scaffold in order to approach also integrated approaches are known 134 A.J. Kooistra et al. in which pharmacophore constraints guide the 2011; Langmead et al. 2012;Kolbetal.2012;Lin conformational sampling of docking simulations, et al. 2012; Istyastono 2012; Blättermann et al. as applied in virtual screening campaigns against 2012; Kim et al. 2012; Mysinger et al. 2012; ADA1A (Evers et al. 2005) and NK1R (Evers Renault et al. 2012; Costanzi et al. 2012; Becker and Klebe 2004). In general the more recent et al. 2004; Engel et al. 2008; Liu et al. 2008)) crystal structure-based SBVS campaigns do efficiently used to find new ligands. Moreover, not apply ligand-based pharmacophores but the new GPCR crystal structures allow for the instead use physicochemical property filters creation of higher resolution homology models as (e.g., heavy atom count, number of rings, more templates are available. hydrophobicity) and emphasize experimentally As described in Sect. 7.2, many of the supported receptor-ligand interactions (e.g., the reported GPCR-structure-based SBVS cam- essential ionic interaction with D3.32 for HRH1 paigns included a customized hierarchical (de Graaf et al. 2011b)) to score and select virtual screening workflow. Table 7.1 gives an docking poses (Tables 7.1 and 7.2). overview of recent prospective structure-based After these final steps of the virtual screening VS studies against GPCR models (since our process a highly diminished subset of the original review of 2009 (de Graaf and Rognan 2009)). compound database (typically between 0.001 and Table 7.2 presents the molecular structures of 0.05 % of the original database, Table 7.1)is representative agonists as well as antagonists obtained and their (commercial) availability is identified by prospective SBVS studies in checked. The available compounds are subse- GPCR models and crystal structures. Figure 7.3 quently obtained and experimentally validated shows the hit rates, ligand efficiency and (step 5). Many times the experimentally vali- size of experimentally validated hits in these dated hits of these virtual screening studies are in silico screening campaigns. It should be not further investigated, however more and more noticed that SBVS often yields new chemical often the hits are only a starting point and used scaffolds (Table 7.2) that still contain essential for further (structure-based) optimization (step 6, functional groups like positively or negatively Sect. 7.4). charged atoms that are used as substructure or pharmacophore filters/constraints to set up the initial ligand database or score/rank docking 7.3 In Silico Structure-Based poses (Table 7.1). Most of the in silico screening GPCR Ligand Discovery studies have focused on bioaminergic receptors (Varady et al. 2003; Evers and Klabunde 2005; The recent crystal structure determinations of Carlsson et al. 2011; de Graaf et al. 2011b;Kolb druggable class A GPCRs (Katritch et al. 2013) et al. 2009; Becker et al. 2004), but several have now opened up excellent new opportunities successful prospective SBVS campaigns are to push forward the limit of crystal structure- reported also for other rhodopsin-like GPCRs based GPCR ligand discovery (Carlsson et al. (adenosine (Carlsson et al. 2010; Langmead 2010, 2011; Langmead et al. 2012; de Graaf et al. 2012; Katritch et al. 2010b), brain-gut et al. 2011b; Katritch et al. 2010b;Kolbetal. peptide (Cavasotto et al. 2008; Engel et al. 2009; Topiol and Sabio 2008; Sabio et al. 2008). 2008), chemoattractant (Blättermann et al. 2012), It should be noted, however, that despite their chemokine (Kellenberger et al. 2007; Kim et al. possible structural inaccuracies GPCR homology 2012; Becker et al. 2004; Liu et al. 2008), lipid models can also be (and have been (de Graaf et al. (Salo et al. 2005), peptide (Evers and Klebe 2011a, b; Varady et al. 2003; Evers and Klabunde 2004; Klabunde et al. 2009; Edwards et al. 2005; Evers and Klebe 2004; Kellenberger et al. 2005), purine receptors (Tikhonova et al. 2008)). 2007; Cavasotto et al. 2008; Tikhonova et al. Only recently the first prospective SBVS studies 2008; Sirci et al. 2012; Klabunde et al. 2009; Salo targeting the allosteric TM cavity of class B et al. 2005; Edwards et al. 2005; Carlsson et al. GPCRs has been reported (de Graaf et al. 2011a) 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 135 ) ), ) ), ) ) ) ) ) ) ) 2012 2012 ) ) ) ) 2012 2012 2012 ) ) 2010 2011 2011 2011b ) 2010b (continued) 2012 2012 2009 2012 2012 2012 2013 i 2012 Congreve et al. ( and Andrews and Benjamin ( (tested) References h a Hits g Prospective Prediction Initial db 2,200,000 8 ant (39)14,000,000 7545,000 ant (20) Kolb et al. ( 20 lig Carlsson (230) et al. ( Langmead et al. ( 10,000,000 63,300,000 ant (25)3,300,000 6 ant (26)108,790 Kolb et 5 al. ant ( (25)771,219 Carlsson et al. 19 ( ant43,326 (26) Carlsson et 18 al. lig ( (29) de Graaf et 6 al. lig ( (23) Sirci et al. ( 350,000 Istyastono ( 3,300,000 1 ant (32) 1 lig (24) Kim et al.420,000 ( Mysinger et al. ( 4 lig (23) Mysinger et al. ( f 5.46 5.46 7.39 C C /E 45.53 5.43 5.43 E S S 6.58 C C C /D 6.55 6.55 6.55 3.32 3.32 3.32 3.32 3.32 3.32 4.60 7.39 7.39 Interaction filter N D D D m LB) D e C IFP D score m m m C C (Re)scoring d ad Score S7.39 250,675 13 lig (97) Renault et al. ( C Conf. Sampling c dl ad Score N C Focused database creation ll ad Score E Template ADRB1 ADRB2 AA2AR Overview of prospective structure-based virtual screening (SBVS) against GPCR models and crystal structures since 2009 b Table 7.1 Receptor Adenosines AA1RAA2ARAA2ARAA2AR AA2AR X-ray X-ray llAmines ADRB1 dl5HTR2A CNS dlADRB2DRD3DRD3 ADRB2HRH1 X-rayHRH3 fda adHRH4 ADRB1/2 adChemoattractants dl X-ray dl adOXER1 HRH1Chemokines HRH1 dlCXCR4 Score ADRB2 flCXCR4 LE/chem/score ad fl CXCR4 fl Ð Score fl de ad novo ad N CXCR4 bRho Lipids none ad RescoringCNR2 adPeptides 4,300,000 X-ray 3D Clust ad Score D3.32 23 ant (56) ad ADRB2 ll Katritch et Score al. ad ( Score fl Score 1,430 (SB IFP Score 1 lig (6) Score ad D Lin et al. ( Ð 2D D Score 1,047 E 1 lig (10) Blättermann et al. ( 136 A.J. Kooistra et al. ) ) 2009 2011a ) i 2012 ), Sum et al. ), and Costanzi 2008 2007 ( ( et al. ( nal topological/chemical roperties), 1D (physicochemical pharmacophore)) used to compile (tested) References h Hits g ProspectiveInitial db Prediction 1,900,000 3 lig (26) de Graaf et al. ( f 2.53 Interaction filter e IFP K clust. 250,675 3 lig (110) Tikhonova et al. C C (Re)scoring d Conf. Sampling ) 2006 c clust. 3D 3D 3D) ad Score C ) for overview of successful prospective GPCR structure-based virtual screening campaigns before 2009 (Comprising campaigns against C Focused database creation 2009 Template (continued) b Receptors are clustered according to Surgand et al. ( Conformer search method: ((H-bond) constr(ained)) automated docking (ad), protein-based or docked ligand-based 3D pharmacophore search (3D) Prospective validation: initial database (db) Number of experimentally confirmed hits with detectable affinity/activity (of the total number of tested compounds) Only structure-based virtual screening studies targeting the TM domain areConsecutive included filters (dl (drug-like physicochemical properties), ll (lead-like physicochemical properties), fl (fragment-like physicochemical p Method to score, rank and/or filter conformers: clust. (scaffold clustering), (c-)score ((consensus) docking scoring function), 2D (two-dimensio Key interactions with the listed residues were used to filter the docking poses References to homology modeling, virtual screening, and structure-based ligand optimization studies are provided Table 7.1 Receptor C3AR1Purines P2RY1 AT1R bRho noneSecretin GLR 3D CRFR1 3D ll( clust. Ð In-house 4 ago (157) Klabunde et al. ( database for docking/3D conformer search similarity/pharmcoaphoric features/sub-groups), 3D (three-dimensional pharmacophore) properties known ligands), 2D (two-dimensional topological/chemical similarity/pharmacophoric features/sub groups), 3D (three-dimensional bioaminergic, brain-gut peptide, chemokine, lipid, peptide, and purine receptors) See de Graaf and Rognana ( b c d e f g h i 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 137 0.3 D N6.55 E45.53 ) N (continued) + ) 2012 i, A1-A2A NH 2 O N pK NH 2010  N H 1.8 b b 2 D O N NH HN O N H N N 6.4 (ant) 6.7 (ant) 9.0 (ant) HO i, A2A-A1 (Langmead et al. N D D D HN i i i pK (Carlsson et al. F pK  10b 8 O N N ) N ) ) 2012 N 2012 N6.55 N6.55 S 2 5 2010b not active on A2A pK N H 1 1 N b b NH 2 N6.5 O 2 D D N N N NH NH O HO N 5.5 (ant) 7.5 (ant) 8.5 (ant)8.5 (ant) pK N i, A2A-A1 i, A2A-A1 (Langmead et al. N (Congreve et al. N D D D D N N i i i i S pK pK (Katritch et al. N 11c pK N  7 10a  N C - 3 ) F O + O N ) 2012 O O 2010b 1 b 2 2 N6.55 N N H D N N O NH NH N N N N N HN 6.2 (ant) pK 7.5 (ant) 8.1 (ant) pK N H N i, A2A-A1 N (Congreve et al. D D D N Cl i i i N pK (Katritch et al. O Bu  11b 6 N ) ) N S 2012 2012 ) 2 O O N6.55 1.2 -0.9 NH D D N 2010b N 8.9 (ant) pK N O 8.5 (ant) pK (Congreve et al. D i, A2A-A1 O i, A2A-A1 i D (Langmead et al. i Cl pK pK 11a ) 9b   N6.55 , Katritch et al. O HO 2012 ) 2 ) N N6.55 NH 2010 2012 N N 2012 N N6.55 OH N S 2 N 1 N N NH N N D HN Representative ligands obtained in structure-based virtual screening (and design) studies against GPCR homology models and X-ray structures O O N N HO (model/X-ray) N 1(ref) 2(ref) 3 4 (model)(Kolb et al. (Carlsson et al. HN 8.5 (ant) pK 8.5 (ant)9.0 (ant) pK 5.7 (ant) pK pK N OH i, A2A-A1 (Langmead et al. D D D D (Langmead et al. i i i O pK S pK pKi Ð  pK Table 7.2 AA1R AA2AR HO 5(ref) O 9a 10a pK 138 A.J. Kooistra et al. N N N + 2 N H N + 2 b b b b N H N N D3.32 O D3.32 S N + HN N 7.0 (ant) 6.0 (ant) O 5.9 (ant) 6.4 (ant) N 2 D D H i i D D i i F K pK D3.32 D3.32 K Cl Cl + N H O N OH N + 2 N b b b N H D3.32 O N N H N O D3.32 + 2 + N H O 5.7 (ant) 7.1 (ant) 6.3 (ant) pK 7.2 (ant) Cl NH D D D D i i i i D3.32 Cl O D3.32 pK pK pK pK Cl Cl Cl OH OH + O O 2 N H O + D3.32 b b b b O O NH O D3.32 HO Cl + 6.5 (ant) 5.8 (ant) 8.0 (ant) 8.2 (ant) N N NH O 2 2 H D D D D H + i i i i D3.32 pK Cl pK D3.32 ) 2011 ) ) ) 2011b 2009 2012 O S NH Cl N HO O N b b O O N H + (continued) NH 12 (ref)1618 (ref) 13 17a 14 19 17b 15 20 17c 21 (X-ray) (Kolb et al. + (X-ray) (de Graaf et al. (model) (Sirci et al. 22 (ref) 23 24 25 OH + (model/X-ray) (Carlsson et al. HO 9.7 (ant) 6.7 (ant) 10.0 (ant)9.7 (ant) pK pK H N N 2 HN D3.32 D D D D H i i i i + S Table 7.2 DRD3 D3.32 ADRB2 D3.32 HRH1 D3.32 HRH3 pK pK pK pK 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 139 N Cl N N 2 H b N H N N H N S (continued) NH N 5.9 5.2 (ant) D D N i i N D3.32 ) 2012 + HN (Kim et al. b a D2.63 NH Cl 5.2 (ant) 3 D CF N 50 Cl N H 39b (analogue) O ) D3.32 N H + HN 2012 N O O N O NN D2.63 5.3 (ant) IC NH N N 5.96.97.3 pK pK D N (Kim et al. D D D N 50 N i i i NH D3.32 O pK O 39a NH NH ) HN N 2012 Cl NH E7.39 N NH b D6.58 N N (Kim et al. S D4.60 N 7.2 (ant) pIC NH 7.7 (ago)8.1 (ant) pK 9.3 (ant) pK D N N D D D 38 (ref) NN 50 i i i HN D3.32 D3.32 HN ) ) ) 2012 2012 F 2012 Cl + E7.39 NH O S NH N H N N N O S (model) (Lin et al. 34 (ref) 35 (ref) 36 N (model/X-ray) D3.32 8.1 pIC (Mysinger et al. O (model) (Istyastono 30 (ref) 31 (ref) 32 33 NH 9.0 (ago)7.8 (biased ago) pK pK 8.5 (ant) pK HN D N + 26 (ref) 27 (ref) 28 29 N D D D O 50 i i i N D2.63 HN pIC HRH4 D3.32 5HTR2A CXCR4 37 (ref) pK pK pK 140 A.J. Kooistra et al. S N R3.36 N O N S b S F S7.39 S N R2.60 N O NH O O sed virtual screening reference 8.5 (ago) HN ional groups in the ligands) and 4.6 (ant) N reference ligands/binding modes O ) D NH D 50 50 2012 ) ) + N H NCS O O 2011a C6.47 - 2013 K2.53 E5.46 N H O N N R3.36 F O (de Graaf et al. S7.39 O N F F S OH 9.0 (ago) pEC (model) (Blättermann et al. N H 7.5 (ant) pIC (model) (Yrjola et al. O NH D O O D K7.32 50 50 N N R2.60 N N OXER1 - (biased ago) - (biased ant) K2.53 HO 42 (ref)CNR2 43 GLR (model) 9 ). The experimentally determined affinity/activity parameters and function (full/partial agonist ) NH E7.3 7.1 O 2012 N b HO ,Table N N 7.2 6.5 (ago) pEC Cl Cl 6.5 (ant) N N D2.63 4.9 (ant) pIC D D 50 D 50 (Mysinger et al. i N H Cl 41 O Q7.36 N H H3.33 O S O N H R7.39 H6.52 ) ) O N 2009 2012 NH ) 2 Q7.36 9 N S N K5.4 2012 E7.3 NH N N (Costanzi et al. N I N 8.1 (ago) pEC N N N D H3.33 (continued) - O O 48 (ref) 49 50 (ref) 51 OH O 3.6 (ant) pIC O (model) (Klabunde et al. P P NH (AT1) O 9.0 (ant) pK - O H6.52 D HN - N O Cl O 44 (ref) 45 46 (ref) 47 - 50 D 50 N (Mysinger et al. i N 2 H Novelty of the hit was explicitly assessed Analogue of SBVS hit: known antimalarial drug chloroquine pIC Table 7.2 D2.63 40 CA3R1 S3.29 P2RY1 (model) R7.39 pEC pK (ago) and/or inverse agonist/antagonist (ant)) of reference and discovered ligands are indicated. Homology model-based and crystal structure-ba compounds as well as hits are highlighted in blue and red respectively docking constraints/post-processing interaction filters (indicatedand by used dotted to lines select to hits complementary along residue(s) the in virtual the screening receptor) work are flow derived (Fig. from H-bond donor/positive ionizable (blue), H-bond acceptor/negatively ionizable (red) or both (magenta) pharmacophore features (assigned to funct a b 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 141 and SBVS studies against class C GPCRs have 19 (Table 7.2), was later corroborated by crystal- so far only focused on the orthosteric N-terminal lization studies. Although this is not unexpected binding site (Venus Fly trap (Triballeau et al. given its chemical similarity to the reference 2008)). The final sections will discuss two of the ligand used (carazolol, 18), this demonstrates that important challenges in SBVS for GPCR ligands: GPCR structure-based virtual screening can yield structure-based optimization of virtual screening not only new ligands, but also suitable starting hits (Sect. 7.4), and selective virtual screening for points for structure-based hit optimization. Also specific ligand types (Sect. 7.5). several chemically novel ADRB2 ligands were SBVS campaigns against the first crystal reported (e.g., 21), which is particularly chal- structures of druggable GPCRs (ADRB2, lenging for ADRB2, a receptor with a relatively AA2AR, DRDR3, and HRH1) have resulted in low ligand diversity (van der Horst et al. 2009). relatively high hit rates (Fig. 7.3,upto73%hit In fact, in 2008 Topiol et al. discovered sev- rate (de Graaf et al. 2011b)). It should however eral submicromolar affinity ligands based on the be noticed that high hit rates have not only ADRB2 crystal structure that were chemically been obtained based on docking studies against very close to known ADRB2 ligands (Topiol and GPCR crystal structures, but also successful Sabio 2008; Sabio et al. 2008). SBVS studies with high hit rates (>20 %) The ADRB2 crystal structure has been used in have been reported based on GPCR homology several studies as a template for the creation models (Fig. 7.3) (Varady et al. 2003;Evers of homology models. Istyastono et al., for et al. 2005). Moreover, in a recent comparative example, used the ADRB2 structure to generate virtual screening comparable high hit rates were JNJ7777120 (cpd. 30) and VUF10497 (cpd. obtained for prospective virtual screening runs 31) bound HRH4 homology models (Istyastono against the recently solved dopamine D3 receptor 2012). These models were subsequently used for (DRD3) crystal structure and a previously VS purposes. 23,112 fragment-like molecules constructed DRD3 homology model (Carlsson were obtained from the ZINC database after et al. 2011). A different result was obtained in a filtering out the compounds containing reactive comparative CXCR4 SBVS study in which the groups. After docking all compounds, the screen against the CXCR4 crystal structure was highest-ranking (according to the IFP score) more successful. SBVS studies using homology and novel (according to their low 2D similarity models are highly dependent on the applied to known HRH4 ligands from the ChEMBL) method as well as the homology model used compounds were selected and clustered using a as recently demonstrated by Kolb et al. (2012) 2D ligand-based fingerprint. From each cluster who created four AA1R homology models with the top compound was selected and in total 164 highly varying hit-rates. compounds were remaining for visual inspection. The first crystal structure-based virtual screen- During the visual inspection compounds with ing study for a druggable GPCR was reported for an ionic interaction with D3.32 and a polar ADRB2 (Cherezov et al. 2007). Kolb et al. (2009) moiety near E5.46 were prioritized resulting in performed docking simulations of 972,608 lead- the purchase of 23 compounds. Six of the 23 like compounds against the ADRB2 crystal struc- compounds (including hits 32 and 33)were ture. They selected 25 compounds for experimen- shown to have affinities in the range of 0.14Ð tal testing from the 500 top-ranking molecules 6.3 M (Istyastono 2012). based on chemical clustering, visual inspection, Two successful docking-based virtual screen- and favorable interaction energies with D3.32 (Ta- ing studies against the AA2AR crystal structure bles 7.1 and 7.2). 6 of the 25 hits had detectable have been performed (Carlsson et al. 2010; binding affinity for ADRB2 (Ki values rang- Katritch et al. 2010b), yielding diverse sets of ing from 4 to 0.009 M) and were character- novel ligands (6–8, 9a, 10a, Table 7.2). Katritch ized as inverse agonists. Interestingly, the pre- et al. selected 56 high ranking compounds from dicted binding mode of the highest affinity hit docking based virtual screening runs of four 142 A.J. Kooistra et al. million compounds against the AA2AR crystal (P2RY1). The ligand and receptor based phar- structure and discovered 23 new AA2AR ligands macophore screen was performed on a set of with Ki values between 0.032 and 10 M (incl. 6, 133,999 compounds that were selected from a 7) (Katritch et al. 2010b). Interestingly, specific set of 250,675 compounds using physicochem- water molecules in the AA2AR binding site were ical filtering. The resulting 362 hit compounds included in the docking simulations, because were subsequently clustered and from each clus- retrospective virtual screening evaluations gave ter one unique compound was selected. The re- better results with water than without consider- sulting 110 compounds were subsequently vali- ation of water (Katritch et al. 2010b). Carlsson dated in vitro for their binding affinity for P2RY1, et al. emphasized favorable H-bond interactions yielding multiple hits (exact number was not with N6.55, an important AA2AR ligand binding reported). One of the hit compounds (with a residue (Kim et al. 1995; Jaakola et al. 2010; reported binding affinity of 13 M) was fur- Zhukov et al. 2011), in their scoring protocol to ther explored through use of an analogue search rank the docking poses of 1.4 million cpds in and SAR studies (Sect. 7.4). This work resulted the AA2AR crystal structure by increasing the in new insights in the binding mode properties dipole moment of the side chain amide group but of this non-nucleotide P2RY1 antagonist series without taking water into account (Carlsson et al. (Costanzi et al. 2012). 2010). Using this experimentally guided SBVS Carlsson et al. compared the SBVS perfor- approach, 7 of the 20 selected hits were exper- mance of the dopamine D3 receptor (DRD3) imentally confirmed as novel AA2AR ligands, crystal structure and an ADRB2-based homology with Ki values ranging from 0.2 to 9 M (incl. model of DRD3 (that was constructed prior to the 8) (Carlsson et al. 2010). Recently an ADRB1- release of the DRD3 X-ray structure) (Carlsson based AA2AR homology model (Zhukov et al. et al. 2011). 26 and 25 of the highest ranking 2011) (that correctly predicted the binding molecules with strong electrostatic interactions mode of ZM241385 prior to publication of the with D3.32 (Tables 7.1 and 7.2) were selected for AA2AR-ZM243185 crystal structure (Jaakola experimental testing from 3.3 million molecules et al. 2008)) was successfully used to discover docked in the DRD3 homology model and DRD3 new AA2AR ligands (Langmead et al. 2012). crystal structure, respectively. Interestingly, both The overall hit rate of this in silico screening models performed equally well in terms of virtual campaign was somewhat lower than the AA2AR screening hit rate (Fig. 7.3). 6 of the homology crystal structure-based screening runs (Fig. 7.3), model based hits had affinities ranging from 0.2 but yielded a diverse set of chemically novel to 3.1 M (incl. hits 15, 16, 17a, Table 7.2), while ligands (e.g., 9a, 10a) (Langmead et al. 2012), 5 of the X-ray structure based hits had affinities that were subsequently optimized by structure- ranging from 0.3 to 3.0 M (incl. hits 13, 14). based design to improve affinity and adenosine Ligand 17 from the homology model screen was receptor selectivity (Sect. 7.4) (Langmead et al. optimized for affinity to 81 nM (Sect. 7.4) (Carls- 2012; Congreve et al. 2012). Both receptor son et al. 2011). model construction as ligand optimization was A customized structure-based virtual screen- driven by experimental (mutagenesis/biophysical ing protocol against the histamine H1 receptor mapping (Zhukov et al. 2011) of receptor-ligand (HRH1) crystal structure was used to dock and interaction hotspots) and computational analysis score a database of 108,790 fragment-like com- (identification of thermodynamically unstable pounds (heavy atoms 22) containing a basic water molecules that can be displaced by the moiety (de Graaf et al. 2011b). The method ligand) (Langmead et al. 2012; Congreve et al. combined molecular docking simulations with a 2012; Andrews and Benjamin 2013). protein-ligand interaction fingerprint (IFP) scor- With the antagonist bound AA2AR crystal ing method (Fig. 7.4). The optimized in sil- structure as a template, Costanzi et al. (2012) ico screening approach was successfully applied created a homology model of the P2Y1 receptor to identify a chemically diverse set of novel 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 143 fragment-like (22 heavy atoms) HRH1 ligands against the homology model and crystal structure with an exceptionally high hit rate of 73 %. Of the respectively. The binding modes of the top 500 26 tested fragments, 19 compounds had affinities compounds from each screen were visually ranging from 10 M to 6 nM (incl. hits 23– inspected. Based on availability, internal energy, 25) This study shows the potential of in silico unsatisfied polar interactions, correctness of the screening against GPCR crystal structures to ex- protonation state and hit diversity 24 and 23 plore novel, fragment-like GPCR ligand space (de compounds were selected for the crystal structure Graaf et al. 2011b). and homology model screening respectively. The HRH1 crystal structure also allowed Experimental validation yielded 1 and 4 hit Sirci et al. to create high-resolution homology compounds (a hit rate of 4 and 17 %) respectively models of HRH3 (Sirci et al. 2012). Multiple and showed affinities ranging from 0.31 M(hit MD snapshots of the homology models were 41) to 225 M(hit40). This study underlined subjected to retrospective validation using the impact of the available templates and the experimental data from a VU-MedChem experimental knowledge for the creation of fragment library screen against HRH3. The homology models (Mysinger et al. 2012). snapshot with the highest early enrichment was Kim et al. (2012) constructed a CXCR4 homo- selected from both the methimepip-based (cpd. logy model based on binding mode of AMD3100 26) and the VUF5228-based homology model. (cpd. 38), and validated this model using pub- After a retrospective comparison of both models lished site-directed mutagenesis data. 350,000 using docking and FLAP-linear discriminant compounds from the open NCI database were analysis (FLAP-LDA) the latter in combination docked in three conformations of the homology with the methimepip-based homology model was model. From the docked compounds the neutral shown to be superior in the retrieval of active molecules with a buried surface area of at least fragment-like compounds. Multiple different 80 % in close proximity to the acidic residues ligand-based FLAP models were build (based D4.60,D6.58 and E7.39 were selected, excluding on amongst others reference compound 27)in 90 % of the small molecules. For the remaining parallel as well and were also retrospectively molecules the binding energies were calculated validated. Subsequently a prospective screening and the top 200 compounds in each receptor of 156,090 fragment-like compounds against conformation were visually inspected, resulting both the best structure-based and ligand-based in a subset of 50 compounds of which 32 were FLAP-LDA models was performed. From the procured and experimentally validated. One hit highest scoring compounds for both the ligand was obtained (cpd. 39a) which was further inves- and structure-based model 21 compounds were tigated because of its close similarity to known purchased as well as 8 from the highest scoring anti-malarial drugs (Sect. 7.4)(Kimetal.2012). compounds from only the ligand-based model. Blättermann et al. (2012) used the CXCR4 Experimental validation pointed out a remarkably crystal structure (Wu et al. 2010) to construct high hit rate of 62 % as 18 of the 29 selected a homology model of an oxoeicosanoid recep- compounds were found to be active. This study tor 1 (OXER1) homology model. Commercially showed that the combined use of ligand-based available compounds from the ZINC database and structure-based models with a thorough were filtered based on physicochemical proper- retrospective validation can be the key to a ties, resulting in a subset of 1,047 compounds. successful VS (Sirci et al. 2012). This subset was docked into the OXER1 ho- Mysinger et al. used a chemokine receptor mology model and from the top 100 (as ranked CXCR4 crystal structure and a CXCR4 by AutoDock), 10 compounds were visually se- homology model (constructed before the release lected and tested for OXER1 mediated Ca2C of the crystal structure) in a comparative virtual release-inhibition. One of the 10 compounds (hit screening study (Mysinger et al. 2012). 3.3 and 43) showed significant inhibition and was fur- 4.2 million lead-like compounds were docked ther investigated. Ligand deconstruction of this 144 A.J. Kooistra et al. a

b

Fig. 7.3 Hit rate versus size (heavy atom count) (a)and C-C chemokine receptor type 5 (CCR5 (Kellenberger hit rate versus ligand efficiency (b) of hits identified et al. 2007)), cannabinoid receptor 2 (CNR2_1 (Salo in prospective structure-based virtual screening studies et al. 2005), CNR2_2 (Renault et al. 2012)), chemokine against GPCR crystal structures (red) and homology mod- receptor CXCR4(CXCR4_1 (Kim et al. 2012), CXCR4_2 els (blue). Open circles indicate a SBVS studies published (Mysinger et al. 2012)), (DRD3_1 before 2009, studies since 2009 are indicated with a (Varady et al. 2003) and DRD3_2 (Carlsson et al. 2011) closed square. The bars shown indicate the minimum and DRD3_3 (Carlsson et al. 2011)), free fatty acid and maximum heavy atom and ligand efficiency count receptor 1 (FFAR1 (Tikhonova et al. 2008)), formyl pep- for all hits of each SBVS respectively. The labels indi- tide receptor 1 (FPR1 (Edwards et al. 2005)), glucagon cate the screening on the following receptors adenosine receptor (GLR (de Graaf et al. 2011a)), histamine receptor A1 (A1A), adenosine A2A receptor (A2A_1 (Carlsson H1 (HRH1 (de Graaf et al. 2011b)), histamine recep- et al. 2010), A2A_2 (Katritch et al. 2010b), A2A_3 tor H3 (HRH3 (Sirci et al. 2012)), histamine receptor (Langmead et al. 2012)), adrenergic alpha-1a receptor H4 (HRH4_1 (Kiss et al. 2008), HRH4_2 (Istyastono (ADA1 (Evers and Klabunde 2005)), adrenergic beta- 2012)), 5-hydroxytryptamine receptor 2A (5HTR2A (Lin 2 receptor (ADRB2 (Kolb et al. 2009)), complement et al. 2012)), melanin-concentrating hormone receptor 1 component 3a receptor 1 (C3A (Klabunde et al. 2009)), (MCH1 (Cavasotto et al. 2008)), neurokinin 1 receptor 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 145 hit compound yielded no substructures with any ture on virtual screening, and the challenges in binding affinity for OXER1 implying that the virtual screening for receptor subtype selective complete molecule, as discovered, was needed ligands. in order to successfully bind to OXER1 and Virtual screening studies can also highlight inhibit the CA2C flux. Further functional assays surprising cross-pharmacology as was demon- highlighted that this compound was biased and strated in the VS study by Lin et al. (2012)A inhibited G“” but not G’i signaling (Blättermann 5HTR2A homology model was created based on et al. 2012). the timolol bound ADRB2 crystal structure (Han- TheworkbyKolbetal.(2012) discussed ear- son et al. 2008). Using MD, ten induced-fit mod- lier pushed the boundaries of virtual screening as els for both ketanserin and cyproheptadine bound they tried to screen for subtype selective ligands 5HTR2A were generated. These 20 snapshots and evaluate the impact of homology model in- were subsequently used to include protein flexi- accuracies/differences. Four different AA1R ho- bility and afterwards subjected to a retrospective mology models based were created based on validation. The ketanserin (cpd. 34) and cypro- the antagonist bound AA2AR crystal structure. heptadine (cpd. 35) bound models with the best These models were subjected to docking simula- early enrichment were selected for a prospective tions of 2.2 million lead-like compounds from the screening study. A filtered (100 MW 600) ZINC database. Within this docking simulation compound library, comprising 1,430 FDA ap- the conserved interaction with N2546.55 was ac- proved drugs, was screened against the two mod- centuated by amplifying this term. The top 500 of els using docking and a MM-GB/SA refinement the docking results from each of the models was and rescoring procedure. The top 200 hits were inspected to filter out molecules with unsatisfied filtered based on an essential hydrogen bond hydrogen bond donors and acceptors, incorrect filter with D3.32 and chemical novelty assessment protonation states, unlikely binding modes or based on comparisons with known 5HTR2A lig- highly strained conformation which resulted in ands in ChEMBL (Gaulton et al. 2012) and Drug- the purchase of 39 compounds in total. All 39 Bank (Knox et al. 2011) and SEA predictions compounds were tested for their AA1R, AA2AR (Keiser et al. 2007). Of the six compounds that and AA3R affinity in order to assess their se- were selected for experimental validation, the ki- lectivity. 8 of the compounds showed activity nase inhibitor sorafenib (hit 36) showed a binding for the AA1R receptor (incl. hits 3 and 4), but affinity of 1,959 nM for 5HTR2A. This com- surprisingly, 15 and 14 compounds were active pound 36 was also tested on the other HT receptor on the AA2AR and AA3R respectively. In total subtypes and found to be active against all with an 20 of the 39 compounds were active on one (or affinity ranging from 56 to 7,071 nM. It should be more) of these three receptors. Interestingly, not noted that although sorafenib is not able to form all homology models performed equally well; the conserved salt bridge with D3.32, the urea one of the models did not yield any hits and moiety of sorafenib is seemingly able to provide another model accounted for half of the active a strong hydrogen bond to D3.32 thereby replacing compounds (but only one of the AA1R active the necessity for a basic moiety. compounds, namely hit 3). This study highlighted Renault et al. (2012) used a combined once more the effect of small structural differ- approach of a trained ligand-based filter ences in the receptor model or crystal struc- and an automated docking approach for the

J Fig. 7.3 (continued) (NK1R (Evers and Klebe 2004)), ity purposes. Only hits for which: (i) binding affinity oxoeicosanoid receptor 1 (OXER1 (Blättermann et al. Ki/IC50/EC50/KB 15 M was experimentally deter- 2012)), P2Y purinoceptor 1 (P2RY1 (Costanzi et al. mined; and (ii) for which a molecular structure was 2012)) and transferrin receptor 1 (TRFR1 (Engel reported, are included in the analysis. The affinity of the et al. 2008)). The maximum heavy atom count of OXER1 hit compound was not quantified (Blättermann FPR1 (Lipinski et al. 2001) is not shown for clar- et al. 2012) and is therefore only included in a 146 A.J. Kooistra et al. identification of novel ligands for the cannabinoid be used to identify novel class B non-competitive receptor 2 (CN2R). A database of 5,513,820 ligands (de Graaf et al. 2011a). compounds was reduced using physicochemical filtering resulting to a subset of 3,495,595 compounds. This dataset was subjected to a 7.4 In Silico Guidance: trained 2D-based Bayesian classifier (based on Optimization of Virtual 90 compounds with measured affinity for CN2R) Screening Hits that further reduced the compound set to 209,442 compounds. Docking in an active-state CN2R The identification of active compounds for a homology model (using carazolol-bound (18) specific target using VS is only the first step inactivate ADRB2 (Cherezov et al. 2007)as in the search for novel leads with the targeted a template) with an interaction filter on S7.39 affinity, selectivity and/or pharmacological prop- resulted in the selection of 1,385 compounds of erties and is often followed by SAR exploration, which 150 were selected and 149 were available site-directed mutagenesis studies and structure- for purchase. Due to solvation issues only 97 of guided optimization. True rational prospective the compounds could be tested which resulted in structure-based hit optimizations, however, re- the confirmation of 13 hits with affinities ranging main relatively scarce (Congreve et al. 2011; from 2.3 nM (hit 47)to71M. Tautermann 2011). Nevertheless some of the new De Graaf et al. (2011a) constructed a homol- hits identified in GPCR structure-based virtual ogy model of the class B GPCR GLR based screening campaigns (Table 7.2) have been opti- on a validated structural model of the CRFR1 mized to improve ligand affinity (Cavasotto et al. receptor. For this representative class B GPCR 2008; Liu et al. 2008), receptor selectivity (Lang- numerous experimental ligand and receptor data mead et al. 2012; Congreve et al. 2012; Becker were available to guide the modeling procedure et al. 2006), pharmacokinetic properties (Con- and to validate the refined model with retrospec- greve et al. 2012), and to explore SAR to vali- tive virtual screening studies (see Sect. 7.2). A date predicted binding modes (Tikhonova et al. database of 1.9 million commercially available 2008; Lin et al. 2012; Engel et al. 2008; Wacker drug-like compounds was screened for chemical et al. 2010). Some of these hit optimization stud- similarity to existing GLR noncompetitive antag- ies have been driven by structure-based design onists and docked to the transmembrane cavity (Langmead et al. 2012; Congreve et al. 2012;Lin of the GLR homology model. 23 compounds et al. 2012; Becker et al. 2006; Wacker et al. were selected based on binding mode similar- 2010). ity to the protein-ligand interaction fingerprints One of the most frequently observed steps of the docking poses of two different reference in a VS hit exploration is the search for close ligands (known GLR ligands 65 and L-168,049) analogues of one or more hit compounds (Carls- in the GLR homology model. Two of the 23 son et al. 2010, 2011; Langmead et al. 2012; compounds inhibited the effect of glucagon in Kim et al. 2012; Kiss et al. 2012) or to syn- a dose-dependent manner (both from the hit list thesize a small series around one or more hit ranked according to IFP similarity to the ref- compounds (Costanzi et al. 2012;Yrjolaetal. erence binding mode of 50). Interestingly, one 2013) (Table 7.2). The experimental validation in silico hit that was inactive at the GLR was of compound 8, an AA2AR-selective VS hit shown to bind to GLP-1R and potentiate the from Carlsson et al. (2010), was followed by an response to the endogenous GLP-1 ligand. This analogue search which yielded 5 equally selective illustrates the strength of using two alternative analogues but all with a lower affinity. Lang- binding mode hypotheses in prospective SBVS mead et al. (2012) also performed an analogue studies. Although the potencies of the ligands search for one of their AA2AR structure-based are still modest, this study showed for the first virtual screening hits (9a). This search yielded time that structure-based approaches can indeed three interesting analogues, amongst them was a 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 147 selective (16-fold over AA1R) and highly potent a slight selectivity towards DRD3 over DRD2 (pKi D 8.5) compound (9b) and an even more in contrast to the 3,5-dichlorophenyl derivatives selective (>100-fold over AA1R) but slightly less as they showed comparable affinity towards both potent (pKi D 7.8) compound. Compound 10a, receptors. Reduction of the 2-methylpropane-1,3- the compound with the highest LE (LE of 0.53, diol tail into a 2-methylpropane-1-ol (17b) and see Fig. 7.3b), and a chromone-containing com- amethyl(17c) yielded submicromolar affinity pound were also further investigated with a ligand compounds (for DRD2 and DRD3) with high deconstruction study and a hit-to-lead synthesis ligand efficiencies (LE of 0.38, 0.47 and 0.57 for program. 17a, 17b and 17c respectively). The SAR obtained from the deconstruction Lin et al. discovered that the kinase-inhibitor of compound 10a highlighted that the phenol in sorafenib (36) is also a 5HTR antagonist based combination with the amino functionality was the on a SBVS study against a 5HTR2A homology core scaffold for high potency ligands. Subse- model. Based on the predicted binding mode of quently the propenyl-thiophene was reduced to sorafenib in 5HTR2A multiple analogues were an isopropyl, which yielded a simplified com- designed and two of them were synthesized to pound with only 17 heavy atoms and a pKi of further validate this surprising discovery. The re- 7.9 (resulting in a LE of 0.64). Further substi- placement of the aromatic nitrogen with a carbon tutions yielded amongst others a highly potent slightly improved the binding affinity indicating (pKi D 9.0) and highly selective (59-fold over that this atom was not engaged in a polar interac- AA1R) compound (10b) by introducing a n- tion with the receptor. However, the subsequent methylpiperazine moiety. Optimization of these addition of a methyl group on the nitrogen atom triazine compounds (10a and 10b) was contin- of the amide resulted in a more than ninefold ued in a rational structure-based optimization loss of affinity which again indicated that the study (Congreve et al. 2012; Andrews and Ben- hypothesized binding mode was correct in which jamin 2013). Instead of 1,3,5-triazines deriva- this nitrogen was engaged in hydrogen bonding tives, 1,2,4-triazine derivatives were further ex- with L7.35. plored by combined SPR, SDM, in silico binding Kim et al. (2012) obtained a comparably mode studies and in silico thermodynamic sta- surprising result in their SBVS study against a bility calculations of water molecules. Three of CXCR4 homology model in which one hit (39a) the designed compounds (11a, 11b and 11c)were was identified that was very closely related to also successfully crystallized in AA2AR and cor- known anti-malarial drugs. The obvious next step roborated the modeling approach (Congreve et al. was to also test these compounds (chloroquine 2012; Andrews and Benjamin 2013). Also the (39b), hydroxychloroquine and quinacrine) and proposed binding mode of the ADRB2 antagonist they were all found to be active against CXCR4. 17, discovered by ADRB2 crystal structure-based Apparently the replacement of the methoxy in silico screening (Kolb et al. 2009), was later moiety with a chlorine atom (the sole difference experimentally validated by a co-crystal structure between hit compound 39a and anti-malarial of ADRB2 and 19 (PDB-code: 3NY9) (Wacker drug chloroquine 39b) did not have a significant et al. 2010). impact of the affinity and cell migration, thereby The SBVS study performed by Carlsson opening up new repurposing possibilities for et al. against both a homology model and a anti-malarial drugs. crystal structure yielded 11 new compounds. The most potent of the three hits (compound The compound with the most novel scaffold 49) from the structure-based pharmacophore (compound 17a) was further explored by the screen against P2RY1 was explored by both the purchase of 20 analogues. Most of the analogues purchase of commercially available analogues showed the submicromolar affinity for DRD3 as well as the synthesis of multiple analogues. and (sub)micromolar affinity for DRD2. The Synthesis of the analogues focused on the 2,5-dichlorophenyl derivatives appeared to have redecoration of the phenyl ring (from the 3,4- 148 A.J. Kooistra et al. dichlorophenyl moiety in 49) with different small fragments can adopt a larger variety of substituents on all positions. However, all synthe- binding modes in different protein (sub)pockets sized analogues showed a lower binding affinity (Leach and Hann 2011; Loving et al. 2010). as well as lower inhibition of the Ca2C efflux Furthermore scoring functions used to estimate than hit 49. Moreover, the replacement of the the binding affinity and determine binding modes chlorine atoms with fluorine atoms or a methoxy in molecular docking simulations are not trained moiety resulted in the complete loss of binding for ranking (the poses of) small fragment-like affinity for P2RY1. Two analogues that were compounds (Loving et al. 2010; Deng et al. 2004; purchased contained different substituents for the Marcou and Rognan 2007). Virtual fragment 3,4-dimethylisoxazole moiety of hit 49, and had screening with a molecular interaction fingerprint a slightly higher binding affinity (Ki D 6Ð10 M) (IFP (Deng et al. 2004; Marcou and Rognan than the initial hit (13 M) (Costanzi et al. 2012). 2007)) is a suitable alternative approach to select Tosh et al. (2012) optimized adenosine 50- ligands that can make specific interactions in carboxamide derivatives for AA2AR by com- specific binding pockets (de Graaf et al. 2011a, b; bining binding mode predictions with in silico de Graaf and Rognan 2008). Validated fragments fragment replacement and redocking techniques. can then efficiently be used as starting points The initial binding modes of some adenosine 50- for ligand optimization strategies by fragment carboxamide derivatives suggested optimization growing, linking, or merging (Loving et al. 2010; of these compounds towards W6.48. By manual de Kloe et al. 2009) of fragments originating fragment replacement several optimizations were from different IFP ranking lists. proposed and an automated fragment replace- ment resulted in 2,000 proposals. The proposed compounds were energy optimized in the binding pocket from which the top 100 compounds were 7.5 Fitting Functional Selectivity: selected for redocking and rescoring. 23 com- Selective Screening for pounds were subsequently selected and synthe- Ligands with Specific sized for further testing. This resulted in 2 inac- Functional Effects tive compounds and 22 compounds with highly differing affinity and selectivity profiles, but most The growing number of available GPCR crys- of them had submicromolar affinity for AA2AR tal structures has not only provided insights in and were full or partial agonists at AA1R. receptor-ligand binding modes, but has also in- The overview in Fig. 7.3a shows that several of creased our understanding in the activation and the GPCR ligands identified in SBVS campaigns signaling mechanism of GPCRs. For the bRho, are relatively small (22 heavy atoms), this AA2AR, ADRB1, and ADRB2 multiple GPCR offers new possibilities for a fragment-based drug crystal structures are available comprising inverse design approach. In fact, in the first SBVS study agonist, antagonist and (biased) agonist bound against the HRH1 (de Graaf et al. 2011b) crystal states as well as agonist bound and G-protein structure successfully retrieved many novel lig- (mimic) coupled states (Katritch et al. 2013). Al- ands with good ligand efficiency (LE) (Hopkins though many VS studies are based on the inactive et al. 2004) from a library of fragment-like (inverse agonist/antagonist bound) state models molecules (incl. 23 (LE D 0.57), 24 (LE D 0.45) with which many new antagonist were found (de and 25 (LE D 0.44)) (de Graaf et al. 2011b). Also Graaf and Rognan 2009; Kooistra et al. 2013), SBVS runs against, for example, AA2AR have these inactive state models have also been shown resulted in fragment-like hits with good ligand ef- to be useful in the search for agonists (de Graaf ficiency (9 hits with LE between 0.3 and 0.5 incl. et al. 2011a; Varady et al. 2003; Kellenberger 7 (LE D 0.34)) (Katritch et al. 2010b). It should et al. 2007; Tikhonova et al. 2008; Salo et al. be noted, however, that structure-based optimiza- 2005; Kiss et al. 2008). In some of these studies tion of fragments can be challenging because (Varady et al. 2003; Tikhonova et al. 2008; Salo 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 149 et al. 2005), the initial ground state model was similar decoys (de Graaf and Rognan 2008)were refined by true agonists, but in other cases (de automatically docked in one chain of each of Graaf et al. 2011a; Kellenberger et al. 2007)the the crystal structures. The docking poses of all inactive model was even refined by antagonists. compounds were subsequently post-processed The other way around, agonist-biased models with IFP and the enrichment factor at 1 % false have also been successfully applied to find antag- positive rate (EF1 %) was determined for both onists (Kiss et al. 2008). the agonists and the antagonists (or inverse Thereare currently 23 crystal structures agonists) versus the decoys (Fig. 7.5a). All deposited in the PDB for the beta-adrenergic agonist-bound crystal structures (indicated by red receptors (12 ADRB1 and 11 ADRB2 structures) dots in Fig. 7.5a) showed a preference towards (Warne et al. 2008, 2011, 2012; Cherezov et al. agonists over antagonists (the dots are located 2007; Wacker et al. 2010; Moukhametzianov above the black dashed line) except for the et al. 2011; Rasmussen et al. 2007, 2011a, b; biased agonist bucindolol-bound structure (PDB- Rosenbaum et al. 2007, 2011). Most of them code: 4AMI) and vice versa for the antagonist (15) are in the inactive antagonist-bound (or bound crystal structures (indicated by blue dots inverse agonist) state, six in the inactive (biased) in Fig. 7.5a). The full ROC curves of the crystal agonist-bound state and two in the agonist and structures showing the highest selectivity (EF1% G-protein (mimic) coupled states and all show for agonist versus EF1 % for antagonists) towards a well-conserved mode of binding. Despite this antagonists and agonist are shown in Fig. 7.5b, c similar binding mode co-crystallized antagonists, respectively. inverse agonists and full/partial agonists show The retrospective virtual screening results pre- specific interactions with the receptor binding site sented in Fig. 7.5a show that the unique combi- (Tang et al. 2012). IFP analysis for these crystal nation of the IFP of the bound ligand and the ori- structures highlights for example the agonist entation of the residues in the pocket determines specific hydrogen-bond interaction with S5.46 the selectivity as can be seen from the difference (Costanzi and Vilar 2012). In 2008 de Graaf and in the retrieval rate of e.g. both carazolol-bound Rognan showed in a retrospective study that by structures (PBB-codes: 2YCW and 2RH1). It adjusting the rotameric states of S5.43 and S5.46 should furthermore be noted that both the isopre- the inactive state ADRB2 structure bound by naline (PDB-code: 2Y03) and BI-167107 (PDB- the inverse agonist carazolol (PDB-code: 2RH1 code: 3P0G) bound structures are highly selective (Cherezov et al. 2007)) could be used for the towards agonists. These ligands are different in selective retrieval of partial and full agonists size but share the same key interactions with the (de Graaf and Rognan 2008). The more recent ADRB2 binding site (Fig. 7.4c), including the agonist-bound ADRB2 crystal structures have hydrogen bond with S5.46. validated these rotameric changes, e.g. the BI- In summary, SBVS campaigns can in principle 167107 (and nanobody) bound ADRB2 structure be biased towards ligands with a specific pharma- (PDB-code: 3P0G, Fig. 7.4a) (Rasmussen cological effect by carefully selecting a crystal et al. 2011a) versus the carazolol bound structure and reference ligand (or carefully con- ADRB1 structure (PDB-code: 2YCW, Fig. 7.4b) structing a homology model) for molecular dock- (Moukhametzianov et al. 2011). ing simulations and combining it with selective In a retrospective docking study against all interaction filters. IFP analyses can be used to ligand-bound (21) ADRB1 and ADRB2 crystal identify and emphasize key interactions that are structures were tested for their selectivity in a linked to specific pharmacological effects. This SBVS towards antagonist/inverse agonists and information can ultimately be used to rescore agonists. A set of compounds comprising 24 docking poses and identify novel chemotypes agonists (Baker 2010), 25 antagonist/inverse ag- with similar interaction profiles and desired func- onists (Baker 2005) and 980 physicochemically tional effects. 150 A.J. Kooistra et al.

a b

c

Fig. 7.4 Binding mode of full agonist BI-167107 (a, BI-167107 (a) and carazolol (b) are compared with the PDB-code: 3P0G) (Rasmussen et al. 2011a)andinverse also highly discriminating (see Fig. 7.5) binding modes of agonist carazolol (b, PDB-code: 2YCW) (Moukhamet- co-crystallized partial inverse agonist iodocyanopindolol zianov et al. 2011) in the active agonist-bound of ADRB2 (PDB-code: 2YCZ) (Moukhametzianov et al. 2011)and and inactive antagonist-bound state of ADRB1, respec- full agonist isoprenaline (PDB-code: 2Y04) (Warne et al. tively. In panel c the interaction fingerprints (IFP, only key 2011) residues are shown) of the binding modes of crystallized

7.6 Exploring Novel of the obtained ligands is decreasing (Fig 7.3a) GPCR-Ligand Interaction and simultaneously their ligand efficiency is Space in More Detail increasing (Fig. 7.3b). Several discussed studies have demonstrated that insights in GPCR- The presented overview shows that structure- ligand interactions obtained from combined based GPCR ligand discovery is growing in experimental and in silico modeling techniques popularity and success. Many virtual screening can be used to enable fragment-based drug studies have resulted in the discovery of novel discovery and design against GPCRs (Sirci ligands for diverse class A GPCRs and recently et al. 2012; de Graaf et al. 2011b; Istyastono also the first class B SBVS has been reported 2012). Despite this progress, the sometimes low (de Graaf et al. 2011a). As opposed to the affinity of small fragments can be a problem more recent computer-aided ligand discovery as the difference in size of the endogenous and design studies, previous studies were mostly ligands between GPCRs (e.g. neurotransmitters focused on the identification of larger ligands versus chemokines) influences the shape of (Fig. 7.3a). However, the focus is currently the orthosteric binding pocket and thereby its gradually shifting towards smaller ligands recognition of fragments-like ligands (de Kloe (fragments) with high ligand efficiency (de Graaf et al. 2009). Therefore a thorough retrospective et al. 2011b; Katritch et al. 2010b; Hopkins validation is pivotal for a successful SBVS et al. 2004). This fragment-based drug discovery for fragment-like molecules, which emphasizes approach (de Graaf et al. 2013) is starting to the need for GPCR-target annotated fragment have an impact on the targeted compounds in libraries with true actives and true inactives (de SBVS studies, as the average heavy atom count Graaf et al. 2013). 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 151

2YCW a OH O b Agonists Antagonists N OH NH H Random OH O OH

N H OH rate (%) rate NH

3P0G TP

3PDS 20 40 60 80 100 N O H

OH 0 5.0 0.1 0.5 1.0 10.0 50.0 100.0 3SN6 FP rate (%) N 3P0G c EF1% Agonists NH I

2RH1 N O H OH 4AMJ

2Y01 (%) rate

3NY9 TP 04AMI 20 2YCX 40 60 802YCW 100

0 20 40 60 80 100 0 20406080100 1.2 EF1% Antagonists 0.1 0.5 1.0 5.0 10.0 50.0 FP rate (%) 100.0

Fig. 7.5 Overall enrichment at 1 % false positive rate molecules (de Graaf and Rognan 2008)(a) using IFP (FP-rate) for the retrieval of 25 partial/full agonists (de scoring (see Fig. 7.4c). Full ROC curves visualizing the Graaf and Rognan 2008; Baker 2010) and 24 inverse retrieval rate (TP-rate) of agonists (red) and antagonists agonists/antagonists of ADRB1/ADRB2 (de Graaf and (blue) in the best antagonist crystal structure (b) and best Rognan 2008; Baker 2005) over a set of 980 decoy agonist crystal structure (c,legendshowninb)

The increasing number of GPCR crystal struc- structure to kinetic properties, thereby changing tures (Katritch et al. 2013) has brought a wealth the focus from solely affinity-based optimizations of detailed structural insights into the molecu- to optimization of kinetic properties (Tresadern lar recognition mechanisms of (small) ligands. et al. 2011; Miller et al. 2012). As discussed in This has opened up new possibilities for (high Sect. 7.5 the (relative) plethora of crystal struc- resolution) homology modeling as they facili- tures of the adrenergic receptors including the tate better templates and insights in regions with first (biased) agonist bound (active and inactive) relatively high structural variability (e.g., extra- GPCR crystal structures show how subtle differ- cellular loops) as well as the structural features ences in the binding pocket can accommodate ag- of sequence-specific motifs (e.g. the helical kink onist binding, but also the molecular mechanisms induced by the TXP motif in CXCR4 (Wu et al. of G-protein binding and activation (Nygaard 2010)). The new GPCR crystal structures in com- et al. 2013). A better understanding of the ligand bination with molecular dynamics simulations binding modes (Katritch and Abagyan 2011), give insights into receptor flexibility and (po- binding pockets (Rosenkilde et al. 2010), and tential) ligand access and exit channels (Kruse conformational changes (Liu et al. 2012; Bokoch et al. 2012; Zhang et al. 2012;Droretal.2011). et al. 2010) associated with (specific) signaling The association and dissociation pathways re- pathways will allow the development of selective vealed by computational simulations in combina- SBVS strategies for agonists over antagonists tion with experimental studies (e.g. mutagenesis (or vice versa) or ultimately even for biased data and biophysical measurements (Congreve ligands. et al. 2012)) can in time be used to relate ligand 152 A.J. Kooistra et al.

tion of the extracellular surface of a G-protein-coupled References receptor. Nature 463(7277):108Ð112. doi:10.1038/na- ture08650 Andrews SP, Benjamin T (2013) Stabilised G protein- Carlsson J, Yoo L, Gao ZG, Irwin JJ, Shoichet BK, Ja- coupled receptors in structure-based drug design: a cobson KA (2010) Structure-based discovery of A2A case study with adenosine A2A receptor. MedChem- adenosine receptor ligands. J Med Chem 53(9):3748Ð Comm 4(1):52Ð67. doi:10.1039/c2md20164j 3755. doi:10.1021/jm100240h Baell JB, Holloway GA (2010) New substructure filters Carlsson J, Coleman RG, Setola V, Irwin JJ, Fan H, for removal of pan assay interference compounds Schlessinger A, Sali A, Roth BL, Shoichet BK (2011) (PAINS) from screening libraries and for their ex- Ligand discovery from a dopamine D3 receptor ho- clusion in bioassays. J Med Chem 53(7):2719Ð2740. mology model and crystal structure. Nat Chem Biol doi:10.1021/jm901137j 7(11):769Ð778. doi:10.1038/nchembio.662 Baker JG (2005) The selectivity of beta-adrenoceptor Cavasotto CN (2011) Homology models in antagonists at the human beta1, beta2 and beta3 docking and high-throughput docking. adrenoceptors. Br J Pharmacol 144(3):317Ð322. Curr Top Med Chem 11(12):1528Ð1534. doi:10.1038/sj.bjp.0706048 doi:10.2174/156802611795860951 Baker JG (2010) The selectivity of beta-adrenoceptor Cavasotto CN, Orry AJ, Murgolo NJ, Czarniecki MF, agonists at human beta1-, beta2- and beta3- Kocsi SA, Hawes BE, O’Neill KA, Hine H, Burton adrenoceptors. Br J Pharmacol 160(5):1048Ð1061. MS, Voigt JH, Abagyan RA, Bayne ML, Monsma FJ Jr doi:10.1111/j.1476-5381.2010.00754.x (2008) Discovery of novel chemotypes to a G-protein- Ballesteros JA, Weinstein H (1995) Integrated methods coupled receptor through ligand-steered homology for the construction of three-dimensional models and modeling and structure-based virtual screening. J Med computational probing of structure-function relations Chem 51(3):581Ð588. doi:10.1021/jm070759m of G protein-coupled receptors. Methods Neurosci Chen JZ, Wang J, Xie XQ (2007) GPCR structure- 25:366Ð428. doi:10.1016/S1043-9471(05)80049-7 based virtual screening approach for CB2 antago- Barillari C, Marcou G, Rognan D (2008) Hot-spots- nist search. J Chem Inf Model 47(4):1626Ð1637. guided receptor-based pharmacophores (HS-Pharm): doi:10.1021/ci7000814 a knowledge-based approach to identify ligand- Cherezov V, Rosenbaum DM, Hanson MA, Rasmussen anchoring atoms in protein cavities and prioritize SG, Thian FS, Kobilka TS, Choi HJ, Kuhn P, Weis WI, structure-based pharmacophores. J Chem Inf Model Kobilka BK, Stevens RC (2007) High-resolution crys- 48(7):1396Ð1410. doi:10.1021/ci800064z tal structure of an engineered human beta2-adrenergic Becker OM, Marantz Y, Shacham S, Inbal B, Heifetz G protein-coupled receptor. Science 318(5854):1258Ð A, Kalid O, Bar-Haim S, Warshaviak D, Fichman 1265. doi:10.1126/science.1150577 M, Noiman S (2004) G protein-coupled receptors: in Cherezov V, Abola E, Stevens RC (2010) Recent progress silico drug discovery in 3D. Proc Natl Acad Sci U S A in the structure determination of GPCRs, a mem- 101(31):11304Ð11309. doi:10.1073/pnas.0401862101 brane protein family with high potential as phar- Becker OM, Dhanoa DS, Marantz Y, Chen D, Shacham maceutical targets. Methods Mol Biol 654:141Ð168. S, Cheruku S, Heifetz A, Mohanty P, Fichman M, doi:10.1007/978-1-60761-762-4_8 Sharadendu A, Nudelman R, Kauffman M, Noiman S ChienEY,LiuW,ZhaoQ,KatritchV,HanGW,Hanson (2006) An integrated in silico 3D model-driven discov- MA, Shi L, Newman AH, Javitch JA, Cherezov V, ery of a novel, potent, and selective amidosulfonamide Stevens RC (2010) Structure of the human dopamine 5-HT1A agonist (PRX-00023) for the treatment of D3 receptor in complex with a D2/D3 selective antag- anxiety and depression. J Med Chem 49(11):3116Ð onist. Science 330(6007):1091Ð1095. doi:10.1126/sci- 3135. doi:10.1021/jm0508641 ence.1197410 Bissantz C, Bernard P, Hibert M, Rognan D (2003) Congreve M, Langmead CJ, Mason JS, Marshall FH Protein-based virtual screening of chemical databases. (2011) Progress in structure based drug design for G II. Are homology models of G-protein coupled protein-coupled receptors. J Med Chem 54(13):4283Ð receptors suitable targets? Proteins 50(1):5Ð25. 4311. doi:10.1021/jm200371q doi:10.1002/prot.10237 Congreve M, Andrews SP, Dore AS, Hollenstein K, Hur- Blättermann S, Peters L, Ottersbach PA, Bock A, Konya rell E, Langmead CJ, Mason JS, Ng IW, Tehan B, V, Weaver CD, Gonzalez A, Schroder R, Tyagi R, Zhukov A, Weir M, Marshall FH (2012) Discovery of Luschnig P, Gab J, Hennen S, Ulven T, Pardo L, Mohr 1,2,4-triazine derivatives as adenosine A(2A) antago- K, Gutschow M, Heinemann A, Kostenis E (2012) A nists using structure based drug design. J Med Chem biased ligand for OXE-R uncouples G alpha and G 55(5):1898Ð1903. doi:10.1021/jm201376w beta gamma signaling within a heterotrimer. Nat Chem Costanzi S, Vilar S (2012) In silico screening for agonists Biol 8(7):631Ð638. doi:10.1038/nchembio.962 and blockers of the beta(2) adrenergic receptor: im- Bokoch MP, Zou Y, Rasmussen SG, Liu CW, Nygaard plications of inactive and activated state structures. J R, Rosenbaum DM, Fung JJ, Choi HJ, Thian FS, Comput Chem 33(5):561Ð572. doi:10.1002/jcc.22893 Kobilka TS, Puglisi JD, Weis WI, Pardo L, Prosser RS, Costanzi S, Santhosh Kumar T, Balasubramanian R, Mueller L, Kobilka BK (2010) Ligand-specific regula- Kendall Harden T, Jacobson KA (2012) Virtual screen- 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 153

ing leads to the discovery of novel non-nucleotide ful virtual screening for antagonists of the alpha1A P2Y(1) receptor antagonists. Bioorg Med Chem adrenergic receptor. J Med Chem 48(4):1088Ð1097. 20(17):5254Ð5261. doi:10.1016/j.bmc.2012.06.044 doi:10.1021/jm0491804 de Graaf C, Rognan D (2008) Selective structure-based Evers A, Klebe G (2004) Successful virtual screening for a virtual screening for full and partial agonists of the submicromolar antagonist of the neurokinin-1 receptor beta2 adrenergic receptor. J Med Chem 51(16):4978Ð based on a ligand-supported homology model. J Med 4985. doi:10.1021/jm800710x Chem 47(22):5381Ð5392. doi:10.1021/jm0311487 de Graaf C, Rognan D (2009) Customizing G protein- Evers A, Hessler G, Matter H, Klabunde T (2005) Virtual coupled receptor models for structure-based vir- screening of biogenic amine-binding G-protein cou- tual screening. Curr Pharm Des 15(35):4026Ð4048. pled receptors: comparative evaluation of protein- and doi:10.2174/138161209789824786 ligand-based virtual screening protocols. J Med Chem de Graaf C, Foata N, Engkvist O, Rognan D (2008) 48(17):5448Ð5465. doi:10.1021/jm050090o Molecular modeling of the second extracellular loop Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies of G-protein coupled receptors and its implication on M, Hersey A, Light Y, McGlinchey S, Michalovich structure-based virtual screening. Proteins 71(2):599Ð D, Al-Lazikani B, Overington JP (2012) ChEMBL: 620. doi:10.1002/prot.21724 a large-scale bioactivity database for drug discovery. de Graaf C, Rein C, Piwnica D, Giordanetto F, Rog- Nucleic Acids Res 40(Database issue):D1100ÐD1107. nan D (2011a) Structure-based discovery of al- doi:10.1093/nar/gkr777 losteric modulators of two related class B G-protein- Gloriam DE, Wellendorph P, Johansen LD, Thomsen AR, coupled receptors. ChemMedChem 6(12):2159Ð2169. Phonekeo K, Pedersen DS, Brauner-Osborne H (2011) doi:10.1002/cmdc.201100317 Chemogenomic discovery of allosteric antagonists at de Graaf C, Kooistra AJ, Vischer HF, Katritch V, Kuijer the GPRC6A receptor. Chem Biol 18(11):1489Ð1498. M, Shiroishi M, Iwata S, Shimamura T, Stevens RC, de doi:10.1016/j.chembiol.2011.09.012 Esch IJ, Leurs R (2011b) Crystal structure-based vir- Granier S, Manglik A, Kruse AC, Kobilka TS, Thian tual screening for fragment-like ligands of the human FS, Weis WI, Kobilka BK (2012) Structure of the histamine H(1) receptor. J Med Chem 54(23):8195Ð delta-opioid receptor bound to naltrindole. Nature 8206. doi:10.1021/jm2011589 485(7398):400Ð404. doi:10.1038/nature11111 de Graaf C, Vischer HF, de Kloe GE, Kooistra AJ, Ni- Haga K, Kruse AC, Asada H, Yurugi-Kobayashi T, Shi- jmeijer S, Kuijer M, Verheij MH, England PJ, van roishi M, Zhang C, Weis WI, Okada T, Kobilka BK, Muijlwijk-Koezen JE, Leurs R, de Esch IJ (2013) Haga T, Kobayashi T (2012) Structure of the hu- Small and colorful stones make beautiful mosaics: man M2 muscarinic acetylcholine receptor bound to fragment-based chemogenomics. Drug Discov Today an antagonist. Nature 482:547Ð551. doi:10.1038/na- 18(7Ð8):323Ð330. doi:10.1016/j.drudis.2012.12.003 ture10753 de Kloe GE, Bailey D, Leurs R, de Esch IJ (2009) Trans- Hanson MA, Cherezov V, Griffith MT, Roth CB, Jaakola forming fragments into candidates: small becomes big VP, Chien EY, Velasquez J, Kuhn P, Stevens RC in medicinal chemistry. Drug Discov Today 14(13Ð (2008) A specific cholesterol binding site is es- 14):630Ð646. doi:10.1016/j.drudis.2009.03.009 tablished by the 2.8 A structure of the human Deng Z, Chuaqui C, Singh J (2004) Structural interac- beta2-adrenergic receptor. Structure 16(6):897Ð905. tion fingerprint (SIFt): a novel method for analyzing doi:10.1016/j.str.2008.05.001 three-dimensional protein-ligand binding interactions. Hanson MA, Roth CB, Jo E, Griffith MT, Scott FL, J Med Chem 47(2):337Ð344. doi:10.1021/jm030331x Reinhart G, Desale H, Clemons B, Cahalan SM, Dror RO, Pan AC, Arlow DH, Borhani DW, Maragakis P, Schuerer SC, Sanna MG, Han GW, Kuhn P, Rosen Shan Y, Xu H, Shaw DE (2011) Pathway and mecha- H, Stevens RC (2012) Crystal structure of a lipid G nism of drug binding to G-protein-coupled receptors. protein-coupled receptor. Science 335(6070):851Ð855. Proc Natl Acad Sci U S A 108(32):13118Ð13123. doi:10.1126/science.1215904 doi:10.1073/pnas.1104614108 Hopkins AL, Groom CR, Alex A (2004) Edwards BS, Bologa C, Young SM, Balakin KV, Prossnitz Ligand efficiency: a useful metric for lead ER, Savchuck NP, Sklar LA, Oprea TI (2005) Inte- selection. Drug Discov Today 9(10):430Ð431. gration of virtual screening with high-throughput flow doi:10.1016/S1359-6446(04)03069-7 cytometry to identify novel small molecule formylpep- Istyastono EP (2012) Computational studies of histamine tide receptor antagonists. Mol Pharmacol 68(5):1301Ð H4 receptor-ligand interactions. VU University Ams- 1310. doi:10.1124/mol.105.014068 terdam, Amsterdam. ISBN 978-90-8570-994-7 Engel S, Skoumbourdis AP, Childress J, Neumann S, Jaakola VP, Griffith MT, Hanson MA, Cherezov V, Deschamps JR, Thomas CJ, Colson AO, Costanzi S, Chien EY, Lane JR, Ijzerman AP, Stevens RC (2008) Gershengorn MC (2008) A virtual screen for diverse The 2.6 angstrom crystal structure of a human A2A ligands: discovery of selective G protein-coupled re- adenosine receptor bound to an antagonist. Science ceptor antagonists. J Am Chem Soc 130(15):5115Ð 322(5905):1211Ð1217. doi:10.1126/science.1164772 5123. doi:10.1021/ja077620l Jaakola VP, Lane JR, Lin JY, Katritch V, Ijzerman AP, Evers A, Klabunde T (2005) Structure-based drug dis- Stevens RC (2010) Ligand binding and subtype se- covery using GPCR homology modeling: success- lectivity of the human A(2A) adenosine receptor: 154 A.J. Kooistra et al.

identification and characterization of essential amino G-protein-coupled receptors and their application in acid residues. J Biol Chem 285(17):13032Ð13044. virtual screening. J Med Chem 52(9):2923Ð2932. doi:10.1074/jbc.M109.096974 doi:10.1021/jm9001346 Katritch V, Abagyan R (2011) GPCR agonist binding KnoxC,LawV,JewisonT,LiuP,LyS,FrolkisA,Pon revealed by modeling and crystallography. A, Banco K, Mak C, Neveu V, Djoumbou Y, Eisner Trends Pharmacol Sci 32(11):637Ð643. R, Guo AC, Wishart DS (2011) DrugBank 3.0: a doi:10.1016/j.tips.2011.08.001 comprehensive resource for ‘omics’ research on drugs. Katritch V, Rueda M, Lam PC, Yeager M, Abagyan Nucleic Acids Res 39(Database issue):D1035ÐD1041. R (2010a) GPCR 3D homology models for ligand doi:10.1093/nar/gkq1126 screening: lessons learned from blind predictions of Kolb P, Rosenbaum DM, Irwin JJ, Fung JJ, Kobilka adenosine A2a receptor complex. Proteins 78(1):197Ð BK, Shoichet BK (2009) Structure-based 211. doi:10.1002/prot.22507 discovery of beta2-adrenergic receptor ligands. Katritch V, Jaakola VP, Lane JR, Lin J, Ijzer- Proc Natl Acad Sci U S A 106(16):6843Ð6848. man AP, Yeager M, Kufareva I, Stevens RC, doi:10.1073/pnas.0812657106 Abagyan R (2010b) Structure-based discovery of Kolb P, Phan K, Gao ZG, Marko AC, Sali A, Jacobson novel chemotypes for adenosine A(2A) receptor KA (2012) Limits of ligand selectivity from dock- antagonists. J Med Chem 53(4):1799Ð1809. doi: ing to models: in silico screening for A(1) adeno- 10.1021/jm901647p sine receptor antagonists. PLoS One 7(11):e49910. Katritch V, Cherezov V, Stevens RC (2012) Diver- doi:10.1371/journal.pone.0049910 sity and modularity of G protein-coupled recep- Kooistra AJ, Roumen L, Leurs R, de Esch IJP, de tor structures. Trends Pharmacol Sci 33(1):17Ð27. Graaf C (2013) From heptahelical bundle to hits doi:10.1016/j.tips.2011.09.003 from the Haystack: structure-based virtual screening Katritch V, Cherezov V, Stevens RC (2013) Structure- for GPCR ligands. Methods Enzymol 522:279Ð336. function of the g protein-coupled receptor super- doi:10.1016/B978-0-12-407865-9.00015-7 family. Annu Rev Pharmacol Toxicol 53:531Ð556. Kruse AC, Hu J, Pan AC, Arlow DH, Rosenbaum doi:10.1146/annurev-pharmtox-032112-135923 DM, Rosemond E, Green HF, Liu T, Chae PS, Keiser MJ, Roth BL, Armbruster BN, Ernsberger P, Irwin DrorRO,ShawDE,WeisWI,WessJ,Kobilka JJ, Shoichet BK (2007) Relating protein pharmacology BK (2012) Structure and dynamics of the M3 mus- by ligand chemistry. Nat Biotechnol 25(2):197Ð206. carinic acetylcholine receptor. Nature 482:552Ð556. doi:10.1038/nbt1284 doi:10.1038/nature10753 Kellenberger E, Springael JY, Parmentier M, Hachet- Kufareva I, Rueda M, Katritch V, Stevens RC, Abagyan Haas M, Galzi JL, Rognan D (2007) Identification R (2011) Participants, G.D.: status of GPCR modeling of nonpeptide CCR5 receptor agonists by structure- and docking as reflected by community-wide GPCR based virtual screening. J Med Chem 50(6):1294Ð dock 2010 assessment. Structure 19(8):1108Ð1126. 1303. doi:10.1021/jm061389p doi:10.1016/j.str.2011.05.012 Kim J, Wess J, van Rhee AM, Schoneberg T, Jacobson Lagerstrom MC, Schioth HB (2008) Structural diversity KA (1995) Site-directed mutagenesis identifies of G protein-coupled receptors and significance for residues involved in ligand recognition in the drug discovery. Nat Rev Drug Discov 7(4):339Ð357. human A2a adenosine receptor. J Biol Chem doi:10.1038/nrd2518 270(23):13987Ð13997. doi:10.1074/jbc.270.23. Langmead CJ, Andrews SP, Congreve M, Errey JC, 13987 Hurrell E, Marshall FH, Mason JS, Richardson CM, Kim J, Yip ML, Shen X, Li H, Hsin LY, Labarge S, Robertson N, Zhukov A, Weir M (2012) Identifi- Heinrich EL, Lee W, Lu J, Vaidehi N (2012) Iden- cation of novel adenosine A2A receptor antagonists tification of anti-malarial compounds as novel antag- by virtual screening. J Med Chem 55(5):1904Ð1909. onists to chemokine receptor CXCR4 in pancreatic doi:10.1021/jm201455y cancer cells. PLoS One 7(2):e31004. doi:10.1371/jour- Leach AR, Hann MM (2011) Molecular complexity nal.pone.0031004 and fragment-based drug discovery: ten years Kiss R, Kiss B, Konczol A, Szalai F, Jelinek I, Laszlo V, on. Curr Opin Chem Biol 15(4):489Ð496. Noszal B, Falus A, Keseru GM (2008) Discovery of doi:10.1016/j.cbpa.2011.05.008 novel human ligands by large- Lin X, Huang XP, Chen G, Whaley R, Peng S, Wang scale structure-based virtual screening. J Med Chem Y, Zhang G, Wang SX, Wang S, Roth BL, Huang 51(11):3145Ð3153. doi:10.1021/jm7014777 N (2012) Life beyond kinases: structure-based dis- Kiss GN, Fells JI, Gupte R, Lee SC, Liu J, Nusser N, covery of sorafenib as nanomolar antagonist of Lim KG, Ray RM, Lin FT, Parrill AL, Sumegi B, 5-HT receptors. J Med Chem 55(12):5749Ð5759. Miller DD, Tigyi G (2012) Virtual screening for LPA2- doi:10.1021/jm300338m specific agonists identifies a nonlipid compound with Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (2001) antiapoptotic actions. Mol Pharmacol 82(6):1162Ð Experimental and computational approaches to esti- 1173. doi:10.1124/mol.112.079699 mate solubility and permeability in drug discovery and Klabunde T, Giegerich C, Evers A (2009) Sequence- development settings. Adv Drug Deliv Rev 46(1Ð3):3Ð derived three-dimensional pharmacophore models for 26. doi:10.1016/S0169-409X(00)00129-0 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 155

Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK (2007) receptor. Proc Natl Acad Sci U S A 108(20):8228Ð BindingDB: a web-accessible database of experi- 8232. doi:10.1073/pnas.1100185108 mentally determined protein-ligand binding affinities. Moura Barbosa AJ, Del Rio A (2012) Freely accessi- Nucleic Acids Res 35(Database issue):D198ÐD201. ble databases of commercial compounds for high- doi:10.1093/nar/gkl999 throughput virtual screenings. Curr Top Med Chem Liu Y, Zhou E, Yu K, Zhu J, Zhang Y, Xie X, Li J, 12(8):866Ð877. doi:10.2174/156802612800166710 Jiang H (2008) Discovery of a novel CCR5 antagonist Muegge I (2003) Selection criteria for drug-like lead compound through fragment assembly. Molecules compounds. Med Res Rev 23(3):302Ð321. 13(10):2426Ð2441. doi:10.3390/molecules13102426 doi:10.1002/med.10041 Liu JJ, Horst R, Katritch V, Stevens RC, Wuthrich K Mysinger MM, Weiss DR, Ziarek JJ, Gravel S, Doak (2012) Biased signaling pathways in beta2-adrenergic AK, Karpiak J, Heveker N, Shoichet BK, Volk- receptor characterized by 19F-NMR. Science man BF (2012) Structure-based ligand discovery for 335(6072):1106Ð1110. doi:10.1126/science.1215802 the protein-protein interface of chemokine receptor Loving K, Alberts I, Sherman W (2010) Com- CXCR4. Proc Natl Acad Sci U S A 109(14):5517Ð putational approaches for fragment-based and de 5522. doi:10.1073/pnas.1120431109 novo design. Curr Top Med Chem 10(1):14Ð32. Nygaard R, Zou Y, Dror RO, Mildorf TJ, Arlow DH, Man- doi:10.2174/156802610790232305 glik A, Pan AC, Liu CW, Fung JJ, Bokoch MP, Thian Malherbe P, Kratochwil N, Muhlemann A, Zenner FS, Kobilka TS, Shaw DE, Mueller L, Prosser RS, MT, Fischer C, Stahl M, Gerber PR, Jaeschke Kobilka BK (2013) The dynamic process of beta(2)- G, Porter RH (2006) Comparison of the bind- adrenergic receptor activation. Cell 152(3):532Ð542. ing pockets of two chemically unrelated allosteric doi:10.1016/j.cell.2013.01.008 antagonists of the mGlu5 receptor and identifica- Olah H, Rad R, Ostopovici L, Bora A, Hadaruga tion of crucial residues involved in the inverse N, Hadaruga D, Moldovan R, Fulias A, Mracec agonism of MPEP. J Neurochem 98(2):601Ð615. M, Oprea TI (2007) WOMBAT and WOMBAT- doi:10.1111/j.1471-4159.2006.03886.x PK: bioactivity databases for lead and drug dis- Manglik A, Kruse AC, Kobilka TS, Thian FS, Mathiesen covery. In: Schreiber SL, Kapoor T, Wess G (eds) JM, Sunahara RK, Pardo L, Weis WI, Kobilka BK, Chemical biology: from small molecules to systems Granier S (2012) Crystal structure of the micro-opioid biology and drug design. Wiley-VCH, New York, receptor bound to a morphinan antagonist. Nature pp 760Ð786. doi:10.1002/9783527619375.ch13b 485(7398):321Ð326. doi:10.1038/nature10954 Palczewski K, Kumasaka T, Hori T, Behnke CA, Mo- Marcou G, Rognan D (2007) Optimizing fragment toshima H, Fox BA, Le Trong I, Teller DC, Okada and scaffold docking by use of molecular interac- T, Stenkamp RE, Yamamoto M, Miyano M (2000) tion fingerprints. J Chem Inf Model 47(1):195Ð207. Crystal structure of rhodopsin: a G protein-coupled re- doi:10.1021/ci600342e ceptor. Science 289(5480):739Ð745. doi:10.1126/sci- Michino M, Abola E, Participants GD, Brooks CL 3rd, ence.289.5480.739 Dixon JS, Moult J, Stevens RC (2009) Community- Petrel C, Kessler A, Dauban P, Dodd RH, Rognan D, Ruat wide assessment of GPCR structure modelling and M (2004) Positive and negative allosteric modulators ligand docking: GPCR dock 2008. Nat Rev Drug of the Ca2C-sensing receptor interact within overlap- Discov 8(6):455Ð463. doi:10.1038/nrd2877 ping but not identical binding sites in the transmem- Miller LJ, Chen Q, Lam PC, Pinon DI, Sexton brane domain. J Biol Chem 279(18):18990Ð18997. PM, Abagyan R, Dong M (2011) Refinement of doi:10.1074/jbc.M400724200 glucagon-like peptide 1 docking to its intact recep- Rasmussen SG, Choi HJ, Rosenbaum DM, Kobilka TS, tor using mid-region photolabile probes and molec- Thian FS, Edwards PC, Burghammer M, Ratnala VR, ular modeling. J Biol Chem 286(18):15895Ð15907. Sanishvili R, Fischetti RF, Schertler GF, Weis WI, doi:10.1074/jbc.M110.217901 Kobilka BK (2007) Crystal structure of the human Miller DC, Lunn G, Jones P, Sabnis Y, Davies NL, beta2 adrenergic G-protein-coupled receptor. Nature Driscoll P (2012) Investigation of the effect of molec- 450(7168):383Ð387. doi:10.1038/nature06325 ular properties on the binding kinetics of a ligand Rasmussen SG, Choi HJ, Fung JJ, Pardon E, Casarosa P, to its biological target. MedChemComm 3:449Ð452. Chae PS, Devree BT, Rosenbaum DM, Thian FS, Ko- doi:10.1021/ci200088d bilka TS, Schnapp A, Konetzki I, Sunahara RK, Gell- Moitessier N, Englebienne P, Lee D, Lawandi J, Corbeil man SH, Pautsch A, Steyaert J, Weis WI, Kobilka BK CR (2008) Towards the development of universal, fast (2011a) Structure of a nanobody-stabilized active state and highly accurate docking/scoring methods: a long of the beta(2) adrenoceptor. Nature 469(7329):175Ð way to go. Br J Pharmacol 153(Suppl 1):S7ÐS26. 180. doi:10.1038/nature09648 doi:10.1038/sj.bjp.0707515 Rasmussen SG, DeVree BT, Zou Y, Kruse AC, Chung KY, Moukhametzianov R, Warne T, Edwards PC, Serrano- Kobilka TS, Thian FS, Chae PS, Pardon E, Calinski Vega MJ, Leslie AG, Tate CG, Schertler GF (2011) D, Mathiesen JM, Shah ST, Lyons JA, Caffrey M, Two distinct conformations of helix 6 observed Gellman SH, Steyaert J, Skiniotis G, Weis WI, Suna- in antagonist-bound structures of a beta1-adrenergic hara RK, Kobilka BK (2011b) Crystal structure of the 156 A.J. Kooistra et al.

beta2 adrenergic receptor-Gs protein complex. Nature perspective: the benefits and challenges of protein 477(7366):549Ð555. doi:10.1038/nature10361 structure-based pharmacophore modeling. Med Chem Renault N, Laurent X, Farce A, El Bakali J, Mansouri Comm 3(1):28Ð38. doi:10.1021/ci200088d R, Gervois P, Millet R, Desreumaux P, Furman C, Shimamura T, Shiroishi M, Weyand S, Tsujimoto H, Chavatte P (2012) Virtual screening of CB(2) re- Winter G, Katritch V, Abagyan R, Cherezov V, Liu ceptor agonists from Bayesian network and high- W, Han GW, Kobayashi T, Stevens RC, Iwata S throughput docking: structural insights into agonist- (2011) Structure of the human histamine H1 recep- modulated GPCR features. Chem Biol Drug Des. tor complex with doxepin. Nature 475(7354):65Ð70. doi:10.1111/cbdd.12095 doi:10.1038/nature10236 Rodriguez D, Gutierrez-de-Teran H (2013) Sirci F, Istyastono EP, Vischer HF, Kooistra AJ, Nijmei- Computational approaches for ligand discovery jer S, Kuijer M, Wijtmans M, Mannhold R, Leurs and design in class-A G protein-coupled R, de Esch IJ, de Graaf C (2012) Virtual fragment receptors. Curr Pharm Des 19(12):2216Ð2236. screening: discovery of histamine h(3) receptor lig- doi:10.2174/1381612811319120009 ands using ligand-based and protein-based molecular Rosenbaum DM, Cherezov V, Hanson MA, Ras- fingerprints. J Chem Inf Model 52(12):3308Ð3324. mussen SG, Thian FS, Kobilka TS, Choi HJ, Yao doi:10.1021/ci3004094 XJ, Weis WI, Stevens RC, Kobilka BK (2007) Stevens RC, Cherezov V, Katritch V, Abagyan R, Kuhn GPCR engineering yields high-resolution structural P, Rosen H, Wuthrich K (2012) The GPCR net- insights into beta2-adrenergic receptor function. work: a large-scale collaboration to determine human Science 318(5854):1266Ð1273. doi:10.1126/science. GPCR structure and function. Nat Rev Drug Discov 1150609 12(1):25Ð34. doi:10.1038/nrd3859 Rosenbaum DM, Zhang C, Lyons JA, Holl R, Aragao Sum CS, Tikhonova IG, Neumann S, Engel S, Raaka BM, D, Arlow DH, Rasmussen SG, Choi HJ, Devree Costanzi S, Gershengorn MC (2007) Identification of BT, Sunahara RK, Chae PS, Gellman SH, Dror RO, residues important for agonist recognition and activa- Shaw DE, Weis WI, Caffrey M, Gmeiner P, Ko- tion in GPR40. J Biol Chem 282(40):29248Ð29255. bilka BK (2011) Structure and function of an irre- doi:10.1074/jbc.M705077200 versible agonist-beta(2) adrenoceptor complex. Nature Surgand JS, Rodrigo J, Kellenberger E, Rognan 469(7329):236Ð240. doi:10.1038/nature09665 D (2006) A chemogenomic analysis of the Rosenkilde MM, Benned-Jensen T, Frimurer transmembrane binding cavity of human G- TM, Schwartz TW (2010) The minor binding protein-coupled receptors. Proteins 62(2):509Ð538. pocket: a major player in 7TM receptor doi:10.1002/prot.20768 activation. Trends Pharmacol Sci 31(12):567Ð574. Tang H, Wang XS, Hsieh JH, Tropsha A (2012) Do crystal doi:10.1016/j.tips.2010.08.006 structures obviate the need for theoretical models of Roumen L, Sanders MP, Vroling B, de Esch IJ, de Vlieg GPCRs for structure based virtual screening. Proteins. J, Leurs R, Klomp JP, Nabuurs SB, de Graaf C doi:10.1002/prot.24035 (2011) In silico veritas: the pitfalls and challenges of Tautermann CS (2011) The use of G-protein coupled re- predicting GPCR-ligand interactions. Pharmaceuticals ceptor models in lead optimization. Future Med Chem 4(9):1196Ð1215. doi:10.1021/ci200088d 3(6):709Ð721. doi:10.4155/fmc.11.24 Sabio M, Jones K, Topiol S (2008) Use of the X- Thompson AA, Liu W, Chun E, Katritch V, Wu H, ray structure of the beta2-adrenergic receptor for Vardy E, Huang XP, Trapella C, Guerrini R, Calo G, drug discovery. Part 2: identification of active com- Roth BL, Cherezov V, Stevens RC (2012) Structure pounds. Bioorg Med Chem Lett 18(20):5391Ð5395. of the nociceptin/orphanin FQ receptor in complex doi:10.1016/j.bmcl.2008.09.046 with a peptide mimetic. Nature 485(7398):395Ð399. Salo OM, Raitio KH, Savinainen JR, Nevalainen T, doi:10.1038/nature11085 Lahtela-Kakkonen M, Laitinen JT, Jarvinen T, Poso Tikhonova IG, Sum CS, Neumann S, Engel S, Raaka A (2005) Virtual screening of novel CB2 ligands BM, Costanzi S, Gershengorn MC (2008) Discovery using a comparative model of the human cannabi- of novel agonists and antagonists of the free fatty acid noid CB2 receptor. J Med Chem 48(23):7166Ð7171. receptor 1 (FFAR1) using virtual screening. J Med doi:10.1021/jm050565b Chem 51(3):625Ð633. doi:10.1021/jm7012425 Salon JA, Lodowski DT, Palczewski K (2011) The signif- Topiol S, Sabio M (2008) Use of the X-ray struc- icance of G protein-coupled receptor crystallography ture of the Beta2-adrenergic receptor for drug dis- for drug discovery. Pharmacol Rev 63(4):901Ð937. covery. Bioorg Med Chem Lett 18(5):1598Ð1602. doi:10.1124/pr.110.003350 doi:10.1016/j.bmcl.2008.01.063 Sanders MP, Verhoeven S, de Graaf C, Roumen L, Vrol- Tosh DK, Phan K, Gao ZG, Gakh AA, Xu F, De- ing B, Nabuurs SB, de Vlieg J, Klomp JP (2011) florian F, Abagyan R, Stevens RC, Jacobson KA, Snooker: a structure-based pharmacophore generation Katritch V (2012) Optimization of adenosine 50- tool applied to class A GPCRs. J Chem Inf Model carboxamide derivatives as adenosine receptor ago- 51(9):2277Ð2292. doi:10.1021/ci200088d nists using structure-based ligand design and frag- Sanders MP, McGuire R, Roumen L, de Esch IJ, de Vlieg ment screening. J Med Chem 55(9):4297Ð4308. J, Klomp JP, de Graaf C (2012) From the protein’s doi:10.1021/jm300095s 7 From Three-Dimensional GPCR Structure to Rational Ligand Discovery 157

Tresadern G, Bartolome JM, Macdonald GJ, Langlois X The structural basis for agonist and partial ago- (2011) Molecular properties affecting fast dissociation nist action on a beta(1)-adrenergic receptor. Nature from the D2 receptor. Bioorg Med Chem 19(7):2231Ð 469(7329):241Ð244. doi:10.1038/nature09746 2241. doi:10.1016/j.bmc.2011.02.033 Warne T, Edwards PC, Leslie AG, Tate CG (2012) Crystal Triballeau N, Van Name E, Laslier G, Cai D, Pail- structures of a stabilized beta1-adrenoceptor bound to lard G, Sorensen PW, Hoffmann R, Bertrand HO, the biased agonists bucindolol and carvedilol. Struc- Ngai J, Acher FC (2008) High-potency olfac- ture 20(5):841Ð849. doi:10.1016/j.str.2012.03.014 tory receptor agonists discovered by virtual high- White JF, Noinaj N, Shibata Y, Love J, Kloss B, Xu throughput screening: molecular probes for receptor F, Gvozdenovic-Jeremic J, Shah P, Shiloach J, Tate structure and olfactory function. Neuron 60(5):767Ð CG, Grisshammer R (2012) Structure of the agonist- 774. doi:10.1016/j.neuron.2008.11.014 bound neurotensin receptor. Nature 490(7421):508Ð van der Horst E, Okuno Y, Bender A, IJzerman AP 513. doi:10.1038/nature11558 (2009) Substructure mining of GPCR ligands reveals Wu B, Chien EY, Mol CD, Fenalti G, Liu W, Ka- activity-class specific functional groups in an un- tritch V, Abagyan R, Brooun A, Wells P, Bi FC, biased manner. J Chem Inf Model 49(2):348Ð360. Hamel DJ, Kuhn P, Handel TM, Cherezov V, Stevens doi:10.1021/ci8003896 RC (2010) Structures of the CXCR4 chemokine Varady J, Wu X, Fang X, Min J, Hu Z, Levant B, Wang S GPCR with small-molecule and cyclic peptide antago- (2003) Molecular modeling of the three-dimensional nists. Science 330(6007):1066Ð1071. doi:10.1126/sci- structure of dopamine 3 (D3) subtype receptor: dis- ence.1194396 covery of novel and potent D3 ligands through a Wu H, Wacker D, Mileni M, Katritch V, Han GW, hybrid pharmacophore- and structure-based database Vardy E, Liu W, Thompson AA, Huang XP, Carroll searching approach. J Med Chem 46(21):4377Ð4392. FI, Mascarella SW, Westkaemper RB, Mosier PD, doi:10.1021/jm030085p Roth BL, Cherezov V, Stevens RC (2012) Struc- Vohra S, Taddese B, Conner AC, Poyner DR, Hay ture of the human kappa-opioid receptor in com- DL, Barwell J, Reeves PJ, Upton GJ, Reynolds CA plex with JDTic. Nature 485(7398):327Ð332. doi: (2013) Similarity between class A and class B G- 10.1038/nature10939 protein-coupled receptors exemplified through calci- Yrjola S, Kalliokoski T, Laitinen T, Poso A, Parkkari T, tonin gene-related peptide receptor modelling and mu- Nevalainen T (2013) Discovery of novel cannabinoid tagenesis studies. J R Soc Interface 10(79):20120846. receptor ligands by a virtual screening approach: fur- doi:10.1098/rsif.2012.0846 ther development of 2,4,6-trisubstituted 1,3,5-triazines Wacker D, Fenalti G, Brown MA, Katritch V, Abagyan as CB2 agonists. Eur J Pharm Sci 48(1Ð2):9Ð20. R, Cherezov V, Stevens RC (2010) Conserved bind- doi:10.1016/j.ejps.2012.10.020 ing mode of human beta2 adrenergic receptor in- Zhang C, Srinivasan Y, Arlow DH, Fung JJ, Palmer verse agonists and antagonist revealed by X-ray crys- D, Zheng Y, Green HF, Pandey A, Dror RO, tallography. J Am Chem Soc 132(33):11443Ð11445. Shaw DE, Weis WI, Coughlin SR, Kobilka BK doi:10.1021/ja105108q (2012) High-resolution crystal structure of human Warne T, Serrano-Vega MJ, Baker JG, Moukhametzianov protease-activated receptor 1. Nature 492(7429):387Ð R, Edwards PC, Henderson R, Leslie AG, Tate CG, 392. doi:10.1038/nature11701 Schertler GF (2008) Structure of a beta1-adrenergic G- Zhukov A, Andrews SP, Errey JC, Robertson N, protein-coupled receptor. Nature 454(7203):486Ð491. Tehan B, Mason JS, Marshall FH, Weir M, Con- doi:10.1038/nature07101 greve M (2011) Biophysical mapping of the adeno- Warne T, Moukhametzianov R, Baker JG, Nehme R, sine A2A receptor. J Med Chem 54(13):4312Ð4323. Edwards PC, Leslie AG, Schertler GF,Tate CG (2011) doi:10.1021/jm2003798 Mathematical Modeling of G Protein-Coupled Receptor Function: 8 What Can We Learn from Empirical and Mechanistic Models?

David Roche, Debora Gil, and Jesús Giraldo

Abstract Empirical and mechanistic models differ in their approaches to the analysis of pharmacological effect. Whereas the parameters of the former are not physical constants those of the latter embody the nature, often complex, of biology. Empirical models are exclusively used for curve fitting, merely to characterize the shape of the E/[A] curves. Mechanistic models, on the contrary, enable the examination of mechanistic hypotheses by parameter simulation. Regretfully, the many parameters that mechanistic models may include can represent a great difficulty for curve fitting, representing, thus, a challenge for computational method development. In the present study some empirical and mechanistic models are shown and the connections, which may appear in a number of cases between them, are analyzed from the curves they yield. It may be concluded that systematic and careful curve shape analysis can be extremely useful for the understanding of receptor function, ligand classification and drug discovery, thus providing a common language for the communication between pharmacologists and medicinal chemists.

Keywords “-arrestin ¥ Biased agonism ¥ Curve fitting ¥ Empirical modeling ¥ Evo- lutionary algorithm ¥ Functional selectivity ¥ G protein ¥ GPCR ¥ Hill coefficient ¥ Intrinsic efficacy ¥ Inverse agonism ¥ Mathematical model- ing ¥ Mechanistic modeling ¥ Operational model ¥ Parameter optimiza- tion ¥ Receptor dimer ¥ Receptor oligomerization ¥ Receptor constitutive activity ¥ Signal transduction ¥ Two-state model

D. Roche ¥ J. Giraldo () Laboratory of Systems Pharmacology and Bioinformatics, Institut de Neurociències and Unitat de D. Gil Bioestadística, Universitat Autònoma de Barcelona, Computer Vision Center, Computer Science Department, 08193 Bellaterra, Spain Universitat Autònoma de Barcelona, 08193 Bellaterra, e-mail: [email protected] Spain

M. Filizola (ed.), G Protein-Coupled Receptors - Modeling and Simulation, Advances 159 in Experimental Medicine and Biology 796, DOI 10.1007/978-94-007-7423-0__8, © Springer ScienceCBusiness Media Dordrecht 2014 160 D. Roche et al.

Yet, what do we understand by a mathematical Abbreviations model in pharmacology? A pharmacological ef- fect is mathematically described by an equation, AS3D asymmetric/symmetric three-state E D f([A]), relating the pharmacological effect, dimer receptor model E, to drug concentration, [A], through an analytic EA evolutionary algorithm function, f, which depends on some parameters. GPCR G protein-coupled receptor The definition of these model parameters deter- mines the approach we are using for modeling the experimental data. In this context, there are 8.1 Biological Reality two types of approaches: empirical and mecha- and the Model: Some nistic. Empirical approaches aim at describing the Introductory Words function profile and, thus, model parameters are set to empirically fit experimental data. Although Knowledge of biological systems can be gained very sensible from a mathematical point of view, by examining the physiological effects that they a main concern is that empirical parameters lack produce under particular experimental conditions of any physical interpretation. An alternative ap- (Fig. 8.1). These effects are the result of the trans- proach is to identify parameters conveying part duction by the system of the signal embodied in of the biological information the function is in- the molecular structure of a drug, where for drug tended to represent by using mechanistic models we mean, in a general sense, any substance able (Giraldo et al. 2002). to perturb a biological system in a concentration- Although this chapter is mostly devoted to dependent fashion. Because the true transduc- mechanistic models, a discussion on empirical tion function is generally unknown an estimate ones will be included to show that, apart from must be obtained. These functional estimates, the widely used Hill equation, other models can expressed as mathematical models, are useful for be of interest. Our discussion focuses on a com- the characterization of the biological system and parative analysis between empirical and mech- for the classification and discovery of new drugs. anistic models in order to identify parallelisms

x=[drug] DrugDrug BiologicalBiological System System

Drug: Substance A signal embodied in the drug structure is able to perturb a f(x)? biological system transducted into a leading to a physiological effect physiological effect PhysiologicalPhysiological Effect Effect

True f(x)is unknown An f(x) estimate must be obtained

• Characterization of the biological system

• Classification and discovering of new drugs

Fig. 8.1 Schematic representation of the drug action pro- estimate must be obtained from E/[A] experimental data. cess. The perturbation that the drug exerts on the bio- The f(x) estimate can be used for both characterization of logical system can, in theory, be described by a “true” the biological system and classification and design of new theoretical function ¥(x) of all the parameters present drugs in the system. Because this function is unknown an f(x) 8 Mathematical Modeling of G Protein-Coupled Receptor Function: What Can We Learn... 161 and coincidences that might lead to new ideas for equal to zero. An important remark is that the debate and analysis. location of the point of inflection serves to assess the curve degree of symmetry. In particular, an E/x curve is symmetric if the point of inflection 8.2 What Can We Learn matches the mid-point, xI D x50, and asymmetric from Empirical Models? if it does not, xI ¤ x50 (Fig. 8.2). The above geometric features are obtained Customarily, drugs and receptors are compared after fitting the mathematical equation E D f([A]) and classified by monitoring the pharmacologi- to the experimental E/x plots. Among existing cal effect the receptor produces with increasing E D f(x) functions for data fitting, the Hill equa- [A]. The E/[A] data are commonly depicted in tion is, without doubt, the most widely used a semi-logarithmic scale, E/x with x D log[A], function in pharmacology. typically leading to sigmoid-shaped plots. To properly compare the experimental scatter plots, E D f(x) functions are fitted to the data points and 8.2.1 The Hill Equation: A Model for their parameters estimated. The parameter esti- Symmetric mates allow the quantification of the geometric Concentration-Effect Curves characteristics of the E/x curves. If we assume that basal response is absent (E D 0for[A]D 0), The Hill equation (Hill 1913) is the three- then E/x curves can be geometrically described parameter Eq. 8.1. by four quantities: the upper asymptote (maxi- mum response), the mid-point (curve location), a E D . / (8.1) the mid-point slope (steepness) and the inflection 1 C 10m xb x point (symmetry of the curve) (Fig. 8.2). Each of these quantities can be mathematically defined where x D log ŒA and m > 0, m being the Hill and have a geometric meaning but also a pharma- coefficient. cological interpretation. The upper asymptote is a, the mid-point is dE x50 D x , the mid-point slope is D The upper asymptote, top, reflects the efficacy b dx xDx50 amln 10 0:576 of the agonist-receptor system and is defined 4 D am (notice that for m D 1, rect- as the value towards the effect tends as [A] angular hyperbola, the mid-point slope is equal increases, Top D lim E. The mid-point, x50, to 0.576 after normalization by dividing by a) x!1 measures the agonist potency and is defined as and the point of inflection is xI D xb. Because the x for half the top. The mid-point slope is there is an identity between the inflection point the value of the slope of the E/x curve at the and the mid-point, the Hill equation produces mid-point, dE , and displays the sensitivity symmetric curves in all cases and, therefore, is dx xDx50 of the system to small changes in agonist con- not appropriate for fitting asymmetric E/x data. centration. Rectangular hyperbolic curves give a typical mid-point slope of 0.576 in the case they are normalized (the derivative is divided by 8.2.2 But What to Do top), while non-hyperbolic curves can be steep with Asymmetric ( dE > 0:576) or flat ( dE < 0:576). Concentration-Effect Curves? dx xDx50 dx xDx50 The point of inflection, xI, is a point on a curve at which the curvature changes from convex to There are several empirical models capable of concave or vice versa. For an E/x curve, this is a dealing with asymmetric E/x data: the Richards point at which the first derivative of the function model, the Gompertz model and the modified Hill is a maximum whereas the second derivative is equation (Giraldo et al. 2002). 162 D. Roche et al. ab 100 100 Upper asymptote: Efficacy 80 80 Mid-point: 60 60 Potency Effect Effect 40 40

20 20

0 0

-10 -8 -6 -4 -10 -8 -6 -4 log [A] log [A] cd 100 100

80 80 Mid-point slope: 60 60 Point of inflection: Sensitivity Asymmetry

Effect 40 Effect 40

20 20

0 0

-10 -8 -6 -4 -10 -8 -6 -4 log [A] log [A]

Fig. 8.2 Geometric parameters characterizing E/[A] (d). The point of inflection (point at which the curvature curves. The data are represented using logarithm values of the curve changes from convex to concave or vice for the X axis. For simplicity it is assumed that basal versa; this geometric parameter determines the symmetry response is not present and then the effect for [A] D 0is or asymmetry of the curve depending on whether the 0. (a) The upper asymptote (the value towards the effect inflection point matches or not the mid-point). The solid tends as [A] increases; this geometric parameter reflects curve is symmetric: the mid-point and the inflection point the efficacy of the system). (b). The mid-point (the value are coincident. The dashed curve is asymmetric: the mid- of log[A] for half maximum effect; this geometric param- point and the inflection point are not coincident (open and eter reflects the potency of the system). (c). The mid-point solid circles are used for the mid-point and the inflection slope (this geometric parameter reflects the sensitivity of point, respectively) the system to small changes in [A] around the mid-point).

1 8.2.2.1 The Richards Model amln 10s 1 1= is dE D 2 s , and the point of The Richards model (Richards 1959) is a gen- dx xDx50 2 1 eralization of the Hill equation by including an inflection is xI D xb C m log s. additional parameter (Eq. 8.2). The new parameter, s, allows for asymmetry. If s D 1, Eq. 8.2 is equivalent to Eq. 8.1 and a the theoretical curve is symmetric. Consistently E D (8.2) m.x x/ s 1 C 10 b with this feature, we see that if s D 1 then xI D x50 D xb. However, if s ¤ 1 then xI ¤ x50, and with s > 0. the theoretical curve is asymmetric. Interestingly, The upper asymptote is a, the mid-point is for s ¤ 1, the degree of asymmetry of the curve, 1 1=s 50 2 1 x50 x D xb m log , the mid-point slope measured as the difference between xI and , 8 Mathematical Modeling of G Protein-Coupled Receptor Function: What Can We Learn... 163 relies on both s and m parameters, and xI x50 D an alternative to model experimental asymmetric 1 1=s 2 1 > < 50 m log s .Ifs 1 then xI x , curves when the Richards function fails in curve the point of inflection is located before the mid- fitting. point, whereas if s < 1 then xI > x50, the point of inflection is placed after the mid-point. We 8.2.2.3 The Modified Hill Equation see that for a given s value the degree of asym- The modified Hill equation (Sips 1948, 1950) metry decreases as the parameter m increases. (see (Boeynaems and Dumont 1980; Giraldo et In addition, it has been shown (Giraldo et al. al. 2002) for discussion) is defined as: 2002) that the degree of asymmetry is higher a < for Richards equations with s 1 than for those E D p (8.4) . 1 C 10xb x/ equations with s > 1. The Richards equation is able to model the The modified Hill equation is equivalent to degree of asymmetry present in E/x data by the the Richards function if the m parameter of the inclusion of the s parameter into the modeling latter model is fixed to 1. Consequently, the same equation. However, by doing so, the parameter happens for the geometric features of the E/x fitting becomes an overparameterized model aris- curves. The upper asymptote is a, the mid-point ing from dependency among parameters, which 21=p 1 is x50 D xb log , the mid-point slope 1 introduces additional difficulties in data fitting aln 10p 1 1= is dE D 2 p , and the point of (Van Der Graaf and Schoemaker 1999)(seeAp- dx xDx50 2 pendix for a discussion on data fitting). To ac- inflection is xI D xb C log p. We see that the count for asymmetry without increasing the num- parameter p contributes to both the steepness and ber of parameters of the Hill equation led to the the asymmetry of the curves. We have defined the proposal of two new functions, the Gompertz degree of asymmetry of a curve as the difference model and the modified Hill equation. between xI and x50. It can be seen that xI x50 D log p 21=p 1 . The modified Hill equation x 8.2.2.2 The Gompertz Model is symmetric, xI D 50 D xb, only for p D 1. If p < 1 then x > x50 whereas if p > 1 then The Gompertz function (Gompertz 1825) may be I x < x50. written as I Yet, in general and reducing the set of em- a pirical models to the ones revised herein, how E D . / (8.3) e10m xI x can we decide which empirical model follows the experimental data? The Hill equation, the Gom- The upper asymptote is a, the mid-point is pertz model and the modified Hill equation are 1 50 . 2/ x D xI m log ln , the mid-point slope is nested within the Richards model and, therefore, dE D amln 10ln 2 D 0:798 a m, and dx xDx50 2 the improvement in fitting that the four-parameter the point of inflection is the parameter xI.The Richards model gains relative to the other three- Gompertz model is inherently asymmetric with parameter models can statistically be tested by an xI < x50 for any value of m. The inflection extra sum-of-squares F-test (Giraldo et al. 2002). a point yields a fixed response value of E D e , In this context, the geometric features of the curve lower than the response value for the mid-point, shape can be quantitatively assessed. a E D 2 , independently of the raw data. It can beshown(Giraldoetal. 2002) that, analogously to the Richards function, xI tends to x50 as the 8.2.3 Generalizing the Hill parameter m increases. In addition, it can be Coefficient for Empirical proved that the Richards function tends to the Models Gompertz function as the asymmetric parameter s of the Richards function increases (Giraldo et The Hill equation (Eq. 8.1) can be re-written as al. 2002). Thus, the Gompertz function can be E log aE D m x m xb, which is known as 164 D. Roche et al. the Hill plot (Colquhoun 1971). The Hill plot is a be applied to a general tetrameric receptor R straight line of slope m, commonly known as the that is activated after all the binding sites are Hill coefficient and denoted nH. From the latter occupied. The mechanistic model is equation the Hill coefficient can be expressed E K1=4 2K2=3 3K3=2 4 4 dlog aE K KE as nH D dx , which in turn can be written R • AR • A2R • A3R • A4R • a dE (Giraldo 2003)asnH D .aE/Eln 10 dx .This A4R expression for nH can be applied to any E/x model and thus it constitutes a practical way of where K ,K,K and K are the microscopic generalizing the Hill coefficient. The parameter 1 2 3 4 equilibrium dissociation constants and KE D is independent of x when E/x corresponds to Œ  A4R is the equilibrium constant for the open- the Hill equation, however this may not hap- ŒA4R ing/activation (ion channel/receptor) reaction. By pen in general. Accordingly, its value at the defining the effect as the proportion of receptors mid-point (Barlow and Blake 1989; Giraldo et in the open/active state, the following expression al. 2002; Giraldo 2003) may solve the problem was obtained (Eq. 8.5). 4 ŒA KE 4 dE E D dx x50 4 Œ  6 Œ 2 nH50 D 10 (8.5) K1K2K3K4 C K2K3K4 A C K3K4 A a ln 3 4 C4K4ŒA C ŒA .1 C KE/ Equation 8.5 is the expression for the calcu- lation of a generalized Hill coefficient under any By supposing that the efficacy is very low empirical model. We see that is related with the (KE 1) and no cooperativity between the bind- slope (derivative) of the E/x curve at the mid- ing sites (Ki D K), the previous mechanistic equa- point and is normalized because it is divided by tion simplifies to the asymptote a. Thus, as expressed, the Hill coefficient at the mid-point can be used as an KE KE E D 4 D 4 index of the sensitivity of the pharmacological .1 10logKx/ 1 C K C system to small changes in agonist concentration. ŒA Applying Eq. 8.5 to the above discussed empir- where x D log[A]. The latter equation corre- ical models yields, for the nH50 parameter, the sponds to a Richards model (Eq. 8.2) with 2 1 1 2 2 2 values of m, m s 21=s , m ln and a D KE,mD 1, xb D logK, and s D 4ora 1 1 Modified Hill equation with p D 4, with s and p 21=p under the Hill, Richards, Gompertz and modified Hill equation, respectively (Giraldo p defining the molecularity of the process. Thus, 2003). we see that empirical models (phenotype) may reflect some of the features that characterize mechanistic models (genotype). In addition, 8.2.4 Connecting Empirical these values of s and p ¤ 1 are indicative of and Mechanistic Models – The asymmetry and may represent an example of Asymmetry of the Curves the necessity of expanding the set of empirical and the Hill Coefficient models and consider that in some cases other models apart from the symmetric Hill model In an earlier work (Giraldo 2003) a connection are needed. Applying the definition of the between empirical and mechanistic models was generalized Hill coefficient (Eq. 8.5), a value 2 4 1 1 1:27 shown by using a mechanistic model taken from of nH50 D 21=4 D is obtained, the ion channel field, a ligand-gated ion channel showing the influence of asymmetry in the slope with 4 binding sites. Nevertheless, the model can at the mid-point. 8 Mathematical Modeling of G Protein-Coupled Receptor Function: What Can We Learn... 165

We have shown an example in which a mech- can have negative efficacy and later confirmed anistic model may be expressed as an empirical from mutated GPCRs displaying functional model under particular mechanistic features. It response in the absence of agonist (Kjelsberg can be hypothesized that if a systematic analysis et al. 1992; Samama et al. 1993). These mutations of mechanisms were done in a particular biolog- were performed at the C-terminal portion of ical/pharmacological research area and the corre- the third intracellular loop, a region involved sponding set of empirical equations were identi- in receptor-G protein coupling, of ’1- and fied then we could proceed in the inverse order “2-adrenergic receptors. Mutated receptors and try to propose some mechanistic conditions showed a graded range of constitutive activity from the application of one or other empirical and a higher affinity for agonists, indicating model. For instance, in the case of GPCRs asym- that mutations released certain intramolecular metric curves are found when total receptor and constraints that restrained the receptor in its total G protein concentrations are not negligible inactive conformation and induced receptor one relative to the other ((Giraldo et al. 2002) and conformations that mimicked the active state references therein). This indicates that the stoi- of the wild type receptor (Kjelsberg et al. 1992; chiometry of the biological species in a proposed Samama et al. 1993). mechanistic GPCR system must be consistent Basal response can be added as an ad hoc with the symmetry of the experimental curves. parameter in any of the above described empirical Likewise many other molecular properties asso- models, for instance in the Hill equation ciated to GPCR function will be commented from the models presented on the next section. Top Bottom E D Bottom C . / (8.6) 1 C 10m xb x

8.3 What Can We Learn Fitting Eq. 8.6 to experimental data displaying from Mechanistic Models? basal response would allow the estimation of this property through the bottom parameter. However, We have seen that there is a relationship between because Eq. 8.6 is purely empirical the basal the mechanism underlying experimental data and response estimate cannot be interpreted in terms the shape of the curves they produce. Empirical of the receptor system. If this is our aim then models may be powerful enough to reveal that a mechanistic model is needed. The simplest “something” at the biological level is happening mechanistic model accounting for receptor con- if this affects any of the geometric characteristics stitutive activity is the two-state model of agonist of the curves. Yet, to properly analyze mecha- action. nistic hypotheses mechanistic models need to be The two-state model ((del Castillo and Katz used. On the following section a collection of 1957; Monod et al. 1965) for ionic channels and mechanistic models of increasing complexity for (Karlin 1967; Thron 1973; Colquhoun 1973)for the analysis of GPCR function will be shown and receptors; see (Leff 1995) for review) includes discussed. receptor constitutive activity by considering an equilibrium between two receptor states, an in- active R, and an active R*. The model (Box 8.1) 8.3.1 GPCRs Present Constitutive contains four receptor species, the free receptor Activity: The Two-State Model R and R* and the corresponding ligand-bound AR and AR* molecular entities. Because the four Constitutive activity is a general property of chemical equilibria are arranged in a thermody- GPCRs that leads to basal response in the absence namic cycle the four equilibrium constants reg- of an agonist. This was first observed (Costa ulating the relative populations between species and Herz 1989) in experiments with • opioid are reduced to three independent constants L, receptors showing that competitive antagonists K and ’. The parameter L measures the con- 166 D. Roche et al. stitutive activity of the receptor; the equilibrium association constant K relates to the mutual affin- The fraction of active receptors ity between the ligand and the inactive receptor Œ  .1 ’ Œ / species; and ’ quantifies the intrinsic efficacy of R Active L C K A f D Œ  D 1 Œ .1 ’ / the ligand. The parameter ’ appears in two of the R T C L C K A C L equilibria, the one containing the binding of the Œ  Œ  Œ  agonist to the active receptor (’K) and the one where R Active D R C AR Œ  Œ  Œ  Œ  Œ  measuring the induction of the activated receptor and R T D R C AR C R C AR state in the ligand-receptor complex (’L). Either Geometric descriptors of the curves by the first equilibrium (conformational selec- ¥ Left asymptote (Basal response: f for tion) or the second (conformational induction) [A] D 0) the parameter ’ determines the agonist profile (full/partial agonist, neutral antagonist or inverse 1 Basal D agonist) of the ligand. 1 1 C L

Box 8.1: The Two-State Receptor Model ¥ Right asymptote, the asymptotic f value as [A] increases (Top: lim f ) ŒA!1 L R R* 1 Top D 1 1 C ’L

K αK ¥ The mid-point, the [A] value for half maximum f value 1 C L ŒA50 D αL K .1 C ’L/ AR AR* [A50] is lower, equal and greater than The equilibrium constants of the model 1/K for agonists (’>1), neutral antago- nists (’ D 1) and inverse agonists (’<1), L ŒR  R ! RI L D respectively. ŒR In this modeling approach the pharmacologic K ŒAR A C R ! ARI K D response is analyzed by means of the fraction of ŒAŒR active receptors (Eq. 8.7), which is obtained ’L ŒAR  by applying the law of mass action. In the AR ! ARI ’L D ŒAR previous empirical models the geometric features of the E/[A] curves accounting for the efficacy ŒRŒAR  ’ D (maximum response) and potency (location of ŒRŒAR h i the curve along the X-axis) of the drug-receptor pair were characterized by parameters lacking ’K AR A C R ! AR I ’K D physical meaning. In a mechanistic model these Œ  A R geometric properties are expressed in terms h i of the equilibrium constants present in the ŒR AR model providing a biological context to the ’ D R ŒAR estimated parameters. In the two-state model of agonism (Box 8.1) the maximum response 8 Mathematical Modeling of G Protein-Coupled Receptor Function: What Can We Learn... 167

α ŒR L .1 C ’K ŒA/ f D Active D (8.7) α ŒR 1 C L C K ŒA.1 C ’L/ α T α α As it has been recently discussed (Giraldo 2010), the concept of neutral antagonism is more a pharmacologic concept than a real property as, probably, many neutral antagonists are partial or inverse agonists whose positive or negative effi- cacies are too small to be detected with the cur- rent technology. As many classical equations of pharmacology were deduced assuming absence of basal response a revision of these equations is worth to be considered. In this context, an adap- Fig. 8.3 The two-state model of agonism. Simulations tation of the Schild and Cheng-Prusoff methods are performed using Eq. 8.7 with L D 10 1;KD 106; and ’ D 103, 10, 1, 0.5 and 101. The basal response for binding affinity estimation has been used for is E D L/(1 C L) D 0.0909. Full agonism, neutral antag- modeling inverse agonists in the two-state model onism and inverse agonism is obtained with ’ greater, (Giraldo et al. 2007). equal and lower than 1, respectively, with corresponding effect values greater, equal and lower than basal response, Remarkably, the two-state model of agonism respectively has been used (Giraldo 2004) for the simulation of the different actions that a mutation may pro- duce on receptor function thereby providing a (Top) is determined by two constants, L, which theoretical framework for a quantitative analysis determines in turn the basal response, and ’, of receptor mutating effects. the intrinsic efficacy of the ligand in the receptor under study, whereas the potency ([A50]) includes in addition an affinity term (K). 8.3.1.1 Seeing the Two-State Model as As mentioned above, the model can explain an Empirical Model the behavior of full agonists, partial agonists, A main concern is whether the two-state model neutral antagonists and inverse agonists by the matches any of the previously discussed empir- variation of the parameter ’. Figure 8.3 illustrates ical models and, if this were the case, what is this feature, five values of ’ (103, 10, 1, 0.5, the relation between the two fittings. It can be 101) were used, with constant values for L shown that Eq. 8.8 is equivalent to the Hill equa- 1 6 (10 ) and K (10 ). For ’ D 1, the effect is kept tion 8.6 with Bottom D c/b, Top D a, xb D log(b) constant and coincident with the basal response and m D 1. (L/(1 C L) D 0.0909), meaning that the ligand (neutral antagonist) is not able to change the c C a ŒA E D (8.8) constitutive receptor effect. For ’>1 (agonists) b C ŒA and ’<1 (inverse agonists), asymptotic curves with effects greater and lower than the basal Equation 8.8 corresponds exactly to the mech- responses are obtained. Full and partial agonism anistic two-state model given by Eq. 8.7.Itfol- (either with top close to one or significantly lows that if the Hill equation with m D 1 accu- lower, respectively) is obtained with high and rately fits experimental data then the empirical low ’>1 values, respectively. As lower is ’<1, Hill model is compatible with a two-state mech- higher is the capacity of the ligand to decrease anistic model. On the contrary, if a Hill equation the basal response (inverse agonism) with the with m D 1 does not correctly fit curve data other limit being E D 0. mechanistic models should be tested. 168 D. Roche et al.

8.3.2 GPCRs Arrange into Dimers models we see that the upper cycle of the latter is the same as the cycle of the former. The lower The monomeric/dimeric nature of GPCRs has cycle of the two-state dimer model includes the been the subject of passionate debate (Chabre binding of the second ligand to either the inactive and le Maire 2005; James et al. 2006; Bouvier or the active state of the singly-bound receptor et al. 2007). Thus, whereas on one hand there is and the conformational interconversion between experimental evidence indicating that a monomer the inactive and active states of the fully occupied receptor is able to couple to G proteins and trans- dimer receptor. Two new constants appear in the mit external signals inside the cell (Meyer et al. model, and “. measures the binding cooper- 2006; Whorton et al. 2007, 2008; Bayburt et al. ativity between the two bound molecules in the 2007; White et al. 2007; Jastrzebska et al. 2004; (R2) inactive receptor state and “ the differential Ernstetal. 2007; Kuszak et al. 2009; Rasmussen affinity of the second ligand for the active recep- et al. 2011) on the other hand receptor dimer- tor relative to the inactive one (conformational ization/higher oligomerization has been shown selection concept) or the differential capacity of to be a inherent property of GPCRs (Rios et al. activation of the doubly-bound receptor relative 2001; Milligan 2004; Terrillon and Bouvier 2004; to the singly-bound one (conformational induc- Jastrzebska et al. 2006; Carrillo et al. 2004; Ferre tion concept). Whereas ’ measures the intrinsic et al. 2010; Lohse 2010; Pin et al. 2007). efficacy of the singly-bound ligand to activate Yet, in the context of the present study, the the receptor relative to the free receptor itself point is what experimental findings can and “ measures the intrinsic efficacy of the doubly- cannot be described by the different mechanistic bound ligand to activate the receptor relative to models currently at pharmacologists’ disposition. the singly-bound one. Many models including receptor dimerization As in the monomeric two-state model the can be found in the literature, developed to pharmacologic response is analyzed by the frac- explain E/[A] curve shapes other than that tion of active receptors (Eq. 8.9), which in this produced by the Hill equation with m D 1 (receptor dimer) case becomes a ratio between (Karlin 1967; Colquhoun 1973; Wells 1992; two second-degree polynomials. The empirical Wreggett and Wells 1995; Chidiac et al. 1997; equation 8.9 turns into mechanistic when the ai Armstrong and Strange 2001; Christopoulos parameters are replaced by the equilibrium con- and Kenakin 2002; Urizar et al. 2005; Albizu stants of the model (Box 8.2). The basal response et al. 2006;Roviraetal. 2009). Thus, it seems depends only on the L parameter whereas ’ and important to characterize mechanistically what is “ determine the maximum response. The potency found experimentally. The following selection of of the agonist measured by [A50] includes in addi- representative models will be discussed. tion the binding constant K and the parameter .

Œ. / Œ  Œ 2 8.3.2.1 The Two-State Dimer Receptor RR Active a1 C a2 A C a3 A f D Œ. / D 2 (8.9) Model RR T a4 C a5 ŒA C ŒA The two-state dimer receptor model (Franco et al. 2005, 2006) is presented in Box 8.2. The model The interaction between the two binding sites is formally similar to the two-state model of causes the intricacy of the functional process and agonism (Box 8.1); the receptor is constitutively results in the E/[A] curve reshaping. Steeper and active and in equilibrium between two receptor flatter curves than those corresponding to the species, one inactive and the other active, but two-state monomeric model (empirical Hill equa- with the presence of two binding sites instead tion with a Hill coefficient of 1) and intermediate- of a single one. Comparing the thermodynamic plateau curves may be obtained as a result of the cycles of the monomeric and dimeric two-state cooperativity between the two binding sites. 8 Mathematical Modeling of G Protein-Coupled Receptor Function: What Can We Learn... 169

Box 8.2: The Two-State Dimer Receptor Œ. / . / . / where, RR Active D RR C RR A Model . / C RR AA Œ. / Œ. / Œ. /  and RR T D RR C RRA Œ. /  . / . / L C RR AA C RR C RR A . / 2A+(R2) 2A+(R2)* C RR AA

L K αK a1 D K2 .1 C ’“L/ α L ’L A+A(R2) A+A(R2)* a2 D K .1 C ’“L/

μK μβK ’“L a3 D 1 C ’“L αβL 1 C L A (R ) A (R )* a4 D 2 2 2 2 K2 .1 C ’“L/ The equilibrium constants of the model 1 C ’L a5 D .RR/ K .1 C ’“L/ . / L . / RR ! RR I L D Œ. / RR The geometric descriptors of the curves

K Œ.RR/  ¥ Left asymptote: Basal activity A C .RR/ ! .RR/ I K D A A ŒAŒ.RR/ a1 1 f D D 1 ’L Œ D0 a4 1 . / . / for A C L RR A ! RR AI ’ . / K . / ¥ Right asymptote: Efficacy A C RR ! RR AI . / Œ. / 1 RR A RR ’ D lim f D a3 D 1 . / Œ. /  ŒA!1 1 C RR RR A ’“L . / K . / A C RR A ! RR AAI ¥ The mid-point, the [A] value for half Œ.RR/ Œ.RR/ maximum f value D AA Œ.RR/ Œ.RR/  p A A b ˙ b2 4ac ’“ ŒA50 D . / L . / 2a RR AA ! RR AAI

“K With a D a a a ;bD a a C a a a . / . / 1 3 4 1 5 3 4 5 A C RR A ! RR AAI 2a2a4;cDa4(a1 C a3a4) where the ˙ .RR/ Œ.RR/  sign depends on A being an agonist or an “ D AA A Œ. /  . / inverse agonist. RR AA RR A

The fraction of active receptors Figure 8.4 shows two E/[A] curves resulting from the application of the two-state dimer re- Œ. / Œ  Œ 2 “ 3 3 RR Active a1 C a2 A C a3 A ceptor model. Two values of (10 ,10 )were f D Œ. / D 2 used, with constant values of L (101), K (106), RR T a4 C a5 ŒA C ŒA ’ (103) and (1/4). The receptor is constitutive 170 D. Roche et al.

Karnik 2005; Kobilka and Deupi 2007). The β β differential ligand-capacity for the selection of pathway-specific receptor active conformations has led to the concept of functional selectivity or biased agonism and, as a consequence, to the concept of pluridimensional efficacy (Clarke and Bond 1998; Kenakin 2002, 2007; Urban et al. 2006; Galandrin and Bouvier 2006; Violin and Lefkowitz 2007; Rajagopal et al. 2011; Kenakin et al. 2012; Reiter et al. 2012). These multiple active conformations can be associated with multiple G protein-dependent and independent pathways, where the latter are governed by Fig. 8.4 The two-state dimer receptor model of agonism. “-arrestins, tyrosine kinases and PDZ-domain Simulations were performed using Eq. 8.9 with L D 101; K D 106; ’ D 103; D 1/4; and “ D 103 and 103.For containing proteins (Sun et al. 2007). The impact “ D 103 a sigmoid curve is obtained whereas for “ D 103 that the multiplicity of signaling involving the a bell-shape curve results whose right asymptote value is wide spectrum of accessory proteins may have the same as the basal response because ’ and “ cancel each in drug development and therapeutics has been other remarked (Mailman 2007; Rajagopal et al. 2010). active (L > 1). The ligand is an agonist (’>1) 8.3.3.1 Distinguishing the Protomers for the first binding site and shows no binding Within a Dimer: The cooperativity for the inactive receptor ( D 1/4). Symmetric/Asymmetric The value of “ reflects the capacity of activa- Three-State Dimer Receptor tion of the fully occupied receptors relative to Model the singly occupied ones. For “ D 103 a steep Both the two-state and the two-state dimer model sigmoid curve is obtained whereas for “ D 103 have been extended to include an additional ac- a bell-shape curve results whose right asymptote tive receptor state, R** and (RR)**, respectively, [1/(1 C 1/(’“L))] reaches the value of the basal to account for an additional signaling pathway response (1/1 C 1/L) because ’ and “ cancel each (Leff et al. 1997;Breaetal. 2009). Notice- other. ably, a three-state dimer receptor model, includ- ing one inactive, (RR), and two active, (RR)* and (RR)**, receptor states, was able to explain the 8.3.3 GPCRs May Signal Through different behavior of typical and atypical 5-HT2A Multiple Pathways: The Issue receptor antagonists with respect to arachidonic of Functional Selectivity acid and inositol phosphate pathways (Brea et al. 2009). Both the two-state and the two-state dimer It is worth noting that, both the two-state receptor models are on-off switch models, one (Franco et al. 2005, 2006) and the three-state single state for the inactive receptor (R and (RR), (Brea et al. 2009) dimer receptor model consider respectively) and one single state for the active the active receptor states as global entities and receptor (R* and (RR)*, respectively). Clearly, the two protomers within an active state identical. this is an extreme simplification. The very However, protomers within dimer active states flexible nature of GPCRs (as proteins) enables do not need to be identical (e.g. for asymmet- the presence of multiple receptor conformations, ric active dimer state) as described by the new and nature has wisely chosen some of them asymmetric/symmetric three-state dimer receptor for signaling specific pathways (Perez and model, AS3D (Rovira et al. 2010)(Box8.3). 8 Mathematical Modeling of G Protein-Coupled Receptor Function: What Can We Learn... 171

Box 8.3: The Asymmetric/Symmetric ŒRRt D ŒRR C ŒRAR C ŒRARA Three-State Dimer Receptor Model C R R C R AR C RAR C R ARA C R R C R AR C R AR A

K1K2X1 a1 D 1 C 2Y5 C Y5Y6

K2 .Y1 C Y2/ a2 D 1 C 2Y5 C Y5Y6 Y5 a3 D 1 C 2Y5 C Y5Y6 K1K2X1X2 b1 D 2 .1 C 2Y5 C Y5Y6/ The equilibrium constants of the model K2Y1Y3 Œ  Œ  b2 D 1 2 2 R R X2 R R C Y5 C Y5Y6 X1 D Œ  2 D Œ  RR R R Y5Y6 b3 D 1 Œ Œ  Œ Œ  2 .1 C 2Y5 C Y5Y6/ K A RR 2 A RAR 2 D Œ  K2 D Œ  RAR RARA K1K2 .1 C 2X1 C X1X2/ a4 D b4 D ŒRAR  ŒR AR 1 C 2Y5 C Y5Y6 Y1 D Y2 D ŒRAR ŒRAR 2K2 .1 C Y1 C Y2 C Y1Y3/ a5 D b5 D 1 2 ŒR AR  ŒR AR  C Y5 C Y5Y6 Y3 D Y4 D ŒR R ŒR R A A The geometric descriptors of the curves Œ  Œ  2 R ARA Y6 R AR A ¥ Left asymptote: Basal activity Y5 D D ŒRARA 2 ŒR ARA a1 1 fRR D 2 D The fraction of active receptors Œ D0 4 1 1 for A a 1 2 C 2 X1 C X 2 ŒR R a1 C a2 ŒA C a3ŒA 1 t b1 fR R D D 2 2 Œ  Œ  Œ 2 fR R D D 2 1 RRt a4 C a5 A C A Œ D0 b4 1 for A C X2 C X1X2 2 ŒR R  b1 C b2 ŒA C b3ŒA t 2 fR R D D 2 ¥ Right asymptote: Efficacy ŒRRt b4 C b5 ŒA C ŒA 1 2 where, lim fR R D a3 D Œ !1 1 1 A 1 C C Y6 2 Y5 RR D RR C R R C R R t A A 1 2 lim fR R D b3 D 2 1 C R ARA Œ !1 1 A C Y6 C Y5Y6 R R t D R R C R AR C R AR A

(continued) 172 D. Roche et al.

the agonist decreased the signaling whereas the Box 8.3 (continued) inverse agonist enhanced it (Han et al. 2009). This is consistent with the agonist promoting ¥ The mid-point, the [A] value for half an alternative R*R* conformation whereas the maximum f or f values R*R R*R* inverse agonist stabilizing the R*R conformation p b ˙ b2 4ac previously promoted by the first agonist (Rovira ŒA50 D et al. 2010). Importantly, not only a conceptual 2a or qualitative analysis is provided by the model With a D a1 a3a4;b D a3a4a5 but also a quantitative description. 2a2a4 C a1a5; and c Da4 .a1 a3a4/ in Following the rationale of the previous mech- the case of fR*R and the same expressions anistic models, the pharmacologic responses of but replacing ai by bi in the case of fR*R*. the two (G protein-dependent and G protein- The ˙ sign in [A50] equation results for the independent) functional pathways are analyzed possibility of A being either a positive or an by the fractional receptor populations of the cor- inverse agonist. responding asymmetric and symmetric receptor species (Eqs. 8.10 and 8.11) (Rovira et al. 2010).

Following several experimental studies 2 a1 C a2 ŒA C a3ŒA (Damian et al. 2006; Bayburt et al. 2007; 2 fR R D 2 (8.10) Mancia et al. 2008; Pin et al. 2004;Xuet a4 C a5 ŒA C ŒA al. 2004; Hlavackova et al. 2005; Goudet et 2 al. 2005), AS3D assumes that the asymmetric b1 C b2 ŒA C b3ŒA 2 fR R D 2 (8.11) active receptor state (R*R) governs a G protein- b4 C b5 ŒA C ŒA dependent pathway whereas the symmetric active receptor state (R*R*) regulates a G protein- As in the two-state dimer receptor model the independent pathway. It is worth noting that both pharmacologic response for each of the signaling cis and trans activation, determined by Y2 and pathways is a ratio between two second-degree Y1 equilibrium constants (chemical equilibrium polynomials. The empirical equations 8.10 and scheme in Box 8.3), are included in the AS3D 8.11 turn into mechanistic when the ai and bi model allowing the simulation of mechanistic parameters are replaced by the equilibrium con- processes under equilibrium conditions that may stants of the model (Box 8.3). depend on the receptor system. Figure 8.5 shows a simulation of the There are particular pharmacological sub- asymmetric R*R and symmetric R*R* pathways 1 tleties that AS3D can represent. For example, an by using Eqs. 8.10 and 8.11 with X1 D 10 ; 8.5 4 3 asymmetric R*R active receptor state allows X2 D 1; K1 D 10 ;K2 D 10 ;Y1 D 10 ; 2 2 to describe findings apparently paradoxical, Y2 D 1; Y3 D 10 ;Y5 D 1 and Y6 D 10 , where such as an inverse agonist (with preferential K1 and K2 are equilibrium dissociation constants affinity for the R molecular entity) binding to for the binding of the first and second ligand the protomer R in the R*R active dimer receptor to the inactive receptor, respectively, Xi are species (if its concentration is significant) and equilibrium constants for the induction of contributing to an increase of the G protein- activated states for the free receptor and Yi dependent functional response. Interestingly, this are equilibrium constants for the induction of was recently found in a study involving dopamine activated states for the occupied receptor. The class A dimers: maximal functional response curve for the R*R-G protein-dependent signaling was achieved by agonist binding to a single pathway is a bell-shape curve because, after protomer; this is compatible with the proposal reaching a maximum, the concentrations of the of an asymmetric R*R active species. However, species associated with this pathway decrease addition of either an agonist or an inverse agonist as the concentrations of the R*R* associated- to the second protomer had opposite effects; species of the G protein-independent pathway 8 Mathematical Modeling of G Protein-Coupled Receptor Function: What Can We Learn... 173

1.0 the binding of the agonist to the receptor, which R*R is governed by an equilibrium constant, and the R*R* 0.8 R*R + R*R* transduction of receptor occupancy into response, for which a logistic function was proposed. The 0.6 final E/[A] equation (8.12) is the result of com-

Effect 0.4 bining both expressions. E £nŒAn 0.2 E D m (8.12) n n n .KA C ŒA/ C £ ŒA 0.0 -10 -8 -6 -4 We have considered the operational model x = log [A] as hybrid because incorporates mechanism through the binding equilibrium (the equilibrium Fig. 8.5 The three-state asymmetric/symmetric dimer re- dissociation constant K ) and empiricism by ceptor model of agonism. Simulations were performed A 1 proposing a transduction function, whose form using Eqs. 8.10 and 8.11 with X1 D 10 ;X2 D 1; 8.5 4 3 2 K1 D 10 ;K2 D 10 ;Y1 D 10 ;Y2 D 1; Y3 D 10 ; cannot necessarily be logistic in all cases and D D 2 Y5 1andY6 10 . The curve for the R*R signaling which includes two terms, Em and KE, which are pathway is a bell-shape curve because after reaching a not the product of a mechanistic development. E maximum the concentrations of the species associated m with this pathway decrease as the concentrations of the is defined as the maximum effect that the receptor R*R* associated-species increase. The sum of the two system may yield (not confound with the max- effects, which may reflects the convergence of the two imum effect (Top) that the receptor may reach pathways at some downstream signaling point, displays a from the binding of a particular agonist) and K , biphasic curve with an intermediate plateau E the concentration of bound receptor for half Em and then an index of the intrinsic efficacy of the increase. A biphasic curve with an intermediate agonist for a particular receptor. The parameter £ plateau results from the sum of the two effects, in Eq. 8.12 is the ratio between the total receptor which may reflect the convergence of the two concentration and KE and reflects the operational pathways at some downstream signaling point. efficacy of the agonist-receptor pair.

Box 8.4: The Operational Model of Agonism 8.4 The Operational Model of Agonism: A Hybrid Between Empirical A. The original proposal: Receptor and Mechanistic Models constitutive activity is not included

KA In the previous sections various examples of em- A C R ! AR ! E pirical and mechanistic models were described. The latter ones were characterized by the depen- The equilibrium constant of the model dent variable being attached to the fraction of ac- ŒAŒR tive receptors. However, the measured functional KA D Œ  responses resulting from GPCR activation may AR reside at some downstream point in the signaling The function for the transduction of re- pathway. The operational model was proposed ceptor occupation into response aiming to circumvent the complexity of the signal transduction process while including the essential n EmŒAR information of the ligand-receptor recognition E D n n KE C ŒAR stage (Black and Leff 1983). As it can be seen in Box 8.4A the models consists of two steps, (continued) 174 D. Roche et al.

Box 8.4 (continued) ŒAŒR KA D Œ  The E/[A] functional response AR

E £nŒAn The generation of stimulus by the recep- E D m n n n tor .KA C ŒA/ C £ ŒA Œ  © Œ  ŒR0 S D R C AR where £ D and ŒR0 D ŒR C ŒAR. KE Geometric descriptors of the curves The function for the transduction of re- ¥ Left asymptote (Basal response: f for ceptor stimulus into response [A] D 0) E Sn E D m 0 n n Basal D KE C S

¥ Right asymptote, the asymptotic f value The E/[A] functional response

as [A] increases (Top: lim f ) n ŒA!1 E ¦n.K C © ŒA/ E D m A n n n .KA C ŒA/ C ¦ .KA C © ŒA/ Em Top D 1 1 ŒR  C £n With ¦ D T and ŒR D ŒR C ŒAR KE T ¥ The mid-point, the [A] value for half Geometric descriptors of the curves maximum f value ¥ Left asymptote (Basal response: f for [A] D 0) KA ŒA50 D 1 Em .2 C £n/ n 1 Basal D 1 1 C ¦n In the particular case of n D 1 ¥ Right asymptote, the asymptotic f value Em£ŒA as [A] increases (Top: lim f ) E D ŒA!1 KA C .1 C £/ŒA E Em Top D m Top D 1 1 1 1 C n n C £ © ¦

KA ¥ The mid-point, the [A] value for half ŒA50 D 1 C £ maximum effect

B. An extension of the model: Receptor ŒA50 D p p constitutive activity is included n n n n n n n n KA 1 C © C 2© ¦ 2 C ¦ C © ¦ p p © n 2 ¦n ©n¦n n 1 ©n 2©n¦n R ! E C C C C In the particular case of n D 1 K A C R !A AR ! E Em¦ .KA C © ŒA/ E D .1 ¦/ .1 ©¦/Œ  The equilibrium constant of the model KA C C C A

(continued) 8 Mathematical Modeling of G Protein-Coupled Receptor Function: What Can We Learn... 175

system indicating that the ligand is a full agonist; Box 8.4 (continued) the other two curves with top significantly lower E than Em correspond to partial agonists. Basal D m 1 1 C ¦ E 8.4.1 The Operational Model Top D m 1 C 1 of Agonism Including ©¦ Receptor Constitutive Activity .1 C ¦/ KA ŒA50 D 1 C ©¦ The property of receptor constitutive activity has been recently added to the operational model of agonism (Slack and Hall 2012) (see also a pre- The operational model as it was originally vious analysis (Hall 2006) of the corresponding proposed (Black and Leff 1983) did not contem- author of the former paper by using a limiting plate receptor constitutive activity and thus basal case of the ternary complex model (De Lean et al. response is equal to 0. The top value depends on 1980)). The model considers that the observed Em, the maximum effect of the receptor system, effect is a logistic function of the stimulus gen- £ , the operational efficacy (an agonist-receptor erated by the receptor with a maximum effect Em dependent term), and n, the parameter enabling and a transducer constant of stimulus into effect E/[A] curves to be steeper or flatter than rect- KE; and that total receptor stimulus arises from angular hyperbola. The [A50] value, a parameter the sum of that generated by the free receptors reflecting the potency of the agonist, includes and that generated by the occupied ones. A pro- in its definition KA, the equilibrium dissociation portionality constant ">0 is used for the stim- constant of the ligand-receptor complex. ulus generated by the occupied receptors. The Figure 8.6 shows a simulation with Em D 100; value of " measures the capability of the ligand- 6 £ n D 1; KA D 10 ; and D 100, 5 and 1, which bound receptor to generate a stimulus relative to may represent three ligands with the same affinity the free receptor: ">1, indicates that the ligand- for the receptor but different intrinsic efficacy. bound generates greater stimulus than the free The asymptotic maxima of the curves decrease receptor and "<1, a smaller one. The E/[A] as £ decreases. For the highest £ value the top relationship is given by Eq. 8.13. of the curve (E D 99) is close to the Em of the E ¦n.K C © ŒA/n E D m A (8.13) n n n .KA C ŒA/ C ¦ .KA C © ŒA/ τ τ where ¦ D [R] /K measures the coupling τ T E efficiency of the signal transduction system. The parameters [R]T, total receptor concentration, and KA, agonist-receptor equilibrium dissociation constant, are defined as in the original operational model (Black and Leff 1983). The parameter ¦ resembles the earlier defined operational efficacy £, but because the meaning is not exactly the same a different symbol was used (Slack and Hall 2012). Figure 8.7 shows a simulation with Em D 100; 6 Fig. 8.6 The operational model of agonism. Simulations n D 1; KA D 10 ; ¦ D 0.5 and " D 100, 10, 1 were performed using Eq. 8.12 with Em D 100; n D 1; 1 6 and 10 assuming four ligands with different KA D 10 ;and£ D 100, 5 and 1. Decreasing the £ values leads to curves with lower asymptotic maxima, 99, 83.3 intrinsic efficacies. The receptor is constitutively and 50, respectively active with a basal response [Em/(1 C 1/¦)] of 176 D. Roche et al.

of an agonist compared to a reference agonist ε for a given pathway j . Finally, to determine ε 1 ε the bias between two pathways (j1 and j2)fora ε given agonist relative to a reference agonist the  .£= /  .£= / quantity log KA j1j2 D log KA j1  .£= / log KA j2 was suggested (Kenakin et al. 2012). From this bias parameter, the authors ex- pected that a scale for the quantification of ligand selectivity between signaling pathways could be constructed. It is worth emphasizing that for the scale to be useful, the differences in receptor densities between cell types should be cancelled. In this case, the bias factor would reflect only the Fig. 8.7 The operational model of agonism with receptor differences in efficacy and affinity of the ligands constitutive activity. Simulations were performed with 6 associated with their selectivity for the pathways. Em D 100; n D 1; KA D 10 ; ¦ D 0.5 and " D 100, 10, 1 and 101 assuming four ligands with different intrinsic It is important to note that quantification of bias, efficacy. The basal response is determined by ¦.The as proposed in this study (Kenakin et al. 2012), " intrinsic efficacy of the agonists featured by determines can be extremely important in the communication the height of the top relative to the basal response. The receptor is constitutively active with a basal response between pharmacologists and medicinal chemists of 33.3 and top responses of 98.0 (full agonist), 83.3 and, consequently, in guiding drug discovery in a (partial agonist), 33.3 (neutral antagonist) and 4.8 (inverse more productive and safer direction. agonist) are obtained

33.3 and top responses [Em/(1 C 1/("¦))]of98.0 (full agonist), 83.3 (partial agonist), 33.3 (neutral 8.5 Concluding Remarks antagonist) and 4.8 (inverse agonist). In the present study, a selection of empirical and mechanistic models has been shown and 8.4.2 Quantification of Functional their most salient features analyzed in detail. In Selectivity by the Operational addition, the operational model (a hybrid between Model of Agonism empirical and mechanistic models) has been in- cluded. The pros and cons of these approaches Functional selectivity can be accounted for by have been commented and the connection be- the operational model of agonism (for instance, tween them through the shape of the curves that in its original form (Black and Leff 1983)) by they produce discussed. In this regard, it is worth assuming that the operational efficacy parameter noting that a mechanistic model can be very £ of an agonist-receptor pair varies with the useful for simulation or for fitting experimental signaling pathways (Kenakin et al. 2012). Thus, data; however, some caution is needed because assuming in Eq. 8.12 that the parameters Em and different mechanistic models may produce sim- n are cell-specific (shared by all agonists for a ilar curves and be apparently well suited for common receptor through a given pathway), KA reality description, being possible that the chosen is ligand-receptor specific and £ is both ligand- mechanistic proposal be wrong (see, for example and pathway-specific, a transduction coefficient comments in (Giraldo 2008)). A careful com- log(£/KA) was defined to characterize agonism parison of all possible models appropriate for a for a given pathway (Kenakin et al. 2012). A nor- particular experimental situation should always  .£= / malized transduction coefficient, log KA j1 , be made in order to find the true model with the was proposed to quantify the relative efficiency true explanation. 8 Mathematical Modeling of G Protein-Coupled Receptor Function: What Can We Learn... 177

Acknowledgements This work was supported by the In the case of pharmacological models, the Spanish projects SAF2010-19257, TIN2009-13618, dependency between the concentration and the TIN2012-33116, Fundació La Marató de TV3 (110230) effect given by E D f([A]) is non-linear with re- and RecerCaixa 2010ACUP 00378. The 2nd author has been supported by the Ramón y Cajal Program of the spect the model parameters and, thus, Eq. 8.14 is Spanish Ministry of Economy and Competitiveness. known as non-linear regression.

A.1 Appendix A.1.2 Traditional Gradient-Based Nonlinear Regression A.1.1 Fitting Mathematical Models Algorithms Versus Stochastic to Experimental Data Approaches

In a pharmacological assay, the input agonist Given that the best set of parameter estimates concentration, [A], and its associated output must be a minimum of the energy given by Eq. effect, E, are recorded under particular 8.14, they can be found by means of optimization experimental conditions. Experimental data are techniques. There are two main families of opti- ::: given by a list of varying effects, Ei,iD 1, ,N, mization strategies: local and global. obtained using different agonist concentrations, Local approaches search for the closest lo- ::: [A]i,iD 1, ,N. A mathematical model of cal minimum given an initial seed point. They pharmacological effect is a mathematical mainly rely on the gradient of the function to be equation E D f([A]), in which E represents the minimized and update the initial guess using the pharmacological effect, [A] is the concentration direction of this gradient because is the direction of the drug and f is a mathematical function of maximum decrease of the energy (gradient containing a number of parameters. descent). Gradient descent methods are efficient The problem of fitting a mathematical model minimizers as far as the energy function can be to a given experimental data list consists in find- differentiated with respect the parameters (which ing the model parameters that produce the func- is always the case in a least squares fitting of tion E D f([A]) best approximating the experi- pharmacological models) and they provide an ::: mental pairs ([A]i ,Ei), i D 1, N in the sense optimal solution as far as the energy function that the model values are as close as possi- is convex (i.e. has a unique local minimum). In ble to the experimental measurements. In other the case of multiple minima (which is a typical words, the differences jf([A]i)-Eij are as small case for mechanistic pharmacological models), as possible. In mathematical terms this is ex- the gradient descent strongly depends on the pressed in the following sum of squares which initial seed. This limitation can be partially over- exclusively depends on the values of the model come by running the gradient descent algorithm parameters: using different initial seed points. This solution has two main shortcomings. First it increases XN 2 the computational time of the minimization pro- Energy.Parameters/ D jf .ŒA / Eij i cess and, second, given that there is no a-priory iD1 (8.14) knowledge on the number and distribution of the local minima, there is not a clear strategy for which should be minimum for the optimal model choosing an initial set of seeds guaranteeing that parametric values (least squares fitting). We the optimal solution (global minimum) will be would like to note that such set of optimal reached. parameters produce a curve that visually fits Global approaches are a way of searching for the profile of the scatter plot given by the global optimal solutions (global minimum) with- experimental pairs ([A]i,Ei), i D 1, :::N. out the need of a special definition of the initial 178 D. Roche et al.

TWO FUNDAMENTAL FORCES STOP CRITERIUM Lead to improving fitness values in consecutive populations

Selection NEW SELECTION POPULATION Acts as a force pushing quality

INITIAL EVOLUTIONARY POPULATION ALGORITHM Variation operations SURVIVOR SELECTION RECOMBINATION Exploration and exploitation of new MUTATION possible solutions

Fig. A.1 Scheme of the mechanisms ruling an Evolu- Recombination and mutation combine and perturb the tionary Algorithm (EA). EAs maintain a population of individuals (exploration). Finally, the survival step decides possible solutions and their fitness on the search space. who survives in the new population. This process iterates In each iteration, selection determines which individu- until the stop criterion occurs als are chosen to produce new solutions (exploitation). set of seeds. For special search spaces and cost general heuristics for exploration. Recombination functions, there is a solid mathematical theory produces new individuals in combining the infor- ensuring convergence of some global algorithms mation contained in the parents and offsprings are (simulated annealing (Ashyraliyev et al. 2009)). mutated by small perturbations with low proba- Although, there is no proof of convergence for bility. Finally, survival step decides who survives the general case. Evolutionary Algorithms (EAs) (among parents and offsprings) to form the new have proven their ability for optimizing non- population. This process iterates until stop crite- analytic multi-modal functions in a wide range of rion occurs. Figure A.1 outlines the scheme of a real life problems, such as parameter estimation standard EA. (Ravikumar and Panigrahi 2010), pattern and text recognition (Rizki et al. 2002) and image processing (Cagnoni et al. 2008).EAsareaclass References of stochastic optimization methods that simulate the process of natural evolution. Unlike gradient Albizu L, Balestre MN, Breton C et al (2006) Probing descent methods that evolve a single initial value the existence of G protein-coupled receptor dimers by positive and negative ligand-dependent cooperative each time, EAs maintain a population of possible binding. Mol Pharmacol 70:1783Ð1791 solutions that evolve according to rules of selec- Armstrong D, Strange PG (2001) Dopamine D2 receptor tion and other operators, such as recombination dimer formation: evidence from ligand binding. J Biol and mutation (Holland 1975). Chem 276:22621Ð22629 Ashyraliyev M, Fomekong-Nanfack Y, Kaandorp JA, Each individual in the population receives a Blom JG (2009) Systems biology: parameter estima- measure of its fitness in the environment. Selec- tion for biochemical models. FEBS J 276:886Ð902 tion focuses attention on high fitness individu- Barlow R, Blake JF (1989) Hill coefficients and the als, thus, exploiting the available fitness infor- logistic equation. Trends Pharmacol Sci 10:440Ð441 Bayburt TH, Leitz AJ, Xie G, Oprian DD, Sligar SG mation. Selection determines which individuals (2007) Transducin activation by nanoscale lipid bilay- are chosen to produce offsprings. Recombination ers containing one and two . J Biol Chem and mutation perturb those individuals, providing 282:14875Ð14881 8 Mathematical Modeling of G Protein-Coupled Receptor Function: What Can We Learn... 179

Black JW, Leff P (1983) Operational models of pharma- Ferre S, Woods AS, Navarro G, Aymerich M, Lluis C, cological agonism. Proc R Soc Lond B Biol Sci 220: Franco R (2010) Calcium-mediated modulation of the 141Ð162 quaternary structure and function of adenosine A2A- Boeynaems JM, Dumont JE (1980) Outlines of re- dopamine D2 receptor heteromers. Curr Opin Pharma- ceptor theory. Elsevier/North-Holland, Amsterdam, col 10:67Ð72 pp 73Ð75 Franco R, Casadó V, Mallol J et al (2005) Dimer-based Bouvier M, Heveker N, Jockers R, Marullo S, Milligan model for heptaspanning membrane receptors. Trends G (2007) BRET analysis of GPCR oligomerization: Biochem Sci 30:360Ð366 newer does not mean better. Nat Methods 4:3Ð4 Franco R, Casadó V, Mallol J et al (2006) The two-state Brea J, Castro M, Giraldo J et al (2009) Evidence dimer receptor model: a general model for receptor for distinct antagonist-revealed functional states of dimers. Mol Pharmacol 69:1905Ð1912 5-hydroxytryptamine(2A) receptor homodimers. Mol Galandrin S, Bouvier M (2006) Distinct signaling profiles Pharmacol 75:1380Ð1391 of beta1 and beta2 adrenergic receptor ligands toward Cagnoni S, Lutton E, Olague G (2008) Genetic and adenylyl cyclase and mitogen-activated protein kinase evolutionary computation for image processing and reveals the pluridimensionality of efficacy. Mol Phar- analysis. Hindawi Publishing Corporation, New York macol 70:1575Ð1584 Carrillo JJ, Lopez-Gimenez JF, Milligan G (2004) Giraldo J (2003) Empirical models and Hill coefficients. Multiple interactions between transmembrane helices Trends Pharmacol Sci 24:63Ð65 generate the oligomeric alpha1b-adrenoceptor. Mol Giraldo J (2004) Agonist induction, conformational selec- Pharmacol 66:1123Ð1137 tion, and mutant receptors. FEBS Lett 556:13Ð18 Chabre M, le Maire M (2005) Monomeric G-protein- Giraldo J (2008) On the fitting of binding data when coupled receptor as a functional unit. Biochemistry receptor dimerization is suspected. Br J Pharmacol 44:9395Ð9403 155:17Ð23 Chidiac P, Green MA, Pawagi AB, Wells JW (1997) Giraldo J (2010) How inverse can a neutral antagonist be? Cardiac muscarinic receptors. Cooperativity as the Strategic questions after the rimonabant issue. Drug basis for multiple states of affinity. Biochemistry 36: Discov Today 15:411Ð415 7361Ð7379 Giraldo J, Vivas NM, Vila E, Badia A (2002) Assess- Christopoulos A, Kenakin T (2002) G protein-coupled ing the (a)symmetry of concentration-effect curves: receptor allosterism and complexing. Pharmacol Rev empirical versus mechanistic models. Pharmacol Ther 54:323Ð374 95:21Ð45 Clarke WP, Bond RA (1998) The elusive nature of intrin- Giraldo J, Serra J, Roche D, Rovira X (2007) Assess- sic efficacy. Trends Pharmacol Sci 19:270Ð276 ing receptor affinity for inverse agonists: Schild and Colquhoun D (1971) Lectures on biostatistics. Oxford Cheng-Prusoff methods revisited. Curr Drug Targets 8: University Press, London, pp 361Ð364 197Ð202 Colquhoun D (1973) The relationship between classical Gompertz B (1825) On the nature of the function expres- and cooperative models for drug action. In: Rang HP sive of the law of human mortality. Philos Trans R Soc (ed) A symposium on drug receptors. University Park Lond 36:513Ð585 Press, Baltimore, pp 149Ð182 Goudet C, Kniazeff J, Hlavackova V et al (2005) Asym- Costa T, Herz A (1989) Antagonists with negative intrinsic metric functioning of dimeric metabotropic glutamate activity at delta opioid receptors coupled to GTP- receptors disclosed by positive allosteric modulators. J binding proteins. Proc Natl Acad Sci U S A 86: Biol Chem 280:24380Ð24385 7321Ð7325 Hall DA (2006) Predicting dose-response curve behav- Damian M, Martin A, Mesnier D, Pin JP, Baneres ior: mathematical models of allosteric receptor-ligand JL (2006) Asymmetric conformational changes in interactions. In: Bowery NG (ed) Allosteric receptor a GPCR dimer controlled by G-proteins. EMBO J modulation in drug targeting. Taylor & Francis, New 25:5693Ð5702 York, pp 39Ð77 De Lean A, Stadel JM, Lefkowitz RJ (1980) A ternary Han Y, Moreira IS, Urizar E, Weinstein H, Javitch JA complex model explains the agonist-specific (2009) Allosteric communication between protomers binding properties of the adenylate cyclase- of dopamine class A GPCR dimers modulates activa- coupled “-adrenergic receptor. J Biol Chem 255: tion. Nat Chem Biol 5:688Ð695 7108Ð7117 Hill AV (1913) The combinations of haemoglobin with del Castillo J, Katz B (1957) Interaction at end-plate oxygen and with carbon monoxide. Biochem J 7: receptors between different choline derivatives. Proc 471Ð480 R Soc Lond B 146:369Ð381 Hlavackova V, Goudet C, Kniazeff J et al (2005) Evi- Ernst OP, Gramse V, Kolbe M, Hofmann KP, Heck dence for a single heptahelical domain being turned M (2007) Monomeric G protein-coupled receptor on upon activation of a dimeric GPCR. EMBO J 24: rhodopsin in solution activates its G protein transducin 499Ð509 at the diffusion limit. Proc Natl Acad Sci U S A Holland JH (1975) Adaptation in natural and artificial 104:10859Ð10864 systems. University Michigan Press, Ann Arbor 180 D. Roche et al.

James JR, Oliveira MI, Carmo AM, Iaboni A, Davis SJ Perez DM, Karnik SS (2005) Multiple signaling states (2006) A rigorous experimental framework for de- of G-protein-coupled receptors. Pharmacol Rev 57: tecting protein oligomerization using bioluminescence 147Ð161 resonance energy transfer. Nat Methods 3:1001Ð1006 Pin JP, Kniazeff J, Binet V et al (2004) Activation Jastrzebska B, Maeda T, Zhu L et al (2004) Functional mechanism of the heterodimeric GABA(B) receptor. characterization of rhodopsin monomers and dimers in Biochem Pharmacol 68:1565Ð1572 detergents. J Biol Chem 279:54663Ð54675 Pin JP, Neubig R, Bouvier M et al (2007) International Jastrzebska B, Fotiadis D, Jang GF, Stenkamp RE, Engel Union of Basic and Clinical Pharmacology. LXVII. A, Palczewski K (2006) Functional and structural Recommendations for the recognition and nomencla- characterization of rhodopsin oligomers. J Biol Chem ture of G protein-coupled receptor heteromultimers. 281:11917Ð11922 Pharmacol Rev 59:5Ð13 Karlin A (1967) On the application of “a plausible model” Rajagopal S, Rajagopal K, Lefkowitz RJ (2010) Teaching of allosteric proteins to the receptor for acetylcholine. old receptors new tricks: biasing seven-transmembrane J Theor Biol 16:306Ð320 receptors. Nat Rev Drug Discov 9:373Ð386 Kenakin T (2002) Drug efficacy at G protein-coupled Rajagopal S, Ahn S, Rominger DH et al (2011) Quantify- receptors. Annu Rev Pharmacol Toxicol 42:349Ð379 ing ligand bias at seven-transmembrane receptors. Mol Kenakin T (2007) Collateral efficacy in drug discovery: Pharmacol 80:367Ð377 taking advantage of the good (allosteric) nature of Rasmussen SG, DeVree BT, Zou Y et al (2011) Crystal 7TM receptors. Trends Pharmacol Sci 28:407Ð415 structure of the beta(2) adrenergic receptor-Gs protein Kenakin T, Watson C, Muniz-Medina V, Christopoulos complex. Nature 477:549Ð557 A, Novick S (2012) A simple method for quantifying Ravikumar V, Panigrahi BK (2010) Comparative study functional selectivity and agonist bias. ACS Chem of evolutionary computing methods for parameter es- Neurosci 3:193Ð203 timation of quality signals. Int J Appl Evol Comput Kjelsberg MA, Cotecchia S, Ostrowski J, Caron MG, 1:28Ð59 Lefkowitz RJ (1992) Constitutive activation of the Reiter E, Ahn S, Shukla AK, Lefkowitz RJ (2012) Molec- alpha 1B-adrenergic receptor by all amino acid sub- ular mechanism of “-arrestin-biased agonism at seven- stitutions at a single site. Evidence for a region transmembrane receptors. Annu Rev Pharmacol Toxi- which constrains receptor activation. J Biol Chem 267: col 52:179Ð197 1430Ð1433 Richards FJ (1959) A flexible growth function for empiri- Kobilka BK, Deupi X (2007) Conformational complexity cal use. J Exp Bot 10:290Ð300 of G-protein-coupled receptors. Trends Pharmacol Sci Rios CD, Jordan BA, Gomes I, Devi LA (2001) G-protein- 28:397Ð406 coupled receptor dimerization: modulation of receptor Kuszak AJ, Pitchiaya S, Anand JP, Mosberg HI, Walter function. Pharmacol Ther 92:71Ð87 NG, Sunahara RK (2009) Purification and functional Rizki MM, Zmuda MA, Tamburino LA (2002) Evolving reconstitution of monomeric mu-opioid receptors: al- pattern recognition systems. IEEE Trans Evol Comput losteric modulation of agonist binding by Gi2. J Biol 6:594Ð609 Chem. doi:10.1074/jbc.M109.026922 Rovira X, Vivo M, Serra J, Roche D, Strange PG, Giraldo Leff P (1995) The two-state model of receptor activation. J (2009) Modelling the interdependence between the Trends Pharmacol Sci 16:89Ð97 stoichiometry of receptor oligomerization and ligand Leff P, Scaramellini C, Law C, McKechnie K (1997) A binding for a coexisting dimer/tetramer receptor sys- three-state receptor model of agonist action. Trends tem. Br J Pharmacol 156:28Ð35 Pharmacol Sci 18:355Ð362 Rovira X, Pin JP, Giraldo J (2010) The asymmet- Lohse MJ (2010) Dimerization in GPCR mobility and ric/symmetric activation of GPCR dimers as a possi- signaling. Curr Opin Pharmacol 10:53Ð58 ble mechanistic rationale for multiple signalling path- Mailman RB (2007) GPCR functional selectivity has ways. Trends Pharmacol Sci 31:15Ð21 therapeutic impact. Trends Pharmacol Sci 28: Samama P, Cotecchia S, Costa T, Lefkowitz RJ (1993) A 390Ð396 mutation-induced activated state of the “2-adrenergic Mancia F, Assur Z, Herman AG, Siegel R, Hendrick- receptor. Extending the ternary complex model. J Biol son WA (2008) Ligand sensitivity in dimeric associ- Chem 268:4625Ð4636 ations of the serotonin 5HT2c receptor. EMBO Rep 9: Sips R (1948) On the structure of a catalyst surface. 363Ð369 J Chem Phys 16:490Ð495 Meyer BH, Segura JM, Martinez KL et al (2006) FRET Sips R (1950) On the structure of a catalyst surface. II. imaging reveals that functional neurokinin-1 receptors J Chem Phys 18:1024Ð1026 are monomeric and reside in membrane microdomains Slack R, Hall D (2012) Development of operational mod- of live cells. Proc Natl Acad Sci U S A 103:2138Ð2143 els of receptor activation including constitutive recep- Milligan G (2004) G protein-coupled receptor dimeriza- tor activity and their use to determine the efficacy of tion: function and ligand pharmacology. Mol Pharma- the chemokine CCL17 at the CC chemokine receptor col 66:1Ð7 CCR4. Br J Pharmacol 166:1774Ð1792 Monod J, Wyman J, Changeux JP (1965) On the nature Sun Y, McGarrigle D, Huang XY (2007) When a G of allosteric transitions: a plausible model. J Mol Biol protein-coupled receptor does not couple to a G pro- 12:88Ð118 tein. Mol Biosyst 3:849Ð854 8 Mathematical Modeling of G Protein-Coupled Receptor Function: What Can We Learn... 181

Terrillon S, Bouvier M (2004) Roles of G-protein-coupled interactions. A practical approach. Oxford University receptor dimerization. EMBO J 5:30Ð34 Press, Oxford, pp 289Ð395 Thron CD (1973) On the analysis of pharmacological White JF, Grodnitzky J, Louis JM et al (2007) Dimer- experiments in terms of an allosteric receptor model. ization of the class A G protein-coupled neurotensin Mol Pharmacol 9:1Ð9 receptor NTS1 alters G protein interaction. Proc Natl Urban JD, Clarke WP, von Zastrow M et al (2006) Acad Sci U S A 104:12199Ð12204 Functional selectivity and classical concepts of Whorton MR, Bokoch MP, Rasmussen SG et al (2007) quantitative pharmacology. J Pharmacol Exp Ther A monomeric G protein-coupled receptor isolated in 320:1Ð13 a high-density lipoprotein particle efficiently activates Urizar E, Montanelli L, Loy T et al (2005) Glycopro- its G protein. Proc Natl Acad Sci U S A 104:7682Ð tein hormone receptors: link between receptor homod- 7687 imerization and negative cooperativity. EMBO J 24: Whorton MR, Jastrzebska B, Park PS et al (2008) Efficient 1954Ð1964 coupling of transducin to monomeric rhodopsin in a Van Der Graaf PH, Schoemaker RC (1999) Analysis phospholipid bilayer. J Biol Chem 283:4387Ð4394 of asymmetry of agonist concentration-effect curves. Wreggett KA, Wells JW (1995) Cooperativity manifest in J Pharmacol Toxicol Methods 41:107Ð115 the binding properties of purified cardiac muscarinic Violin JD, Lefkowitz RJ (2007) Beta-arrestin-biased lig- receptors. J Biol Chem 270:22488Ð22499 ands at seven-transmembrane receptors. Trends Phar- Xu H, Staszewski L, Tang H, Adler E, Zoller M, Li X macol Sci 28:416Ð422 (2004) Different functional roles of T1R subunits in Wells JW (1992) Analysis and interpretation of binding the heteromeric taste receptors. Proc Natl Acad Sci U at equilibrium. In: Hulme EC (ed) Receptor-ligand S A 101:14258Ð14263 Part IV Bioinformatics Tools and Resources for GPCRs GPCR & Company: Databases and Servers for GPCRs and Interacting 9 Partners

Noga Kowalsman and Masha Y. Niv

Abstract G-protein-coupled receptors (GPCRs) are a large superfamily of mem- brane receptors that are involved in a wide range of signaling pathways. To fulfill their tasks, GPCRs interact with a variety of partners, including small molecules, lipids and proteins. They are accompanied by different proteins during all phases of their life cycle. Therefore, GPCR interactions with their partners are of great interest in basic cell-signaling research and in drug discovery. Due to the rapid development of computers and internet communi- cation, knowledge and data can be easily shared within the worldwide research community via freely available databases and servers. These provide an abundance of biological, chemical and pharmacological infor- mation. This chapter describes the available web resources for investigating GPCR interactions. We review about 40 freely available databases and servers, and provide a few sentences about the essence and the data they supply. For simplification, the databases and servers were grouped under the following topics: general GPCRÐligand interactions; particular families of GPCRs and their ligands; GPCR oligomerization; GPCR interactions with intracellular partners; and structural information on GPCRs. In conclusion, a multitude of useful tools are currently available. Summary tables are provided to ease navigation between the numerous and partially overlapping resources. Suggestions for future enhancements of the online tools include the addition of links from general to specialized databases and enabling usage of user-supplied template for GPCR struc- tural modeling.

N. Kowalsman ¥ M.Y. Niv () Institute of Biochemistry, Food Science and Nutrition, Dynamics, The Hebrew University of Jerusalem, The Robert H. Smith Faculty of Agriculture, Food and Jerusalem, Israel Environment and The Fritz Haber Center for Molecular e-mail: [email protected]

M. Filizola (ed.), G Protein-Coupled Receptors - Modeling and Simulation, Advances 185 in Experimental Medicine and Biology 796, DOI 10.1007/978-94-007-7423-0__9, © Springer ScienceCBusiness Media Dordrecht 2014 186 N. Kowalsman and M.Y. Niv

Keywords G protein-coupled receptors ¥ Servers ¥ Databases ¥ Models ¥ Bitter taste receptors ¥ Protein-protein interactions ¥ Ligand sets ¥ Membrane mismatch ¥ Oligomerization ¥ Repository ¥ Interface ¥ Curation ¥ Xray crystallography ¥ Homology ¥ Partners ¥ Database cross referencing

Abbreviations endoplasmic reticulum (ER), maturation in the Golgi apparatus, and transport to the cell surface, 2D 2-dimensional followed by signal transduction and receptor 3D 3-dimensional internalization. At each of these stages, they are 5-HT 5-hydroxytryptamine accompanied by specialized interacting protein BLAST Basic Local Alignment Search partners and small molecules (Maurice et al. Tool 2011; Bockaert et al. 2010). The repertoire of CPI chemicalÐprotein interaction GPCR-interacting proteins (GIPs) is large and GIPs GPCR-interacting proteins varied. It includes heat-shock proteins, kinases GpHR glycoprotein-hormone receptor (for example GRKs), GPCR-associated sorting GPCR G-protein-coupled receptor proteins (GASPs), PSD-95/Discs-large/ZO-1 GDP guanosine diphosphate (PDZ) domain-containing proteins, G-protein GTP guanosine-50-triphosphate ’ subunit and many more (Maurice et al. 2011; KEGG Kyoto Encyclopedia of Genes and Magalhaes et al. 2012), see Fig. 9.1. Genomes The involvement of GPCRs in numerous OR pathways and pathological conditions makes PDZ PSD-95/Discs-large/ZO-1 them one of the most important classes of QSAR quantitative structureÐactivity rela- pharmacological targets (Overington et al. 2006; tionship Schlyer and Horuk 2006). Many pathological TM transmembrane conditions have been shown to be connected to UniProtKB Universal Protein Resource interactions between GPCRs and other proteins. Knowledgebase For example, the interaction between serotonin receptor 5-hydroxytryptamine 1B (5-HT1B) and an altered form of p11 protein has been shown 9.1 Introduction to be related to depression (Bockaert et al. 2010). The interaction between the PDZ protein G-protein-coupled receptors (GPCRs) constitute MUPP1 and gamma-aminobutyric acid receptor a large superfamily of membrane receptors B (GABAB) fine-tunes receptor signaling and that are involved in a wide range of signaling is potentially involved in diseases such as pathways, celebrated in the 2012 Nobel Prize epilepsy (Magalhaes et al. 2012). Serotonin 5- in Chemistry, awarded jointly to Robert J. HT2A receptors (2AR) Ð metabotropic glutamate Lefkowitz and Brian K. Kobilka for “for studies receptor 2 (mGluR2) complex has been suggested of G-protein-coupled receptors”. GPCRs are to be involved in the altered cortical processes of activated by diverse ligands, including amines, schizophrenia (Gonzalez-Maeso et al. 2008). It odorants, fatty acids, peptides and neurotrans- is therefore not surprising that the interactions mitters (Congreve et al. 2011a; Jacoby et al. between GPCRs and other molecules, such 2006). These receptors undergo several stages in as natural ligands, synthetic agonists and their life cycle, starting from biosynthesis in the antagonists, other GPCRs, G-proteins and 9 GPCR & Company: Databases and Servers for GPCRs and Interacting Partners 187

Fig. 9.1 Schematic view of a prototypical GPCR and in blue and purple lines, the ligands are depicted as cyan some possible interacting partners. This cartoon shows the circles, the G-protein units are shown as red, yellow and main interactions of a GPCR: ligand binding, oligomeriza- orange ellipses, the effectors are marked as pink hexagons tion with other GPCRs, coupling to G-proteins (that inter- and the additional GIPs are shown as green pentagons. act with additional effectors) and binding with additional The membrane is illustrated as a large yellow rectangle GIPs from various protein families. The GPCRs are shown

GIPs, are key aspects of GPCR-targeted drug In this chapter, we focus on freely available development. Moreover, taking into account databases and servers that supply information on GPCR interactions may help in designing more GPCR interactions. We organize these resources specific, and perhaps more potent drugs. For in the following order: (i) general receptorÐligand instance, synthetic bivalent and multivalent drugs and GPCRÐligand interaction databases, (ii) for GPCR oligomers represent attempts in this GPCRÐligand interaction databases dedicated direction (Shonberg et al. 2011). to particular GPCR families, (iii) databases The rapid development of the internet and in- focused on GPCR oligomerization, (iv) databases crease in information-sharing resources (such as dedicated to GPCR interactions with GIPs and electronic journals, databases and servers) make signaling pathways, and (v) databases and servers it easier for research groups and organizations to addressing structural information on GPCRs. disseminate data to the research community. This We conclude by outlining the challenges and leads also to an increase in the freely available opportunities in establishing, maintaining, and data on GPCRs. A vast repertoire of web-based using GPCRÐpartner databases and servers, databases and servers is being developed in this and highlight topics that, in our opinion, could field, raising the need to highlight and organize further enhance GPCR interactions-related online these data to make it easily accessible. tools. 188 N. Kowalsman and M.Y. Niv

screening protocol. PubChem and PubChem 9.2 Databases and Servers BioAssay are linked (Wang et al. 2012). PDSP Related to GPCR Ligands Ki contains more than 55,000 receptor K values. and Interactions i In addition to the Ki values it also supplies references to the relevant experiment (Roth et 9.2.1 General GPCR Ligands al. 2000). BindingDB focuses on interactions of and GPCR-Ligand Interactions proteins considered to be drug targets (GPCRs included) with small molecules. It contains more GPCRÐligand interactions are of major impor- than 900,000 entries of binding data, for more tance for GPCR-targeted drug discovery and de- than 6,000 protein targets and almost 400,000 velopment. Indeed, recent studies have provided small molecules and includes all data from PDSP a vast abundance of data on GPCRs and their Ki database (Liu et al. 2007). BindingDB allows ligands. broadening the information on GPCRs by linking Several freely available databases offer infor- to different sources, including IUPHAR-DB (see mation on a wide range of GPCRÐligand inter- below). SuperTarget II holds information on actions. We begin by describing databases that the relations between drugs and their protein are general to many protein families, including targets (including GPCRs), such as drug-related GPCRs, and proceed to databases and servers information, adverse drug effects and more. It dedicated to GPCRs and GPCRÐligand interac- currently contains over 6,000 target proteins, tions. The databases reviewed in this chapter, and annotated with more than 330,000 relations to their url addresses are summarized in Table 9.1. almost 200,000 compounds (Hecker et al. 2012). Databases such as PubChem (Bolton et The Therapeutic Target Database (TTD) al. 2008), PubChem BioAssay (Wang et al. aims to facilitate target-oriented drug discovery. 2012), Psychoactive Drug Screening Program The TTD database connects proteins (GPCRs in- Ki (PDSP Ki) (Roth et al. 2000), BindingDB cluded) with pathological conditions and known (Liu et al. 2007) and SuperTarget II (Hecker drugs. TTD contains about 2,000 targets and et al. 2012), supply information on many close to 18,000 drugs (Chen et al. 2002; Zhu et ligands of numerous receptors, including GPCRs. al. 2012). It links to the Kyoto Encyclopedia of These databases include links to information on Genes and Genomes (KEGG) pathway database bioassays, Ki and IC50 values and more. which provides information on the pathways in PubChem is a very large database based on which the proteins are involved (see Sect. 9.2.4). highly automated processing that contains more For example, searching the serotonin receptor 5- than 30 million compounds, some of which are HT1A (Universal Protein Resource Knowledge- GPCR ligands. For each compound, PubChem base (UniProtKB) entry P08908) retrieves a table provides information on basic physical and with an entry on 5-HT1A listing several diseases chemical properties (such as molecular weight, and drugs entries. Each entry leads to a new infor- hydrophobicity (AlogP), molecular formula), mation page. The 5-HT1A information page con- 2-dimensional (2D) and 3-dimensional (3D) tains synonyms, list of eight related diseases, lists structures that can be downloaded, synonyms, of known related drugs, quantitative structureÐ related literature, classification, patents, a list of activity relationship (QSAR) models, agonists vendors, and links to other (structurally related) and antagonists and more. compounds (Bolton et al. 2008). Many of the IUPHAR-DB and GPCRDB provide general other databases mentioned in this chapter supply information on GPCRs. Here we focus on data links to PubChem. The PubChem BioAssay provided on GPCR interactions with various lig- database holds records on bioactivity screens ands. IUPHAR-DB contains curated information of chemical substances. It provides searchable on protein superfamilies that are major biolog- data from each bioassay, including information ical targets of licensed medicinal drugs. This on the conditions and notes specific to a particular includes almost 360 GPCRs from human and 9 GPCR & Company: Databases and Servers for GPCRs and Interacting Partners 189 ) ) ), , ) ) ) ), , ) ) ) 2009 2012 2008 2006 2012 2011 2012 2002 1998 2000 2012 re free for 2007 ) ), Vroling et al. ) 2011 Roth et al. ( Harmar et al. ( Horn et al. ( 2003 ( Bolton et al. ( Wang et al. ( Liu et al. ( Bellis et al. ( Hecker et al. ( Zhu et al. ( Okuno et al. ( 2008 Chen et al. ( Cheng et al. ( values i for a large number ofGPCRs, drugs ion and channels, drug transporters candidates and (for enzymes). Supplies pharmacological, pathophysiological, chemical, genetic, functional and anatomical information for GPCRs and ion channels. Contains sequences, ligand binding constants, oligomerization data and mutations; computationally derived data, i.e. multiple sequencehomology alignments models. and Contains information on millions ofcompounds, small including their biological activities. Contains published and internally derived K Holds measured binding affinities for(including proteins GPCRs) and small-molecules. Contains information on the ligandsconstants and for binding family A GPCRsthe and properties basic of statistics the on ligands. Contains information on drug protein(GPCRs targets included), for example, drug-related information and adverse drug effects,ontologies, gene pathways and more. Enables ligand-based searches and searchesbinding for patterns of similar GPCRS to sets of ligands. pathological conditions and known drugs. it will bind, if at all. if if 50 50 if 50 50 50 50 50 possible possible available Information on binding constants Additional information References Ki, Kd and IC Not directly Provides information on the GPCRs and ligands. Various species Ki,Various species Kd, IC Ki, Kd, IC Various species Ki, Kd, IC HumanVarious species None Ki, Kd, IC Human and rodents Various species Connects proteins (including GPCRs) with Ki, Kd and IC Various species Ki, Kd, IC Human, mouse and rat Human Not relevant A server that predicts given a ligand to which GPCR http://pubchem. ncbi.nlm.nih.gov http://pdsp.med.unc. edu/pdsp.php http://www. bindingdb.org/bind/ index.jsp http://xin.cz3.nus. edu.sg/group/ttd/ttd. asp http://bioinf-apache. charite.de/ supertarget_v2/ http://www.iuphar- db.org http://www.gpcr. org/7tm https://www.ebi.ac. uk/chembl/sarfari/ gpcrsarfari http://pharminfo. pharm.kyoto-u.ac. jp/services/glida/ http://www.lmmd. org/online_services/ cpi_predictor/ Information on GPCR ligands and general GPCRÐligand interactions can be obtained via these databases Table 9.1 DatabasePubChem and PubChem BioAssay PDSP Ki URLBindingDB TTD Species SuperTarget II IUPHAR-DB GPCRDB GPCR SARfari GLIDA CPI-Predictor The data from all these databases can be downloaded. Some parts of the data in GPCR IUPHAR-DB cannot be downloaded but can be given upon request and they a academic users. GLIDA is deposited in PubChem and can be downloaded from there 190 N. Kowalsman and M.Y. Niv rodent sources and thousands of their ligands. It A multitarget quantitative structureÐactivity integrates pharmacological (including the Ki and relationship (mt-QSAR) method was developed IC50 values of ligands, when available), patho- recently using a dataset of about 82,000 physiological, chemical, genetic, functional and chemical-protein interaction pairs between anatomical information (Harmar et al. 2009). almost 51,000 compounds and 136 GPCRs GPCRDB specializes in GPCRs from different and is available via the mt-QSAR-based web species. It contains data on sequences, ligand- server named CPI-Predictor. The user inserts binding constants, oligomerization and mutations the compound of interest (either by structure from more than 12,000 experiments. In addi- or in SMILES format) and the server returns tion, it stores computationally derived data, such target GPCRs predicted to bind this compound. as multiple sequence alignments and homology This predictor has the potential to facilitate models (Horn et al. 1998, 2003; Vroling et al. applications in network pharmacology and drug 2011). repositioning (Cheng et al. 2012). GPCR SARfari (part of ChEMBL database) and GPCR-Ligand Database (GLIDA) are dedicated to interactions between GPCRs and 9.2.2 Ligand Interactions their ligands. GPCR SARfari integrates target, of Particular GPCR Families 3D structure, compounds of experimental relevance and screening data (including binding In addition to the general GPCRÐligand constants) in a single chemogenomics database databases, databases and servers specialized in for class A GPCRs. It provides information specific GPCR families have been established on the molecular weight, AlogP, polar surface (summarized in Table 9.2). The dedicated efforts area and affinities of the compounds that bind and manual curation of some of these sites a specific GPCR. In total, GPCR SARfari provide data that are unavailable in the more contains information on over 142,000 compounds general resources, justifying the existence of (Bellis et al. 2011). GLIDA holds thousands these specialized databases alongside the more of human, rat and mouse GPCR entries and general ones. tens of thousands of ligand entries. It supplies Glycoprotein-Hormone Receptor (GpHR) general information on GPCRs and their ligands Information System (GRIS) (Van Durme et and links to additional information databases al. 2006) and Sequence-Structure-Function- such as GPCRDB, IUPHAR-DB and PubChem. Analysis of Glycoprotein Hormone Receptors GLIDA allows a user-friendly ligand-based (SSFA-GPHR) (Kleinau et al. 2007)are search of the database that is based on clustering dedicated to GpHRs, including thyrotropin of ligand structures. In addition, it can create receptor (TSHR), follitropin receptor (FSHR) correlation maps between similar GPCRs and and lutropin/choriogonadotropin receptor their ligands (Okuno et al. 2006, 2008). Each (LHR/CGR). GRIS contains GpHR sequences, database contains information that is not found models, and information about mutations and in the other: GPCR SARfari supplies binding their phenotypes. It allows the users to perform constants and allows filtering of the search results residue and rotamer analyses using the 3D based on preferred values of activity types (such models and to predict the structural effects as LogIC50, Binding, activity percent and many of mutations in silico (Van Durme et al. more). GLIDA provides interaction-correlation 2006). SSFA-GPHR integrates large amounts maps between similar GPCRs and their ligands. of information on mutations in GpHRs and For example, for 5-HT1A, GPCR SARfari returns uses 2D snake-plot diagrams and 3D models more than 3,600 binding compounds and more to supply information on the mutation’s location. than 5,000 entries on bioactivity data for ligands, In addition, it includes information on the effects whereas GLIDA supplies data on 1,110 agonists of mutations on receptor expression, binding and and 604 antagonists. activity (Kleinau et al. 2007). 9 GPCR & Company: Databases and Servers for GPCRs and Interacting Partners 191 ) ) ) ) ) , ) 2007 2011 2009 2012 2010 2002 2011 ownloaded ) ), Safran et al. ) ) Scent, SuperSweet 2006 2004 2003 2003 Van Durme et al. ( Kleinau et al. ( Crasto et al. ( Olender et al. ( ( Dunkel et al. ( Ahmed et al. ( Wiener et al. ( sequences, models, mutations and their phenotypes. GpHRs, including their effects onexpression, GpHR binding receptor and activity. and ORDB link between odorantsORDB and was olfactory developed receptors. over the yearsrange and of now chemosensory includes receptors. a Provides information on the genomiccluster localization organization and of 855 humangenes. olfactory receptor searched by structure, similarity orentry properties. includes Each basic scent details and supplier information. Provides homology models of thepredicted sweet complexes receptor with and the its molecules. Compounds can be searched bySupplies structure, 2D properties, snake-plot etc. representations of the receptors. Provides response profile of odor receptors and odorants. Galizia et al. ( Human and ratVarious speciesDrosophila melanogaster Integrates large amounts of information on mutations in Human and mouse Supplies basic information on the odorants andHuman. receptors, Supply orthologues in chimpanzee, dog, opossum and platypus Various species Predicts olfactory receptors for small molecules.HumanHuman models A database Liu for et volatile al. scents. ( The compounds can be Dedicated to sweet and Links predicted-to-be bitter sweet compounds molecules. and bitter taste receptors. Various species Contains heterogeneous data on GpHRs such as (the site is http://gris.ulb.ac.be/ http://www.ssfa-gphr.de/ http://senselab.med.yale.edu/ odordbhttp://senselab.med. yale.edu/ordb http://neuro.uni-konstanz.de/ DoOR/default.html http://mdl.shsmu.edu.cn/ ODORactor/ http://genome.weizmann.ac.il/ horde/ http://bioinf-applied.charite. de/superscent/index.php?site= home http://bioinf-applied.charite. de/sweet http://bitterdb.agri.huji.ac.il/ bitterdb/ not always accessible) Databases for ligand interactions of particular GPCR families Table 9.2 DatabaseGRIS URLSSFA-GPHR OdorDB, ORDB DoOR ODORactor SpeciesHORDE SuperScent SuperSweet Additional informationBitterDB References The data from all these databases can be downloaded except in the cases: OdorDB and ORDB. Structures and sequences can be downloaded from GRIS. In Super and BitterDB some of the data on each molecule can be downloaded separately. In SSFA-GPHR the search results as 2D Snake plot or 3D homology model can be d 192 N. Kowalsman and M.Y. Niv

Several specialized databases are devoted to sweet molecules, extracted from the literature and sensory receptors, including OdorDB, Olfactory publicly available databases such as PubChem Receptor Database (ORDB) (Crasto et al. 2002, and PDB. This dataset was expanded by using 2003), Database of Odor Responses (DoOR) similarity search methods, hence it also contains (Galizia et al. 2010), ODORactor (Liu et al. molecules that are predicted, but not proven, to 2011), The Human Olfactory Data Explorer be sweet. In addition, SuperSweet contains a (HORDE) database (Olender et al. 2004; Safran homology model of the T1R3-T1R2 sweet recep- et al. 2003) SuperScent (Dunkel et al. 2009), tor dimer. For some of the sweet compounds, SuperSweet (Ahmed et al. 2011) and BitterDB the server provides docking poses in the model (Wiener et al. 2012). structure (Ahmed et al. 2011). The database for odor molecules, OdorDB, BitterDB was developed and is maintained by and the database for olfactory receptors (ORs), our group (Wiener et al. 2012)(http://bitterdb. ORDB, are related and link odor ligands to the agri.huji.ac.il/bitterdb/). This database includes a ORs they activate. ORDB was expanded and collection of known bitter compounds, gathered contains additional chemosensory receptors, for from the literature and from Merck index, and example taste papilla receptors. The chemosen- their 25 associated human bitter taste receptors. sory genes and proteins can be searched using It encourages users to add their own information a variety of classification entries such as organ- on bitter or tasteless molecules. BitterDB pro- ism, tissue, strain and the labs that do functional vides several ways to investigate the bitter world: analysis on the receptor. The odors in OdorDB search for bitter compounds by different physico- can be searched by different traits (for example chemical properties, search for bitter molecules cyclic structures, functional groups). It also lists with structures similar to a query compound, the receptors found to be involved with each odor cross-link between a bitter compound and the (Crasto et al. 2002, 2003). DoOR database sup- receptor that binds it, option to use Basic Local plies the response profile of ORs in Drosophila. Alignment Search Tool (BLAST) against human It allows searching by receptor or by odorant bitter receptor sequences and more. It supplies and retrieves the normalized response graphs of basic information on the bitter compounds and each. DoOR provides a downloadable map of receptors and a 2D snake-plot representation of the Drosophila antennal lobe’s physiological re- the receptors. sponse to an odorant (Galizia et al. 2010). The server ODORactor predicts whether a molecule (inserted in SMILES format or via drawing its 9.2.3 GPCR Oligomerization 2D structure) is an odorant, and its possible as- sociated ORs. The server provides links to other In the last decade, it has become clear that many data sources for information about the receptors GPCRs may function as homo or heterodimers (Liu et al. 2011). HORDE is a database of human or higher order oligomers (Gonzalez-Maeso et OR genes, supplying information on genomic and al. 2008; Gorinski et al. 2012; Lohse 2010). cluster organizations of 855 human OR genes Oligomerization has implications for trafficking, and their orthologues in other mammals (Olender signaling and pharmacology of many members et al. 2004; Safran et al. 2003). The SuperScent of the GPCR family (Lohse 2010; Rediger et al. database contains more than 2,000 volatile scents. 2011) thus increasing the importance of GPCR The scents are classified and can be searched by oligomerization for general understanding and structure, or by properties such as name, molecu- drug discovery. lar weight, chemical group and more. Each scent Two main web resources deal with GPCR entry includes basic details and supplier informa- oligomerization: the GPCR Oligomerization tion and is linked to the relevant PubChem entry Knowledge Base Project (GPCR-OKB) and for additional information (Dunkel et al. 2009). the GPCRs Interaction Partners (GRIP) server SuperSweet is a database for natural and artificial and database (Table 9.3). 9 GPCR & Company: Databases and Servers for GPCRs and Interacting Partners 193

Table 9.3 Databases and servers dealing with GPCR oligomerization Database URL Species Additional information References GPCR-OKB http://data.gpcr-okb. Various Contains information about GPCR Khelashvili org/gpcr-okb/ species oligomers, including phenotypic changes, et al. (2010) evidence for physiological/clinical relevance, effects of oligomer-specific ligands and proposed mechanisms of activation. http://grip.cbrc.jp/ Not relevant Predicts interfaces for oligomerization Nemoto et al. GRIP server GRIP/ sites based on sequence alignment of the (2009) query GPCR and its homologs, and a template structure (either rhodopsin or “2 adrenergic receptor). http://grip.cbrc.jp/ Uses human Holds information on GPCR Nemoto et al. GRIP GDB/index.html sequences as oligomerization and suggested interfaces (2011) database representative based on published experiments and GRIP server predictions. http://memprotein.org/ Not relevnt A novel approach that can predict Mondal et al. CTMDapp new-continuum- oligomerization interfaces on membrane (2011) Downloadable molecular-dynamics- proteins monomers, GPCRs included. The executable ctmd-software-made- predictions are based on energetic cost available-by- calculations of the hydrophobic computational- mismatch-driven deformation in modeling-core membrane bilayers generated by multihelical membrane proteins. http://thebiogrid.org/ Various Contains genetic and protein interaction Stark et al. BioGRID species data (GPCRs included) from various (2006) species, including humans. For a specific protein, the database supplies a list of proteins that have been found to interact with it and the appropriate reference. http://www.hprd.org/ Human Holds information on domain architecture, Keshava HPRD post-translational modifications, Prasad et al. interaction networks and disease (2009) association for each protein in the human proteome (including GPCRs). The database supplies vast information on a query protein, including a list of interacting partners (some of them additional GPCRs). The data from these databases except GRIP database can be downloaded. HPRD is free for academic users

GPCR-OKB compiles data on oligomerization activation. For additional data on the protomers, for all types of GPCRs. It allows browsing, it links to GPCRDB (Khelashvili et al. visualization, and structured querying of 2010). computational and experimental information on The GRIP database (Nemoto et al. 2011)in- GPCR oligomerization. The database provides cludes two sections: an experimental informa- information on the oligomer, including the tion section which contains experimentally iden- proposed interface (at TM and residue levels), tified GPCR oligomers, and their experimentally phenotypic changes, in-vivo evidence and suggested interfaces for the oligomerization, if biological relevance. In addition, GPCR-OKB known; the second section contains predictions presents the effects of ligands that are oligomer- on GPCR oligomerization made by the GRIP specific and their proposed mechanisms of server. The prediction relies on multiple sequence 194 N. Kowalsman and M.Y. Niv alignment-based conservation scores and their of proteins that have been found to interact with mapping on the surface of the template structure it, along with the relevant reference (Stark et al. of either rhodopsin or “2 adrenergic receptor. 2006). The output includes a model structure and a HPRD contains information regarding domain list of interacting residues (Nemoto et al. 2009). architecture, post-translational modifications, in- The user can choose to view results based on teraction networks and disease association for experiments, predictions or both. It is known each protein in the human proteome. Querying that 5HT1A can form homo and hetero oligomers for a specific GPCR returns a list of interacting (Lukasiewicz et al. 2007; Renner et al. 2012; partners (some of them other GPCRs) and the Woehler et al. 2009) and we have recently identi- relevant references. HPRD is manually curated fied, using computational and experimental meth- (Keshava Prasad et al. 2009). ods, residues involved in 5HT1A homodimeriza- BioGRID supplies five entries of interacting tion (Gorinski et al. 2012). GPCR-OKB finds partners for 5-HT1A, but only when using the syn- seven entries for 5-HT1A oligomerization either onym HTR1A, Three of these are other GPCRs. with another 5HT1A or additional GPCRs. No Searching HPRD for 5-HT1A (using the name entries for 5HT1A oligomerization were found in 5-hydroxytryptamine receptor, 1A) returns eight the GRIP database. entries of interacting partners, including 5HT1A Recently, a new Continuum-Molecular itself. Dynamics (CTMD) software tool was developed and made available for download (Mondal et al. 2011). This novel tool energetically 9.2.4 Intracellular Partners quantifies the hydrophobic mismatch-driven and Signaling Pathways deformation in membrane bilayers generated by multihelical membrane proteins. The method During their life cycle, GPCRs are accompanied accounts for both the membrane remodeling by a range of specialized GIPs that assist nascent energy and the energy contribution from any receptors in proper folding, target them to the incomplete alleviation of the hydrophobic appropriate subcellular compartments and help mismatch by membrane remodeling and can them fulfill their signaling tasks (Maurice et al. thus predict optimal monomeric orientations 2011), see Fig. 9.1. GIPs have an important role within oligomers. Results may also explain how in regulating receptorÐligand interaction. The in- distinct ligand-induced conformations of GPCRs teraction between GPCRs and GIPs creates larger result in differential effects on the membrane GPCR-associated protein complexes, placing the environment. The CTMD software executable GPCRs at the center of protein networks that par- can be downloaded from the Membrane Protein ticipate in the dynamic regulation of the receptor Structural Dynamics Consortium (MPSDC) (Daulat et al. 2009; Ritter and Hall 2009). Gateway http://memprotein.org/. GIPs databases and servers deal with G- The Biological General Repository for proteins (Elefsinioti et al. 2004; Satagopam et Interaction Datasets (BioGRID) (Stark et al. al. 2010; Theodoropoulou et al. 2008) and PDZ 2006) and the Human Protein Reference domains (Beuming et al. 2005), and these are Database (HPRD) (Keshava Prasad et al. summarized in Table 9.4 and described below. 2009) hold information on protein interactions, The heterotrimeric G-protein complex is vital including GPCRÐGPCR interactions. BioGRID for GPCR signal transduction. It is composed contains genetic and protein interaction data from of ’, “, and ” subunits, each containing several various species, including humans. It stores over types (in humans there are 21 G’,6G“ and 12 500,000 curated interactions taken from datasets G” subunits) (Oldham and Hamm 2008). The and individual focused studies, derived from over interaction pattern of GPCRs with G-proteins is 30,000 publications. The user can search for a complex: some GPCRs can bind several types of specific GPCR and obtain, for each species, a list G-proteins, yielding various signal-transduction 9 GPCR & Company: Databases and Servers for GPCRs and Interacting Partners 195

Table 9.4 Databases of GPCRs interactions with other proteins and signal-transduction Database URL Species Additional information References gpDB http://biophysics.biol. Various Provides information on the coupling Elefsinioti uoa.gr/gpDB/ species between GPCR and G-protein ’ subunit et al. (2004) and the interactions between the G-protein units and different effectors. Last updated in 2008. http://biophysics.biol. Human Provides information on GPCRÐG-protein Satagopam Human-gpDB uoa.gr/human_gpdb/ interactions and coupling between GPCR et al. (2010) and G-protein ’ subunit and the interactions between the G-protein units and different effectors. Allows 2D and 3D visualization of the connections between GPCRs, related drugs, G-proteins and effectors. http://athina.biol.uoa. Not Predicts the G-protein families that couple Sgourakis et PRED- gr/bioinformatics/ relevant to a specific GPCR. al. (2005) COUPLE PRED-COUPLE2/ 2.00 http://icb.med.cornell. Various Provides information on more than 300 Beuming et al. PDZBase edu/services/pdz/start species interactions involving PDZ domains, some (2005) of them with GPCRs. Information is obtained from in-vivo or in-vitro experiments. http://www.reactome. Various REACTOME is a database of pathways Croftetal. REACTOME org/ReactomeGWT/ species and reactions in human biology. Allows (2011) entrypoint.html detailed visualization and navigation of biological pathways. http://www.genome.jp/ Various Links between different KEGG databases. Kanehisa and KEGG kegg/pathway.html species Contains genomic information, pathways Goto (2000), pathway of pathological conditions and synthesis Kanehisa et database of drugs. al. (2012) http://www.signaling- Various Dedicated to proteins involved in Li et al. The UCSD gateway.org/molecule/ species signaling (GPCRs included). Each protein (2002), Signaling entry contains interacting partners and Saunders et al. Gateway other information. (2008) The data can be downloaded from all databases except gpDB and human-gpDB. For downloading the entire KEGG dataset academic users can subscribe to the KEGG ftp site (managed by NPO Bioinformatics, Japan) and non-academic users requires a commercial license agreement

responses; and the same type of G-protein can information on the interacting partners. The bind to different GPCRs. In addition, the G- human-gpDB allows 2D and 3D visualization protein’s subunits activate or inhibit a variety of the network of connections between GPCRs, of effector proteins that modulate the cellular related drugs, G-proteins and effectors. It also signal-transduction functions (Bosier and Her- offers links to many additional GPCR web mans 2007; Wettschureck and Offermanns 2005). resources (Satagopam et al. 2010). Two databases are available for G-protein The server PRED-COUPLE 2.00 predicts complexes: the general gpDB (Elefsinioti et coupling specificity of GPCRs to G-protein al. 2004; Theodoropoulou et al. 2008) and the families. The PRED-COUPLE algorithm more recent Human-gpDB (Satagopam et al. combines a data-mining exploratory approach 2010). Both provide information on the coupling with the discriminative power of profile hidden between GPCR and G-protein ’ subunit and on Markov models. The method was trained using the interactions between the G-protein subunits two datasets of GPCRs with known non- and different effectors, and supply additional promiscuous binding to G’i/G’o,G’q/G’11 196 N. Kowalsman and M.Y. Niv or G’s and validated on two other datasets of tion’ includes many biological events that entail GPCRs. The user inserts a GPCR sequence changes in state, such as binding, activation, and gets a normalized probability score (from translocation and degradation, in addition to clas- 0 to 1) for coupling to each one of the four sical biochemical reactions). It allows detailed G-protein families. A higher score represents a visualization and navigation of known pathways. higher coupling probability. Probability scores It also returns inferred reactions for orthologous below 0.3 are considered negative predictions proteins of over 20 non-human species (Croft et (Sgourakis et al. 2005). For example, the al. 2011). The KEGG pathway database is part PRED-COUPLE retrieves only G’i/G’o as of the KEGG integrated database resource, which having significant probability for coupling to consists of 17 categorized databases (Kanehisa the human 5-HT1A receptor. G’i/G’o is indeed and Goto 2000; Kanehisa et al. 2012). KEGG the main coupling G-protein ’ subunit of 5-HT1A pathway links the different KEGG databases, thus (Pucadyil et al. 2005). providing information on related drugs and addi- PDZ domains consist of 80Ð90 amino acids tional, mostly genomic, information on GPCRs. that usually bind to a specific peptide sequence It allows examination of the pathways of patho- at the C terminus of their interacting proteins. logical conditions and diseases, showing the roles Many GPCRs contain a PDZ-binding motif at of different proteins, including GPCRs, in those their C terminus that allows them to bind PDZ conditions (Kanehisa and Goto 2000; Kanehisa proteins. Through these interactions, the PDZ et al. 2012). The UCSD Signaling Gateway pro- proteins participate in GPCR regulation of sig- vides information on more than 8,000 proteins naling, trafficking, and function (Magalhaes et (GPCRs included) involved in cellular signaling. al. 2012; Romero et al. 2011). PDZBase is a The data on each protein is represented by a specific database for interactions involving PDZ “Molecule Page”. The molecule page is divided domains. The database can be searched by PDZ into two main parts; one contains information name, accession code, ligand accession code, contributed by authors, such as a short review on specific sequence motif in either the PDZ or the the main functional and biological properties of ligand, and specific residue in the PDZ domain. the molecule, including key citations, an interac- PDZBase contains more than 300 interactions tive overview of the different states of the protein obtained from either in-vivo (coimmunoprecip- and the transitions between them, a list of known itation) or in-vitro (GST-fusion or related pull- functions and more. The second part holds data down) experiments (Beuming et al. 2005). acquired automatically from public databases, for BioGRID and HPRD (mentioned in Sect. 9.2.3) example sequence information, links with infor- provide references for interactions between mation on interacting partners, graphical repre- GPCRs and other proteins, such as G-proteins, sentations of protein domains and motifs, links to PDZ domain-containing proteins and additional pathway information in KEGG and REACTOME GIPs (Stark et al. 2006; Keshava Prasad et al. (Lietal.2002; Saunders et al. 2008). 2009). As an example, the REACTOME server re- General information on GPCR-signaling path- sults in 12 entries of reaction, pathway, com- ways and the molecules which make up these plex, or protein connected to 5-HT1A for the 5- pathways can be found in the REACTOME HT1A receptor query. Each entry consists of a server (Croft et al. 2011), the Kyoto Encyclope- few lines of information and links to an infor- dia of Genes and Genomes (KEGG) pathway mation page that includes a pathway scheme of database (Kanehisa and Goto 2000; Kanehisa et that reaction, pathway, etc. Querying the KEGG al. 2012)ortheUCSD Signaling Gateway (Li pathway database on 5-HT1A, results in an en- et al. 2002; Saunders et al. 2008). REACTOME try with a thumbnail map and information on is a curated database of pathways and reactions serotonergic synapses. The map is interactive and (pathway steps) in human biology (where ‘reac- allows obtaining more information specific for 5- 9 GPCR & Company: Databases and Servers for GPCRs and Interacting Partners 197

HT1A-F and other serotonin receptors. Searching and evaluated (Mobarec et al. 2009; Yarnitzky et the UCSD Signaling Gateway for 5-HT1A, results al. 2010; Costanzi 2012; Levit et al. 2012). in two entries: one for the mouse receptor and In some cases, there have been successes in one for the human receptor. Each entry links to predicting interactions with ligands using GPCR a “molecule page” with information on the re- models (Kufareva et al. 2011; Levit et al. 2011; ceptor and to annotated data, including pathways. Provasi et al. 2009) and in numerous cases, SuperTarget II (mentioned in Sect. 9.2.2)also homology-based virtual screening campaigns provides pathway information and is linked to the have yielded high hit rates and contributed to drug KEGG pathway database (Hecker et al. 2012). discovery (Evers and Klebe 2004; Congreve et Searching SuperTarget II for 5-HT1A results in al. 2011b; Vilar et al. 2011; Carlsson et al. 2011). target details page that contains a link to KEGG Here we review freely available web resources pathways related information, list of drugs related for experimentally solved and computationally to 5-HT1A, synonyms, similar targets and list of modeled GPCR structures (summarized in proteins known to bind 5-HT1A. Each of these can Table 9.5). open into a separate page. The main universal depository of X-ray- and NMR-solved structures of macromolecules is the Protein Data Bank (PDB) (Berman et al. 9.2.5 Structural Information 2000). A user can create a personalized PDB on GPCRs (MyPDB) account to save previous searches, get alerts on new structure releases, save notes on Structural information on GPCRs is invaluable structures and more. Solved GPCR structures for GPCR signaling research and drug design can be retrieved from PDB or alternatively (Shoichet and Kobilka 2012; Granier and Kobilka from the Membrane Proteins of Known 3D 2012). GPCRs share a seven-transmembrane Structure database, http://blanco.biomol.uci. (TM) helical fold but the structure and length edu/mpstruc. The latter database provides useful of their N and C termini and loops vary greatly. information on integral membrane proteins to For many years, challenges in expressing and which crystallographic, or sometimes NMR crystallizing GPCRs prevented deciphering their structures, have been determined to a resolution structure. However, the last few years have sufficient to identify TM helices of helix-bundle witnessed great breakthroughs in GPCR X-ray membrane proteins (typically up to 4Ð4.5 Å). crystallography, paving the way for solving the The Orientations of Proteins in Membranes structures of 17 GPCRs, some in their active (OPM) database extracts the structural data state (bound to agonists), some in their inactive from the PDB and predicts the orientation of state (bound to antagonists) and some captured the protein (GPCRs included) in the membrane in both (Shoichet and Kobilka 2012; Katritch at the residue level and the tilt of the protein in et al. 2012; Jacobson and Costanzi 2012). Most degrees. The proteins are classified by species recently, an active structure of peptide-bound and localization in membrane types (Lomize et neurotensin receptor (White et al. 2012) and al. 2006). an NMR structure of chemokine receptor were An important resource is UniProt (http://www. solved (Park et al. 2012). .org/) (The UniProt Consortium 2012). Despite the outstanding progress in GPCR X- UniProt is composed from several databases that ray crystallography, structures were solved so far provide information on protein sequence and an- for a small portion of all GPCR types. Thus, notation. This information is very valuable for modeling the structures of additional GPCRs, protein modeling, including GPCRs. to be used in research and drug development, Several databases that were mentioned in the remains indispensable. With X-ray structures on previous sections provide 3D models for visu- the rise, the models can now be better assessed alizations of query proteins (GPCRDB, GPCR 198 N. Kowalsman and M.Y. Niv

Table 9.5 Databases and servers holding GPCR structures Database URL Additional information References PDB http://www.rcsb.org/ The main universal depository of X-ray- and Berman et al. pdb/home/home.do NMR-solved macromolecule structures. (2000) http://blanco.biomol. Provides information about integral membrane None available Membrane Proteins uci.edu/mpstruc proteins whose structures have been of Known 3D determined. Structure database http://opm.phar. Predicts the orientation of the proteins in the Lomize et al. OPM umich.edu/ membrane for membrane protein structures (2006) found in the PDB. http://cssb.biology. A repository of homology models for all 907 Zhang et al. Human GPCR gatech.edu/skolnick/ human GPCRs. (2006) models repository files/gpcr/gpcr.html http://zhanglab.ccmb. A resource for modeling GPCRs. Composed Zhang and Zhang GPCRRD med.umich.edu/ of three main parts: database of experimental (2010) GPCRRD/ restraints that can be used in the modeling, GPCR-ITASSER modeling server and 1,028 pre-computed models for human GPCRs. http://www.ssfa-7tmr. Provides homology models of the TM helices Worth et al. GPCR-SSFE de/ssfe/ for more than 5,000 family A GPCRs and the (2009) templates used for modeling the TMs. http://gpcr.usc.es/ A server for GPCR modeling that aims to Rodriguez et al. GPCR-ModSim integrate: multiple sequence alignment, (2012) modeling, loop refinement and global refinement stages. http://genome.jouy. A modeling server devoted to olfactory Launay et al. GPCRautomodel inra.fr/GPCRautomdl/ receptor modeling and docking odorants to (2012) cgi-bin/welcome.pl these models. The server can model other types of GPCRs as well. http://cavasotto-lab. Sets of ligands for GPCR targets and the Gatica and GLL and GDD net/Databases/GDD/ decoys produced for these ligands. Cavasotto (2012) http://evfold.org/ A program for membrane protein folding. The Hopf et al. EVfold_membrane transmembrane/ models created by this program can be (2012) downloaded and the software code for creating evolutionary constraints can be obtained from the authors.

SARfari, SSFA-GPHR, GRIS and SuperSweet). GPCRs. The GPCRRD contains three main parts: Resources that are specifically dedicated to pro- (i) the GPCR experimental restraints database. viding models or servers for modeling GPCRs are This contains experimental results that have described next. been systematically collected from the literature A repository of homology models for human and other GPCR-related web resources and GPCRs (Zhang et al. 2006) was recently updated can be used as restraints in modeling; (ii) the and now contains almost 1,000 human GPCR GPCR-ITASSER, which is a GPCR-specialized model structures combined with virtual screening version of the I-TASSER server, and (iii) pre- of the ZINC (Irwin et al. 2012) database computed models of 1,028 putative human (Zhou and Skolnick 2012). The I-TASSER GPCRs (from 2010) using GPCR-ITASSER. The server enables modeling of protein structures user can either search or browse the databases for (including GPCRs) using multiple templates restraints and models (Zhang and Zhang 2010). (Zhang 2008). The recent GPCR Research The GPCR-Sequence-Structure-Feature- Database (GPCRRD) is designed to facilitate Extractor (GPCR-SSFE) database provides the construction of 3D structural models of initial homology models of the TM helices 9 GPCR & Company: Databases and Servers for GPCRs and Interacting Partners 199 for more than 5,000 family A GPCRs from enforcing ligandÐdecoy chemical dissimilarities. various species. The templates are selected for The docking performance of the GDD was each TM separately. The models and sequence evaluated on 19 GPCRs, showing a marked alignments can be viewed and downloaded, decrease in enrichment compared to bias- as can the templates that were used to create uncorrected decoy sets. Both the GLL and GDD the TM helices. The site also provides a link are freely available to the scientific community for loop modeling (Worth et al. 2009). The (Gatica and Cavasotto 2012). new GPCR Modeling and Simulation toolkit A new algorithm for membrane pro- (GPCR-ModSim) server aims to integrate all tein folding has been recently published, stages of modeling: multiple sequence alignment, EVfold_membrane, with very impressive modeling, loop refinement and global refinement success in de-novo 3D-structure predictions (using molecular dynamics). GPCR-ModSim (Hopf et al. 2012). The authors blind-tested allows the user to decide on every step involved in the EVfold_membrane protocol on ’-helical the modeling pipeline: selection of the template transmembrane proteins representing 23 diverse from lists of either “active” or “inactive” and families with known structures. In 21 of the cases “inactive-like” structures, manual editing of the the C’-rmsd was 2.6Ð4.8 Å for more than 70 % query-template alignment, selection of the most of the length (Hopf et al. 2012). There is no appropriate 3D-model, the definition of the loop server for EVfold_membrane, but the receptors regions to be refined and, if the user wishes, models, the 3D coordinates and the evolutionary global refinement. The user can choose several constraints used in the paper can be downloaded. templates and based on each of them alone the The software code for the evolutionary constraint server will create homology models (Rodriguez calculation can be obtained upon request. This et al. 2012). program may help in modeling class B and class GPCRautomodel is another modeling server C GPCRs, which is challenging due to the lack of for GPCRs. It was created mainly for modeling X-ray structures of the TM part for these families. ORs but it can model other GPCRs as well. The user can decide between seven GPCR templates for the modeling. The server also enables docking 9.3 Discussion and Outlook of small molecules to the resulting GPCR model. For each set of models generated, one template A large variety of freely available databases and structure is used. The website supplies a library servers supply information on GPCR interactions of odorant compounds that can be docked but and binding partners. These databases provide the user can also insert his or her own molecule an important tool in understanding biological (Launay et al. 2012). Note that to use GPCR- pathways, chemoinformatics and development of ModSim and GPCRautomodel (ligand-binding new drugs, and in enabling “systems chemistry” prediction), the user must first get a MODELLER alongside the more established “systems biol- (Sali and Blundell 1993) license key, which is ogy”. The value of these resources is reflected freely available. Both servers have a link to the in the significant number of publications citing MODELLER registration site. these databases and servers during 2011Ð2012 Of significance for docking studies, some (Fig. 9.2). chemical libraries may have a strong bias toward Interestingly, the specialized GPCR interac- GPCR-like ligands because GPCRs are important tion databases supply links to the more general drug targets (Kolb et al. 2009). Recently, a and typically bigger ones, but the opposite does GPCR ligand library (GLL) for 147 targets, not hold true. In many cases, the specialized selecting 39 decoy molecules for each ligand, databases were created by dedicated groups for was collected in the GPCR Decoy Database their own research and for disseminating data (GDD). Decoys were chosen to ensure a ligandÐ to other groups. These specialized resources can decoy similarity of six physical properties, while contribute valuable insights and information and 200 N. Kowalsman and M.Y. Niv

Fig. 9.2 Number of publications citing the papers that in the years 2011Ð2012 (up to October). The numbers describe the databases and servers reviewed in this chap- of citations were taken from the Web Of KnowledgeSM ter, during 2011Ð2012. The figure shows the total number (Thomson Reuters) using the “Create citations report” of citations for papers connected to the databases and option servers discussed in Sects. 9.2.2, 9.2.3, 9.2.4,and9.2.5 are often manually curated. In our view, intro- Computational screening and modeling are ducing links from the general databases to the often used by the research community and specialized ones would be useful for the GPCR drug discovery in general, and in the GPCRs community. Very recently, links from the large field in particular. Computational resources and widely used ZINC database of compounds are particularly powerful in cases where (Irwin et al. 2012) to the specialized BitterDB experimental results are limited. For example, database of bitter compounds were introduced. the number of solved GPCR structures is Among the specialized databases, only few are less than 2 % of all GPCR types. Modeling dedicated to GIPs. It is known that GPCRs in- of GPCRs helps researchers in cases where teract with many different GIPs throughout their there is no available structure. However, the life cycle and these interactions are vital for the user must bear in mind that the models are maturation and proper signaling of the GPCRs. only predictions, and are not necessarily fully We believe that the research community and drug accurate. Therefore, the structural models must discovery can greatly benefit from additional web be validated prior to their use, for example resources on specialized GIPs and we hope that by checking structural characteristics such as in the future, more websites on GIPs and GPCR Ramachandran plots (Ramachandran et al. 1963), effectors will be developed. i.e. using PROCHECK (Laskowski et al. 1993) 9 GPCR & Company: Databases and Servers for GPCRs and Interacting Partners 201 and by comparing residues that are known to a unified system for describing these databases be important in GPCRs against solved GPCR in a single, centralized location (Gaudet et al. structures (Levit et al. 2012), The most powerful 2011). Having a united platform for all databases confirmation of a model’s accuracy stems from and servers will make them even more accessible comparisons to experimental data. This may to users from all fields. However, this kind of include indirect inference from mutagenesis and platform should be practical in the sense that it subsequent binding or functional assays (Levit should allow small labs that do not have a lot et al. 2012), from screening enrichment data of resources to participate and contribute their or, ultimately, comparing against newly solved knowledge and databases to it. Furthermore, it structure (Kufareva et al. 2011). is not yet clear how partial overlaps and partial As repositories of GPCR models rapidly be- redundancies will be treated. It seems that the come outdated as new templates are being pub- best approach a user can currently take is to check lished every few months, it is usually advisable the relevance and usefulness of each individual to create new models, rather than relying on pre- resource for his or her particular research needs. computed ones. However, for comparative stud- We hope that this review and the summary in ies that call for many (not necessarily fine-tuned) Tables 9.1, 9.2, 9.3, 9.4, and 9.5 will make this structures underlie the need for pre-computed task an easier one. models, and the high number of models reposito- ries’ citations, 108 for (Zhang et al. 2006) and 26 Acknowledgments We thank Dr. Tali Yarnitzky for crit- for (Worth et al. 2009), illustrate their extensive ical reading of the manuscript and Ayelet Kowalsman usage by the scientific community. for helping with the figures. Grants from the Deutsche With the increasing number of specialized Forschungsgemeinschaft (DFG) ME 1024/8and from the Nutrigenomics and Functional Foods Research Center at modeling servers, community-wide experiments The Robert H. Smith Faculty of Agriculture, Food and for the modeling of GPCRs, such as GPCR Dock Environment, The Hebrew University of Jerusalem are (Kufareva et al. 2011), are highly desirable and gratefully acknowledged. will help evaluate and further improve the differ- ent modeling approaches. Moreover, to increase the accuracy of the References model predictions using the different modeling algorithms, it is advantageous to allow users Ahmed J, Preissner S, Dunkel M, Worth CL, Eckert A, Preissner R (2011) SuperSweet Ð a resource on natural to introduce GPCR templates of choice in the and artificial sweetening agents. Nucleic Acids Res modeling server, as more and more GPCR 39:D377ÐD382 structures become available. Bellis LJ, Akhtar R, Al-Lazikani B, Atkinson F, Bento AP et al (2011) Collation and data-mining of literature The abundance of databases and servers re- bioactivity data for drug discovery. Biochem Soc Trans sults in repetitions and partial redundancy. For 39:1365Ð1370 example, a recently published analysis of com- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN mercial and public bioactivity databases (includ- et al (2000) The Protein Data Bank. Nucleic Acids Res 28:235Ð242 ing PDSP Ki and PubChem) showed that a large Beuming T, Skrabanek L, Niv MY, Mukherjee P, We- proportion of the information is shared by differ- instein H (2005) PDZBase: a protein-protein interac- ent databases; however, the data overlap between tion database for PDZ-domains. Bioinformatics 21: these databases is not complete (Tiikkainen and 827Ð828 Bockaert J, Perroy J, Becamel C, Marin P, Fagni L (2010) Franke 2012). We saw a similar picture for GPCR GPCR interacting proteins (GIPs) in the nervous sys- SARfari and GLIDA databases. tem: Roles in physiology and pathologies. Annu Rev To simplify database usage, initiatives such Pharmacol Toxicol 50:89Ð109 as BioDBCore are calling to create uniform and Bolton E, Wang Y, Thiessen PA, Bryant SH (2008) Pub- Chem: Integrated Platform of Small Molecules and centralized database descriptions (Gaudet et al. Biological Activities. In: Wheeler RA, Spellmeyer 2011). The BioDBCore initiative calls for stan- DC (eds) Annual reports in computational chemistry. dardization across the biological databases using Elsevier Science, Amsterdam, pp 217Ð241 202 N. Kowalsman and M.Y. Niv

Bosier B, Hermans E (2007) Versatility of GPCR Gaudet P, Bairoch A, Field D, Sansone SA, Taylor C et recognition by drugs: from biological implications al (2011) Towards BioDBcore: a community-defined to therapeutic relevance. Trends Pharmacol Sci 28: information specification for biological databases. Nu- 438Ð446 cleic Acids Res 39:D7ÐD10 Carlsson J, Coleman RG, Setola V, Irwin JJ, Fan H et al Gonzalez-Maeso J, Ang RL, Yuen T, Chan P, Weisstaub (2011) Ligand discovery from a dopamine D3 receptor NV et al (2008) Identification of a serotonin/glutamate homology model and crystal structure. Nat Chem Biol receptor complex implicated in psychosis. Nature 7:769Ð778 452:93Ð97 Chen X, Ji ZL, Chen YZ (2002) TTD: Therapeutic Target Gorinski N, Kowalsman N, Renner U, Wirth A, Reinartz Database. Nucleic Acids Res 30:412Ð415 MT et al (2012) Computational and Experimental Cheng F, Zhou Y, Li J, Li W, Liu G, Tang Y (2012) Pre- Analysis of the TM4/TM5 Dimerization Interface diction of chemical-protein interactions: multitarget- of the Serotonin 5-HT1A Receptor. Mol Pharm 82: QSAR versus computational chemogenomic methods. 448Ð463 Mol Biosyst 8:2373Ð2384 Granier S, Kobilka B (2012) A new era of GPCR structural Congreve M, Langmead CJ, Mason JS, Marshall FH and chemical biology. Nat Chem Biol 8:670Ð673 (2011a) Progress in structure based drug design Harmar AJ, Hills RA, Rosser EM, Jones M, Buneman OP for G protein-coupled receptors. J Med Chem 54: et al (2009) IUPHAR-DB: the IUPHAR database of 4283Ð4311 G protein-coupled receptors and ion channels. Nucleic Congreve M, Langmead C, Marshall FH (2011b) The use Acids Res 37:D680ÐD685 of GPCR structures in drug design. In: Neubig RR Hecker N, Ahmed J, von Eichborn J, Dunkel M, Macha (ed) Pharmacology of G protein coupled receptors, K et al (2012) SuperTarget goes quantitative: up- Advances in Pharmacology. Elsevier Inc., Amsterdam. date on drug-target interactions. Nucleic Acids Res pp 1Ð36 40:D1113ÐD1117 Costanzi S (2012) Homology modeling of class a Hopf TA, Colwell LJ, Sheridan R, Rost B, Sander C, G protein-coupled receptors. Methods Mol Biol Marks DS (2012) Three-Dimensional Structures of 857:259Ð279 Membrane Proteins from Genomic Sequencing. Cell Crasto C, Marenco L, Miller P, Shepherd G (2002) Ol- 149:1607Ð1621 factory Receptor Database: a metadata-driven auto- Horn F, Weare J, Beukers MW, Horsch S, Bairoch A mated population from sources of gene and protein et al (1998) GPCRDB: an information system for sequences. Nucleic Acids Res 30:354Ð360 G protein-coupled receptors. Nucleic Acids Res 26: Crasto CJ, Liu N, Shepherd GM (2003) Databases for the 275Ð279 Functional Analyses of Olfactory Receptors. In: Kot- Horn F, Bettler E, Oliveira L, Campagne F, Cohen FE, ter R (ed) Neuroscience databases: a practical guide. Vriend G (2003) GPCRDB information system for Kluwer Academic Publishers, Düsseldorf, pp 37Ð50 G protein-coupled receptors. Nucleic Acids Res 31: Croft D, O’Kelly G, Wu G, Haw R, Gillespie M et 294Ð297 al (2011) Reactome: a database of reactions, path- Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman ways and biological processes. Nucleic Acids Res 39: RG (2012) ZINC: A Free Tool to Discover Chemistry D691ÐD697 for Biology. J Chem Inf Model 52:1757Ð1768 Daulat AM, Maurice P, Jockers R (2009) Recent Jacobson KA, Costanzi S (2012) New insights for drug methodological advances in the discovery of GPCR- design from the X-ray crystallographic structures of associated protein complexes. Trends Pharmacol Sci GPCRs. Mol Pharm 82:361Ð371 30:72Ð78 Jacoby E, Bouhelal R, Gerspacher M, Seuwen K (2006) Dunkel M, Schmidt U, Struck S, Berger L, Gruening B The 7 TM G-Protein-Coupled Receptor Target Family. et al (2009) SuperScent Ð a database of flavors and ChemMedChem 1:760Ð782 scents. Nucleic Acids Res 37:D291ÐD294 Kanehisa M, Goto S (2000) KEGG: kyoto encyclope- Elefsinioti AL, Bagos PG, Spyropoulos IC, Hamodrakas dia of genes and genomes. Nucleic Acids Res 28: SJ (2004) A database for G proteins and their interac- 27Ð30 tion with GPCRs. BMC Bioinformatics 5:208 Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M Evers A, Klebe G (2004) Successful virtual screening for a (2012) KEGG for integration and interpretation of submicromolar antagonist of the neurokinin-1 receptor large-scale molecular data sets. Nucleic Acids Res based on a ligand-supported homology model. J Med 40:D109ÐD114 Chem 47:5381Ð5392 Katritch V, Cherezov V, Stevens RC (2012) Diversity and Galizia CG, Munch D, Strauch M, Nissler A, Ma S (2010) modularity of G protein-coupled receptor structures. Integrating heterogeneous odor response data into a Trends Pharmacol Sci 33:17Ð27 common response model: A Door to the complete Keshava Prasad TS, Goel R, Kandasamy K, Keerthiku- olfactome. Chem Senses 35:551Ð563 mar S, Kumar S et al (2009) Human Protein Refer- Gatica EA, Cavasotto CN (2012) Ligand and decoy sets ence DatabaseÐ2009 update. Nucleic Acids Res 37: for docking to G protein-coupled receptors. J Chem D767ÐD772 Inf Model 52:1Ð6 9 GPCR & Company: Databases and Servers for GPCRs and Interacting Partners 203

Khelashvili G, Dorff K, Shan J, Camacho-Artacho M, Maurice P, Guillaume JL, Benleulmi-Chaachoua A, Skrabanek L et al (2010) GPCR-OKB: the G Protein Daulat AM, Kamal M, Jockers R (2011) GPCR- Coupled Receptor Oligomer Knowledge Base. Bioin- interacting proteins, major players of GPCR function. formatics 26:1804Ð1805 Adv Pharmacol 62:349Ð380 Kleinau G, Brehm M, Wiedemann U, Labudde D, Leser Mobarec JC, Sanchez R, Filizola M (2009) Modern U, Krause G (2007) Implications for molecular mech- homology modeling of G-protein coupled receptors: anisms of glycoprotein hormone receptors using a new which structural template to use?. J Med Chem sequence-structure-function analysis resource. Mol 52:5207Ð5216 Endocrinol 21:574Ð580 Mondal S, Khelashvili G, Shan J, Andersen OS, Wein- Kolb P, Rosenbaum DM, Irwin JJ, Fung JJ, Kobilka stein H (2011) Quantitative modeling of membrane BK, Shoichet BK (2009) Structure-based discovery of deformations by multihelical membrane proteins: ap- beta2-adrenergic receptor ligands. Proc Natl Acad Sci plication to G-protein coupled receptors. Biophys J USA 106:6843Ð6848 101:2092Ð2101 Kufareva I, Rueda M, Katritch V, Stevens RC, Abagyan Nemoto W, Fukui K, Toh H (2009) GRIP: a server R, GPCR Dock Participants (2011) Status of GPCR for predicting interfaces for GPCR oligomerization. J Modeling and Docking as Reflected by Community- Recept Signal Transduct Res 29:312Ð317 wide GPCR Dock 2010 Assessment. Structure 19: Nemoto W, Fukui K, Toh H (2011) GRIPDB Ð G protein 1108Ð1126 coupled Receptor Interaction Partners DataBase. J Laskowski RA, Macarthur MW, Moss DS, Thornton JM Recept Signal Transduct Res 31:199Ð205 (1993) Procheck Ð a Program to Check the Stereo- Okuno Y, Yang J, Taneishi K, Yabuuchi H, Tsujimoto chemical Quality of Protein Structures. J Appl Crys- G (2006) GLIDA: GPCR-ligand database for chem- tallogr 26:283Ð291 ical genomic drug discovery. Nucleic Acids Res 34: Launay G, Teletchea S, Wade F, Pajot-Augy E, Gibrat D673ÐD677 JF, Sanz G (2012) Automatic modeling of mammalian Okuno Y, Tamon A, Yabuuchi H, Niijima S, Minowa olfactory receptors and docking of odorants. Protein Y et al (2008) GLIDA: GPCRÐligand database for Eng Des Sel 25:377Ð386 chemical genomics drug discoveryÐdatabase and tools Levit A, Yarnitzky T, Wiener A, Meidan R, Niv MY update. Nucleic Acids Res 36:D907ÐD912 (2011) Modeling of human prokineticin receptors: Oldham WM, Hamm HE (2008) Heterotrimeric G protein interactions with novel small-molecule binders and activation by G-protein-coupled receptors. Nat Rev potential off-target drugs. PLoS One 6:e27990 Mol Cell Biol 9:60Ð71 Levit A, Barak D, Behrens M, Meyerhof W, Niv MY Olender T, Feldmesser E, Atarot T, Eisenstein M, Lancet (2012) In: Vaidehi N, Klein-Seetharaman J (eds) Ho- D (2004) The olfactory receptor universeÐfrom whole mology model-assisted elucidation of binding sites in genome analysis to structure and evolution. Genet Mol GPCRs: methods and protocols, Methods in molec- Res 3:545Ð553 ular biology. Springer Science C Business Media, Overington JP, Al-Lazikani B, Hopkins AL (2006) How pp 179Ð205 many drug targets are there?. Nat Rev Drug Discov Li J, Ning Y, Hedley W, Saunders B, Chen Y et al (2002) 5:993Ð996 The Molecule Pages database. Nature 420:716Ð717 Park SH, Das BB, Casagrande F, Tian Y, Nothnagel HJ et Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK (2007) al (2012) Structure of the chemokine receptor CXCR1 BindingDB: a web-accessible database of experimen- in phospholipid bilayers. Nature 491:779Ð783 tally determined protein-ligand binding affinities. Nu- Provasi D, Bortolato A, Filizola M (2009) Explor- cleic Acids Res 35:D198ÐD201 ing molecular mechanisms of ligand recognition by Liu X, Su X, Wang F, Huang Z, Wang Q et al (2011) opioid receptors with metadynamics. Biochemistry ODORactor: a web server for deciphering olfactory 48:10020Ð10029 coding. Bioinformatics 27:2302Ð2303 Pucadyil TJ, Kalipatnapu S, Chattopadhyay A (2005) Lohse MJ (2010) Dimerization in GPCR mobility and The serotonin1A receptor: a representative member of signaling. Curr Opin Pharmacol 10:53Ð58 the serotonin receptor family. Cell Mol Neurobiol 25: Lomize MA, Lomize AL, Pogozheva ID, Mosberg HI 553Ð580 (2006) OPM: orientations of proteins in membranes Ramachandran GN, Ramakrishnan C, Sasisekharan V database. Bioinformatics 22:623Ð625 (1963) Stereochemistry of polypeptide chain config- Lukasiewicz S, Blasiak E, Faron-Gorecka A, Polit A, urations. J Mol Biol 7:95Ð99 Tworzydlo M et al (2007) Fluorescence studies of Rediger A, Piechowski CL, Yi CX, Tarnow P, Strot- homooligomerization of adenosine A2A and serotonin mann R et al (2011) Mutually opposite signal mod- 5-HT1A receptors reveal the specificity of receptor ulation by hypothalamic heterodimerization of ghre- interactions in the plasma membrane. Pharmacol Rep lin and melanocortin-3 receptors. J Biol Chem 286: 59:379Ð392 39623Ð39631 Magalhaes AC, Dunn H, Ferguson SS (2012) Regula- Renner U, Zeug A, Woehler A, Niebert M, Dityatev A tion of GPCR activity, trafficking and localization et al (2012) Heterodimerization of serotonin receptors by GPCR-interacting proteins. Br J Pharmacol 165: 5-HT1A and 5-HT7 differentially regulates receptor 1717Ð1736 signalling and trafficking. J Cell Sci 125:2486Ð2499 204 N. Kowalsman and M.Y. Niv

Ritter SL, Hall RA (2009) Fine-tuning of GPCR activity Tiikkainen P, Franke L (2012) Analysis of commercial by receptor-interacting proteins. Nat Rev Mol Cell and public bioactivity databases. J Chem Inf Model Biol 10:819Ð830 52:319Ð326 Rodriguez D, Bello X, Gutierrez-de-Teran H (2012) Van Durme J, Horn F, Costagliola S, Vriend G, Vassart of G Protein-Coupled Receptors G (2006) GRIS: glycoprotein-hormone receptor infor- Through the Web. Mol Infor 31:334Ð341 mation system. Mol Endocrinol 20:2247Ð2255 Romero G, von Zastrow M, Friedman PA (2011) Role of Vilar S, Ferino G, Phatak SS, Berk B, Cavasotto CN, PDZ proteins in regulating trafficking, signaling, and Costanzi S (2011) Docking-based virtual screening function of GPCRs: means, motif, and opportunity. for ligands of G protein-coupled receptors: not only Adv Pharmacol 62:279Ð314 crystal structures but also in silico models. J Mol Roth BL, Lopez E, Patel S, Kroeze WK (2000) The Graph Model 29:614Ð623 multiplicity of serotonin receptors: Uselessly diverse Vroling B, Sanders M, Baakman C, Borrmann A, Ver- molecules or an embarrassment of riches?. Neurosci- hoeven S et al (2011) GPCRDB: information system entist 6:252Ð262 for G protein-coupled receptors. Nucleic Acids Res Safran M, Chalifa-Caspi V,Shmueli O, Olender T, Lapidot 39:D309ÐD319 M et al (2003) Human Gene-Centric Databases at Wang Y, Xiao J, Suzek TO, Zhang J, Wang J et al (2012) the Weizmann Institute of Science: GeneCards, UDB, PubChem’s BioAssay Database. Nucleic Acids Res CroW 21 and HORDE. Nucleic Acids Res 31:142Ð146 40:D400ÐD412 Sali A, Blundell TL (1993) Comparative protein mod- Wettschureck N, Offermanns S (2005) Mammalian G elling by satisfaction of spatial restraints. J Mol Biol proteins and their cell type specific functions. Physiol 234:779Ð815 Rev 85:1159Ð1204 Satagopam VP, Theodoropoulou MC, Stampolakis CK, White JF, Noinaj N, Shibata Y,Love J, Kloss B et al (2012) Pavlopoulos GA, Papandreou NC et al (2010) GPCRs, Structure of the agonist-bound neurotensin receptor. G-proteins, effectors and their interactions: human- Nature 490:508Ð513 gpDB, a database employing visualization tools and Wiener A, Shudler M, Levit A, Niv MY (2012) BitterDB: data integration techniques. Database (Oxford) 2010: a database of bitter compounds. Nucleic Acids Res baq019 40:D413ÐD419 Saunders B, Lyon S, Day M, Riley B, Chenette E et al Woehler A, Wlodarczyk J, Ponimaskin EG (2009) Spe- (2008) The Molecule Pages database. Nucleic Acids cific oligomerization of the 5-HT1A receptor in the Res 36:D700ÐD706 plasma membrane. Glycoconj J 26:749Ð756 Schlyer S, Horuk R (2006) I want a new drug: G-protein- Worth CL, Kleinau G, Krause G (2009) Comparative coupled receptors in drug development. Drug Discov sequence and structural analyses of G-protein-coupled Today 11:481Ð493 receptor crystal structures and implications for molec- Sgourakis NG, Bagos PG, Papasaikas PK, Hamodrakas SJ ular models. PLoS One 4:e7011 (2005) A method for the prediction of GPCRs coupling Yarnitzky T, Levit A, Niv MY (2010) Homology modeling specificity to G-proteins using refined profile Hidden of G-protein-coupled receptors with X-ray structures Markov Models. BMC Bioinformatics 6:104 on the rise. Curr Opin Drug Discov Devel 13:317Ð325 Shoichet BK, Kobilka BK (2012) Structure-based drug Zhang Y (2008) I-TASSER server for protein 3D structure screening for G-protein-coupled receptors. Trends prediction. BMC Bioinformatics 9:40 Pharmacol Sci 33:268Ð272 Zhang J, Zhang Y (2010) GPCRRD: G protein-coupled Shonberg J, Scammells PJ, Capuano B (2011) De- receptor spatial restraint database for 3D structure sign strategies for bivalent ligands targeting GPCRs. modeling and function annotation. Bioinformatics ChemMedChem 6:963Ð974 26:3004Ð3005 Stark C, Breitkreutz BJ, Reguly T, Boucher L, Bre- Zhang Y, Devries ME, Skolnick J (2006) Structure Mod- itkreutz A, Tyers M (2006) BioGRID: a general repos- eling of All Identified G Protein-Coupled Receptors in itory for interaction datasets. Nucleic Acids Res 34: the . PLoS Comput Biol 2:e13 D535ÐD539 Zhou H, Skolnick J (2012) FINDSITE(X): A Structure- The UniProt Consortium (2012) Reorganizing the protein Based, Small Molecule Virtual Screening Approach space at the Universal Protein Resource. Nucleic Acids with Application to All Identified Human GPCRs. Mol Res 40:D71ÐD75 Pharm 9:1775Ð1784 Theodoropoulou MC, Bagos PG, Spyropoulos IC, Zhu F, Shi Z, Qin C, Tao L, Liu X et al (2012) Therapeutic Hamodrakas SJ (2008) gpDB: a database of GPCRs, target database update 2012: a resource for facilitat- G-proteins, effectors and their interactions. Bioinfor- ing target-oriented drug discovery. Nucleic Acids Res matics 24:1471Ð1472 40:D1128ÐD1136 Bioinformatics Tools for Predicting GPCRGeneFunctions 10

Makiko Suwa

Abstract The automatic classification of GPCRs by bioinformatics methodology can provide functional information for new GPCRs in the whole ‘GPCR proteome’ and this information is important for the development of novel drugs. Since GPCR proteome is classified hierarchically, general ways for GPCR function prediction are based on hierarchical classification. Various computational tools have been developed to predict GPCR functions; those tools use not simple sequence searches but more powerful methods, such as alignment-free methods, statistical model methods, and machine learn- ing methods used in protein sequence analysis, based on learning datasets. The first stage of hierarchical function prediction involves the discrimina- tion of GPCRs from non-GPCRs and the second stage involves the classi- fication of the predicted GPCR candidates into family, subfamily, andsub- subfamily levels. Then, further classification is performed according to their protein-protein interaction type: binding G-protein type, oligomer- ized partner type, etc. Those methods have achieved predictive accuracies of around 90 %. Finally, I described the future subject of research of the bioinformatics technique about functional prediction of GPCR.

Keywords Bioinformatics ¥ Function ¥ G-protein coupled receptor ¥ G-protein ¥ Dimerization ¥ Class ¥ family ¥ subfamily ¥ GPCR discrimination ¥ Hi- erarchical GPCR classification ¥ PPT Network classification ¥ Trans- membrane helix ¥ Alignment free method ¥ Machine learning method ¥ Statistical model method ¥ Motif based search ¥ Sequence similarity search

M. Suwa () Department of Chemistry and Bioscience, College of Computational Biology Research Center (CBRC), Science and Technology, Aoyama Gakuin University, National Institute of Advanced Industrial Science and 5-10-1 Fuchinobe, Chuo-ku, Sagamihara-shi, Kanagawa Technology (AIST), 4-2-7 Aomi, Kotou-ku, Tokyo, Japan 252-5258, Japan e-mail: [email protected]; [email protected]

M. Filizola (ed.), G Protein-Coupled Receptors - Modeling and Simulation, Advances 205 in Experimental Medicine and Biology 796, DOI 10.1007/978-94-007-7423-0__10, © Springer ScienceCBusiness Media Dordrecht 2014 206 M. Suwa

search for ligands. Newly de-orphanized GPCRs Abbreviations include the (Sakurai et al. 1998), the (Tatemoto et al. 1998), the GPCR G-protein coupled receptor melanin-concentrating hormone receptor (Saito TM Transmembrane et al. 1999), and the ghrelin receptor (Kojima TMH Transmembrane helix et al. 1999) etc. Their discovery has made a huge HMM Hidden Markov Model impact on both academe and industry (Civelli SVM Support Vector Machine et al. 2006). NN Neural Network For a long time, experimental difficulties in CA Cell Automaton structure determination and gene expression had kNN K Nearest Neighbor prevented researchers from fully understanding SOM Self-organization Map the functional mechanism and classification of PPI Protein-Protein Interaction orphan GPCRs. Nevertheless, a treasure trove of genome information of many species has been 10.1 Introduction unearthed and several structures of GPCRs have been identified (Rosenbaum et al. 2009; Kobilka and Schertler 2008; Palczewski et al. 2000;Ras- G-protein-coupled receptors (GPCRs) are mussen et al. 2011; Dorota et al. 2012). In this membrane proteins that are characterized by context, it is now possible to overview the se- seven transmembrane (TM) helices. The binding quences to understand the general rule of GPCR of ligands, including biological amines, peptides, function by using bioinformatics tools. The auto- hormones, and odorant substances etc., from matic classification of GPCRs by bioinformatics the extracellular side to GPCRs induces signal tools can provide functional information of newly transduction to the cell interior by activating determined GPCRs in the whole ‘GPCR pro- G-proteins (Gi/o,Gq/11,Gs, and G12/13) that teome’ and this information is important for the are conjugated at the cytoplasmic side of the development of novel drugs by pharmaceutical cell. This process is followed by different types companies. This review introduces several useful of signal transduction (Vauquelin and Mentzer bioinformatics tools for GPCR analysis. 2007; Kobilka 2007). GPCRs exist in almost all biological cells and abnormal events occurring in GPCR signal 10.2 Overview of GPCRs: Known transduction have been related to various serious Classes and Repertoires conditions, such as allergy, heart trouble, cancer, of GPCRs high blood pressure, and inflammation. As ap- proximately 40 % (Wise et al. 2002)ofdrugs GPCRs are classified into six classes (Class A, distributed around the world are designed to con- Class B, Class C, Class D, Class E, and Class trol the function of GPCRs, many researchers are F GPCRs) as defined in GPCRDB (Horn et al. looking into ways to reveal GPCR function. Of 1998, 2000, 2003), based on their amino acid the approximately 800 GPCR genes (Vassilatis sequences and functional similarities. et al. 2003) encoded in human, most are predicted to be olfactory receptors and the remaining 200Ð Class A: This class, often called the “rhodopsin- 300 genes are said to have the potential of being like family,” is the largest, accounting for around drug design targets. Of these, 100Ð150 are called 80 % of all GPCR genes (Fridmanis et al. ‘orphan’ GPCRs because their endogenous lig- 2007). More than half of the members of this ands are still unknown. As the discovery of a lig- class are predicted to be olfactory receptors and that activates orphan GPCRs may be directly and the remaining are receptors with known linked to new drug development, there is intense endogenous ligands and orphan receptors. competition among pharmaceutical companies to The members are classified into family level 10 Bioinformatics Tools for Predicting GPCR Gene Functions 207

(e.g., adrenergic receptor), subfamily level B1 GPCRs and Class A GPCRs, they have a (e.g., “ adrenergic receptor), and sub-subfamily common feature in the extracellular/cytoplasmic level (e.g., “2 adrenergic receptor). Although loop and the TM helices. The second subclass they show quite low sequence similarity to B2 includes more than 30 members, an the receptors of the other families, the solved example of which is CD97. The members structures (Rosenbaum et al. 2009; Kobilka of this subclass have very long N-terminal and Schertler 2008; Palczewski et al. 2000; domains with repeats of well-known protein Rasmussen et al. 2011; Dorota et al. 2012) modules, such as the EGF domain and the indicate that all Class A GPCRs share a common cadherin repeat, which may be involved in cell structure (Fig. 10.1a) having seven TM helices adhesion. Only a few are known to associate together with the eighth helix at the C terminus, with G-proteins and most of them are orphan an S-S bond between TM helix 3(TMH3) and receptors. TMH4 at the extracellular side of membrane, Class C: This includes the metabotropic gluta- a palmitoylated cysteine at the C terminal tail, mate receptor family, GABA receptors, calcium- and Class A specific conserved residues. The sensing receptors, and taste receptors (Pin et al. conserved residues in the extracellular region 2003; Brauner-Osborne et al. 2007). Several lig- affect ligand binding selectivity, whereas those in ands of Class C GPCRs have been identified. the cytoplasmic region affect coupling selectivity These receptors are characterized by a large ex- with G-proteins. Many articles have reported tracellular N-terminal domain that is often shaped the positions of conserved residues (Nygaard like a clam with approximately 600 residues to et al. 2009; Mizadegan et al. 2003;Karmiketal. which ligands bind (Fig. 10.1c). This domain is 2003; Madabushi et al. 2004; Muramatsu and connected to a TM helix by a loop that contains Suwa 2006;Wess1998; Suwa et al. 2011) and cysteine-rich domains (CRDs). It is suggested the mutation experiments of key residues that that most of the Class C GPCRs form homod- have significant influence on the selectivity of imers or heterodimers (Pin et al. 2003; Brauner- ligand binding and G-protein coupling. Here, Osborne et al. 2007). strongly conserved residues in all Class A GPCR sequences are observed in each TM helix region: Class D: This includes the receptor the DRY motif in TMH3, the NPxxY pattern in family (Eilers et al. 2005), such as STE2 and TMH7, etc. The characterization of subfamilies STE3. The amino acid sequences of these re- and their ligand binding sites is important as ceptors have seven hydrophobic domains, similar Class A GPCRs include a very large number of to Classes A, B, and C GPCRs. Therefore, it is drug-targeted receptors. believed to be a GPCR family. Class B: This class has around 70 members Class E: This includes the cAMP receptor fam- that are further divided into several subclasses ily (Eichinger and Noegel 2005; Troemel et al. (Parthier et al. 2009; Chapter et al. 2010; 1995; Prabhu and Eichinger 2006) from slime Dong et al. 2008, 2009). The first subclass mold. The amino acid sequences of these re- B1 has approximately 20 members (mainly ceptors contain seven hydrophobic domains that secretin/glucagon/VIP receptors) regulated by are considered TM helices. These receptors are peptide hormones. Those receptors activate also believed to interact with some G-proteins. adenylyl cyclase and the phosphatidyl inositol They do not have significant similarity to Class A calcium pathway by coupling to Gs and Gq GPCR sequences and are characterized by their proteins, respectively. Class B receptors have unique signatures. seven TM helices and a long N-terminal domain Class F: This class includes the frizzled/ of approximately 120 residues stabilized by smoothened family (Malbon 2004; Ruiz- several S-S bonds (Fig. 10.1b). Despite the quite Gómez et al. 2007). The frizzled family has a different amino acid sequences between subclass 208 M. Suwa

Fig. 10.1 (a) Schematic diagram of Class A GPCR structure (right) together with representative PDB structure (left): bovine rhodopsin (PDB:1f88). (b) Schematic diagram of Class B GPCR structure. This figure shows a subclass B1 GPCR (such as ). It has a long N-terminal loop that is stabilized by SÐS bonds (yellow circles represent cysteine). In the case of subclass B2 members, the N-terminal region consists of the repeat of well-known protein modules. (c) Schematic diagram of Class C GPCR structure. It has a large extracellular domain that is shaped like a clam. The loop region connecting this clam-shaped domain to the TM helix is called the cysteine-rich domains (CRDs) (yellow circles represent cysteine) 10 Bioinformatics Tools for Predicting GPCR Gene Functions 209 cysteine-rich domain that forms disulfide bridges G-protein. In general, an established way to pre- in the extracellular region, which is composed dict protein function is to classify proteins into of mainly alpha helices. This family is related to groups whose members are linked by sequence the Wnt signaling pathway. On the other hand, similarity using a conventional sequence search the smoothened family encoded by the SMO method. However, in the case of GPCRs, the gene is related to pathways participating in many function-similarity relationship is unclear. For developmental processes. example, it is suggested in ref. (Gaulton and Classes A, B, C, and F GPCRs are found in Attwood 2003) that, (1) some homologous GPCR mammals, whereas Class D GPCRs are found pairs that bind to the same ligands bind to differ- only in fungi and Class E GPCRs are exclusive to ent types of G-protein; (2) those pairs that bind Dictyostelium. These classes are further divided to the same type of G-protein bind to a different into family, subfamily, and sub-subfamily (some- ligand; and (3) some GPCR pairs bind to both the times called subtypes) based on the functions of same ligand and the same G-protein even though the GPCRs and their specific ligands. they show less than 25 % sequence similarity. Class A GPCRs bind to several types of G- Given this situation, various computational proteins (Gi/o,Gq/11,Gs,G12/13). Class B GPCRs tools have been developed to understand GPCR are classified into two types: those that bind to function. Those tools use not simple sequence Gs and those that bind to Gq/11. Class C GPCRs searches but more powerful methods, such bind to Gi/o and Gq/11. There are also “promis- as alignment-free methods, statistical model cuous” receptors that couple with multiple types methods (hidden Markov models (HMMs), etc. of G-proteins. The G-proteins of Class F GPCRs See Sect. 10.3.2.), and machine learning methods (frizzled/smoothened family) are still not known. (support vector machines (SVMs), etc. See Fredriksson et al. (2003) created a phyloge- Sect. 10.3.2.). Karchin et al. compared several netic tree of approximately 800 known human methods and suggested that SVM achieved the GPCR sequences and proposed a new taxonomy highest performance (Karchin et al. 2002). called “GRAFS,” which consisted of five main re- The aforementioned tools are applied to the ceptor groups (metabotropic glutamate receptors, hierarchical classification of GPCRs (Fig. 10.2). rhodopsin-like receptors, adhesion type recep- The first stage involves the discrimination of tors, frizzled/smoothened family, and secretin re- GPCRs from non-GPCRs. At the second stage, ceptors). This taxonomy was applied to 13 kinds the predicted GPCR candidates are classified into of eukaryotes (Fredriksson and Schiöth 2005). family, subfamily, and sub-subfamily levels. At Concerning the comprehensive GPCR gene anal- the final stage, further classification is performed ysis of individual species, many publications that according to protein-protein interaction (PPI) deal with insect (Hill et al. 2002), plant (Josefsson type: binding G-protein type and oligomerization 1999), human and mouse (Vassilatis et al. 2003; partner type. Classification accuracy is evaluated Thora et al. 2006), and human and dog (Haitina based on the known hierarchical families of et al. 2009) are available. GPCRs, which are available in several databases.

10.3 GPCR Function Prediction 10.3.2 Overview of Bioinformatics by Using Bioinformatics Algorithms Used in Protein Tools Sequence Analysis

10.3.1 Strategy for Function We present an overview of the general bioin- Prediction formatics algorithms used in GPCR sequence analysis in this section. GPCR function is characterized by two aspects: Table 10.1 summarizes the bioinformatics the type of binding ligand and the type of binding tools according to the three layers of function 210 M. Suwa

Fig. 10.2 Schematic diagram of hierarchical GPCR the first stage; (2) hierarchical classification of GPCRs at function prediction. The flow of the prediction is as the second stage; and (3) PPI (protein-protein interaction) follows: (1) GPCR discrimination from other proteins at network classification at the third stage prediction. The tool has been classified according remote relationships by constructing a position- to the classification hierarchy in Fig. 10.2. About specific scoring matrix (PSSM) through multiple the tool that has been developed specifically for alignments of sequences gathered by the repeated GPCR was listed along with the URL of the use of BLAST search. FASTA (Pearson and Lip- WEB. (Please note that there is a possibility that man 1988) is another commonly used sequence the URL will change in the future.) similarity search tool that employs heuristics Further information about the features of for rapid local alignment search. SSEARCH each tool are shown in the following text. is a search tool that uses the Smith-Waterman (Sects. 10.3.2, 10.3.3, 10.3.4, 10.3.5, and algorithm (Smith and Waterman 1981), an opti- 10.3.6) mal local alignment algorithm. These methods are used to classify GPCRs into groups whose Sequence similarity search method involves members are linked by sequence similarity. searching sequence databases by means of alignment to a query sequence. Statistical Motif-based approach: A motif is short nu- assessment scoring (e.g., occurrence score E- cleotide/amino acid sequence pattern that has value, P-value) is conducted to determine how a significant biological meaning. It is extracted well sequences in the database align to the query from a multiple sequence alignment as a con- sequence. The following are popular protein served region. The main role of the motif is to or nucleotide sequence search tools. BLAST discriminate a functional region from a query (Altschul et al. 1990) is the most commonly used sequence. Furthermore, it is often used to find a sequence similarity search tool. It uses heuristics remote homologue, as proteins in the same family to perform rapid local alignment searches. PSI- conserve functional regions although the whole BLAST (Altschul et al. 1997) can help find sequences show low similarity. 10 Bioinformatics Tools for Predicting GPCR Gene Functions 211

Table 10.1 Bioinformatics tools categorized based on functional prediction methodology Statistical model method and Motif based method AlignmentÐfree method machine learning method GPCR TMFinder (Charles et al. 2001) TMHMM (Krogh et al. 2001) discrimination SOSUI (Hirokawa et al. 1998) HMMTOP (Tusnády et al. 2001) QFC method (Kim et al. 2000) 7TMHMM (Möller et al. 2001a) TopPred (Elofson and von http://tp12.pzr.uni-rostock.de/~ Heijne 2007) moeller/7tmhmm/ SCANPI (Bernsel et al. 2008) GPCRHMM (Markus et al. 2006) http://gpcrhmm.sbc.su.se/ Hierarchical PROSITE (Sigrist GPCR Tree (Davies et al. Method of Huang et al. 2004 GPCR et al. 2010) 2008a) PRED-GPCR (Papasaikas et al. classification http://igrid-ext.cryst.bbk.ac.uk/ 2004) gpcrtree/ http://athina.biol.uoa.gr/ bioinformatics/PRED-GPCR/ PRINTS (Attwood Method of Lapnish et al. (2005) GPCRsclass (Bhasin and Raghava et al. 2012) 2004, 2005) http://www.imtech.res.in/raghava/ gpcrsclass/ GPCRpred (Bhasin and Raghava 2004, 2005) http://www.imtech.res.in/raghava/ gpcrpred/ Pfam (Finn et al. Method of 7TMRMine (Lu GPCR-CA (Xiao and Qiu 2010) 2010) et al. 2009) GPCR-SVMFS (Li et al. 2010) gpcrmotif (Gangal GPCR-Mpredictor and Kumar 2007) http://111.68.99.218/gpcr- mpredictor/ PPI Network Method of Filizola and GRIFFIN (Yabuki et al. 2005) classification Weinstein (Filizola and http://griffin.cbrc.jp Weinstein 2005) PRED-COUPLE (Sgurakis et al. 2005) http://athina.biol.uoa.gr/ bioinformatics/PRED-COUPLE2/ GRIP (Nemoto et al. 2009) http://grip.cbrc.jp/GRIP/

PROSITE (Sigrist et al. 2010) database uses form the characteristic signature of an aligned regular expression or profile expression to de- protein family. Pfam (Finn et al. 2010)isa scribe motifs. For example, the regular expression database that includes functional annotations and of a certain motif is M Ð [GS] Ð x Ð fRQg, is described by HMM (see description below). where M represents Met, [GS] represents Gly or Using the search tool HMMER (Eddy 1998), Ser, fRQg represents neither Arg nor Gln, and x the calibrated profile HMM of each family is represents any amino acid residue. This expres- used to score the sequences of all other families. sion can generate a large number of variations. InterPro (Hunter et al. 2009) is a motif database In contrast, a profile expression is described by that integrates PROSITE, PRINTS, Pfam, and the position-specific matrix of the appearance many other motif databases. probability of 20 types of amino acids (or 4 types of nucleotides). PRINTS (Attwood et al. Hidden Markov Model (HMM) (Gollery 2012) is a collection of so-called “fingerprints,” 2008): Let us imagine a sequence of signals that a group of conserved motifs taken from a mul- are output one after another. When the stochastic tiple sequence alignment. Together, the motifs process of the signal output value is determined 212 M. Suwa

Fig. 10.3 (a) Schematic diagram of HMM. This is an of the two groups (blue and red circles), in the case of a example of a statistical model that expresses the inter- three-dimensional space. As regards the points in the N- nal state transition, in which a sign (alphabets in this dimensional space, although it is difficult to imagine, it case) is output in a sequence, according to the internal is possible to choose such a hyperplane mathematically. states (A, B, C states). For example, PAB indicates the (c) Schematic diagram of neural network. A model that probability of transition from A state to B state. PA(C) changes the bond weight of each synapse by learning the indicates the probability that letter C is output from A artificial nerve system formed in the brain. In this figure, state. (b) Schematic diagram of SVM. This is a machine the input layer is the amino acid appearance frequency learning method that performs linear shape distinction of of each site that is obtained from a multiple sequence a point in N-dimensional space in two groups. We can alignment and such information is learned through the choose a distinction plane that maximizes the sum of the middle layer. The final network information is integrated minimum altitude lengths (D1 C D2) among all the points as the strength of the last output (A, B or C region) only by the present value regardless of the past transition states (A, B, C) which determine the output value, this process is called the Markov occurrence probability of each signal (alphabets process. This model has several internal states in this case) and make transitions according to that make transitions mutually within this process the probability eg. from A state and B state and output a signal value according to each state. (PAB). Because only a series of signals can be Figure 10.3a is an example of three internal observed from the outside, without knowing the 10 Bioinformatics Tools for Predicting GPCR Gene Functions 213 internal transition state, it is called as “hiding”. In case of NN application to GPCR classification, For the analysis of GPCR sequences using HMM, multiple alignment of a specific family is used as the character of each amino acid is used as the input data (Fig. 10.3c) and the learned binding signal value and transition probability of internal weight network of synapses discriminates the states and occurrence probability of characters specific family. are calculated from multiple sequence alignment of known GPCRs. Self-Organization Map (SOM) is a teacher- less NN that consists of an input layer and Support Vector Machine (SVM) is a machine an output layer (Kohonen 1990). First, an N- learning method (Cristianini and Shawe-Taylor dimensional vector (at the input layer) and several 2000) that distinguishes coordinates (X1 ::: nodes arranged on a two-dimensional plane XN) in two groups in N-dimensional space. (output layer) are given. Randomly generated For example, when this discrimination method N-dimensional vectors (dignity vectors) are is performed in the three-dimensional space linked to each node. Learning is performed by (Fig. 10.3b), several distinction planes can be calculating the Euclidean distance between the drawn between the two groups. Here, altitude input vector and the dignity vectors and then lines relative to the distinction plane are drawn each input vector is linked to the node that has from all the points in the two groups, and the the nearest dignity vector. After repeating the minimum altitude length is chosen for each renewal of dignity vectors as the averaged vector group. Thus, a straight distinction plane is composed of the input data and the predefined selected so that the sum of the minimum altitude dignity vector, the input data are categorized into lengths (margin D D1 C D2 in Fig. 10.3b) several nodes. SOM can change the nonlinear becomes the maximum. Although it is difficult relationship that exists between data of a high to imagine an N-dimensional space, the solution dimension into a two-dimensional image that of hyper plane is obtained mathematically in can simply represent a geometric relationship. this space. In the case of SVM application to When SOM is applied to GPCR classification, a GPCR classification, a GPCR is described by GPCR is described as the N-dimensional vector an N-dimensional coordinate vector composed of several parameters. of N feature elements, such as physicochemical parameters. As SVM is used to divide data into K Nearest Neighbor (kNN) (Nearest-Neighbor two classes, several discrimination planes should Methods in Learning and Vision 2005)isa be combined in order to classify GPCRs into machine learning method for classifying N- several families. dimensional vectors into several labeled classes. Using user-defined number k, the Euclidean Neural Network (NN) is a model (Bishop 1994) distances from a query vector (unlabeled vector) in which artificial neurons form a network by assign k nearest neighborhood vectors (labeled combining with artificial synapses that change vectors) sequentially. In the classification step, a the weight of binding to neighboring synapses query vector is classified to a class whose label (Fig. 10.3c). Thus, it is able to solve problems by is most frequent among the k nearest vectors. A learning data and simulating the characteristics of result may change with what kind of number the brain. Generally, NN has layers correspond- is chosen about k. When kNN is applied to ing to the input, middle, and output layers as GPCR classification, a GPCR is described as shown in Fig. 10.3c. Learning the relationship the N-dimensional coordinate vector composed between the input dataset and the correct answer of N feature elements, such as physicochemical is conducted by changing the binding weight of parameters. each synapse pair. It is well used in the discrim- ination problem in which input data have many Cellular Automaton (CA) (Wolfram 1984)isa dimensions and linear separation is impossible. collection of “colored” cells on a grid of specified 214 M. Suwa shape that evolves through a number of discrete sequence that cannot be assigned to any known time steps according to a set of rules based on family. such states as the “on” or “off” of neighboring cells. The grid can be in any finite number of dimensions. For each cell, a set of cells called its 10.3.3 GPCR Classification Database neighborhood is defined relative to the specified cell. An initial state (time t D 0) is selected by Many useful GPCR databases are available for assigning a state for each cell. A new generation the classification of GPCR sequences, including is created according to some fixed rule (generally, GPCRDB (Horn et al. 1998, 2000, 2003), a mathematical function) that determines the new IUPHAR (GPCR database) (Harmar et al. 2009), state of each cell in terms of the current state GPCR-PDTM (Hodges et al. 2002), ORDB of the cell and the states of the cells in its (Crasto et al. 2002), GPCR safari (https://www. neighborhood. Typically, the rule for updating ebi.ac.uk/chembl/sarfari/gpcrsafari), SEVENS the state of cells is the same for each cell and (Suwa and Ono 2009, 2010; Ono et al. 2005), does not change over time, and is applied to the gpDB (Theodoropoulou et al. 2008), Human- whole grid simultaneously. In the case of GPCR gpDB (Satagopam et al. 2010), GLIDA (Okuno classification, a GPCR is described as the N- et al. 2008), and GPCR-OKB (Kerashvili dimensional coordinate vector composed of N et al. 2010). GPCRDB is the most popular feature elements, and each kind of class/family and includes known GPCR sequences from corresponds to a two-dimensional CA image. UniProt and GENBANK. IUPHAR and GPCR- PDTM accumulate literature information as well Decision tree (Menzies and Hu 2003) is a de- as sequence information. ORDB focuses on cision support tool that uses a tree-like graph to the olfactory receptor, a subfamily of GPCRs. compare competing alternatives and assign oc- SEVENS integrates GPCR genes that have been currence values to those alternatives. In the tree- comprehensively predicted from the genome like graph, the leaves represent alternatives to sequences of 58 eukaryotes. Furthermore, be selected and the branches represent conjunc- unique data resources regarding the interaction tions of features that lead to those classifications. of a GPCR with G-proteins and effectors The machine learning technique for inducing a are summarized in gpDB and Human-gpDB, decision tree from data is called decision tree whereas the relationships between GPCRs and learning. More descriptive names for such tree their ligands are included in GLIDA, and GPCR models include the classification tree or the re- oligomer information is found in GPCR-OKB. gression tree. Decision trees are commonly used These databases summarize well-organized in operations research, specifically in decision GPCR genes useful as training data for tuning analysis, to help identify the most likely strategy bioinformatics tools. to reach a goal. When used for GPCR family clas- sification, the leaves represent each family and the branches indicate some conserved character 10.3.4 Tools for GPCR Discrimination for each GPCR family. from Other Proteins

Alignment-free method (sometimes called the Predicting the TM helix region is the most funda- physicochemical method) does not require the mental operation in the discrimination of mem- consensus sequence obtained from a multiple brane protein from another protein when an un- sequence alignment. Here, each amino acid of a known protein sequence is obtained. In partic- single sequence is translated into some physico- ular, if a query sequence is predicted to have chemical parameters and this information is used around seven TM helices, that sequence may for function prediction. This method is often be considered as a GPCR candidate. When an useful to annotate a function to an unknown amino acid sequence is translated into a numeri- 10 Bioinformatics Tools for Predicting GPCR Gene Functions 215 cal sequence of hydrophobicity indices (Kyte and the inside-and-outside prediction of membrane Doolittle 1982, etc.) and the indices are plotted protein (Elofson and von Heijne 2007) based against the sequential number, the TM regions on the positive inside rule. SCANPI (Bernsel can be easily obtained because those regions et al. 2008) yields approximately 90 % predictive become highly hydrophobic domains with 20Ð30 accuracy by combining the positive inside rule residues (corresponding to membrane thickness) and the free energy calculation of protein folding and are connected by loop regions with low from an amino acid sequence. hydrophobicity. TMFinder (Charles et al. 2001) There are methods that model the feature of predicts TM regions using this physicochemical a TM helix with inside and outside loop in- character. formation, and predict the topology of a query The idea that “the TM helix domain has high sequence by calculating the fitness score using average hydrophobicity” is the most fundamental the predefined profile or HMM. The loops at the one. However, using only this characteristic, a extracellular/cytoplasmic side, the helix termini TM helix may be undistinguishable from several at the extracellular/cytoplasmic side, the central helices inside the hydrophobic core of globu- region of a TM helix, and sometimes the large lar proteins. By taking protein sequence length globular domains exposed to water are incor- and amino acid amphiphilicity into considera- porated into the TM helix HMM. TMHMM tion, SOSUI (Hirokawa et al. 1998) can predict (Krogh et al. 2001) and its improved version TM helices by rejecting false-positive helices in (Viklund and Elofsson 2004), HMMTOP (Tus- globular proteins and therefore, highly accurate nády et al. 2001), and MEMSAT (Jones 2007) (97 %) discrimination between globular protein are typical programs that yield the highest accu- and membrane proteins can be accomplished. racies. Kim et al. developed a physicochemical algo- rithm for identifying TM proteins from genomic 7TMHMM (http://tp12.pzr.uni-rostock.de/~ databases using a quasi-periodic feature classi- moeller/7tmhmm/submission.php) (Möller et al. fier (QFC) (Kim et al. 2000). From a sequen- 2001a) is a GPCR-specific predictor. This model tial plot of physicochemical parameters, several is derived from TMHMM (Krogh et al. 2001) parameters of amino acids (amino acid usage and contains sub-models for the seven TM index, log average of hydrophobicity periodicity, helices and the cytoplasmic and extracellular log average of polarity periodicity, and variance loops, but is not trained on GPCR sequences. It of first-order polarity derivative) are evaluated in uses a data-mining approach, combining pattern a window of protein sequences and those values discovery with membrane topology prediction to are then used in a linear discriminant function to find patterns of amino acid residues in the TM separate GPCRs from non-GPCRs. The perfor- domains of GPCR. mance on their test dataset shows 96 % positive identification of known GPCRs. GPCRHMM (http://gpcrhmm.cgb.ki.se) In the analysis of TM helix proteins, the term (Markus et al. 2006) is an HMM that is ‘topology prediction’ involves the simultaneous based on a large dataset representing the entire prediction of TM helical regions and the exposed GPCR superfamily. GPCRHMM’s sensitivity is side (extracellular or cytoplasmic side) of the approximately 15 % higher than that of the best loops that connect neighboring helices. The fun- TM helix predictors at comparable false-positive damental view of the inside-and-outside predic- rates. When this method was applied to five tion of membrane protein is the positive inside proteomes, 120 sequences with no annotations rule (Andersson and von Heijne 1994) that the were obtained as novel GPCRs. GPCRHMM loops exposed to the cytoplasmic side have an strongly rejected a family of arthropod-specific abundance of basic residues, such as Lys and Arg, odorant receptors believed to be GPCRs. Detailed compared to the loops exposed to the extracellu- analysis showed that the sequences were indeed lar side. TopPred is the first program to perform very different from those of the other GPCRs. 216 M. Suwa

There are other methods that do not take into for the discrimination and classification of consideration the TM model. NN-based method proteins. It employs five “z values” derived (PHD-htm (Rost et al. 1995)) and SVM-based from 26 physicochemical properties using methods (SVMtm (Yuan et al. 2004), SPOCTO- principal component analysis. The five z values PUS (Viklund et al. 2008), and MEMSAT-SVM represent amino acid hydrophobicity, steric (Nugent and Jones 2009)) directly learned pa- bulk/polarizability, polarity, electronic effects, rameters from sequence information to TM helix and electrophilicity. Lapnish et al. translated prediction. Those methods show high predictive amino acid sequences into these chemical accuracy. parameters and applied this numerical sequence The motif-based search method is also a pop- to the GPCR classification (Lapnish et al. 2002). ular tool for discriminating GPCR from other proteins. Cobanoglu et al. discriminated Class A GPCRTree (Davies et al. 2007, 2008a, b) GPCRs from other types (Cobanoglu et al. 2011) (http://igrid-ext.cryst.bbk.ac.uk/gpcrtree/)isa by discovering key receptor-ligand interaction system that uses an alignment-free classification sites selected by an original technique (Distin- based on the physicochemical properties of guishing Power Evaluation (DPE)). Application amino acids. Using proteochemometrics (Lapinsh of this method to a GPCR sequence showed et al. 2001, 2005), the five physicochemical that it outperformed several other GPCR Class A values are calculated for each amino acid in subfamily prediction tools. the sequence and are used to generate 15 attribute In general, sequence motif based methods are values. This system yields accuracies of 97 % strongly dependent on the results of similarity at the class level, 84 % at the subfamily level, of primary sequence alignments. In this regard, and 75 % at the sub-subfamily level when the it is necessary to design a motif discovery and system was applied to BIAS-PROFS GPCR application method that is not strongly dependent dataset (Davies et al. 2007). on primary sequence similarity. The gpcrmotif A better strategy would be to combine dif- (Gangal and Kumar 2007) is a method that uses ferent classifiers to improve both specificity and a reduced alphabet representation (Eric et al. sensitivity for the identification of a broader spec- 2009) where similar functional residues (e.g., trum of GPCR candidates. 7TMRmine (Lu et al. positively charged residues, negatively charged 2009) is a Web server that integrates alignment- residues, etc.) have similar symbols. This reduced free and alignment-based classifiers specifically alphabet representation can accurately classify trained to identify seven TM helix type receptors known GPCRs and the results obtained are com- as well as TM helix prediction methods. This parable to those of PRINTS and PROSITE. For tool enables users to easily assess the distribu- well-known GPCR sequences in SWISSPROT tion of GPCR families in diverse genomes or database, there are no false negatives and only a individual newly discovered proteins. Users can few false positives. This method is applicable to submit protein sequences for analysis or explore almost all known classes of GPCRs. It also pre- pre-analyzed results of multiple genomes. This dicts more than one class for certain sequences. server currently includes prediction results and a This method has annotated 695 orphan receptors, summary of statistics for 68 genomes. 121 of which belong to Class A GPCRs (Gangal Several GPCR classification methods are and Kumar 2007). available. They are based on statistical models (HMM, etc.), machine learning methods (deci- sion tree, SVM, kNN, CA, etc.), and methods 10.3.5 Tools for Hierarchical GPCR that combine different types of algorithms. Family Classification Qian et al. generated a phylogenetic tree based profile HMM (T-HMM) (Qian et al. 2003) and Proteochemometrics (Lapinsh et al. 2001, showed its superiority in generating a profile 2002, 2005) is an alignment-free approach for a group of similar proteins. When T-HMM 10 Bioinformatics Tools for Predicting GPCR Gene Functions 217 was applied to the generation of the common The predictive power of this method is higher features of GPCRs, the generated profile yielded than that of the amino acid composition based high accuracy in GPCR function classification by method because dipeptide composition takes into ligand type and by coupled G-protein type. account both the amino acid composition and the Huang et al. used the decision tree method local consensus sequence of amino acids. This (bagging classification tree) (Huang et al. 2004) method can distinguish GPCRs from non-GPCR to predict GPCR subfamily and sub-subfamily sequences with nearly 100 % accuracy. It can based on the amino acid composition of a protein. also predict classes and subfamilies of GPCRs They adopted the C4.5 algorithm for classifi- with higher than 80 % accuracy. cation tree construction. To improve prediction accuracy, they used the bootstrap aggregating GPCR-CA (Xiao et al. 2009) uses cellular (bagging) procedure and the prediction accuracy automaton (CA) images to represent GPCRs actually became higher than that obtained by through their pseudo amino acid composition using a single classification tree. In a cross- (PseuAA composition (Chou 2001)), which validation test, they achieved a predictive accu- was originally introduced to improve protein racy of 91.1 % for GPCR subfamily classifica- subcellular localization prediction and membrane tion and 82.4 % for sub-subfamily classification protein type prediction. GPCR-CA has a two- (Huang et al. 2004). layered prediction engine in which the first layer is used to determine whether a protein is a GPCR PRED-GPCR (Papasaikas et al. 2004)(http:// or a non-GPCR, and the second layer is used to athina.biol.uoa.gr/bioinformatics/PRED-GPCR/) classify the protein into the six functional classes. uses family-specific profile HMMs to determine The overall success rates of the prediction for the which GPCR family a query sequence resembles. first and second layers are higher than 91 and The approach proposed by this method exploits 83 %, respectively. The same research group also the descriptive power of profile HMMs and uses developed a tool that uses both adaptive kNN an exhaustive discrimination assessment method algorithm and CA images (Xiao and Qiu 2010). to select only highly selective and sensitive Based on CA images, complexity measure factors profile HMMs for each family. The collection derived from each of the protein sequences are of these profile HMMs constitutes a signature adopted for its PseuAA composition. GPCRs library that is scanned for significant matches are categorized into nine subtypes. The overall with a given query sequence. success rate for nine class (rhodopsin-like receptor, peptide, hormones receptor, glutamate GPCRsclass (Bhasin and Raghava 2005) and calcium receptor, fungal mating pheromone (http://www.imtech.res.in/raghava/gpcrsclass/) receptor, cyclic AMP receptor, odorant receptors, is a tool for predicting amine binding receptors gustatory receptor, frizzled/smoothened family, from amino acid sequences. For this purpose, T2R family) identification is approximately SVM based tools were developed. The average 83.5 %. The high success rate indicates the high accuracy of the method for two cases based potential of the adaptive kNN algorithm and CA on amino acid composition and dipeptide images for classifying GPCRs. composition is 89.8 and 96.4 %, respectively, when evaluated using the fivefold cross- GPCR-SVMFS (Li et al. 2010) uses a novel validation test. three-layer predictor based on SVM and feature selection is developed for the prediction and clas- GPCRpred (Bhasin and Raghava 2004)(http:// sification of GPCRs directly from amino acid www.imtech.res.in/raghava/gpcrpred/)usessequences. In this method, mRMR (maximum an SVM-based method and the dipeptide relevance minimum redundancy) algorithm is ap- composition of proteins to predict GPCR family plied to pre-evaluate features with discrimina- and subfamily levels for a query sequence. tive information, and genetic algorithm (GA) is 218 M. Suwa utilized to find the optimized feature subsets. reported >90 % specificity with 30Ð40 % sen- The overall accuracy of the three-layer predictor sitivity (Möller et al. 2001b). Sreekumar et al. at the superfamily, family, and subfamily levels succeeded in reducing the prediction error rate was determined by a cross-validation test on two to <1 % (Sreekumar et al. 2004)usingHMM non-redundant datasets and was approximately classification. Subsequent to those works, the 0.5Ð16 % higher than those of GPCR-CA and following useful Web tools were published. GPCRPred. GRIFFIN (Yabuki et al. 2005)(http://griffin. GPCR-MPredictor (Muhammad and Asif cbrc.jp; G-protein and Receptor Interaction Fea- 2012)(http://111.68.99.218/gpcr-mpredictor/) ture Finding INstrument) is a Web tool that pre- combines individual classifiers for the prediction dicts GPCR and G-protein coupling selectivity by of GPCRs and can efficiently predict GPCRs using SVM and HMM. Based on the assump- at five levels: GPCR or non-GPCR, family, tion that whole structural segments of ligands, subfamily, sub-subfamily, and subtype levels. GPCRs, and G-proteins are essential to determine This program analyzes the discriminative power GPCR and G-protein coupling selectivity, vari- of different feature extraction and classification ous quantitative features are selected for ligands, strategies in the case of GPCR prediction and GPCRs, and G-protein complex structures, and then uses an evolutionary ensemble approach to “the most effective parameter set” for predict- enhance prediction performance. Features are ing G-protein type are selected by evaluating extracted using the amino acid composition, predictive accuracy in SVM. The main part of PseuAA composition, and dipeptide composition GRIFFIN includes a hierarchical SVM classifier of receptor sequences. Different classification that utilizes feature vectors which is composed approaches, such as kNN, SVM, probabilistic of “the most effective parameter set”, and this neural network (PNN), decision tree method, classifier is useful for Class A GPCRs, the major naive Bayes method etc., are used to classify GPCR family. For and olfactory subfami- GPCRs at the five levels. lies of Class A and other minor families (Classes B, C, and frizzled/smoothened), the binding G- protein is predicted with high accuracy using 10.3.6 Tools for Predicting GPCR HMM. Most importantly, this method is the first Function in View to predict G-protein type by inputting both ligand of Protein–Protein Interaction molecular weight and GPCR sequence informa- Network tion. It can predict G-proteins with sensitivity and specificity exceeding 85 % for the dataset with To predict GPCR function in the cell, it is nec- known coupling property. essary to know the type of coupling G-protein as the signal transduction pathway is related to the PRED-COUPLE (Sgurakis et al. 2005) G-protein type. (http://athina.biol.uoa.gr/bioinformatics/PRED- Enumerated (Qian et al. 2003; Cao et al. 2003; COUPLE2/) is a Web method that predicts Möller et al. 2001b; Sreekumar et al. 2004)are the coupling specificity of GPCRs to the four the previously studied methods and tools for families of G-proteins (including G12/13). This predicting coupling G-proteins. T-HMM of Qian method can predict coupling to more than et al. yielded high accuracy in the GPCR classi- one family of G-proteins, as is experimentally fication by coupled G-protein, as well as in the determined. Similar to the first version, the classification by ligand type (Qian et al. 2003). PRED-COUPLE2 system implements a library The method of Cao et al., which is based on the of refined profile HMMs. The profiles are trained naive Bayes model, was able to predict G-protein by the intracellular domain sequences of 188 with 72 % sensitivity from 55 GPCRs (Cao et al. GPCRs with known coupling properties. All 2003). Using pattern extraction, Möller et al. HMMs are constructed and calibrated by the 10 Bioinformatics Tools for Predicting GPCR Gene Functions 219

HMMER (Eddy 1998) software package. In tuting the interface. Unfortunately, because of the order to produce the final prediction, scores from shortage of the learning dataset, it is difficult to individual profiles are combined by using an evaluate the accuracy of this system. Neverthe- Artificial Neural Network. Multiple predictions less, it works well at least for known complexes. should be made for receptors in the form of more than one output above the threshold. Conventional research of GPCR function has 10.4 Conclusion taken into consideration the relationship between a ligand and an isolated GPCR. However, re- In this review, many bioinformatics tools that cently, new topics related to GPCR-GPCR com- predict the functions of GPCRs from amino acid plexes have emerged (Szidonya et al. 2008). In sequences were introduced. The tools have sev- some cases, one GPCR activates another GPCR eral classification layers that are divided accord- through the contact surface of their complex. In ing to GPCR family hierarchy, such as class, other cases, some GPCRs binding to specific G- family, subfamily, and sub-subfamily, by using proteins change the type of G-protein when the motif-based methods, alignmentÐfree methods, GPCR-GPCR complex is formed. The prediction statistical models, and machine learning methods. of interactions between GPCRs and other pro- These methods have achieved prediction accura- teins has attracted significant attention for next- cies of around 90 % (see Sects. 10.3.4, 10.3.5, generation drug design (Szidonya et al. 2008; and 10.3.6). Susan et al. 2002). Around 10 years ago, bioinformatics tools for Filizola and Wenstein reviewed several bioin- predicting GPCR function saw a surge in number, formatics approaches for GPCR oligomerization but the number saturated thereafter. However, prediction (Flizola and Wenstein 2005). The cor- in the past 3Ð5 years, new bioinformatics tools related mutation algorithm (Govel et al. 1994; for GPCR function prediction have emerged in Oliveira et al. 1993) is a powerful tool for the succession. Why is this so? During the first surge accurate prediction of physical contact. The evo- that occurred approximately 10 years ago, protein lutional trace method (Lihitage et al. 1996)is sequences were determined rapidly from genome also useful to determine family-specific func- sequences, and it became possible to analyze tional regions. Using those methods, Filizola and a large number of GPCR sequences. However, Wenstein analyzed the occurrence probability of although bioinformatics tools were required tem- residues, which exposed to lipid surface, in pre- porarily, it seemed that the demand for them dicting the oligomerization interface of GPCRs reached saturation as experimental results and and reported the importance of the TMH4-TMH6 protein tertiary structures were restricted in those regions as the oligomerization interfaces (Flizola days. and Wenstein 2005). On the other hand, in the past several years, the amount of biological information showed GRIP (Nemoto et al. 2009)(http://grip.cbrc.jp/ a marked increase due to remarkable advances GRIP/) is a unique WEB tool that predicts the in experimental techniques. For example, the interface of oligomerized GPCRs by means of genome sequences of many species have been the spatial cluster detection (SCD) method (Cook determined by next-generation sequencers. et al. 2007). Here, the three procedures for query Moreover, the tertiary structures of GPCRs sequence analysis involve: (1) gathering of ho- are rapidly increasing in number (structures mologous sequence families, (2) pre-searching of approximately 13 families in 2013) with of template structure to which sequence families improved crystal analysis technology, and are mapped, and (3) detection of clusters of con- regions linked with a function can now be served residues around structural surfaces. The determined directly. Under such circumstances, conserved residues within the detected sector are the demand for bioinformatics tools has considered to correspond to the residues consti- resurfaced. 220 M. Suwa

In recent years, although the analysis of tar- get proteins has seen much advancement, the References number of de-orphanized receptors is still small. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ Therefore, GPCRs still maintain the position of (1990) Basic local alignment search tool. J Mol Biol the most important drug design target. In the 215:403Ð410 environment where the competition for drug tar- Altschul SF et al (1997) Gapped BLAST and PSI-BLAST: get search has intensified among pharmaceutical a new generation of protein database search programs. Nucleic Acids Res 25:3389Ð3402 companies, the requirement of bioinformatics for Andersson H, von Heijne G (1994) Membrane protein function classification in GPCR proteome has topology: effects of delta mu HC on the translocation naturally emerged. In this regard, what should of charged residues explain the ‘positive inside’ rule. be the purpose of developing bioinformatics tools EMBO J 13:2267Ð2272 Attwood TK, Coletta A, Muirhead G, Pavlopoulou A, from now on? If the purpose is to solve the prob- Philippou PB, Popov I, Roma-Mateo C, Theodosiou lem of ‘classification to fit a known category’ as A, Mitchell A (2012) The PRINTS database: a fine- described in Sect. 10.2, finding ways to increase grained protein sequence annotation and analysis re- source Ð its status in 2012. Database 2012: Article ID prediction accuracy would be a good goal of bas019 doi:10.1093/database/bas019 development. Bernsel A, Viklund H, Falk J, Lindahl E, von Heijne In order to classify sequences that have no G (2008) SCAMPI: prediction of membrane-protein similarity, bioinformatics researchers should set topology from first principles. Proc Natl Acad Sci USA 105:7177Ð7181 their sights on the development of optimal al- Bhasin M, Raghava GP (2004) GPCRpred: an SVM-based gorithms and parameters. Therefore, GPCR is method for prediction of families and subfamilies positioned as a suitable target for bioinformatics of G-protein coupled receptors. Nucleic Acids Res research. In reality, however, it remains a difficult 32(Web Server issue):W383ÐW389 Bhasin M, Raghava GP (2005) GPCRsclass: a web tool problem whether such bioinformatics research for the classification of amine type of G-protein- can meet the demand of researchers who actually coupled receptors. Nucleic Acids Res 33(Web Server design experiments. Of course, it is very impor- issue):W143ÐW147 tant to predict to which family a novel receptor Bishop CM (1994) Neural networks and their applica- tions. Rev Sci Instrum 65:1803Ð1832 belongs as this information is expected to support Brauner-Osborne H, Wellendorph P, Jensen AA (2007) the design of new experiments. It is also very Structure, pharmacology and therapeutic prospects of important to predict, for a novel GPCR, the type family C G-protein coupled receptors. Curr Drug Tar- of signal it transmits to a cell and the biological gets 8:169Ð184 Cao J, Panetta R, Yue S, Steyaert A, Young-Bellido M, changes that take place as a result of signal Ahmad S (2003) A naive Bayes model to predict cou- transduction. Current GPCR function prediction pling between seven transmembrane domain receptors tools cannot achieve this yet. and G-proteins. Bioinformatics 19:234Ð240 In bioinformatics, a vast quantity of data in a Chapter MC et al (2010) Chemical modification of class II G protein-coupled receptor ligands: frontiers in the de- complicated system are needed as learning data. velopment of peptide analogs as neuroendocrine phar- Fortunately, such a vast amount of experimental macological therapies. Pharmacol Ther 125:39Ð54 data would be obtained by scaling up experimen- Charles M et al (2001) TM Finder: a prediction program tal techniques. We also expect the development of for transmembrane protein segments using a combi- nation of hydrophobicity and nonpolar phase helicity new prediction tools using those data. Obviously, scales. Protein Sci 10:212Ð219 the close cooperation of researchers in the fields Chou KC (2001) Prediction of protein cellular at- of biochemistry, structural biology, and bioinfor- tributes using pseudo-amino acid composition. Pro- matics is indispensable to this end. teins 43:246Ð255 Civelli O, Saito Y, Wang Z, Hans-Peter Nothacker H- P, Reinscheid RK (2006) Orphan GPCR and their Acknowledgement I appreciate deeply the insightful dis- ligands. Pharmacol Ther 110:525Ð532 cussions with Dr. Minoru Sugihara (CBRC), Dr. Masami Cobanoglu MC, Saygin Y, Sezerman U (2011) Classifi- Ikeda (Aoyama Gakuin University), Dr. Wataru Nemoto cation of GPCRs using family specific motifs. Trans (Tokyo Denki University), and Dr. Hiroyuki Toh (CBRC). Comput Biol Bioinformatics 8:1495Ð1508 10 Bioinformatics Tools for Predicting GPCR Gene Functions 221

Cook AJ, Gold DR, Li Y (2007) Spatial cluster de- Fridmanis D et al (2007) Formation of new genes ex- tection for censored outcome data. Bio Metrics 63: plains lower intron density in mammalian rhodopsin 540Ð549 G protein-coupled receptors. Mol Phylogenet Evol Crasto C, Marenco L, Miller P, Shepherd G (2002) 43:864Ð880 Olfactory receptor database: a metadata-driven auto- Gangal R, Kumar KK (2007) Reduced alphabet motif mated population from sources of gene and protein methodology for GPCR annotation. J Biomol Struct sequences. Nucleic Acids Res 30:354Ð360 Dyn 25:299Ð310 Cristianini N, Shawe-Taylor J (2000) An introduction to Gaulton A, Attwood TK (2003) Bioinformatics ap- support vector machines: and other kernel-based learn- proaches for the classification of G-protein-coupled ing methods. Cambridge University Press, Cambridge, receptors. Curr Opin Pharmacol 3:114Ð120 xi, 189 Gollery M (2008) Handbook of hidden Markov models Davies MN et al (2007) On the hierarchical classifica- in bioinformatics. Chapman & Hall/CRC Press, Boca tion of G protein-coupled receptors. Bioinformatics Raton/London, xix, 156 23:3113Ð3118 Govel V, Sander C, Schneider R, Valencia A (1994) Davies MN et al (2008a) Optimizing amino acid Correlated mutations and residue contacts in proteins. groupings for GPCR classification. Bioinformatics Proteins 18:309Ð317 24:1980Ð1986 HaitinaT, Fredriksson R, Foord SM, Schioth H, Glo- Davies MN et al (2008b) GPCRTree: online hierarchical riam DE (2009) The G protein-coupled receptor classification of GPCR function. BMC Res Notes. subset of the dog genome is more similar to that doi:10.1186/1756-0500-1-67 in humans than rodents. BMC Genomes 10:24. Dong M et al (2008) Insights into the structural basis of doi:10.1186/1471-2164-10-24 endogenous agonist activation of family B G protein- Harmar AJ et al (2009) IUPHAR-DB: the IU-PHAR coupled receptors. Mol Endocrinol 22:1489Ð1499 database of G protein coupled receptors and ion chan- Dong M, Cox RF, Miller LJ (2009) Juxtamembranous nels. Nucleic Acids Res 37:D680ÐD685 region of the amino terminus of the family BG Hill CA, Fox AN, Pitts RJ, Kent LB, Tan PL, Chrystal protein-coupled plays a critical MA, Cravchik A, Collins FH, Robertson HM (2002) role in small-molecule agonist action. J Biol Chem G protein-coupled receptors in Anopheles gambiae. 284:21839Ð21847 Science 298:176Ð178 Dorota L et al (2012) G-protein coupled receptors-recent Hirokawa T, Boon-Chieng S, Mitaku S (1998) SO- advances. Acta Biochem Pol 59:515Ð529 SUI: classification and secondary structure predic- Eddy SR (1998) Profile hidden Markov models. Bioinfor- tion system for membrane proteins. Bioinformatics matics 14:755Ð763 14:378Ð379 Eichinger L, Noegel AA (2005) Comparative genomics of Hodges PE et al (2002) Annotating the human proteome: Dictyostelium discoideum and Entamoeba histolytica. the human proteome survey database (HumanPSDTM) Curr Opin Microbiol 8:606Ð611 and an in-depth target database for G protein-coupled Eilers M et al (2005) Comparison of class A and D G receptors (GPCR-PDTM) from incyte genomics. Nu- protein-coupled receptors: common features in struc- cleic Acids Res 30:137Ð141 ture and activation. Biochemistry 44:8959Ð8975 Horn F et al (1998) GPCRDB: an information system Elofson A, von Heijne G (2007) Membrane protein struc- for G protein Ð coupled receptors. Nucleic Acids Res ture: prediction versus reality. Annu Rev Biophys Rev 26:275Ð279 Biochem 76:125Ð140 Horn F, Vriend G, Cohen FE (2000) Collecting and har- Eric LP, Jane K, Julie AT, Rob P (2009) Reduced amino vesting biological data: the GPCRDB and NucleaRDB acid alphabets exhibit an improved sensitivity and se- information systems. Nucleic Acids Res 29:346Ð349 lectivity in fold assignment. Bioinformatics 25:1356Ð Horn F et al (2003) GPCRDB information system for 1362 G protein Ð coupled receptors. Nucleic Acids Res Finn RD et al (2010) The Pfam protein families database. 31:294Ð297 Nucleic Acids Res 38(Database issue):D211ÐD222 https://www.ebi.ac.uk/chembl/sarfari/gpcrsafari Flizola M, Wenstein H (2005) The study of G- Huang Y, Cai J, Ji L, Li Y (2004) Classifying G-protein protein coupled receptor oligomerization with com- coupled receptors with bagging classification tree. putational method and bioinformatics. FEBS J 272: Comput Biol Chem 28:275Ð280 2926Ð2938 Hunter S et al (2009) InterPro: the integrative protein Fredriksson R, Schiöth HB (2005) The repertoire of G- signature database. Nucleic Acids Res 37(Database protein-coupled receptors in fully sequenced genomes. issue):D211ÐD215 Mol Pharmacol 67:1414Ð1425 Jones DT (2007) Improving the accuracy of transmem- Fredriksson R, Lagerström MC, Lundin LG, Schiöth HB brane protein topology prediction using evolutionary (2003) The G-protein-coupled receptors in the human information. Bioinformatics 23:538Ð544 genome form five main families. Phylogenetic analy- Josefsson LG (1999) Evidence for kinship between sis, paralogon groups, and fingerprints. Mol Pharmacol diverse G-protein coupled receptors. Gene 239: 63:1256Ð1272 333Ð340 222 M. Suwa

Karchin R, Karplus K, Haussler D (2002) Classifying that determine global and class-specific functions. J G-protein coupled receptors with support vector ma- Biol Chem 279:8126Ð8132 chines. Bioinformatics 18:147Ð159 Malbon CC (2004) Frizzleds: new members of the super- Karmik SS, Gogonea C, Patil S, Saad Y, Takezako T family of G-protein-coupled receptors. Front Biosci (2003) Activation of G-protein-coupled receptors: a 9:1048Ð1058 common molecular mechanism. Trends Endocrinol Markus W, Lukas K, Erik LL (2006) Sonnhammer, a gen- Metab 14:431Ð437 eral model of G protein- coupled receptor sequences Kerashvili G et al (2010) GPCR-OKB: the G protein and its application to detect remote homologs. Protein coupled receptor oligomer knowledge base. Bioinfor- Sci 15:509Ð521 matics. doi:10.1093/bioinformatics/btq 264 Menzies T, Hu Y (2003) Data mining for very busy Kim J, Moriyama EN, Warr CG, Clyne PJ, Carlson JR people. IEEE Comput 36:22Ð29 (2000) Identification of novel multi-transmembrane Mizadegan T, Benkö G, Filipek S, Palczewski K proteins from genomic databases using quasi-periodic (2003) Sequence analysis of G-protein-coupled- structural properties. Bioinformatics 9:767Ð775 receptors: similarities to rhodopsin. Biochemistry 42: Kobilka BK (2007) G protein coupled receptor structure 2759Ð2767 and activation. Biochim Biophys Acta 1768:794Ð807 Möller S, Vilo J, Croning MD (2001a) Supplementary Kobilka B, Schertler FX (2008) New G-protein-coupled material for the G-protein coupling prediction paper receptor crystal structures; insights and limitations. of Croning Prediction of the coupling specificity of Trends Pharm Sci 29:79Ð83 GPCRs to their G proteins Bioinformatics (Suppl Kohonen T (1990) The self-organization map. Proc IEEE 1):174Ð181 9:1464Ð1480 Möller S, Vilo J, Croning MD (2001b) Prediction of the Kojima M, Hosoda H, Date Y, Nakazato M, Matsuo coupling specificity of G protein coupled receptors to H, Kangawa K (1999) Ghrelin is a growth-hormone- their G proteins. Bioinformatics 17:174Ð181 releasing acylated peptide from stomach. Nature Muhammad N, Asif UK (2012) GPCR-MPredictor: multi- 402:656Ð660 level prediction of G protein-coupled receptors using Krogh A, Larsson B, von Heijne G, Sonnhammer ELL genetic ensemble. Amino Acids 42(5):1809Ð1823 (2001) Predicting transmembrane protein topology Muramatsu T, Suwa M (2006) Statistical analysis and pre- with a hidden Markov model: application to complete diction of functional residues effective for GPCR-G- genomes. J Mol Biol 305:567Ð580 protein coupling selectivity. Protein Eng 19:277Ð283 Kyte J, Doolittle RF (1982) A simple method for display- Nemoto W, Fukui K, Toh H (2009) GRIP: a server for pre- ing the hydropathic character of a protein. J Mol Biol dicting interfaces for GPCR oligomerization. J Recept 157:105Ð132 Signal Transduct Res 29:312Ð317 Lapinsh M et al (2001) Development of proteo- Nugent T, Jones DT (2009) Transmembrane protein topol- chemometrics: a novel technology for the analysis of ogy prediction using support vector machines. BMC drug-receptor interactions. Biochimica Et Biophysica Bioinformatics 10:159 Acta- Gen Subj 1525:180Ð190 Nygaard R, Frimurer TM, Holst B, Rosenkilde MM, Lapinsh M et al (2005) Improved approach for pro- Schwartz TW (2009) Ligand binding and micro- teochemometrics modeling: application to organic switched in 7TM receptor structures. Trends Pharma- compoundÐamine G protein-coupled receptor interac- col Sci 30:249Ð259 tions. Bioinformatics 21:4289Ð4296 Okuno Y, Tamon A, Yabuuchi H, Niijima S, Minowa Y, Lapnish M, Gutcaits A, Prusis P, Post C, Lundstedt T, Tonomura K, Kunimoto R, Feng C (2008) GLIDA: Wikberg JES (2002) Classification of G-protein cou- GPCR-ligand database for chemical genomics drug pled receptors by alignment-independent extraction of discovery-database and tools update. Nucleic Acids principle chemical properties of primary amino acid Res 36:D907ÐD912 sequences. Protein Sci 11:795Ð805 Oliveira L, Paiva ACM, Vriend G (1993) A common Li Z, Zhou X, Dai Z, Zou X (2010) Classification of motif in G-protein coupled seven transmembrane helix G-protein coupled receptors based on support vector receptors. J Comp Aid Mol Des 7:649Ð658 machine with maximum relevance minimum redun- Ono Y, Fujibuchi W, Suwa M (2005) Automatic gene dancy and genetic algorithm. BMC Bioinformatics collection system for genome-scale overview of 11:325 G-protein coupled receptors in eukaryotes. Gene Lihitage O, Bowrne HR, Chhen FE (1996) An evolution- 364:63Ð73 ary trace method defines binding surfaces common to Palczewski K et al (2000) Crystal structure of rhodopsin: protein families. J Mol Biol 257:342Ð358 a G protein- coupled receptor. Science 289:739Ð745 Lu G, Wang Z, Jones AM, Moriyama EN (2009) Papasaikas PK, Bargos PG, Litou ZI, Promponas VJ, 7TMRmine: a Web server for hierarchical min- Hamodrakas SJ (2004) PRED-GPCR: GPCR recog- ing of 7TMR proteins. BMC Genomics 10:275. nition and family classification server. Nucleic Acids doi:10.1186/1471-2164-10-275 Res 32:W380ÐW382 Madabushi S, Gross AK, Philippi A, Meng EC, Wensel Parthier C et al (2009) Passing the baton in class B TG, Lichtarge O (2004) Evolutionary trace of G GPCRs: peptide hormone activation via helix induc- protein-coupled receptors reveals clusters of residues tion? Trends Biochem Sci 34:303Ð310 10 Bioinformatics Tools for Predicting GPCR Gene Functions 223

Pearson WR, Lipman DJ (1988) Improved tools for bio- Susan R, George Brian FOD, Samuel PL (2002) G-protein logical sequence comparison. Proc Natl Acad Sci USA coupled receptor oligomerization and it potential for 85:2444Ð2448 drug discovery. Nature review 1:808Ð820 Pin JP, Galvez T, Prezeau L (2003) Evolution, structure, Suwa M, Ono Y (2009) Computational overview of GPCR and activation mechanism of family 3/C G- protein- gene universe to support reverse chemical genomics coupled receptors. Pharmacol Ther 98:325Ð354 study. In: Koga H (ed) Reverse chemical genetics, Prabhu Y, Eichinger L (2006) The Dictyostelium reper- Methods in Mol Biol 577, 1st edn. Springer, Tokyo toire of seven transmembrane domain receptors. Eur J Suwa M, Ono Y (2010) A bioinformatics strategy to Cell Biol 85:937Ð946 produce a cyclically developing project structure- Qian B, Soyer OS, Neubig RR, Goldstein RA (2003) comprehensive functional analysis of the drug design Depicting a protein’s two faces: GPCR classification target genes. Synthesiology 3:1Ð12 by phylogenetic tree-based HMMs. FEBS Lett 554: Suwa M, Sugihara M, Ono Y (2011) Functional and struc- 95Ð99 tural overview of G-protein-coupled receptors compre- Rasmussen SG et al (2011) Crystal structure of the hensively obtained from genome sequence. Pharma- Beta 2 adrenergic receptor-Gs protein complex. Nature ceuticals 4:652Ð664 477:549Ð555 Szidonya L, Cserzo M, Hunyady L (2008) Dimerization Rosenbaum DM, Rasmaussen SGF, Kobilka BK (2009) and oligomerization of G-protein-coupled receptors: The structure and function of G-protein-coupled re- debated structures with established and emerging func- ceptors. Nat Insight 459:356Ð363 tions. J Endocrinol 196:435Ð453 Rost B, Casadio R, Fariselli P, Sander C (1995) Trans- Tatemoto K, Hosoya M, Habata Y, Fujii R, Kakegawa membrane helices predicted at 95% accuracy. Protein T, Zou MX et al (1998) Isolation and characteriza- Sci 4:521Ð533 tion of a novel endogenous peptide ligand for the Ruiz-Gómez A, Molnar C, Holguín H, Mayor F Jr, de human APJ receptor. Biochem Biophys Res Commun Celis JF (2007) The cell biology of Smo signalling and 251:471Ð476 its relationships with GPCRs. Biochim Biophys Acta Theodoropoulou MC, Bagos PG, Spyropoulos IC, 1768:901Ð912 Hamodrakas SJ (2008) gpDB: a database of GPCRs, Saito Y, Nothacker H-P, Wang Z, Steven HSL, Leslie G-proteins, effectors and their interactions. Bioinfor- F, Civelli O (1999) Molecular characterization of matics 24:1471Ð1472 the melanin-concentrating-hormone receptor. Nature Thora KB, David EG, Sofia HH, Helena K, Fredriksson 400:265Ð269 R, Helgi BS (2006) Comprehensive and phylogenetic Sakurai T, Amemiya A, Ishii M, Matsuzaki I, Chemell analysis of the G protein-coupled receptors in human RM, Tanaka H et al (1998) Orexins and orexin recep- and mouse. Genomics 88:263Ð273 tors: a family of hypothalamic neuropeptides and G Troemel ER et al (1995) Divergent seven transmembrane protein-coupled receptors that regulate feeding behav- receptors are candidate chemosensory receptors in C. ior. Cell 92:573Ð585 elegans. Cell 83:207Ð218 Satagopam VP, Theodoropoulou MC, Stampolakis CK, Tusnády GE, Simon I, Jayasinghe S, Hristova K, White Pavlopoulos GA, Papandreou NC, Bagos PG, Schnei- SH (2001) Bioinformatics 17:49Ð50 der R, Hamodrakas SJ (2010) GPCRs, G-proteins, ef- Vassilatis DK et al (2003) The G protein-coupled receptor fectors and their interactions: human-gpDB, a database repertoires of human and mouse. Proc Nat Acad Sci employing visualization tools and data integration USA 100:4903Ð4908 techniques. Database doi:2010:baq019 Vauquelin G, Mentzer B (2007) G. Protein-coupled recep- Sgurakis NG, Bagos PG, Papasaikas PK, Hamodorakas tors. Wiley, West Sussex SJ (2005) A method for the prediction of GPCRs Viklund H, Elofsson A (2004) PRO/PRODIV-TMHMM: coupling specificity to G-proteins using refined profile best alpha-helical transmembrane protein topol- Hidden Markov Models. BMC Bioinformatics 6:104. ogy predictions are achieved using hidden Markov doi:10.1186/1471-2105-6-104 models and evolutionary information. Protein Sci Shakhnarovish-rovich G, Darrell T, Indyk P (ed) (2005) 13:1908Ð1917 Nearest-neighbor methods in learning and vision, Viklund H, Bernsel A, Skwark M, Elofson A (2008) IEEE. The MIT Press, Cambridge, MA SPOCTOPUS: a combined predictor of signal pep- Sigrist CJ et al (2010) PROSITE, a protein domain tides and membrane protein topology. Bioinformatics database for functional characterization and 24:2928Ð2929 annotation. Nucleic Acids Res 38(Database Wess J (1998) Molecular basis of receptor/G- issue):D161ÐD166 protein Ð coupling selectivity. Pharmacol Ther 80: Smith TF, Waterman MS (1981) Identification of common 231Ð264 molecular subsequences. J Mol Biol 147:195Ð197 Wise A et al (2002) Target validation of G-protein coupled Sreekumar KR, Huang Y, Pausch MH, Gulukota K (2004) receptors. Drug Discov Today 7:235Ð246 Predicting GPCRÐG-protein coupling using hidden Wolfram S (1984) Cellular automata as models of com- Markov models. Bioinformatics 20:3490Ð3499 plexity. Nature 311:419Ð424 224 M. Suwa

Xiao X, Qiu WR (2010) Using adaptive K-nearest neigh- Yabuki Y, Mutamatsu T, Hirokawa T, Mukai H, Suwa bor algorithm and cellular automata images to predict- M (2005) GRIFFIN: a system for predicting GPCR- ing G-protein-coupled receptor classes. Interdiscip Sci G-protein coupling selectivity using support vector Comput Life Sci 2:180Ð184 machines and a hidden Markov model. Nucleic Acids Xiao X, Wang P, Chou KC (2009) GPCR-CA: A cellular Res 32:W148ÐW153 automaton image approach for predicting G-proteinÐ Yuan Z, Mattick JS, Teasdale RD (2004) SVMtm: support coupled receptor functional classes. J Comput Chem vector machines to predict transmembrane segments. 30:1414Ð1423 J Comput Chem 25:632Ð636 Index

A Conformational states, 38, 39, 46, 47, 64, 110 Activation mechanism, 45, 46, 50 Congreve, M., 135 Ahmed, J., 191 Continuum elastic theory, 65 Alignment free method, 209, 211, 214, 219 Continuum-Molecular Dynamics (CTMD), 61Ð63, All-atom simulations, 80, 81, 89, 91 65Ð69, 193, 194 Allen, T.W., 116 Cordomí, A., 15 Andrews, S.P., 135 Costanzi, S., 3, 5, 7, 136, 142 “-Arrestin, 38, 170 Crasto, C.J., 191 Audet, M., 8 Croft, D., 195 Crystal structures, 6, 8Ð10, 15Ð29, 39, 40, 42Ð48, 50, 63, 78, 80, 81, 89, 96Ð99, 101, 103, 104, B 107Ð110, 113, 117, 120, 130, 131, 134, 135, Baldwin, J., 6, 8 140Ð151 Ballesteros, J.A., 16, 45 CTMD. See Continuum-Molecular Dynamics (CTMD) Bellis, L.J., 189 Curation, 190 Benjamin, T., 135 Curve fitting, 163 Berman, H.M., 198 Beuming, T., 195 Bhattacharya, S., 37 D Biased agonism, 107, 170 Database cross referencing, 186 Biased simulations, 115, 116 Databases, 10, 106, 131Ð136, 141Ð146, 185Ð201, Biogenic amines, 17, 19Ð22, 40Ð41, 50 209Ð211, 214Ð216 Bioinformatics, 205Ð220 de Esch, I.J.P., 129 Bitter taste receptors, 191, 192 de Graaf, C., 129, 135, 136, 146, 149 Blättermann, S., 135, 143 DHA. See ¨-3 Docosahexaenoic acid (DHA) Bolton, E., 189 Dimerization, 41, 56, 89, 107, 113Ð119, Bouvier, M., 8, 45 168, 194 Dimers, 25, 27, 76, 113, 114, 116, 119, 168Ð173 3D models, 190, 197, 199 C Docking, 7, 9, 10, 97, 98, 103, 106, 132Ð134, Carlsson, J., 135, 146 136, 140Ð143, 145, 146, 148, 149, 192, Cavasotto, C.N., 198 198, 199 Cheng, F., 189 ¨-3 Docosahexaenoic acid (DHA), 57, 58, 60, 69, 77, 78, Chen, X., 189 83Ð86, 90 Cherezov, V., 17 Dopamine receptor, 40, 42, 144 Chien, E.Y., 17 Dror, R.O., 46, 97, 102 Choe, H.W., 17 Drug design, 43Ð47, 111, 148, 197, 206, 219, 220 Cholesterol-GPCR binding, 69 Duan, Y., 99 Class, 4, 5, 19, 22, 24, 25, 27, 38, 40, 41, 59, 76, Dunkel, M., 191 96, 106, 131, 134, 141, 146, 147, 150, 172, 178, 186, 190, 199, 206Ð209, 213, 214, 216, 217, 219 E Coarse-graining, 46, 47, 62, 63, 75Ð91, 96, 112Ð113 Elefsinioti, A.L., 195 Conformation, 6, 7, 9, 16Ð25, 27, 29, 38, 39, 41Ð48, 51, Empirical modeling, 161Ð167, 173Ð176 57, 59, 76, 86, 96, 99, 103Ð111, 133, 143, 145, Enhanced methods, 46, 95Ð120 165, 170, 172, 194 Evolutionary algorithm (EA), 178

M. Filizola (ed.), G Protein-Coupled Receptors - Modeling and Simulation, Advances 225 in Experimental Medicine and Biology 796, DOI 10.1007/978-94-007-7423-0, © Springer ScienceCBusiness Media Dordrecht 2014 226 Index

F Intrinsic efficacy, 166, 168, 173, 175 Family, 4, 6, 7, 9, 15, 16, 20, 22, 24, 28, 40, 41, 56, 76, Inverse agonism, 167 130, 131, 177, 188Ð192, 195, 196, 198, 199, 206, Isberg, V., 105, 106 207, 209Ð211, 213Ð220 Istyastono, E.P., 135 Filizola, M., 95, 219 Frauenfelder, H., 96 Fredriksson, R., 209 J Free-energy calculations, 215 Jaakola, V.P., 17 Friesner, R.A., 9 Jacobson, K.A., 5, 7 Function, 3, 10, 24, 50, 57, 59, 61, 67, 69, 70, 75Ð78, Johner, N., 55 81Ð82, 95, 101, 105, 109, 115Ð117, 129, 136, 140, Johnston, J.M., 95 159Ð178, 192, 196, 206, 209, 214, 215, 219, 220 Functional microdomains in GPCRs, 57 Functional selectivity, 107, 148Ð151, 170Ð173, 176 K Kanehisa, M., 195 Kao, T.-C., 75 G Karchin, R., 209 Galizia, C.G., 191 Katritch, V., 10, 135 Gatica, E.A., 198 Keshava Prasad, T.S., 193 Gil, D., 159 Khelashvili, G., 55, 193 Giraldo, J., 159 Kim, J., 135, 143, 147, 215 Gonzalez, A., 15, 102 Kimura, S.R., 97, 98 Goto, S., 195 Kleinau, G., 191 G-protein, 4, 8, 15, 20, 25, 27, 28, 38, 39, 45, 47, 50, 149, Kobilka,B.K.,3,8,47 151, 186, 194Ð196, 206, 207, 209, 214, 217Ð219 Kolb, P., 135, 141, 145 G-protein coupled receptors (GPCR) Kooistra, A.J., 129 activation, 21, 44, 46, 173, 219 Kouyama, T., 17 discrimination, 210, 211, 214Ð216 Kowalsman, N., 185 dynamics, 37Ð51 Kruse, A.C., 17 function, 57, 58, 70, 165, 206, 209Ð220 structure, 4, 5, 7, 9Ð11, 39Ð47, 50, 58, 96, 129Ð151, 197, 200, 201, 208 L Granier, S., 17 Langmead, C.J., 135, 146 Grossfield, A., 75 Lapnish, M., 211, 216 Larsen, A.B., 37 Launay, G., 198 H Lefkowitz,R.J.,3,4,6,8 Haga, K., 17 Leurs, R., 129 Hanson, M.A., 17 Ligand binding, 7, 23, 25, 28, 38, 39, 41, 42, 45, 47, 48, Hargrave, P., 5 50, 51, 57, 63Ð65, 96Ð104, 107, 122, 142, 148, Harmar, A.J., 189 151, 153, 178, 187, 189, 199, 207 Hecker, N., 189 Ligand recognition, 9, 56, 67Ð104 Hierarchical GPCR classification, 211, 216Ð218 Ligand sets, 106, 189, 198 Hill coefficient, 161, 163Ð165, 168 Li, J., 195 Homology, 3Ð11, 16, 28, 63, 97Ð98, 103, 105, 107, 111, Lin, X., 135, 145, 147 113, 117, 130, 133, 136, 137, 140Ð147, 149, 151, LipidÐprotein interactions, 58, 69, 75Ð76, 79, 80 189Ð193, 197Ð199 Liu, T., 189 Homology modeling, 3Ð11, 16, 28, 97, 98, 107, 113, 128, Liu, W., 17 136, 151 Liu, X., 191 Hopf, T.A., 198 Lomize, M.A., 198 Horn, F., 189 Horn, J.N., 75 Huang, J., 17 M Hydrophobic mismatch, 62Ð65, 69, 76, 77, 79, 112, Machine learning method, 209, 211Ð214, 216, 219 113, 194 Manglik, A., 17 Mathematical modeling, 160, 177 I Matsoukas, M., 15 In silico methods, 143 Mechanistic modeling, 159Ð178 Interface, 25, 27, 41, 48, 60, 62Ð64, 66Ð69, 75, 76, 78, Membrane deformation energy, 65, 69 87, 89, 90, 113Ð120, 131, 193, 219 Membrane mismatch, 60Ð62 Index 227

Metadynamics, 47, 103Ð104, 107Ð111, 113, 117, 118 Rognan, D., 136, 149 Models, Rosenbaum, D.M., 17, 111 Molecular dynamics (MD) simulations, 45, 46, Roth, B.L., 189 49, 50, 58Ð60, 62, 63, 65Ð69, 78, 80, 81, Roux, B., 115, 116 96Ð99, 101Ð103, 105, 106, 108, 111, 113, 116, 151 Molecular mechanisms, 8, 39, 95Ð120, 151 S Mondal, S., 55, 64, 193 Safran, M., 191 Motif based search, 216 Satagopam, V.P., 195 Murakami, M., 17 Saunders, B., 195 Mysinger, M.M., 135 Scheerer, P., 17 Selvam, B., 102 Sequence similarity search, 4, 8, 10, 16, 207, 209, N 210, 216 Nemoto, W., 193 Serotonin receptor, 57, 59, 104, 106, 107, 109, Niv, M.Y., 185 186Ð188, 197 Serrano-Vega, M.J., 49 Servers, 185Ð201, 216, 220 O Sgourakis, N.G., 195 Okuno, Y., 189 Shimamura, T., 17 Olender, T., 191 Shoichet, B.K., 10 Oligomerization, 25Ð27, 62Ð64, 70, 79, 94, 96, 111Ð120, Signal transduction, 15, 28, 38, 50, 56, 77, 111, 173, 175, 168, 187, 189, 190, 192Ð194, 209, 219 186, 194, 195, 206, 218, 220 Operational model, 173Ð176 Simpson, L.M., 104Ð107 Ovchinnikov, Y., 5 Sirci, F., 135, 143 Standfuss, J., 17 Stark, C., 193 P Statistical model method, 209, 211 Palczewski, K., 6, 17 Steyaert, J., 47 Parameter optimization, 177 Strader, D., 4, 8 Pardo, L., 15 Structure and remodeling of lipid membranes, 60Ð62 Park, J.H., 17 Structure-function relations in GPCRs, 58Ð65 Parrinello, M., 81 Subfamily, 18, 22, 23, 41, 207, 209, 214, 216Ð219 Partners, 4, 21, 86, 111, 185Ð201, 209 Sum, C.S., 136 Periole, X., 113, 117Ð119 Surgand, J.S., 136 PPI. See ProteinÐprotein interactions (PPI) Suwa, M., 205 PPI network classification, 210, 211 Predictions, 10, 41, 47, 65, 79, 113, 135, 145, 148, 193, 194, 196, 199Ð201, 209Ð220 T Protein function, 75, 76, 209 Taddese, B., 107 ProteinÐprotein interactions (PPI), 58, 69, 70, 75Ð76, 79, Thermostable mutants, 39, 47Ð49, 51 210, 211, 218Ð219 Tikhonova, I.G., 136 TMH. See Transmembrane helix (TMH) Topiol, S., 141 R Torrie, G.M., 114 Rahman, A., 81 Tosh, D.K., 148 Rasmussen, S.G., 17 Transmembrane helix (TMH), 16Ð21, 44, 50, Receptor constitutive activity, 165, 173Ð176 81, 117 Receptor dimer, 114, 116, 168, 192 Transmembrane proteins, 50, 62, 65, 66, 77, 199 Receptor oligomerization, 25Ð27, 62 Two-state model, 165Ð168 Renault, N., 135, 145 Repository, 194, 198, 201 Residual exposure, 62Ð70 Residual mismatch, 56 U Rhodopsin, 4Ð10, 17, 21, 22, 24, 25, 27, 44, 45, 48, Umbrella sampling, 113, 115Ð119 57, 59Ð64, 67Ð69, 75Ð90, 97Ð101, 104Ð107, 109Ð113, 117Ð120, 130, 134, 193, 194, 206, 208, 209, 217 V Roche, D., 159 Vaidehi, N., 37 Rodriguez, D., 198 Valleau, J.P., 114 228 Index

Van Durme, J., 191 Worth, C.L., 198 Virtual screening, 10, 97, 104, 106, 107, 131Ð150, 197, Wu, B., 17 198 Wu, H., 17

W X Wacker, D., 17 X-ray crystallography, 6, 7, 9, 10, 106, Wang, C., 17 120, 197 Wang, K., 3 Xu, F., 17 Wang, T., 99, 101, 102 Wang, Y., 189 Warne, T., 17 Z Weinstein, H., 16, 55 Zachmann, J., 15 Weinstein, J., 45 Zhang, C., 17 Wenstein, H., 219 Zhang, J., 198 White, J.F., 17 Zhang, Y., 198 Wiener, A., 191 Zhu, F., 189