University of Calgary PRISM: University of Calgary's Digital Repository

Graduate Studies The Vault: Electronic Theses and Dissertations

2020-07-30 Elucidating the Interplay Between Lipids and Membrane Proteins Using Multiscale Computer Simulations

Sejdiu, Besian I.

Sejdiu, B. I. (2020). Elucidating the Interplay Between Lipids and Membrane Proteins Using Multiscale Computer Simulations (Unpublished doctoral thesis). University of Calgary, Calgary, AB. http://hdl.handle.net/1880/112372 doctoral thesis

University of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission. Downloaded from PRISM: https://prism.ucalgary.ca UNIVERSITY OF CALGARY

Elucidating the Interplay Between Lipids and Membrane Proteins Using Multiscale Computer Simulations

by

Besian I. Sejdiu

A THESIS

SUBMITTED TO THE FACULTY OF GRADUATE STUDIES

IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE

DEGREE OF DOCTOR OF PHILOSOPHY

GRADUATE PROGRAM IN BIOLOGICAL SCIENCES

CALGARY, ALBERTA

JULY, 2020

© Besian I. Sejdiu 2020 Abstract

Biological membranes are complex cellular structures formed by a large number of different lipid types, that also contain a variety of bound proteins, carbohydrates, and other molecules. The detailed orchestration of all these elements has been a major focus of scientific research during the last 5 decades. Computer- based methods, such as molecular dynamics (MD) simulations, have proven to be a valuable approach in addressing many of the details of lipid organization and membrane protein activity. I used MD simulations at both atomistic and coarse-grained level of detail to study the number of way lipids and proteins interact and their possible functional ramifications. In part of my work, I studied the interaction of G Protein- Coupled Receptors (GPCRs) with lipids at a family-wide level. Plenty of other computational studies had shown specific lipid-protein interactions for a handful of GPCRs but with quite different outcomes on their number, location, and lipid identity. In my work, I simulated 28 different GPCR structures and showed that they are distinguished by a unique interaction profile with membrane lipids. I provided a comprehensive analysis of simulation results with available crystallographic data. I also studied the lipid-protein interaction profile of AMPA receptors and cyclooxygenases (mainly COX-1), showing that they both form specific interactions with lipids, but do so in a quite different fashion. AMPA receptors interact specifically with diacylglycerol lipids, whereas COX-1 enzymes do so indiscriminately with glycerophospholipids, cholesterol, and fatty acids, but at different levels of interaction strength. Using atomistic simulations, we show the binding pathway of arachidonic acid to COX-1 and identify a series of arginine residues that guide it toward the hydrophobic cavity of the enzyme. As part of my work, I also developed a webserver that automates the analysis and visualization of lipid-protein interactions from MD simulations allowing for the creation of automated pipelines to study lipid-protein interactions in the future. Lastly, I provide a short review of some of the main challenges facing the field along with possible solutions going forward. My work expands our understanding of lipid-protein interactions.

ii

Preface

The work presented in this thesis contains four chapters of original research and two reviews, also original work, on the field of lipid-protein interactions. Below I outline the content of each chapter and its status in terms of availability to the general scientific community.

Chapter 1 provides a brief introduction to the biology of the problem and the methods I use. It is meant to serve as a general overview of lipid-protein interactions and as an intuitive guideline to MD simulations. Its primary purpose is to allow for an easier reading of later chapters.

Chapter 2 includes a detailed review of GPCR-lipid interactions as derived primarily from MD simulations. It contains my contribution on a large and comprehensive review article that was published in Chemical Reviews in 2019. I also reviewed the same literature on bacterial mechanosensitive channels but that is not included in here.

Chapter 3 presents the study of lipid-protein interaction profile of AMPA receptors. The work presented in this chapter highlights part of my contribution to a wider collaborative effort that was published in ACS Central Science in 2018.

Chapter 4 details my work on the lipid-protein interaction profile of GPCRs. In it, I provide a comparative study of the interaction of 28 different GPCR structures with their lipid environment. Appendix A presents the supplementary information for this chapter. This work has been published in the Biophysical Journal.

Chapter 5 includes a detailed study of the lipid interaction profile of COX-1 enzymes. In it, I show how these enzymes bind different membrane lipids in their hydrophobic cavity. I also detail the binding pathway of arachidonic acid to the same binding site. Appendix B contains the supplementary information for this chapter. This work has been made available online on bioRxiv and will be submitted for peer-review soon.

Chapter 6 presents a webserver/software that I developed which aims at automating the analysis and visualization of lipid-protein interactions. The chapter provides an overview of some of the analysis applications that have been implemented and discusses the advantages they provide. The manuscript for this work has been completed and will be submitted soon.

Chapter 7 provides a short review of the field of lipid-protein interactions. It focuses mainly on the challenges and obstacles facing the field, as well as providing novel analysis and a discussion on how the field should move forward. This work will be submitted for consideration to The Journal of Chemical Physics soon.

iii

Chapter 8 contains a brief summary of conclusions reached for all chapters.

The following is a list of papers and reviews published during my PhD(1-4):

1. Corradi V., E. Mendez-Villuendas, H. I. Ingolfsson, R. X. Gu, I. Siuda, M. N. Melo, A. Moussatova, L. J. DeGagne, B. I. Sejdiu, G. Singh, T. A. Wassenaar, K. D. Magnero, S. J. Marrink, D. P. Tieleman. Lipid-Protein Interactions Are Unique Fingerprints for Membrane Proteins. Acs Central Sci. 2018;4(6):709-717.

2. Corradi V., B. I. Sejdiu, H. Mesa-Galloso, H. Abdizadeh, S. Y. Noskov, S. J. Marrink, D. P. Tieleman. Emerging Diversity in Lipid–Protein Interactions. Chem Rev. 2019.

3. Sejdiu B. I., D. P. Tieleman. Lipid-Protein Interactions Are a Unique Property and Defining Feature of G Protein-Coupled Receptors. Biophys J. 2020;118(8):1887-1900.

4. Sejdiu B. I., D. P. Tieleman. COX-1 - lipid interactions: arachidonic acid, cholesterol, and phospholipid binding to the membrane binding domain of COX-1. bioRxiv. 2020.

My contributions to each of the chapters presented in this thesis will be highlighted at the beginning of each chapter.

iv

Acknowledgement

First and foremost, I would like to thank my supervisor Dr. Peter Tieleman for his mentoring throughout my PhD. There are many things to be appreciative about working in the Tieleman lab, but perhaps what I am grateful for most was the freedom that I was given to develop as a scientist and the (frankly, quite often misplaced) trust put on me for knowing what I am doing. I was always free to pursue any project of my interest even if they were only tangentially research related. I am also thankful to Peter for not only allowing but actively encouraging me to express myself freely in my scientific work, design projects and structure articles the way I saw fit and for giving me detailed feedback on my work where sometimes I thought I had mistakenly written manuscripts using a red font.

A special thanks goes to my former colleagues in Peter’s group: Dr. Gurpreet Singh and Dr. Valentina Corradi for their help and mentorship. Valentina has been an inspiration for me throughout my PhD, and her dedication, organized planning and friendliness are things I will try to emulate in my future career. Similarly, I thank Dr. Anastassiia Moussatova for her support, advice and encouragement, and all other members of the Tieleman lab for their help, collaboration, and friendship.

Thanks to my supervisory committee members, Dr. Kenneth Ng and Dr. Justin MacCallum for their advice and mentorship and for always being accommodating to my requests for meetings. I also appreciate the help of the supporting staff and program coordinators at the University of Calgary for making all the dreadful administrative paperwork so easy to deal with.

The most important and most heartfelt appreciation goes to my parents for all the sacrifices they made and continue to make for me, my family, and my education. Both are much smarter than me yet neither got to pursue their passion for academic advancement to their desired extent. Their support towards my goals and dreams has been immense, unwavering, consistent, and so far, almost entirely one-sided.

Last but not least, I thank my brothers for their love and support and my wife for guiding and always being there for me. Liridona, you left everything behind to follow me and I can never thank you enough for that. You created a home for us and provided far more than your fair share of work to maintain it. Thanks as well to my son, Lirian, who is the greatest motivation and inspiration for me. I hope I can provide half the support to him that my parents gave to me.

v

To my mother and father

Kushtuar prindërve të mi. Ju dua shumë.

vi

Table of Contents

Abstract ...... ii Preface ...... iii Acknowledgement ...... v Table of Contents ...... vii List of Figures and illustrations ...... xi List of Tables ...... xiii Chapter One: General Introduction ...... 1 1.1 Cell membranes ...... 1 1.2 Membranes and lipid-protein interactions ...... 1 1.3 Molecular Dynamics Simulations ...... 5 1.3.1 Theoretical framework ...... 5 1.3.2 Statement of the objective ...... 5 1.3.3 Motivation and functional form of the potential energy ...... 6 1.3.4 Periodic boundary conditions ...... 8 1.3.5 Handling electrostatic calculations ...... 9 1.3.6 Temperature and pressure coupling ...... 10 1.3.7 Numerical integration ...... 11 1.3.8 The MARTINI model ...... 12 1.3.9 Using MARTINI to study lipid-protein interactions ...... 12 1.4 References ...... 15 Chapter Two: Emerging Diversity in Lipid-Protein Interactions ...... 18 2.1 Abstract ...... 19 2.2 G protein-coupled receptors (GPCRs) ...... 19 2.2.1 GPCR – Lipid interactions ...... 21 2.2.2 Rhodopsin ...... 21 2.2.3 Adrenergic receptors ...... 22 2.2.4 Adenosine receptors ...... 25 2.2.5 Serotonin receptors ...... 26 2.2.6 Other GPCRs ...... 26 2.3 GPCR scramblase activity and lipid entry events ...... 27 2.4 GPCR oligomerization ...... 28 2.5 References ...... 32

vii

Chapter Three: Lipid-protein interactions are unique fingerprints for AMPA receptors ...... 43 3.1 Introduction ...... 44 3.2 Methods...... 45 3.3 Results ...... 46 3.3.1 AMPA receptors interact with different lipid types at the membrane spanning domain ...... 46 3.3.2 Interaction heatmaps reveal the interaction diversity of AMPA receptors with lipids ...... 47 3.3.3 Cholesterol and DAG lipids form specific interactions with AMPA receptors ...... 50 3.4 Discussion ...... 51 3.5 References ...... 53 Chapter Four: Lipid-Protein Interactions are a Unique Property and Defining Feature of G Protein- Coupled Receptors ...... 56 4.1 Abstract ...... 57 4.2 Significance...... 57 4.3 Introduction ...... 58 4.4 Methods...... 60 4.5 Results ...... 61 4.5.1 The lipid environment near GPCRs is distinctly different from the bulk membrane composition ...... 62 4.5.2 2D density profiles reveal a highly localised cholesterol distribution around GPCRs ...... 64 4.5.3 GPCRs induce a unique local membrane environment ...... 66 4.5.4 GPCRs interact specifically with PIP lipids on the intracellular surface...... 67 4.5.5 Cholesterol interactions are a unique identifier for GPCRs ...... 70 4.5.6 GPCR-lipid interactions are dependent on the conformational state of the ...... 73 4.6 Discussion ...... 74 4.7 Conclusions ...... 78 4.8 Supporting Material ...... 78 4.8.1 Supporting Citations ...... 79 4.9 References ...... 79 Chapter Five: COX-1 – lipid interactions: arachidonic acid, cholesterol, and phospholipid binding to the membrane binding domain of COX-1 ...... 88 5.1 Abstract ...... 89 5.2 Introduction ...... 90 5.3 Methods...... 92 5.4 Results ...... 94 5.4.1 Characterization of arachidonic acid binding to COX-1 ...... 94

viii

5.4.2 Coarse-grained MD simulations reveal a complex interplay of COX-1 enzymes with lipids ... 96 5.4.3 Atomistic simulations of cholesterol interactions with COX-1 ...... 101 5.4.4 COX-1 induces a positive curvature on the surrounding lipid environment ...... 104 5.5 Discussion ...... 106 5.6 Acknowledgements ...... 111 5.7 Data availability ...... 111 5.8 References ...... 111 Chapter Six: ProLint: a web-based framework for the automated data analysis and visualization of lipid- protein interactions ...... 118 6.1 Introduction ...... 119 6.2 Testing and Validation ...... 120 6.3 Results ...... 122 6.3.1 Point representation of lipid-protein contact information ...... 123 6.3.2 Network graph representation of lipid-protein interactions...... 124 6.3.3 2D density map are commonly used but can also be misleading...... 126 6.3.4 Visualizing 3D densities with complete spatial information ...... 128 6.3.5 Membrane proteins induce structural changes to their local membrane environment ...... 129 6.3.6 Interactive sequence heatmaps highlight the conservation of interaction sites ...... 130 6.4 Conclusion ...... 131 6.5 References ...... 132 Chapter Seven: Lipid – Protein Interactions: Current Status and Future Outlook ...... 135 7.1 Abstract ...... 136 7.2 Introduction ...... 136 7.3 Ensuring convergence of lipid distributions ...... 137 7.4 When does a lipid become bound to a protein? ...... 139 7.5 The full picture ...... 142 7.6 Conclusion ...... 145 7.7 References ...... 146 Chapter Eight: Conclusions ...... 151 8.1 Conclusions ...... 151 8.2 Outlook ...... 152 Appendix A: Supplementary Data for Chapter 4 ...... 174 A.1 References ...... 202 Appendix B: Supplementary Data for Chapter 5 ...... 208 Appendix C: Copyright Permissions...... 214

ix

C.1 Permissions from Biophysical Journal (CellPress) ...... 214 C.2 Permissions from ACS...... 216

x

List of Figures and illustrations

Figure 1-1. The structure of three different lipid types...... 3 Figure 1-2. Using the MARTINI model to study lipid-protein interactions...... 14 Figure 1-3. Example of lipid-protein interaction analysis...... 14 Figure 2-1. Representative structures of GPCRs...... 20 Figure 2-2. The effect of cholesterol on β2AR activation...... 24 Figure 2-3. Experimental structures of GPCR dimers...... 29 Figure 3-1. System setup and the distribution of lipids...... 47 Figure 3-2. AMPA receptors interaction heatmaps...... 49 Figure 3-3. Density and distance measurements...... 50 Figure 4-1. GPCR Depletion-Enrichment (DE) index data as derived from our simulations...... 63 Figure 4-2. 2D density profiles...... 65 Figure 4-3. GPCR curvature analysis...... 67 Figure 4-4. Sequence heatmaps of GPCR – PIP lipid interactions...... 68 Figure 4-5. CXCR1 – PIP lipid interactions...... 69 Figure 4-6. GPCR-cholesterol interaction for a sample of eight GPCRs shown as a surface presentation of cholesterol contact durations...... 71 Figure 4-7. Cholesterol binding to angiotensin receptor (AT2R)...... 72 Figure 4-8. Cholesterol interactions with RhodRi and RhodRa...... 73 Figure 4-9. Overview of cholesterol binding sites...... 76 Figure 5-1. Structure of the ovine Prostaglandin endoperoxide H synthase-1...... 91 Figure 5-2. Arachidonic Acid (ARAN) binding to the MBD of COX-1 in AA simulations...... 95 Figure 5-3. COX-1 interaction with lipids in ER-like membranes...... 98 Figure 5-4. Cholesterol binding to the MBD of COX-1 in CG simulations...... 100 Figure 5-5. Cholesterol binding to the MBD of COX-1 in AA simulations...... 102 Figure 5-6. Curvature inducing property of COX-1...... 105 Figure 5-7. Arachidonic acid binding pathway...... 108 Figure 6-1. Schematic overview of ProLint...... 122 Figure 6-2. Snapshot showing the interaction of Smoothened with cholesterol...... 124 Figure 6-3. Network graph representation of lipid-protein interactions...... 125

Figure 6-4. 2D density profile analyses of lipid-protein interactions for 5HT1B...... 127 Figure 6-5. Visualizing lipid-protein interactions in 3D space for SMO...... 128 Figure 6-6. Thickness and curvature visualization of lipid-protein interactions...... 129

xi

Figure 6-7. Aligned sequence heatmaps for GPCR-cholesterol interactions...... 130 Figure 7-1. Measuring convergence of lipid distributions...... 138 Figure 7-2. Overview of cholesterol – COX-1 interactions...... 144 Figure A-1. Convergence of the number of lipids (Lipid Count) during the course of the simulation. .... 179 Figure A-2. β2ARi-cholesterol interactions...... 180 Figure A-3. P-P plots of DE indices...... 181 Figure A-4. Cholesterol 2D density profile...... 183 Figure A-5. FS lipids density profile...... 185 Figure A-6. PU density profile...... 187

Figure A-7. Gaussian curvature (KG) maps...... 189

Figure A-8. Mean curvature (KM) maps...... 191 Figure A-9. GPCR membrane thickness...... 193 Figure A-10. PIP lipid - TM helix interactions...... 194 Figure A-11. CXCR1-PIP interactions...... 195 Figure A-12. AT2R-cholesterol interactions...... 196

Figure A-13. Cholesterol interactions with A2AAR and β2AR...... 197 Figure A-14. Cholesterol interactions with Serotonin and μOR...... 199

Figure A-15. Cholesterol interactions with CB1R, chemokine receptor, ETBR, and US28...... 200 Figure A-16. Cholesterol interactions with other GPCR...... 201 Figure A-17. GPCR-lipid interactions...... 202 Figure B-1. Cholesterol binding to the MBD of COX-1 in CG simulations...... 208 Figure B-2. 2D density profiles for ER-like membrane simulations...... 209 Figure B-3. POPC lipid binding to the MBD of COX-1 in AA simulations...... 210 Figure B-4. Coarse-Grained order parameters for some multicomponent systems...... 211 Figure B-5. Lipid count in the close proximity of embedded COX-1 proteins in the complex system. ... 212 Figure B-6. Lipid count within 7Å of COX-1 in the complex membrane system...... 212

xii

List of Tables

Table A-1. Overview of all GPCR structures simulated...... 178

Table B-1. Detailed overview of simulated systems. For each system we show the lipid composition as well as the total simulation time...... 213

xiii

Chapter One: General Introduction

1.1 Cell membranes

Life on Earth has its origin around 4 billion years ago. We know that early conditions on Earth allowed for the formation of amino acids and other organic molecules from completely inorganic ingredients. Yet that in itself, while necessary, would very likely be insufficient to allow for the creation of any life form. Instead, what may have proved to be the necessary catalyst for the creation and propagation of ‘life’ could have been the formation of membranes: entities that compartmentalize space. This process would allow for the confinement of organic molecules and their escape from outside conditions. By balancing the pressure between compartments, membranes also enable the creation of different chemical gradients. Thermodynamically, this could accelerate the creation of more complex molecules and eventually leading to the formation of early self-replicating complexes.

From the formation of early life forms on Earth to their subsequent evolution by natural selection to the diverse organisms we observe and see going extinct today, 4 billion years have passed. From the time of hominids in the African savanna to the one writing this text, our understanding of the biology of life on Earth has increased tremendously.

Current cell membranes are unlike anything their early ancestors may have been, and far more complex than what they where modeled five decades ago(1). During the last 50 years, our view of cell membranes has shifted from that of a passive solute that acts as support to embedded proteins and filter for permeant molecules, to that of a complex medium that plays an active role on many cellular processes including the activity of membrane proteins. The following work is an investigation of membrane protein function and activity as a function of their membrane environment. We employ state-of-the-art multiscale computer simulations to study the interplay between lipids and membrane proteins and we highlight the diverse array of interactions between lipids and proteins and note their functional relevance.

1.2 Membranes and lipid-protein interactions

Cellular membranes cover the entire cell and individual organelles inside it. They define their circumference and allow for the creation of an inside environment different from the surrounding space. Membranes serve a dual role in this regard: compartmentalizing space and maintaining the integrity of the inside environment. They are soft and flexible structures that act as diffusion filters to in- and outbound molecular traffic. Their

1 semi-permeability is both size and hydrophobicity dependent. The ability of membranes to discriminate between molecules either intrinsically through passive diffusion, or indirectly through embedded proteins, enables cells and organelles to create electrochemical gradients between the individual compartments that are essential for allowing the diversification and decentralization of cell activities. For instance, lysosomes are small organelles with pH significantly lower than the cytoplasm that keep cells clear of any unnecessary or waste products. Vacuoles serve as water storage in plant cell. The mitochondrion utilizes an electrochemical gradient to generate the energy needed by cells, and many other examples.

It is the importance of membranes in keeping the integrity of internal compartments that underlines their structure and lateral organization. Membranes are composed of amphiphilic molecules called lipids, arranged in a tail-to-tail manner in a planar configuration with the height vector of the membrane aligned parallel to the plane normal. The variety of lipids that partake in the structure of cellular membranes is in the thousands(2, 3). Lipids, however, are defined by a hydrophilic and lipophilic side that form their headgroup and tail end, respectively. In this respect, mammalian membranes are composed by three main lipid types: glycerophospholipids, sphingolipids and sterols. Cells, however, include a huge variety of carbohydrate modifications of these lipids and other variants many of which are species- and even cell- specific. Many of erythrocyte surface antigens, for instance, including part of the blood type system, are carbohydrates attached to lipids(4).

Owing to their compositional complexity, membrane lateral structure is complex and not yet completely understood. Despite decades of research we still lack a clear understanding of how membranes ‘work’. We understand their structure and can model it in sub-angstrom detail, yet membranes are highly dynamic systems placed in an environment constantly pushing it out of equilibrium(5). The traditional picture of cell membranes leaves us with a static image of lipids that fails to appreciate the variations in and complexity of lipid distributions as well as crowding due to membrane proteins. A dynamic view, in contrast, includes membrane undulations the degree of which depends on lipid type (e.g. sterols) and protein density, variable membrane thickness around proteins depending on lipid tail type, varying diffusion rates and lateral distribution of lipid types as a function of distance to embedded proteins, protein-mediated or protein- independent lipid flip-flop, creation of transient functional domains, underlying cytoskeleton, etc.

Figure 1-1 shows three different lipid types representing glycerophospholipids (1-palmitoyl-2- oleoylphosphatidylcholine, POPC), sphingolipids (N-Palmitoylsphingomyelin, PSM), and sterols (cholesterol) to highlight the different lipid components of cell membranes.

2

Figure 1-1. The structure of three different lipid types. N-Palmitoylsphingomyelin (PSM) represents an example of sphingolipids, 1-palmitoyl-2- oleoylphosphatidylcholine (POPC) represents an example of glycerophospholipids and cholesterol (CHOL) represents an example of sterols. For clarity, hydrogen atoms are not displayed.

All lipids have a hydrophilic headgroup and a hydrophobic lipid tail. Both PSM and POPC have a zwitterionic phosphocholine headgroup whereas cholesterol has a small hydroxyl end for its headgroup. The backbone of glycerophospholipids and sphingolipids is formed by glycerol, and sphingosine, respectively. In contrast, cholesterol contains a rigid ring structure and its concentration in plasma membranes regulates their fluidity. The large number of lipids that we observe is due to four types of variations: different headgroups, different modifications of the same headgroup, different tail lengths, and different desaturation of same length lipid tails. The type and relative ratios of lipid components determines membrane properties. Glycerophospholipids are the must abundant lipid types. They can have headgroups that are neutral, polar, zwitterionic, or charged and give the overall membrane, for example, a net charge with important implications to protein binding and function. Cholesterol content can vary from a few percent in endoplasmic reticulum(6) membranes and up to 1/3 of membrane lipids. Thanks to its rigid ring structure, cholesterol packs against the acyl chain tails of other lipids, thereby influencing the fluidity of the overall membrane. Sphingolipids, along with cholesterol, form organized lateral domains with functional importance(7).

3

Based on their membrane insertion depth, membrane proteins can be peripheral in which case they are anchored on one side of the membrane and do not traverse it fully, or they can span the entire membrane height at least once (but commonly they do so multiple times) and are referred to as integral membrane proteins. This distinction, however, is also functional, and as such there are proteins that are anchored on only one membrane leaflet, but never dissociate and as such they are called monotopic membrane proteins. A prominent example of the latter are cyclooxygenases (COX-1 and COX2). The totality of membrane protein activity is a major determinant of cell activity. Based on genome studies, it is estimated that a quarter of encoded proteins are membrane proteins(8). They regulate the incoming and outgoing ion and molecular traffic, mediate the exchange of energy and metabolites between compartments and enable the highly specialized functions of different cells: synaptic release due to an action potential in neurons, cellular waste and alien material phagocytosis by macrophages, gamete fusion during fertilization for ova cells, anti-thrombotic and fibrinolytic activity of blood vessel endothelium cells, and many others.

An increasing body of evidence puts an active role on the membrane and its lipid components in either regulating, modulating, or otherwise affecting the activity of membrane proteins. Lipids exert their influence on membrane proteins by either binding specifically to membrane proteins, accessing deep crevices and grooves on their surface and sometimes entirely entering the interior of the proteins in a - like fashion, and in yet other cases by changing membrane mechanical properties leading to a shift in their conformational equilibrium, changes to gating mechanism and many other effects(5, 9, 10). This reciprocal interplay between membrane proteins and lipids is referred as lipid-protein interactions.

Due to the time and length scales that lipids interact with proteins coupled with general challenges in expressing, purifying, and studying membrane proteins, experimental studies have faced a barrier in accessing relevant details of their interplay. Nevertheless, results converging from many experiments have provided invaluable insight into lipid-protein interactions. Efforts from x-ray crystallography and electron microscopy (cryo-EM) have revealed the structures of many proteins with bound lipids(10). These co- crystalized lipids hint towards a specific interaction with proteins. Staying within the family of G Protein Coupled Receptors (GPCRs), co-crystallization with cholesterol has been observed for many receptors(11- 15), and for the Adenosine A2A receptor, 27 out of 45 solved structures contain bound cholesterol(16). Many other proteins have also been resolved with bound lipids. Importantly, other experimental techniques have also been used to characterize lipid-protein interactions: nuclear magnetic resonance, atomic force microscopy, mass spectrometry, different spectroscopy methods, biochemical assays, etc.(10)

In contrast, molecular dynamics (MD) simulations do not face any of the experimental challenges and allow for the detailed investigation of lipid-protein interactions. By creating an accurate structural representation of membrane proteins and lipid bilayers, MD simulations allow us to track the motion of these components

4 in space and time. They have their own challenges and limitations which will be discussed in the next section. Despite that, however, the use of MD simulations along with other computational tools to study lipid-protein interactions has shown to be incredibly fruitful. Thanks to decades of research, the lipid- interaction profile of many proteins has been characterized, leading to the current picture of cellular membranes and their lipid components as tightly involved in membrane protein structure and function. The current thesis presents work directed towards this objective: characterizing the ways in which lipids and proteins interact with each other and understanding the driving forces that lead to their preferential interaction.

1.3 Molecular Dynamics Simulations

Molecular Dynamics (MD) Simulations have become one of the most widely used tools to address biologically relevant questions. Current state-of-the-art simulations allow for length scales in the hundreds of nanometers and temporal timescales of hundreds of microseconds to be simulated. The following is a brief overview of the main concepts in the field.

1.3.1 Theoretical framework

The work presented here, with the exception of software development efforts in chapter 6, makes exclusive use of MD simulations to address various aspects of lipid-protein interactions. Based on the setup, force- field and problem addressed, a variety of different simulation parameters have been employed. This has also been done to utilize the massively parallel processing power of GPUs. A detailed description of these parameters would necessitate a lengthy excursion in the theory of computer simulation and numerical analysis. Considering the number of dedicated works already published on these subjects, however, we believe it is more appropriate to only provide a brief overview of the field, with an intuitive explanation of key concepts. The material presented here as well as the relevant equations are based on and adapted from the GROMACS user manual(17, 18) and Computer Simulation of Liquids textbook(19), with the help of other sources(20, 21) as well.

1.3.2 Statement of the objective

Our goal is to answer biologically relevant questions by simulating the actions of atoms and molecules using computers. We do this by integrating known principles on the motion and interaction of particles with our understanding of how to effectively translate those principles into computer-understandable

5 instructions. As such, MD simulation operates at the intersection of classical mechanics, numerical analysis, and high-performance computing.

1.3.3 Motivation and functional form of the potential energy

The total internal energy is the sum of the kinetic (K) and potential (V) energy of the system. If we use the notations 풒풊 and 풑풊 to denote the coordinates and momenta of particles in a system, respectively, then we can write the Hamiltonian (which usually is equal to the total internal energy) as:

H(q, p) = K(p) + V(q) (1) where, 풒 and 풑 are defined for each particle 푖 over 푁 total particles:

p = (p1, p2, p3, …, pN) (2)

q = (q1, q2, q3, …, qN) (3) The total kinetic energy of the N-particle system would be:

푁 푁 2 1 2 푝푖 퐾 = ∑ 푚푖푣푖 = ∑ (4) 2 2푚푖 푖=1 푖=1 In contrast to K, the calculation of the potential energy is much more difficult as it is a function of particle coordinates and their interactions. It is also the most critical function in computer simulations. A general form of the potential energy could be written as follows:

푁 푁 푁 푁 푁 푁 푽 = ∑ 푉1(풓푖) + ∑ ∑ 푉2(풓푖, 풓푗) + ∑ ∑ ∑ 푉3(풓푖, 풓푗, 풓푘) + ⋯ (5) 푖=1 푖=1 푗=푖+1 푖=1 푗=푖+1 푘=푗+1 Where the first term models the application of external forces, and the remainder of the terms describe particle interactions(19). If we assume no external potential, then we can set the first term to zero and eliminate it. Now we have the potential as a function of the interactions between doublets, triplets, quartets, quintets, and so on of particles, each with a progressively smaller contribution to the total potential energy. The most important of these terms and the one MD simulations, for efficiency reasons, are based on is the pair potential. While it would be possible to model the potential energy entirely on pairwise interactions, the approach that is usually used is to distinguish chemically bonded atoms from other atoms and model their contribution separately. This is commonly done by defining bonds between atoms 푖 and 푗, angles between atoms 푖, 푗 and 푘, and dihedrals between atoms 푖, 푗, 푘, and 푙. As such, the resultant form of the potential function is formed by the combined contributions of these bonded and pairwise nonbonded terms(21):

6

푉 = ∑ 푉푠푡푟푒푡푐ℎ + ∑ 푉푏푒푛푑 + ∑ 푉푡표푟푠푖표푛 + ∑ 푉푛표푛푏표푛푑푒푑 (6) 푏표푛푑푠 푎푛𝑔푙푒푠 푑푖ℎ푒푑푟푎푙푠 푝푎푖푟푠 The total potential energy is formed by summing individual contributions over all the bonds, angles, and dihedrals of molecules as well as interactions between not directly bonded atoms. The functional form of each term can vary depending on their implementation on different force-fields. Usually, however, atoms are considered as balls connected by springs, allowing for the bonded terms, which are invariably intramolecular, to be modeled using harmonic potentials. Nonbonded interactions are used to model the intra- and intermolecular interactions as a result of dispersion- repulsion (using a Lennard-Jones potential) and electrostatic interactions (using Coulomb potential). The prototypical form of V is as follows(20):

1 2 1 2 푉 = ∑ 푘 (푙 − 푙 ) + ∑ 푘 (휃 − 휃 ) 2 푖 푖 푖,0 2 푖 푖 푖,0 푏표푛푑푠 푎푛𝑔푙푒푠 1 + ∑ 푉 (1 + 푐표푠(푛휔 − 훾)) 2 푛 (7) 푑푖ℎ푒푑푟푎푙푠 푁 푁 12 6 휎푖푗 휎푖푗 푞푖푞푗 + ∑ ∑ (4휀푖푗 [( ) − ( ) ] + ) 푟푖푗 푟푖푗 4휋휀0푟푖푗 푖=1 푗=푖+1

The first term models the interaction between directly bonded atoms, with 푙푖,0 being the reference bond distance. The function allows bonds to deviate slightly around 푙푖,0 to compensate for changes to the other terms in eq. 7, but there is a progressively higher barrier for doing so determined by the force constant (푘푖). For atomistic simulations, I used the LINCS constraint algorithm(22) to constrain hydrogen bonds. The second term, the angle potential, is modelled similarly using a harmonic potential, but since the definition of an angle is done by three consecutive atoms 푖푗푘, it is the angle 휃 that is kept to its reference value instead

(휃푖,0). The third term models torsional or dihedral angles between four atoms. The potential form in eq. 7 depicts a proper dihedral which is the most common and represents the angle formed by the triplets of atoms 푖푗푘 and 푗푘푙. Sometimes it is necessary to prevent out-of-plane bending of atoms (e.g. to maintain planar conformation of aromatic rings) or enforce specific stereoisomeric conformations, which is done by defining improper dihedrals(18).

The final term in eq. 7 describes nonbonded interactions. It is composed of a Lennard-Jones potential and a Coulomb potential. The former is used to model repulsion-dispersion interactions whereas the latter electrostatic interactions. The double summation at the front of the expression denotes that calculations are done pairwise for each pair of atoms 푖 and 푗 (excluding cases where 푗 = 푖). The calculation of nonbonded interactions, however, is inherently a many-body problem with significant contributions from three-body interactions. Nevertheless, their direct inclusion into the eq. 7 leads to a prohibitively high computational

7 cost. In practice, careful parameterization of force-fields allows for the inclusion of many-body effects into the pairwise potential. As such, the nonbonded potentials commonly used are in fact effective pairwise potentials. Doing so, however, comes at a price: two-body potentials are only dependent on the respective coordinates of atoms 푖 and 푗 (specifically, their distance 푟푖푗), whereas the effective pair potential may depend on various macroscopic properties (e.g. temperature) depending on the target experimental values used during parameterization(19).

The process of assigning values to all the parameters in eq. 7 is called parameterization, and the result is called a force-field. There are many different force-fields that are used in MD simulations with different parameterization processes and different objectives that can be used to model real-world problems.

1.3.4 Periodic boundary conditions

Proteins, lipids, ions, ligands, and everything else that forms our system of interest, before the simulation can start, need to be put in a simulation box which defines the system boundaries. One question is how to physically treat those boundaries. Leaving them as simple walls, for instance, would lead to disastrous effects on liquid simulations. One approach that is commonly used in MD and that I used for all my simulations is to use periodic boundaries. The system box is surrounded on all directions by translated copies of itself. Hence, it does not have any walls (i.e. potential mimicking physical walls), instead, when a molecule leaves the simulation box on one side, the same molecule will enter the system on the opposite site.

An important requirement to achieve periodic boundaries is that the simulation box needs to be space- filling. In my simulations, I have used a rectangular box. The use of periodic boundary conditions also influences the way nonbonded interactions are calculated because now we have to also consider all the mirror images. For instance, to calculate short-range interactions for a molecule located at one of the box edges, we would also look at all the molecules surrounding it from the mirror images. This way of counting is referred to as the minimum image convention. In practice, a cut-off is also used where beyond it, interactions are set to zero. This, of course, leads to a change in the calculated thermodynamically properties, but they can, however, be recovered at a later point. The application of boundary conditions also enforces a minimum size for the simulation box because a molecule in one box should not ‘see’ it own mirror image or the same molecule twice. For this reason, when employing cut-offs, they should be smaller than half the smallest edge of the box.

8

1.3.5 Handling electrostatic calculations

When solving the expressions for bonded interactions the list of atoms for which the calculation of bonds, angles and dihedral potentials is necessary is defined in the system topology and remains fixed. In contrast, nonbonded interactions are defined spatially with the dependence being some inverse power of their distance. Short-range interactions (van der Waals or repulsion-dispersion interactions) are treated easily using a cut-off as described above. The main challenge in MD simulations lies in calculating long-range (electrostatic) interactions as their fall off with increasing distance is very slow (1/r). Satisfying the half-of- box criterion similar to short-range interactions is only possible for very large simulation boxes and applying a straight cut-off would lead to serious artifacts.

The solution most commonly employed in MD simulations and the one I used for all atomistic simulations, is based on the Ewald summation method. This method allows for the most efficient calculation of electrostatic interactions in computer simulations. It does so by summing the interactions between a particle and all other particles in the box as well as all the particles in the periodic images. This is done by dividing the summation into a real and reciprocal space summation. Carrying out the Ewald summation is not recommended because it has poor scaling, is conditionally convergent, and the summation in reciprocal space is computationally demanding. This is overcome using the particle-mesh Ewald (PME) method(23) whereby point charges are distributed onto a grid and fast Fourier transforms (FFTs) are used for the calculation of the reciprocal space part of the Ewald summation. The following are example parameters to show how nonbonded interactions were treated in my atomistic simulations:

; neighbor searching cutoff-scheme = Verlet nstlist = 20 rlist = 1.2 ; electrostatic interactions coulombtype = pme rcoulomb = 1.2 ; van der Waals interactions vdwtype = Cut-off vdw-modifier = Force-switch rvdw_switch = 1.0 rvdw = 1.2

Another method to treat electrostatic interactions is called reaction field(24). It uses a cut-off distance within which interactions are calculated explicitly, whereas molecules that fall outside of the cut-off are modelled as a dielectric continuum with constant permittivity. Movement of particles in and out of the cut-off can

9 introduce significant discontinuities in the calculated forces, which can be mitigated by using modifier functions. I used reaction field for part of simulations with the MARTINI model.

1.3.6 Temperature and pressure coupling

All of my simulations employ the isothermal-isobaric (NPT) ensemble which maintains the temperature and pressure (in addition to the number of atoms) constant. These conditions are the most closely resembling usual experimental conditions. There are several algorithms that can be used to maintain both the temperature and pressure constant during MD simulations. One of the most common methods to couple temperature is the ‘weak coupling’ algorithm developed by Berendsen(25), which uses an external heat bath to maintain a constant temperature at 푇0. Deviations are corrected using:

푑푇 푇 − 푇 = 0 (8) 푑푡 휏 The temperature in corrected by scaling velocities by a factor λ:

푛푇퐶∆푡 푇0 휆2 = [1 + { − 1}] (9) 1 휏푇 푇(푡 − ∆푡) 2

휏푇 is the time constant for coupling and higher values imply a weaker coupling(17). The average kinetic energy distribution obtained using the Berendsen thermostat is incorrect. This is fixed, however, using the similar thermostat called velocity-rescale(26). I used the velocity-rescale thermostat for MARTINI simulations. For atomistic simulation we mainly used the Nosé-Hoover thermostat(27, 28) in which the thermal reservoir has its own degree of freedom and is incorporated into the system Hamiltonian.

Maintaining constant pressure can also be done using a “pressure bath” using the Berendsen algorithm similar to temperature coupling with not much different equations. Instead of velocities, however, it is the coordinates and box vectors that are scaled to maintain constant pressure.

푑푃 푃0 − 푃 = (10) 푑푡 휏푝 Scaling is done using the following scale factor:

푛푃퐶∆푡 휇푖푗 = 훿푖푗 − 훽푖푗{푃0푖푗 − 푃푖푗(푡)} (11) 3휏푝

훽푖푗 and 휏푝 are the compressibility and the time constant for pressure coupling, respectively. In my simulations, I also used the Parrinello-Rahman pressure coupling method(29), which is similar to the temperature coupling by Nosé-Hover. The following is an example for the treatment of pressure and temperature in atomistic simulations:

10

; temperature coupling tcoupl = Nose-Hoover tc_grps = PROT MEMB SOL_ION tau_t = 1.0 1.0 1.0 ref_t = 310.15 310.15 310.15 ; pressure coupling pcoupl = Parrinello-Rahman pcoupltype = semiisotropic tau_p = 5.0 compressibility = 4.5e-5 4.5e-5 ref_p = 1.0 1.0

1.3.7 Numerical integration

Carrying out MD simulations means solving Newton’s equations of motion. This is done numerically by using computers. While the discussion so far has focused on potential energy, the main quantity that we need to calculate in order to update positions is the force acting on each atom. Calculating the force is done by taking the negative derivative of the potential with respect to their cartesian coordinates (푟푖).

휕푉 퐹 = − (12) 휕푟푖 For instance, the force derived by differentiating the bond term in eq. 7 is:

휕 1 2 풓푖푗 푭푖(풓푖푗) = 푘푖(푙푖 − 푙푖,0) = 푘푖(푙푖 − 푙푖,0) (13) 휕푟푖 2 푟푖푗 Forces are derived for all nonbonded and bonded terms as well as additional restraints and external forces that may have been defined. Once we know the force, we can update the position of atoms using Newton’s equations of motion:

푑2풓 푭 푖 푖 (14) 2 = 푑푡 푚푖

푑풓푖 푑풗푖 푭푖 = 풗푖; = (15) 푑푡 푑푡 푚푖 These equations are solved numerically. In my simulations I have used the leap-frog integrator which updates the velocities and positions using the following relations:

1 1 ∆푡 풗(푡 + ∆푡) = 풗(푡 − ∆푡) + 푭(푡) (16) 2 2 푚 1 풓(푡 + ∆푡) = 풓(푡) + ∆푡풗(푡 + ∆푡) (17) 2

11

To summarize, given an input topology which contains a description of the force-field and parameters to use, initial coordinate and velocities generated by a Maxwell-Boltzmann distribution, MD simulation programs calculate forces on each atom in the system, calculate the scaling factors (λ and μ), update velocities according to λ, calculate new positions, and scale coordinates and box size according to μ. This is the process that is implemented in GROMACS(17).

1.3.8 The MARTINI model

I have used the MARTINI(30, 31) coarse-grained model extensively in my simulations. Simulations in chapters 3 and 4 use MARTINI, as well as parts of chapter 5. It is by far the most popular coarse-grained model in answering biologically relevant questions(9, 32), and in particular lipid-protein interactions(9). It uses a top-down approach in its parameterization whereby nonbonded interactions are calibrated with respect to experimental data on oil/water partitioning coefficients. Bonded interactions are modelled based on atomistic models and the calculation of the potential energy uses the same functional form as in eq. 7.

MARTINI fuses four heavy atoms into one interaction site (or bead), with the exception of ring systems where a 3:1 mapping is used. It defines four types of beads: charged (Q), polar (P), nonpolar (N), apolar (C). Q and N beads are further divided into four subtypes to model different levels of propensity for hydrogen-bond formation (d = donor, a = acceptor, da = both, 0 = none). Similarly, P and C beads are divided into 5 subtypes to model different levels of polarity (1 being low in polarity and 5 being high). The total number of beads thus is (2 x 4) + (2 x 5) = 18.

Pairwise interactions between beads are modelled using a Lennard-Jones 12-6 potential. There are 10 different levels of interactions based on the well-depth parameter (휀푖푗) ranging from 2.0 – 5.6 kJ/mol. Q beads also interact using a shifted Coulomb potential and a uniform dielectric constant of 15. In general, however, the treatment of electrostatic interactions in the MARTINI model is subpar. The speed-up achieved by the MARTINI model is due to the combined effect of reducing the number of particles in the system, simplified interaction potential (leading to a smoother energy landscape), increased lateral diffusion of lipids and ability to increase the integration time step by a factor of 10-20(32).

1.3.9 Using MARTINI to study lipid-protein interactions

The exact simulation details for the MARTINI simulations are described on each chapter. Here, we highlight one of the most common system setups used. Figure 1-2 shows snapshots of an example system with four proteins embedded equidistantly from each other. This was done to allow for the complex lipid composition of the membrane which contains 63 different lipid types (some with <1% concentration) and was optimized for the available computer hardware.

12

13

Figure 1-2. Using the MARTINI model to study lipid-protein interactions. Top- and side-view of an example system showing four embedded proteins at the corners of the system surrounding by 63 different lipid types. Lipids here are grouped according to their headgroup type and colored separately (e.g. ganglioside lipids are shown as orange).

Figure 1-3 shows an example of how these systems are analyzed for lipid-protein interactions. Simulations are carried out for 30+μs, after which lipids are either grouped according to their headgroup or tail type or treated individually, and contacts with embedded proteins are extracted. These are then further processed or visualized to differentiate transient interactions from specifically bound lipids. In chapter 4, where the lipid-protein interaction profile of 28 different GPCR structures is studied, we can align these profiles and see which interaction sites are conserved. Specifically, for GPCRs we find a cholesterol interaction site at the extracellular TM6/7 interface for all class A GPCRs.

Figure 1-3. Example of lipid-protein interaction analysis. Lipid-protein contacts are measured from the generated trajectory and visualized on the surface of embedded proteins. In the case of GPCRs, where this type of calculation is done for many different proteins, profiles are aligned with each other, allowing us to see which binding sites are conserved throughout the family.

14

1.4 References

1. Singer S. J., G. L. Nicolson. The fluid mosaic model of the structure of cell membranes. Science. 1972;175(4023):720-731.

2. Kuo T.-C., Y. J. Tseng. LipidPedia: a comprehensive lipid knowledgebase. Bioinformatics. 2018;34(17):2982-2987.

3. Van Meer G., D. R. Voelker, G. W. Feigenson. Membrane lipids: where they are and how they behave. Nat Rev Mol Cell Biol. 2008;9(2):112-124.

4. Dean L., L. Dean. Blood groups and red cell antigens: NCBI Bethesda, Md, USA; 2005.

5. Enkavi G., M. Javanainen, W. Kulig, T. Róg, I. Vattulainen. Multiscale Simulations of Biological Membranes: The Challenge To Understand Biological Phenomena in a Living Substance. Chem Rev. 2019.

6. Casares D., P. V. Escribá, C. A. Rosselló. Membrane lipid composition: effect on membrane and organelle structure, function and compartmentalization and therapeutic avenues. International journal of molecular sciences. 2019;20(9):2167.

7. Simons K., R. Ehehalt. Cholesterol, lipid rafts, and disease. The Journal of clinical investigation. 2002;110(5):597-603.

8. Wallin E., G. V. Heijne. Genome‐wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms. Protein Sci. 1998;7(4):1029-1038.

9. Corradi V., B. I. Sejdiu, H. Mesa-Galloso, H. Abdizadeh, S. Y. Noskov, S. J. Marrink, D. P. Tieleman. Emerging Diversity in Lipid–Protein Interactions. Chem Rev. 2019.

10. Muller M. P., T. Jiang, C. Sun, M. Lihan, S. Pant, P. Mahinthichaichan, A. Trifan, E. Tajkhorshid. Characterization of Lipid–Protein Interactions and Lipid-Mediated Modulation of Membrane Protein Function through Molecular Simulation. Chem Rev. 2019.

11. Zhang K. H., J. Zhang, Z. G. Gao, D. D. Zhang, L. Zhu, G. W. Han, S. M. Moss, S. Paoletta, E. Kiselev, W. Z. Lu, G. Fenalti, W. R. Zhang, C. E. Muller, H. Y. Yang, H. L. Jiang, V. Cherezov, V. Katritch, K. A. Jacobson, R. C. Stevens, B. L. Wu, et al. Structure of the human P2Y(12) receptor in complex with an antithrombotic . Nature. 2014;509(7498):115-118.

15

12. Wu H. X., C. Wang, K. J. Gregory, G. W. Han, H. P. Cho, Y. Xia, C. M. Niswender, V. Katritch, J. Meiler, V. Cherezov, P. J. Conn, R. C. Stevens. Structure of a Class C GPCR Metabotropic Glutamate Receptor 1 Bound to an Allosteric Modulator. Science. 2014;344(6179):58-64.

13. Burg J. S., J. R. Ingram, A. J. Venkatakrishnan, K. M. Jude, A. Dukkipati, E. N. Feinberg, A. Angelini, D. Waghray, R. O. Dror, H. L. Ploegh, K. C. Garcia. Structural basis for chemokine recognition and activation of a viral G protein-coupled receptor. Science. 2015;347(6226):1113-1117.

14. Fan H. X., S. H. Chen, X. J. Yuan, S. Han, H. Zhang, W. L. Xia, Y. C. Xu, Q. Zhao, B. L. Wu. Structural basis for ligand recognition of the human thromboxane A(2) receptor. Nat Chem Biol. 2019;15(1):27-+.

15. Segala E., D. Guo, R. K. Y. Cheng, A. Bortolato, F. Deflorian, A. S. Dore, J. C. Errey, L. H. Heitman, A. P. Ijzerman, F. H. Marshall, R. M. Cooke. Controlling the Dissociation of Ligands from the Adenosine A(2A) Receptor through Modulation of Salt Bridge Strength. J Med Chem. 2016;59(13):6470- 6479.

16. Sejdiu B. I., D. P. Tieleman. Lipid-Protein Interactions Are a Unique Property and Defining Feature of G Protein-Coupled Receptors. Biophys J. 2020;118(8):1887-1900.

17. Abraham M. J., T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, E. Lindahl. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1:19-25.

18. Abraham M., D. Van Der Spoel, E. Lindahl, B. Hess. The GROMACS development team GROMACS user manual version 5.0. 4. J Mol Model. 2014.

19. Allen M. P., D. J. Tildesley. Computer simulation of liquids: Oxford university press; 2017.

20. Leach A. R., A. R. Leach. Molecular modelling: principles and applications: Pearson education; 2001.

21. Lewars E. Computational chemistry. Introduction to the theory and applications of molecular and quantum mechanics. 2003:318.

22. Hess B., H. Bekker, H. J. Berendsen, J. G. Fraaije. LINCS: a linear constraint solver for molecular simulations. Journal of computational chemistry. 1997;18(12):1463-1472.

16

23. Darden T., D. York, L. Pedersen. Particle mesh Ewald: An N⋅ log (N) method for Ewald sums in large systems. The Journal of chemical physics. 1993;98(12):10089-10092.

24. Tironi I. G., R. Sperb, P. E. Smith, W. F. van Gunsteren. A generalized reaction field method for molecular dynamics simulations. The Journal of chemical physics. 1995;102(13):5451-5459.

25. Berendsen H. J., J. v. Postma, W. F. van Gunsteren, A. DiNola, J. R. Haak. Molecular dynamics with coupling to an external bath. The Journal of chemical physics. 1984;81(8):3684-3690.

26. Bussi G., D. Donadio, M. Parrinello. Canonical sampling through velocity rescaling. The Journal of chemical physics. 2007;126(1):014101.

27. Nosé S. A molecular dynamics method for simulations in the canonical ensemble. Molecular physics. 1984;52(2):255-268.

28. Hoover W. G. Canonical dynamics: Equilibrium phase-space distributions. Physical review A. 1985;31(3):1695.

29. Parrinello M., A. Rahman. Polymorphic transitions in single crystals: A new molecular dynamics method. Journal of Applied physics. 1981;52(12):7182-7190.

30. Marrink S. J., H. J. Risselada, S. Yefimov, D. P. Tieleman, A. H. de Vries. The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations. The Journal of Physical Chemistry B. 2007;111(27):7812-7824.

31. de Jong D. H., G. Singh, W. D. Bennett, C. Arnarez, T. A. Wassenaar, L. V. Schäfer, X. Periole, D. P. Tieleman, S. J. Marrink. Improved parameters for the martini coarse-grained protein force field. Journal of Chemical Theory and Computation. 2012;9(1):687-697.

32. Marrink S. J., D. P. Tieleman. Perspective on the Martini model. Chem Soc Rev. 2013;42(16):6801-6822.

17

Chapter Two: Emerging Diversity in Lipid-Protein Interactions

Copyright

This chapter is taken from the following article:

Corradi V., B. I. Sejdiu, H. Mesa-Galloso, H. Abdizadeh, S. Y. Noskov, S. J. Marrink, D. P. Tieleman. Emerging Diversity in Lipid–Protein Interactions. Chem Rev. 2019;119(9):5775-848. with the following direct link: https://pubs.acs.org/doi/full/10.1021/acs.chemrev.8b00451 - further permissions related to the material excerpted should be directed to the ACS.

Copyright permissions are provided in Appendix C.

Contributions

The work cited above is a collaborative effort to summarize the MD simulation literature on lipid-protein interactions. The material presented in this chapter is entirely my own contribution. The only exception are Figures 2-1 and 2-3 which were done by my colleague Dr. Valentina Corradi (I only altered the color gradient slightly here), and Figure 2-2 which was taken from reference 47. The input of my supervisor in reviewing, providing feedback, assistance, etc. at all stages of the work should be understood.

Abbreviations

All abbreviations are introduced at their first mentioning instance.

18

2.1 Abstract

Membrane lipids interact with proteins in a variety of ways, ranging from providing a stable membrane environment for proteins to being embedded in to detailed roles in complicated and well-regulated protein functions. Experimental and computational advances are converging in a rapidly expanding research area of lipid-protein interactions. Experimentally, the database of high-resolution membrane protein structures is growing, as are capabilities to identify the complex lipid composition of different membranes, to probe the challenging time and length scales of lipid-protein interactions, and to link lipid-protein interactions to protein function in a variety of proteins. Computationally, more accurate membrane models and more powerful computers now enable a detailed look at lipid-protein interactions and increasing overlap with experimental observations for validation and joint interpretation of simulation and experiment. Here we review papers that use computational approaches to study detailed lipid-protein interactions, together with brief experimental and physiological contexts, aiming at comprehensive coverage of simulation papers in the last five years. Overall, a complex picture of lipid-protein interactions emerges, through a range of mechanisms including modulation of the physical properties of the lipid environment, detailed chemical interactions between lipids and proteins, and key functional roles of very specific lipids binding to well- defined binding sites on proteins. Computationally, despite important limitations, molecular dynamics simulations with current computer power and theoretical models are now in an excellent position to answer detailed questions about lipid-protein interactions.

2.2 G protein-coupled receptors (GPCRs)

G protein – coupled receptors (GPCRs) are the largest superfamily of membrane proteins and the number one drug target.(1) They are characterized by a highly conserved seven transmembrane helix (TM) topology with a ligand-binding extracellular site and G protein coupling intracellular site (Figure 2-1). Ligand binding on the extracellular site triggers a sequence of conformational changes in the TM domain that results in the activation of a membrane-anchored G protein. Due to their localization on the plasma membrane, GPCR function and activity are exposed to all the biophysical processes that characterize cell membranes. In the last few years, significant advances have been made in understanding this relationship between GPCRs and their lipid environment.

19

Figure 2-1. Representative structures of GPCRs.

From left to right, the structures of rhodopsin(2), the β2 adrenergic receptor (β2AR)(3), and the smoothened receptor(4) are shown as cartoons, with the 7 transmembrane (TM) helices highlighted in different colors.

For β2AR, the three subunits of the G protein (Gα, Gβ and Gγ) are shown in light gray and blue cartoons. For the smoothened receptor, the extracellular cysteine-rich domain (CRD) and the long extracellular loop 3 are shown in light blue and gray cartoons, respectively. The top panels show a side view of the receptors, while the bottom panels provide a view from the extracellular side. Substrates or are labeled and shown in white spheres, and the hydrophobic region of the membrane is highlighted in gray. In, intracellular side; Out, extracellular side; ICL, intracellular loop; ECL, extracellular loop.

Early evidence for the importance of lipid – protein interactions in GPCR function and activity dates back to early to mid-1990s when a series of discoveries was made regarding cholesterol modulation of rhodopsin and oxytocin function. The equilibrium between meta-rhodopsin I and meta-rhodopsin II depends on cholesterol concentration(5) and was assumed to be a result of cholesterol’s ability to alter the physical properties of the bilayer. Albert et al. a few years later, however, showed that another mechanism is possible, namely a direct, structurally specific, interaction between rhodopsin and cholesterol.(6) Gimpl et

20 al. studied this cholesterol – modulatory effect on the ligand – binding activity of two different GPCRs with peptide-hormone ligands: the oxytocin and the cholecystokinin receptor.(7, 8) They found that this effect is caused by altering the membrane properties in the case of cholecystokinin, but by a ‘putatively specific cholesterol – receptor interaction’ for the oxytocin receptor. More recently, the cholecystokinin type 1 receptor was shown to be sensitive to membrane cholesterol levels, as opposed to its type 2 relative cholecystokinin receptor which does not share this cholesterol sensitivity.(9-11)

In the last two decades a plethora of experimental and theoretical techniques have been used to investigate the existence and significance of lipid – GPCR interactions. Two illustrative, although not exclusive, examples of the progress made are the several solved crystallographic structures of GPCRs with bound lipids,(12-21) and a series of papers published recently clearly and unequivocally demonstrating the importance of cholesterol in activating the class F GPCR Smoothened in hedgehog signaling.(4, 22-25)

2.2.1 GPCR – Lipid interactions

The most commonly used classification of GPCRs uses their sequence homology to categorize them into classes A-F or, alternatively, into the GRAFS system (with each letter of the acronym standing for the most representative member of the family, e.g. R=Rhodopsin).(1, 26) Class A (or rhodopsin-like) GPCRs are the largest, most studied, and hence best-understood GPCR family by practically any metric. MD literature follows published GPCR structures. Therefore, GPCR – lipid interactions have mainly been studied in the context of class A GPCRs. We begin with a discussion of the prototypical rhodopsin and continue with aminergic receptors, in no particular order, concluding with a brief section on other miscellaneous GPCR- lipid interaction papers. Considering the vast GPCR literature, no single review, no matter how comprehensive, can do justice at detailing every aspect of their biology. We limit the discussion to the aspects outlined in the introduction of this review and refer the reader to several other excellent reviews.(27- 33) When necessary, the Ballesteros-Weinstein numbering scheme will be used to identify residues.(34)

2.2.2 Rhodopsin

Rhodopsin is found in rod cells embedded in a membrane that is rich in ω – 3 polyunsaturated lipids and cholesterol (although the content of the latter varies with cell age). Polyunsaturated lipids, e.g. docosahexaenoic acid (DHA), have a stabilizing effect on rhodopsin structure and increase its activity; cholesterol has the opposite effect. Multiple short – time scale MD simulations of rhodopsin(35) embedded in a bilayer with PE and PC lipids with mixed saturation lipid tails and cholesterol showed that rhodopsin forms a few, potentially specific, interactions with DHA, the unsaturated tail, but no specific interaction with stearic acid, the saturated tail, and cholesterol. A later study, employing 1.6 μs simulation of the same system, however, provided more insight into the cholesterol – rhodopsin interactions.(36) Rhodopsin

21 contains three structural regions that exhibit increased affinity for cholesterol molecules: the extracellular sides of the TM2 – TM3 bundle and TM7 helix, and the intracellular side of the TM1 – TM2 – TM4 helices. Extending these simulations even further in time scale using the Martini model confirmed the preferential interaction of cholesterol and DHA with rhodopsin and suggested a preference of rhodopsin for PE headgroup over PC headgroups.(37) The effect of polyunsaturated tails and PE headgroups on rhodopsin activity has also been observed in experiments on reconstituted rhodopsin receptors in model membranes.(38) These experiments show that PE lipids, by creating a negative curvature, affect the MI/MII equilibrium of rhodopsin. Increased levels of PE lipids shift the equilibrium towards the MII state. The implication here is the possibility of a lipid-regulated inactive-active state transitioning of the receptor. More recent MD simulation studies on lipid – rhodopsin interactions, however, highlight a more dualistic nature of lipid – protein interactions whereby membrane lipids express their modulatory effects on rhodopsin by acting, simultaneously, as allosteric modulators and by altering membrane physical properties (e.g. membrane fluidity).(39)

2.2.3 Adrenergic receptors

Adrenergic receptors are among the best-studied GPCRs, and their lipid – interaction profile has consequently been studied and characterized extensively. In 2008, Hanson et al. solved the structure of the human β2AR receptor bound to two cholesterol molecules, strongly indicating structurally-specific cholesterol-binding sites on the receptor surface.(13) Helices I – IV of the β2AR form a shallow groove that is sufficient to accommodate the binding of two cholesterol molecules, with different binding affinities. These observations enabled the definition of a cholesterol – binding motif, which is found in 21% of human class A GPCRs.(13) This domain, coined cholesterol consensus motif (CCM), is formed by residues in helices II and IV of the receptor and differs from the CRAC/CARC domains in that it is formed by a spatial arrangement of residues, rather than a linear one. Cholesterol interactions with adrenergic receptors have been observed in several other crystal structures.(14, 16) MD simulations at both the atomic and coarse – grained level of the receptor embedded in a POPC bilayer in the presence or absence of cholesterol have shown that β2AR interacts with cholesterol at several potential interaction sites or hot-spots (some of which match those observed in several crystal structures), quantified by cholesterol occupancy time from the simulation trajectories.(40, 41) A possible functional role of these additional putative interaction sites remains to be established, but it is reassuring that the simulations agree with each other and can reproduce with good agreement the cholesterol interaction sites observed in solved crystal structures. Interestingly, the interaction of cholesterol with these binding sites is dynamic and ranges from nanosecond to microsecond timescales and might serve as a basis for dividing cholesterol – β2AR binding events into short- and long-lived. This is an interesting prospect considering recent experimental evidence supporting

22 it.(42) Experiments with unfolding temperature assays and saturation transfer difference NMR showed that cholesterol binds to the β2AR with high affinity – slow exchange rate and low affinity – fast exchange rate, respectively. Control experiments reveal both these types of binding events to be specific to cholesterol.(42)

When the β2AR and β1AR are each simulated separately in cholesterol containing POPC bilayers in microsecond-long atomistic MD simulations, the resulting cholesterol-interaction profiles of the receptors differ significantly from each other.(43) Most notably, the H1 – H8 interface of β2AR seems to interact preferentially with two cholesterol molecules, while the same interaction is missing from the β1AR, which can be attributed to the slightly different resides lining the interface. More generally, this means that lipid interaction data from one GPCR may not be easily extended to other members of even the same GPCR family despite sharing a high structural similarity.

Important advances in the understanding of β2AR-lipid interactions were made by Kobilka and colleagues who showed that cholesterol and phospholipids affect the kinetics and stability of the receptor.(44, 45) Using force spectroscopy methods and cholesteryl hemisuccinate as a cholesterol analog they showed that cholesterol increases the stability of almost all structural elements of the β2AR, presumably by making it difficult for the protein to sample its conformational landscape.(44) This finding is also supported by recent experimental(46) and computational(47) studies. In the latter study, Manna et al. carried out atomistic MD simulations of the human β2AR in, among other setups, DOPC bilayers of varying cholesterol concentration and showed that the conformational landscape sampled by β2AR is reduced significantly at or above 10 mol % concentrations of cholesterol (Figure 2-2).(47) Further simulations suggest that this reduced conformational flexibility of β2AR is a direct result of cholesterol-receptor interactions, not of indirect modulation through altered bilayer bulk properties. In addition, the binding of cholesterol agreed well with previous studies,(40-42) as several known interaction sites were retrieved. Time-correlation data also reveal some binding sites to be dependent on cholesterol concentration and others independent of it, further supporting the division of cholesterol binding sites according to their binding affinity or exchange rate.(47)

23

A B C

Figure 2-2. The effect of cholesterol on β2AR activation. A. Definition of two distance parameters used to measure the conformational changes as a function of time.(47) The D3.32 – S5.46 Cα atom distance (denoted LL) measures fluctuations in the ligand – binding site of the receptor, and the R3.50 – E6.30 Cα atom distance (denoted LG) captures fluctuations in the G protein binding interface. B-C. The conformational space probed by the simulations in pure DOPC and DOPC – 10% Chol concentration bilayer, respectively, plotted as a function of these two distance parameters(47). Cholesterol significantly decreases the conformational space sampled by the receptor. Adapted with permission from reference (47). Copyright 2016 Manna et al. Licensed under Creative Commons Attribution 4.0.

While cholesterol is clearly the major lipid type to consider when looking at lipid – protein interactions in GPCRs, Dawaliby et al.(45) showed recently, experimentally, that phospholipids also regulate the activity of the β2AR by acting as allosteric modulators. By testing phospholipids with different headgroup types they found that phospholipids, depending on the nature of the headgroup, can shift the equilibrium towards either the active or inactive state of the receptor, with DOPG, DOPS and DOPI favoring the former and DOPE favoring the latter. Remarkably, this effect, in the case of negatively charged phospholipids is dose- dependent and present even in the absence of a bilayer, probably by the headgroup interacting with the cytoplasmic side of the receptor. These studies show the effect of cholesterol and phospholipids to be independent of ligand binding. The importance of phospholipids has been underscored in several other experimental studies for other GPCRs.(48, 49)

MD simulations have been quite successful in reaffirming and even predicting these phospholipid – β2AR interactions. Neale et al. showed that POPG stabilizes the active state of the embedded β2AR through specific interactions with Arg3.50 on the intracellular side of the receptor.(50) This interaction of POPG is stronger and more frequent when compared to equivalent interactions of the zwitterionic POPC. While it is

24 not possible to attribute this stabilizing activity of POPG solely on its interaction with Arg3.50, it is clear that it opposes the closure of H6 – a critical step in β2AR activation. More recently, microsecond-length atomistic simulations of β2AR embedded in DOPG, DOPE, or DOPC phospholipids reaffirm the differential effect of negatively charged versus zwitterionic lipids on the activation of β2AR.(51) DOPC, DOPE, and DOPG partially inactivate, fully deactivate, and stabilize the active-state of the β2AR, respectively. While the exact details of these interactions remain to be deciphered, MD simulations in combination with many experimental findings have already provided a wealth of information with regards to cholesterol- and phospholipid-β2AR interactions.

MD simulations have also hinted to a possible role of PIP lipids in mediating lipid-protein interactions by demonstrating preferential localization of these lipids in microsecond-long simulations.(52, 53) Native mass spectrometry studies complemented with CG MD simulations have highlighted the importance of PIP2 lipids. In particular, experiments by Yen et al. demonstrate that PIP2 lipids do not only affect the stability of the active state of these receptors, but also exert influence on its coupling to G proteins.(54) The GPCRs studied were β1AR, adenosine A2A receptor (A2AR), and neurotensin receptor 1 (NTSR1), although the CG MD simulations points to this effect likely being conserved in other class A GPCRs as well.(54)

2.2.4 Adenosine receptors

Evidence for the importance of cholesterol in the adenosine receptor function and activity dates to at least 2008.(55) The same year a 2.6 Å crystal structure of the A2A adenosine receptor (A2AAR) was published(56) Lyman et al.(57) took this opportunity to investigate, using MD simulations, the behaviour of the receptor in the presence and absence of cholesterol. In bilayers where cholesterol is absent, helix II of the A2AAR is remarkably unstable if the ligand is removed from the simulations.(57) If, however, cholesterol is introduced in these simulations, the stability of helix II is restored, thanks to cholesterol- protein interactions. Later MD simulations provided a clearer picture of the cholesterol – A2AAR interaction profile,(58) where three cholesterol binding sites are identified. Two of these cholesterol hotspots are on the extracellular side, and one on the intracellular side of the receptor. One of the cholesterol interaction sites observed on the extracellular leaflet of the receptor interacting on the interface between helices II and III, is confirmed by a previously solved X-ray crystallographic structure of the receptor;(17) however, the other cholesterol hotspots observed lack cross-validation, possibly due to limited sampling, especially considering more recent results below. MD simulations of the A2AAR embedded in POPC and POPE bilayers suggest that the receptor samples a larger part of the conformational landscape if it is embedded in the former, although this may be due to generally slower dynamics in the more ordered POPE bilayer.(59) It is tempting to look at similar findings obtained for the human β2AR and extend those, but it

25 remains to be established if such extrapolation of data is sensible. More recent MD simulations of the A2AAR, by combining data at both the atomistic and coarse-grained level, have identified two new cholesterol – interaction sites.(60) One of these is located on the intracellular side of the interface formed by helices V and VI (not validated experimentally), and the other on the extracellular side of helix VI, matching experimental evidence.(17)

2.2.5 Serotonin receptors

Experimental findings revealed that cholesterol depletion alters ligand binding and G-protein coupling to the serotonin 1A receptor.(61) Cholesterol also increases the stability of the human serotonin 1A receptor,(62) and in giant unilamellar protein-vesicles has been observed to increase oligonucleotide exchange.(63) Due to the lack of a crystal structure of serotonin receptors, initial MD simulations used homology models. One such study, using Martini coarse – grained MD simulations, showed that in cholesterol containing POPC bilayers, the embedded homology model of the serotonin 1A receptor displays several preferential cholesterol interaction sites.(64) One of these interactions is with helix V which represents one (out of three) CRAC motifs found on the receptor. Atomistic MD simulations of the activity of serotonin 1A and serotonin 2A receptors as a function of bilayer cholesterol content, currently portray a conflicting picture as to if cholesterol decreases the conformational flexibility of the receptor,(65) or increases it,(66) respectively. Ganglioside GM1 and sphingolipids also interact with the serotonin 1A receptor.(67, 68) Shan et al. simulated the conformational changes of the serotonin 2A receptor induced by its binding to three different ligands (full , , and ).(69) They found a noticeably different response of the receptor depending on the ligand bound and that these conformational changes are relayed into the surrounding membrane environment of the embedded receptor, confirmed by specific interactions of the receptor with cholesterol and distinct membrane perturbations around the TM core of the receptor.

2.2.6 Other GPCRs

MD simulations either alone or in tandem with experimental studies have been used to explore the lipid – binding properties of other GPCRs as well. Marino et al. carried out extensive Martini coarse – grained MD simulations of the mu-opioid receptor (μOR) in highly realistic membrane compositions,(53) showing specific interactions of μOR with cholesterol and with the negatively charged PIP lipids, which surprisingly, differ to some extent depending on the conformational state of the receptor. The implication of these differences is, however, unclear. Similar simulations of the delta-opioid receptor (δOR) confirm the relative enrichment of cholesterol adjacent to the receptor compared to bulk concentrations.(52) This type of simulations in complex bilayers reveal a unique interaction profile of opioid receptors with individual

26 membrane lipids (cholesterol, PIP lipids) and groups of lipids that share a chemical feature (polyunsaturated, fully saturated, headgroup type, etc.), hinting towards a functionally relevant involvement in GPCR activity and oligomerization (see below).(52, 53)

Cholesterol is an essential component in Smoothened receptor activation. (4, 22-25) Cholesterol binds to the cysteine-rich domain (CRD) of Smoothened. MD simulations showed that cholesterol confers stability to the CRD domain but did not affect the stability of the TM domain.(4) In a series of experiments, Huang et al.(22) and Luchetti et al. (23) concurrently showed that cholesterol is the endogenous ligand that activates Smoothened. Indeed, cholesterol is not only necessary, but also sufficient for Smoothened activation.(23) Additionally, MD simulations coupled with PMF calculations point to the existence of an interaction site for cholesterol formed by TM2 and TM3 helices of Smoothened on the extracellular site of the receptor.(70)

Cholesterol also affects the activity and stability of the neurotensin receptor 1 (NTS1).(71) NTS1 also displays a potential preference for PS lipids(72) and its G protein coupling affinity is significantly increased in the presence of PE lipids.(73) This dependency of G protein coupling by GPCRs on the lipid environment is also evident for the cannabinoid type 2 receptor (CB2R), which increases G protein activation in the presence of anionic lipids.(49) Interestingly, CB2R and the structurally similar CB1R seem to differ in their cholesterol-interaction profiles.(74) Sphingosine-1-phosphate receptor 1, in extensive Martini coarse- grained simulations, consistently interacts with cholesterol and PIP2 lipids.(75) Membrane localization of the dopamine D1 receptor is dependent on membrane cholesterol and sphingolipid levels.(76) GPCR – cholesterol interactions, mainly by way of CRAC motifs, have also been demonstrated for class A chemokine receptors,(77) for class C metabotropic glutamate receptors,(78) and T2R4 bitter taste receptor(79) perhaps highlighting the presence and importance of this motif across very different GPCRs.

2.3 GPCR scramblase activity and lipid entry events

An unexpected finding from MD simulations of GPCRs is the occasionally-observed complete entrance of lipid molecules from the bilayer into the receptor. Considering the lack of structural data pointing towards such a possibility, however, not much attention has been paid to this phenomenon. So far, MD simulations have demonstrated lipid entry for several GPCRs, including opsin,(80) cannabinoid CB2 receptor,(81) sphingosine-1-phosphate receptor,(82) β2AR,(50) and A2AAR.(83) 11-cis-retinal uptake by the opsin receptor is likely achieved via the TMH5/6 interface as an entrance port and either TMH1/7(84) or TMH5/6(80) serving as an all-trans-retinal exit site. In MD simulations of the cannabinoid type 2 receptor, 2-arachidonoylglycerol (2-AG) partitions out of the POPC bilayer and interacts specifically with the

27

TMH6/7 interface, where it also enters the receptor.(81) In contrast, simulations of the sphingosine-1- phosphate receptor showed a POPC lipid to interact specifically with and enter the receptor through the TMH1/7 interface.(82) The TMH1 – TMH7 distance increases significantly during the course of the simulation to accommodate this lipid entry. Endogenous ligand binding to GPCRs is expected. More puzzling is the observed complete entry of bilayer lipids inside receptors that do not have lipids as natural ligands. In simulations of the β2AR, a POPC lipid sometimes (estimated at 6% of the time) accesses the inside of the receptor via the TMH6/7.(50) This complete lipid entry is, however, only observed with one of the two force-fields the authors used. More recently, MD simulations revealed that cholesterol completely enters the A2A adenosine receptor through TM5/6, in a process that appears to be dependent on the bulk properties of the surrounding bilayer.(83) While in the interior of the receptor, cholesterol preferentially samples an area of the receptor that binds the ZM241385 ligand in the A2A adenosine receptor crystal structure,(56) hinting that cholesterol could affect ligand binding properties. The latter was confirmed using biotinylation assay experiments.(83) It is unclear what the implications of these findings are, but they may add an additional layer of complexity to GCPR – lipid interactions.

Experiments by Menon et al. showed that opsin acts as a phospholipid scramblase.(85) In a later study, it was discovered that rhodopsin as well, is a phospholipid scramblase with an activity of >10 000 phospholipids per protein per second.(86) The authors demonstrated that this activity of rhodopsin is independent of the conformational state of the receptor, which means that phototransduction and scramblase activity are not coupled to each other. Furthermore, β2AR and A2AAR, as well, scramble phospholipids, hinting that this activity could be shared by all class A GPCRs.(86) MD simulations and Markov State Model analysis reveal that the mechanism for phospholipid translocation involves a hydrophilic pathway that is created between TMH6 and TMH7 of opsin, through which the phospholipid headgroup crosses from the intracellular to the extracellular leaflet, while the lipid tail remains in the bilayer.(87) The simulations also characterize the conformational changes necessary for this translocation event to occur. Considering this observed scramblase activity is dependent on a thin low-cholesterol membrane, it remains to be seen how this is affected by cholesterol- and sphingolipid-containing membranes.(86) While sampling these events is challenging and may require specialized methods, MD simulations have already demonstrated that they can be a useful tool to shed light on this novel aspect of GPCR activity.

2.4 GPCR oligomerization

The physiological role of oligomerization of several GPCRs is a topic of ongoing discussion. It has been observed and characterized as functionally important in many experiments,(88-91) but at the same time

28 doubts have been raised about several aspects of its occurrence and importance.(92, 93) Computer simulations have been used to study membrane-mediated aspects of oligomerization, which we discuss below (Figure 2-3). A more detailed discussion of oligomerization and other aspects of their relevance and function is available in several other reviews.(90, 93-96)

Figure 2-3. Experimental structures of GPCR dimers. Side view (upper panels) and the view from the extracellular side (bottom panels) of the (left) μ opioid receptor (μOR) dimer(18) and (right) the chemokine receptor type 4 (CXCR4) dimer.(97) The 7 transmembrane (TM) helices of the receptors are highlighted in different colors and shown as cartoons, while one of the monomers is also represented with a transparent gray surface. Substrates bound to these dimers are not shown for clarity.

Initial MD simulations of GPCR dimerization were focused on rhodopsin. Periole et al. carried out coarse- grained MD simulations of several rhodopsin receptors embedded in PC bilayers of different lipid tail lengths and found that the increase in hydrophobic mismatch with shorter lipid tail length resulted in local thickening of the bilayer, particularly noticeable near TM2, TM4 and TM7 helices, and thinning near TM1, TM5, TM6 and TM8.(98) These results suggest that hydrophobic mismatch acts as a driving force for receptor oligomerization. Additionally, the most prominent interaction surface involved a symmetric arrangement of TM1, TM2, and H8, although other dimer interfaces were also observed. Later studies, however, affirmed that the primary mode of interaction in rhodopsin dimers involves the TM1/TM2 on the extracellular side, and H8 on the intracellular side.(99, 100) NMR experiments show that hydrophobic mismatch modulates the MI/MII rhodopsin intermediate equilibrium, with oligomerization shifting the

29 equilibrium towards MI, and local thickening of the membrane to compensate for the hydrophobic mismatch favors the MII intermediate.(101)

To characterize the role of hydrophobic mismatch in GPCR oligomerization, Mondal et al. used Martini coarse-grained simulations of the prototypical β1AR and β2AR, and studied the energy penalty associated with the residual hydrophobic mismatch, RHM (the energy penalty resulting from the inability of the membrane to completely counter hydrophobic mismatch).(102) They found that the highest energy cost of RHM for the β2AR monomer was at TM1, TM4, and TM5. Oligomerization of β2AR occurs via the TM1 and TM4/TM5 interface and significantly decreases this energy penalty, thus highlighting hydrophobic mismatch as a possible driving force behind oligomerization. The highly homologous β1AR in contrast showed a high RHM only on its TMH1, indicating a preference of the receptor to form dimers, not oligomers, through its TMH1/TMH1 interface. In their simulations, Mondal et al. did not see a significant effect of cholesterol on receptor oligomerization.(102) A different study, however, proposes a modulatory role of cholesterol in β2AR dimer interfaces.(41) Without cholesterol present, the preferred β2AR dimer interface involves TMH4 and TMH5. Increasing cholesterol concentrations, however, changed the relative involvement of TM helices at the dimer interface, with 50% cholesterol favoring a predominantly TMH1 and TMH2 interface. This modulatory role of cholesterol was attributed to its preferred localization at TMH4.(41)

Opioid receptors are another GPCR family with extensively studied oligomerization properties, mainly by the Filizola lab and collaborators. Umbrella sampling free energy calculations and metadynamics simulations revealed a short-lived interaction of delta-opioid receptors (δOR), mainly through TMH4, but with a relative involvement of TMH5, as well.(103, 104) In a later, more extensive study, Provasi et al. used the Martini model to simulate the preferred di/oligomerization pattern of the main opioid receptor (OR) subtypes: μOR, δOR, and κOR, simulated in their homomeric and main heteromeric form.(105) OR utilize a limited number of interfaces which are consistent among the subtypes simulated but differ in their relative fraction of occurrence. Also, consistent seems to be the shared lack of either TM3 or TM7 involvement in dimer interfaces. Analysis of different kinetic parameters pointed towards different propensities and association rates of dimers to form, depending on the interface involved, and an active role of membrane lipids, including cholesterol, in guiding these associations. Local lipid exchange and persistence time might affect and serve a modulatory effect, on the kinetic favourability of different interfaces to form.(105) Experimental and computational tools have shown that μOR homodimerization is facilitated by cholesterol through a Cys3.55-palmitoyl – cholesterol interaction.(106) Simulations of the active- and inactive-state μOR in more realistic plasma model provided additional insights into how the membrane environment guides receptor dimerization.(53) Notably, TMH1, TMH5, and TMH6 induced an

30 ordering of lipids, in contrast to TMH4, which induced a more disordered region in the membrane. These local membrane adaptations facilitate receptor dimerization and affect the preferred dimer interfaces formed. Interestingly, the latter is also dependent on the conformational state of the receptor.

A recent computational study of A2AAR and dopamine D2 receptor oligomerization suggests that it is highly DHA concentration dependent.(107) DHA displays a preferential interaction with each receptor and high – levels of it increase receptor heteromerization. Bioluminescence resonance energy transfer (BRET) experiments showed the number of oligomers formed, however, to be independent of DHA levels, underscoring a kinetic modulation by DHA. Prasanna et al., using coarse-grained MD simulations demonstrated the lipid-dependent oligomerization of serotonin 1A receptor.(108) Specifically, they observed two main dimer interfaces involving TMH1/TMH2, and TMH4/TM5/TMH6, and they highlight the importance of cholesterol in modulating the stability and flexibility of these dimers. Pluhackova et al. simulated the chemokine receptor type 4 (CXCR4) in pure phospholipid bilayers and noticed that the most prominent dimerization interface involved TM1/TM5-7.(109) In bilayers containing cholesterol, however, this interface was unavailable due to a cholesterol molecule binding between TM1 and TM7. Instead, a new symmetric TM3/TM4 interface was observed that seems to be supported by experiment, highlighting not only its regulatory role, but also the importance of considering its effect in studies like this. A more recent study further characterized chemokine receptor homo- and heterodimerization patterns and their dependence on membrane cholesterol.(110) Coarse-grained MD simulations were also used to study the oligomerization of S1P1 receptors.(75)

A common theme that emerges from both experimental and theoretical results is the dual modulatory effect of lipids on GPCRs, by way of either altering membrane bulk properties or specific interactions with the receptor. While quantifying this effect and the contributions from each component is challenging for GPRCs, current evidence suggests the modulatory role of membrane lipids on receptor function and activity is achieved primarily by specific lipid – protein interactions. However, to make matters more complicated, that does not seem to be the case for rhodopsin, where the modulatory role is currently mainly attributed to local membrane curvature instead.(111)

It is challenging to compare the lipid – protein interaction results across GPCR members given the differences in experimental and simulation setups employed. The current literature suggests that extending results from one GPCR to others, simply based on structural similarity and sequence identity might not be enough. MD simulation data show that many GPCRs interact preferentially with cholesterol through several interaction sites. Some of these are likely of functional importance, others seem not to be. Recent simulations show that microseconds are required to obtain statistically significant results on cholesterol – GPCR interactions, a bar that most of the older papers do not reach.

31

GPCR oligomerization is a topic of debate and caution is required when interpreting results from MD simulations. Regardless of that, however, it seems that in so far as GPCR association is a true phenomenon, it is regulated and dependent on specific membrane lipids (cholesterol, DHA), and membrane physical properties (hydrophobic mismatch, lipid order/disorder) that act as molecular driving forces.

2.5 References

1. Isberg V., S. Mordalski, C. Munk, K. Rataj, K. Harpsoe, A. S. Hauser, B. Vroling, A. J. Bojarski, G. Vriend, D. E. Gloriam. GPCRdb: an information system for G protein-coupled receptors. Nucleic Acids Res. 2016;44(D1):D356-D364.

2. Okada T., M. Sugihara, A. N. Bondar, M. Elstner, P. Entel, V. Buss. The retinal conformation and its environment in rhodopsin in light of a new 2.2 A crystal structure. J Mol Biol. 2004;342(2):571-583.

3. Rasmussen S. G., B. T. DeVree, Y. Zou, A. C. Kruse, K. Y. Chung, T. S. Kobilka, F. S. Thian, P. S. Chae, E. Pardon, D. Calinski, J. M. Mathiesen, S. T. Shah, J. A. Lyons, M. Caffrey, S. H. Gellman, J. Steyaert, G. Skiniotis, W. I. Weis, R. K. Sunahara, B. K. Kobilka. Crystal structure of the beta2 adrenergic receptor-Gs protein complex. Nature. 2011;477(7366):549-555.

4. Byrne E. F. X., R. Sircar, P. S. Miller, G. Hedger, G. Luchetti, S. Nachtergaele, M. D. Tully, L. Mydock-McGrane, D. F. Covey, R. P. Rambo, M. S. P. Sansom, S. Newstead, R. Rohatgi, C. Siebold. Structural basis of Smoothened regulation by its extracellular domains. Nature. 2016;535(7613):517-522.

5. Mitchell D. C., M. Straume, J. L. Miller, B. J. Litman. MODULATION OF METARHODOPSIN FORMATION BY CHOLESTEROL-INDUCED ORDERING OF BILAYER LIPIDS. Biochemistry. 1990;29(39):9143-9149.

6. Albert A. D., J. E. Young, P. L. Yeagle. Rhodopsin-cholesterol interactions in bovine rod outer segment disk membranes. Biochim Biophys Acta - Biomembr. 1996;1285(1):47-55.

7. Gimpl G., U. Klein, H. Reilander, F. Fahrenholz. EXPRESSION OF THE HUMAN OXYTOCIN RECEPTOR IN BACULOVIRUS-INFECTED INSECT CELLS - HIGH-AFFINITY BINDING IS INDUCED BY A CHOLESTEROL CYCLODEXTRIN COMPLEX. Biochemistry. 1995;34(42):13794- 13801.

8. Gimpl G., K. Burger, F. Fahrenholz. Cholesterol as modulator of receptor function. Biochemistry. 1997;36(36):10959-10974.

32

9. Harikumar K. G., V. Puri, R. D. Singh, K. Hanada, R. E. Pagano, L. J. Miller. Differential effects of modification of membrane cholesterol and sphingolipids on the conformation, function, and trafficking of the G protein-coupled cholecystokinin receptor. J Biol Chem. 2005;280(3):2176-2185.

10. Potter R. M., K. G. Harikumar, S. V. Wu, L. J. Miller. Differential sensitivity of types 1 and 2 cholecystokinin receptors to membrane cholesterol. J Lipid Res. 2012;53(1):137-148.

11. Harikumar K. G., R. M. Potter, A. Patil, V. Echeveste, L. J. Miller. Membrane Cholesterol Affects Stimulus-Activity Coupling in Type 1, but not Type 2, CCK Receptors: Use of Cell Lines with Elevated Cholesterol. Lipids. 2013;48(3):231-244.

12. Cherezov V., D. M. Rosenbaum, M. A. Hanson, S. G. F. Rasmussen, F. S. Thian, T. S. Kobilka, H. J. Choi, P. Kuhn, W. I. Weis, B. K. Kobilka, R. C. Stevens. High-resolution crystal structure of an engineered human beta(2)-adrenergic G protein-coupled receptor. Science. 2007;318(5854):1258-1265.

13. Hanson M. A., V. Cherezov, M. T. Griffith, C. B. Roth, V. P. Jaakola, E. Y. T. Chien, J. Velasquez, P. Kuhn, R. C. Stevens. A specific cholesterol binding site is established by the 2.8 angstrom structure of the human beta(2)-adrenergic receptor. Structure. 2008;16(6):897-905.

14. Wacker D., G. Fenalti, M. A. Brown, V. Katritch, R. Abagyan, V. Cherezov, R. C. Stevens. Conserved Binding Mode of Human beta(2) Adrenergic Receptor Inverse Agonists and Antagonist Revealed by X-ray Crystallography. J Am Chem Soc. 2010;132(33):11443-11445.

15. Warne T., R. Moukhametzianov, J. G. Baker, R. Nehme, P. C. Edwards, A. G. W. Leslie, G. F. X. Schertler, C. G. Tate. The structural basis for agonist and partial agonist action on a beta(1)-adrenergic receptor. Nature. 2011;469(7329):241-244.

16. Rosenbaum D. M., C. Zhang, J. A. Lyons, R. Holl, D. Aragao, D. H. Arlow, S. G. Rasmussen, H. J. Choi, B. T. Devree, R. K. Sunahara, P. S. Chae, S. H. Gellman, R. O. Dror, D. E. Shaw, W. I. Weis, M. Caffrey, P. Gmeiner, B. K. Kobilka. Structure and function of an irreversible agonist-beta(2) adrenoceptor complex. Nature. 2011;469(7329):236-240.

17. Liu W., E. Chun, A. A. Thompson, P. Chubukov, F. Xu, V. Katritch, G. W. Han, C. B. Roth, L. H. Heitman, A. P. Ijzerman, V. Cherezov, R. C. Stevens. Structural Basis for Allosteric Regulation of GPCRs by Sodium Ions. Science. 2012;337(6091):232-236.

18. Manglik A., A. C. Kruse, T. S. Kobilka, F. S. Thian, J. M. Mathiesen, R. K. Sunahara, L. Pardo, W. I. Weis, B. K. Kobilka, S. Granier. Crystal structure of the micro-opioid receptor bound to a morphinan antagonist. Nature. 2012;485(7398):321-326.

33

19. Wacker D., C. Wang, V. Katritch, G. W. Han, X. P. Huang, E. Vardy, J. D. McCorvy, Y. Jiang, M. H. Chu, F. Y. Siu, W. Liu, H. E. Xu, V. Cherezov, B. L. Roth, R. C. Stevens. Structural Features for at Serotonin Receptors. Science. 2013;340(6132):615-619.

20. Wu Q. Y., Q. Liang. Interplay between Curvature and Lateral Organization of Lipids and Peptides/Proteins in Model Membranes. Langmuir. 2014;30(4):1116-1122.

21. Zhang K., J. Zhang, Z. G. Gao, D. Zhang, L. Zhu, G. W. Han, S. M. Moss, S. Paoletta, E. Kiselev, W. Lu, G. Fenalti, W. Zhang, C. E. Muller, H. Yang, H. Jiang, V. Cherezov, V. Katritch, K. A. Jacobson, R. C. Stevens, B. Wu, et al. Structure of the human P2Y12 receptor in complex with an antithrombotic drug. Nature. 2014;509(7498):115-118.

22. Huang P., D. Nedelcu, M. Watanabe, C. Jao, Y. Kim, J. Liu, A. Salic. Cellular Cholesterol Directly Activates Smoothened in Hedgehog Signaling. Cell. 2016;166(5):1176-1187 e1114.

23. Luchetti G., R. Sircar, J. H. Kong, S. Nachtergaele, A. Sagner, E. F. Byrne, D. F. Covey, C. Siebold, R. Rohatgi. Cholesterol activates the G-protein coupled receptor Smoothened to promote Hedgehog signaling. Elife. 2016;5:e20304.

24. Xiao X., J. J. Tang, C. Peng, Y. Wang, L. Fu, Z. P. Qiu, Y. Xiong, L. F. Yang, H. W. Cui, X. L. He, L. Yin, W. Qi, C. C. Wong, Y. Zhao, B. L. Li, W. W. Qiu, B. L. Song. Cholesterol Modification of Smoothened Is Required for Hedgehog Signaling. Mol Cell. 2017;66(1):154-162.e110.

25. Myers B. R., L. Neahring, Y. X. Zhang, K. J. Roberts, P. A. Beachy. Rapid, direct activity assays for Smoothened reveal Hedgehog pathway regulation by membrane cholesterol and extracellular sodium. Proc Natl Acad Sci U S A. 2017;114(52):E11141-E11150.

26. Fredriksson R., M. C. Lagerstrom, L. G. Lundin, H. B. Schioth. The G-protein-coupled receptors in the human genome form five main families. Phylogenetic analysis, paralogon groups, and fingerprints. Mol Pharmacol. 2003;63(6):1256-1272.

27. Periole X. Interplay of G Protein-Coupled Receptors with the Membrane: Insights from Supra- Atomic Coarse Grain Molecular Dynamics Simulations. Chem Rev. 2017;117(1):156-185.

28. Gimpl G. Interaction of G protein coupled receptors and cholesterol. Chem Phys Lipids. 2016;199:61-73.

29. Genheden S., J. W. Essex, A. G. Lee. G protein coupled receptor interactions with cholesterol deep in the membrane. Biochim Biophys Acta - Biomembr. 2017;1859(2):268-281.

34

30. Sengupta D., A. Chattopadhyay. Molecular dynamics simulations of GPCR-cholesterol interaction: An emerging paradigm. Biochim Biophys Acta - Biomembr. 2015;1848(9):1775-1782.

31. Parrill A. L., G. Tigyi. Integrating the puzzle pieces: the current atomistic picture of phospholipid- G protein coupled receptor interactions. Biochim Biophys Acta - Mol Cell Biol Lipids. 2013;1831(1):2-12.

32. Mondal S., G. Khelashvili, H. Weinstein. Not just an oil slick: how the energetics of protein- membrane interactions impacts the function and organization of transmembrane proteins. Biophys J. 2014;106(11):2305-2316.

33. Vickery O. N., J. P. Machtens, U. Zachariae. Membrane potentials regulating GPCRs: insights from experiments and molecular dynamics simulations. Curr Opin Pharmacol. 2016;30:44-50.

34. Ballesteros J. A., H. Weinstein. [19] Integrated methods for the construction of three-dimensional models and computational probing of structure-function relations in G protein-coupled receptors. In: Sealfon SC, editor. Methods in . 25: Academic Press; 1995. p. 366-428.

35. Grossfield A., S. E. Feller, M. C. Pitman. A role for direct interactions in the modulation of rhodopsin by omega-3 polyunsaturated lipids. Proc Natl Acad Sci U S A. 2006;103(13):4888-4893.

36. Khelashvili G., A. Grossfield, S. E. Feller, M. C. Pitman, H. Weinstein. Structural and dynamic effects of cholesterol at preferred sites of interaction with rhodopsin identified from microsecond length molecular dynamics simulations. Proteins: Struct Funct Bioinform. 2009;76(2):403-417.

37. Horn J. N., T. C. Kao, A. Grossfield. Coarse-Grained Molecular Dynamics Provides Insight into the Interactions of Lipids and Cholesterol with Rhodopsin. In: Filizola M, editor. G Protein-Coupled Receptors - Modeling and Simulation. Advances in Experimental and Biology. 796. Berlin: Springer-Verlag Berlin; 2014. p. 75-94.

38. Teague W. E., O. Soubias, H. Petrache, N. Fuller, K. G. Hines, R. P. Rand, K. Gawrisch. Elastic properties of polyunsaturated phosphatidylethanolamines influence rhodopsin function. Faraday Discuss. 2013;161:383-395.

39. Salas-Estrada L. A., N. Leioatts, T. D. Romo, A. Grossfield. Lipids Alter Rhodopsin Function via Ligand-like and Solvent-like Interactions. Biophys J. 2018;114(2):355-367.

40. Cang X. H., Y. Du, Y. Y. Mao, Y. Y. Wang, H. Y. Yang, H. L. Jiang. Mapping the Functional Binding Sites of Cholesterol in beta(2)-Adrenergic Receptor by Long-Time Molecular Dynamics Simulations. J Phys Chem B. 2013;117(4):1085-1094.

35

41. Prasanna X., A. Chattopadhyay, D. Sengupta. Cholesterol modulates the dimer interface of the beta(2)-adrenergic receptor via cholesterol occupancy sites. Biophys J. 2014;106(6):1290-1300.

42. Gater D. L., O. Saurel, I. Iordanov, W. Liu, V. Cherezov, A. Milon. Two Classes of Cholesterol Binding Sites for the beta(2)AR Revealed by Thermostability and NMR. Biophys J. 2014;107(10):2305- 2312.

43. Cang X. H., L. L. Yang, J. Yang, C. Luo, M. Y. Zheng, K. Q. Yu, H. Y. Yang, H. L. Jiang. Cholesterol-beta(1)AR interaction versus cholesterol-beta(2)AR interaction. Proteins: Struct Funct Bioinform. 2014;82(5):760-770.

44. Zocher M., C. Zhang, S. G. F. Rasmussen, B. K. Kobilka, D. J. Muller. Cholesterol increases kinetic, energetic, and mechanical stability of the human beta(2)-adrenergic receptor. Proc Natl Acad Sci U S A. 2012;109(50):E3463-E3472.

45. Dawaliby R., C. Trubbia, C. Delporte, M. Masureel, P. Van Antwerpen, B. K. Kobilka, C. Govaerts. Allosteric regulation of G protein-coupled receptor activity by phospholipids. Nat Chem Biol. 2016;12(1):35-39.

46. Casiraghi M., M. Damian, E. Lescop, E. Point, K. Moncoq, N. Morellet, D. Leyy, J. Marie, E. Guittet, J. L. Baneres, L. J. Catoire. Functional Modulation of a G Protein-Coupled Receptor Conformational Landscape in a Lipid Bilayer. J Am Chem Soc. 2016;138(35):11170-11175.

47. Manna M., M. Niemela, J. Tynkkynen, M. Javanainen, W. Kulig, D. J. Muller, T. Rog, I. Vattulainen. Mechanism of allosteric regulation of beta2-adrenergic receptor by cholesterol. Elife. 2016;5:e18432.

48. Inagaki S., R. Ghirlando, J. F. White, J. Gvozdenovic-Jeremic, J. K. Northup, R. Grisshammer. Modulation of the Interaction between Neurotensin Receptor NTS1 and Gq Protein by Lipid. J Mol Biol. 2012;417(1-2):95-111.

49. Kimura T., A. A. Yeliseev, K. Vukoti, S. D. Rhodes, K. Cheng, K. C. Rice, K. Gawrisch. Recombinant Cannabinoid Type 2 Receptor in Liposome Model Activates G Protein in Response to Anionic Lipid Constituents. J Biol Chem. 2012;287(6):4076-4087.

50. Neale C., H. D. Herce, R. Pomes, A. E. Garcia. Can Specific Protein-Lipid Interactions Stabilize an Active State of the Beta 2 Adrenergic Receptor? Biophys J. 2015;109(8):1652-1662.

36

51. Bruzzese A., C. Gil, J. A. R. Dalton, J. Giraldo. Structural insights into positive and negative allosteric regulation of a G protein-coupled receptor through protein-lipid interactions. Sci Rep. 2018;8(1):4456.

52. Corradi V., E. Mendez-Villuendas, H. I. Ingólfsson, R.-X. Gu, I. Siuda, M. N. Melo, A. Moussatova, L. J. DeGagné, B. I. Sejdiu, G. Singh, T. A. Wassenaar, K. Delgado Magnero, S. J. Marrink, D. P. Tieleman. Lipid–Protein Interactions Are Unique Fingerprints for Membrane Proteins. ACS Cent Sci. 2018;4(6):709-717.

53. Marino K. A., D. Prada-Gracia, D. Provasi, M. Filizola. Impact of Lipid Composition and Receptor Conformation on the Spatio-temporal Organization of mu-Opioid Receptors in a Multi-component Plasma Membrane Model. PLoS Comput Biol. 2016;12(12):e1005240.

54. Yen H. Y., K. K. Hoi, I. Liko, G. Hedger, M. R. Horrell, W. Song, D. Wu, P. Heine, T. Warne, Y. Lee, B. Carpenter, A. Pluckthun, C. G. Tate, M. S. P. Sansom, C. V. Robinson. PtdIns(4,5)P2 stabilizes active states of GPCRs and enhances selectivity of G-protein coupling. Nature. 2018;559(7714):423-427.

55. Zezula J., M. Freissmuth. The A(2A)-adenosine receptor: a GPCR with unique features? British Journal of . 2008;153:S184-S190.

56. Jaakola V. P., M. T. Griffith, M. A. Hanson, V. Cherezov, E. Y. T. Chien, J. R. Lane, A. P. Ijzerman, R. C. Stevens. The 2.6 Angstrom Crystal Structure of a Human A(2A) Adenosine Receptor Bound to an Antagonist. Science. 2008;322(5905):1211-1217.

57. Lyman E., C. Higgs, B. Kim, D. Lupyan, J. C. Shelleys, R. Farid, G. A. Voth. A Role for a Specific Cholesterol Interaction in Stabilizing the Apo Configuration of the Human A(2A) Adenosine Receptor. Structure. 2009;17(12):1660-1668.

58. Lee J. Y., E. Lyman. Predictions for Cholesterol Interaction Sites on the A(2A) Adenosine Receptor. J Am Chem Soc. 2012;134(40):16512-16515.

59. Ng H. W., C. A. Laughton, S. W. Doughty. Molecular Dynamics Simulations of the Adenosine A2a Receptor in POPC and POPE Lipid Bilayers: Effects of Membrane on Protein Behavior. J Chem Inf Model. 2014;54(2):573-581.

60. Rouviere E., C. Arnarez, L. W. Yang, E. Lyman. Identification of Two New Cholesterol Interaction Sites on the A(2A) Adenosine Receptor. Biophys J. 2017;113(11):2415-2424.

37

61. Shrivastava S., T. J. Pucadyil, Y. D. Paila, S. Ganguly, A. Chattopadhyay. Chronic Cholesterol Depletion Using Statin Impairs the Function and Dynamics of Human Serotonin(1A) Receptors. Biochemistry. 2010;49(26):5426-5435.

62. Saxena R., A. Chattopadhyay. Membrane cholesterol stabilizes the human serotonin(1A) receptor. Biochim Biophys Acta - Biomembr. 2012;1818(12):2936-2942.

63. Gutierrez M. G., K. S. Mansfield, N. Malmstadt. The Functional Activity of the Human Serotonin 5-HT1A Receptor Is Controlled by Lipid Bilayer Composition. Biophys J. 2016;110(11):2486-2495.

64. Sengupta D., A. Chattopadhyay. Identification of cholesterol binding sites in the serotonin1A receptor. J Phys Chem B. 2012;116(43):12991-12996.

65. Patra S. M., S. Chakraborty, G. Shahane, X. Prasanna, D. Sengupta, P. K. Maiti, A. Chattopadhyay. Differential dynamics of the serotonin1A receptor in membrane bilayers of varying cholesterol content revealed by all atom molecular dynamics simulation. Mol Membr Biol. 2015;32(4):127-137.

66. Ramirez-Anguita J. M., I. Rodriguez-Espigares, R. Guixa-Gonzalez, A. Bruno, M. Torrens- Fontanals, A. Varela-Rial, J. Selent. Membrane cholesterol effect on the 5-HT2A receptor: Insights into the lipid-induced modulation of an antipsychotic drug target. Biotechnol Appl Biochem. 2018;65(1):29-37.

67. Prasanna X., M. Jafurulla, D. Sengupta, A. Chattopadhyay. The ganglioside GM1 interacts with the serotonin1A receptor via the sphingolipid binding domain. Biochim Biophys Acta - Biomembr. 2016;1858(11):2818-2826.

68. Jafurulla M., S. Bandari, T. J. Pucadyil, A. Chattopadhyay. Sphingolipids modulate the function of human serotonin(1A) receptors: Insights from sphingolipid-deficient cells. Biochim Biophys Acta - Biomembr. 2017;1859(4):598-604.

69. Shan J., G. Khelashvili, S. Mondal, E. L. Mehler, H. Weinstein. Ligand-dependent conformations and dynamics of the serotonin 5-HT(2A) receptor determine its activation and membrane-driven oligomerization properties. PLoS Comput Biol. 2012;8(4):e1002473.

70. Hedger G., H. Koldso, M. Chavent, C. Siebold, R. Rohatgi, M. S. P. Sansom. Cholesterol Interaction Sites on the Transmembrane Domain of the Hedgehog Signal Transducer and Class F G Protein- Coupled Receptor Smoothened. Structure. 2018.

71. Oates J., B. Faust, H. Attrill, P. Harding, M. Orwick, A. Watts. The role of cholesterol on the activity and stability of neurotensin receptor 1. Biochim Biophys Acta - Biomembr. 2012;1818(9):2228- 2233.

38

72. Bolivar J. H., J. C. Munoz-Garcia, T. Castro-Dopico, P. M. Dijkman, P. J. Stansfeld, A. Watts. Interaction of lipids with the neurotensin receptor 1. Biochim Biophys Acta - Biomembr. 2016;1858(6):1278-1287.

73. Dijkman P. M., A. Watts. Lipid modulation of early G protein-coupled receptor signalling events. Biochim Biophys Acta - Biomembr. 2015;1848(11 Pt A):2889-2897.

74. Oddi S., E. Dainese, F. Fezza, M. Lanuti, D. Barcaroli, V. De Laurenzi, D. Centonze, M. Maccarrone. Functional characterization of putative cholesterol binding sequence (CRAC) in human type- 1 cannabinoid receptor. J Neurochem. 2011;116(5):858-865.

75. Koldso H., M. S. P. Sansom. Organization and Dynamics of Receptor Proteins in a Plasma Membrane. J Am Chem Soc. 2015;137(46):14694-14704.

76. Mystek P., P. Dutka, M. Tworzydlo, M. Dziedzicka-Wasylewska, A. Polit. The role of cholesterol and sphingolipids in the dopamine D-1 receptor and G protein distribution in the plasma membrane. Biochim Biophys Acta - Mol Cell Biol Lipids. 2016;1861(11):1775-1786.

77. Zhukovsky M. A., P. H. Lee, A. Ott, V. Helms. Putative cholesterol-binding sites in human immunodeficiency virus (HIV) coreceptors CXCR4 and CCR5. Proteins. 2013;81(4):555-567.

78. Kumari R., C. Castillo, A. Francesconi. Agonist-dependent Signaling by Group I Metabotropic Glutamate Receptors Is Regulated by Association with Lipid Domains. J Biol Chem. 2013;288(44):32004- 32019.

79. Pydi S. P., M. Jafurulla, L. S. Wai, R. P. Bhullar, P. Chelikani, A. Chattopadhyay. Cholesterol modulates bitter taste receptor function. Biochim Biophys Acta - Biomembr. 2016;1858(9):2081-2087.

80. Hildebrand P. W., P. Scheerer, J. H. Park, H. W. Choe, R. Piechnick, O. P. Ernst, K. P. Hofmann, M. Heck. A Ligand Channel through the G Protein Coupled Receptor Opsin. PLoS One. 2009;4(2):e4382.

81. Hurst D. P., A. Grossfield, D. L. Lynch, S. Feller, T. D. Romo, K. Gawrisch, M. C. Pitman, P. H. Reggio. A Lipid Pathway for Ligand Binding Is Necessary for a Cannabinoid G Protein-coupled Receptor. J Biol Chem. 2010;285(23):17954-17964.

82. Caliman A. D., Y. L. Miao, J. A. McCammon. Activation Mechanisms of the First Sphingosine-1- Phosphate Receptor. Protein Sci. 2017;26(6):1150-1160.

39

83. Guixa-Gonzalez R., J. L. Albasanz, I. Rodriguez-Espigares, M. Pastor, F. Sanz, M. Marti-Solano, M. Manna, H. Martinez-Seara, P. W. Hildebrand, M. Martin, J. Selent. Membrane cholesterol access into a G-protein-coupled receptor. Nat Commun. 2017;8:14505.

84. Park J. H., P. Scheerer, K. P. Hofmann, H. W. Choe, O. P. Ernst. Crystal structure of the ligand- free G-protein-coupled receptor opsin. Nature. 2008;454(7201):183-U133.

85. Menon I., T. Huber, S. Sanyal, S. Banerjee, P. Barre, S. Canis, J. D. Warren, J. Hwa, T. P. Sakmar, A. K. Menon. Opsin Is a Phospholipid Flippase. Curr Biol. 2011;21(2):149-153.

86. Goren M. A., T. Morizumi, I. Menon, J. S. Joseph, J. S. Dittman, V. Cherezov, R. C. Stevens, O. P. Ernst, A. K. Menon. Constitutive phospholipid scramblase activity of a G protein-coupled receptor. Nat Commun. 2014;5:5115.

87. Morra G., A. M. Razavi, K. Pandey, H. Weinstein, A. K. Menon, G. Khelashvili. Mechanisms of Lipid Scrambling by the G Protein-Coupled Receptor Opsin. Structure. 2018;26(2):356-+.

88. Maurice P., M. Kamal, R. Jockers. Asymmetry of GPCR oligomers supports their functional relevance. Trends Pharmacol Sci. 2011;32(9):514-520.

89. Gaitonde S. A., J. Gonzalez-Maeso. Contribution of heteromerization to G protein-coupled receptor function. Curr Opin Pharmacol. 2017;32:23-31.

90. Kasai R. S., A. Kusumi. Single-molecule imaging revealed dynamic GPCR dimerization. Curr Opin Cell Biol. 2014;27:78-86.

91. Kniazeff J., L. Prezeau, P. Rondard, J. P. Pin, C. Goudet. Dimers and beyond: The functional puzzles of class C GPCRs. Pharmacol Ther. 2011;130(1):9-25.

92. Chabre M., P. Deterre, B. Antonny. The apparent cooperativity of some GPCRs does not necessarily imply dimerization. Trends Pharmacol Sci. 2009;30(4):182-187.

93. Gurevich V. V., E. V. Gurevich. GPCRs and Signal Transducers: Interaction Stoichiometry. Trends Pharmacol Sci. 2018;39(7):672-684.

94. Farran B. An update on the physiological and therapeutic relevance of GPCR oligomers. Pharmacol Res. 2017;117:303-327.

95. Gahbauer S., R. A. Bockmann. Membrane-Mediated Oligomerization of G Protein Coupled Receptors and Its Implications for GPCR Function. Front Physiol. 2016;7:494.

40

96. Franco R., E. Martinez-Pinilla, J. L. Lanciego, G. Navarro. Basic Pharmacological and Structural Evidence for Class A G-Protein-Coupled Receptor Heteromerization. Front Pharmacol. 2016;7:76.

97. Wu B., E. Y. Chien, C. D. Mol, G. Fenalti, W. Liu, V. Katritch, R. Abagyan, A. Brooun, P. Wells, F. C. Bi, D. J. Hamel, P. Kuhn, T. M. Handel, V. Cherezov, R. C. Stevens. Structures of the CXCR4 chemokine GPCR with small-molecule and cyclic peptide antagonists. Science. 2010;330(6007):1066- 1071.

98. Periole X., T. Huber, S. J. Marrink, T. P. Sakmar. G protein-coupled receptors self-assemble in dynamics simulations of model bilayers. J Am Chem Soc. 2007;129(33):10126-10132.

99. Knepp A. M., X. Periole, S. J. Marrink, T. P. Sakmar, T. Huber. Rhodopsin Forms a Dimer with Cytoplasmic Helix 8 Contacts in Native Membranes. Biochemistry. 2012;51(9):1819-1821.

100. Periole X., A. M. Knepp, T. P. Sakmar, S. J. Marrink, T. Huber. Structural Determinants of the Supramolecular Organization of G Protein-Coupled Receptors in Bilayers. J Am Chem Soc. 2012;134(26):10959-10965.

101. Soubias O., W. E. Teague, K. G. Hines, K. Gawrisch. Rhodopsin/Lipid Hydrophobic Matching- Rhodopsin Oligomerization and Function. Biophys J. 2015;108(5):1125-1132.

102. Mondal S., J. M. Johnston, H. Wang, G. Khelashvili, M. Filizola, H. Weinstein. Membrane driven spatial organization of GPCRs. Sci Rep. 2013;3:2909.

103. Provasi D., J. M. Johnston, M. Filizola. Lessons from Free Energy Simulations of delta-Opioid Receptor Homodimers Involving the Fourth Transmembrane Helix. Biochemistry. 2010;49(31):6771-6776.

104. Johnston J. M., M. Aburi, D. Provasi, A. Bortolato, E. Urizar, N. A. Lambert, J. A. Javitch, M. Filizola. Making Structural Sense of Dimerization Interfaces of Delta Opioid Receptor Homodimers. Biochemistry. 2011;50(10):1682-1690.

105. Provasi D., M. B. Boz, J. M. Johnston, M. Filizola. Preferred supramolecular organization and dimer interfaces of opioid receptors from simulated self-association. PLoS Comput Biol. 2015;11(3):e1004148.

106. Zheng H., E. A. Pearsall, D. P. Hurst, Y. Zhang, J. Chu, Y. Zhou, P. H. Reggio, H. H. Loh, P. Y. Law. Palmitoylation and membrane cholesterol stabilize mu-opioid receptor homodimerization and G protein coupling. BMC Cell Biol. 2012;13:6.

41

107. Guixa-Gonzalez R., M. Javanainen, M. Gomez-Soler, B. Cordobilla, J. C. Domingo, F. Sanz, M. Pastor, F. Ciruela, H. Martinez-Seara, J. Selent. Membrane omega-3 fatty acids modulate the oligomerisation kinetics of adenosine A2A and dopamine D2 receptors. Sci Rep. 2016;6:19839.

108. Prasanna X., D. Sengupta, A. Chattopadhyay. Cholesterol-dependent Conformational Plasticity in GPCR Dimers. Sci Rep. 2016;6:31858.

109. Pluhackova K., S. Gahbauer, F. Kranz, T. A. Wassenaar, R. A. Bockmann. Dynamic Cholesterol- Conditioned Dimerization of the G Protein Coupled Chemokine Receptor Type 4. PLoS Comput Biol. 2016;12(11):e1005169.

110. Gahbauer S., K. Pluhackova, R. A. Btockmann. Closely related, yet unique: Distinct homo- and heterodimerization patterns of G protein coupled chemokine receptors and their fine-tuning by cholesterol. PLoS Comp Biol. 2018;14(3).

111. Soubias O., K. Gawrisch. The role of the lipid matrix for structure and function of the GPCR rhodopsin. Biochim Biophys Acta. 2012;1818(2):234-240.

42

Chapter Three: Lipid-protein interactions are unique fingerprints for AMPA receptors

Copyright

The data used in this chapter was published in the following article:

Corradi V., E. Mendez-Villuendas, H. I. Ingolfsson, R. X. Gu, I. Siuda, M. N. Melo, A. Moussatova, L. J. DeGagne, B. I. Sejdiu, G. Singh, T. A. Wassenaar, K. D. Magnero, S. J. Marrink, D. P. Tieleman. Lipid- Protein Interactions Are Unique Fingerprints for Membrane Proteins. Acs Central Sci. 2018;4(6):709-717.

The analysis presented here, and the figures generated in the process, however, are entirely new and different from the published work. So are the conclusions of these findings.

Contributions

The following chapter details part of my contributions to the above paper. The simulations presented here were initiated by a former colleague and continued by me. The analysis of the data and their presentation contained herein are also my own contributions. Everything is done under the input of my supervisor and his guidance.

Abbreviations

All abbreviations are introduced at their first mentioning instance.

43

3.1 Introduction

Complex human activities such as thought, and introspection are the combined result of receptor activities lining up the surfaces of pre- and post-synaptic neurons. In response to a charge gradient created as a result of an action potential, are released into the synaptic cleft and deffuse towards the dendrites of the post-synaptic neuron, where they bind to their target receptors(1, 2). The most common of these neurotransmitters is glutamate and receptors that bind to glutamate and either elicit a downstream signaling response or ion exchange are called metabotropic glutamate receptors (mGluRs) and ionotropic glutamate receptors (iGluRs), respectively. Whereas the former belongs to G Protein – Coupled Receptors (class C), iGluRs are ligand-gated ion channels. Based on their evolutionary relationship and pharmacology, the latter are categorized into four types: α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptors, N-methyl-D-aspartate (NMDA) receptors, kainite and delta receptors(3, 4). iGluRs mediate the majority of excitatory neurotransmission in the central nervous system and are involved in almost all aspects of its function and development. Consequently, their importance to neurobiology and medicine are paramount, especially, owing to their implication in neurodegenerative and psychiatric disorders(5). Biochemically, AMPA receptors are heterotetramers, with each subunit being composed of a transmembrane domain (TMD) and a large extracellular domain which is further divided into two domains: an amino-terminal domain (ATD) that is responsible for receptor trafficking and assembly and a ligand- binding domain (LBD) that binds to glutamate (as well as agonists and competitive antagonists)(5, 6) and induces structural changes to the TMD. The TMD is embedded in the membrane and forms the ion channel part of the receptor. The overall architecture of AMPA receptors, depending on how axes of symmetry are drawn, can feature a two- and four-fold symmetry between its domain units, something that is unique in structural biology(5).

Upon glutamate binding, the LBD of AMPA receptors will undergo a conformational change that will trigger the opening of the ion channel and allow for influx and efflux of sodium and potassium ions, respectively, leading to post-synaptic neuron depolarization. The detailed is a major focus of research, however, with AMPA receptors capable of also allowing calcium influx and which has important functional ramifications(7). Considering current evidence highlighting the importance of lipid- protein interactions(8-10), one aspect of AMPA receptor activity that has not been explored yet is the way the surrounding lipid environment of the membrane-spanning ion channel may play a role in the structure and activity of the receptor. Ion channels in general, and ligand-gated ion channels in particular, have repeatedly been shown to form specific interactions with lipids such as cholesterol and PIP lipids. Kv1.2 channels, for example, display a non-localized enrichment of polyunsaturated lipids, and a localized and

44 clearly defined enrichment of cholesterol(11). The functional implications of these interactions are, however, still largely undefined, despite experimental studies underlining the impact of membrane lipids in ion channel activity(12). In the context of lipid–protein interactions, however, AMPA receptors are an outlier in so far as there is very little information available regarding their interactions with membrane lipids.

To reduce this deficit of information with respect to ion channel – lipid interactions, we simulated the AMPA receptor using a coarse-grained approach in a complex membrane environment. We show that AMPA receptors, not unlike other ligand-gated ion channels, interacts specifically with cholesterol. However, we find that the primary lipids involved in specific interactions are diacylglycerol (DAG) lipids. Considering the positioning of these interaction sites we postulate a functional involvement of these interactions in the activity of AMPA receptors.

3.2 Methods

We retrieved the protein structure from the Protein Data Bank with Protein Data Bank (PDBID: 3KG2)(5). The simulations were carried out using the MARTINI model(13). All ligands where removed and missing atoms where added. The structure was converted into a coarse-grained representation using the martinize tool as described on the MARTINI website (cgmartini.nl).

Four copies of the coarse-grained proteins are placed in a 40x40 nm2 area which is then filled with lipids using the tool insane(14). The complex membrane setup is composed of 63 different lipid types based on the model developed by Ingólfsson et al.(15) and applied to 10 different proteins(16), The exact ratios for each lipid are provided there(16), with the following being a brief summary of the lipid composition. The membrane model contains an asymmetric distribution of the following major lipid groups: cholesterol (CHOL), phosphatidylcholine (PC), phosphatidylethanolamine (PE), and sphingomyelin (SM) placed in both leaflets; gangliosides exclusively in the upper leaflet, and charged lipids phosphatidylserine (PS), phosphatic acid (PA), phosphatidylinositol (PI) along with PI-phosphate, -bisphosphate, and -trisphosphate lipids (PIPs) placed exclusively in the lower leaflet. Complete details on the lipids used can be found on the MARTINI lipidome webpage (cgmartini.nl/index.php/force-field-parameters/lipids).

System equilibration was done using a gradual stepwise procedure, reducing the number and strength of position restraints on the proteins. Simulations were performed using a 20 fs time step. A target 310 K temperature was maintained with a velocity-rescaling thermostat(17), and a time constant for coupling of 1 ps. The Berendsen barostat(18) applied semi-isotropically at 1 bar and a compressibility of 3 · 10-4 bar-1

45 and a relaxation constant of 5 ps. A small 1 kJ mol-1 nm-2 force constant on backbone beads was maintained during the simulation to prevent protein-protein interactions. The system was simulated for 48 μs, with the last 5 μs used for the calculations (unless stated otherwise). All simulations were carried out using the GROMACS 4.6.x package(19). For the first 38 μs of the simulation, no elastic network was applied. This was corrected for the last 10 μs of the trajectory.

Visualizations are done using either VMD(20) or the NGL Viewer(21, 22).

3.3 Results

3.3.1 AMPA receptors interact with different lipid types at the membrane spanning domain

The AMPA receptor interaction profile with lipids features the enrichment/depletion of different lipid types at the transmembrane domain. We show this in Figure 3-1 using radial distribution functions (rdf) of different lipid groups. The results are calculated for the upper and lower leaflet separately. Because they flip-flop, the rdf for DAG lipids is for both leaflets combined. The most noticeable observation is the high clustering of DAG lipids around the receptor, surpassing even GM lipids in the upper leaflet. The observance of GM lipid aggregation in itself fits with the common theme of protein – GM lipid interactions whereby the latter have a pronounced tendency to cluster around the circumference of embedded proteins and interacting with surface – exposed positively charged or polar residues.

In contrast to the upper leaflet, the lower leaflet does not show such a high preferential localization of any other lipid type, although negatively charged lipids (PA, PI and PIPs) do show a very modest clustering higher than their neutral counterparts (especially when compared to their low concentration in the membrane model). The interaction of membrane proteins with PIP lipids, in particular, has been noted in the lipid-protein interaction literature and PIP lipids have been shown to affect several aspects of protein function and activity. Recently, the interaction of inward rectifier potassium (Kir2) channels(23) with several lipid types, among them PIP and PS lipids, has been characterized in detail noting a strong clustering of these lipids around Kir2 channels. In comparison, we observe lower levels of aggregations for PIP and PS lipids and see a higher clustering of PI lipids instead.

46

Figure 3-1. System setup and the distribution of lipids. Radial distribution of main lipid groups in our simulation as a function of distance form the protein (measured up to 7 nm). The data are calculated for the upper (A) and lower (B) membrane leaflet separately, with the exception of DAG lipids which are calculated for both leaflets combined (the same plot is shown in A and B).

With respect to the lipid tail saturation degree, we see that AMPA receptors prefer an environment that is high in polyunsaturated lipids, in agreement with previous evidence for other ion channels(11). The localization of fully saturated lipid is low in the lower (or cytoplasmic) leaflet and shows a more localized clustering in the upper leaflet (reference 16 contains a more thorough discussion of these aspects of lipid- protein interactions).

3.3.2 Interaction heatmaps reveal the interaction diversity of AMPA receptors with lipids

Studying lipid-protein interactions, even in systems incorporating complex membrane models, is usually limited to the detailed characterization of a few lipid types, most notably cholesterol and PIP lipids. This is the case for GPCRs, for example, but also many other proteins. Our analysis of the interaction profile of AMPA receptors, however, reveals a more complex nature of interactions with lipids. Specifically, we observe most lipid types in our system interacting with AMPA receptors, but not to the same degree. This includes GM, CHOL and PIP lipids, but we also see PS, PI, and notably DAG lipids forming a clearly definite interaction “fingerprint” with AMPA receptors (Figure 3-2). On the extracellular side, the ion channel is defined predominantly by interactions with GM lipids, whereas on the lower leaflet side, the ion channel features the clustering of different lipid types. It is particularly negatively charged PS, PI and PIP

47 lipids that interact most with the cytoplasmic side of the receptor. The latter region features positively charged lysine and arginine residues at the most distal part of the ion channel which form strong interactions with the headgroup of the above lipid types.

In contrast to other lipid groups that interact mostly on either the upper or lower membrane side of the receptor, CHOL and DAG lipids interact predominantly with the core of the ion channel. It is these interactions with CHOL and DAG that are maintained for prolonged durations of the time. In the literature of specific binding of lipids to proteins, cholesterol is a key player as the specific binding of cholesterol has been highlighted for a plethora of different proteins, including other ion channels(8). So, while the observation of cholesterol interaction sites for AMPA receptors is not in itself surprising, its confirmation and the subsequent characterization that follow in the next section are of great importance. Finally, our observation of specific interactions with DAG lipids, which to our knowledge lacks a precedent in the field of receptors and ion channels, is particularly noteworthy. In the following section, we describe AMPA receptor interactions with CHOL and DAG lipids in greater detail.

48

Figure 3-2. AMPA receptors interaction heatmaps. The total number of contacts (N) as well as the average duration of the longest contacts (D) (i.e. the lipid- residue contact that is maintained for the longest time) are projected on the surface of the receptor. The data are displayed for the major lipid species in our simulation setup. For clarity only the transmembrane domain of the receptor is displayed.

49

3.3.3 Cholesterol and DAG lipids form specific interactions with AMPA receptors

The membrane spanning domain of AMPA receptors that is responsible for its ion channel functionality interacts in a lipid-specific manner with two lipid species in our simulations: CHOL and DAG lipids. Each subunit of the AMPA receptor is formed from four helices (M1-M4)(5), three of which span the width of the membrane (M1, M3 and M4) and one (M2) does not (Figure 3-3A). The tetrameric arrangement of the subunit forms the ion conduction pathway in the center of the receptor. The positioning of the M2 helix creates a cavity that allows for membrane lipids to interact with residues lining up its surface, and it is here where we observe the most interactions with CHOL and DAG lipids.

Figure 3-3. Density and distance measurements. We show the atomistic structure of the receptor (A) and highlight one of its subunits along with Phe-531 and Leu-581 that we use as reference to measure the direct binding of lipids. We also show that 3D density of cholesterol (red) and DAG lipids (lime). It is calculated by centering and averaging the Cartesian coordinates of the lipids and (B) mapped over the interaction heatmaps displayed in Figure 3-2. The subplot in C shows distance calculations between cholesterol and DAG lipids with reference residues.

While cholesterol interactions involve predominantly residues forming this cavity, they are less well localized and include other residues as well (e.g. residues that form the interface between M1 and M3

50 helices). Interactions with DAG lipids, in contrast, are almost exclusively located at this cavity region. We see this, for instance, when we calculate the three-dimensional density of these lipids (Figure 3-3B).

We also calculated distance measurements between reference residues Leu-581 and Phe-531 and cholesterol and DAG lipids, respectively to show the specific binding of these lipids to AMPA receptors. The close contacts that we observe show that the increased localization of these lipids is direct consequence of their specific interactions.

3.4 Discussion

Ion channels are integral proteins that in response to a stimulus, change conformation from an inactivated to a conducting state. This process is accompanied by the opening of a channel that allows the permeation of water molecules and ions. Substantial support to the importance of membrane lipids in the function and activity of ion channels is provided by numerous studies, theoretical and experimental, showing the interaction between these components in great detail(8, 9). For example, lipid-protein interactions have been highlighted and characterized for different voltage-gated channels, e.g. potassium channels (K2P(24), Kir2.1(12), Kir2.2(23), KcsA(25, 26)) and sodium channels(27). The same is true for ligand-gated ion channels, with studies of ATP-gated channels, TRP channels and Cys-Loop receptors converging to the importance of lipids in the structure and function of ion channels(8). Cholesterol and PIP lipids, in particular, are the most common lipids found to form specific interactions to these proteins, although the involvement of other lipids has also been noted.

Despite the breadth of studies focusing on the lipid interaction profile of ion channels, there is virtually no information available on AMPA receptors. In the current project, our aim was to study and characterize the interaction profile of AMPA receptors with lipids. To this end, we discover the specific interaction of AMPA receptors with cholesterol and DAG lipids. This is noteworthy for two reasons. First, similar to the general observation of the interaction of cholesterol with ligand-gated ion channels(8), we confirm that AMPA receptors, as well, bind specifically to cholesterol. Our observation of the specific DAG lipid binding, however, to our knowledge, lacks a precedent in the literature of integral membrane receptors and ion channels. Given the resolution of the MARTINI model, it is unclear what structural features of the AMPA receptor lead to the preference for DAG lipids. A likely explanation could be the lack of a headgroup that allows specific interactions with DAG lipids but precludes interactions with other lipid species. Regardless of the exact mechanism involved, however, this aspect of AMPA receptor – lipid interactions merit further consideration. Secondly, these specific interactions are localized mainly at the interface formed by helices M1, M2 and M4 (Figure 3-3), which allows these lipids to interact with the terminal

51 residue of M2 that form the ion conduction pathway, thereby, potentially, affecting the activity of the ion channel.

More generally, our discovery of specific AMPA receptor – DAG lipid interactions highlight the need and importance of studying lipid-protein interactions using complex membrane setups. A survey of current published MD literature on lipid-protein interactions reveals that DAG lipids are rarely included in membrane models, and as such it is unclear how prevalent protein – DAG lipid interactions really are. The current trend in the field, however, is toward more realistic membrane compositions, so we are confident that other lipid species, including DAG lipids, will continue to gain increasing attention. This is true for AMPA receptors discussed here, but also, for instance, for the Kir2.2 channel(23) where recently it was shown that these channels interact with various phospholipids in addition to cholesterol. Our results in many ways echo the finding of Duncan et al for Kir2.2(23).

Lastly, simulations of AMPA receptors reveal the importance of a coarse-grained approach to study lipid- protein interactions. AMPA receptors are heterotetrameric, with each subunit stretching vertically to 180 Å distance. The width of the receptor is 55 Å on its narrowest side and 150 Å on its widest side(5). Simulating such a system in atomistic detail while also ensuring proper treatment of electrostatic interactions, leads to a system size of 1.5 million atoms. While atomistic simulations of this size and larger are possible, studying lipid diffusion around proteins and measuring specific vs. non-specific interactions at this scale is not yet feasible.

52

3.5 References

1. Hassel B., R. Dingledine. Glutamate and glutamate receptors. Basic Neurochemistry: Elsevier; 2012. p. 342-366.

2. Chater T. E., Y. Goda. The role of AMPA receptors in postsynaptic mechanisms of synaptic plasticity. Frontiers in cellular . 2014;8:401.

3. Traynelis S. F., L. P. Wollmuth, C. J. McBain, F. S. Menniti, K. M. Vance, K. K. Ogden, K. B. Hansen, H. Yuan, S. J. Myers, R. Dingledine. Glutamate receptor ion channels: structure, regulation, and function. Pharmacol Rev. 2010;62(3):405-496.

4. Dingledine R., K. Borges, D. Bowie, S. F. Traynelis. The glutamate receptor ion channels. Pharmacol Rev. 1999;51(1):7-62.

5. Sobolevsky A. I., M. P. Rosconi, E. Gouaux. X-ray structure, symmetry and mechanism of an AMPA-subtype glutamate receptor. Nature. 2009;462(7274):745-756.

6. Dravid S. M., H. Yuan, S. Traynelis. AMPA Receptors: Molecular Biology and Pharmacology. Encyclopedia of Neuroscience: Elsevier Ltd; 2010. p. 311-318.

7. Wright A. L., B. Vissel. The essential role of AMPA receptor GluR2 subunit RNA editing in the normal and diseased brain. Frontiers in molecular neuroscience. 2012;5:34.

8. Corradi V., B. I. Sejdiu, H. Mesa-Galloso, H. Abdizadeh, S. Y. Noskov, S. J. Marrink, D. P. Tieleman. Emerging Diversity in Lipid–Protein Interactions. Chem Rev. 2019.

9. Enkavi G., M. Javanainen, W. Kulig, T. Róg, I. Vattulainen. Multiscale Simulations of Biological Membranes: The Challenge To Understand Biological Phenomena in a Living Substance. Chem Rev. 2019.

10. Muller M. P., T. Jiang, C. Sun, M. Lihan, S. Pant, P. Mahinthichaichan, A. Trifan, E. Tajkhorshid. Characterization of Lipid–Protein Interactions and Lipid-Mediated Modulation of Membrane Protein Function through Molecular Simulation. Chem Rev. 2019.

11. Yazdi S., M. Stein, F. Elinder, M. Andersson, E. Lindahl. The molecular basis of polyunsaturated fatty acid interactions with the shaker voltage-gated potassium channel. PLoS Comput Biol. 2016;12(1).

12. Elinder F., S. I. Liin. Actions and mechanisms of polyunsaturated fatty acids on voltage-gated ion channels. Front Physiol. 2017;8:43.

53

13. Marrink S. J., H. J. Risselada, S. Yefimov, D. P. Tieleman, A. H. de Vries. The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations. The Journal of Physical Chemistry B. 2007;111(27):7812-7824.

14. Wassenaar T. A., H. I. Ingólfsson, R. A. Böckmann, D. P. Tieleman, S. J. Marrink. Computational lipidomics with insane: a versatile tool for generating custom membranes for molecular simulations. Journal of chemical theory and computation. 2015;11(5):2144-2155.

15. Ingólfsson H. I., M. N. Melo, F. J. Van Eerden, C. Arnarez, C. A. Lopez, T. A. Wassenaar, X. Periole, A. H. De Vries, D. P. Tieleman, S. J. Marrink. Lipid organization of the plasma membrane. J Am Chem Soc. 2014;136(41):14554-14559.

16. Corradi V., E. Mendez-Villuendas, H. I. Ingolfsson, R. X. Gu, I. Siuda, M. N. Melo, A. Moussatova, L. J. DeGagne, B. I. Sejdiu, G. Singh, T. A. Wassenaar, K. D. Magnero, S. J. Marrink, D. P. Tieleman. Lipid-Protein Interactions Are Unique Fingerprints for Membrane Proteins. Acs Central Sci. 2018;4(6):709-717.

17. Bussi G., D. Donadio, M. Parrinello. Canonical sampling through velocity rescaling. The Journal of chemical physics. 2007;126(1):014101.

18. Berendsen H. J., J. v. Postma, W. F. van Gunsteren, A. DiNola, J. R. Haak. Molecular dynamics with coupling to an external bath. The Journal of chemical physics. 1984;81(8):3684-3690.

19. Hess B., C. Kutzner, D. Van Der Spoel, E. Lindahl. GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. Journal of chemical theory and computation. 2008;4(3):435-447.

20. Humphrey W., A. Dalke, K. Schulten. VMD: visual molecular dynamics. Journal of molecular graphics. 1996;14(1):33-38.

21. Rose A. S., P. W. Hildebrand. NGL Viewer: a web application for molecular visualization. Nucleic Acids Res. 2015;43(W1):W576-W579.

22. Rose A. S., A. R. Bradley, Y. Valasatava, J. M. Duarte, A. Prlić, P. W. Rose. NGL viewer: web- based molecular graphics for large complexes. Bioinformatics. 2018;34(21):3755-3758.

23. Duncan A. L., R. A. Corey, M. S. Sansom. Defining how multiple lipid species interact with inward rectifier potassium (Kir2) channels. Proceedings of the National Academy of Sciences. 2020.

54

24. Nilius B., E. Honoré. Sensing pressure with ion channels. Trends in neurosciences. 2012;35(8):477- 486.

25. Deol S. S., C. Domene, P. J. Bond, M. S. Sansom. Anionic phospholipid interactions with the potassium channel KcsA: simulation studies. Biophys J. 2006;90(3):822-830.

26. Weingarth M., A. Prokofyev, E. A. van der Cruijsen, D. Nand, A. M. Bonvin, O. Pongs, M. Baldus. Structural determinants of specific lipid binding to potassium channels. J Am Chem Soc. 2013;135(10):3983-3988.

27. Ulmschneider M. B., C. Bagnéris, E. C. McCusker, P. G. DeCaen, M. Delling, D. E. Clapham, J. P. Ulmschneider, B. A. Wallace. Molecular dynamics of ion transport through the open conformation of a bacterial voltage-gated sodium channel. Proceedings of the National Academy of Sciences. 2013;110(16):6364-6369.

55

Chapter Four: Lipid-Protein Interactions are a Unique Property and Defining Feature of G Protein-Coupled Receptors

Copyright

The following chapter is published here:

Sejdiu B. I., D. P. Tieleman. Lipid-Protein Interactions Are a Unique Property and Defining Feature of G Protein-Coupled Receptors. Biophys J. 2020;118(8):1887-1900.

Copyright permissions are provided in Appendix C.

Contribution

The system setups were motivated by our previous work (partly presented in chapter 3). The design of the study, all the simulations as well as all the analysis were carried out by entirely me. The manuscript was written together with my supervisor who had substantial input on the overall shape and form the data are presented. Appendix A contains the supplementary information for this chapter, and any reference to supplementary material in the following text should be understood as a reference to Appendix A. Figures A-18 – A-22, along with Tables S2-S5 were too large to fit in the thesis. They are made available through the accompanying files.

Abbreviations

All abbreviations are introduced at their first mentioning instance.

56

4.1 Abstract

G Protein-Coupled Receptors (GPCRs) are membrane-bound proteins that depend on their lipid environment to carry out their physiological function. Combined efforts from many theoretical and experimental studies on the lipid-protein interaction profile of several GPCRs hint at an intricate relationship of these receptors with their surrounding membrane environment, with several lipids emerging as particularly important. Using coarse-grained molecular dynamics simulations, we explore the lipid- protein interaction profiles of 28 different GPCRs, spanning different levels of classification and conformational states, and totaling to 1 millisecond of simulation time. We find a close relationship with lipids for all GPCRs simulated, in particular, cholesterol and PIP lipids, but the number, location, and estimated strength of these interactions is dependent on the specific GPCR as well as its conformational state. While both cholesterol and PIP lipids bind specifically to GPCRs, they utilize distinct mechanisms. Interactions with PIP lipids are mediated by charge-charge interactions with intracellular loop residues and stabilized by one or both of the transmembrane helices linked by the loop. Interactions with cholesterol, on the other hand, are mediated by a hydrophobic environment, usually made up of residues from more than one helix, capable of accommodating its ring structure and stabilized by interactions with aromatic and charged/polar residues. Cholesterol binding to GPCRs occurs in a small number of sites, some of which (like the binding site on the extracellular side of TM6/7) are shared among many class A GPCRs. Combined with a thorough investigation of the local membrane structure, our results provide a detailed picture of GPCR-lipid interactions. Additionally, we provide an accompanying website to interactively explore the lipid-protein interaction profile of all GPCRs simulated to facilitate analysis and comparison of our data.

4.2 Significance

Membrane proteins carry out their function and activity in an environment composed of many different lipid types. Despite this heterogeneity, membrane proteins have been shown to associate preferably with some lipids, most notably cholesterol, over others. We selected 28 different GPCRs from different families and simulated them using the same simulation setup and protocol. Analysis of the simulation data reveals specific interactions with lipids for all GPCRs, as well as overall lipid interaction profiles that are unique to each GPCR structure and conformational state. Cholesterol and PIP lipids are the most prominent lipid types forming specific interactions with GPCRs and we find that they do so by employing different mechanisms.

57

4.3 Introduction

G Protein-Coupled Receptors (GPCRs) are the largest family of membrane protein receptors, both in size and in the diversity of their physiological functions.(1) Their structure consists of a structurally conserved seven transmembrane (TM) helical core whose sequence spans the full length of the membrane bilayer and ends with an eighth helix that stretches along the intracellular side of the membrane. The individual helices of the helical core are connected via 3 intracellular and 3 extracellular loops. GPCRs are essential drug targets, accounting for an estimated 1/3 of prescribed .(2, 3)

One aspect that has attracted substantial attention, especially in the last decade, has been the ability of the membrane environment to guide or otherwise exert influence on GPCR activity and function. Two modes of modulation have been proposed in the literature: (i) direct and specific lipid-protein interactions and (ii) altering of membrane physical properties (e.g. thickness, curvature, etc.).(4-10) Molecular Dynamics (MD) simulations, either alone or in conjunction with experiments, have repeatedly shown that lipids can and often do affect the activity of membrane-embedded proteins. Recent in-depth reviews on the lipid-protein interaction landscape as it emerges from hundreds of MD simulation studies highlight the prevalence of such interactions with various membrane proteins, with a particular emphasis on GPCRs, their possible biological relevancy and the importance of computational methods as essential tools in deciphering these interactions.(11-15)

Among the most studied GPCRs in the literature are β2 adrenergic receptor (β2AR), adenosine A2A receptor

(A2AR) and rhodopsin (RhodR, the complete list of abbreviations used in this study is given in Table

S1).(11, 16) Microsecond long MD simulations of β2AR,(17) for example, reveal several interaction sites for cholesterol distributed unequally between the extracellular (denoted here as ec) and intracellular (denoted ic) sides of the receptor and differing in their relative strength of binding to the receptor. These interaction sites were later supported by longer and a more diverse set of simulations.(18) Similarly, for

A2AR, combined all-atom and coarse-grained MD simulations from independent studies show A2AR to be interacting with cholesterol at several interaction sites.(19) Two sites, in particular, are noted for their cholesterol binding affinity: TM5/6 (ic) and TM6 (ec).(20) Rhodopsin, as well, has been shown to interact with cholesterol and DHA lipids in a ligand-like manner.(21) One such potential interaction with cholesterol, for instance, is located between helices TM7/1 of the receptor.(22) Bulk lipid properties, in addition to specific lipid-protein interactions, have also been proposed as determining factors in rhodopsin activity.(23) Specific lipid-GPCR interactions are believed to serve biologically important roles, with some interactions capable of acting as allosteric modulators of receptor activity, confer stability to the receptor, and may increase the coupling selectivity to effector proteins.(24, 25) The effect of membrane bulk lipid properties, rather than individual membrane components, is less well documented but its importance has

58 been highlighted for serotonin (5HT1AR)(26) and rhodopsin.(21) MD simulations of other GPCRs, including serotonin(27) and opioid receptors(28) and class F Smoothened (SMO)(29) consistently show specific interactions of GPCRs with cholesterol. Moreover, simulations of β2AR and A2AR show cholesterol to not only have a different interaction profile depending on the state of the receptor(19) but also affect the conformational transition from active to inactive state of the receptor by restricting its conformational landscape.(18) Experimental data, while scarcer, generally agrees with the conclusions reached by MD simulations. (24, 25, 30)

Despite the clear progress made, several aspects of GPCR-lipid interactions remain hidden. The majority of studies focus only on a few GPCRs and MD simulations protocols used to study lipid-protein interactions differ in design and execution, making both the comparison of results as well as extrapolation of data to other GPCRs difficult (a noteworthy exception is the work of Yen et al. where they did analyse the PIP interaction profile of nine class A GPCRs)(25). Additionally, in the last few years we have learned that in order to obtain statistically accurate lipid distributions around proteins microsecond long simulations are required.(17, 20, 31) This makes interpreting older simulation papers and resolving conflicting findings particularly challenging.

Recently, we developed a protocol to study lipid-protein interactions using the MARTINI model(32) by employing a highly realistic membrane model and long time-scale simulations.(31) We successfully applied the protocol to ten different proteins, and we showed that the lipid interaction profile of proteins is a defining feature of their structure, expressed as a combination of specific and long-lasting interactions formed with individual lipids and the resultant membrane thickness and curvature caused by the disordering of the local membrane structure.

In this study, we have used our simulation protocol to study GPCR-lipid interactions. We have simulated 23 different GPCR structures, 5 of which we simulated in both the active and inactive state, totaling to 28 different structures (Table A-1 in the Supporting Material). Each GPCR has been simulated in a system containing 4 copies of itself for 30 μs. Altogether, after accounting for control simulations, we present data on GPCR-lipid interactions from a total of 1 ms simulation time. Our usage of a consistent protocol and identical simulation parameters coupled with the large number of GPCRs simulated allows us to study the lipid interaction profile of each GPCRs separately as well as present our findings in a GPCR family-wide context. We show that GPCRs, through lipid-protein interactions, create a local membrane environment that is unique for each receptor.

59

4.4 Methods

System setup. The initial protein coordinates for all GPCRs were retrieved from the Protein Data Bank. The corresponding PDB IDs and references are shown in Table S1. For each GPCR, non-protein molecules and atoms were removed. Attached protein fragments were also excluded from the simulations, leaving only the helical core (including helix 8) and extra- and intracellular loops. The resulting structures were converted to a CG model using the Martinize protocol as outlined on the Martini website (http://www.cgmartini.nl/) and inserted into a bilayer using the insane tool.(33) The secondary and tertiary structure of GPCRs were maintained using an elastic network. We have simulated 23 different GPCR structures belonging to class A, and 5 non – Class A GPCRs, totaling 28 different GPCR structures simulated. 5 GPCRs (receptors Adenosine 2A, beta2, Rhodopsin, M2 muscarinic acetylcholine, and μ- opioid) are simulated in both the active and inactive conformational state. Each system was simulated for 30 μs, and unless otherwise specified analysis was performed on the last 5 μs of each trajectory. The total simulation time for this project, including the control simulations, is around 1 ms.

To test if our simulation protocol has any effect on our results, we performed three control simulations for

β2AR (Figure A-2):

1. Including ICL2 to test if its exclusion from our simulations has any effect, 2. Including the palmitoylated Cys-341, 3. Pre-equilibrating the system for 100 ns using all-atom simulations. Manna et al.(34) highlighted

the importance of a proper equilibration protocol for β2AR when carrying out all-atom MD simulations, so we wanted to ensure that our direct conversion of protein coordinates to a coarse- grained representation did not affect our results.

The overall system setup follows our recently published protocol, where four copies of each protein are embedded equidistantly in a simulation box of ca. 40x40 nm in the x and y plane. We find this surface area to be adequate in allowing both the insertion of multiple (here four) protein copies as well as maintaining accurate ratios between lipids in the system, in particular those found at low concentrations. In comparison to simulating multiple copies of smaller setups containing 1 protein each, our setup enables us to have a much better representation of all lipid types in the system and allow for lipids to sample a larger area and thus completely escape the influence of embedded proteins. The lipid composition of our systems corresponds to the plasma membrane model developed by Ingόlfsson et al.(35) and later applied to several protein systems.(31) The exact lipid composition for each system simulated is described in the supplementary information, but in general, the outer leaflet contains ganglioside (GM) lipids and the inner leaflet contains phosphatidylserine (PS), phosphatidic acid (PA), phosphatidylinositol (PI), and PI-

60 phosphate, -bisphosphate, and -trisphosphate (PIP lipids). Both membrane leaflets contain cholesterol (CHOL), phosphatidylcholine (PC), phosphatidylethanolamine (PE), and sphingomyelin (SM) lipids. Small amounts of lysophosphatidylcholine (LPC), diacylglycerol and ceramide (DAG, and CER, respectively) were also included. For GM lipids, we used the new parameters.(36) The membrane system with embedded protein was generated using the insane protocol resulting in a composition with ca. 5200 residues (including proteins) and 165 000 beads per system. The final systems also contained water molecules, counterions and 150 mM of NaCl.

Simulation protocol. All simulations were carried out using the GROMACS simulation package version 2016.4,(37) with the standard Martini v2.2 simulation settings and parameters(38) as published on the Martini website (available on martini website in the lipidome section: http://www.cgmartini.nl/index.php/force-field-parameters/lipids with a detailed overview given in http://www.cgmartini.nl/index.php/force-field-parameters/lipids2/350-lipid-details). A short energy minimization procedure was performed using the steepest descent algorithm, with position restraints of 1000 kJ mol-1 nm-2 applied to protein beads, followed by a gradual equilibration procedure whereby position restraints were lowered and limited to only the protein backbone beads. Production runs were carried out using a 20 fs timestep and weak position restraint applied to backbone beads (1 kJ mol-1 nm-2). The temperature was kept at 310 K using the velocity rescale thermostat,(39) with a time constant for coupling of 1 ps. A 1 bar pressure was applied semi-isotropically to the system and maintained using the Parrinello- Rahman barostat, with a compressibility of 3 · 10-4 bar-1 and a relaxation time constant of 12 ps.

To maintain the spacing between proteins and to decouple lipid – protein interactions from protein – protein contacts, we applied small position restraints on the backbone beads of GPCRs and simulated each system for 30 µs. Analysis presented here, unless stated otherwise, are carried out on the last 5 µs of the simulation trajectories and are mainly focused on lipids within a 7Å distance cutoff from the protein (and unless otherwise stated this is the default radius we use).

Analysis. A detailed explanation of the analysis protocol and tools used is available in the supporting material.

4.5 Results

All simulations presented here are carried out using a complex membrane model in which we inserted four copies of each GPCR structure (Figure 4-1D). The membrane model has been described previously and has been applied to study lipid-protein interactions in ten different proteins, including δOR which is an opioid

61

GPCR. We used the MARTINI CG force field to model our systems and the underlying interactions.(32) Briefly, an equidistantly placed quartet of each GPCR structure is simulated in a membrane environment composed of 63 different lipid types and stretched in a 40 x 40 nm2 surface area in the lateral dimension. We use the same number density and relative ratios to represent each lipid type as described in our previous work.(31) The resulting membrane model has an asymmetric lipid distribution whereby ganglioside and PIP lipids, for instance, are found exclusively in the upper and lower leaflet, respectively.

When displaying lipid-protein interactions, for clarity and brevity reasons, we usually limit ourselves to a small sample of GPCRs. There are no hard rules regarding which GPCRs we show in a particular example, and for most cases they can be substituted with any other structure without altering the message. We do, however, go to great lengths to provide a complete analysis of the lipid-protein interaction profile of all GPCRs, either in the supporting information or through the accompanying website.

4.5.1 The lipid environment near GPCRs is distinctly different from the bulk membrane composition

We use the average number of lipids around proteins as a function of time and their cumulative average to show convergence of lipid distributions around proteins (Figure A-1 in the Supporting Material). We note that shorter simulation times (<5 µs), at least using the MARTINI model, may not be adequate to fully capture the interactions of proteins with lipids. This is seen, for example, from the cumulative average number of lipids within 7Å of proteins which takes significant amount of simulation time of to reach an apparent equilibrium state. This is most noticeable for lipids that are present in smaller amounts like GM, and PIP lipids, and are key players in GPCR-lipid interactions. In a few cases, GM lipids take considerably more time to converge (Figure A-1 and A-22). We think that using our protocol, a simulation time of around 20-25 µs should be sufficient for this type of systems. Our setup with 4 approximately independent proteins per simulation provides a statistical control on most results, in addition to time-dependent analyses.

62

Figure 4-1. GPCR Depletion-Enrichment (DE) index data as derived from our simulations. A. Boxplot showing the DE index values for all GPCRs combined. We have grouped lipids according to their headgroup type into GM, PC, PE, PI, PA, PS, SM, PIP, and CHOL lipids, and for each lipid group we show the three quartile values of the distribution. Data points that fall within a 1.5 interquartile range of the upper and lower quartiles are included in the “whiskers” of the boxplot. Data points that fall outside of this range are plotted individually. B. DE index values for a selection of GPCRs (error bars are ± SD). The selection is based on increasing DE index values for PIP lipids. For clarity, if 2 or more GPCRs show the same range of values only one is chosen. The full data set is available in the supporting information. C.

DE index data as a function of cutoff radii from A2ARi. (error bars are ± SD). The black line in Figures A- C corresponds to DE index of 1. D. Snapshot of our system setup at 25 μs with proteins, GM lipids and cholesterol shown in yellow, red and green, respectively.

To understand how the lipid environment of GPCRs changes during the simulation, we define the DE (Depletion-Enrichment) index (Figure 4-1 and SI).(40) It is a metric that gives us a numerical measure of the tendency of lipids to escape a homogeneous solution and cluster close to (>1), or away from (<1) embedded proteins. Our data clearly show that GPCRs consistently favor a lipid environment that is different from a random or homogenous distribution. Calculated DE index data show that lipid distribution around GPCRs differs markedly from their distribution in the bulk bilayer, either by their enrichment (GM, PI, and PIP lipids) or depletion (PC and SM lipids) in the proximity of embedded proteins (Figure 4-1A). Interestingly, while these results are consistent among all GPCRs studied here, the magnitude of DE indices shows a noticeable variability between different GPCRs (Figure 4-1B). And even though we use a 7Å distance cutoff in our analysis, the results presented here hold for other cutoff values as well (Figure 4-1C).

63

Figure 4-1C, additionally, highlights the impact of embedded GPCRs on the local membrane environment, where their effect is “felt” even at 5 nm (50Å) away (considering that the DE index for each consecutive radius includes lipids from all previous radii, which means its convergence to 1 will be slower and so the GPCR effect on the local membrane may be shorter than 5 nm).

Next, we tested if lipid distributions around class A and non class A GPCRs differ. We calculated two- sided two-sample T-tests between the average DE indices of class A and non class A GPCRs. As control, we performed the same calculations on aminergic and non aminergic class A GPCRs (Table S5 in data.xls). The results reveal that the distribution of PIP and CHOL, but not GM, lipids differs significantly between class A and non class A GPCRs but does not differ between aminergic and non aminergic class A GPCRs (independent of distance cutoff). PE and PS lipids, as well, show different distribution between class A and non class A GPCRs.

The average DE index for class A and non class A GPCRs (with the corresponding 95% confidence interval for the mean) at 7Å is 9.1 ± 0.6 and 4.9 ± 1.2, respectively, underscoring the higher enrichment of PIP lipids in class A GPCRs (Table S4). Non class A GPCRs like glucagon-like peptide-1 receptor (GLP1) and SMO, in particular have a much lower enrichment of PIP lipids than other GPCRs. The same calculations for cholesterol yield 1.07 ± 0.02 and 0.89 ± 0.04, respectively, showing a depletion of cholesterol for non class A GPCRs at 7Å. Other lipids, like PI and PA, display a higher enrichment close to proteins, but with the exception of SMO, their levels are lower compared to PIP lipids. On the other hand, PC lipids are significantly depleted, whereas PE levels, in general, remain close to 1 for class A GPCRs, but show a higher enrichment for non class A GPCRs.

The depletion of a particular lipid around a protein, however, is not necessarily a result of their “incompatibility” with each other, as it could be a by-product of the positive enrichment of a different lipid (in the case of PC lipids this may be the clustering of GM lipids, whereas the depletion of PE lipids may be a result of the enrichment of negatively charged lipids).

Overall, the local environment of GPCRs is characterized by an enrichment of GM lipids in the upper leaflet and negatively charged lipids, particularly PIP lipids, in the lower leaflet relative to the surrounding membrane. These results are consistent in our entire set of GPCR simulations.

4.5.2 2D density profiles reveal a highly localised cholesterol distribution around GPCRs

To visualize GPCR-lipid interactions we calculated 2D density maps of cholesterol, fully saturated (FS) and poly unsaturated (PU) lipids around the protein (Figure 4-2; Figures A-4 – A-6) during the last 5 µs of the simulation, separated into upper and lower membrane leaflet densities. Cholesterol density profiles are

64 distinguished by a well-defined localization of cholesterol molecules around proteins, especially when compared to the density profile of PU and FS lipids, hinting at specific interactions of cholesterol with embedded GPCRs. Perhaps one of our most important findings is that we observe specific interactions with cholesterol for all GPCRs simulated. The number and location of these interaction sites differs significantly, however. For instance, we only observe one interaction site for SMO in the upper leaflet, located between helices TM2/3 (ec), and no interaction sites on the lower leaflet side of the receptor. We observe cholesterol interaction sites in both the ec and ic side of the receptor for the majority of GPCRs simulated. For example, within aminergic GPCRs, we see 3, 4 and 7 interaction sites for dopamine receptor (D3R), histamine receptor (H1R) and 5HT1B, respectively. The total number of putative cholesterol interaction sites can vary from 1 for SMO to 8 for cannabinoid receptor (CB1R) and β2ARa. Differences in the number and location of cholesterol binding sites are also observed between different conformational states of the same receptor (Figure A-4). Although our dataset for non-class A GPCRs is small and we do see a few putative cholesterol binding sites, overall, we find fewer interaction sites for non-class A GPCRs compared to their class A counterparts, which may explain their differences in the calculated DE indices. In Figures A-13 – A-16 we give a detailed overview and comparison of GPCR-cholesterol interactions from our simulation results, available crystallographic information, and other MD studies.

Figure 4-2. 2D density profiles. A. Orientation of GPCRs within 2D maps: H8 is facing downwards with helices TM1-7 going counter- clockwise. B. The density profile of three class A GPCRs: apelin receptor (ApelinR), angiotensin receptor

(AT2R) and β2AR, as well as class F Smoothened: SMO. The GPCRs shown here were chosen to showcase the different number and location of interaction sites, as well as being from different subclasses of the GPCR family (peptide, aminergic, and frizzled receptors, respectively). The complete analysis is given in Figure A-4. Density profiles are divided into an upper and lower profile corresponding to the extracellular and intracellular side of the receptor, respectively.

65

In order to study the changes induced by GPCRs in the surrounding lipid environment we grouped lipids into two categories based on the degree of saturation of their lipid tails: lipids that lack any unsaturated bond (FS) and lipid that posses multiple unsaturated bonds (PU). Since we only consider lipids that fall on each edge of the saturation scale this grouping leaves many lipids unaccounted for, but, in exchange, it gives us a clear look into the local membrane environment of GPCRs based on the tail saturation degree. The distribution of FS and PU lipids (Figures A-5 and A-6), especially compared to the density profile of cholesterol, is distinctly non-local and nonuniform. FS and PU density profiles underscore the GPCR preference for a disorganized lipid environment with PU lipids localized predominantly around the embedded protein. FS lipids, however, seem to be excluded from the protein. This is especially noticeable in the lower leaflet of the membrane. This asymmetrical distribution of lipids according to their tail saturation degree by the presence of a GPCR is manifested locally around the protein and is observable by resulting changes in membrane curvature and thickness (Figure 4-3).

4.5.3 GPCRs induce a unique local membrane environment

Given the changes in the density of FS and PU lipids close to GPCRs in comparison to the surrounding membrane environment, we analysed how these changes are reflected in the local membrane thickness and curvature. Bulk lipid properties, sometimes also referred to as solvent-like effects,(21) is a description of the changes in local membrane physical properties as a result of rearrangements of membrane structural components. The activity of mechanosensitive channels, for instance, is modulated by these effects.(41) The importance of bilayer mediated effects has been recognized for GPCRs as well, most notably for rhodopsin(10) and 5HT1A.(26) To gain a deeper understanding of this, we analysed the membrane thickness, as well as the mean and gaussian curvature of our systems (Figure 4-3, Figures A-7 – A-9).

66

Figure 4-3. GPCR curvature analysis.

Mean (KM) and Gaussian (KG) curvature for A. ApelinR and B. AT2R, calculated separately for the upper, middle, and lower bilayer plane. For brevity, only the profiles of ApelinR and AT2R from Figure 4-2, are shown here.

To our surprise, even though GPCRs share a conserved helical core, the subtle structural differences between them seem to be sufficient to induce different thickness and curvature profiles. A feature that we observe among some GPCR-induced membrane curvatures, especially in the upper leaflet, is a positive mean curvature in and around the protein insertion point with a steep change to negative mean curvature in the surrounding membrane environment. GPCR thickness and curvature profiles highlight the GPCR- induced perturbations in the local membrane environment. In the following sections we analyse specific GPCR-lipid interactions, by focusing on cholesterol and PIP lipids.

4.5.4 GPCRs interact specifically with PIP lipids on the intracellular surface.

Phosphatidylinositol phosphate (PIP) lipids are negatively charged lipids located exclusively in the lower leaflet. Unlike cholesterol, they only account for a small fraction of membrane lipids (around 0.7%). In spite of this, or perhaps because of it, they have repeatedly been shown to form specific interactions with membrane proteins, e.g. receptor tyrosine kinases.(31, 42) PIP lipids may modulate key functional aspects of protein activity, and considering their low quantities, they may act as modulators of protein activity. While experimental data is lacking, GPCR - PIP lipid interactions have recently been highlighted with respect to three GPCRs: β1AR, NTSR1 and A2AR. Yen et al.(25) show compelling evidence from combined experimental and computational methods that PIP lipids stabilize the active state of these GPCRs as well as increase its coupling selectivity to G proteins. Active state stabilization by PIP lipids is also noted by Song et al.(19) MD simulations point to the possibility of such interactions being present in other GPCRs

67 as well. To test this hypothesis and gain further insight into this aspect of GPCR-lipid interactions we analysed our set of GPCRs for potential interactions with PIP lipids (Figure 4-4).

Figure 4-4. Sequence heatmaps of GPCR – PIP lipid interactions. A. Scaled-down image of all GPCR heatmaps and a zoomed-in view of the TM1 – ICL1 – TM2 interface for a selection of GPCRs showing the increase in the number of contacts of PIP lipids with charged residues in a White-Red scale. B. Bar graph of the number of contacts with PIP lipids with CXCR1 TM1-ICL1-TM2 residues. C. Normalized bar graph presentation of the involvement of each TM helix for three selected

GPCRs (D3R, A2ARi and CXCR1). Error bars are ± SD. We did not apply selection criteria when choosing the GPCRs to show here, and for each part of the figure a complete analysis is provided in the SI covering all GPCRs (e.g. Figure A-19 contains the full sequence alignment highlighted here in subplot A).

Indeed, we observe a close relationship between GPCRs and PIP lipids in our simulations. For each structure we find several well-defined and stable interactions with PIP lipids. Starting from the upper leaflet, GPCRs transverse the bilayer seven times. Each turn, depending on the side, is formed by extracellular or intracellular loops. PIP lipids, being confined in the lower membrane leaflet, interact with GPCRs mainly, although not exclusively, at these turning points: TM1-ICL1-TM2, TM3-ICL2-TM4, TM5-ICL3-TM6 and TM7/8. Interaction of PIP lipids with the first of these interfaces, TM1-ICL1-TM2, for a selection of GPCRs is shown in Figure 4-4, and the complete set of results in provided in the supplementary material (Figures A-10, and A-18 – A-19). Due to GPCR ICLs extension outside of the membrane and into an aqueous environment their sequences are lined with charged residues (arginine and lysine) that we identify as the key residues involved in establishing and maintaining interactions with PIP lipids through charge-charge interactions. For chemokine receptor (CXCR1), for instance, interactions with PIP lipids are mediated by

68 two arginine residues (Figure 4-4B) with similar interactions being observed for most other GPCRs as well

(Figure 4-4A), although not for A2AR since it lacks charged residues at this interface. Additionally, and somewhat surprisingly, this interaction with PIP lipids at the TM1-ICL1-TM2 interface is lacking for β2AR

(in both of its conformational states) and D3R. Yen et al., in their study of β1AR, noted PIP lipid interactions with the receptors’ TM5, TM6, and TM7, but not TM1 and TM2, helices. These results imply that while GPCR-PIP lipid interactions are mediated by charge-charge interactions, the presence of charged residues alone is not a sufficient indicator to guarantee interaction. The intracellular loop plus the incoming and outgoing TM helices may constitute the interacting interface that ensures a specific binding of PIP lipids.

When we look at the involvement of different TMs in establishing and maintaining contacts with PIP lipids, we see that GPCRs differ noticeably in the localization of PIP contacts and the extent of the involvement of TM helices in maintaining these interactions (Figure 4-4C). CXCR1, for instance, displays a higher number of contacts with TM helices 1, 2, 4 and 6. D3R on the other hand displays a higher localization of

PIP lipids at the distal part of TM5. Similarly, A2ARi is characterized by a very different involvement of TM helices, namely TMs 6, 7, and 8. These three examples show the different ways PIP lipids interact with GPCRs (for the complete results see Figure A-10).

Figure 4-5. CXCR1 – PIP lipid interactions. A. Protein surface presentation of the number of contacts with PIP lipids (increasing number of contacts corresponds to Blue-White- Red color change) viewed in the plane parallel to the bilayer (showing helices 1-4 and 5-8 on the left and right figure, respectively) and normal to the bilayer (showing example snapshots of PIP lipids – in magenta – bound to the protein). B. Centre-of-mass distances between key residues in each interaction site identified with PIP lipids. For this particular case, interactions, once formed, are maintained throughout the duration of the trajectory.

69

We selected CXCR1 as an example to showcase its interaction with PIP lipids in further detail. We identify two PIP interaction sites, or hotspots (Figures 4-4 & 4-5). The first one is formed by three helices: TM1, TM2 and TM4 and binds PIP lipids with greater affinity than the second hotspot, which is located on the interface between ICL3 and TM6. Arginine and lysine residues at the intracellular part of TM1 and TM4 helices interact with the charged headgroup of PIP lipids, thus forming strong interactions that are maintained throughout the simulation trajectory – we show this by calculating the centre-of-mass distances between contacting PIP lipids and Arg68, Arg71 and Arg150 residues of CXCR1 (Figure 4-5B & A-11).

Three key findings from our simulations with respect to GPCR-PIP lipid interactions are that (i) GPCR interactions with PIP lipids are present in all GPCRs simulated, including non-class A GPCRs, (ii) they are specific to PIP lipids in the sense that the binding sites are generally inert to other lipids (interactions with other charged lipids are observed but they are easily displaced by PIPs) and (iii) the prevalence of interaction sites (measured as number of contacts), their relative strength (measured as duration of contacts) as well as their localization vary considerably among different GPCRs. We have already known that the location of identified cholesterol hotspots depends on the receptor; here we show that the same is true for GPCR – PIP lipid interactions.

4.5.5 Cholesterol interactions are a unique identifier for GPCRs

MD simulations either alone or in tandem with experiments have elucidated cholesterol interaction sites for several GPCRs: β2AR, A2AR, SMO, δOPR, µOR, rhodopsin and 5HT1A (see Introduction). Understanding GPCR-cholesterol interactions in a GPCR-wide context has, however, been challenging. Why and how do cholesterol molecules interact with GPCRs? One mechanism that has been proposed to underlie GPCR- cholesterol interactions is the characterization of cholesterol binding motifs (i.e. the CRAC and CARC motifs).(43) These are 5-13 amino acid-long sequences that exhibition a high binding affinity for cholesterol. These motifs indeed show a higher propensity for cholesterol interactions(44) and have been found to form the interface of putative cholesterol interaction sites in GPCRs.(43) Due to their somewhat indiscriminate definition, however, it is difficult to properly assess their importance and relevance. To address these issues as well as gain a deeper insight into the “how” part of the GPCR-cholesterol question we present a detailed analysis of our data with respect to cholesterol interactions.

For each simulated GPCR structure we identify several sites with an increased cholesterol localization (measured as number of contacts and 2D density profiles). These interaction sites, however, do not have the same binding strength for cholesterol and when they are studied further using other computational tools usually only a few of them display a consistent interaction with cholesterol. This is why in our calculations

70 we simultaneously consider the number of contacts with cholesterol as well as their duration, resulting in a much clearer picture of GPCR-cholesterol interactions (Figures A-13 – A-15).

We show interactions with cholesterol for a small subset of GPCRs in Figure 4-6 (a detailed comparison and complete analysis is given in Figures A-4, A-13 – A-16 and A-20 – A-21). Each GPCR structure displays at least one site of pronounced affinity for cholesterol binding, that is stable and usually maintained throughout the simulation, and on average, we find 5-7 interaction sites per GPCR, 1-3 of which are noted for the stability of the interaction.

Figure 4-6. GPCR-cholesterol interaction for a sample of eight GPCRs shown as a surface presentation of cholesterol contact durations. Color scale (red-white-blue) represents an increase in the duration of cholesterol contacts (similar to Figure 4-5). A larger set of GPCRs, including a detailed comparison between contact duration and contact number as the visualization metric is given in the SI.

The binding of cholesterol to AT2R is an interesting example displaying the interaction between GPCRs and cholesterol (Figure 4-7A). Helices TM4 and TM5 on the extracellular side of the receptor are spaced just sufficiently far apart to accommodate one cholesterol molecule in between them. The entrance to this crevice is made of hydrophobic residues (Pro-177 and Ile-211) and we consistently find a cholesterol molecule inserted inside this crevice, interacting with a phenylalanine (Phe-129) residue in TM3. This binding is mediated by interactions with the hydroxyl-containing (ROH) bead of cholesterol and Lys-215

(K5.42) in TM5 (Figure 4-7B). This interaction is present in all four copies of AT2R, with in one case persisting for the last 29 µs in the 30 µs simulation. While we cannot observe complete insertion of cholesterol because of limited conformational flexibility of the proteins in coarse-grained simulations, lipid entry in GPCRs has been noted before in atomistic simulations for CB2R,(45) opsin,(46) and A2AR.(30) Analysis for all protein copies is shown in Figure A-12.

71

Figure 4-7. Cholesterol binding to angiotensin receptor (AT2R).

A. Simulation snapshots showing a cholesterol (red) molecule buried inside helices TM4 and TM5 of AT2R (white). Residues lining up the entrance to this crevice are shown as blue spheres; Cholesterol’s hydroxyl- containing ROH bead interacts with Lys-215 (K5.42). TM4 is left transparent for clarity; CRAC and CARC motifs present in TM4 and TM5, respectively, are colored in green. B. Centre-of-mass distance calculations between cholesterol and Phe-129 (3.37) and Lys-215 (K5.42), respectively.

An important feature of AT2R-cholesterol interaction is that TM4 contains a CRAC motif and TM5 contains a CARC motif, both lining up the cholesterol entry gateway. Analysing our data in the context of cholesterol binding motifs reveals a more complicated story, however. Indeed, we find CRAC and CARC motifs to take part in forming cholesterol interaction sites for several GPCRs: A2AR, ApelinR, AT2R, CB1R, CXCR1,

D3R, endothelin (ETBR), GLP1, GlucagonR, protease-activated receptor (PAR2), and lysophospholipid sphingosine 1-phosphate receptor (S1PR1), showing that these motifs are capable and do bind cholesterol. When they are present, CRAC and CARC motifs constitute one way cholesterol may bind to GPCRs. In the context of all simulations performed, however, we find that the mere presence of these motifs is insufficient to determine the existence of cholesterol hotspots, regardless of their location within TM helices. This is because we find many stable GPCR-cholesterol interactions that are not mediated by these motifs, and even plenty of such motifs that do not bind cholesterol at all. We further note that the number of CRAC/CARC motifs in GPCRs vary strongly, from three on A2AR (two of which overlap with one- another) up to 13 for mGlu5 (nine CRAC and four CARC motifs). Coincidentally, the two TM helices of mGlu5 that feature the highest localization of cholesterol do not do it through any of those motifs. On the other hand, other GPCRs, like CXCR1 and S1PR1, do bind cholesterol through these motifs at their most pronounced interaction site. Similar comments about these motifs have also been made in the literature.(11, 47, 48)

72

4.5.6 GPCR-lipid interactions are dependent on the conformational state of the receptor

MD simulations of β2AR(18) show that cholesterol reduces the conformational landscape sampled by the receptor, through specific lipid-protein interactions rather than cholesterol’s order-inducing effect on membrane lipids. Simulations of µOR(28) and A2AR(19) using CGMD reveal cholesterol interactions that vary based on the conformational state of the receptor. We reaffirm these findings in the case of µOR and

A2AR, and show that β2AR, M2R and RhodR, as well, interact with cholesterol in a conformation-dependent manner. Specific interactions with cholesterol for rhodopsin have been noted before, with one of the interaction sites identified being located at the TM7/1 helix,(22) and rhodopsin interactions with DHA have shown to depend on the conformational state of the receptor.(21)

Figure 4-8. Cholesterol interactions with RhodRi and RhodRa. A. 2D density profiles for rhodopsin B. Sequence heatmap of the duration of contacts for TM1 and TM7/8 helices for RhodRi and RhodRa C. Comparison of the position of Phe 293 in RhodRi (violet) and RhodRa (magenta) and the resulting change in their cholesterol interaction profile. Color gradients are as described previously.

Rhodopsin-cholesterol interactions differ significantly depending on the conformational state of the receptor (Figure 4-8A). We observe three interaction sites on the extracellular side of RhodRi that are largely missing in RhodRa. In terms of the duration of contacts, TM7/1 is the most prominent interaction site with cholesterol for RhodRi. Sequence heatmaps for the duration of contacts show that this site is

73 completely absent in RhodRa (Figure 4-8B). What difference between the structures of RhodRi and RhodRa accounts for this change? Phe-293 is the key residue modifying rhodopsin-cholesterol interactions. Phe- 293 faces the inside of the receptor in RhodRi allowing cholesterol molecules to easily access the TM7/1 interface and assists in binding cholesterol. In RhodRa it however faces away from the receptor and thus unavailable to interact with cholesterol. For RhodRa on the other hand we observe a cholesterol interaction site on the distal part of TM7, which is not present in RhodRi.

4.6 Discussion

Currently, we know 323 structures divided over 61 different GPCRs.(49) The vast majority of these structures belong to class A GPCRs, with class B, C and F only accounting for 37 of the structures. Out of all class A GPCR structures, almost half (132 or 45%) belong to only four GPCRs: β1AR (24), β2AR (22),

A2AR (45) and rhodopsin (41). MD simulations, which rely on the availability of initial protein coordinates, reflect this with the majority of papers being published on these GPCRs as well. Our set of structures simulated spans 23 (or ~40% of the 61) different GPCRs and allows us to characterize the lipid interaction profile of each structure individually, but also understand lipid-protein interactions in a GPCR-wide context.

We analysed GPCR-lipid interactions with single lipids as well as by grouping lipids based off their headgroup type and lipid tail saturation level. The results show that GPCRs are characterized by a unique lipid-protein interaction profile, that is not only GPCR dependent but also conformation dependent. We consistently see GPCRs interacting specifically with cholesterol, GM and PIP lipids, in an environment that is strikingly different from the surrounding membrane. We observe that, for the most part, GPCRs share the type of lipid group, but not the degree that it is enriched/depleted, and that the local membrane environments of class A GPCR and non-class A GPCRs differ.

We analysed all GPCR crystal structures for interactions with cholesterol and found that 20% of GPCR crystal structures solved (64 out of 323) have been co-crystalized with cholesterol. Of them, 38 (59%) are

A2AR (27) and β2AR (11) structures. Interestingly, none of the β1AR and rhodopsin structures solved contain bound cholesterol molecules, despite them being solved at roughly equal numbers to β2AR and A2AR, respectively. Serotonin receptors (5HT2A and 5HT2B) account for an additional 9 structures. Of all structures that have been crystalized with cholesterol, the only ones we did not simulate are thromboxane A2 and P2Y receptors. We give a detailed comparison of our simulation results with available experimental data in Figures A-13 – A-16. A summary of cholesterol interaction sites from crystallographic data shows that the interaction sites appearing more than once are: TM1-4 (ic), TM1-2 (ec), TM4-5 (ic), TM6-7 (ec) and

74

TM8/1. These are interaction sites that we also consistently observe in our simulations, and based on the number and duration of contacts with cholesterol we observe the TM2/3 (ec), TM6-7 (ec), TM8/1 (ic) interaction sites more frequently than TM1/4 (ic), TM5-6 (ec), TM6-7 (ic), TM7/1 (ec). Other sites like TM1-2 (ec), TM4-5 (ec), TM4-5 (ic) occur even less frequently.

Overall, GPCRs express two larger (TM1-4 and TM5 - 8/1) and two smaller (TM4-5 and TM7-8/1) surfaces for cholesterol binding, and each of these surfaces can bind cholesterol in either the extracellular or ic side (Figure A-17). For instance, binding on the TM1-4 surface can occur on the extracellular side between TM1-2, TM2-3 (albeit to a much lesser extent), and on the intracellular side between TM1-2/4. The other large surface (TM5-8/1) displays several binding sites for cholesterol with the most prominent, in our simulations, being the TM6/7 (ec) interface. The TM6-7 (ec) interface in particular seems to occur on many class A GPCRs. We consistently find a high cholesterol occupancy at this site, sufficient to accommodate two cholesterol molecules (similar to what is observed in crystal structures). The hydrophobic residue (usually either valine, leucine, or isoleucine) in the 6.46 position is seen to interact preferably with cholesterol for many class A GPCRs. Non-class A GPCRs (SMO, GLP1, calcitoninR), however, do not seem to display this interaction site. In terms of GPCR ligand binding and functional activity, cholesterol can either act as a positive or negative modulator.(30) Guixà-González et al.(30) summarized the relevant literature on this issue highlighting the receptor-dependent activity of cholesterol. Considering that cholesterol enhances the function capabilities of β2AR and μOR, and diminishes that of Rhodopsin, we speculate if the TM6-7 (ec) interaction site, which we do observe for the former but do not for the latter, may be, at least partly, responsible for it. While the nature of our simulation protocol hinders us from making any definitive statements, we do think this issue merits further investigation.

In their 2008 study, Hanson et al.(50) defined the Cholesterol Consensus Motif (CCM), which differs from CRAC/CARC motifs in that it is a spatial arrangement of residues rather than a linear sequence. They defined it in the context of β2AR and estimated that the motif exists in 21% of human class A GPCRs. Our results strongly point towards a cholesterol binding profile of GPCRs that involves residues from multiple helices. The majority of interactions with cholesterol that we observe, especially those that are present for a majority of the simulation time, are found at the interfaces between two or three helices and often also supported by residues from extracellular and intracellular loops. Even cholesterol interactions that predominantly involve TM1 are supported by interactions of cholesterol (ROH bead) with H8 residues.

Based on our simulations, it appears that cholesterol binding is conditioned on two key components: a hydrophobic residue environment that stabilizes cholesterol, and a geometric compatibility between cholesterol and the protein interface which accommodates its ring structure. Shielding of the cholesterol hydroxyl group (“ROH” bead in our system) from the hydrophobic environment is also important but we

75 find that nearby lipids help with it, and as such, charged residues are not an essential requirement. Cholesterol binding may occur in the absence of aromatic residues; the presence of aromatic residues, however, is observed very frequently and their orientation may serve as a determining factor for cholesterol binding as we observe for rhodopsin (Figure 4-8).

CRAC/CARC motifs may be unique in that they are linear sequences that fulfill all these requirements themselves, but usually we see two or three TM helices working in concert to bind cholesterol.

Figure 4-9. Overview of cholesterol binding sites.

We show cholesterol binding at the TM2/3 (ec) interface for three GPCRs: SMO, μORi, and CB1R. For comparison, the same interface is shown for β2ARi where this interaction site is missing. Proteins are shown in white, cholesterol in red and residues comprising the TM2/3 (ec) interface are shown with yellow.

Figure 4-9 highlights the TM2/3 (ec) interface for four different GPCRs: three (SMO, μORi and CB1R) bind cholesterol specifically at that site and the fourth (β2ARi) does not. The reason for this is that residues comprising the TM2/3 (ec) interface for β2AR do not provide the “geometric compatibility” necessary for binding. SMO, μORi and CB1R residues at the TM2/3 (ec) interface form an “interacting bed” that enables cholesterol binding. The spacing of residues is also an important factor determining the interaction strength.

For these proteins the binding strength (measured as duration of contacts) is SMO ≥ CB1R > μORi.

We also find PIP lipids are closely involved in lipid-protein interactions with GPCRs. PIP lipids and in particular interactions with PtdIns(4,5)P2, confer stability to GPCRs and increase their GTPase activity.(25)

Mass spectrometry experiments in tandem with computer simulations showed that PIP2 lipids stabilize the active state of GPCRs and act as allosteric modulators.(25) This effect seems to be higher for PIP2 than for PA, PI, PS and other PIP lipids, and the cytoplasmic side of GPCRs may contain PIP binding hotspots. MD simulations in the case of A2AR, reaffirm the importance of PIP lipids interactions.(19) Here we confirm these findings and furthermore extend them to the whole GPCR family. We also find that PIP lipid

76 interactions, while being mediated by charge interactions, differ among GPCRs. Taking all simulations into account, we see that PIP lipids bind at four different sites on GPCR surfaces: TM1-ICL1-TM2, TM3-ICL2- TM4, TM5-ICL3-TM6, and TM7/8, but GPCRs differ in how and which of these sites are utilized. For example, A2AR and β2AR receptors do not make any noticeable contacts through their TM1-ICL1-TM2 interface, whereas many other GPCRs do. The number of PIP lipids bound to each of these sites varies among GPCRs simulated as well, as does the longevity of these interactions. For instance, CXCR1 binds PIP lipids for the whole simulation time (Figure 4-5); SMO, does so only for a fraction of it.

In our simulations we have used the MARTINI model to study GPCR-lipid interactions, and as such all underlying assumptions on which the model is built as well as its advantages and shortcomings are carried forward to our results. Due to the nature of the model, we lack the resolution to describe in detail the lipid- binding sites identified, as well as provide a quantitative measure of their strength. When we refer here to the strength of a binding site, we base it on the number and duration of lipid contacts. That is, however, not a replacement for carrying out all-atom simulations and free energy calculation methods. Furthermore, to analyse lipid-protein interactions in a GPCR-wide context, we used the same system setup for all structures. The disadvantage of this approach is that our complex membrane model does not represent the “natural” environment to any GPCR in particular. This is, however, a problem that all MD simulation studies of membrane proteins face.(12)

Our membrane model contains three different PIP lipids in equal amounts: PIP1, PIP2 and PIP3, and we observe each of them interacting with GPCRs. Current experimental evidence,(25) however, show PIP2 to be of higher preference and acting in a structure-specific manner compared to other lipid types. While we do observe PIP2 interactions to occur more frequently than PIP1, we lack either the resolution or the sampling to address this question properly. In our model, we also lack any GPCR effector proteins and as such cannot comment on how GPCR-PIP lipid interactions affect coupling to G proteins. We do note, however, that PIP lipid interactions with the active and inactive state of GPCRs simulated here differ from each other (Figure A-18). More detailed simulations at a higher resolution coupled with free energy calculation methods are, however, necessary to fully characterize these details.

Our results represent a large-scale attempt to understand lipid-GPCR interactions at the family level of classification. To this end, we have uncovered that GPCRs, despite sharing common structural features and a conserved helical core, nevertheless create a unique local membrane environment and interact with lipids in a GPCR conformation-specific manner. Lipids have been implicated as either affecting or directly controlling the activity of many proteins, even acting as allosteric modulators.(11) Humans express over 800 different GPCRs and the ligand binding landscape of GPCR is in the thousands. Yet signaling as a response to ligand binding is mediated by only four Gα families.(2) Flock et al.(2) proposed the existence

77 of a “selectivity barcode” at the GPCR-G protein interface that could enable this large array of GPCRs to maintain their specific response by only coupling to a few effector proteins. GPCR-lipid interactions may present another such barcode that characterizes GPCRs and helps in retaining their specific response.

4.7 Conclusions

We observe specific interactions with lipids for all GPCRs simulated. The lipid types we observe to consistently and most prominently form specific and long-lasting interactions with GPCRs are cholesterol and PIP lipids. Analysis of these interactions, however, reveal that while cholesterol interactions depend on the existence of a hydrophobic environment and aromatic residues to stabilize its ring structure, interactions with PIP lipids rely on the existence of charged residues lining up the sequence of intracellular loops.

When we compare the cholesterol interaction profiles between different GPCRs, or even different conformational states of the same GPCR, we find some interaction sites that are quite common (e.g. TM6- 7 (ec), TM2/3 (ec)) and others that occur rarely (e.g. TM4-5 (ic)). In the cases where the same interaction site is observed to bind cholesterol for multiple GPCRs, we still see differences in the binding strength of cholesterol and its binding conformation (Figure 4-9).

Accounting for the lipid type, the different interaction sites and their binding strength, as well as the resulting changes in bilayer thickness and curvature, we conclude that the GPCR-lipid interaction profile constitutes a defining feature for each GPCR structure.

Along with the main text we provide a webpage (https://bisejdiu.github.io/GPCR-lipid-interactions) where users can interactively view interactions of GPCRs with cholesterol and PIP lipids represented as 3D densities as well as the calculated thickness and curvature profiles.

4.8 Supporting Material

Supporting information describing the analysis tools and protocol, the full list of GPCRs simulated, along with 21 additional figures and 5 additional tables is available online. The supporting material for this thesis has been put into Appendix A.

78

4.8.1 Supporting Citations

References (51-89) appear in the Supporting Material.

4.9 References

1. Isberg V., S. Mordalski, C. Munk, K. Rataj, K. Harpsoe, A. S. Hauser, B. Vroling, A. J. Bojarski, G. Vriend, D. E. Gloriam. GPCRdb: an information system for G protein-coupled receptors. Nucleic Acids Res. 2016;44(D1):D356-D364.

2. Flock T., A. S. Hauser, N. Lund, D. E. Gloriam, S. Balaji, M. M. Babu. Selectivity determinants of GPCR-G-protein binding. Nature. 2017;545(7654):317-+.

3. Manglik A., A. C. Kruse. Structural Basis for G Protein-Coupled Receptor Activation. Biochemistry. 2017;56(42):5628-5634.

4. Gimpl G. Interaction of G protein coupled receptors and cholesterol. Chem Phys Lipids. 2016;199:61-73.

5. Song Y., A. K. Kenworthy, C. R. Sanders. Cholesterol as a co-solvent and a ligand for membrane proteins. Protein Sci. 2014;23(1):1-22.

6. Sengupta D., A. Chattopadhyay. Molecular dynamics simulations of GPCR-cholesterol interaction: An emerging paradigm. Biochim Biophys Acta. 2015;1848(9):1775-1782.

7. Parrill A. L., G. Tigyi. Integrating the puzzle pieces: the current atomistic picture of phospholipid- G protein coupled receptor interactions. Biochim Biophys Acta. 2013;1831(1):2-12.

8. Mondal S., G. Khelashvili, H. Weinstein. Not just an oil slick: how the energetics of protein- membrane interactions impacts the function and organization of transmembrane proteins. Biophys J. 2014;106(11):2305-2316.

9. Soubias O., K. Gawrisch. The role of the lipid matrix for structure and function of the GPCR rhodopsin. Biochim Biophys Acta-Biomembr. 2012;1818(2):234-240.

10. Soubias O., W. E. Teague, K. G. Hines, K. Gawrisch. Rhodopsin/Lipid Hydrophobic Matching- Rhodopsin Oligomerization and Function. Biophys J. 2015;108(5):1125-1132.

79

11. Corradi V., B. I. Sejdiu, H. Mesa-Galloso, H. Abdizadeh, S. Y. Noskov, S. J. Marrink, D. P. Tieleman. Emerging Diversity in Lipid–Protein Interactions. Chem Rev. 2019.

12. Marrink S. J., V. Corradi, P. C. Souza, H. I. Ingólfsson, D. P. Tieleman, M. S. Sansom. Computational modeling of realistic cell membranes. Chem Rev. 2019;119(9):6184-6226.

13. Enkavi G., M. Javanainen, W. Kulig, T. Róg, I. Vattulainen. Multiscale Simulations of Biological Membranes: The Challenge To Understand Biological Phenomena in a Living Substance. Chem Rev. 2019.

14. Muller M. P., T. Jiang, C. Sun, M. Lihan, S. Pant, P. Mahinthichaichan, A. Trifan, E. Tajkhorshid. Characterization of Lipid–Protein Interactions and Lipid-Mediated Modulation of Membrane Protein Function through Molecular Simulation. Chem Rev. 2019.

15. Cheng X., J. C. Smith. Biological Membrane Organization and Cellular Signaling. Chem Rev. 2019;119(9):5849-5880.

16. Genheden S., J. W. Essex, A. G. Lee. G protein coupled receptor interactions with cholesterol deep in the membrane. Biochim Biophys Acta-Biomembr. 2017;1859(2):268-281.

17. Cang X. H., Y. Du, Y. Y. Mao, Y. Y. Wang, H. Y. Yang, H. L. Jiang. Mapping the Functional Binding Sites of Cholesterol in beta(2)-Adrenergic Receptor by Long-Time Molecular Dynamics Simulations. J Phys Chem B. 2013;117(4):1085-1094.

18. Manna M., M. Niemela, J. Tynkkynen, M. Javanainen, W. Kulig, D. J. Muller, T. Rog, I. Vattulainen. Mechanism of allosteric regulation of beta2-adrenergic receptor by cholesterol. eLife. 2016;5.

19. Song W., H. Y. Yen, C. V. Robinson, M. S. P. Sansom. State-dependent Lipid Interactions with the A2a Receptor Revealed by MD Simulations Using In Vivo-Mimetic Membranes. Structure. 2019;27(2):392-403 e393.

20. Rouviere E., C. Arnarez, L. W. Yang, E. Lyman. Identification of Two New Cholesterol Interaction Sites on the A(2A) Adenosine Receptor. Biophys J. 2017;113(11):2415-2424.

21. Salas-Estrada L. A., N. Leioatts, T. D. Romo, A. Grossfield. Lipids Alter Rhodopsin Function via Ligand-like and Solvent-like Interactions. Biophys J. 2018;114(2):355-367.

22. Horn J. N., T. C. Kao, A. Grossfield. Coarse-Grained Molecular Dynamics Provides Insight into the Interactions of Lipids and Cholesterol with Rhodopsin. In: Filizola M, editor. G Protein-Coupled

80

Receptors - Modeling and Simulation. Advances in Experimental Medicine and Biology. 796. Berlin: Springer-Verlag Berlin; 2014. p. 75-94.

23. Teague W. E., O. Soubias, H. Petrache, N. Fuller, K. G. Hines, R. P. Rand, K. Gawrisch. Elastic properties of polyunsaturated phosphatidylethanolamines influence rhodopsin function. Faraday Discuss. 2013;161:383-395.

24. Dawaliby R., C. Trubbia, C. Delporte, M. Masureel, P. Van Antwerpen, B. K. Kobilka, C. Govaerts. Allosteric regulation of G protein-coupled receptor activity by phospholipids. Nat Chem Biol. 2016;12(1):35-39.

25. Yen H. Y., K. K. Hoi, I. Liko, G. Hedger, M. R. Horrell, W. L. Song, D. Wu, P. Heine, T. Warne, Y. Lee, B. Carpenter, A. Pluckthun, C. G. Tate, M. S. P. Sansom, C. V. Robinson. PtdIns(4,5)P-2 stabilizes active states of GPCRs and enhances selectivity of G-protein coupling. Nature. 2018;559(7714):424-+.

26. Gutierrez M. G., K. S. Mansfield, N. Malmstadt. The Functional Activity of the Human Serotonin 5-HT1A Receptor Is Controlled by Lipid Bilayer Composition. Biophys J. 2016;110(11):2486-2495.

27. Sengupta D., A. Chattopadhyay. Identification of cholesterol binding sites in the serotonin1A receptor. J Phys Chem B. 2012;116(43):12991-12996.

28. Marino K. A., D. Prada-Gracia, D. Provasi, M. Filizola. Impact of Lipid Composition and Receptor Conformation on the Spatio-temporal Organization of mu-Opioid Receptors in a Multi-component Plasma Membrane Model. PLoS Comput Biol. 2016;12(12):e1005240.

29. Hedger G., H. Koldso, M. Chavent, C. Siebold, R. Rohatgi, M. S. P. Sansom. Cholesterol Interaction Sites on the Transmembrane Domain of the Hedgehog Signal Transducer and Class F G Protein- Coupled Receptor Smoothened. Structure. 2018.

30. Guixa-Gonzalez R., J. L. Albasanz, I. Rodriguez-Espigares, M. Pastor, F. Sanz, M. Marti-Solano, M. Manna, H. Martinez-Seara, P. W. Hildebrand, M. Martin, J. Selent. Membrane cholesterol access into a G-protein-coupled receptor. Nat Commun. 2017;8:14505.

31. Corradi V., E. Mendez-Villuendas, H. I. Ingolfsson, R. X. Gu, I. Siuda, M. N. Melo, A. Moussatova, L. J. DeGagne, B. I. Sejdiu, G. Singh, T. A. Wassenaar, K. D. Magnero, S. J. Marrink, D. P. Tieleman. Lipid-Protein Interactions Are Unique Fingerprints for Membrane Proteins. Acs Central Sci. 2018;4(6):709-717.

81

32. Marrink S. J., H. J. Risselada, S. Yefimov, D. P. Tieleman, A. H. de Vries. The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations. The Journal of Physical Chemistry B. 2007;111(27):7812-7824.

33. Wassenaar T. A., H. I. Ingólfsson, R. A. Böckmann, D. P. Tieleman, S. J. Marrink. Computational lipidomics with insane: a versatile tool for generating custom membranes for molecular simulations. Journal of chemical theory and computation. 2015;11(5):2144-2155.

34. Manna M., W. Kulig, M. Javanainen, J. Tynkkynen, U. Hensen, D. J. Müller, T. Rog, I. Vattulainen. How to minimize artifacts in atomistic simulations of membrane proteins, whose crystal structure is heavily engineered: β2-adrenergic receptor in the spotlight. Journal of chemical theory and computation. 2015;11(7):3432-3445.

35. Ingólfsson H. I., M. N. Melo, F. J. Van Eerden, C. Arnarez, C. A. Lopez, T. A. Wassenaar, X. Periole, A. H. De Vries, D. P. Tieleman, S. J. Marrink. Lipid organization of the plasma membrane. J Am Chem Soc. 2014;136(41):14554-14559.

36. Gu R.-X., H. I. Ingólfsson, A. H. de Vries, S. J. Marrink, D. P. Tieleman. Ganglioside-lipid and ganglioside-protein interactions revealed by coarse-grained and atomistic molecular dynamics simulations. The Journal of Physical Chemistry B. 2016;121(15):3262-3275.

37. Abraham M. J., T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, E. Lindahl. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1:19-25.

38. de Jong D. H., G. Singh, W. D. Bennett, C. Arnarez, T. A. Wassenaar, L. V. Schäfer, X. Periole, D. P. Tieleman, S. J. Marrink. Improved parameters for the martini coarse-grained protein force field. Journal of Chemical Theory and Computation. 2012;9(1):687-697.

39. Bussi G., D. Donadio, M. Parrinello. Canonical sampling through velocity rescaling. The Journal of chemical physics. 2007;126(1):014101.

40. Gu R.-X., S. Baoukina, D. P. Tieleman. Cholesterol Flip-Flop in Heterogeneous Membranes. Journal of chemical theory and computation. 2019;15(3):2064-2070.

82

41. Pliotas C., A. C. E. Dahl, T. Rasmussen, K. R. Mahendran, T. K. Smith, P. Marius, J. Gault, T. Banda, A. Rasmussen, S. Miller, C. V. Robinson, H. Bayley, M. S. P. Sansom, I. R. Booth, J. H. Naismith. The role of lipids in mechanosensation. Nat Struct Mol Biol. 2015;22(12):991-998.

42. Hedger G., M. S. Sansom, H. Koldsø. The juxtamembrane regions of human receptor tyrosine kinases exhibit conserved interaction sites with anionic lipids. Sci Rep. 2015;5:9198.

43. Jafurulla M., S. Tiwari, A. Chattopadhyay. Identification of cholesterol recognition amino acid consensus (CRAC) motif in G-protein coupled receptors. Biochem Biophys Res Commun. 2011;404(1):569-573.

44. Fantini J., C. Di Scala, L. S. Evans, P. T. F. Williamson, F. J. Barrantes. A mirror code for protein- cholesterol interactions in the two leaflets of biological membranes. Sci Rep. 2016;6:14.

45. Hurst D. P., A. Grossfield, D. L. Lynch, S. Feller, T. D. Romo, K. Gawrisch, M. C. Pitman, P. H. Reggio. A Lipid Pathway for Ligand Binding Is Necessary for a Cannabinoid G Protein-coupled Receptor. J Biol Chem. 2010;285(23):17954-17964.

46. Park J. H., P. Scheerer, K. P. Hofmann, H. W. Choe, O. P. Ernst. Crystal structure of the ligand- free G-protein-coupled receptor opsin. Nature. 2008;454(7201):183-U133.

47. Lee A. G. A database of predicted binding sites for cholesterol on membrane proteins, deep in the membrane. Biophys J. 2018;115(3):522-532.

48. Lee A. G. Interfacial Binding Sites for Cholesterol on G Protein-Coupled Receptors. Biophys J. 2019;116(9):1586-1597.

49. Jianyi Y., Z. Yang. GPCR-EXP: a database for experimentally solved GPCR structures.

50. Hanson M. A., V. Cherezov, M. T. Griffith, C. B. Roth, V. P. Jaakola, E. Y. T. Chien, J. Velasquez, P. Kuhn, R. C. Stevens. A specific cholesterol binding site is established by the 2.8 angstrom structure of the human beta(2)-adrenergic receptor. Structure. 2008;16(6):897-905.

51. Fantini J., F. J. Barrantes. How cholesterol interacts with membrane proteins: an exploration of cholesterol-binding sites including CRAC, CARC, and tilted domains. Front Physiol. 2013;4:9.

83

52. De Castro E., C. J. Sigrist, A. Gattiker, V. Bulliard, P. S. Langendijk-Genevaux, E. Gasteiger, A. Bairoch, N. Hulo. ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins. Nucleic Acids Res. 2006;34(suppl_2):W362-W365.

53. McGibbon R. T., K. A. Beauchamp, M. P. Harrigan, C. Klein, J. M. Swails, C. X. Hernández, C. R. Schwantes, L.-P. Wang, T. J. Lane, V. S. Pande. MDTraj: a modern open library for the analysis of molecular dynamics trajectories. Biophys J. 2015;109(8):1528-1532.

54. Jones E., T. Oliphant, P. Peterson. {SciPy}: Open source scientific tools for {Python}. 2014.

55. Seabold S., J. Perktold, editors. Statsmodels: Econometric and statistical modeling with python. Proceedings of the 9th Python in Science Conference; 2010: Scipy.

56. Hunter J. D. Matplotlib: A 2D graphics environment. Computing in science & engineering. 2007;9(3):90.

57. Waskom M., O. Botvinnik, P. Hobson, J. Warmenhoven, J. Cole, Y. Halchenko, J. Vanderplas, S. Hoyer, S. Villalba, E. Quintero. Seaborn: statistical data visualization. Seaborn: Statistical Data Visualization Seaborn 0. 2014;5.

58. Rose A. S., P. W. Hildebrand. NGL Viewer: a web application for molecular visualization. Nucleic Acids Res. 2015;43(W1):W576-W579.

59. Cignoni P., M. Callieri, M. Corsini, M. Dellepiane, F. Ganovelli, G. Ranzuglia, editors. Meshlab: an open-source mesh processing tool. Eurographics Italian chapter conference; 2008.

60. Humphrey W., A. Dalke, K. Schulten. VMD: visual molecular dynamics. Journal of molecular graphics. 1996;14(1):33-38.

61. Wang C., Y. Jiang, J. Ma, H. Wu, D. Wacker, V. Katritch, G. W. Han, W. Liu, X.-P. Huang, E. Vardy. Structural basis for molecular recognition at serotonin receptors. Science. 2013;340(6132):610-614.

62. Lebon G., T. Warne, P. C. Edwards, K. Bennett, C. J. Langmead, A. G. Leslie, C. G. Tate. Agonist- bound adenosine A 2A receptor structures reveal common features of GPCR activation. Nature. 2011;474(7352):521.

84

63. Jaakola V. P., M. T. Griffith, M. A. Hanson, V. Cherezov, E. Y. T. Chien, J. R. Lane, A. P. Ijzerman, R. C. Stevens. The 2.6 Angstrom Crystal Structure of a Human A(2A) Adenosine Receptor Bound to an Antagonist. Science. 2008;322(5905):1211-1217.

64. Ma Y., Y. Yue, Y. Ma, Q. Zhang, Q. Zhou, Y. Song, Y. Shen, X. Li, X. Ma, C. Li. Structural basis for apelin control of the human apelin receptor. Structure. 2017;25(6):858-866. e854.

65. Zhang H., G. W. Han, A. Batyuk, A. Ishchenko, K. L. White, N. Patel, A. Sadybekov, B. Zamlynny, M. T. Rudd, K. Hollenstein. Structural basis for selectivity and diversity in angiotensin II receptors. Nature. 2017;544(7650):327.

66. Rasmussen S. G., B. T. DeVree, Y. Zou, A. C. Kruse, K. Y. Chung, T. S. Kobilka, F. S. Thian, P. S. Chae, E. Pardon, D. Calinski, J. M. Mathiesen, S. T. Shah, J. A. Lyons, M. Caffrey, S. H. Gellman, J. Steyaert, G. Skiniotis, W. I. Weis, R. K. Sunahara, B. K. Kobilka. Crystal structure of the beta2 adrenergic receptor-Gs protein complex. Nature. 2011;477(7366):549-555.

67. Cherezov V., D. M. Rosenbaum, M. A. Hanson, S. G. F. Rasmussen, F. S. Thian, T. S. Kobilka, H. J. Choi, P. Kuhn, W. I. Weis, B. K. Kobilka, R. C. Stevens. High-resolution crystal structure of an engineered human beta(2)-adrenergic G protein-coupled receptor. Science. 2007;318(5854):1258-1265.

68. Hua T., K. Vemuri, M. Pu, L. Qu, G. W. Han, Y. Wu, S. Zhao, W. Shui, S. Li, A. Korde. Crystal structure of the human cannabinoid receptor CB1. Cell. 2016;167(3):750-762. e714.

69. Park S. H., B. B. Das, F. Casagrande, Y. Tian, H. J. Nothnagel, M. Chu, H. Kiefer, K. Maier, A. A. De Angelis, F. M. Marassi. Structure of the chemokine receptor CXCR1 in phospholipid bilayers. Nature. 2012;491(7426):779.

70. Chien E. Y., W. Liu, Q. Zhao, V. Katritch, G. W. Han, M. A. Hanson, L. Shi, A. H. Newman, J. A. Javitch, V. Cherezov. Structure of the human dopamine D3 receptor in complex with a D2/D3 selective antagonist. Science. 2010;330(6007):1091-1095.

71. Shihoya W., T. Nishizawa, K. Yamashita, A. Inoue, K. Hirata, F. M. N. Kadji, A. Okuta, K. Tani, J. Aoki, Y. Fujiyoshi, T. Doi, O. Nureki. X-ray structures of endothelin ETB receptor bound to clinical antagonist bosentan and its analog. Nat Struct Mol Biol. 2017;24(9):758-+.

85

72. Shimamura T., M. Shiroishi, S. Weyand, H. Tsujimoto, G. Winter, V. Katritch, R. Abagyan, V. Cherezov, W. Liu, G. W. Han. Structure of the human histamine H 1 receptor complex with doxepin. Nature. 2011;475(7354):65.

73. Chrencik J. E., C. B. Roth, M. Terakado, H. Kurata, R. Omi, Y. Kihara, D. Warshaviak, S. Nakade, G. Asmar-Rovira, M. Mileni. Crystal structure of antagonist bound human lysophosphatidic acid receptor 1. Cell. 2015;161(7):1633-1643.

74. Kruse A. C., A. M. Ring, A. Manglik, J. Hu, K. Hu, K. Eitel, H. Hübner, E. Pardon, C. Valant, P. M. Sexton. Activation and allosteric modulation of a muscarinic acetylcholine receptor. Nature. 2013;504(7478):101.

75. Haga K., A. C. Kruse, H. Asada, T. Yurugi-Kobayashi, M. Shiroishi, C. Zhang, W. I. Weis, T. Okada, B. K. Kobilka, T. Haga. Structure of the human M2 muscarinic acetylcholine receptor bound to an antagonist. Nature. 2012;482(7386):547.

76. Huang W. J., A. Manglik, A. J. Venkatakrishnan, T. Laeremans, E. N. Feinberg, A. L. Sanborn, H. E. Kato, K. E. Livingston, T. S. Thorsen, R. C. Kling, S. Granier, P. Gmeiner, S. M. Husbands, J. R. Traynor, W. I. Weis, J. Steyaert, R. O. Dror, B. K. Kobilka. Structural insights into mu-opioid receptor activation. Nature. 2015;524(7565):315-+.

77. Manglik A., A. C. Kruse, T. S. Kobilka, F. S. Thian, J. M. Mathiesen, R. K. Sunahara, L. Pardo, W. I. Weis, B. K. Kobilka, S. Granier. Crystal structure of the mu-opioid receptor bound to a morphinan antagonist. Nature. 2012;485(7398):321-U170.

78. Suno R., K. T. Kimura, T. Nakane, K. Yamashita, J. Wang, T. Fujiwara, Y. Yamanaka, D. Im, S. Horita, H. Tsujimoto. Crystal structures of human orexin 2 receptor bound to the subtype-selective antagonist EMPA. Structure. 2018;26(1):7-19. e15.

79. Cheng R. K., C. Fiez-Vandal, O. Schlenker, K. Edman, B. Aggeler, D. G. Brown, G. A. Brown, R. M. Cooke, C. E. Dumelin, A. S. Doré. Structural insight into allosteric modulation of protease-activated receptor 2. Nature. 2017;545(7652):112.

80. Choe H.-W., Y. J. Kim, J. H. Park, T. Morizumi, E. F. Pai, N. Krauss, K. P. Hofmann, P. Scheerer, O. P. Ernst. Crystal structure of metarhodopsin II. Nature. 2011;471(7340):651.

86

81. Li J., P. C. Edwards, M. Burghammer, C. Villa, G. F. Schertler. Structure of bovine rhodopsin in a trigonal crystal form. J Mol Biol. 2004;343(5):1409-1438.

82. Hanson M. A., C. B. Roth, E. Jo, M. T. Griffith, F. L. Scott, G. Reinhart, H. Desale, B. Clemons, S. M. Cahalan, S. C. Schuerer. Crystal structure of a lipid G protein–coupled receptor. Science. 2012;335(6070):851-855.

83. Burg J. S., J. R. Ingram, A. J. Venkatakrishnan, K. M. Jude, A. Dukkipati, E. N. Feinberg, A. Angelini, D. Waghray, R. O. Dror, H. L. Ploegh, K. C. Garcia. Structural basis for chemokine recognition and activation of a viral G protein-coupled receptor. Science. 2015;347(6226):1113-1117.

84. Liang Y. L., M. Khoshouei, M. Radjainia, Y. Zhang, A. Glukhova, J. Tarrasch, D. M. Thal, S. G. B. Furness, G. Christopoulos, T. Coudrat, R. Danev, W. Baumeister, L. J. Miller, A. Christopoulos, B. K. Kobilka, D. Wootten, G. Skiniotis, P. M. Sexton. Phase-plate cryo-EM structure of a class B GPCR-G- protein complex. Nature. 2017;546(7656):118-+.

85. Song G., D. Yang, Y. Wang, C. de Graaf, Q. Zhou, S. Jiang, K. Liu, X. Cai, A. Dai, G. Lin. Human GLP-1 receptor transmembrane domain structure in complex with allosteric modulators. Nature. 2017;546(7657):312.

86. Siu F. Y., M. He, C. De Graaf, G. W. Han, D. Yang, Z. Zhang, C. Zhou, Q. Xu, D. Wacker, J. S. Joseph. Structure of the human glucagon class B G-protein-coupled receptor. Nature. 2013;499(7459):444.

87. Doré A. S., K. Okrasa, J. C. Patel, M. Serrano-Vega, K. Bennett, R. M. Cooke, J. C. Errey, A. Jazayeri, S. Khan, B. Tehan. Structure of class C GPCR metabotropic glutamate receptor 5 transmembrane domain. Nature. 2014;511(7511):557.

88. Wang C., H. Wu, T. Evron, E. Vardy, G. W. Han, X.-P. Huang, S. J. Hufeisen, T. J. Mangano, D. J. Urban, V. Katritch. Structural basis for Smoothened receptor modulation and chemoresistance to anticancer drugs. Nat Commun. 2014;5:4355.

89. Shan J., G. Khelashvili, S. Mondal, E. L. Mehler, H. Weinstein. Ligand-dependent conformations and dynamics of the serotonin 5-HT(2A) receptor determine its activation and membrane-driven oligomerization properties. PLoS Comput Biol. 2012;8(4):e1002473.

87

Chapter Five: COX-1 – lipid interactions: arachidonic acid, cholesterol, and phospholipid binding to the membrane binding domain of COX-1

Copyright

The following chapter has been made available on bioRxiv:

Sejdiu B. I., D. P. Tieleman. COX-1 - lipid interactions: arachidonic acid, cholesterol, and phospholipid binding to the membrane binding domain of COX-1. bioRxiv. 2020

The article is made available under a CC-BY-NC 4.0 International license.

Contribution

All the simulations and the entire analysis of the data are my own contributions. The text is written by me with help and feedback from my supervisor. In particular, the early draft of the following work was quite inferior to the current version, and the change reflects the suggestions and advice of my supervisor.

Tables S2 and S3 are available on bioRxiv and in the accompanying files described in Chapter 4.

Abbreviations

All abbreviations are introduced at their first mentioning instance.

88

5.1 Abstract

Cyclooxygenases carry out the committed step in prostaglandin synthesis and are the target of NSAIDs, the most widely used class of drugs in alleviating pain, fever, and inflammation. While extensively studied, one aspect of their biology that has been neglected is their interaction with membrane lipids. Such lipid- protein interactions have been shown to be a driving force behind membrane protein function and activity. Cyclooxygenases (COX-1 and COX-2) are bound on the luminal side of the endoplasmic reticulum membrane. The entrance to their active site is formed by a long hydrophobic channel which is used by the cyclooxygenase natural substrate, arachidonic acid, to access the enzyme. Using atomistic and coarse- grained simulations, we show that several membrane lipids are capable of accessing the same hydrophobic channel. We observe the preferential binding of arachidonic acid, cholesterol and glycerophospholipids with residues lining the cavity of the channel. We find that the membrane binding domain (MBD) of COX- 1 is usually in a lipid-bound state and not empty. This orthosteric binding by other lipids suggests a potential regulatory role of membrane lipids with the possibility of affecting the COX-1 turnover rate. We also observed the unbiased binding of arachidonic acid to the MBD of COX-1 allowing us to clearly delineate its binding pathway. We identified a series of arginine residues as being responsible for guiding arachidonic acid towards the binding site. Finally, we were also able to identify the mechanism by which COX-1 induces a positive curvature on the membrane environment.

89

5.2 Introduction

Prostaglandin endoperoxide H synthase-1 and 2 isoenzymes, commonly known as cyclooxygenase-1 and 2 (COX-1 and COX-2), respectively, are membrane-bound enzymes that carry out the committed step in prostaglandin synthesis, and are thus either directly or indirectly involved in many malfunctions and pathologies, with large therapeutic implications(1-5). The anti-inflammatory, analgesic, antipyretic and antithrombotic effects of nonsteroidal anti-inflammatory drugs (NSAIDs) are associated with the inhibition of the activity of cyclooxygenases (COX-2 for the former three, and COX-1 isoform for the latter one)(6- 8).

Prostanoids, the end products of cyclooxygenase (and other enzymes) catalysis, are biologically active compounds formed from arachidonic acid as the precursor. Cyclooxygenases, in a two-step process, convert arachidonate to prostaglandin H2 (PGH2): first, through the addition of two O2 molecules arachidonate is converted to prostaglandin G2 (this is the cyclooxygenase – or COX – reaction) which is then followed by − a 2푒 reduction reaction to produce PGH2 (the peroxidase – or POX – reaction). PGH2 then through the action of different enzymes is converted into other prostaglandins, prostacyclin or thromboxane A2 (9). Structurally, COX-1 and 2 are homodimers (Figure 5-1), with each monomer composed of a large globular (or catalytic) domain, an epidermal growth factor-like (EGF) domain, and a membrane binding domain (MBD) (10).

COX-1 and 2 are integral monotopic membrane proteins, permanently bound to one side of the membrane without spanning the full bilayer. The MBD of COX-1 and 2 forms the entrance to a 25Å long hydrophobic channel(2) that starts at the surface of the membrane and extends to the core of the catalytic domain where it ends at the cyclooxygenase reaction site (Figure 5-1C). Arachidonic acid enters the enzyme via the membrane through this channel, as do NSAID drugs(11, 12). The MBD is composed of four short helical segments (helices A-D) that anchor COX-1 into the upper (outer or luminal) leaflet(13) of the endoplasmic reticulum (ER) and inner membrane of the nuclear envelope(14). While cyclooxygenases are sequence homodimers, they function as heterodimers, with one subunit serving an allosteric (Eallo) and the other a catalytic (Ecat) role(15, 16). This is referred to as half of site COX activity whereby only one subunit is functionally active at a given time and is regulated allosterically by the other subunit. The cross-talk between the monomers is heavily influenced by ligand binding and often results in different outcomes for COX-1 and COX-2(9).

Many aspects of cyclooxygenases have been studied, including substrate binding and enzymatic activity, kinetic profile, as well as ligand binding and inhibition(9, 11, 17). Here we focus on the lipid-protein interactions that may modulate the activity of COX-1. The study of membrane protein interactions with

90 their lipid environment has gained a lot of popularity in recent years and is a major focus of current research literature. Molecular Dynamics (MD) simulations, in particular, have provided many insights into the nature of these interactions, and the current understanding of lipid-protein interactions highlights the importance of direct and specific interactions with lipids, as well as general changes to the membrane physical properties in affecting the function and activity of embedded proteins(18-20). Several previous simulation studies have focused on cyclooxygenase catalytic activity(6, 7, 21, 22) and more general membrane binding of monotopic proteins(23-27). For example, the ability of dimyristoylphosphatidylcholine (DMPC) headgroup to interact with COX-1 has been noted as early as 2000(23).

Figure 5-1. Structure of the ovine Prostaglandin endoperoxide H synthase-1. A. The enzyme is shown with its domains colored in blue, green and gold denoting the catalytic, epidermal growth factor-like and membrane binding domains, respectively. Heme is shown in red, flurbiprofen in yellow, and POPC lipids in contact with it during the simulation as stick representation. This figure is based on a similar figure from reference (2). B. Close-up view of the MBD of COX-1, showing the location of helices A-D, and surface rendering highlighting the hydrophobic channel formed by these helices (C). For reference, Arg-120 is drawn using a stick presentation. D. Schematic view of one of the simulated systems for COX-1, showing different lipids species using different colors. One of the proteins is drawn transparent to reveal the backbone beads as they are modeled in the MARTINI model.

To study the lipid-protein interactions COX-1 and obtain a comprehensive understanding of their interaction profile, we performed long-scale molecular dynamics (MD) simulations of COX-1 using both atomistic (AA) and coarse-grained (CG) resolutions in both simple and complex membrane environments.

91

The combined results from these simulations highlight the ability of lipids, such as arachidonic acid, cholesterol and glycerophospholipids, to enter into the hydrophobic channel of COX-1 and interact with hydrophobic residues lining the interior of the channel and via charge-charge interactions with Arg-120. We find two mechanisms for arachidonic acid binding, revealing thus the pathway the lipid takes to enter the active site of the enzyme. We identify several residues that are involved in guiding arachidonic acid and in maintaining its binding to the hydrophobic channel. Lastly, we also identify the mechanism by which COX-1 induces previously reported perturbations into the surrounding lipid environment that result in the creation of a strong positive curvature.

5.3 Methods

Protein Structure. We retrieved the protein structure from the Protein Data Bank with PDB entry 1Q4G(28). For CG simulations, all ligands – including heme – where removed before the structure was converted into a coarse-grained representation using the martinize tool as described on the MARTINI(29) website (cgmartini.nl). For atomistic simulations, heme was included in the simulations, but all glycans alongside other ligands were excluded. Table S1 provides a complete list of setups and simulation details.

Coarse-Grained Molecular Dynamics Simulations. In terms of membrane composition, three types of systems were employed in our CG simulations: POPC only, ER-like membranes, and a membrane with complex lipid composition.

POPC membranes were used to highlight the curvature inducing ability of COX-1, whereas CG simulations with lipid concentrations mimicking closely the lipid composition of ER membranes(30-32) were used to study lipid – COX-1 interactions in multicomponent membranes. These systems contained one copy of the enzyme inserted into the membrane. For the complex membrane setup, four copies of the coarse-grained protein were placed in a 40x40 nm2 area which was then filled with lipids using the tool insane(33). It is composed of 63 different lipid types based on the model developed by Ingólfsson et al.(34) and applied to 10 different proteins(35) including COX-1. The data presented here for the complex membrane setup was previously published in that study. The exact compositions are given in Table S2. In brief, the membrane model contains an asymmetric distribution of the following major lipid groups: cholesterol (CHOL), phosphatidylcholine (PC), phosphatidylethanolamine (PE), and sphingomyelin (SM) placed in both leaflets; gangliosides exclusively in the upper leaflet, and charged lipids phosphatidylserine (PS), phosphatic acid (PA), phosphatidylinositol (PI) along with PI-phosphate, -bisphosphate, and -trisphosphate lipids (PIPs) placed exclusively in the lower leaflet. Complete details on the lipids used can be found on the MARTINI lipidome webpage (cgmartini.nl/index.php/force-field-parameters/lipids).

92

System equilibration was done using a gradual stepwise procedure, reducing the number and strength of position restraints on the proteins. Simulations were performed using a 20 fs time step. A target 310 K temperature was maintained with a velocity-rescaling thermostat(36), and a time constant for coupling of 1 ps. The Berendsen barostat(37) was applied semi-isotropically at 1 bar, a compressibility of 3 · 10-4 bar-1 and a relaxation constant of 5 ps. A small 1 kJ mol-1 nm-2 force constant on protein backbone beads was maintained during the simulation to prevent diffusion of the four protein copies and resulting protein- protein interactions, which cannot be accurately sampled on a the time scale of the simulation. The system was simulated for 30 μs. Analysis of all CG systems, unless otherwise stated, was performed on the whole trajectory. All CG simulations were carried out using the GROMACS 4.6.x package(38).

Atomistic Molecular Dynamics Simulations. For atomistic simulations, bilayer assembly, protein preprocessing and embedding into the bilayer, were done using the CHARMM-GUI webserver (39, 40). In addition to protein and lipids, the final system also contained TIP3P water(41) and 150 mM NaCl ions. The exact lipid composition of the atomistic systems is provided in Table S1. We used the CHARMM36m force-field(42) to describe the system and a 2 fs time step for integration. Particle Mesh Ewald(43) was used for long range electrostatics with a real space cutoff of 1.2 nm and 0.12 nm Fourier grid spacing. The same 1.2 nm cutoff was also used for van der Waals interactions. The Nosé-Hoover thermostat(44, 45) was used to maintain a target temperature of 310 K with a 1 ps coupling constant, applied separately to protein, membrane and solution components. We used the Parrinello-Rahman barostat(46) to keep the pressure at 1 bar, with a coupling constant of 5 ps and compressibility of 4.5 · 10-5 bar-1. The LINCS algorithm(47) was used to constrain bonds with hydrogen atoms. Bilayer composition includes simple lipid mixtures, composed of POPC lipids and either cholesterol (CHOL) or arachidonic acid (ARAN). Simulations that include large membrane are composed of only POPC lipids.

We performed several control simulations which differed in terms of their simulation parameters. Two of these systems were simulated using a surface tension of 5 and 30 mN/m applied in the x-y plane, respectively, and compressibility of 4.5 · 10-5 bar-1. To rule out any force-field dependency of our results one system was simulated using the GROMOS 54A7 force-field(48) and contained SPC water(49) instead. All atomistic simulations were carried out using GROMACS 2016.x package(50).

Analysis. The GROMACS g_select tool was used to calculate the lipid count around proteins within a distance cutoff for the upper leaflet. Calculations for the lower leaflet were done using in-house scripts which used MDTraj(51) to process the trajectories, in addition to standard python libraries for scientific computing (numpy and scipy) and visualization (matplotlib). 2D Density, thickness and curvature profiles calculated used an in-house developed program that has also been applied previously(35, 52) and explained in detail elsewhere(35).

93

Calculations of lipid contacts were done by considering both the total number of lipid contacts as well as their duration. When measuring the duration of contacts, we only consider the contact with the longest duration. These results were either plotted as a time series, as in Figure 5-2, or projected onto the surface of individual residues and colored via a color gradient, as in Figure 5-3.

To show bilayer deformations as a result of POPC lipid interactions with basic residues on the surface of cyclooxygenases, we calculated distances from lipid P atoms and COX-1 center of mass. These calculations were done for three mutually exclusive categories, meaning that a POPC lipid can at most belong to one of these categories. They are: POPC lipids in direct contact with arginine or lysine residues, POPC lipids in close contact with the whole protein (excluding those interacting with arginine and lysine residues), and other POPC lipids. To define what constitutes a ‘direct contact’ we used three different distance cutoffs: 0.4, 0.5 and 0.6 nm. The results presented in Figure 5-6C are derived from the system with a small surface tension applied. This was done to decouple bilayer undulations, which is a property of every biological membrane, from perturbations to the membrane, such as positive curvature, caused by the presence of COX-1. The application of surface tension will eliminate bilayer undulations and allow us to measure the effect of COX-1 on the curvature of the membrane.

Visualizations are done using VMD(53) and NGL Viewer(54, 55).

5.4 Results

5.4.1 Characterization of arachidonic acid binding to COX-1

The endogenous substrate of COX-1 enzymes is arachidonic acid which is released by phospholipase A2 – mediated hydrolysis of glycerophospholipids(56). Computational studies of COX-1 interactions with arachidonate focus on interactions of the lipid already bound at the cyclooxygenase site(6, 7, 21, 22) of the enzyme with residues surrounding it. The way the lipid enters the active site, however, has received much less attention. In atomistic simulations of COX-1 embedded in a membrane model containing arachidonate, we observe the entrance and specific binding of the latter into the hydrophobic channel of the MBD (Figure 5-2). Interactions with residues lining up helices A-D of the MBD lead to arachidonate being pulled out of the membrane plane and its insertion into the hydrophobic cavity. This configuration with the lipid bound inside the MBD hydrophobic core is highly stable and maintained throughout the simulation. We identify two types of interactions that are responsible for this binding: charge-charge interaction of the carboxylate end of arachidonate and guanidinium moiety of Arg-120, as well as the hydrophobic interaction between the arachidonate acyl chain and hydrophobic residues lining up helices B and D of the MBD.

94

To measure the binding of arachidonate to the hydrophobic channel of COX-1 we performed distance calculations between the carboxylate headgroup of the lipid and the guanidinium moiety of Arg-120. We chose Arg-120 as a reference residue for these calculations throughout this study for the following reasons: (i) Arg-120 forms the entrance to the COX-1 cyclooxygenase site and as such delineates its entry point, (ii) it interacts with and stabilizes the bound lipid headgroup, (iii) it is of great functional importance to the activity of COX-1(57, 58) and (iv) the relative alignment of the distance vector connecting the lipid to Arg- 120 is close to the bilayer normal vector, and as such it simultaneously allows us to measure the degree of lipid insertion into the channel. These results are highlighted in Figure 5-2B. These distance calculations reveal that once arachidonate binds to the hydrophobic channel of COX-1, it is maintained there for the whole duration of the trajectory with minimal fluctuation, which is highly indicative of specific binding.

Figure 5-2. Arachidonic Acid (ARAN) binding to the MBD of COX-1 in AA simulations. Distance calculations between arachidonic acid carboxylate end and Arg-120 guanidino moiety for one monomer of COX-1(top line plot; denoted with A). For the other monomer (denoted with B) in addition to the same distance, we also show the distance between the phosphate headgroup of a bound POPC lipid to Arg-120. While the binding site is occupied, arachidonic acid interacts with Arg-83 instead (bottom line plot). The time when arachidonic acid starts interacting with Arg-120 is marked with a black arrow.

95

COX-1 is a homodimer, each monomer containing a catalytic site and hence, a hydrophobic channel for lipid entrance. We observe the immediate binding of arachidonate on only one of the monomers. In the other monomer, however, we observe the binding of a POPC lipid instead. This binding – which has been reported before in the MD literature for DMPC(23) – is more variable and therefore less stable, hinting at the interaction being nonspecific and in the long run more likely to dissociate. Nevertheless, in our simulations, while bound, POPC clearly blocks arachidonate from entering the site and disallows binding. Instead, it is kept at the ‘front door’, where it interacts with Arg-83. During the course of arachidonate interactions with Arg-83, we see the lipid coming in close contact with Arg-120 several times (Figure 5-2, black arrow). After close to 900 ns, we observe the displacement of the POPC lipid by arachidonate but do not observe it completely removing POPC out of the MBD, which we expect to happen given more sampling. The MBD expands to accommodate both lipids at this site: POPC interacting with mainly helix B and arachidonate interacting with helix B and D residues.

The binding poses we observe for both lipids that bind to COX-1, include the lipid “extracted” from the membrane and reaching heights higher than the average membrane plane. Arachidonate for instance, when bound, reaches a height of 1.98 ± 0.17 nm compared to 1.65 ± 0.05 nm for all other arachidonate lipids combined (calculated over the last few frames of the trajectory, distance is relative to membrane center). Such a lipid configuration is stable and thermodynamically viable as a result of the aforementioned charged- based interactions of the lipid headgroup with Arg-120 and hydrophobic interactions of mainly Val-116 and Val-129, but also other hydrophobic residues from helices B and D of the MBD. In a similar simulation with equal length but using a protonated (uncharged) arachidonic acid, we still see it binging to the MBD and interacting with the same residues, highlighting thus the importance of hydrophobic interactions in maintaining the binding.

5.4.2 Coarse-grained MD simulations reveal a complex interplay of COX-1 enzymes with lipids

The occupancy of the cavity within the MBD of COX-1 by POPC lipids, closes the pathway used by arachidonate to access the enzyme’s active site. From these results, however, it is unclear if this binding is specific to POPC lipids or perhaps shared by other lipids as well. To test this, we conducted CG MD simulations of the enzyme embedded in several model bilayers. The construction of these membrane models aimed at reproducing the unique lipid composition of ER membranes and include several lipid species. Specifically, we employed low concentrations of cholesterol which in ER membranes is reported to be in the range 5-8% (30-32) and a large ratio of PC lipids (Table S1). These systems also contain very small amount of sphingolipids to match the low concentration of sphingo- and other complex lipids in ER membranes(59, 60). Figure 5-3A shows 2D density maps calculated separately for the upper and lower leaflet and highlighting the preferential localization of lipids in one of these systems (Figure B-2 shows the

96 same calculations for all systems combined). Owing to their monotopic binding to the upper leaflet of the bilayer, COX-1 interactions with lipids are defined predominantly by lipid interactions with their membrane anchoring domain.

97

Figure 5-3. COX-1 interaction with lipids in ER-like membranes. A. Top and front view of the simulated system with each lipid type colored using a different color, along with density maps calculated separately for the upper and lower leaflet for PC, PE, CHOL and ARAN. B.

98

Contact heatmaps between COX-1 and the same lipid groups calculated as the total number of contacts or their respective duration, using a white-red color gradient.

The MBD inserts well into the width of the upper bilayer leaflet and interacts with surrounding lipids. Most of these interactions are short-lived and transient; we do, however, also observe several interactions that persist for much of the simulation time. Such interactions hint at the possibility of specific lipid-protein interactions, which are interactions that may serve a functional or modulatory role in the activity of proteins. Figure 5-3A shows density maps for the following lipid groups found on the upper leaflet of the membrane: CHOL, PC, PE, and arachidonic acid (ARAN). The upper leaflet also contains SM lipids, but since they do not show any notable interaction with the enzyme, they are omitted from the figure. We see that each of these lipid types forms highly localized interactions with COX-1. Specifically, we note that the majority of these interactions are located within the interface formed by the MBD of each monomer, and in particular we observe that each of these lipids is capable of interacting with the hydrophobic channel inside the MBDs. To further highlight the latter, Figure 5-3B shows contact heatmaps between COX-1 and each of the above lipids measured as either their total number of contacts or their respective duration of interaction. Calculations are done on a per-residue basis and the resulting heatmaps are projected onto the surface of the enzyme. The heatmap formed by the total number of contacts confirms that the majority of interactions include the MBDs of COX-1. When we look at the longevity of these interactions instead, however, we see that only those interactions that involve the hydrophobic channel of the enzyme are left, showing clearly that only lipid interactions with this site of the enzyme are maintained for prolonged durations of the trajectory.

When comparing the binding of different non-arachidonate lipids against each other in terms of their consistency and specificity of binding, despite its low concentration in our model membranes, we find that in the majority of cases, cholesterol is the dominant lipid species interacting with COX-1 and occupying the MBD cavity. We observe cholesterol binding even in setups where its content is very low (3% of total lipids) or arachidonate is one of the lipid components (Figure B-2). Therefore, to highlight cholesterol binding we measured distance calculations of cholesterol bound at this site from a CG setup involving a complex membrane model composed of 63 different lipid types (but lacking arachidonate) wherein four copies of COX-1 have been embedded and that has been simulated for 30 μs. Figure 5-4 shows data for one of these copies and Figure B-1 for the rest. These results indicate that despite the presence of many other lipid species in the system, we mainly see cholesterol occupying the hydrophobic channel of COX-1 and forming strong interactions that are maintained from a few and up to 13 μs of simulation time.

99

Figure 5-4. Cholesterol binding to the MBD of COX-1 in CG simulations. A. Overview of the binding site inside the MBD. On the left the backbone of the coarse-grained protein is shown along with a transparent surface representation. The two panels on the right highlight the interaction of cholesterol (gray surface) with helices A-D of the MBD. Residues that interact most with cholesterol as shown as sphere representation. (coloring and viewing angles are similar to Figure 5-1). B. Arg-120 – cholesterol distance as measured by the MARTINI representation of the arginine guanidinium and cholesterol hydroxyl moieties (ROH-SC2 beads) measured separately for each monomer. The same distance for the second cholesterol molecule that is sometimes bound at roughly the same site is drawn with a greyed out dotted line.

Figure 5-4A provides a close view of the binding site within the hydrophobic channel of the protein along with residues involved in the binding process. The binding site is formed by the hydrophobic sidechains of leucines 92, 93, 112, and 115, as well as Val-116, and Val-119, which are positioned at opposing sides of the binding site (helices B and D) and form a tight interaction network with the hydrophobic ring structure of cholesterol. The hydrophilic headgroup of cholesterol, on the other hand, interacts with Arg-120, an essential residue for the activity of COX-1(57, 58). Figure 5-4B shows distance calculations for both COX- 1 monomers between the cholesterol headgroup (ROH bead in the MARTINI model) and the guanidino

100 moiety of Arg-120 (SC2 bead). This measurement was done to match the distance calculation shown in Figure 5-2 for arachidonate.

In our simulation setup we have four enzyme copies, each composed of two monomers. Hence, the data that we show in Figure 5-4 (and Figure B-1) correspond to 8 total binding sites. Collectively, we see several cholesterol binding and unbinding events, associated with varying levels of close contact durations, but generally no less than 2 μs. We also observe that the COX-1 hydrophobic channel is sometimes occupied by a second cholesterol molecule (Figure 5-4B, dotted line), with a much sparser frequency and lower longevity of binding. While the MBD certainly seems capable of accommodating two cholesterol molecules, this is not a stable configuration. This could also be a consequence of the presence of an elastic network on CG proteins, which would prohibit the MBD from expanding to accommodate both lipids (similar to what we observed for arachidonate).

The combined results from CG simulations confirm the binding of arachidonate similar to what we observe in atomistic simulations, and further show that other lipids – mainly, cholesterol, but also PC and PE lipids – can also bind at the same interaction site as the endogenous ligand of COX-1. The biological implications of these findings are twofold: (i) the hydrophobic channel of COX-1 is usually in an occupied state and not empty, and therefore (ii) the dissociation of the bound lipid at this site has to precede arachidonate binding. The former relates to the most populated state adopted by COX-1 in ER membranes, whereas the latter hints at a possible importance of membrane lipids in affecting the kinetic profile of the enzyme. This is because the ability of a lipid to occupy a space, the accessibility of which is required by the native ligand for the biological activity of the protein, underscores a potential regulatory activity of the lipid in the activity of the protein.

5.4.3 Atomistic simulations of cholesterol interactions with COX-1

To further investigate cholesterol binding to COX-1, we carried out simulations at the all-atom level of detail using simpler membrane models (POPC:CHOL). In simulations where the level of cholesterol is low (~15% of bilayer lipids), we do observe some prolonged interactions of COX-1 with cholesterol, but we do not observe any binding to the hydrophobic channel. While it may be the case the atomistic simulations would result in different interactions compared to the coarse-grained simulations, we believe that the more limited sampling in the former is a far bigger issue. To account for this, we carried out the same simulations in setups with a larger cholesterol content, and indeed we do observe cholesterol binding to COX-1, at the same site as in the CG simulations (Figure 5-5). The binding site itself, as well as the residues that are responsible in maintaining the binding are the same as what we saw from CG simulations (see the COX-1 surface representation comparison between all-atom and coarse-grained results in Figure 5-5C).

101

Figure 5-5. Cholesterol binding to the MBD of COX-1 in AA simulations. A. Overview of the simulated system anchored to the membrane viewed in full (left) and the monomer bound to cholesterol (middle and right). B. Closed-up view of the bound cholesterol. Hydrophobic residues lining up the interface are shown in yellow. Arginine residues are drawn in blue. The coloring of the MBD uses

102 the same color gradient as Figure 5-3 based on the number of interactions with cholesterol that are formed. C. Comparison of the binding site in the atomistic (left) and CG simulation (right). D. Distance calculations between the hydroxyl oxygen of cholesterol and the guanidinium carbon of three arginine residues that stabilize the polar headgroup of cholesterol.

The rough surface (β-face) of cholesterol faces the inside of the protein, with the smoother (α-face) directed towards the membrane. Similar to the CG simulations, a network of hydrophobic residues (Leu-92, Leu- 93, Leu-112, Leu-115, Val-116, and Val-119) lining the MBD stabilize the ring structure of cholesterol. Somewhat differently from the CG simulations, however, three adjacent arginine residues interact with its hydroxyl headgroup. Figure 5-5B shows distance calculations from these three arginine residues to the cholesterol hydroxyl headgroup, calculated to make comparison with the CG data easy. Binding occurs at the start of the simulation and is maintained throughout. Furthermore, all three arginine residues form contacts with cholesterol, with its hydroxyl headgroup placed equidistantly between them with shorter intervals where cholesterol interacts predominantly with either Arg-73 or Arg-120. For Arg-73 binding to occur, helix A bends towards the other helices. In another similar simulation, however, we observe cholesterol binding without the bending of helix A and the involvement of Arg-73, as such interactions with Arg-73 are not necessary for cholesterol binding.

In the setup with ~15% CHOL content, where we did not observe cholesterol binding, instead we see a POPC lipid insert and interact with residues at roughly the same site as arachidonate and cholesterol (Figure B-3). In contrast to them, however, POPC binding appears to involve largely electrostatic interactions. Even though the 2-oleyl tail interacts with Trp100 of helix C, the bulk of the interaction stability comes from interactions between the lipid phosphate and Arg-83 guanidino moieties (as is evident in Figure B-3, where when this interaction breaks and the choline headgroup starts interacting with the very vertically placed Glu-493, the overall positioning of POPC at this site becomes less well-defined). This interaction with POPC inside the MBD hydrophobic channel in the setup with low cholesterol content may explain why we did not observe cholesterol binding there, since for it to occur we would have to sample the dissociation of the bound POPC lipid first. That is, the prior binding of POPC increases the sampling time required to observe the binding of cholesterol (note that this does not convey any information about their relative strength of binding).

Overall, while binding of cholesterol to COX-1 is specific, the binding site itself does not seem to be exclusive to only cholesterol (as also noted above from CG simulations). The overall conclusion of atomistic simulations seems to be a strong support for cholesterol binding, with a side note that POPC (and likely other lipids) as well can access the same interaction site.

103

5.4.4 COX-1 induces a positive curvature on the surrounding lipid environment

Measurements of membrane physical properties such as membrane thickness, curvature, shear stress, strain and elasticity profiles show that proteins in addition to forming specific interactions with individual lipids, can also cause perturbations to the local membrane environment(18). The latter is the result of nonspecific lipid-protein interactions. These “bulk-lipid” effects have been studied extensively but generally are less well understood compared to the specific interaction with lipids. In our simulations, regardless of the setup, lipid composition or resolution of the models used, we consistently observe COX-1 exhibiting a marked effect on the surrounding membrane environment.

Despite their localization on the outer leaflet of (mainly) ER membranes, in all our setups we see COX-1 induce structural changes to the membrane and affect both leaflets. The immediately observable visual manifestation of these perturbations is the creation of a positive curvature around the enzyme (Figure 5-6). Calculations of the mean curvature profiles for the CG systems highlight the ability of COX-1 to affect the curvature of the upper and lower leaflet (Figure 5-6D). This effect is also easily observable when calculating the average thickness of the membrane model during the simulation, where for the same system, we see a higher average thickness at the enzyme embedding location compared to the surrounding environment. It is clear that this effect is not a result of specific lipid-protein interactions. Rather, it is a consequence of COX-1 nonspecific interactions with membrane lipids. In Figure 5-3A and, more prominently Figure 5-6B, we notice that in addition to lipids interacting with residues inside the hydrophobic channel of the enzyme, they also preferentially localize at the interface between its MBDs, forming many smaller and less well- defined interaction sites. This is in particular the case for phospholipids.

To test if these non-specific interactions with membrane phospholipids are the mechanism by which COX- 1 enzymes perturb the local environment, we carried out all-atom simulations of COX-1 embedded in a pure POPC membrane. These simulations employ a large membrane setup, which is necessary to distinguish between local lipids and bulk bilayer lipids. Figure 5-6A shows snapshots of one of these systems visually highlighting the curvature the upper-leaflet embedded COX-1 induces on the overall membrane (arrow II). The resulting density profiles of POPC lipids during the simulation (Figure 5-6B) confirm their increased localization at the interface between each monomer’s MBD (arrow I). COX-1 contain many surface-exposed positively charged residues, with several facing the membrane where they interact with local lipids. The interaction of phospholipids with these residues ultimately leads to the creation of the positive curvature we observe in our simulations. To show this, we separated POPC lipids into three mutually exclusive groups: POPC lipids interacting with lysine or arginine residues (blue), interacting with all other COX-1 residues (orange), and all other POPC lipids (green) and calculated their average distance from the center of mass of COX-1 (Figure 5-6C).

104

Figure 5-6. Curvature inducing property of COX-1. A. Schematic view of bilayer perturbations as a result of COX-1 interactions with POPC lipids. The middle panel shows the changes to the lower leaflet by only visualizing the POPC phosphorus atoms (arrow II). Positively charged residues are colored with dark and light blue for arginine and lysine, respectively. B. 2D density profiles for the lower and upper leaflet highlighting the increased localization of POPC lipids underneath the enzyme (arrow I). Please note the colorbar follows a ‘cyclic’ color gradient (see reference (61) for more information) for easy identification of preferential localization sites. C. Distance calculations between COX-1 center of mass (COM) and P atom of POPC lipids, from three mutually exclusive groups: POPC lipids interacting with arginine or lysine residues (blue), POPC lipids interacting with residues other than arginine and lysine, and all other POPC lipids. The definition of contact is done using three distance cutoffs: 0.4, 0.5, and 0.6 nm. D and E. Visualization of the mean curvature and membrane thickness, respectively.

105

The cutoff for what constitutes a contact/interaction is done for three different distances: 0.4, 0.5 and 0.6 nm. The result shows that POPC lipids interacting with positively charged residues are on average closer to the center of mass of the protein then other POPC lipids, revealing thus a likely mechanism for the capability of COX-1 to induce a positive curvature. First, the MBDs act as physical barriers that separate lipids into protein-interacting, low diffusion lipids and freely moving, high diffusion lipids. Since proteins in general affect the diffusion rates of lipids, there are two additional factors that play a key role for cyclooxygenases. The outer surface of COX-1 is characterized by a plethora of positively charged residues which form charge-charge interactions with lipids containing negatively charged chemical groups (predominantly phospholipids) resulting in a height difference between lipids in-between the MBDs and surrounding lipids. And lastly, the partial insertion of COX-1 into the membrane leaves the lower leaflet without any supporting protein interactions. The combined result is the induction of a positive curvature on the local membrane environment, which is persistent even in simulations employing surface tension, or other control parameters (see methods). In contrast to the upper leaflet, calculations of lipid order parameters and number densities reveal no changes to their values for the lower leaflet (Figure B-4 – B-6), further supporting our claim that the curvature creation is driven purely by COX-1 interactions with lipid in the upper leaflet and the enzyme does not affect the lipid composition of the lower leaflet.

5.5 Discussion

The understanding of the interplay between membrane embedded proteins and their surrounding lipid environment has become an important part of protein function and activity studies. MD simulations have revealed the detailed lipid interaction profile of many proteins (including GPCRs(52), ion channels(62), and many others(18)).

The endoplasmic reticulum is the main organelle for lipid synthesis accounting for the majority of phospholipids (e.g. PC, PE, PS etc. lipids)(63) and cholesterol(31). They are, however, quite low in cholesterol content itself, resulting in a loosely packed configuration of lipids which fits their role in transporting synthesized lipids to other organelles. Cyclooxygenases are bound monotopically to the outer leaflet of ER membranes. In addition to their evolutionary adaptation to function in such a highly fluid membrane environment, we find that COX-1 is itself capable of inducing a positive curvature. The effect is persistent regardless of the lipid composition of the membrane model and simulation setup used. The observation of this property in MD simulations, to our knowledge, was done independently by Wan et al.(24) using atomistic simulation and Balali-Mood et al.(27) employing a coarse-grained resolution. In the former, the authors observed the curvature of both cyclooxygenase isoforms during 25ns runs, whereas in

106 the latter study, the authors show that monotopic membrane proteins with a deep insertion into the membrane cause bilayer perturbations similar to those presented here (although the degree of the curvature was much more pronounced in their work, which may be due to their use of a nascent version of the MARTINI model(64)).

Through a combination of membrane curvature and thickness profile calculations coupled with measurements of order parameters and density maps from both atomistic and coarse-grained resolution, we are able to provide a mechanistic explanation for these perturbations. Nonspecific yet prolonged interactions of surface-exposed positively charged residues with membrane phospholipids located at the interface between the MBDs of the enzyme provide the driving force leading to the creation of the observed curvature around COX-1. The non-specificity of these interactions stems from the fact that in our simulations they are driven by charge-charge interactions of the lipid phosphate headgroup with arginine and lysine residues. In our atomistic simulations we chose POPC to model the presence of phospholipids. Since ER membranes are composed of around 85% by phospholipids(30) and the COX-1 induced curvature is charge-driven, we are confident that our results are invariant to the details of the used membrane models and simulation parameters.

In contrast to these local and non-specific interactions of COX-1 with phospholipids that covers the entire interface between the MBDs, we observe specific interactions with membrane lipids at only one interaction site per monomer. This interaction site is located inside the hydrophobic channel created by the four helices of the MBD. In our CG simulations, we see PC and PE lipids, but not SM lipids, interact with residues lining up the interface of the channel and occupying it for prolonged durations of time. In addition to these phospholipids, we also observe the binding of cholesterol at the same site. Altogether, the MBD is able to accommodate a variety of phospholipids and cholesterol (in addition to its endogenous ligand), revealing a picture of the enzyme which shows a lipid bound at its MBDs for the majority of the time. In terms of frequency of binding as well as strength of binding – measured here as the maximum duration a lipid is in contact with the same (set of) residue(s) – we find cholesterol and arachidonate to be the main lipids in occupying this interaction site.

107

Figure 5-7. Arachidonic acid binding pathway. A. The sum of all contacts formed between arachidonic acid and COX-1 residues during the whole simulation. The six residues with the highest number of contacts are shown. B. Side and top view of the MBD with arginine residues forming the pathway shown with a stick representation along with the density map of arachidonic acid during the simulation showing its pathway. On top of the density we overlaid the position of MBD residues (colored in black) and the six residues with the most contacts. Positions are averages during the simulation, and residue colors are consistent between subplots. t is the simulation time.

The binding of arachidonate, in particular, is marked by a high stability and small fluctuations. In one monomer, arachidonate binds directly to Arg-120 and is maintained there for the whole simulation. In the other monomer, however, a POPC lipid binds first and occupies the site, disallowing arachidonate from

108 doing the same. POPC binding, however, is less stable and as the simulation progresses it is eventually displaced by arachidonate (Figure 5-2). In this case, when the MBD is occupied by another lipid, arachidonate binding follows a different pathway. We calculated the number of contacts that this arachidonate forms during the simulation (Figure 5-7A) and see that 4 out of 5 most prominent interactions are with arginine residues. These arginine residues are located in the EGF domain of the enzyme (Arg-61) and along the length of the MBD and form the pathway for arachidonate binding (Figure 5-7B). Charge- charge interactions between arachidonate carboxylate and arginine guanidino atoms allow the former to bind temporarily to the latter. Due to the lack of any stabilizing hydrophobic residues, these interactions are not stable, and only serve to allow arachidonate to sample interactions with other residues in the vicinity. This way, these four arginine residues act as “connecting bridges” whereby arachidonate “jumps” from one to the other and ultimately binds to Arg-120, from where it enters the active site of the enzyme (we do not observe this process in our simulation). The main arginine residue in this pathway is Arg-83, which forms the most interactions with arachidonate and is responsible for keeping it close to the binding site when it is occupied by another lipid and enabling its interaction with Arg-120. Upon binding to Arg-120, arachidonate is stabilized via hydrophobic interactions with valines 116 and 119, replaces the bound POPC lipid, and is maintained there for the rest of the simulation.

The other lipid that we observe to interact specifically with COX-1 is cholesterol. Its binding, similar to that of arachidonate and phospholipids, includes the hydrophobic channel inside the MBD. We observe it consistently and independently in both CG and AA simulations involving largely interactions with the same residues. Binding of cholesterol to COX-1 is achieved by interactions of cholesterol headgroup atoms (ROH bead in MARTINI, hydroxyl in all-atom simulations) with arginine residues of helix B and D (Arg-83 and Arg-120). The core structure of cholesterol forms interactions with several hydrophobic residues lining up the interior of the MBD. In our simulations, cholesterol binding is observed more frequently and with longer duration compared to the binding of phospholipids, however, future studies looking at the free-energy landscape of these interactions are required to provide a more definitive characterization of their relative binding strengths. In addition, longer simulations are necessary to probe arachidonate insertion into the active site of COX-1.

Only one of the COX-1 monomers is catalytically active at a given time (Ecat) and it is modulated allosterically by its partner monomer (Eallo). The cooperative cross-talk between the subunits is an important determinant of their function and is heavily dependent on ligand binding. Palmitic acid, for instance, has an opposite and reverse allosteric effect for the cyclooxygenase isoforms, activating COX-1 by a factor of 2 and inhibiting COX-1 by ½(65). Other nonsubstrate fatty acids (e.g. stearic and oleic acid) have the same effect on COX-1. Measurements of binding kinetics for these fatty acids indicate that their inhibition of

109

COX-1 may not be due to competition with arachidonic acid for Ecat(15). These studies reveal a complex and isoform-dependent interplay between fatty acids and cyclooxygenases. In a recent review, Smith and Malkowski(15) conclude that cyclooxygenase activity is both pharmacologically and physiologically dependent on the fatty acid content and flavor of the membrane environment. MD simulation results presented here broaden our perspective on COX-1 binding to ER membranes. First and foremost, we identify a previously unexplored and undefined pathway for arachidonic acid entrance into the COX-1 active site. Further, we show that the cavity within the MBD is occupied by membrane lipid for the majority of the time, which indiscriminately binds to fatty acids, cholesterol and phospholipids. Regarding the latter, we do not observe a big difference between different phospholipids binding. Binding of phospholipids without an obvious preference between them has been reported before in the MD literature for ABC transporters(66). Future calculations of the free-energy landscape of these binding events would reveal a better picture of COX-1 preference for phospholipids. Our results, however, agree with experiments of reconstituted COX-2 in nanodisks containing different phospholipids which showed no effect of changing phospholipids on the activity and inhibition of the enzyme(67). Considering the often-different ligand- binding outcomes for cyclooxygenase isoforms, however, it is unclear if these results hold for COX-1 as well.

MD simulations, in addition to providing insight into the interplay between molecular components, should also guide experimental work. The discovery that the COX-1 hydrophobic channel harbors specificity of binding to nonsubstrate fatty acids(68, 69) and, in our work, non-fatty acid lipids opens the possibility of membrane lipids playing a functional role in the activity of these enzymes. This could be in particular the case for cholesterol, which is found in small concentrations in ER membranes, and thus may provide cells with a regulatory mechanism. The results presented here, indicate that the kinetic profile of COX-1 should be dependent on the relative ratios of membrane lipid components – something that has already been shown in the case of fatty acids(16, 69) Additionally, we identify Arg-83 and hydrophobic residues Val-116 and Val-119 as important to ensure proper binding of arachidonic acid, especially if the MBD is already bound to a different lipid. Experimental techniques such as site-directed mutagenesis could be used to test the validity of these findings. Phe-205 mutants, for example, have shown a several-fold decrease in enzyme (COX-2) efficiency compared to the wild type(58). Single (R120A) and double (R120A/G533A) mutants of Arg-120 also affect the efficiency of COX-2, with the former increasing the Km value 3.4-fold without affecting the oxygenation of arachidonic acid, and the latter completely losing the cyclooxygenase activity(58). Mutants of Arg-83, as well as Val-116 and Val-119, could display a similar increase in Km (albeit perhaps to a lesser extent). Comparing the sequences of ovine, murine and human cyclooxygenases, similar to Smith et al.(9), we see that Arg-61, Arg-120 and Val-116 are conserved throughout, with Arg-79 and Arg-83 being conserved for COX-1 and replaced by a lysine for COX-2, which likely would serve the

110 same function in guiding arachidonic acid via charge-charge interactions. In contrast, Val-119 is conserved among COX-1 but replaced with a serine in COX-2 enzymes.

5.6 Acknowledgements

This work was supported by the Natural Sciences and Engineering Research Council (Canada) with further support from the Canada Research Chairs program. Calculations were performed on Compute Canada facilities, funded by the Canada Foundation for Innovation and partners.

5.7 Data availability

Simulation files as well as structure, binary and trajectory files are available on request from the authors.

5.8 References

1. Vane J., Y. Bakhle, R. Botting. CYCLOOXYGENASES 1 AND 2. Annual review of pharmacology and . 1998;38(1):97-120.

2. Smith W. L., D. L. DeWitt, R. M. Garavito. Cyclooxygenases: structural, cellular, and molecular biology. Annual review of biochemistry. 2000;69(1):145-182.

3. Schneider C., A. Pozzi. Cyclooxygenases and lipoxygenases in cancer. Cancer and Metastasis Reviews. 2011;30(3-4):277-294.

4. Rouzer C. A., L. J. Marnett. Cyclooxygenases: structural and functional insights. J Lipid Res. 2009;50(Supplement):S29-S34.

5. Wang M.-T., K. V. Honn, D. Nie. Cyclooxygenases, prostanoids, and tumor progression. Cancer and Metastasis Reviews. 2007;26(3-4):525.

6. Furse K. E., D. A. Pratt, N. A. Porter, T. P. Lybrand. Molecular dynamics simulations of arachidonic acid complexes with COX-1 and COX-2: insights into equilibrium behavior. Biochemistry. 2006;45(10):3189-3205.

111

7. Furse K. E., D. A. Pratt, C. Schneider, A. R. Brash, N. A. Porter, T. P. Lybrand. Molecular dynamics simulations of arachidonic acid-derived pentadienyl radical intermediate complexes with COX- 1 and COX-2: insights into oxygenation regio-and stereoselectivity. Biochemistry. 2006;45(10):3206-3218.

8. Blobaum A. L., L. J. Marnett. Structural and functional basis of cyclooxygenase inhibition. J Med Chem. 2007;50(7):1425-1441.

9. Smith W. L., Y. Urade, P.-J. Jakobsson. Enzymes of the cyclooxygenase pathways of prostanoid biosynthesis. Chem Rev. 2011;111(10):5821-5865.

10. Garavito R. M., M. G. Malkowski, D. L. DeWitt. The structures of prostaglandin endoperoxide H synthases-1 and-2. Prostaglandins & other lipid mediators. 2002;68:129-152.

11. Luong C., A. Miller, J. Barnett, J. Chow, C. Ramesha, M. F. Browner. Flexibility of the NSAID binding site in the structure of human cyclooxygenase-2. Nature structural biology. 1996;3(11):927-933.

12. Rao P., E. E. Knaus. Evolution of nonsteroidal anti-inflammatory drugs (NSAIDs): cyclooxygenase (COX) inhibition and beyond. Journal of & Pharmaceutical Sciences. 2008;11(2):81-110s.

13. Otto J. C., W. L. Smith. The orientation of prostaglandin endoperoxide synthases-1 and-2 in the endoplasmic reticulum. J Biol Chem. 1994;269(31):19868-19875.

14. Spencer A. G., J. W. Woods, T. Arakawa, I. I. Singer, W. L. Smith. Subcellular localization of prostaglandin endoperoxide H synthases-1 and-2 by immunoelectron microscopy. J Biol Chem. 1998;273(16):9886-9893.

15. Smith W. L., M. G. Malkowski. Interactions of fatty acids, nonsteroidal anti-inflammatory drugs, and coxibs with the catalytic and allosteric subunits of cyclooxygenases-1 and-2. J Biol Chem. 2019;294(5):1697-1705.

16. Dong L., A. J. Vecchio, N. P. Sharma, B. J. Jurban, M. G. Malkowski, W. L. Smith. Human cyclooxygenase-2 is a sequence homodimer that functions as a conformational heterodimer. J Biol Chem. 2011;286(21):19035-19046.

17. Rouzer C. A., L. J. Marnett. Endocannabinoid oxygenation by cyclooxygenases, lipoxygenases, and cytochromes P450: cross-talk between the eicosanoid and endocannabinoid signaling pathways. Chem Rev. 2011;111(10):5899-5921.

112

18. Corradi V., B. I. Sejdiu, H. Mesa-Galloso, H. Abdizadeh, S. Y. Noskov, S. J. Marrink, D. P. Tieleman. Emerging Diversity in Lipid–Protein Interactions. Chem Rev. 2019.

19. Marrink S. J., V. Corradi, P. C. Souza, H. I. Ingólfsson, D. P. Tieleman, M. S. Sansom. Computational modeling of realistic cell membranes. Chem Rev. 2019;119(9):6184-6226.

20. Enkavi G., M. Javanainen, W. Kulig, T. Róg, I. Vattulainen. Multiscale Simulations of Biological Membranes: The Challenge To Understand Biological Phenomena in a Living Substance. Chem Rev. 2019.

21. Blomberg L. M., M. R. Blomberg, P. E. Siegbahn, W. A. van der Donk, A.-L. Tsai. A quantum chemical study of the synthesis of prostaglandin g2 by the cyclooxygenase active site in prostaglandin endoperoxide h synthase 1. The Journal of Physical Chemistry B. 2003;107(14):3297-3308.

22. Christov C. Z., A. Lodola, T. G. Karabencheva-Christova, S. Wan, P. V. Coveney, A. J. Mulholland. Conformational Effects on the pro-S Hydrogen Abstraction Reaction in Cyclooxygenase-1: An Integrated QM/MM and MD Study. Biophys J. 2013;104(5):L5-L7.

23. Nina M., S. Bernèche, B. Roux. Anchoring of a monotopic membrane protein: the binding of prostaglandin H 2 synthase-1 to the surface of a phospholipid bilayer. European Biophysics Journal. 2000;29(6):439-454.

24. Wan S., P. V. Coveney. A comparative study of the COX‐1 and COX‐2 isozymes bound to lipid membranes. Journal of computational chemistry. 2009;30(7):1038-1050.

25. Fowler P. W., K. Balali-Mood, S. Deol, P. V. Coveney, M. S. Sansom. Monotopic enzymes and lipid bilayers: a comparative study. Biochemistry. 2007;46(11):3108-3115.

26. Lomize A. L., I. D. Pogozheva, M. A. Lomize, H. I. Mosberg. The role of hydrophobic interactions in positioning of peripheral proteins in membranes. BMC Struct Biol. 2007;7(1):44.

27. Balali-Mood K., P. J. Bond, M. S. Sansom. Interaction of monotopic membrane enzymes with a lipid bilayer: a coarse-grained MD simulation study. Biochemistry. 2009;48(10):2135-2145.

28. Gupta K., B. S. Selinsky, C. J. Kaub, A. K. Katz, P. J. Loll. The 2.0 Å resolution crystal structure of prostaglandin H2 synthase-1: structural insights into an unusual peroxidase. J Mol Biol. 2004;335(2):503-518.

113

29. Marrink S. J., H. J. Risselada, S. Yefimov, D. P. Tieleman, A. H. de Vries. The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations. The Journal of Physical Chemistry B. 2007;111(27):7812-7824.

30. Casares D., P. V. Escribá, C. A. Rosselló. Membrane lipid composition: effect on membrane and organelle structure, function and compartmentalization and therapeutic avenues. International journal of molecular sciences. 2019;20(9):2167.

31. Van Meer G., D. R. Voelker, G. W. Feigenson. Membrane lipids: where they are and how they behave. Nat Rev Mol Cell Biol. 2008;9(2):112-124.

32. van Meer G., A. I. de Kroon. Lipid map of the mammalian cell. Journal of cell science. 2011;124(1):5-8.

33. Wassenaar T. A., H. I. Ingólfsson, R. A. Böckmann, D. P. Tieleman, S. J. Marrink. Computational lipidomics with insane: a versatile tool for generating custom membranes for molecular simulations. Journal of chemical theory and computation. 2015;11(5):2144-2155.

34. Ingólfsson H. I., M. N. Melo, F. J. Van Eerden, C. Arnarez, C. A. Lopez, T. A. Wassenaar, X. Periole, A. H. De Vries, D. P. Tieleman, S. J. Marrink. Lipid organization of the plasma membrane. J Am Chem Soc. 2014;136(41):14554-14559.

35. Corradi V., E. Mendez-Villuendas, H. I. Ingolfsson, R. X. Gu, I. Siuda, M. N. Melo, A. Moussatova, L. J. DeGagne, B. I. Sejdiu, G. Singh, T. A. Wassenaar, K. D. Magnero, S. J. Marrink, D. P. Tieleman. Lipid-Protein Interactions Are Unique Fingerprints for Membrane Proteins. Acs Central Sci. 2018;4(6):709-717.

36. Bussi G., D. Donadio, M. Parrinello. Canonical sampling through velocity rescaling. The Journal of chemical physics. 2007;126(1):014101.

37. Berendsen H. J., J. v. Postma, W. F. van Gunsteren, A. DiNola, J. R. Haak. Molecular dynamics with coupling to an external bath. The Journal of chemical physics. 1984;81(8):3684-3690.

38. Hess B., C. Kutzner, D. Van Der Spoel, E. Lindahl. GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. Journal of chemical theory and computation. 2008;4(3):435-447.

114

39. Jo S., T. Kim, V. G. Iyer, W. Im. CHARMM‐GUI: a web‐based graphical user interface for CHARMM. Journal of computational chemistry. 2008;29(11):1859-1865.

40. Wu E. L., X. Cheng, S. Jo, H. Rui, K. C. Song, E. M. Dávila‐Contreras, Y. Qi, J. Lee, V. Monje‐ Galvan, R. M. Venable. CHARMM‐GUI membrane builder toward realistic biological membrane simulations. Journal of computational chemistry. 2014;35(27):1997-2004.

41. Jorgensen W. L., J. Chandrasekhar, J. D. Madura, R. W. Impey, M. L. Klein. Comparison of simple potential functions for simulating liquid water. The Journal of chemical physics. 1983;79(2):926-935.

42. Huang J., S. Rauscher, G. Nawrocki, T. Ran, M. Feig, B. L. de Groot, H. Grubmüller, A. D. MacKerell. CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nature methods. 2017;14(1):71-73.

43. Darden T., D. York, L. Pedersen. Particle mesh Ewald: An N⋅ log (N) method for Ewald sums in large systems. The Journal of chemical physics. 1993;98(12):10089-10092.

44. Nosé S. A molecular dynamics method for simulations in the canonical ensemble. Molecular physics. 1984;52(2):255-268.

45. Hoover W. G. Canonical dynamics: Equilibrium phase-space distributions. Physical review A. 1985;31(3):1695.

46. Parrinello M., A. Rahman. Polymorphic transitions in single crystals: A new molecular dynamics method. Journal of Applied physics. 1981;52(12):7182-7190.

47. Hess B., H. Bekker, H. J. Berendsen, J. G. Fraaije. LINCS: a linear constraint solver for molecular simulations. Journal of computational chemistry. 1997;18(12):1463-1472.

48. Schmid N., A. P. Eichenberger, A. Choutko, S. Riniker, M. Winger, A. E. Mark, W. F. van Gunsteren. Definition and testing of the GROMOS force-field versions 54A7 and 54B7. European biophysics journal. 2011;40(7):843.

49. Berendsen H. J., J. P. Postma, W. F. van Gunsteren, J. Hermans. Interaction models for water in relation to protein hydration. Intermolecular forces: Springer; 1981. p. 331-342.

115

50. Abraham M. J., T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, E. Lindahl. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1:19-25.

51. McGibbon R. T., K. A. Beauchamp, M. P. Harrigan, C. Klein, J. M. Swails, C. X. Hernández, C. R. Schwantes, L.-P. Wang, T. J. Lane, V. S. Pande. MDTraj: a modern open library for the analysis of molecular dynamics trajectories. Biophys J. 2015;109(8):1528-1532.

52. Sejdiu B. I., D. P. Tieleman. Lipid-Protein Interactions are a Unique Property and Defining Feature of G Protein-Coupled Receptors. Biophys J. 2020.

53. Humphrey W., A. Dalke, K. Schulten. VMD: visual molecular dynamics. Journal of molecular graphics. 1996;14(1):33-38.

54. Rose A. S., P. W. Hildebrand. NGL Viewer: a web application for molecular visualization. Nucleic Acids Res. 2015;43(W1):W576-W579.

55. Rose A. S., A. R. Bradley, Y. Valasatava, J. M. Duarte, A. Prlić, P. W. Rose. NGL viewer: web- based molecular graphics for large complexes. Bioinformatics. 2018;34(21):3755-3758.

56. Shinohara H., M. a. A. Balboa, C. A. Johnson, J. Balsinde, E. A. Dennis. Regulation of delayed prostaglandin production in activated P388D1 macrophages by group IV cytosolic and group V secretory phospholipase A2s. J Biol Chem. 1999;274(18):12263-12268.

57. Rieke C. J., A. M. Mulichak, R. M. Garavito, W. L. Smith. The role of arginine 120 of human prostaglandin endoperoxide H synthase-2 in the interaction with fatty acid substrates and inhibitors. J Biol Chem. 1999;274(24):17109-17114.

58. Vecchio A. J., B. J. Orlando, R. Nandagiri, M. G. Malkowski. Investigating Substrate Promiscuity in Cyclooxygenase-2 THE ROLE OF ARG-120 AND RESIDUES LINING THE HYDROPHOBIC GROOVE. J Biol Chem. 2012;287(29):24619-24630.

59. Tidhar R., A. H. Futerman. The complexity of sphingolipid biosynthesis in the endoplasmic reticulum. Biochimica Et Biophysica Acta (BBA)-Molecular Cell Research. 2013;1833(11):2511-2518.

60. Breslow D. K. Sphingolipid homeostasis in the endoplasmic reticulum and beyond. Cold Spring Harbor perspectives in biology. 2013;5(4):a013326.

116

61. Kovesi P. Good colour maps: How to design them. arXiv preprint arXiv:150903700. 2015.

62. Duncan A. L., R. A. Corey, M. S. Sansom. Defining how multiple lipid species interact with inward rectifier potassium (Kir2) channels. Proceedings of the National Academy of Sciences. 2020.

63. Fagone P., S. Jackowski. Membrane phospholipid synthesis and endoplasmic reticulum function. J Lipid Res. 2009;50(Supplement):S311-S316.

64. Marrink S. J., A. H. De Vries, A. E. Mark. Coarse grained model for semiquantitative lipid simulations. The Journal of Physical Chemistry B. 2004;108(2):750-760.

65. Dong L., H. Zou, C. Yuan, Y. H. Hong, D. V. Kuklev, W. L. Smith. Different fatty acids compete with arachidonic acid for binding to the allosteric or catalytic subunits of cyclooxygenases to regulate prostanoid synthesis. J Biol Chem. 2016;291(8):4069-4078.

66. Barreto-Ojeda E., V. Corradi, R.-X. Gu, D. P. Tieleman. Coarse-grained molecular dynamics simulations reveal lipid access pathways in P-glycoprotein. Journal of General . 2018;150(3):417-429.

67. Orlando B. J., D. R. McDougle, M. J. Lucido, E. T. Eng, L. A. Graham, C. Schneider, D. L. Stokes, A. Das, M. G. Malkowski. Cyclooxygenase-2 catalysis and inhibition in lipid bilayer nanodiscs. Archives of biochemistry and biophysics. 2014;546:33-40.

68. Yuan C., R. S. Sidhu, D. V. Kuklev, Y. Kado, M. Wada, I. Song, W. L. Smith. Cyclooxygenase allosterism, fatty acid-mediated cross-talk between monomers of cyclooxygenase homodimers. J Biol Chem. 2009;284(15):10046-10055.

69. Dong L., C. Yuan, B. J. Orlando, M. G. Malkowski, W. L. Smith. Fatty acid binding to the allosteric subunit of cyclooxygenase-2 relieves a tonic inhibition of the catalytic subunit. J Biol Chem. 2016;291(49):25641-25655.

117

Chapter Six: ProLint: a web-based framework for the automated data analysis and visualization of lipid-protein interactions

Copyright

N/A

Contributions

The code presented here that automates the analysis and visualization of lipid-protein interactions was written by me. I also wrote all the applications presented here. The deployment of the webserver on AWS services was also my own contributions. The webserver is temporarily accessible at http://ec2-18-216-165- 89.us-east-2.compute.amazonaws.com:8000 On the “Examples” page of the webserver, many of the results presented in chapter 4, for instance, are available to explore interactively. Figures generated in this chapter are also taken from the “Examples” tab of the webserver.

Abbreviations

All abbreviations are introduced at their first mentioning instance.

118

6.1 Introduction

Data analysis and visualization are integral parts of every scientific study. Effective communication of scientific results is contingent on the careful analysis of generated data and the display of the findings and trends in an information rich and aesthetically pleasing format. Even though the visualization of data is becoming a demanding task for scientists, taking up considerable time and effort, outside of general best practices and guides, there is no clear blueprint to follow. The process itself can very often become quite arduous especially when the datasets are large, or the story they tell complex. Scientific disciplines like current-day Molecular Dynamics (MD) simulations, even using conservative simulation output-control parameters, leave the researcher with tera-bytes of data that need to be analyzed and presented to the reader(1, 2). Scientific journals usually have rigid restrictions on the number and type of graphical elements allowed and researchers must try to limit themselves to only what they believe to be the most relevant details of their work, leaving out everything else. This leads to an inefficient and incomplete presentation of the experiments and/or simulations performed and unintentional bias in only presenting positive and confirming results(3, 4).

Concurrently with the increase in the timescales achievable by MD simulations and the size of the resulting output data, or perhaps because of it, there has been an ever-growing number of software that facilitates data processing and analysis. MDTraj(5) and MDAnalysis(6), are two of the very frequently used software packages that provide specialized tools for trajectory IO processes and analysis; this is in addition to the analysis tools provided by default by the different simulation packages, e.g. GROMACS(7), AMBER(8), CHARMM(9), etc. each come pre-packaged with a set of tools dedicated to manipulating and analyzing trajectory and coordinate data files, as well as managing many package-specific tasks.

Recently, we published our work on GPCR-lipid interactions where we presented results from around 1ms of simulation data spread across 28 different GPCR structures, grouped into several categories and simulated in membrane models composed of 63 different lipid species(10). The main conclusion reached from our comprehensive analysis was that GPCRs are distinguished by a unique interaction profile with their surrounding lipid environment.

This conclusion, however, presented several challenges in how to best display the data, so that the resulting graphics are informative, self-contained, and easy to read. Which protein to highlight in any given context, what metric to use to measure their interaction with lipid, and how to integrate the results across different lipid species, being some of the challenges. With each of these considerations, the best choice between the available options is not clear and the resulting tradeoffs have to be carefully evaluated. As a result, visualizing large data sets on lipid-protein interactions remains a challenge and largely unexplored

119 territory(11). MemProtMD, for instance, is a popular database of membrane protein structures embedded into membrane models that also includes a display of the analysis performed(12).

Currently published studies use several available venues to overcome these limitations, among them making simulation setup and parameters as well as custom analysis code available through repositories like Zendoo or GitHub(13), which are essential for data reproducibility but do not address the shortcomings highlighted above. Considering the challenges in visualizing large datasets we believe that the best-case scenario has the user interacting with the underlying data directly, and displaying the information they are curious about, among a large set of pre-calculated options. This form of data presentation is rarely available for scientific studies, yet it is integral to data sharing and science dissemination. By leveraging modern visualization libraries like bokeh(14) and D3.js(15) we showcase the visualization of lipid-protein interaction data in an interactive graphic format, that gives the user the ability to access the most important details of a simulation study. Here we present ProLint, a webserver with the aim to provide a comprehensive analysis of Protein- Lipid interaction and serve as a visualization framework of analysis results using modern graphical representations. Each graphical application provided by the webserver aims at highlighting a different facet of the interaction between lipids and proteins and offers users a panel to interactively explore the underlying interaction landscape.

6.2 Testing and Validation

The ProLint webserver is a freely available service that offers the MD simulation community the ability to quickly and efficiently analyze lipid-protein interactions as a product of MD simulations using the MARTINI model, and optionally, make those results publicly available for everyone to view and hosted free of charge. It also offers the wider scientific community a platform through which to explore the lipid interaction profile of proteins, without requiring any knowledge of the underlying models or forcefield, or any other simulation details. The backend of the server is written in Python and Nodejs, whereas the frontend uses standard HTML, CSS and JavaScript files. The visualization framework incorporated into the frontend uses several modern libraries which allow for an interactive and responsive display of the data: bokeh(14), Three.js(16), D3.js(15) and NGL viewer(17, 18).

The architecture of the webserver and the way all the visualization modules are incorporated is schematically depicted in Figure 6-1. The frontend features an Upload form from which users can upload simulation data to the server, wherein each submission is put into a task queue waiting for further processing. Once the required resources become available, the user-uploaded files undergo the steps outlined in the figure. To preserve space, user-uploaded files are deleted immediately after calculations are

120 completed, whereas result files are kept temporarily after generation to allow time to either download the data, or make them publicly available (in which case the data would be preserved indefinitely or until deletion by the user).

To test and validate the analysis and visualization protocols we compared ProLint results to similar results obtained using different software packages and visualization libraries. For instance, in our analysis protocol, we utilize the MDTraj(5) compute_neighbors functionality to get lipids within a cutoff radius. The results then undergo a series of postprocessing steps whereby different contact parameters are calculated and stored. To ensure we obtain accurate results, we carried out the same set of calculations but using the gmx select utility provided by GROMACS(7) and postprocessed the results using shell scripts. Both methods give matching results.

Contact information including the 2D density map calculations use the bokeh python library to visualize the results in the browser and allow for user interactions in interactively exploring the data, whereas the network application, which also depends on pre-calculated contact information utilizes the JavaScript D3.js library to display the results.

Thickness and curvature representations are obtained using the g_surf utility written by our group and successfully applied when studying the lipid-protein interaction profile of 10 different proteins(19) and 28 different GPCRs(10). The result of g_surf is postprocessed to allow for their in-browser visualization. The provided application to view and interact with the curvature and thickness objects uses the Three.js JavaScript library.

We also provide information on the 2D and 3D density profiles of calculated lipids. Such profiles are very commonly used in the lipid-protein interactions literature, as they provide a clear overview of the underlying interactions. The calculation of 3D densities used the gridDataFormats (https://github.com/MDAnalysis/GridDataFormats) library in the backend.

The figures generated in this chapter are taken from the “Examples” section of the webserver, which contains information for the lipid-protein interaction profile of GPCRs.

121

Figure 6-1. Schematic overview of ProLint. The user interface features an Upload and Results page to submit the data to the server for calculation and visualize the results, respectively. The server backend uses Amazon Web Services (AWS) to manage data storage, and database management.

Analysis of lipid-protein interactions is almost always focused on two aspects of their interactions: (i) specific and direct interactions between lipids and proteins that are maintained for prolonged durations of time and (ii) cumulative effects resulting from transient and nonspecific interactions between embedded proteins and membrane lipids. To analyze the former, direct lipid-protein interactions, we define contacts between lipids and protein residues. A contact is simply a user-defined distance criterion (or cut-off) that both lipids and proteins must satisfy in order for them to be considered as interacting or in contact with each other. It is the magnitude of these contacts that points towards specific and potentially important interactions. Common cut-offs used in the literature are 0.5, 0.6, 0.62 and 0.7(19, 20). As for how such contacts are calculated, there is no agreement and different methods are commonly used. Often, it is the total number of contacts that is evaluated, other times it is the longevity of such contacts and in yet other cases it is either a variant of them or some other custom parameter. The current version of ProLint supports several cutoffs and carries out the calculation of six contacts parameters by default. These parameters relate to the total number of contacts between lipids and proteins as well as their duration of contact maintenance.

6.3 Results

The ideal solution would be to give the user complete access to the MD data and control over every aspect of the generated visuals. Trajectory processing, however, is inherently a time-consuming process and as

122 such responsive real time analysis of MD outputs given server restrictions is currently unattainable. A more realistic objective is to attempt at maximizing the user control using preprocessed datasets. ProLint follows this approach, whereby user uploaded files are preprocessed for lipid-protein contact information and for the generation of 3D models, custom structure files, and other data formats. On the front-end, it provides several applications to visualize and interact with the generated data. The goal is to provide users with an uncomplicated but powerful interface to explore lipid-protein interactions as well as serve as a template for researchers to easily make their results publicly available.

Some of the shared functionality between these visualization applications is the ability to select between and focus on different lipids in the system, change between various precalculated cutoff radii, and change graphical representations (e.g. for 3D models). In the following, we discuss some of these applications, focusing on their interactive aspect, and unique perspective they provide on the lipid-protein interaction landscape.

6.3.1 Point representation of lipid-protein contact information

A common visualization employed, especially in the early stages of lipid-protein interaction analysis, is plotting the total number of lipid contacts (or any other similar parameter) of each residue making up the protein. This provides a quick way to distinguish between residues based on their level of interaction with lipids. Figure 6-2 shows an example of a scatter plot displaying the contact number for each residue of the Smoothened GPCR structure with every lipid group in the simulation setup. Several layers of interactivity are supported: changing the value of the radius used to define a contact, easily switching between visualizing the contact number or any other contact parameter (e.g. contact duration), querying each residue by its 3-letter digit code and other tools. While such a graphical representation excludes several important details (e.g. geometric information of the interactions) it is nonetheless essential in quickly dissecting and assessing a lipid-protein dataset in terms of their interactions. The application shown in Figure 6-2 displays 348 data points out of precalculated ~200000 points in total. The latter value is dependent on the protein size, number of different lipids and number of cutoff parameters specified.

123

Figure 6-2. Snapshot showing the interaction of Smoothened with cholesterol. The contact duration is normalized, with a value of 1 (one) representing a contact that is maintained throughout the trajectory. The inset figure, added here for clarity, visualizes the interaction site on the structure of Smoothened (with a blue-white-red color gradient). The points colored in red in the application correspond to the residues colored similarly in the inset figure.

Figure 6-2 shows the contact duration of the Smoothened Receptor with cholesterol molecules calculated at a 7Å distance cut-off. The application makes it easy to see and differentiate between residues based on their level of interaction with lipids. In the case of Smoothened, the prominent interaction site between transmembrane helices II and III and the residues that contribute the most to the binding are easily distinguished. While such differentiation of residues is not always this straightforward, owing mainly to other proteins having a more complex interaction profile, the application is useful as a first look into the dataset and gaining a general overview of the lipid interactions taking place with a protein.

6.3.2 Network graph representation of lipid-protein interactions.

Network diagrams are graphical methods commonly used to visualize relationships between elements. They are insightful, intuitive and informative. They also scale easily to large datasets. If we think of lipid-protein interactions as a relationship between lipids and proteins, then it is easy to see ways how we could use

124 network graphs to represent features of their interactions. ProLint implements such a visualization application. Lipids or groups of lipids and residues are represented as nodes while their interactions form the links between them – the size/width of the link corresponding to the weight of the interactions. In contrast to other visualization methods, network graphs allow for all lipids groups and all residues to be displayed at the same time making direct comparisons of interactions straightforward.

Figure 6-3. Network graph representation of lipid-protein interactions.

A. and B. The interaction network of cholesterol and PIP lipids with the serotonin (5HT1B) receptor, respectively. C. An expanded view of serotonin-PIP lipid interaction profile showing all arginine (Arg) and lysine (Lys) residues. The size of lipid nodes represents their total fraction of all lipids in the system, whereas residue node sizes are based on their relative fraction that make up the serotonin receptor. Edge width visualizes the average number of contacts with each lipid type.

An example snapshot of the serotonin receptor (5HT1B) showing how network graphs display lipid-protein interactions is given in Figure 6-3. A quick glance over the interaction network reveals that cholesterol and PIP lipids, while both making direct and long-lasting interactions with proteins, they do so through a different association pattern with residues: whereas cholesterol interacts mainly with hydrophobic residues (e.g. leucine, isoleucine, valine) with the additional involvement of aromatic residues (e.g. phenylalanine), PIP lipids, on the other hand, interact mainly with positively charged residues (arginine and lysine). It is possible to expand each residue group on the network to reveal the interaction of each lipid type with every

125 single residue making up the group as shown in Figure 6-3C. When interacting with 5HT1B, PIP lipids bind predominantly to Arg-76, Arg-78, with lesser amount to Arg-310 and Arg-385 and mainly to Lys-79, Lys- 314 and Lys-382, with lesser amount to Lys-164 and Lys-311. This example shows how network graph representations allow for an information-rich visualization of lipid-protein interactions.

The usefulness of network graph representation lies in their ability to provide a more holistic overview of lipid-protein interactions in the system and make it easy to distinguish between the residue-preferences of different lipid types. That is, a network graph will display interactions from the reference point of individual lipids and show their level of interactions with amino acid groups or every residue making up the amino acid group.

6.3.3 2D density map are commonly used but can also be misleading.

A common visualization method used when analysing lipid-protein interactions are 2D density maps. They are easy to calculate and offer a clear view of lipid distributions in the system making them useful in distinguishing regions with high lipid localization from other regions with lower lipid localization. 2D density maps, however, are dependent on several factors that can severely impact their appearance (e.g. the number of bins used in building the histogram, what interpolation algorithm is employed if any, what type of colormap is used, etc.). For membrane systems, they also depend on the choice of normal (z)-axis range used in the calculation. A density map encompassing the full length of the bilayer will look quite different than one calculated for only part of it. The choice of these parameters affects the result and since the distinction between high vs low lipid localizations is done using a color gradient, 2D density maps can also be misleading.

126

Figure 6-4. 2D density profile analyses of lipid-protein interactions for 5HT1B. A. Conventional 2D density profiles calculated for the full-length bilayer, the lower and upper leaflet as well as for a small subsection of the lower leaflet (4.5 – 5.0 nm). The density of the lower leaflet is almost completely accounted for by the small subsection. B. A ProLint application showing the 2D density profile with different options available to change the way the density is calculated. The right-most slider allows for a subset of the data to be hidden to easily distinguish localization sites according to their density, regardless of the colormap employed.

Figure 6-4A shows 2D density maps for cholesterol distributions around the serotonin (5HT1B) receptor. The density profile of the full bilayer gives a rough idea of the localization of cholesterol molecules in the system, but it cannot distinguish between the localization sites in terms of their relative position to the embedded protein. A common approach used in the literature is to divide the analysis for each of the two leaflets. While this separation of the density profile into upper and lower density does help, it is still insufficient to accurately pinpoint the position of these potential binding sites (as is shown for the lower leaflet where the interaction site is only accounted for by a subsection of the lower leaflet). The separation of the density profiles into lower and upper leaflet profiles also doubles the number of figures, which is a problem for large datasets. Figure 6-4B shows a ProLint application that fixes many of these shortcomings. The application allows for the visualization of all proteins in the system or, in some cases, their average, as well as options to change the number of bins and coloring scheme used in the calculation of the density

127 profile. There is also the option to visualize the protein which makes it easier to estimate the location of the interaction sites as well as a slider to change the normal (z) axis range of the bilayer to the desired width. The latter would easily allow the user to exactly pinpoint the location of the interaction site in Figure 6-4 to between 4.5 – 5 nm of bilayer height.

6.3.4 Visualizing 3D densities with complete spatial information

In displaying aspects of lipid-protein interactions, analysis tools have to make compromises and usually that comes at the loss of spatial information about such interactions. Very often, however, it is the overall geometry of the lipid binding site that we are most interested in. In the last few years there’s been significant development in molecular visualization software that can be run within a browser with NGLviewer(17, 18), LiteMol(21), JSmol(22) being notable examples. mol* is a new software currently in development (github.com/molstar/molstar). Such feature-rich visualization tools allow for a very customizable display of lipid-protein interactions.

Figure 6-5. Visualizing lipid-protein interactions in 3D space for SMO. ProLint application that uses the NGL viewer to show lipid protein interactions by displaying the 3D density of lipids e.g. cholesterol (colored in magenta here) and highlighting its interaction with the Smoothened receptor (A) and as surface representations colored according to the relative interaction of each residue with cholesterol and measured as contact duration (left) or contact number (right), respectively (B).

Figure 6-5 shows example images highlighting some of the functionality provided by the 3D viewer application implemented in ProLint to visualize lipid-protein interactions in 3D space. Visualizing the 3D lipid density is perhaps the most straightforward approach to reveal interaction sites of interest while also revealing the identity and complete spatial arrangement of residues making up the interaction site. Figure 6-5B projects the contact duration and contact number, respectively, of each residue with cholesterol on the surface of the protein revealing several sites that display a pronounced interaction with cholesterol, but only

128 one showing the characteristic profile of a specific binding site (between helices 2 and 3). The unique advantage offered by such an application relies on its ability to easily define the relative magnitude of interactions between residues and lipids, and their spatial alignment. Information that is either not available, or difficult to assess with other methods.

6.3.5 Membrane proteins induce structural changes to their local membrane environment

Current understanding of lipid-protein interactions puts a special importance on the effect of bilayer physical properties (mainly bilayer thickness and curvature) on the physical activity of membrane proteins(23). Proteins like Rhodopsin induce changes to their local membrane environment with the depletion of specific lipid types in favor of the enrichment of others(10, 24). The cumulative effect of membrane proteins into the surrounding membrane is commonly measured through the calculation of thickness profiles or the resulting changes to membrane curvature. The analysis and subsequent visualization of membrane thickness and curvature, however, poses a significant challenge as the underlying calculations require the fine-tuning of several system-dependent parameters (bilayer midpoint definition, reference lipids, headgroup definition, grid sizes used for calculations etc.). Additionally, current molecular visualization software offers limited support and functionality for non-density 3D file formats. Considering these shortcomings, ProLint automates the calculation of thickness and curvature profiles and provides an accompanying application for their visualization.

Figure 6-6. Thickness and curvature visualization of lipid-protein interactions. A. Side view of the systems showing each of the layers (upper, middle and lower layer) as well as the lipid centers used in the calculation. B. Top view of the system showing the curvature around the embedded protein (here showing the Smoothened receptor as an example) colored according to the median curvature profile. Scales are similar to Figure A-8.

129

Figure 6-6 shows the thickness and curvature profiles for the Smoothened receptor. Figure 6-6A shows a side view of the simulated system with the upper and lower layers presenting the curvature of the upper and lower leaflet. The middle layer shows the curvature at the bilayer midpoint plane. Figure 6-6B offers a top view of the curvature of the same system showing regions with negative curvature (colored in blue) and positive curvature (colored in red). Smoothened creates a negative curvature around helices IV and V and positive curvature around helices VI and VII. ProLint offers support for the calculation and visualization of the thickness and mean and gaussian curvature profiles. This allows users to easily view and analyze the deformation that are caused by embedded proteins to the local membrane environment.

6.3.6 Interactive sequence heatmaps highlight the conservation of interaction sites

Coloring each residue according to its interaction level with lipids provides an easy way to see potential lipid interaction sites. The full power of such sequence heatmaps, however, comes when multiple sequences are aligned according to their sequence similarity. This enables the comparison of the interaction profile of related proteins and identify shared interaction sites. Figure 6-7 shows an example of such aligned sequence heatmaps for 28 GPCRs(10). The interaction showing is with cholesterol, but ProLint offers support for any other lipid as well as interaction parameter.

Figure 6-7. Aligned sequence heatmaps for GPCR-cholesterol interactions. Sequences of all GPCRs simulated in (10) are aligned and colored using the same colormap to show similarities and differences between the interaction sites found among GPCRs. The region shown in the

130 figure corresponds to transmembrane helices VI and VII and highlights a conserved cholesterol interaction site found in transmembrane VI.

GPCR sequences are aligned according to their sequence similarity and only the region encompassing helices VI and VII are shown. We see that most of the GPCRs display an interaction site with cholesterol at the helix VI/VII interface, which involves mainly hydrophobic residues (e.g. Leu-244 in the case of the active state of A2AR). These data hint at the possibility of this interaction site being conserved among GPCRs.

6.4 Conclusion

The use of MD simulations to study lipid-protein interactions has resulted in a wealth of information on the interplay between lipids and proteins. This trend is projected to continue with an increasing pace. And while, in theory, the “production phase” of MD simulation remains the rate-limiting step (i.e. the time it takes to integrate the equations of motion), in practice, however, data processing and analysis are the most time consuming. Packages such as MDTraj and MDAnalysis have played an important role in decreasing the time it takes to read and manipulate trajectories, yet analysis is largely confined within custom in-house developed scripts that are sometimes shared(13, 25) but always carry a significant tax on researcher time and effort.

ProLint aims to bridge this gap between the generation of MD data and their analysis. The main goal of ProLint is to allow researchers to gain detailed insight easily and quickly into their simulated systems. By implementing a standard workflow of processing and extracting lipid-protein contact information coupled with many modern visualization applications it allows for researchers to interactively view and analyze the lipid interaction profile of any membrane-embedded protein. It can be used to quickly analyze simulation data for lipid-protein interactions, as well as make results publicly available for everyone to view. By reporting contact information on all residues with all lipids, ProLint ensures that all data, including negative results, are available and viewable. Another important benefit provided by ProLint is the elimination of any technical skill requirements or expertise in MD simulation to view and understand the lipid interaction profiles of proteins, which allows anyone interested in the field to interact with the data.

ProLint, in its current (0.7) version allows you to submit simulation data that use the MARTINI model and quickly analyze and visualize the underlying lipid-protein interactions. We have also developed a clear roadmap for features that will be included in future versions of the webserver.

131

We hope that this will serve as a motivation for other groups to make their data publicly available. The advantages of open access science extend to and include the sharing of the relevant data(26) and its benefit to researchers is well documented(27). ProLint facilitates this process. We believe that ultimately, ProLint benefits the larger MD simulation community as it allows for simulation data to be made accessible to a wider audience, and allows for experimentalists to access a free and publicly available database on lipid- protein interactions without requiring any technical knowledge.

6.5 References

1. Feig M., G. Nawrocki, I. Yu, P.-h. Wang, Y. Sugita, editors. Challenges and opportunities in connecting simulations with experiments via molecular dynamics of cellular environments. Journal of Physics: Conference Series; 2018: IOP Publishing.

2. Thibault J. C., D. R. Roe, J. C. Facelli, T. E. Cheatham. Data model, dictionaries, and desiderata for biomolecular simulation data indexing and sharing. Journal of cheminformatics. 2014;6(1):4.

3. Mlinarić A., M. Horvat, V. Šupak Smolčić. Dealing with the positive publication bias: Why you should really publish your negative results. Biochemia medica: Biochemia medica. 2017;27(3):447-452.

4. Nimpf S., D. A. Keays. Why (and how) we should publish negative data. EMBO reports. 2020;21(1).

5. McGibbon R. T., K. A. Beauchamp, M. P. Harrigan, C. Klein, J. M. Swails, C. X. Hernández, C. R. Schwantes, L.-P. Wang, T. J. Lane, V. S. Pande. MDTraj: a modern open library for the analysis of molecular dynamics trajectories. Biophys J. 2015;109(8):1528-1532.

6. Gowers R. J., M. Linke, J. Barnoud, T. J. E. Reddy, M. N. Melo, S. L. Seyler, J. Domanski, D. L. Dotson, S. Buchoux, I. M. Kenney. MDAnalysis: a Python package for the rapid analysis of molecular dynamics simulations. Los Alamos National Lab.(LANL), Los Alamos, NM (United States); 2019. Report No.: 2575-9752.

7. Abraham M. J., T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, E. Lindahl. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1:19-25.

132

8. Pearlman D. A., D. A. Case, J. W. Caldwell, W. S. Ross, T. E. Cheatham III, S. DeBolt, D. Ferguson, G. Seibel, P. Kollman. AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules. Computer Physics Communications. 1995;91(1-3):1-41.

9. Brooks B. R., C. L. Brooks III, A. D. Mackerell Jr, L. Nilsson, R. J. Petrella, B. Roux, Y. Won, G. Archontis, C. Bartels, S. Boresch. CHARMM: the biomolecular simulation program. Journal of computational chemistry. 2009;30(10):1545-1614.

10. Sejdiu B. I., D. P. Tieleman. Lipid-Protein Interactions Are a Unique Property and Defining Feature of G Protein-Coupled Receptors. Biophys J. 2020;118(8):1887-1900.

11. Chavent M., T. Reddy, J. Goose, A. C. E. Dahl, J. E. Stone, B. Jobard, M. S. Sansom. Methodologies for the analysis of instantaneous lipid diffusion in MD simulations of large membrane systems. Faraday Discuss. 2014;169:455-475.

12. Newport T. D., M. S. P. Sansom, P. J. Stansfeld. The MemProtMD database: a resource for membrane-embedded protein structures and their lipid interactions. Nucleic Acids Res. 2019;47(D1):D390- D397.

13. Duncan A. L., R. A. Corey, M. S. Sansom. Defining how multiple lipid species interact with inward rectifier potassium (Kir2) channels. Proceedings of the National Academy of Sciences. 2020.

14. Team B. D. Bokeh: Python library for interactive visualization. Bokeh Development Team Wichita, KS; 2014.

15. Bostock M., V. Ogievetsky, J. Heer. D³ data-driven documents. IEEE transactions on visualization and computer graphics. 2011;17(12):2301-2309.

16. Cabello R. Three. js. URL: https://github com/mrdoob/three js. 2010.

17. Rose A. S., P. W. Hildebrand. NGL Viewer: a web application for molecular visualization. Nucleic Acids Res. 2015;43(W1):W576-W579.

18. Rose A. S., A. R. Bradley, Y. Valasatava, J. M. Duarte, A. Prlić, P. W. Rose. NGL viewer: web- based molecular graphics for large complexes. Bioinformatics. 2018;34(21):3755-3758.

133

19. Corradi V., E. Mendez-Villuendas, H. I. Ingolfsson, R. X. Gu, I. Siuda, M. N. Melo, A. Moussatova, L. J. DeGagne, B. I. Sejdiu, G. Singh, T. A. Wassenaar, K. D. Magnero, S. J. Marrink, D. P. Tieleman. Lipid-Protein Interactions Are Unique Fingerprints for Membrane Proteins. Acs Central Sci. 2018;4(6):709-717.

20. Rouviere E., C. Arnarez, L. W. Yang, E. Lyman. Identification of Two New Cholesterol Interaction Sites on the A(2A) Adenosine Receptor. Biophys J. 2017;113(11):2415-2424.

21. Sehnal D., M. Deshpande, R. S. Vařeková, S. Mir, K. Berka, A. Midlik, L. Pravda, S. Velankar, J. Koča. LiteMol suite: interactive web-based visualization of large-scale macromolecular structure data. Nature methods. 2017;14(12):1121.

22. Hanson R. M., J. Prilusky, Z. Renjian, T. Nakane, J. L. Sussman. JSmol and the next‐generation web‐based representation of 3D molecular structure as applied to proteopedia. Israel Journal of Chemistry. 2013;53(3‐4):207-216.

23. Corradi V., B. I. Sejdiu, H. Mesa-Galloso, H. Abdizadeh, S. Y. Noskov, S. J. Marrink, D. P. Tieleman. Emerging Diversity in Lipid–Protein Interactions. Chem Rev. 2019.

24. Salas-Estrada L. A., N. Leioatts, T. D. Romo, A. Grossfield. Lipids Alter Rhodopsin Function via Ligand-like and Solvent-like Interactions. Biophys J. 2018;114(2):355-367.

25. Gu R.-X., B. L. de Groot. Lipid-protein interactions modulate the conformational equilibrium of a potassium channel. Nat Commun. 2020;11(1):1-10.

26. Abraham M., R. Apostolov, J. Barnoud, P. Bauer, C. Blau, A. M. Bonvin, M. Chavent, J. Chodera, K. Čondić-Jurkić, L. Delemotte. Sharing data from molecular simulations. J Chem Inf Model. 2019;59(10):4093-4099.

27. McKiernan E. C., P. E. Bourne, C. T. Brown, S. Buck, A. Kenall, J. Lin, D. McDougall, B. A. Nosek, K. Ram, C. K. Soderberg. Point of view: How open science helps researchers succeed. eLife. 2016;5:e16800.

134

Chapter Seven: Lipid – Protein Interactions: Current Status and Future Outlook

Copyright

N/A

Contribution

The motivations for the following article are my own and the entire review is written by me.

Abbreviation

All abbreviations are introduced at their first mentioning instance.

135

7.1 Abstract

Molecular dynamics simulations have proven to be a powerful and insightful tool in the study of the interplay between lipids and proteins. This is reflected in our current understanding of the intricacies of lipid-protein interactions and the variety of ways membrane lipids exert influence on the activity of membrane proteins. Interactions with membrane lipids have been shown to be a driving force behind membrane protein function and activity, by for instance, affecting the sampling of conformational changes of proteins or acting as allosteric modulators of their activity. As a field, however, the study of lipid-protein interactions is still in its infancy. Here, we give a brief overview of some of the challenges facing the field and argue for the necessity of formalizing our language and our approach when assessing the interactions between lipids and proteins. Such efforts would enhance the clarity and transparency of simulation results and improve their communication to the general scientific community.

7.2 Introduction

The last three decades of scientific research have seen an explosion in the number of works published with the goal of elucidating the interplay between lipids and proteins. This is thanks to the incredible effort and dedication from many talented scientists. In a recent series of efforts aiming at comprehensively reviewing the lipid – protein interaction literature(1-4), particularly as it emerges from computational studies, the number of contributions from many research groups as well as diversity of approaches became apparent. Combined efforts from both computational and experimental work, have been tremendously successful in revealing the molecular environment of membrane proteins and providing mechanistic details into many aspects of their function and activity, ranging from their anchoring process to cell membranes(5-7), perturbations of the local membrane environment(8, 9), aiding the creation, stability and form of protein oligomeric complexes(10, 11), or the filtering of lipids out of the lipid milieu in lipid – specific processes such as forming direct interactions(12) that are kept for long timespans, translocating lipids across the membrane(13), or allowing the protrusion of membrane lipids into the protein core(14). Consequently, our current picture and working model of the life of membrane-embedded proteins is much richer and vastly more complex.

Despite the lack of clear demarcation lines with regards to what constitutes a field in scientific disciplines, we believe it to be a non-controversial statement that the study of lipid – protein interactions merits such a qualifier. Going into the new decade, therefore, with the expectation that the number of studies dedicated to illuminating the diverse array of interplays between lipids and proteins is going to grow rapidly, it

136 becomes important to discuss the field itself. The current status and future outlook, with an important emphasis on the challenges moving forward.

Here, we attempt at providing an overview of some of the obstacles and problems facing the field and briefly talk about possible remedies. To this end, we discuss various aspects of lipid – protein interactions that we believe are important to address and merit further consideration. Specifically, we emphasize the need for a more transparent and standardized methodology along with the necessity of a comprehensive analysis of lipid – protein interactions that considers the presence of multiple lipid species when characterizing a protein’s interaction profile with the surrounding membrane environment.

Our discussion will be focused entirely on the computational approach to the study of lipid – protein interactions, and while some details might be extended to apply to experimental work, we neither imply nor infer it. We also take a more high-level approach to the discussion and as such do not consider challenges that relate to e.g. force-field development, coarse-grained and mixed resolution simulations, etc.

7.3 Ensuring convergence of lipid distributions

Bilayer models composed of only a few lipid species are a great tool to gain quick insight into the interplay between proteins and lipids, and due to their simplicity, they are ideal to link computational and experimental data together. The trend in the field, however, is towards more realistic membrane models whereby the lipid compositions employed in MD simulations aim at faithfully reproducing lipidomic data on cellular membranes(1). The strive for increased compositional complexity, however, amplifies the issue of attaining convergence of the entities of interest. That is, given lipid components with varying levels of concentration and protein surfaces that are partly occluded by lipid interactions, it is important that simulation lengths allow for the sufficient sampling of the desired parameters (e.g. the lipid distribution around proteins). One complex system that is commonly employed to study lipid-protein interactions is composed of 63 distinct lipid species(9, 15, 16), that differ in their headgroup and tail type. Estimating the time needed to achieve converged lipid distributions in such and similar systems is no trivial task. One approach that we used recently(16), and was motivated by the work of Chodera(17), consisted of calculating the cumulative average, 푐푎푣, of the number of lipids around proteins within a distance cutoff. For

푙1, 푙2, … , 푙푁 number of lipids, this would simply be:

푙 + 푙 + … + 푙 푐푎푣 = 1 2 푁 푁

137

Since each successive value is a result of all preceding data points, the shape of the graph tells us how long we need to simulate so that the beginning portion of the trajectory, where the system is in a converging state, does not affect the calculated properties. This is important in, for example, determining the optimal region of the trajectory to discard and thus maximize the length of the production region (which we may denote here as 푡0). To illustrate this, Figure 7-1A shows the number of ganglioside (GM) lipids around the serotonin receptor (5HT1B) during a 30 μs simulation, calculated either as the running average on the number of lipids or their cumulative average using different points of discarding the initial portion of the trajectory (i.e. different values of 푡0). Using only the running average, it is difficult to determine the point at which the distribution of GM lipids around 5HT1B has converged. In contrast, the cumulative average shows that by analyzing the whole trajectory without discarding any part of it to equilibration (푡0 = 푡), we need to simulate for very long (30+ μs) so that calculated properties of the GM lipid distribution are correct.

In fact, the results show that we can be confident only at 푡0 = 10 휇푠, and preferably larger. We can also gain a deeper insight if we calculate the running average of the number of lipids using different values for

푡0. Figure 7-1B shows the convergence of the running average as a function of increasing 푡0. We see that as we discard a larger fraction of the trajectory to equilibration, the average number of lipids around the receptor reaches a converged value.

Figure 7-1. Measuring convergence of lipid distributions. In A we show the running average (avg; dark red) and the cumulative average (cav; black) of the number of ganglioside (GM) lipids withing 7Å of the serotonin receptor (5HT1B). The cumulative average is

138 displayed separately for the whole trajectory, and for three 푡0 values: 5, 10 and 15 μs. B shows the convergence of the running average as a function of its standard deviation calculated for increasing values of 푡0. The black plus indicates the average of the last 5 μs of the trajectory.

This process is accompanied with a progressive decrease in the standard deviation of calculated values. The black plus symbol in Figure 7-1B marks the average value of the last 5 μs of the trajectory, which is a commonly used cutoff for analyzing such systems. It is still a challenge, however, to balance between accurate lipid distributions and maximizing the production region (which is the region where statistics are collected). Automated tools for the detection of the optimal value for 푡0, are therefore of particular importance(17). More work, however, is needed to formalize this discussion. Lipid distributions around a reference point is only one of many parameters of interest! Others, such as the sampling of binding/unbinding events may, for example, take considerably longer, or not even be possible at all using unbiased MD simulations.

7.4 When does a lipid become bound to a protein?

Definitions. Membrane proteins interact with membrane lipids. The majority of these interactions are transient with the lipid and protein contact formation and breakage (i.e. the lipid exchange rate)(18) usually lying in the lower nanosecond timescale. Such interactions are referred to as nonspecific lipid binding, to distinguish them from the binding of lipids which is specific to the latter. While the nonspecific (specific) binding of lipids is marked with a high (low) exchange rate, accurately identifying a site as either one or the other is a formidable challenge in the field, since even the nonspecific binding of lipids can occur in the range of hundreds of nanoseconds(1). Therefore, the challenge first and foremost lies in ensuring adequate sampling of the binding/unbinding events and then accurately differentiating those as specific/nonspecific. Considering practical limitations, however, there is inherently some degree of arbitration involved. Atomistic simulations (AT) are still limited in the time scales achievable, but owing to their resolution, identifying specific interactions is considerably easier. The opposite is the case for coarse-grained (CG) simulations, such as those utilizing the MARTINI model(19), where due to the smoother energy landscape and increased lateral diffusion of lipids(20), the access to longer time scales comes at the expense of an increased rate of false positive interaction sites(16). Given the complimentary nature of their advantages and shortcomings, however, it seems obvious that simulation strategies employing both levels of resolution should be especially useful. Indeed, a combined approach to the study of lipid – protein interactions, using both CG and AT simulations, whereby the former is used to ensure adequate sampling and the latter to filter out false positives and “flesh out” the details of the interaction, has been used successfully, either initiated

139 independently of each other(12) or by mapping the equilibrated CG system back into atomistic detail(21). Of course, plenty of other strategies can also be used to overcome the sampling problem in the study of specific lipid interaction sites, for instance, running multiple independent simulations that are started from different initial positions or using enhanced sampling techniques.

Measurements. Regardless of the approach undertaken to overcome the sampling barrier, the next challenge will be quantifying the interactions of proteins with lipids. This step is essential for the accurate differentiation between binding and nonbinding lipids, but also to enable other researchers to analyze and compare results across studies. Unfortunately, the field currently lacks a standardized protocol to measure lipid – protein interactions. For instance, some of the ways that interactions between lipids and protein have been quantitatively assessed include measurement of the total number of contacts formed between lipids and proteins(22), the average duration of lipid – protein contacts(12, 16), custom definitions such as the lipid residence time(21) or the depletion-enrichment index(9), and two or three dimensional densities of lipids around proteins. When considering the duration of contacts, as an example, Rouviere et al.(12) discard contacts shorter than 1 μs, whereas Sejdiu et al.(16) calculate the average of the contact with the longest duration. These differences are in addition to the inconsistent definition of what constitutes a contact. It would, of course, be wrong to expect that we only use one metric alone, considering the different advantages and shortcomings these parameters have. The total number of contacts is very intuitive in showing the interaction frequency of residues, but it is dependent on the ratios of lipid components in the system. Measurements of the longevity of contacts, to be meaningful, have to discard many of the contacts based on a somewhat arbitrary cutoff value, but they are less prone to false positives, etc.

In principle, the usage of different parameters to measure the interaction between lipids and proteins is not a problem per se as any carefully chosen and consistently applied metric should be successful in accurately identifying specifically bound lipids. One of the goals of the field, however, is to understand the driving forces and mechanical details that lead to lipid binding, and their effect on the structure and function of membrane proteins. To this end, comparative analysis of simulation results with respect to the interaction of lipids and protein across different studies is essential and made difficult by the lack of a standardized protocol. A possible remedy to this could be to share the data on several metrics simultaneously when publishing results regardless of the preference of the authors. Although technically more challenging, using modern in-browser molecular visualization software to display the interaction of proteins with lipids(16), would also be an option.

Ultimately, it will be desirable to complement such results with free-energy calculation methods. Despite considerable advancements in computer hardware and algorithm optimization, free-energy calculation methods are still computationally expensive, in particular at the atomistic resolution. Good approximation,

140 however, can be achieved using CG models such as MARTINI, and important advancements have been made with promising results. Potential of mean force calculations, alchemical free-energy perturbation and well-tempered metadynamics can all be used to estimate the binding of lipids to proteins(23). These calculations depend on the prior knowledge of the lipid binding site, and as such they do not replace the contact-based calculation methods mentioned above, but rather provide a reliable way to differentiate between specific/nonspecific binding of lipids, and compare the binding strength between interaction sites.

Interactions. Given the establishment of a specific interaction between a protein with a lipid, does it therefore mean that the interaction site is exclusive to that lipid? In other words, does specificity imply exclusivity of binding? Current evidence does not support an affirmative answer. That is, the same interaction site that binds specifically to one lipid, may also do so with other lipids. For instance, COX-1 binds specifically to cholesterol at the same site that is used by its natural substrate, arachidonic acid, to enter the cyclooxygenase active site of the enzyme. Practically, this highlights the need to have accurate lipid compositions in membrane models that mimic the biological environment of simulated proteins and ensuring adequate sampling between the components.

It is important to not conflict the study of specific lipid – protein interactions with the characterization of the lipid interaction profile of proteins. Specific interactions of lipids with proteins range from one up to a few in number, whereas interactions with all other lipids are short-lived and occur at a higher exchange rate. This, however, does not mean that membrane proteins do not affect the localization and clustering properties of nonbinding lipids in its close proximity. In fact, as has been repeatedly shown, the lipid composition of the local membrane environment of membrane proteins is distinctly different from the surrounding membrane. CGMD simulations of G Protein – Coupled Receptors(15, 16, 24)(GPCRs), for instance, show that they induce a reorganization of membrane lipids towards the enrichment of some lipid types (gangliosides, phosphatidylinositol and phosphatidylinositol phosphates, cholesterol) and the depletion of others (phosphatidylcholine and sphingomyelin lipids). Furthermore, the extent of these different localizations varies among GPCRs. Taken together, the specific interactions that GPCRs form with cholesterol and PIP lipids, the degree of enrichment/depletion of surrounding lipids and the overall changes to membrane thickness and curvature result in a lipid – interaction profile that is unique to a particular GPCR structure.

There are several considerations with regards to these findings. Since they are the product of the interaction of proteins with nonbinding lipids, determining the mechanistic details of the resulting changes is challenging. For instance, it is not obvious if the positive/negative enrichment of one lipid is the result of the negative/positive enrichment of another lipid type. In the case of GPCRs, the depletion of glycerophospholipids (PC lipids), as an example, may simply be the by-product of the preferential

141 clustering of GM lipids, and not due to any incompatibility of interaction between them. As such, inferring the direction of causality for changes to the clustering of nonbinding lipids around proteins is of noteworthy difficulty.

Lastly, we should also add that the terminology used to describe the overall interaction profile of proteins with lipids lacks a clear and formal definition. The designations annular vs bulk lipids are commonly used to differentiate lipids that are in close contact with membrane proteins from surrounding lipids. Defining the delineation between them, however, is anything but straightforward. Related to this, the effect of membrane lipids on the structure and function of membrane proteins is usually characterized as either being the result of specific interactions between lipids and proteins, or the product of their nonspecific, yet distinct, interactions. Given our current understanding, we think that this characterization is correct and accurate, but does not fully encompass the whole range of lipid – protein interactions. For example, COX- 1 interacts preferentially but non-specifically with membrane phospholipids. Surface-exposed and bilayer- oriented positively charged lysing and arginine residues interact with the headgroup of PC lipids. The result of these interactions is the induction of a positive curvature on the membrane that affects both leaflets despite the anchoring of COX-1 being monotopic. In this case, changes to the membrane physicochemical properties are caused by nonspecific interactions with membrane lipids. In contrast, mechanosensitive channel gating mechanism involves a combination of membrane curvature and hydrophobic mismatch, coupled with a close relationship with individual lipid acyl chains. Here, the lipid-protein interaction landscape is not easily classifiable. Most recently, MD simulations of the pore domain of MthK in 11 different membrane models(25), resulting in the creation of membrane environments with different physical properties (differing in membrane thickness, lipid tail saturation degree as well as membrane cholesterol content) revealed that they affect the ion permeation rates of MthK. These three examples would all classify as a result of nonspecific lipid-protein interactions, yet they all manifest themselves quite differently on the protein conformational landscape as well as local membrane structure.

7.5 The full picture

The literature on lipid – protein interactions is dominated by the characterization of protein interactions with mainly cholesterol and, to a lesser extent, PIP lipids. This is, of course, not without good reason. The specific binding of cholesterol has been a major focus of the field since its inception, which coupled with the involvement of cholesterol in driving membrane lipid organization(26), domain separation(27), and lipid rafts(28) has led to a wealth of information regarding its activity. Similarly, PIP lipids have gained an increasing attention in the context of their specific interactions with proteins(29, 30), since, owing to their

142 low concentrations in cell membranes, their binding may provide cells with a control mechanism over the activity of membrane proteins.

Despite their prevalence in the lipid – protein interaction literature(2, 3), the research of cholesterol and PIP lipid binding should not substitute or even occlude the necessity to study the interaction of other lipid types with proteins. The description of the lipid – interaction profile of proteins inherently implies a comprehensive analysis of their interaction with all lipid types. The importance and usefulness of such a thorough and comprehensive analysis was, for example, shown recently by Duncan et al.(31) for Kir2.2, where they show the channel interacting specifically with PS lipids in addition to PIP2 lipids. Specifically, they highlight the need to consider multiple lipid species when analyzing lipid – protein interactions.

Related to the above, discussions of lipid – protein interactions are also focused mostly on the protein, i.e. how individual residues interact with lipids, at what strength, and so on. This, of course, makes sense considering one of the primary reasons when choosing proteins to study is their physiological and medical importance. Nevertheless, to get a complete understanding of lipid – protein interactions, we also have to consider the subtly different aspect of their interplay, namely how lipids interact with proteins. That is, shifting our attention to how and why lipids bind to proteins. We illustrate this difference in Figure 7-2, where we show the interaction of COX-1 with cholesterol, by first mapping the number of contacts into the surface of the protein and then compare it to the same mapping into the structure of cholesterol. The projection of cholesterol contacts into the surface of the protein shows which residues interact most with cholesterol and the general location of the binding site. In the case of COX-1 presented here, the binding site is located at the hydrophobic core of the enzyme, which is also used by its natural substrate, arachidonic acid, to access the active site of the protein. This, however, does not tell us anything about how cholesterol is positioned into the active site or how the interaction is maintained. To get the complete picture of cholesterol – COX-1 interactions we have to also consider their interaction from the perspective of cholesterol. One way to do that, as illustrated in Figure 7-2, is to simply map the interactions with enzyme residues into the structure of cholesterol. Considering the large number of residues, to keep the discussion concise Figure 7-2B only shows the interaction with positively charged and hydrophobic residues. We see that the hydroxyl headgroup of cholesterol as well as atoms forming its first phenanthrene ring interact mainly with charged residues (which for COX-1 are arginine residues) and only on the α-face. In contrast, cholesterol interaction with hydrophobic residues include the entire length of the β-face of cholesterol and almost none of the α-face. Taken together, such an analysis now allows for a detailed understanding of the binding of cholesterol to the COX-1 enzyme.

143

Figure 7-2. Overview of cholesterol – COX-1 interactions. We show the total number of contacts between cholesterol and COX-1 residues mapped into the surface of the enzyme (A) and cholesterol atoms (B, and C), respectively. Contact projection on cholesterol atoms is

144 done separately for positively charged (B) and hydrophobic residues (C). Calculation of contacts for A was done using MDTraj(32) and displayed using VMD(33). For B and C, we used PyVista(34) to generate the density maps and ParaView(35) for visualization.

Lipid – protein interactions, even if they are specific, do not necessitate any functional importance to the activity of the protein. This connection between lipid binding and functional importance has to be independently established. As a matter of fact, it has to be independently established for every potential binding site of every protein. One of the findings from lipid – protein interaction studies is the unique interaction profile of proteins with their local lipid environment, the question of the biological importance of such a unique profile, however, is left open. Indeed, even if such a biological importance where to be established, it is unclear how differences in the composition of annular shell lipids would relate to it. That is, it is possible that the lipid – interaction profile for two similar proteins might be unique, but still not sufficiently different enough to elicit a unique biological response. While linking annular shell lipid composition to protein activity is a significantly tougher challenge, the number of articles that prove and detail the relationship between the binding of specific lipids with protein function and activity is substantial and growing rapidly. This is in large part thanks to advancements to novel experimental techniques such as native mass spectrometry (nMS)(36), the usage of lipid nanodisks(37), or surface-plasmon resonance(38). Despite this, however, experimental validation of putative binding sites and their functional involvement in the structure and activity of membrane proteins remains a challenge. MD simulations are not mute with regards to this topic either and have been used to show that lipids do indeed affect the activity of proteins(39). Nevertheless, as computational approaches continue to be optimized and pipelines for the automated simulations and analysis of lipid – protein interactions are generated, it will become increasing important to accelerate the development of experimental techniques that will verify and validate simulation findings.

7.6 Conclusion

Using MD simulation to study lipid – protein interactions has proven to be incredibly successful in elucidating the interplay between membrane proteins and their lipid environment. Advancements in computer hardware, software and algorithm coupled with an increasing repertoire of solved membrane protein structures, will enable the field to grow at a fast rate. The parallel increase in the number of membrane proteins that are solved with bound lipids is of particular importance as they reveal structural details of lipid binding and serve as a reference point for simulation results(40, 41).

145

Current efforts are dedicated toward membrane models with increased compositional complexity and simulated for longer temporal lengths, with projections that full cell models are going to be within reach within the next decade(1). Parallel to this, we expect our understanding of the molecular forces that drive lipids to interact with proteins to increase substantially, paired with the creation of lipid – protein interaction databases, visualization frameworks, and even algorithms for the automated detection of lipid binding sites.

Despite the progress, we believe the field would benefit from efforts dedicated to formalizing the language and terminology we use, as well as standardizing analysis and measurement metrics. Avoiding the myopic understanding of lipid – protein interactions by extending our scope to consider other lipid species, in addition to cholesterol and PIPs lipids, and increasing the accessibility of simulation/analysis data(42) would be of no lesser importance.

7.7 References

1. Marrink S. J., V. Corradi, P. C. Souza, H. I. Ingólfsson, D. P. Tieleman, M. S. Sansom. Computational modeling of realistic cell membranes. Chem Rev. 2019;119(9):6184-6226.

2. Corradi V., B. I. Sejdiu, H. Mesa-Galloso, H. Abdizadeh, S. Y. Noskov, S. J. Marrink, D. P. Tieleman. Emerging Diversity in Lipid–Protein Interactions. Chem Rev. 2019.

3. Enkavi G., M. Javanainen, W. Kulig, T. Róg, I. Vattulainen. Multiscale Simulations of Biological Membranes: The Challenge To Understand Biological Phenomena in a Living Substance. Chem Rev. 2019.

4. Muller M. P., T. Jiang, C. Sun, M. Lihan, S. Pant, P. Mahinthichaichan, A. Trifan, E. Tajkhorshid. Characterization of Lipid–Protein Interactions and Lipid-Mediated Modulation of Membrane Protein Function through Molecular Simulation. Chem Rev. 2019.

5. Fowler P. W., K. Balali-Mood, S. Deol, P. V. Coveney, M. S. Sansom. Monotopic enzymes and lipid bilayers: a comparative study. Biochemistry. 2007;46(11):3108-3115.

6. Charlier L., M. Louet, L. Chaloin, P. Fuchs, J. Martinez, D. Muriaux, C. Favard, N. Floquet. Coarse-grained simulations of the HIV-1 matrix protein anchoring: revisiting its assembly on membrane domains. Biophys J. 2014;106(3):577-585.

7. Hoogerheide D. P., S. Y. Noskov, D. Jacobs, L. Bergdoll, V. Silin, D. L. Worcester, J. Abramson, H. Nanda, T. K. Rostovtseva, S. M. Bezrukov. Structural features and lipid binding domain of tubulin on

146 biomimetic mitochondrial membranes. Proceedings of the National Academy of Sciences. 2017;114(18):E3622-E3631.

8. Teague W. E., O. Soubias, H. Petrache, N. Fuller, K. G. Hines, R. P. Rand, K. Gawrisch. Elastic properties of polyunsaturated phosphatidylethanolamines influence rhodopsin function. Faraday Discuss. 2013;161:383-395.

9. Corradi V., E. Mendez-Villuendas, H. I. Ingolfsson, R. X. Gu, I. Siuda, M. N. Melo, A. Moussatova, L. J. DeGagne, B. I. Sejdiu, G. Singh, T. A. Wassenaar, K. D. Magnero, S. J. Marrink, D. P. Tieleman. Lipid-Protein Interactions Are Unique Fingerprints for Membrane Proteins. Acs Central Sci. 2018;4(6):709-717.

10. Schmidt T. H., Y. Homsi, T. Lang. Oligomerization of the Tetraspanin CD81 via the Flexibility of Its δ-Loop. Biophys J. 2016;110(11):2463-2474.

11. Pluhackova K., S. Gahbauer, F. Kranz, T. A. Wassenaar, R. A. Bockmann. Dynamic Cholesterol- Conditioned Dimerization of the G Protein Coupled Chemokine Receptor Type 4. PLoS Comput Biol. 2016;12(11):e1005169.

12. Rouviere E., C. Arnarez, L. W. Yang, E. Lyman. Identification of Two New Cholesterol Interaction Sites on the A(2A) Adenosine Receptor. Biophys J. 2017;113(11):2415-2424.

13. Morra G., A. M. Razavi, K. Pandey, H. Weinstein, A. K. Menon, G. Khelashvili. Mechanisms of Lipid Scrambling by the G Protein-Coupled Receptor Opsin. Structure. 2018;26(2):356-+.

14. Guixa-Gonzalez R., J. L. Albasanz, I. Rodriguez-Espigares, M. Pastor, F. Sanz, M. Marti-Solano, M. Manna, H. Martinez-Seara, P. W. Hildebrand, M. Martin, J. Selent. Membrane cholesterol access into a G-protein-coupled receptor. Nat Commun. 2017;8:14505.

15. Marino K. A., D. Prada-Gracia, D. Provasi, M. Filizola. Impact of Lipid Composition and Receptor Conformation on the Spatio-temporal Organization of mu-Opioid Receptors in a Multi-component Plasma Membrane Model. PLoS Comput Biol. 2016;12(12):e1005240.

16. Sejdiu B. I., D. P. Tieleman. Lipid-Protein Interactions Are a Unique Property and Defining Feature of G Protein-Coupled Receptors. Biophys J. 2020;118(8):1887-1900.

17. Chodera J. D. A simple method for automated equilibration detection in molecular simulations. Journal of chemical theory and computation. 2016;12(4):1799-1805.

147

18. Sengupta D., X. Prasanna, M. Mohole, A. Chattopadhyay. Exploring GPCR–Lipid Interactions by Molecular Dynamics Simulations: Excitements, Challenges, and the Way Forward. The Journal of Physical Chemistry B. 2018;122(22):5727-5737.

19. Marrink S. J., H. J. Risselada, S. Yefimov, D. P. Tieleman, A. H. de Vries. The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations. The Journal of Physical Chemistry B. 2007;111(27):7812-7824.

20. Marrink S. J., D. P. Tieleman. Perspective on the Martini model. Chem Soc Rev. 2013;42(16):6801-6822.

21. Arnarez C., J.-P. Mazat, J. Elezgaray, S.-J. Marrink, X. Periole. Evidence for cardiolipin binding sites on the membrane-exposed surface of the cytochrome bc 1. J Am Chem Soc. 2013;135(8):3112-3120.

22. Hedger G., H. Koldso, M. Chavent, C. Siebold, R. Rohatgi, M. S. P. Sansom. Cholesterol Interaction Sites on the Transmembrane Domain of the Hedgehog Signal Transducer and Class F G Protein- Coupled Receptor Smoothened. Structure. 2018.

23. Corey R. A., O. N. Vickery, M. S. Sansom, P. J. Stansfeld. Insights into Membrane Protein–Lipid Interactions from Free Energy Calculations. Journal of chemical theory and computation. 2019;15(10):5727-5736.

24. Horn J. N., T. C. Kao, A. Grossfield. Coarse-Grained Molecular Dynamics Provides Insight into the Interactions of Lipids and Cholesterol with Rhodopsin. In: Filizola M, editor. G Protein-Coupled Receptors - Modeling and Simulation. Advances in Experimental Medicine and Biology. 796. Berlin: Springer-Verlag Berlin; 2014. p. 75-94.

25. Gu R.-X., B. L. de Groot. Lipid-protein interactions modulate the conformational equilibrium of a potassium channel. Nat Commun. 2020;11(1):1-10.

26. Harder T., K. Simons. Caveolae, DIGs, and the dynamics of sphingolipid—cholesterol microdomains. Curr Opin Cell Biol. 1997;9(4):534-542.

27. Gu R.-X., S. Baoukina, D. P. Tieleman. Phase Separation in Atomistic Simulations of Model Membranes. J Am Chem Soc. 2020;142(6):2844-2856.

28. Simons K., R. Ehehalt. Cholesterol, lipid rafts, and disease. The Journal of clinical investigation. 2002;110(5):597-603.

148

29. Yen H. Y., K. K. Hoi, I. Liko, G. Hedger, M. R. Horrell, W. L. Song, D. Wu, P. Heine, T. Warne, Y. Lee, B. Carpenter, A. Pluckthun, C. G. Tate, M. S. P. Sansom, C. V. Robinson. PtdIns(4,5)P-2 stabilizes active states of GPCRs and enhances selectivity of G-protein coupling. Nature. 2018;559(7714):424-+.

30. Hedger G., M. S. Sansom, H. Koldsø. The juxtamembrane regions of human receptor tyrosine kinases exhibit conserved interaction sites with anionic lipids. Sci Rep. 2015;5:9198.

31. Duncan A. L., R. A. Corey, M. S. Sansom. Defining how multiple lipid species interact with inward rectifier potassium (Kir2) channels. Proceedings of the National Academy of Sciences. 2020.

32. McGibbon R. T., K. A. Beauchamp, M. P. Harrigan, C. Klein, J. M. Swails, C. X. Hernández, C. R. Schwantes, L.-P. Wang, T. J. Lane, V. S. Pande. MDTraj: a modern open library for the analysis of molecular dynamics trajectories. Biophys J. 2015;109(8):1528-1532.

33. Humphrey W., A. Dalke, K. Schulten. VMD: visual molecular dynamics. Journal of molecular graphics. 1996;14(1):33-38.

34. Sullivan C., A. Kaszynski. PyVista: 3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK). Journal of Open Source Software. 2019;4(37):1450.

35. Ayachit U. The paraview guide: a parallel visualization application: Kitware, Inc.; 2015.

36. Patrick J. W., C. D. Boone, W. Liu, G. M. Conover, Y. Liu, X. Cong, A. Laganowsky. Allostery revealed within lipid binding events to membrane proteins. Proceedings of the National Academy of Sciences. 2018;115(12):2976-2981.

37. Swainsbury D. J., S. Scheidelaar, R. Van Grondelle, J. A. Killian, M. R. Jones. Bacterial reaction centers purified with styrene maleic acid copolymer retain native membrane functional properties and display enhanced stability. Angewandte Chemie International Edition. 2014;53(44):11803-11807.

38. Inada M., M. Kinoshita, A. Sumino, S. Oiki, N. Matsumori. A concise method for quantitative analysis of interactions between lipids and membrane proteins. Analytica chimica acta. 2019;1059:103- 112.

39. Manna M., M. Niemela, J. Tynkkynen, M. Javanainen, W. Kulig, D. J. Muller, T. Rog, I. Vattulainen. Mechanism of allosteric regulation of beta2-adrenergic receptor by cholesterol. eLife. 2016;5.

149

40. Papasergi-Scott M. M., M. J. Robertson, A. B. Seven, O. Panova, J. M. Mathiesen, G. Skiniotis. Structures of metabotropic GABAB receptor. bioRxiv. 2020.

41. Flores J. A., B. G. Haddad, K. A. Dolan, J. B. Myers, C. C. Yoshioka, J. Copperman, D. M. Zuckerman, S. L. Reichow. Connexin-46/50 in a dynamic lipid environment resolved by CryoEM at 1.9 Å. BioRxiv. 2020.

42. Abraham M., R. Apostolov, J. Barnoud, P. Bauer, C. Blau, A. M. Bonvin, M. Chavent, J. Chodera, K. Čondić-Jurkić, L. Delemotte. Sharing data from molecular simulations. J Chem Inf Model. 2019;59(10):4093-4099.

150

Chapter Eight: Conclusions

8.1 Conclusions

My thesis focuses on lipid-protein interactions. Specifically, I use molecular dynamics (MD) simulations to study the number of ways lipids interact with membrane proteins. To this end, I provide six chapters of original contributions to the field: four of which are novel research contributions and two include important and timely reviews of the field.

In chapter 2, I provide a detailed and comprehensive review of the MD literature on lipid-GPCR interactions. In it I attempt to synthesize our knowledge of how GPCRs interact with lipids and try to derive shared features of their interplay. After noting the challenges associated with comparing the lipid interaction profiles of GPCRs as a result of the differences in experimental and simulation setups employed, one of the conclusions that I reached there was that “extending results from one GPCR to others” is not enough. This finding motivated one of my main research efforts which is presented in chapter 4. In it, I provide the most comprehensive study of GPCR-lipid interactions to date in several metrics: the number of different structures considered, the total simulation time and the complexity of the membrane model used. We find that GPCRs, even if they are different conformations of the same protein, induce different changes to the local membrane environment. This difference extends to the comparison of class A GPCRs from their non class A counterparts. We highlight these differences by measuring specific interactions of GPCRs with cholesterol and phosphatidylinositol (PIP) lipids, the overall enrichment/depletion of all lipid groups around the protein, changes to the localization of lipids according to their tail saturation degree, and perturbations induced on the surrounding membrane. Nevertheless, we also note the existence of common interaction sites (e.g. cholesterol interactions with the extracellular side of transmembrane helices 6/7 for class A GPCRs).

These findings also explain the challenges in synthesizing the GPCR-lipid interaction literature. The research presented in chapter 4 further motivated my work detailed in chapter 6. Presenting data from the interaction of 28 different structures in lipid membranes with 60+ different components is not easy, and this is reflected in the amount of supplementary material for chapter 4. In chapter 6 I provide one simple solution how to overcome these problems, namely, presenting all the data in an interactive format. The current version of the webserver completely automates the analysis and visualization of lipid-protein interactions.

151

To get a wider understanding of lipid-protein interactions, we also studied the interaction of other proteins with their lipid environment. In chapter 3 we present the lipid interaction profile of AMPA receptors where we note their interaction with cholesterol and diacylglycerol lipids. These lipids interact with the terminal residues of helix M2 that form the ion conductance pathway of the receptor and are inserted inside a cavity formed by helices M1, M2 and M4, highlighting a potential functional modulation of the activity of AMPA receptors by lipids. In chapter 5 we present a detailed investigation of COX-1 – lipid interactions. We observe the interaction of several lipid types with the enzyme where they occupy the hydrophobic channel that is also used by the endogenous ligand of COX-1, arachidonic acid, to access the active site of the enzyme. We also, for the first time, identify the pathway that is used by arachidonic acid to bind to COX-1 and reveal the importance of a series of arginine residues in guiding the fatty acid. Lastly, we provide a mechanistic explanation for how COX-1 induces a positive curvature on the surrounding membrane. The combined results reveal a very complex interaction profile of COX-1 enzymes with lipids.

Finally, in chapter 7 I provide a review of the field of lipid-protein interactions derived mostly from my own results presented in previous chapters and focusing mainly on some of the main challenges moving forward. I, for instance, note the importance of ensuring adequate sampling of lipid distributions, provide an analysis of the deficiencies in the terminology commonly used in the field, and note the need for more standardized protocols in measuring and identifying specific interactions with lipids.

My work during my PhD focuses on various aspects of lipid-protein interactions. All chapters presented here share a common theme in that sense, but the conclusions derived in each of the six chapters are different. Taken together, my work presented here improves our understanding of lipid-protein interactions.

8.2 Outlook

There are several open avenues on how to either continue or build upon my work highlighted here. One of the challenges that I discuss in chapter 7 is how to link simulation findings to their biological relevance. For instance, while the overall lipid-protein interaction profile of GPCRs is unique, it is unclear if and to what extent that translates to any functional importance on receptor activity. As such, the story of GPCR interactions with lipids is far from over, and future studies will likely focus on elucidating these aspects of their interactions.

The same is true for results presented in chapter 5 for COX-1 – lipid interactions. There, we discuss how experimental studies could be used to test several conclusions on arachidonic acid and glycerophospholipid binding to COX-1 enzymes. In general, one of the main paths for future research will be using experiments

152 in combination with simulations. Concurrently, the increase in computer power and the introduction of MARTINI 3 along with efforts dedicated to building larger systems easier and faster will allow for the simulation of entire cell organelles. In the near future it will be possible to probe many mesoscopic properties of cell activity. Chapter 7 provides a more detailed overview of these and many other aspects on the outlook of lipid-protein interactions.

To summarize, based on our current understanding we see the field of lipid-protein interactions developing in three different avenues. First, we are going to see an increase in focused effort on elucidating the finer details of lipid-protein interactions. This will include the study of conformational changes induced by different lipid environments or how specific interactions with lipids modulate the activity of proteins. Second, we will start to see an increase in MD simulations that take a holistic approach in studying lipid- protein interactions. Large setups composed of hundreds of thousands if not million of lipids will be accessible soon and will allow us to study how entire cellular structures work. And the final avenue of future research will be directed towards the creation of automated pipelines to study lipid-protein interactions. The number of proteins with solved structures is much smaller than the number of proteins with known sequences, similarly, the number of proteins with known lipid interaction profiles is much smaller than the number of proteins with solved structures. Automated protocols would significantly mitigate this. Chapter 6 presented here provides an initial aim towards such a goal, and we are going to see much more efforts dedicated towards this end in the near future.

153

Bibliography

Abraham M., D. Van Der Spoel, E. Lindahl, B. Hess. The GROMACS development team GROMACS user manual version 5.0. 4. J Mol Model. 2014.

Abraham M., R. Apostolov, J. Barnoud, P. Bauer, C. Blau, A. M. Bonvin, M. Chavent, J. Chodera, K. Čondić-Jurkić, L. Delemotte. Sharing data from molecular simulations. J Chem Inf Model. 2019;59(10):4093-4099.

Abraham M. J., T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, E. Lindahl. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1:19-25.

Allen M. P., D. J. Tildesley. Computer simulation of liquids: Oxford university press; 2017.

Arnarez C., J.-P. Mazat, J. Elezgaray, S.-J. Marrink, X. Periole. Evidence for cardiolipin binding sites on the membrane-exposed surface of the cytochrome bc 1. J Am Chem Soc. 2013;135(8):3112-3120.

Ayachit U. The paraview guide: a parallel visualization application: Kitware, Inc.; 2015.

Balali-Mood K., P. J. Bond, M. S. Sansom. Interaction of monotopic membrane enzymes with a lipid bilayer: a coarse-grained MD simulation study. Biochemistry. 2009;48(10):2135-2145.

Barreto-Ojeda E., V. Corradi, R.-X. Gu, D. P. Tieleman. Coarse-grained molecular dynamics simulations reveal lipid access pathways in P-glycoprotein. Journal of General Physiology. 2018;150(3):417-429.

Berendsen H. J., J. P. Postma, W. F. van Gunsteren, J. Hermans. Interaction models for water in relation to protein hydration. Intermolecular forces: Springer; 1981. p. 331-342.

Berendsen H. J., J. v. Postma, W. F. van Gunsteren, A. DiNola, J. R. Haak. Molecular dynamics with coupling to an external bath. The Journal of chemical physics. 1984;81(8):3684-3690.

Blobaum A. L., L. J. Marnett. Structural and functional basis of cyclooxygenase inhibition. J Med Chem. 2007;50(7):1425-1441.

Blomberg L. M., M. R. Blomberg, P. E. Siegbahn, W. A. van der Donk, A.-L. Tsai. A quantum chemical study of the synthesis of prostaglandin g2 by the cyclooxygenase active site in prostaglandin endoperoxide h synthase 1. The Journal of Physical Chemistry B. 2003;107(14):3297-3308.

154

Bostock M., V. Ogievetsky, J. Heer. D³ data-driven documents. IEEE transactions on visualization and computer graphics. 2011;17(12):2301-2309.

Breslow D. K. Sphingolipid homeostasis in the endoplasmic reticulum and beyond. Cold Spring Harbor perspectives in biology. 2013;5(4):a013326.

Brooks B. R., C. L. Brooks III, A. D. Mackerell Jr, L. Nilsson, R. J. Petrella, B. Roux, Y. Won, G. Archontis, C. Bartels, S. Boresch. CHARMM: the biomolecular simulation program. Journal of computational chemistry. 2009;30(10):1545-1614.

Burg J. S., J. R. Ingram, A. J. Venkatakrishnan, K. M. Jude, A. Dukkipati, E. N. Feinberg, A. Angelini, D. Waghray, R. O. Dror, H. L. Ploegh, K. C. Garcia. Structural basis for chemokine recognition and activation of a viral G protein-coupled receptor. Science. 2015;347(6226):1113-1117.

Bussi G., D. Donadio, M. Parrinello. Canonical sampling through velocity rescaling. The Journal of chemical physics. 2007;126(1):014101.

Byrne E. F. X., R. Sircar, P. S. Miller, G. Hedger, G. Luchetti, S. Nachtergaele, M. D. Tully, L. Mydock- McGrane, D. F. Covey, R. P. Rambo, M. S. P. Sansom, S. Newstead, R. Rohatgi, C. Siebold. Structural basis of Smoothened regulation by its extracellular domains. Nature. 2016;535(7613):517-522.

Cabello R. Three. js. URL: https://github com/mrdoob/three js. 2010.

Cang X. H., Y. Du, Y. Y. Mao, Y. Y. Wang, H. Y. Yang, H. L. Jiang. Mapping the Functional Binding Sites of Cholesterol in beta(2)-Adrenergic Receptor by Long-Time Molecular Dynamics Simulations. J Phys Chem B. 2013;117(4):1085-1094.

Casares D., P. V. Escribá, C. A. Rosselló. Membrane lipid composition: effect on membrane and organelle structure, function and compartmentalization and therapeutic avenues. International journal of molecular sciences. 2019;20(9):2167.

Charlier L., M. Louet, L. Chaloin, P. Fuchs, J. Martinez, D. Muriaux, C. Favard, N. Floquet. Coarse-grained simulations of the HIV-1 matrix protein anchoring: revisiting its assembly on membrane domains. Biophys J. 2014;106(3):577-585.

Chater T. E., Y. Goda. The role of AMPA receptors in postsynaptic mechanisms of synaptic plasticity. Frontiers in cellular neuroscience. 2014;8:401.

155

Chavent M., T. Reddy, J. Goose, A. C. E. Dahl, J. E. Stone, B. Jobard, M. S. Sansom. Methodologies for the analysis of instantaneous lipid diffusion in MD simulations of large membrane systems. Faraday Discuss. 2014;169:455-475.

Cheng R. K., C. Fiez-Vandal, O. Schlenker, K. Edman, B. Aggeler, D. G. Brown, G. A. Brown, R. M. Cooke, C. E. Dumelin, A. S. Doré. Structural insight into allosteric modulation of protease-activated receptor 2. Nature. 2017;545(7652):112.

Cheng X., J. C. Smith. Biological Membrane Organization and Cellular Signaling. Chem Rev. 2019;119(9):5849-5880.

Cherezov V., D. M. Rosenbaum, M. A. Hanson, S. G. F. Rasmussen, F. S. Thian, T. S. Kobilka, H. J. Choi, P. Kuhn, W. I. Weis, B. K. Kobilka, R. C. Stevens. High-resolution crystal structure of an engineered human beta(2)-adrenergic G protein-coupled receptor. Science. 2007;318(5854):1258-1265.

Chien E. Y., W. Liu, Q. Zhao, V. Katritch, G. W. Han, M. A. Hanson, L. Shi, A. H. Newman, J. A. Javitch, V. Cherezov. Structure of the human dopamine D3 receptor in complex with a D2/D3 selective antagonist. Science. 2010;330(6007):1091-1095.

Chodera J. D. A simple method for automated equilibration detection in molecular simulations. Journal of chemical theory and computation. 2016;12(4):1799-1805.

Choe H.-W., Y. J. Kim, J. H. Park, T. Morizumi, E. F. Pai, N. Krauss, K. P. Hofmann, P. Scheerer, O. P. Ernst. Crystal structure of metarhodopsin II. Nature. 2011;471(7340):651.

Chrencik J. E., C. B. Roth, M. Terakado, H. Kurata, R. Omi, Y. Kihara, D. Warshaviak, S. Nakade, G. Asmar-Rovira, M. Mileni. Crystal structure of antagonist bound human lysophosphatidic acid receptor 1. Cell. 2015;161(7):1633-1643.

Christov C. Z., A. Lodola, T. G. Karabencheva-Christova, S. Wan, P. V. Coveney, A. J. Mulholland. Conformational Effects on the pro-S Hydrogen Abstraction Reaction in Cyclooxygenase-1: An Integrated QM/MM and MD Study. Biophys J. 2013;104(5):L5-L7.

Cignoni P., M. Callieri, M. Corsini, M. Dellepiane, F. Ganovelli, G. Ranzuglia, editors. Meshlab: an open- source mesh processing tool. Eurographics Italian chapter conference; 2008.

156

Corey R. A., O. N. Vickery, M. S. Sansom, P. J. Stansfeld. Insights into Membrane Protein–Lipid Interactions from Free Energy Calculations. Journal of chemical theory and computation. 2019;15(10):5727-5736.

Corradi V., E. Mendez-Villuendas, H. I. Ingolfsson, R. X. Gu, I. Siuda, M. N. Melo, A. Moussatova, L. J. DeGagne, B. I. Sejdiu, G. Singh, T. A. Wassenaar, K. D. Magnero, S. J. Marrink, D. P. Tieleman. Lipid- Protein Interactions Are Unique Fingerprints for Membrane Proteins. Acs Central Sci. 2018;4(6):709-717.

Corradi V., B. I. Sejdiu, H. Mesa-Galloso, H. Abdizadeh, S. Y. Noskov, S. J. Marrink, D. P. Tieleman. Emerging Diversity in Lipid–Protein Interactions. Chem Rev. 2019.

Darden T., D. York, L. Pedersen. Particle mesh Ewald: An N⋅ log (N) method for Ewald sums in large systems. The Journal of chemical physics. 1993;98(12):10089-10092.

Dawaliby R., C. Trubbia, C. Delporte, M. Masureel, P. Van Antwerpen, B. K. Kobilka, C. Govaerts. Allosteric regulation of G protein-coupled receptor activity by phospholipids. Nat Chem Biol. 2016;12(1):35-39.

De Castro E., C. J. Sigrist, A. Gattiker, V. Bulliard, P. S. Langendijk-Genevaux, E. Gasteiger, A. Bairoch, N. Hulo. ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins. Nucleic Acids Res. 2006;34(suppl_2):W362-W365. de Jong D. H., G. Singh, W. D. Bennett, C. Arnarez, T. A. Wassenaar, L. V. Schäfer, X. Periole, D. P. Tieleman, S. J. Marrink. Improved parameters for the martini coarse-grained protein force field. Journal of Chemical Theory and Computation. 2012;9(1):687-697.

Dean L., L. Dean. Blood groups and red cell antigens: NCBI Bethesda, Md, USA; 2005.

Deol S. S., C. Domene, P. J. Bond, M. S. Sansom. Anionic phospholipid interactions with the potassium channel KcsA: simulation studies. Biophys J. 2006;90(3):822-830.

Dingledine R., K. Borges, D. Bowie, S. F. Traynelis. The glutamate receptor ion channels. Pharmacol Rev. 1999;51(1):7-62.

Dong L., A. J. Vecchio, N. P. Sharma, B. J. Jurban, M. G. Malkowski, W. L. Smith. Human cyclooxygenase-2 is a sequence homodimer that functions as a conformational heterodimer. J Biol Chem. 2011;286(21):19035-19046.

157

Dong L., C. Yuan, B. J. Orlando, M. G. Malkowski, W. L. Smith. Fatty acid binding to the allosteric subunit of cyclooxygenase-2 relieves a tonic inhibition of the catalytic subunit. J Biol Chem. 2016;291(49):25641- 25655.

Dong L., H. Zou, C. Yuan, Y. H. Hong, D. V. Kuklev, W. L. Smith. Different fatty acids compete with arachidonic acid for binding to the allosteric or catalytic subunits of cyclooxygenases to regulate prostanoid synthesis. J Biol Chem. 2016;291(8):4069-4078.

Doré A. S., K. Okrasa, J. C. Patel, M. Serrano-Vega, K. Bennett, R. M. Cooke, J. C. Errey, A. Jazayeri, S. Khan, B. Tehan. Structure of class C GPCR metabotropic glutamate receptor 5 transmembrane domain. Nature. 2014;511(7511):557.

Dravid S. M., H. Yuan, S. Traynelis. AMPA Receptors: Molecular Biology and Pharmacology. Encyclopedia of Neuroscience: Elsevier Ltd; 2010. p. 311-318.

Duncan A. L., R. A. Corey, M. S. Sansom. Defining how multiple lipid species interact with inward rectifier potassium (Kir2) channels. Proceedings of the National Academy of Sciences. 2020.

Elinder F., S. I. Liin. Actions and mechanisms of polyunsaturated fatty acids on voltage-gated ion channels. Front Physiol. 2017;8:43.

Enkavi G., M. Javanainen, W. Kulig, T. Róg, I. Vattulainen. Multiscale Simulations of Biological Membranes: The Challenge To Understand Biological Phenomena in a Living Substance. Chem Rev. 2019.

Fagone P., S. Jackowski. Membrane phospholipid synthesis and endoplasmic reticulum function. J Lipid Res. 2009;50(Supplement):S311-S316.

Fan H. X., S. H. Chen, X. J. Yuan, S. Han, H. Zhang, W. L. Xia, Y. C. Xu, Q. Zhao, B. L. Wu. Structural basis for ligand recognition of the human thromboxane A(2) receptor. Nat Chem Biol. 2019;15(1):27-+.

Fantini J., F. J. Barrantes. How cholesterol interacts with membrane proteins: an exploration of cholesterol- binding sites including CRAC, CARC, and tilted domains. Front Physiol. 2013;4:9.

Fantini J., C. Di Scala, L. S. Evans, P. T. F. Williamson, F. J. Barrantes. A mirror code for protein- cholesterol interactions in the two leaflets of biological membranes. Sci Rep. 2016;6:14.

158

Feig M., G. Nawrocki, I. Yu, P.-h. Wang, Y. Sugita, editors. Challenges and opportunities in connecting simulations with experiments via molecular dynamics of cellular environments. Journal of Physics: Conference Series; 2018: IOP Publishing.

Flock T., A. S. Hauser, N. Lund, D. E. Gloriam, S. Balaji, M. M. Babu. Selectivity determinants of GPCR- G-protein binding. Nature. 2017;545(7654):317-+.

Flores J. A., B. G. Haddad, K. A. Dolan, J. B. Myers, C. C. Yoshioka, J. Copperman, D. M. Zuckerman, S. L. Reichow. Connexin-46/50 in a dynamic lipid environment resolved by CryoEM at 1.9 Å. BioRxiv. 2020.

Fowler P. W., K. Balali-Mood, S. Deol, P. V. Coveney, M. S. Sansom. Monotopic enzymes and lipid bilayers: a comparative study. Biochemistry. 2007;46(11):3108-3115.

Furse K. E., D. A. Pratt, N. A. Porter, T. P. Lybrand. Molecular dynamics simulations of arachidonic acid complexes with COX-1 and COX-2: insights into equilibrium behavior. Biochemistry. 2006;45(10):3189- 3205.

Furse K. E., D. A. Pratt, C. Schneider, A. R. Brash, N. A. Porter, T. P. Lybrand. Molecular dynamics simulations of arachidonic acid-derived pentadienyl radical intermediate complexes with COX-1 and COX- 2: insights into oxygenation regio-and stereoselectivity. Biochemistry. 2006;45(10):3206-3218.

Garavito R. M., M. G. Malkowski, D. L. DeWitt. The structures of prostaglandin endoperoxide H synthases-1 and-2. Prostaglandins & other lipid mediators. 2002;68:129-152.

Genheden S., J. W. Essex, A. G. Lee. G protein coupled receptor interactions with cholesterol deep in the membrane. Biochim Biophys Acta-Biomembr. 2017;1859(2):268-281.

Gimpl G. Interaction of G protein coupled receptors and cholesterol. Chem Phys Lipids. 2016;199:61-73.

Gowers R. J., M. Linke, J. Barnoud, T. J. E. Reddy, M. N. Melo, S. L. Seyler, J. Domanski, D. L. Dotson, S. Buchoux, I. M. Kenney. MDAnalysis: a Python package for the rapid analysis of molecular dynamics simulations. Los Alamos National Lab.(LANL), Los Alamos, NM (United States); 2019. Report No.: 2575- 9752.

Gu R.-X., H. I. Ingólfsson, A. H. de Vries, S. J. Marrink, D. P. Tieleman. Ganglioside-lipid and ganglioside- protein interactions revealed by coarse-grained and atomistic molecular dynamics simulations. The Journal of Physical Chemistry B. 2016;121(15):3262-3275.

159

Gu R.-X., S. Baoukina, D. P. Tieleman. Cholesterol Flip-Flop in Heterogeneous Membranes. Journal of chemical theory and computation. 2019;15(3):2064-2070.

Gu R.-X., S. Baoukina, D. P. Tieleman. Phase Separation in Atomistic Simulations of Model Membranes. J Am Chem Soc. 2020;142(6):2844-2856.

Gu R.-X., B. L. de Groot. Lipid-protein interactions modulate the conformational equilibrium of a potassium channel. Nat Commun. 2020;11(1):1-10.

Guixa-Gonzalez R., J. L. Albasanz, I. Rodriguez-Espigares, M. Pastor, F. Sanz, M. Marti-Solano, M. Manna, H. Martinez-Seara, P. W. Hildebrand, M. Martin, J. Selent. Membrane cholesterol access into a G- protein-coupled receptor. Nat Commun. 2017;8:14505.

Gupta K., B. S. Selinsky, C. J. Kaub, A. K. Katz, P. J. Loll. The 2.0 Å resolution crystal structure of prostaglandin H2 synthase-1: structural insights into an unusual peroxidase. J Mol Biol. 2004;335(2):503- 518.

Gutierrez M. G., K. S. Mansfield, N. Malmstadt. The Functional Activity of the Human Serotonin 5-HT1A Receptor Is Controlled by Lipid Bilayer Composition. Biophys J. 2016;110(11):2486-2495.

Haga K., A. C. Kruse, H. Asada, T. Yurugi-Kobayashi, M. Shiroishi, C. Zhang, W. I. Weis, T. Okada, B. K. Kobilka, T. Haga. Structure of the human M2 muscarinic acetylcholine receptor bound to an antagonist. Nature. 2012;482(7386):547.

Hanson M. A., V. Cherezov, M. T. Griffith, C. B. Roth, V. P. Jaakola, E. Y. T. Chien, J. Velasquez, P. Kuhn, R. C. Stevens. A specific cholesterol binding site is established by the 2.8 angstrom structure of the human beta(2)-adrenergic receptor. Structure. 2008;16(6):897-905.

Hanson M. A., C. B. Roth, E. Jo, M. T. Griffith, F. L. Scott, G. Reinhart, H. Desale, B. Clemons, S. M. Cahalan, S. C. Schuerer. Crystal structure of a lipid G protein–coupled receptor. Science. 2012;335(6070):851-855.

Hanson R. M., J. Prilusky, Z. Renjian, T. Nakane, J. L. Sussman. JSmol and the next‐generation web‐based representation of 3D molecular structure as applied to proteopedia. Israel Journal of Chemistry. 2013;53(3‐ 4):207-216.

Harder T., K. Simons. Caveolae, DIGs, and the dynamics of sphingolipid—cholesterol microdomains. Curr Opin Cell Biol. 1997;9(4):534-542.

160

Hassel B., R. Dingledine. Glutamate and glutamate receptors. Basic Neurochemistry: Elsevier; 2012. p. 342-366.

Hedger G., M. S. Sansom, H. Koldsø. The juxtamembrane regions of human receptor tyrosine kinases exhibit conserved interaction sites with anionic lipids. Sci Rep. 2015;5:9198.

Hedger G., H. Koldso, M. Chavent, C. Siebold, R. Rohatgi, M. S. P. Sansom. Cholesterol Interaction Sites on the Transmembrane Domain of the Hedgehog Signal Transducer and Class F G Protein-Coupled Receptor Smoothened. Structure. 2018.

Hess B., H. Bekker, H. J. Berendsen, J. G. Fraaije. LINCS: a linear constraint solver for molecular simulations. Journal of computational chemistry. 1997;18(12):1463-1472.

Hess B., C. Kutzner, D. Van Der Spoel, E. Lindahl. GROMACS 4: algorithms for highly efficient, load- balanced, and scalable molecular simulation. Journal of chemical theory and computation. 2008;4(3):435- 447.

Hoogerheide D. P., S. Y. Noskov, D. Jacobs, L. Bergdoll, V. Silin, D. L. Worcester, J. Abramson, H. Nanda, T. K. Rostovtseva, S. M. Bezrukov. Structural features and lipid binding domain of tubulin on biomimetic mitochondrial membranes. Proceedings of the National Academy of Sciences. 2017;114(18):E3622-E3631.

Hoover W. G. Canonical dynamics: Equilibrium phase-space distributions. Physical review A. 1985;31(3):1695.

Horn J. N., T. C. Kao, A. Grossfield. Coarse-Grained Molecular Dynamics Provides Insight into the Interactions of Lipids and Cholesterol with Rhodopsin. In: Filizola M, editor. G Protein-Coupled Receptors - Modeling and Simulation. Advances in Experimental Medicine and Biology. 796. Berlin: Springer-Verlag Berlin; 2014. p. 75-94.

Hua T., K. Vemuri, M. Pu, L. Qu, G. W. Han, Y. Wu, S. Zhao, W. Shui, S. Li, A. Korde. Crystal structure of the human cannabinoid receptor CB1. Cell. 2016;167(3):750-762. e714.

Huang J., S. Rauscher, G. Nawrocki, T. Ran, M. Feig, B. L. de Groot, H. Grubmüller, A. D. MacKerell. CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nature methods. 2017;14(1):71-73.

Huang W. J., A. Manglik, A. J. Venkatakrishnan, T. Laeremans, E. N. Feinberg, A. L. Sanborn, H. E. Kato, K. E. Livingston, T. S. Thorsen, R. C. Kling, S. Granier, P. Gmeiner, S. M. Husbands, J. R. Traynor, W. I.

161

Weis, J. Steyaert, R. O. Dror, B. K. Kobilka. Structural insights into mu-opioid receptor activation. Nature. 2015;524(7565):315-+.

Humphrey W., A. Dalke, K. Schulten. VMD: visual molecular dynamics. Journal of molecular graphics. 1996;14(1):33-38.

Hunter J. D. Matplotlib: A 2D graphics environment. Computing in science & engineering. 2007;9(3):90.

Hurst D. P., A. Grossfield, D. L. Lynch, S. Feller, T. D. Romo, K. Gawrisch, M. C. Pitman, P. H. Reggio. A Lipid Pathway for Ligand Binding Is Necessary for a Cannabinoid G Protein-coupled Receptor. J Biol Chem. 2010;285(23):17954-17964.

Inada M., M. Kinoshita, A. Sumino, S. Oiki, N. Matsumori. A concise method for quantitative analysis of interactions between lipids and membrane proteins. Analytica chimica acta. 2019;1059:103-112.

Ingólfsson H. I., M. N. Melo, F. J. Van Eerden, C. Arnarez, C. A. Lopez, T. A. Wassenaar, X. Periole, A. H. De Vries, D. P. Tieleman, S. J. Marrink. Lipid organization of the plasma membrane. J Am Chem Soc. 2014;136(41):14554-14559.

Isberg V., S. Mordalski, C. Munk, K. Rataj, K. Harpsoe, A. S. Hauser, B. Vroling, A. J. Bojarski, G. Vriend, D. E. Gloriam. GPCRdb: an information system for G protein-coupled receptors. Nucleic Acids Res. 2016;44(D1):D356-D364.

Jaakola V. P., M. T. Griffith, M. A. Hanson, V. Cherezov, E. Y. T. Chien, J. R. Lane, A. P. Ijzerman, R. C. Stevens. The 2.6 Angstrom Crystal Structure of a Human A(2A) Adenosine Receptor Bound to an Antagonist. Science. 2008;322(5905):1211-1217.

Jafurulla M., S. Tiwari, A. Chattopadhyay. Identification of cholesterol recognition amino acid consensus (CRAC) motif in G-protein coupled receptors. Biochem Biophys Res Commun. 2011;404(1):569-573.

Jianyi Y., Z. Yang. GPCR-EXP: a database for experimentally solved GPCR structures.

Jo S., T. Kim, V. G. Iyer, W. Im. CHARMM‐GUI: a web‐based graphical user interface for CHARMM. Journal of computational chemistry. 2008;29(11):1859-1865.

Jones E., T. Oliphant, P. Peterson. {SciPy}: Open source scientific tools for {Python}. 2014.

162

Jorgensen W. L., J. Chandrasekhar, J. D. Madura, R. W. Impey, M. L. Klein. Comparison of simple potential functions for simulating liquid water. The Journal of chemical physics. 1983;79(2):926-935.

Kovesi P. Good colour maps: How to design them. arXiv preprint arXiv:150903700. 2015.

Kruse A. C., A. M. Ring, A. Manglik, J. Hu, K. Hu, K. Eitel, H. Hübner, E. Pardon, C. Valant, P. M. Sexton. Activation and allosteric modulation of a muscarinic acetylcholine receptor. Nature. 2013;504(7478):101.

Kuo T.-C., Y. J. Tseng. LipidPedia: a comprehensive lipid knowledgebase. Bioinformatics. 2018;34(17):2982-2987.

Leach A. R., A. R. Leach. Molecular modelling: principles and applications: Pearson education; 2001.

Lebon G., T. Warne, P. C. Edwards, K. Bennett, C. J. Langmead, A. G. Leslie, C. G. Tate. Agonist-bound adenosine A 2A receptor structures reveal common features of GPCR activation. Nature. 2011;474(7352):521.

Lee A. G. A database of predicted binding sites for cholesterol on membrane proteins, deep in the membrane. Biophys J. 2018;115(3):522-532.

Lee A. G. Interfacial Binding Sites for Cholesterol on G Protein-Coupled Receptors. Biophys J. 2019;116(9):1586-1597.

Lewars E. Computational chemistry. Introduction to the theory and applications of molecular and quantum mechanics. 2003:318.

Li J., P. C. Edwards, M. Burghammer, C. Villa, G. F. Schertler. Structure of bovine rhodopsin in a trigonal crystal form. J Mol Biol. 2004;343(5):1409-1438.

Liang Y. L., M. Khoshouei, M. Radjainia, Y. Zhang, A. Glukhova, J. Tarrasch, D. M. Thal, S. G. B. Furness, G. Christopoulos, T. Coudrat, R. Danev, W. Baumeister, L. J. Miller, A. Christopoulos, B. K. Kobilka, D. Wootten, G. Skiniotis, P. M. Sexton. Phase-plate cryo-EM structure of a class B GPCR-G-protein complex. Nature. 2017;546(7656):118-+.

Lomize A. L., I. D. Pogozheva, M. A. Lomize, H. I. Mosberg. The role of hydrophobic interactions in positioning of peripheral proteins in membranes. BMC Struct Biol. 2007;7(1):44.

163

Luong C., A. Miller, J. Barnett, J. Chow, C. Ramesha, M. F. Browner. Flexibility of the NSAID binding site in the structure of human cyclooxygenase-2. Nature structural biology. 1996;3(11):927-933.

Ma Y., Y. Yue, Y. Ma, Q. Zhang, Q. Zhou, Y. Song, Y. Shen, X. Li, X. Ma, C. Li. Structural basis for apelin control of the human apelin receptor. Structure. 2017;25(6):858-866. e854.

Manglik A., A. C. Kruse, T. S. Kobilka, F. S. Thian, J. M. Mathiesen, R. K. Sunahara, L. Pardo, W. I. Weis, B. K. Kobilka, S. Granier. Crystal structure of the mu-opioid receptor bound to a morphinan antagonist. Nature. 2012;485(7398):321-U170.

Manglik A., A. C. Kruse. Structural Basis for G Protein-Coupled Receptor Activation. Biochemistry. 2017;56(42):5628-5634.

Manna M., W. Kulig, M. Javanainen, J. Tynkkynen, U. Hensen, D. J. Müller, T. Rog, I. Vattulainen. How to minimize artifacts in atomistic simulations of membrane proteins, whose crystal structure is heavily engineered: β2-adrenergic receptor in the spotlight. Journal of chemical theory and computation. 2015;11(7):3432-3445.

Manna M., M. Niemela, J. Tynkkynen, M. Javanainen, W. Kulig, D. J. Muller, T. Rog, I. Vattulainen. Mechanism of allosteric regulation of beta2-adrenergic receptor by cholesterol. eLife. 2016;5.

Marino K. A., D. Prada-Gracia, D. Provasi, M. Filizola. Impact of Lipid Composition and Receptor Conformation on the Spatio-temporal Organization of mu-Opioid Receptors in a Multi-component Plasma Membrane Model. PLoS Comput Biol. 2016;12(12):e1005240.

Marrink S. J., A. H. De Vries, A. E. Mark. Coarse grained model for semiquantitative lipid simulations. The Journal of Physical Chemistry B. 2004;108(2):750-760.

Marrink S. J., H. J. Risselada, S. Yefimov, D. P. Tieleman, A. H. de Vries. The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations. The Journal of Physical Chemistry B. 2007;111(27):7812-7824.

Marrink S. J., D. P. Tieleman. Perspective on the Martini model. Chem Soc Rev. 2013;42(16):6801-6822.

Marrink S. J., V. Corradi, P. C. Souza, H. I. Ingólfsson, D. P. Tieleman, M. S. Sansom. Computational modeling of realistic cell membranes. Chem Rev. 2019;119(9):6184-6226.

164

McGibbon R. T., K. A. Beauchamp, M. P. Harrigan, C. Klein, J. M. Swails, C. X. Hernández, C. R. Schwantes, L.-P. Wang, T. J. Lane, V. S. Pande. MDTraj: a modern open library for the analysis of molecular dynamics trajectories. Biophys J. 2015;109(8):1528-1532.

McKiernan E. C., P. E. Bourne, C. T. Brown, S. Buck, A. Kenall, J. Lin, D. McDougall, B. A. Nosek, K. Ram, C. K. Soderberg. Point of view: How open science helps researchers succeed. eLife. 2016;5:e16800.

Mlinarić A., M. Horvat, V. Šupak Smolčić. Dealing with the positive publication bias: Why you should really publish your negative results. Biochemia medica: Biochemia medica. 2017;27(3):447-452.

Mondal S., G. Khelashvili, H. Weinstein. Not just an oil slick: how the energetics of protein-membrane interactions impacts the function and organization of transmembrane proteins. Biophys J. 2014;106(11):2305-2316.

Morra G., A. M. Razavi, K. Pandey, H. Weinstein, A. K. Menon, G. Khelashvili. Mechanisms of Lipid Scrambling by the G Protein-Coupled Receptor Opsin. Structure. 2018;26(2):356-+.

Muller M. P., T. Jiang, C. Sun, M. Lihan, S. Pant, P. Mahinthichaichan, A. Trifan, E. Tajkhorshid. Characterization of Lipid–Protein Interactions and Lipid-Mediated Modulation of Membrane Protein Function through Molecular Simulation. Chem Rev. 2019.

Newport T. D., M. S. P. Sansom, P. J. Stansfeld. The MemProtMD database: a resource for membrane- embedded protein structures and their lipid interactions. Nucleic Acids Res. 2019;47(D1):D390-D397.

Nilius B., E. Honoré. Sensing pressure with ion channels. Trends in neurosciences. 2012;35(8):477-486.

Nimpf S., D. A. Keays. Why (and how) we should publish negative data. EMBO reports. 2020;21(1).

Nina M., S. Bernèche, B. Roux. Anchoring of a monotopic membrane protein: the binding of prostaglandin H 2 synthase-1 to the surface of a phospholipid bilayer. European Biophysics Journal. 2000;29(6):439-454.

Nosé S. A molecular dynamics method for simulations in the canonical ensemble. Molecular physics. 1984;52(2):255-268.

Okada T., M. Sugihara, A. N. Bondar, M. Elstner, P. Entel, V. Buss. The retinal conformation and its environment in rhodopsin in light of a new 2.2 A crystal structure. J Mol Biol. 2004;342(2):571-583.

165

Orlando B. J., D. R. McDougle, M. J. Lucido, E. T. Eng, L. A. Graham, C. Schneider, D. L. Stokes, A. Das, M. G. Malkowski. Cyclooxygenase-2 catalysis and inhibition in lipid bilayer nanodiscs. Archives of biochemistry and biophysics. 2014;546:33-40.

Otto J. C., W. L. Smith. The orientation of prostaglandin endoperoxide synthases-1 and-2 in the endoplasmic reticulum. J Biol Chem. 1994;269(31):19868-19875.

Papasergi-Scott M. M., M. J. Robertson, A. B. Seven, O. Panova, J. M. Mathiesen, G. Skiniotis. Structures of metabotropic GABAB receptor. bioRxiv. 2020.

Park J. H., P. Scheerer, K. P. Hofmann, H. W. Choe, O. P. Ernst. Crystal structure of the ligand-free G- protein-coupled receptor opsin. Nature. 2008;454(7201):183-U133.

Park S. H., B. B. Das, F. Casagrande, Y. Tian, H. J. Nothnagel, M. Chu, H. Kiefer, K. Maier, A. A. De Angelis, F. M. Marassi. Structure of the chemokine receptor CXCR1 in phospholipid bilayers. Nature. 2012;491(7426):779.

Parrill A. L., G. Tigyi. Integrating the puzzle pieces: the current atomistic picture of phospholipid-G protein coupled receptor interactions. Biochim Biophys Acta. 2013;1831(1):2-12.

Parrinello M., A. Rahman. Polymorphic transitions in single crystals: A new molecular dynamics method. Journal of Applied physics. 1981;52(12):7182-7190.

Patrick J. W., C. D. Boone, W. Liu, G. M. Conover, Y. Liu, X. Cong, A. Laganowsky. Allostery revealed within lipid binding events to membrane proteins. Proceedings of the National Academy of Sciences. 2018;115(12):2976-2981.

Pearlman D. A., D. A. Case, J. W. Caldwell, W. S. Ross, T. E. Cheatham III, S. DeBolt, D. Ferguson, G. Seibel, P. Kollman. AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules. Computer Physics Communications. 1995;91(1-3):1-41.

Pliotas C., A. C. E. Dahl, T. Rasmussen, K. R. Mahendran, T. K. Smith, P. Marius, J. Gault, T. Banda, A. Rasmussen, S. Miller, C. V. Robinson, H. Bayley, M. S. P. Sansom, I. R. Booth, J. H. Naismith. The role of lipids in mechanosensation. Nat Struct Mol Biol. 2015;22(12):991-998.

166

Pluhackova K., S. Gahbauer, F. Kranz, T. A. Wassenaar, R. A. Bockmann. Dynamic Cholesterol- Conditioned Dimerization of the G Protein Coupled Chemokine Receptor Type 4. PLoS Comput Biol. 2016;12(11):e1005169.

Rao P., E. E. Knaus. Evolution of nonsteroidal anti-inflammatory drugs (NSAIDs): cyclooxygenase (COX) inhibition and beyond. Journal of Pharmacy & Pharmaceutical Sciences. 2008;11(2):81-110s.

Rasmussen S. G., B. T. DeVree, Y. Zou, A. C. Kruse, K. Y. Chung, T. S. Kobilka, F. S. Thian, P. S. Chae, E. Pardon, D. Calinski, J. M. Mathiesen, S. T. Shah, J. A. Lyons, M. Caffrey, S. H. Gellman, J. Steyaert, G. Skiniotis, W. I. Weis, R. K. Sunahara, B. K. Kobilka. Crystal structure of the beta2 adrenergic receptor- Gs protein complex. Nature. 2011;477(7366):549-555.

Rieke C. J., A. M. Mulichak, R. M. Garavito, W. L. Smith. The role of arginine 120 of human prostaglandin endoperoxide H synthase-2 in the interaction with fatty acid substrates and inhibitors. J Biol Chem. 1999;274(24):17109-17114.

Rose A. S., P. W. Hildebrand. NGL Viewer: a web application for molecular visualization. Nucleic Acids Res. 2015;43(W1):W576-W579.

Rose A. S., A. R. Bradley, Y. Valasatava, J. M. Duarte, A. Prlić, P. W. Rose. NGL viewer: web-based molecular graphics for large complexes. Bioinformatics. 2018;34(21):3755-3758.

Rouviere E., C. Arnarez, L. W. Yang, E. Lyman. Identification of Two New Cholesterol Interaction Sites on the A(2A) Adenosine Receptor. Biophys J. 2017;113(11):2415-2424.

Rouzer C. A., L. J. Marnett. Cyclooxygenases: structural and functional insights. J Lipid Res. 2009;50(Supplement):S29-S34.

Rouzer C. A., L. J. Marnett. Endocannabinoid oxygenation by cyclooxygenases, lipoxygenases, and cytochromes P450: cross-talk between the eicosanoid and endocannabinoid signaling pathways. Chem Rev. 2011;111(10):5899-5921.

Salas-Estrada L. A., N. Leioatts, T. D. Romo, A. Grossfield. Lipids Alter Rhodopsin Function via Ligand- like and Solvent-like Interactions. Biophys J. 2018;114(2):355-367.

Schmid N., A. P. Eichenberger, A. Choutko, S. Riniker, M. Winger, A. E. Mark, W. F. van Gunsteren. Definition and testing of the GROMOS force-field versions 54A7 and 54B7. European biophysics journal. 2011;40(7):843.

167

Schmidt T. H., Y. Homsi, T. Lang. Oligomerization of the Tetraspanin CD81 via the Flexibility of Its δ- Loop. Biophys J. 2016;110(11):2463-2474.

Schneider C., A. Pozzi. Cyclooxygenases and lipoxygenases in cancer. Cancer and Metastasis Reviews. 2011;30(3-4):277-294.

Seabold S., J. Perktold, editors. Statsmodels: Econometric and statistical modeling with python. Proceedings of the 9th Python in Science Conference; 2010: Scipy.

Segala E., D. Guo, R. K. Y. Cheng, A. Bortolato, F. Deflorian, A. S. Dore, J. C. Errey, L. H. Heitman, A. P. Ijzerman, F. H. Marshall, R. M. Cooke. Controlling the Dissociation of Ligands from the Adenosine A(2A) Receptor through Modulation of Salt Bridge Strength. J Med Chem. 2016;59(13):6470-6479.

Sehnal D., M. Deshpande, R. S. Vařeková, S. Mir, K. Berka, A. Midlik, L. Pravda, S. Velankar, J. Koča. LiteMol suite: interactive web-based visualization of large-scale macromolecular structure data. Nature methods. 2017;14(12):1121.

Sejdiu B. I., D. P. Tieleman. Lipid-Protein Interactions Are a Unique Property and Defining Feature of G Protein-Coupled Receptors. Biophys J. 2020;118(8):1887-1900.

Sejdiu B. I., D. P. Tieleman. COX-1 - lipid interactions: arachidonic acid, cholesterol, and phospholipid binding to the membrane binding domain of COX-1. bioRxiv. 2020

Sengupta D., A. Chattopadhyay. Identification of cholesterol binding sites in the serotonin1A receptor. J Phys Chem B. 2012;116(43):12991-12996.

Sengupta D., A. Chattopadhyay. Molecular dynamics simulations of GPCR-cholesterol interaction: An emerging paradigm. Biochim Biophys Acta. 2015;1848(9):1775-1782.

Sengupta D., X. Prasanna, M. Mohole, A. Chattopadhyay. Exploring GPCR–Lipid Interactions by Molecular Dynamics Simulations: Excitements, Challenges, and the Way Forward. The Journal of Physical Chemistry B. 2018;122(22):5727-5737.

Shan J., G. Khelashvili, S. Mondal, E. L. Mehler, H. Weinstein. Ligand-dependent conformations and dynamics of the serotonin 5-HT(2A) receptor determine its activation and membrane-driven oligomerization properties. PLoS Comput Biol. 2012;8(4):e1002473.

168

Shihoya W., T. Nishizawa, K. Yamashita, A. Inoue, K. Hirata, F. M. N. Kadji, A. Okuta, K. Tani, J. Aoki, Y. Fujiyoshi, T. Doi, O. Nureki. X-ray structures of endothelin ETB receptor bound to clinical antagonist bosentan and its analog. Nat Struct Mol Biol. 2017;24(9):758-+.

Shimamura T., M. Shiroishi, S. Weyand, H. Tsujimoto, G. Winter, V. Katritch, R. Abagyan, V. Cherezov, W. Liu, G. W. Han. Structure of the human histamine H 1 receptor complex with doxepin. Nature. 2011;475(7354):65.

Shinohara H., M. a. A. Balboa, C. A. Johnson, J. Balsinde, E. A. Dennis. Regulation of delayed prostaglandin production in activated P388D1 macrophages by group IV cytosolic and group V secretory phospholipase A2s. J Biol Chem. 1999;274(18):12263-12268.

Simons K., R. Ehehalt. Cholesterol, lipid rafts, and disease. The Journal of clinical investigation. 2002;110(5):597-603.

Singer S. J., G. L. Nicolson. The fluid mosaic model of the structure of cell membranes. Science. 1972;175(4023):720-731.

Siu F. Y., M. He, C. De Graaf, G. W. Han, D. Yang, Z. Zhang, C. Zhou, Q. Xu, D. Wacker, J. S. Joseph. Structure of the human glucagon class B G-protein-coupled receptor. Nature. 2013;499(7459):444.

Smith W. L., D. L. DeWitt, R. M. Garavito. Cyclooxygenases: structural, cellular, and molecular biology. Annual review of biochemistry. 2000;69(1):145-182.

Smith W. L., Y. Urade, P.-J. Jakobsson. Enzymes of the cyclooxygenase pathways of prostanoid biosynthesis. Chem Rev. 2011;111(10):5821-5865.

Smith W. L., M. G. Malkowski. Interactions of fatty acids, nonsteroidal anti-inflammatory drugs, and coxibs with the catalytic and allosteric subunits of cyclooxygenases-1 and-2. J Biol Chem. 2019;294(5):1697-1705.

Sobolevsky A. I., M. P. Rosconi, E. Gouaux. X-ray structure, symmetry and mechanism of an AMPA- subtype glutamate receptor. Nature. 2009;462(7274):745-756.

Song G., D. Yang, Y. Wang, C. de Graaf, Q. Zhou, S. Jiang, K. Liu, X. Cai, A. Dai, G. Lin. Human GLP- 1 receptor transmembrane domain structure in complex with allosteric modulators. Nature. 2017;546(7657):312.

169

Song W., H. Y. Yen, C. V. Robinson, M. S. P. Sansom. State-dependent Lipid Interactions with the A2a Receptor Revealed by MD Simulations Using In Vivo-Mimetic Membranes. Structure. 2019;27(2):392- 403 e393.

Song Y., A. K. Kenworthy, C. R. Sanders. Cholesterol as a co-solvent and a ligand for membrane proteins. Protein Sci. 2014;23(1):1-22.

Soubias O., K. Gawrisch. The role of the lipid matrix for structure and function of the GPCR rhodopsin. Biochim Biophys Acta-Biomembr. 2012;1818(2):234-240.

Soubias O., W. E. Teague, K. G. Hines, K. Gawrisch. Rhodopsin/Lipid Hydrophobic Matching-Rhodopsin Oligomerization and Function. Biophys J. 2015;108(5):1125-1132.

Spencer A. G., J. W. Woods, T. Arakawa, I. I. Singer, W. L. Smith. Subcellular localization of prostaglandin endoperoxide H synthases-1 and-2 by immunoelectron microscopy. J Biol Chem. 1998;273(16):9886-9893.

Sullivan C., A. Kaszynski. PyVista: 3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK). Journal of Open Source Software. 2019;4(37):1450.

Suno R., K. T. Kimura, T. Nakane, K. Yamashita, J. Wang, T. Fujiwara, Y. Yamanaka, D. Im, S. Horita, H. Tsujimoto. Crystal structures of human orexin 2 receptor bound to the subtype-selective antagonist EMPA. Structure. 2018;26(1):7-19. e15.

Swainsbury D. J., S. Scheidelaar, R. Van Grondelle, J. A. Killian, M. R. Jones. Bacterial reaction centers purified with styrene maleic acid copolymer retain native membrane functional properties and display enhanced stability. Angewandte Chemie International Edition. 2014;53(44):11803-11807.

Teague W. E., O. Soubias, H. Petrache, N. Fuller, K. G. Hines, R. P. Rand, K. Gawrisch. Elastic properties of polyunsaturated phosphatidylethanolamines influence rhodopsin function. Faraday Discuss. 2013;161:383-395.

Team B. D. Bokeh: Python library for interactive visualization. Bokeh Development Team Wichita, KS; 2014.

Thibault J. C., D. R. Roe, J. C. Facelli, T. E. Cheatham. Data model, dictionaries, and desiderata for biomolecular simulation data indexing and sharing. Journal of cheminformatics. 2014;6(1):4.

170

Tidhar R., A. H. Futerman. The complexity of sphingolipid biosynthesis in the endoplasmic reticulum. Biochimica Et Biophysica Acta (BBA)-Molecular Cell Research. 2013;1833(11):2511-2518.

Tironi I. G., R. Sperb, P. E. Smith, W. F. van Gunsteren. A generalized reaction field method for molecular dynamics simulations. The Journal of chemical physics. 1995;102(13):5451-5459.

Traynelis S. F., L. P. Wollmuth, C. J. McBain, F. S. Menniti, K. M. Vance, K. K. Ogden, K. B. Hansen, H. Yuan, S. J. Myers, R. Dingledine. Glutamate receptor ion channels: structure, regulation, and function. Pharmacol Rev. 2010;62(3):405-496.

Ulmschneider M. B., C. Bagnéris, E. C. McCusker, P. G. DeCaen, M. Delling, D. E. Clapham, J. P. Ulmschneider, B. A. Wallace. Molecular dynamics of ion transport through the open conformation of a bacterial voltage-gated sodium channel. Proceedings of the National Academy of Sciences. 2013;110(16):6364-6369.

Van Meer G., D. R. Voelker, G. W. Feigenson. Membrane lipids: where they are and how they behave. Nat Rev Mol Cell Biol. 2008;9(2):112-124. van Meer G., A. I. de Kroon. Lipid map of the mammalian cell. Journal of cell science. 2011;124(1):5-8.

Vane J., Y. Bakhle, R. Botting. CYCLOOXYGENASES 1 AND 2. Annual review of pharmacology and toxicology. 1998;38(1):97-120.

Vecchio A. J., B. J. Orlando, R. Nandagiri, M. G. Malkowski. Investigating Substrate Promiscuity in Cyclooxygenase-2 THE ROLE OF ARG-120 AND RESIDUES LINING THE HYDROPHOBIC GROOVE. J Biol Chem. 2012;287(29):24619-24630.

Wallin E., G. V. Heijne. Genome‐wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms. Protein Sci. 1998;7(4):1029-1038.

Wan S., P. V. Coveney. A comparative study of the COX‐1 and COX‐2 isozymes bound to lipid membranes. Journal of computational chemistry. 2009;30(7):1038-1050.

Wang C., Y. Jiang, J. Ma, H. Wu, D. Wacker, V. Katritch, G. W. Han, W. Liu, X.-P. Huang, E. Vardy. Structural basis for molecular recognition at serotonin receptors. Science. 2013;340(6132):610-614.

171

Wang C., H. Wu, T. Evron, E. Vardy, G. W. Han, X.-P. Huang, S. J. Hufeisen, T. J. Mangano, D. J. Urban, V. Katritch. Structural basis for Smoothened receptor modulation and chemoresistance to anticancer drugs. Nat Commun. 2014;5:4355.

Wang M.-T., K. V. Honn, D. Nie. Cyclooxygenases, prostanoids, and tumor progression. Cancer and Metastasis Reviews. 2007;26(3-4):525.

Waskom M., O. Botvinnik, P. Hobson, J. Warmenhoven, J. Cole, Y. Halchenko, J. Vanderplas, S. Hoyer, S. Villalba, E. Quintero. Seaborn: statistical data visualization. Seaborn: Statistical Data Visualization Seaborn 0. 2014;5.

Wassenaar T. A., H. I. Ingólfsson, R. A. Böckmann, D. P. Tieleman, S. J. Marrink. Computational lipidomics with insane: a versatile tool for generating custom membranes for molecular simulations. Journal of chemical theory and computation. 2015;11(5):2144-2155.

Weingarth M., A. Prokofyev, E. A. van der Cruijsen, D. Nand, A. M. Bonvin, O. Pongs, M. Baldus. Structural determinants of specific lipid binding to potassium channels. J Am Chem Soc. 2013;135(10):3983-3988.

Wright A. L., B. Vissel. The essential role of AMPA receptor GluR2 subunit RNA editing in the normal and diseased brain. Frontiers in molecular neuroscience. 2012;5:34.

Wu E. L., X. Cheng, S. Jo, H. Rui, K. C. Song, E. M. Dávila‐Contreras, Y. Qi, J. Lee, V. Monje‐Galvan, R. M. Venable. CHARMM‐GUI membrane builder toward realistic biological membrane simulations. Journal of computational chemistry. 2014;35(27):1997-2004.

Wu H. X., C. Wang, K. J. Gregory, G. W. Han, H. P. Cho, Y. Xia, C. M. Niswender, V. Katritch, J. Meiler, V. Cherezov, P. J. Conn, R. C. Stevens. Structure of a Class C GPCR Metabotropic Glutamate Receptor 1 Bound to an Allosteric Modulator. Science. 2014;344(6179):58-64.

Yazdi S., M. Stein, F. Elinder, M. Andersson, E. Lindahl. The molecular basis of polyunsaturated fatty acid interactions with the shaker voltage-gated potassium channel. PLoS Comput Biol. 2016;12(1).

Yen H. Y., K. K. Hoi, I. Liko, G. Hedger, M. R. Horrell, W. L. Song, D. Wu, P. Heine, T. Warne, Y. Lee, B. Carpenter, A. Pluckthun, C. G. Tate, M. S. P. Sansom, C. V. Robinson. PtdIns(4,5)P-2 stabilizes active states of GPCRs and enhances selectivity of G-protein coupling. Nature. 2018;559(7714):424-+.

172

Yuan C., R. S. Sidhu, D. V. Kuklev, Y. Kado, M. Wada, I. Song, W. L. Smith. Cyclooxygenase allosterism, fatty acid-mediated cross-talk between monomers of cyclooxygenase homodimers. J Biol Chem. 2009;284(15):10046-10055.

Zhang H., G. W. Han, A. Batyuk, A. Ishchenko, K. L. White, N. Patel, A. Sadybekov, B. Zamlynny, M. T. Rudd, K. Hollenstein. Structural basis for selectivity and diversity in angiotensin II receptors. Nature. 2017;544(7650):327.

Zhang K. H., J. Zhang, Z. G. Gao, D. D. Zhang, L. Zhu, G. W. Han, S. M. Moss, S. Paoletta, E. Kiselev, W. Z. Lu, G. Fenalti, W. R. Zhang, C. E. Muller, H. Y. Yang, H. L. Jiang, V. Cherezov, V. Katritch, K. A. Jacobson, R. C. Stevens, B. L. Wu, et al. Structure of the human P2Y(12) receptor in complex with an antithrombotic drug. Nature. 2014;509(7498):115-118.

173

Appendix A: Supplementary Data for Chapter 4

Analysis. Considering the complexity of the setup, to comprehensively evaluate our data we analysed lipid- protein interactions by both grouping lipids based on some property as well as analysing lipid types individually. For the former, we use a combination of 2D map profiles and for the latter we calculate average contact heatmaps and distances between lipids and proteins.

We group lipids based on their headgroup type and tail saturation level. With the former grouping we define PC, PE, PS, PA, DAG, LPC, SM, CER, PI, PIPs, GM lipids. The latter allows us to croup lipids into fully- saturated (FS), poly-unsaturated (PU), cholesterol, and Other (containing lipids that are excluded from the previous three categories). PU lipids are lipids that contain above two type “D” beads and consists of: DAPC, DUPE, DAPE, DAPS, DUPS, APC, and UPC lipids. FS lipids includes SM lipids, glycolipids, ceramides, and LPC lipids.

Thickness, Curvature, and Lipid Composition. We use the same in-house tools as in our previous work(1) to calculate the thickness, curvature and lipid density profiles. The only difference is that since we are dealing with only GPCRs, to allow for direct comparison of results, we orient the structures so that H8 is facing downwards, and helices TM1-TM7 moving counter-clockwise from right to left.

Depletion-enrichment index and Equilibration Tests. For a multicomponent bilayer system with an assumed homogeneous distribution, the following relation is true:

[퐴]푙표푐푎푙 = [퐴]푏푢푙푘 where [퐴] is the molar concentration of lipid A. Its local value is calculated within a cutoff distance from embedded proteins (GPCRs). There is a limit to how small this cutoff value can be, and we limit ourselves at 5Å from the protein. Rearranging the above relation, we obtain the following equation:

174

[퐴]푙표푐푎푙 퐷퐸(퐴|퐺푃퐶푅) = = 1 [퐴]푏푢푙푘

That is, for a membrane system with a homogeneous lipid distribution, the expected depletion-enrichment index (or simply the enrichment value),(2) 퐷퐸(퐴|퐺푃퐶푅), is unity.

With the DE index we aim to reduce what is a 3 – dimensional problem into a single number for the whole protein. As such, while it does not retain any information about the spatial distribution of the interactions, it does give a measure of the tendency of lipids at a specific point (in our case embedded proteins) to deviate from a reference distribution (a homogeneous bulk membrane distribution). From experience we have seen that it is more accurate for lipids that are either in small number in the system (e.g. PIP lipids) or change significantly during the simulation (e.g. GM lipids).

We use the average number of lipids around GPCRs to measure the direction of change in lipid distribution and estimate when the distributions have converged. We focus mainly on the 7Å radius, but the results hold for other similar values. In Figure A-1 we highlight these results for 5HT1B, even though, again, the data tell a similar story for other GPCRs.

Number and Duration of Contacts. To analyse specific interactions of lipids with proteins, we calculate both the number of contacts between the corresponding lipids and each protein residue, as well as the duration of each contact. In our analysis, we use the total number of contacts (referred to simply as number of contacts) and the average duration of the longest contact for each residue. Our results, however, are not dependent on any particular analysis method or averaging statistic used. They are also independent from our cutoff choice (7Å). We use the gmx select utility from the GROMACS simulation package and the MDTraj package to process the trajectories for contacts and post-process the results using in-house scripts.

When calculating the contribution of TM helices in GPCR-PIP lipid interactions (as shown in Figure 4- 4C), we have to only consider the part of the helix that is facing the intracellular membrane (since that is where PIP lipids are found, exclusively). Since it is not clear how to separate TM helices into extra- and intracellular facing residues, to ensure that we use a consistent definition between different GPCRs, we only consider the four residues that interact with PIP lipids the most. Considering all residues that form the helix, or even residues that interact at least once with PIP lipids leads to unnaturally large error bars.

For brevity and clarity, we describe only a selection of GPCRs in detail in the main text. However, we append the full analysis results for all GPCRs in the supplementary material. We provide sequence heatmaps for both the number of contacts and their duration, for GPCR-cholesterol and PIP lipid interactions in two formats: aligned and full unaligned sequence. We used GPCRdb to obtain the alignment.(3) We use the following definition for CRAC motifs: (L/V)-X(1,5)-(Y)-X(1,5)-(R/K) and its

175 reverse for CARC motifs,(4) and use it to query our GPCR sequences using ScanProsite.(5) In these motifs Y represents an aromatic residue (tyrosine or phenylalanine). Statistics on GPCR crystal structures solved were obtained from the GPCR-EXP database.(6)

Statistical Analysis. The expected value of the DE index assuming no depletion/enrichment is 1 – which corresponds to our null hypothesis, and we use a two-sided one-sample T-test to calculate the two-tailed p- values reported for all GPCR DE index values (reported in Table S3). The p-value of each DE index is calculated in comparison to the expected value from the null hypothesis.

The confidence intervals reported in Table S4 are calculated using one-sample t confidence intervals. If x̄ and s are the sample mean and sample standard deviation, then the confidence interval from a random sample derived from a normal population with mean μ is:

푠 푥̅ ± 푡훼/2 ∙ √푛 where 푡훼/2 is the t critical value with 푛 − 1 degrees of freedom. For instance, if pip_data is a python list containing the DE indices of PIP lipids for each class A GPCR then the confidence interval can be calculated using the following code block: import numpy as np import scipy.stats as st x_hat = np.mean(pip_data) s = np.std(pip_data, ddof=1) n = np.sqrt(len(pip_data)) t_critical_value = st.t.ppf(1-0.025, len(pip_data)-1) t_conf_int = t_critical_value * s / n print (x_hat-t_conf_int, x_hat+t_conf_int)

SciPy also provides a simpler way of doing this: st.t.interval(.95, len(data)-1, loc=np.mean(data), scale=st.sem(data))

Lastly, we use a two-sided T-test when comparing the DE index means of different categories of GPCRs (class A vs non class A, and aminergic vs non aminergic class A GPCRs), and deriving the corresponding p-values (as reported in Table S5).

An important assumption when using t distributions is that the sample data are normally distributed. We use percent-percent (P-P) probability plots of the theoretical vs practical percentiles for each lipid type to

176 show that, indeed, DE indices conform to a normal distribution (Figure A-3). In fact, cholesterol is the only lipid that shows a small deviation from the assumed normal distribution.

Notation. We use the TM label to denote transmembrane helices, with the exception of helix 8 which is referred to simply as H8. The exception is when we refer to H8 as part of a binding interface (TM1/8 or TM7/8). Transmembrane proteins are in contact with both leaflets of the plasma membrane and lipid interactions may occur at each side. To differentiate between binding that occurs on the extracellular and intracellular side of GPCRs we use the ec and ic notation, respectively, to label interaction sites.

Software packages. For the analysis of lipid – protein contacts we simultaneously used two different software tools: gmx select (part of the GROMACS simulation package) and the compute_neighbors method (part of the MDTraj package).(7) We use the stats module of SciPy(8) and the statsmodels(9) python package for the statistical analysis of our data. Lastly, plotting of the data is done using the matplotlib(10) and seaborn(11) python libraries.

GPCR-lipid interactions website. Along with the paper, we also release a dedicated webpage hosted by GitHub to interactively visualize our dataset as it pertains to cholesterol and PIP lipid interactions. It uses the NGL Viewer(12) to display the density of cholesterol and PIP lipids in a 6 nm (60Å) radius around proteins. It allows direct interactions with our dataset using a user-friendly interface and visualize the number and duration of contact maps highlighted here, as well as view 3D density profiles and 2D slices of it.

Additionally, we provide a separate application to view and interact with 3D objects representing the thickness and curvature profiles of each GPCRs. The 2D maps of these calculations are shown in Figures A-7 – A-9, however, the online application allows the user to interactively view these profiles, customize their appearance and easily switch between mean and Gaussian curvature. These objects are generated by g_surf and visualized using the Three.js JavaScript library and the coloring of the curvature is done using MeshLab.(13)

We hope that this will allow for a much cleaner presentation of GPCR-lipid interactions and enable users to explore any detail of our dataset that, for reasons of text brevity and clarity, we may not have been able to do here. The webpage can be accessed through: https://bisejdiu.github.io/GPCR-lipid-interactions and all the code is available on the following GitHub repository: https://github.com/bisejdiu/GPCR- lipid-interactions

Visualization. All molecular visualizations presented in the paper are done using VMD.(14)

177

Time Protein name Abbrev. PDB ID (μs) Reference

5-hydroxytryptamine receptor 1B 5HT1B 4IAQ 30 (15)

Adenosine A2A receptor A2ARa 2YDV 30 (16)

Adenosine A2A receptor A2ARi 3EML 30 (17) Apelin receptor ApelinR 5VBL 30 (18)

Angiotensin II type 2 receptor AT2R 5UNG 30 (19) beta2 adrenergic receptor b2ARa (β2ARa) 3SN6 30 (20) beta2 adrenergic receptor b2ARi (β2ARa) 2RH1 30 (21)

Cannabinoid Receptor CB1 CB1R 5TGZ 30 (22) Chemokine receptor CXCR1 CXCR1 2LNL 30 (23)

Dopamine D3 receptor D3R 3PBL 30 (24)

Endothelin ETB receptor ETbR (ETBR) 5X93 30 (25)

Human histamine H1 H1R 3RZE 30 (26) Lysophosphatidic Acid Receptor 1 LPAR1 4Z36 30 (27)

M2 muscarinic acetylcholine receptor M2Ra 4MQS 30 (28)

M2 muscarinic acetylcholine receptor M2Ri 3UON 30 (29) mu-opioid receptor (active) mORa (μORa) 5C1M 30 (30) mu-opioid receptor mORi (μORi) 4DKL 30 (31)

Human Orexin 2 Receptor OX2R 5WQC 30 (32) Protease-activated receptor 2 PAR2 5NDD 30 (33) Metarhodopsin II RhodRa 3PQR 30 (34) Rhodopsin RhodRi 1GZM 30 (35) Lysophospholipid sphingosine 1-phosphate S1PR1 3V2Y 30 (36) US28 US28 4XT1 30 (37) Calcitonin receptor CalcitoninR 5UZ7 30 (38) Glucagon-like peptide-1 receptor GLP1 5VEW 30 (39) Glucagon receptor GlucagonR 4L6R 30 (40) Metabotropic glutamate receptor 5 mGlu5 4OO9 30 (41) Smoothened receptor SMO 4N4W 30 (42)

Table A-1. Overview of all GPCR structures simulated. The full name of each GPCR along with their abbreviation used in this work, as well as the respective PDB ID, simulation time and reference to the relevant paper. We believe most of our abbreviations should be obvious as to what receptor they are referring to. We also note that we differentiate between GPCRs that

178 have been simulated in both active and inactive states by the last letter of the abbreviation (e.g. β2ARi vs

β2ARa denoting the inactive and active state β2AR, respectively).

Figure A-1. Convergence of the number of lipids (Lipid Count) during the course of the simulation. A. Running average of the average number of lipids within 7Å of S1PR1 as a function of trajectory time. We chose S1PR1 to highlight here as its GM lipid count is the worst behaving among all GPCRs simulated and showcase how our focus on the last 5 μs ensures that we analyze converged lipid distributions around proteins. Lipid groups that appear in both leaflets are denoted as such (e.g. PC_u, PC_l denotes PC lipids only in the upper and lower leaflet of the bilayer, respectively) B. The cumulative average of several lipid groups compared to the usage of different cutoffs (that is, different points in time at which the trajectory is discarded and only the remaining part is used for analysis): no cutoff, 5 μs, 10 μs, and 24 μs. These results give us an idea of the “error” that is introduced into the calculations depending on what part, if any, of the trajectory is discarded before analysis. They also reveal the amount of simulation required to ensure converged lipid distributions. As is clear from the figure, discarding the first 5 μs of the simulation is enough for PIP, PE and PS lipid to converge, but for GM lipids even 10 μs is not enough. Simulation lengths of 20 μs and more are required to achieve converged GM lipid distributions. Please also note that, since our results are averages of 4 proteins (i.e. n is small), the initial values of the cumulative average are more chaotic and take some time to stabilize (hence why we’re showing the values for a cutoff of 24 μs, even though in our analysis we discarded the first 25 μs, to allow for 1 μs time for the cumulative average values to stabilize). This figure is for S1PR1, and the complete dataset for all GPCRs is available on Figure A- 22. Highlighted areas represent ± SD (n = 4).

179

Figure A-2. β2ARi-cholesterol interactions.

Sequence heatmaps for the number of contacts with cholesterol for each β2ARi setup. For clarity and easy comparison, we aligned the structures and are only showing residues that make up the helical core of the receptor. The systems are as follows: the setup used in our simulations (#1), including ICL2 (#2), including palmitoylated Cys-341 (#3), and pre-equilibrated system (#4). We see that our results are not affected by our simulation protocol. We also analysed the effect different strengths of the elastic network have and saw no difference. These results also show that the cholesterol distribution around proteins has converged.

180

Figure A-3. P-P plots of DE indices. Each graph shows the probability plot for the DE indices calculated for all GPCRs per lipid group (PC, PE, GM, CHOL, PS, SM, PIP, PI, and PA). Each data point represents the DE index for a particular GPCR. Data points colored in red are DE indices for non class A GPCRs. In each graph, the data points are

181 approximated with a line of best fit. The black line is drawn at 45˚ which corresponds to theoretical percentiles = sample percentiles.

Figure A-4. (continued)

182

Figure A-4. Cholesterol 2D density profile. Density maps are on the x-y axis of each system and calculated for each leaflet separately. We show the density maps for cholesterol (this figure), fully-saturated lipids (Figure A-3) and poly-unsaturated lipids (Figure A-4). Densities are calculated for the last 5 μs and averaged over the four copies. To assist in analyzing and comparing the data, we have overlaid the atomistic structure over the approximate insertion

183 place and orientation of proteins. The average density for each lipid group is shown with white, and the relative enrichment or depletion is shown if red and blue colors, respectively.

Figure A-5. (continued)

184

Figure A-5. FS lipids density profile. Details are similar to the cholesterol density maps (Figure A-2).

185

Figure A-6. (continued)

186

Figure A-6. PU density profile. Details are similar to the cholesterol density maps (Figure A-2).

187

Figure A-7. (continued)

188

Figure A-7. Gaussian curvature (KG) maps.

2D KG curvature maps are calculated for each system simulated over the last 5 μs. We define three surfaces for which we calculate curvature and thickness maps: upper, middle and lower. We use the PO4 and GM1 beads to define the upper surface and the PO4 and CP beads to define the lower surface. For the middle surface we use the last lipid tail beads. Saddles (negative KG) are colored magenta and convex/concave

189 regions (positive KG) are colored green. Atomistic structures over overlaid at the approximate insertion place and orientation of proteins.

Figure A-8. (continued)

190

Figure A-8. Mean curvature (KM) maps.

2D KM curvature maps are calculated for each system simulated over the last 5 μs. We color negative curvature and positive curvature with blue and red, respectively, using the RWB coloring scheme. white present zero curvature. Atomistic structures are overlaid at the approximate insertion place and orientation of proteins.

191

Figure A-9. (continued)

192

Figure A-9. GPCR membrane thickness. 2D thickness maps are calculated for each system simulated over the last 5 μs. The same three surfaces defined for curvature analysis are used here as well. Overall thickness is calculated as the distance from the upper to the lower surface. Upper and lower thickness are the distance between the middle surface to the upper and lower surfaces, respectively. Atomistic structures over overlaid at the approximate insertion place and orientation of proteins.

193

Figure A-10. PIP lipid - TM helix interactions. A bar plot showing the interactions between each GPCR TM helix with PIP lipids. For each helix, only the four highest residues in terms of interactions with PIP lipids are considered. The data show how different GPCRs are in their interactions with PIP lipids, in particular we note that non class A GPCRs (GLP1, mGlu5, SMO, CalcitoninR and GlucagonR) have strikingly fewer interactions with PIP lipids compared to

194 their class A counterparts. A color gradient from white-to-red is used to color the increase in the number of contacts with PIP lipids.

Figure A-11. CXCR1-PIP interactions. Interactions between bound PIP lipids and arginine residues lining up the CXCR1 binding site. We show the data for protein #2, #3, and #4 in our system. In each case, there is a PIP lipid that is tightly bound at the TM1/2/4 interface.

195

Figure A-12. AT2R-cholesterol interactions.

Centre-of-mass distances between F129 and K215 of AT2R and bound cholesterol molecules. We show the data for protein #2, #3, and #4 in our system. We see that in each case there is a cholesterol molecule tightly bound at the TM4/5 interface. We also observe multiple binding/unbinding events.

196

Figure A-13. Cholesterol interactions with A2AAR and β2AR. Crystal structures are shown as a surface representation to allow for easy comparison with simulation results. The latter are mapped on the surface of proteins in a Red-White-Blue scale denoting an increase in either cholesterol number of contacts (denoted with the letter C) or duration of contacts (letter D). We only show one crystal structure per study published. A. A2AAR. 27 of the 45 A2AAR crystal structures solved contain bound cholesterol. They are found at the TM2/3 and TM5-7 interface. Song et al.(43) show cholesterol interacting with A2AAR at 7 different interaction sites: TM1/2(ec), TM1/2/4 (ic), TM2/3 (ec), TM3/4 (ec), TM3/4 (ic), TM5/6 (ec), and TM6/7. The number of these interaction sites is dependent on the conformational state of A2AAR. Rouviere et al.(44) as well performed coarse-grained MD simulations and showed several cholesterol interaction sites. Using long-scale all-atom simulations they noted two sites

197 with a particularly high cholesterol-binding affinity: TM5/6(ic) and TM6(ec). While we do observe cholesterol interactions at the TM2-4 interface, the majority of these interactions are seen at the TM5-7/1 interface. Accounting for the longevity of cholesterol-contacts, the A2AAR TM6-7 interface has a distinctly higher affinity for cholesterol. We also find the following interaction sites: TM5-6, TM7-8/1, TM1-2, and TM2-4. The TM5-6 interface noted here and by Rouviere et al,(44) has also been found to be the entry site for cholesterol insertion inside the receptor by Guixa et al.(45) Such lipid entry events are unlikely to occur in CGMD simulations employing an elastic network. B. β2AR. They account for 11 of the 64 cholesterol – co-crystalized GPCR structures. There are two cholesterol binding sites in the β2AR crystal structures:

TM1-4 and TM1/8. In microsecond long all-atom simulations of β2AR, Cang et al.(46) observe 8 cholesterol interaction sites: 5 in the lower leaflet and 3 in the upper leaflet side of the receptor. In particular, they note interaction sites between helices TM1/7, TM6/7 and TM1/8. These simulations also show that the TM5-

7 interface of β2AR is quite involved in cholesterol interactions, despite this not being reflected in solved crystal structures. Manna et al.(47) also find several high-density cholesterol interaction sites in their simulations. In the lower half of the receptor they find interaction sites at TM1-TM4 and TM5/6, and in the upper half of the receptor they find one cholesterol binding at the TM5/6 and another one at the TM6/7 interface. Agreeing with previous simulations, we find several cholesterol interaction sites. The TM5-7 interface of β2AR, in particular, displays a cholesterol-interaction profile that is quite similar to that of

A2AAR. The β2AR TM1-4 interface, however, is distinctly different. It interacts with cholesterol molecules in the lower half of the receptor where helices TM1/2/4 converge and corresponds to the cholesterol binding site from crystal structures. We also note the existence of an interaction site between helices TM4/3/5, which is not observed for A2AAR. Lastly, we note the TM1/8 interaction site, which matches the second cholesterol binding site found in crystal structures.

198

Figure A-14. Cholesterol interactions with Serotonin and μOR. A. Serotonin. There are 9 cholesterol – co-crystalized structures of serotonin receptors. The majority belong to 5HT2B receptor, and while we did not simulate this particular subtype, we simulated 5HT1B which shares a high sequence similarity to 5HT2B. 5HT2B crystal structures reveal a cholesterol molecule bound in the lower half of the receptor at TM8/1. One of the structures solved has a cholesterol molecule in a crevice between helices TM2/3/4 not unlike the binding site observed for β2AR, however and surprisingly, here cholesterol is found in the opposite direction with its OH group facing upward and, positioned at the midplane of the bilayer. Shan et al.(48) carried out simulations of 5-HT2AR and found several cholesterol interaction sites: TM1/2/4 (ic) TM2/3 (ec) and TM6/7 (ec and ic side). In our simulations, one of the most pronounced interaction sites observed is between TM1/8 which matches crystallographic experimental evidence. We also note the existence of other interaction sites located at TM1/2, TM2/4, TM6/7 and

TM7/TM1. The interface between helices TM6/7 and TM8/1 are observed for β2AR, A2AR and 5HT1B.

Cholesterol interaction sites are also observed at the TM1-4 interface, on the ic side for β2AR and 5HT1B, and on the ec side for A2AR. B. μOR. 3 opioid receptors have been crystalized with bound cholesterol: kappa-Opioid (1) receptor and mu-Opioid receptor (2). Marino et al.(49) using CGMD and a setup similar to ours to simulate μOR show the existence of an interaction site in a hydrophobic region close to TM6 and TM7 of the receptor which corresponds to the position of cholesterol resolved in the crystal structure of

199

μOR (4DKL). In our simulations, we reproduce this cholesterol-binding site at TM6/7 quite well. Here, again, we find the TM1-4 interface to be involved in cholesterol interactions. The interaction site we identify is formed from helices TM2/3 with partial involvement of TM4 in the ec side of the receptor. We also note an interaction site at TM5/6 ic side.

Figure A-15. Cholesterol interactions with CB1R, chemokine receptor, ETBR, and US28. A. CBR. Currently, there are three structures of the cannabinoid receptor (CB1) that have been solved with bound cholesterol. Two of them (5XR8 and 5XRA) have cholesterol bound at the TM6/7 interface and the

200 third (6N4B) has two cholesterol molecules bound at the TM4/5 interface. CB1R differs from other GPCRs so far in that the TM1-4 interface is very heavily involved in interacting with cholesterol. We observe cholesterol interaction sites at: TM7/1, TM1/2 (ec and ic side), TM2/3, TM4/5 and TM6/7 (ic side). We also observe the interaction site at TM4/5 seen in the crystal structure, although in terms of the duration of cholesterol contacts it is of lesser “strength” than the other sites. B. CCR9, ETBR and US28. We summarize the data for three additional GPCRs with co-crystalized cholesterol molecules (from top row to down row): CCR9 (a chemokine), ETBR and US28. They feature cholesterol bound at different sites, namely: TM6/TM (ic), TM1/2 (ec) and TM6/7 (ec). We observe the cholesterol interaction sites from the crystal structures in all three cases, and in particular for ETBR and US28. We observe a stronger interaction between cholesterol and helices TM2/4 – perhaps because we simulated CXCR1 and not CCR9.

Figure A-16. Cholesterol interactions with other GPCR. We show the rest of GPCR crystal structures that have been solved with cholesterol molecules bound to them. A. P2Y12 and P2Y1. Cholesterol is bound at three P2Y12 sites: TM3/4 interface, which hasn’t been observed in other crystal structures, TM7/1 which is observed in our simulations (notably, RhodRi), and TM1-4 (ic). P2Y1, on the other hand, features a cholesterol molecule bound at the TM4-5 interface, at the exact location where we observe cholesterol binding inside AT2R. B. mGlu1. mGlu1 has been solved as a dimer and the interface between monomers is lined up with 6 cholesterol molecules. These are found at the TM1/2/3 interface. C. CX3CL1. Two cholesterol molecules are found at the TM6/7 interface. D.

201

Thromboxane A2. One cholesterol molecule is found at the TM2/3 interface, which is not observed elsewhere, and is also rare in our simulations.

Figure A-17. GPCR-lipid interactions. We approximate GPCRs with a rectangular parallelogram (or cuboid) and define two large interfaces (TM1-4 and TM5-7/8), and two smaller interfaces (TM7/8-1 and TM3/4 – 5). TM helices as part of a cuboid face (shown as a rectangle in the picture) do not form any interface with TM helices part of other faces. We also use colors red and blue to differentiate between ec and ic located binding sites, respectively. For the TM7/8 – 1 interface, red and blue, in fact, represent the TM7/1 and TM1/8 interfaces respectively. The figure shows the major interaction sites for a selection of GPCRs in a schematic way.

A.1 References

1. Corradi V., E. Mendez-Villuendas, H. I. Ingolfsson, R. X. Gu, I. Siuda, M. N. Melo, A. Moussatova, L. J. DeGagne, B. I. Sejdiu, G. Singh, T. A. Wassenaar, K. D. Magnero, S. J. Marrink, D. P. Tieleman. Lipid-Protein Interactions Are Unique Fingerprints for Membrane Proteins. Acs Central Sci. 2018;4(6):709-717.

202

2. Gu R.-X., S. Baoukina, D. P. Tieleman. Cholesterol Flip-Flop in Heterogeneous Membranes. Journal of chemical theory and computation. 2019;15(3):2064-2070.

3. Isberg V., S. Mordalski, C. Munk, K. Rataj, K. Harpsoe, A. S. Hauser, B. Vroling, A. J. Bojarski, G. Vriend, D. E. Gloriam. GPCRdb: an information system for G protein-coupled receptors. Nucleic Acids Res. 2016;44(D1):D356-D364.

4. Fantini J., F. J. Barrantes. How cholesterol interacts with membrane proteins: an exploration of cholesterol-binding sites including CRAC, CARC, and tilted domains. Front Physiol. 2013;4:9.

5. De Castro E., C. J. Sigrist, A. Gattiker, V. Bulliard, P. S. Langendijk-Genevaux, E. Gasteiger, A. Bairoch, N. Hulo. ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins. Nucleic Acids Res. 2006;34(suppl_2):W362-W365.

6. Jianyi Y., Z. Yang. GPCR-EXP: a database for experimentally solved GPCR structures.

7. McGibbon R. T., K. A. Beauchamp, M. P. Harrigan, C. Klein, J. M. Swails, C. X. Hernández, C. R. Schwantes, L.-P. Wang, T. J. Lane, V. S. Pande. MDTraj: a modern open library for the analysis of molecular dynamics trajectories. Biophys J. 2015;109(8):1528-1532.

8. Jones E., T. Oliphant, P. Peterson. {SciPy}: Open source scientific tools for {Python}. 2014.

9. Seabold S., J. Perktold, editors. Statsmodels: Econometric and statistical modeling with python. Proceedings of the 9th Python in Science Conference; 2010: Scipy.

10. Hunter J. D. Matplotlib: A 2D graphics environment. Computing in science & engineering. 2007;9(3):90.

11. Waskom M., O. Botvinnik, P. Hobson, J. Warmenhoven, J. Cole, Y. Halchenko, J. Vanderplas, S. Hoyer, S. Villalba, E. Quintero. Seaborn: statistical data visualization. Seaborn: Statistical Data Visualization Seaborn 0. 2014;5.

12. Rose A. S., P. W. Hildebrand. NGL Viewer: a web application for molecular visualization. Nucleic Acids Res. 2015;43(W1):W576-W579.

13. Cignoni P., M. Callieri, M. Corsini, M. Dellepiane, F. Ganovelli, G. Ranzuglia, editors. Meshlab: an open-source mesh processing tool. Eurographics Italian chapter conference; 2008.

203

14. Humphrey W., A. Dalke, K. Schulten. VMD: visual molecular dynamics. Journal of molecular graphics. 1996;14(1):33-38.

15. Wang C., Y. Jiang, J. Ma, H. Wu, D. Wacker, V. Katritch, G. W. Han, W. Liu, X.-P. Huang, E. Vardy. Structural basis for molecular recognition at serotonin receptors. Science. 2013;340(6132):610-614.

16. Lebon G., T. Warne, P. C. Edwards, K. Bennett, C. J. Langmead, A. G. Leslie, C. G. Tate. Agonist- bound adenosine A 2A receptor structures reveal common features of GPCR activation. Nature. 2011;474(7352):521.

17. Jaakola V. P., M. T. Griffith, M. A. Hanson, V. Cherezov, E. Y. T. Chien, J. R. Lane, A. P. Ijzerman, R. C. Stevens. The 2.6 Angstrom Crystal Structure of a Human A(2A) Adenosine Receptor Bound to an Antagonist. Science. 2008;322(5905):1211-1217.

18. Ma Y., Y. Yue, Y. Ma, Q. Zhang, Q. Zhou, Y. Song, Y. Shen, X. Li, X. Ma, C. Li. Structural basis for apelin control of the human apelin receptor. Structure. 2017;25(6):858-866. e854.

19. Zhang H., G. W. Han, A. Batyuk, A. Ishchenko, K. L. White, N. Patel, A. Sadybekov, B. Zamlynny, M. T. Rudd, K. Hollenstein. Structural basis for selectivity and diversity in angiotensin II receptors. Nature. 2017;544(7650):327.

20. Rasmussen S. G., B. T. DeVree, Y. Zou, A. C. Kruse, K. Y. Chung, T. S. Kobilka, F. S. Thian, P. S. Chae, E. Pardon, D. Calinski, J. M. Mathiesen, S. T. Shah, J. A. Lyons, M. Caffrey, S. H. Gellman, J. Steyaert, G. Skiniotis, W. I. Weis, R. K. Sunahara, B. K. Kobilka. Crystal structure of the beta2 adrenergic receptor-Gs protein complex. Nature. 2011;477(7366):549-555.

21. Cherezov V., D. M. Rosenbaum, M. A. Hanson, S. G. F. Rasmussen, F. S. Thian, T. S. Kobilka, H. J. Choi, P. Kuhn, W. I. Weis, B. K. Kobilka, R. C. Stevens. High-resolution crystal structure of an engineered human beta(2)-adrenergic G protein-coupled receptor. Science. 2007;318(5854):1258-1265.

22. Hua T., K. Vemuri, M. Pu, L. Qu, G. W. Han, Y. Wu, S. Zhao, W. Shui, S. Li, A. Korde. Crystal structure of the human cannabinoid receptor CB1. Cell. 2016;167(3):750-762. e714.

23. Park S. H., B. B. Das, F. Casagrande, Y. Tian, H. J. Nothnagel, M. Chu, H. Kiefer, K. Maier, A. A. De Angelis, F. M. Marassi. Structure of the chemokine receptor CXCR1 in phospholipid bilayers. Nature. 2012;491(7426):779.

204

24. Chien E. Y., W. Liu, Q. Zhao, V. Katritch, G. W. Han, M. A. Hanson, L. Shi, A. H. Newman, J. A. Javitch, V. Cherezov. Structure of the human dopamine D3 receptor in complex with a D2/D3 selective antagonist. Science. 2010;330(6007):1091-1095.

25. Shihoya W., T. Nishizawa, K. Yamashita, A. Inoue, K. Hirata, F. M. N. Kadji, A. Okuta, K. Tani, J. Aoki, Y. Fujiyoshi, T. Doi, O. Nureki. X-ray structures of endothelin ETB receptor bound to clinical antagonist bosentan and its analog. Nat Struct Mol Biol. 2017;24(9):758-+.

26. Shimamura T., M. Shiroishi, S. Weyand, H. Tsujimoto, G. Winter, V. Katritch, R. Abagyan, V. Cherezov, W. Liu, G. W. Han. Structure of the human histamine H 1 receptor complex with doxepin. Nature. 2011;475(7354):65.

27. Chrencik J. E., C. B. Roth, M. Terakado, H. Kurata, R. Omi, Y. Kihara, D. Warshaviak, S. Nakade, G. Asmar-Rovira, M. Mileni. Crystal structure of antagonist bound human lysophosphatidic acid receptor 1. Cell. 2015;161(7):1633-1643.

28. Kruse A. C., A. M. Ring, A. Manglik, J. Hu, K. Hu, K. Eitel, H. Hübner, E. Pardon, C. Valant, P. M. Sexton. Activation and allosteric modulation of a muscarinic acetylcholine receptor. Nature. 2013;504(7478):101.

29. Haga K., A. C. Kruse, H. Asada, T. Yurugi-Kobayashi, M. Shiroishi, C. Zhang, W. I. Weis, T. Okada, B. K. Kobilka, T. Haga. Structure of the human M2 muscarinic acetylcholine receptor bound to an antagonist. Nature. 2012;482(7386):547.

30. Huang W. J., A. Manglik, A. J. Venkatakrishnan, T. Laeremans, E. N. Feinberg, A. L. Sanborn, H. E. Kato, K. E. Livingston, T. S. Thorsen, R. C. Kling, S. Granier, P. Gmeiner, S. M. Husbands, J. R. Traynor, W. I. Weis, J. Steyaert, R. O. Dror, B. K. Kobilka. Structural insights into mu-opioid receptor activation. Nature. 2015;524(7565):315-+.

31. Manglik A., A. C. Kruse, T. S. Kobilka, F. S. Thian, J. M. Mathiesen, R. K. Sunahara, L. Pardo, W. I. Weis, B. K. Kobilka, S. Granier. Crystal structure of the mu-opioid receptor bound to a morphinan antagonist. Nature. 2012;485(7398):321-U170.

32. Suno R., K. T. Kimura, T. Nakane, K. Yamashita, J. Wang, T. Fujiwara, Y. Yamanaka, D. Im, S. Horita, H. Tsujimoto. Crystal structures of human orexin 2 receptor bound to the subtype-selective antagonist EMPA. Structure. 2018;26(1):7-19. e15.

205

33. Cheng R. K., C. Fiez-Vandal, O. Schlenker, K. Edman, B. Aggeler, D. G. Brown, G. A. Brown, R. M. Cooke, C. E. Dumelin, A. S. Doré. Structural insight into allosteric modulation of protease-activated receptor 2. Nature. 2017;545(7652):112.

34. Choe H.-W., Y. J. Kim, J. H. Park, T. Morizumi, E. F. Pai, N. Krauss, K. P. Hofmann, P. Scheerer, O. P. Ernst. Crystal structure of metarhodopsin II. Nature. 2011;471(7340):651.

35. Li J., P. C. Edwards, M. Burghammer, C. Villa, G. F. Schertler. Structure of bovine rhodopsin in a trigonal crystal form. J Mol Biol. 2004;343(5):1409-1438.

36. Hanson M. A., C. B. Roth, E. Jo, M. T. Griffith, F. L. Scott, G. Reinhart, H. Desale, B. Clemons, S. M. Cahalan, S. C. Schuerer. Crystal structure of a lipid G protein–coupled receptor. Science. 2012;335(6070):851-855.

37. Burg J. S., J. R. Ingram, A. J. Venkatakrishnan, K. M. Jude, A. Dukkipati, E. N. Feinberg, A. Angelini, D. Waghray, R. O. Dror, H. L. Ploegh, K. C. Garcia. Structural basis for chemokine recognition and activation of a viral G protein-coupled receptor. Science. 2015;347(6226):1113-1117.

38. Liang Y. L., M. Khoshouei, M. Radjainia, Y. Zhang, A. Glukhova, J. Tarrasch, D. M. Thal, S. G. B. Furness, G. Christopoulos, T. Coudrat, R. Danev, W. Baumeister, L. J. Miller, A. Christopoulos, B. K. Kobilka, D. Wootten, G. Skiniotis, P. M. Sexton. Phase-plate cryo-EM structure of a class B GPCR-G- protein complex. Nature. 2017;546(7656):118-+.

39. Song G., D. Yang, Y. Wang, C. de Graaf, Q. Zhou, S. Jiang, K. Liu, X. Cai, A. Dai, G. Lin. Human GLP-1 receptor transmembrane domain structure in complex with allosteric modulators. Nature. 2017;546(7657):312.

40. Siu F. Y., M. He, C. De Graaf, G. W. Han, D. Yang, Z. Zhang, C. Zhou, Q. Xu, D. Wacker, J. S. Joseph. Structure of the human glucagon class B G-protein-coupled receptor. Nature. 2013;499(7459):444.

41. Doré A. S., K. Okrasa, J. C. Patel, M. Serrano-Vega, K. Bennett, R. M. Cooke, J. C. Errey, A. Jazayeri, S. Khan, B. Tehan. Structure of class C GPCR metabotropic glutamate receptor 5 transmembrane domain. Nature. 2014;511(7511):557.

42. Wang C., H. Wu, T. Evron, E. Vardy, G. W. Han, X.-P. Huang, S. J. Hufeisen, T. J. Mangano, D. J. Urban, V. Katritch. Structural basis for Smoothened receptor modulation and chemoresistance to anticancer drugs. Nat Commun. 2014;5:4355.

206

43. Song W., H. Y. Yen, C. V. Robinson, M. S. P. Sansom. State-dependent Lipid Interactions with the A2a Receptor Revealed by MD Simulations Using In Vivo-Mimetic Membranes. Structure. 2019;27(2):392-403 e393.

44. Rouviere E., C. Arnarez, L. W. Yang, E. Lyman. Identification of Two New Cholesterol Interaction Sites on the A(2A) Adenosine Receptor. Biophys J. 2017;113(11):2415-2424.

45. Guixa-Gonzalez R., J. L. Albasanz, I. Rodriguez-Espigares, M. Pastor, F. Sanz, M. Marti-Solano, M. Manna, H. Martinez-Seara, P. W. Hildebrand, M. Martin, J. Selent. Membrane cholesterol access into a G-protein-coupled receptor. Nat Commun. 2017;8:14505.

46. Cang X. H., Y. Du, Y. Y. Mao, Y. Y. Wang, H. Y. Yang, H. L. Jiang. Mapping the Functional Binding Sites of Cholesterol in beta(2)-Adrenergic Receptor by Long-Time Molecular Dynamics Simulations. J Phys Chem B. 2013;117(4):1085-1094.

47. Manna M., M. Niemela, J. Tynkkynen, M. Javanainen, W. Kulig, D. J. Muller, T. Rog, I. Vattulainen. Mechanism of allosteric regulation of beta2-adrenergic receptor by cholesterol. eLife. 2016;5.

48. Shan J., G. Khelashvili, S. Mondal, E. L. Mehler, H. Weinstein. Ligand-dependent conformations and dynamics of the serotonin 5-HT(2A) receptor determine its activation and membrane-driven oligomerization properties. PLoS Comput Biol. 2012;8(4):e1002473.

49. Marino K. A., D. Prada-Gracia, D. Provasi, M. Filizola. Impact of Lipid Composition and Receptor Conformation on the Spatio-temporal Organization of mu-Opioid Receptors in a Multi-component Plasma Membrane Model. PLoS Comput Biol. 2016;12(12):e1005240.

207

Appendix B: Supplementary Data for Chapter 5

Figure B-1. Cholesterol binding to the MBD of COX-1 in CG simulations. Data are for the MBDs of the rest of the proteins in the system (proteins #2, #3, and #4), to complement Figure 5-4 in the main text (which shows data for protein #1).

208

Figure B-2. 2D density profiles for ER-like membrane simulations. For each lipid component the density profile for the upper leaflet is shown. In most cases, we see that the highest localization of lipids corresponds to the cavity of the MBDs. Normalization is done on a per lipid basis (and not across all lipids). Notice the creation of the ‘coronal layer’ around COX-1 for PC and PE lipids, closely resembling the FS layers from Figure 5-2.

209

Figure B-3. POPC lipid binding to the MBD of COX-1 in AA simulations. Binding of POPC to the MBD of COX-1 utilizes the same cavity within the MBD, however the binding itself differs. First, POPC binding is more erratic owing to its higher flexibility within the site. The distances shown here are between the P atom of bound POPC lipid and either the sidechain carboxyl C atom (for Glu-493 distance) or guanidino carbon atom (for Arg-83 distance)

210

Figure B-4. Coarse-Grained order parameters for some multicomponent systems. Data are for the complex membrane setup (A), 6% cholesterol (B) and 3% cholesterol (C) content systems. We see a noticeable disorder caused by the protein caused on the upper leaflet in all systems, yet we do not see any change on the lower leaflet of the membrane.

211

Figure B-5. Lipid count in the close proximity of embedded COX-1 proteins in the complex system. Lipids are grouped according to their headgroup type into the following categories: PC, PE, PS, GM, SM, PA, PI, PIP, LPC. A. Lipid count within a 7Å radius around proteins. B. Lipid count within the perimeter formed by the projection of the outermost residues of the protein. C. Description of how lipids are counted for the lower leaflet. The largest circumference drawn along the outermost points of the protein (colored in black) is projected onto the lower leaflet to define the area for counting lipids. Lipids that fall within that surface are counted and displayed in B.

Figure B-6. Lipid count within 7Å of COX-1 in the complex membrane system. Data are shown for the lower leaflet only. Calculations are done similar to Figure B-5, but here the MBDs of the proteins are projected into the lower leaflet to define the surface for counting lipids. Only lipid species that fall within that surface are counted.

212

Table S1. Detailed overview of simulated systems. For each system we show the lipid composition as well as the total simulation time.

Resolution Description Composition (%) Simulation time (μs) Complex Membrane Setup / CG See Table S2 (.xlsx) 30 Plasma membrane

See Table S3 (.xlsx) ER-like membrane setup w/ CG 10 3% cholesterol See Table S3 (.xlsx)

See Table S3 (.xlsx) ER-like membrane setup w/ CG 4 5% cholesterol See Table S3 (.xlsx)

See Table S3 (.xlsx) ER-like membrane setup w/ CG 10 6% cholesterol See Table S3 (.xlsx)

CG Large POPC only POPC (1141 lipids) 10

AA Low CHOL CHOL:13;POPC:87 1

AA High CHOL CHOL:40; POPC:60 0.45

AA 40% Arachidonic Acid (ARAN) ARAN:40;POPC60 0.95

40% Arachidonic Acid AA ARANP:40;POPC60 0.95 (ARANP)

AA Large POPC membrane POPC:100 (1140 lipids) 0.2

Large POPC membrane / AA POPC:100 (1140 lipids) 0.275 surface tension: 50 Large POPC membrane / AA POPC:100 (1140 lipids) 0.25 surface tension: 300 Large POPC membrane / AA POPC:100 (1140 lipids) 0.2 GROMOS54A7 ff Large POPC membrane / AA POPC:100 (1140 lipids) 0.075 histidine protonation state Reference POPC bilayer / no AA POPC:100 (1140 lipids) 0.2 protein

213

Appendix C: Copyright Permissions

C.1 Permissions from Biophysical Journal (CellPress)

As an author, I retain the right to “Include the article in full or in part in a thesis or dissertation (provided that this is not to be published commercially)”. See screenshots below:

214

215

C.2 Permissions from ACS.

The article is marked as “ACS Author Choice” leading to different terms of use. Therefore, RightsLink®, the automated permission service cannot be used, and the journal support services have to be contacted directly. See below:

My correspondence with the ACS Publications Support, granting me permission, is provided below. The consent of all other authors in the article granting me permission to use parts of it in my thesis is also attached (personal emails and affiliations have been redacted for privacy reasons).

216

Dear Besian Sejdiu,

Your permission requested is granted and there is no fee for this reuse. In your planned reuse, you must cite the ACS article as the source, add this direct link https://pubs.acs.org/doi/full/10.1021/acs.chemrev.8b00451, and include a notice to readers that further permissions related to the material excerpted should be directed to the ACS.

If you need further assistance, please let me know.

Sincerely,

Raquel Picar-Simpson ACS Publications Support Customer Services & Information Website: https://help.acs.org/

Incident Information: Incident #: 3534626 Date Created: 2020-05-20T13:28:27 Priority: 3 Customer: Besian Sejdiu Title: Permission to use article -- DOI: 10.1021/acs.chemrev.8b00451 Description: Dear Support Staff,

I would like to use the following article as a chapter in my thesis: https://pubs.acs.org/doi/abs/10.1021/acs.chemrev.8b00451 Specifically, I would like to use the section on G Protein-Coupled Receptors (section 2) as well as the associated figures (Figures 1-3).

I am also one of the authors in the review article.

Thank you, Besian

217

218

219

220