Predicting Coupling Probabilities of G-Protein Coupled Receptors Gurdeep Singh1,2,†, Asuka Inoue3,*,†, J

Published online 30 May 2019 Nucleic Acids Research, 2019, Vol. 47, Web Server issue W395–W401 doi: 10.1093/nar/gkz392 PRECOG: PREdicting COupling probabilities of G-protein coupled receptors Gurdeep Singh1,2,†, Asuka Inoue3,*,†, J. Silvio Gutkind4, Robert B. Russell1,2,* and Francesco Raimondi1,2,* 1CellNetworks, Bioquant, Heidelberg University, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany, 2Biochemie Zentrum Heidelberg (BZH), Heidelberg University, Im Neuenheimer Feld 328, 69120 Heidelberg, Germany, 3Graduate School of Pharmaceutical Sciences, Tohoku University, Sendai, Miyagi 980-8578, Japan and 4Department of Pharmacology and Moores Cancer Center, University of California, San Diego, La Jolla, CA 92093, USA Received February 10, 2019; Revised April 13, 2019; Editorial Decision April 24, 2019; Accepted May 01, 2019 ABSTRACT great use in tinkering with signalling pathways in living sys- tems (5). G-protein coupled receptors (GPCRs) control multi- Ligand binding to GPCRs induces conformational ple physiological states by transducing a multitude changes that lead to binding and activation of G-proteins of extracellular stimuli into the cell via coupling to situated on the inner cell membrane. Most of mammalian intra-cellular heterotrimeric G-proteins. Deciphering GPCRs couple with more than one G-protein giving each which G-proteins couple to each of the hundreds receptor a distinct coupling profile (6) and thus specific of GPCRs present in a typical eukaryotic organism downstream cellular responses. Determining these coupling is therefore critical to understand signalling. Here, profiles is critical to understand GPCR biology and phar- we present PRECOG (precog.russelllab.org): a web- macology. Despite decades of research and hundreds of ob- server for predicting GPCR coupling, which allows served interactions, coupling information is still missing for users to: (i) predict coupling probabilities for GPCRs many receptors and sequence determinants of coupling- specificity are still largely unknown. However, it is clear to individual G-proteins instead of subfamilies; (ii) that, in contrast to e.g. enzyme specificities (7), simple visually inspect the protein sequence and structural amino acid differences explaining coupling differences are features that are responsible for a particular cou- rare. pling; (iii) suggest mutations to rationally design arti- Here, we present a machine learning-based predictor ﬁcial GPCRs with new coupling properties based on (PRECOG) of Class A GPCR/G-protein couplings, which predetermined coupling features. was developed as a part of the most systematic quantifica- tion of GPCR coupling selectivity to date (8). PRECOG was built by exploiting experimental binding affinities of INTRODUCTION 144 human Class A GPCRs for 11 chimeric G-proteins ob- G-protein coupled receptors (GPCRs) are the largest class tained through the TGF␣ shedding assay (9). We derived a of cell-surface receptors and the target for 30% of marketed set of sequence- and structure-based features that were sta- drugs (1,2). They are responsible for transducing a myriad tistically associated with each of 11 G-proteins, which we of stimuli from the extracellular environment to activate used to devise predictive models. multiple intracellular signalling pathways. They do so by Given one or more input sequences or Uniprot pro- coupling to one or more heterotrimeric G-proteins, whose tein accessions (or gene symbols), PRECOG provides both ␣-subunits are grouped into four major G-protein families: overview predictions for each G-protein and putative mech- Gs,Gi/o,Gq/11 and G12/13 (3). Aberrant coupling of GPCRs anistic insights into how each prediction was made. De- to G-proteins has been linked to several pathological pro- terminants of coupling-specificity are displayed on the se- cesses and diseases such as cardiovascular and mental dis- quence and on available (known or homologous) 3D struc- orders, retinal degeneration, AIDS and cancer (4). Untan- tures. Users can also assess the impact of mutations on gling GPCR/G-protein coupling can also aid the design of GPCR/G-protein coupling with respect to the wild type. chemogenetic tools, such as Designer Receptors Exclusively We provide views that can aid users in selecting muta- Activated by Designer Drugs (DREADDs), that can be of tions that can help alter coupling specificity and ultimately *To whom correspondence should be addressed. Tel: +49 6221 54 51 362; Fax: +49 6221 54 51 486; Email: [email protected] Correspondence may also be addressed to Asuka Inoue. Email: [email protected] Correspondence may also be addressed to Robert B. Russell. Email: [email protected] †The authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors. C The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. W396 Nucleic Acids Research, 2019, Vol. 47, Web Server issue to design receptors having specific couplings. We have al- Hidden Markov Model (HMM) from Pfam (2016 release) ready used PRECOG to predict coupling preferences of all (12). We then subdivided sequences into positives (coupled; human GPCRs, as well as to design a chemogenetic tool LogRAi ≥ –1) and negatives (not-coupled; LogRAi <-1) (DREADD) specific for GNA12 (8). for each G-protein. We then extracted sub-alignments and constructed their corresponding HMM profiles (coupled vs. MATERIALS AND METHODS not-coupled for 11 G-proteins) using HMMbuild (11). For a given G-protein, we then extracted positions show- Coupling data for 144 class A GPCRs and 11 chimeric G- ing statistically significant differences in terms of the amino proteins from the TGF␣ shedding assay acid bit-scores (Wilcoxon’s signed-rank test; P-value ≤ 0.05) To train a predictor for G-protein coupling specificity, we among the coupled and uncoupled HMMs. Alignment po- exploited data from the TGF␣ shedding assay, which is sitions with consensus columns (i.e. having a fraction of a robust, high-throughput means to measure accumulated non-gaps equal or greater than the symfrac parameter, con- GPCR signals (8,9). This approach exploits a ADAM17- sidering a default value of 0.5) present in either HMMs, induced ectodomain shedding of alkaline phosphatase- were considered as either insertion or deletion if they were fused TGF␣ (AP-TGF␣) and chimeric G-proteins where present only in the coupled or not-coupled group. We also the 11 unique C-termini (which have previously been shown included length and amino acid composition of the third in- to account for most of the coupling specificity) from hu- tracellular loop (ICL3) and C-terminus (C-term) consider- man G␣ subunits replace the last 6 amino acids of GNAQ. ing features showing statistically significant differences (P- Chimeric G-proteins are expressed in cells lacking endoge- value < 0.05; Wilcoxon’s rank-sum test) in coupled vs not- nous G␣ subunits (GNAQ, GNA11, GNA12 and GNA13) coupled. that mediate the AP-TGF␣ shedding response. This means We employed the Ballesteros/Weinstein (B/W) scheme that induction of specific GPCRs with titrated concentra- (13) to number alignment positions (using GPCRDB (14) tion of their ligands leads to binding to the co-transfected to define the most conserved position). For positions lying G-protein partner. AP-TGF␣ release signals over titrated outside of the transmembrane helices (e.g. ICL3), we note concentrations were fitted with a sigmoidal concentration- the corresponding Pfam 7tm 1 position in parenthesis. We integrated the above sequence-based feature set with response curve, from which we obtained EC50 and Emax additional structure-based features derived from available values. For each chimeric G␣ condition, an Emax/EC50 3D complex structures of Class A GPCRs/G-proteins value was normalized by the maximum Emax/EC50 value among the 11 G␣ chimeras (relative intrinsic activity, RAi through the InterPreTS approach (15,16), which uses (10)). The base-10 log-transformed values (LogRAi), rang- learned parameters of amino-acid pair contacts across pro- ing from –2 to 0 (100-fold in linear range), represent cou- tein interfaces (i.e. statistical potentials) to predict how well pling indices. We have shown that the chimeric G-proteins, aligned homologues fit on to a particular interface of known with their C-termini, are capable of reporting a reliable cou- structure. We selected six GPCR-G protein complex struc- pling across the four G-protein families (8). Functional as- tures covering the most diverse interaction interface reper- says were performed systematically for 144 representative toire (considering both receptors and G-proteins): ADRB2- Class A GPCRs. In order to define a LogRAi threshold for GNAS (PDB ID: 3SN6), ADORA2A-miniGNAS (6GDG), true couplings, we compared our dataset with reported cou- RHO-GNAI1 (6CMO), Oprm1-GNAI1 (6DDE), ADORA1- plings from the IUPHAR/BPS Guide to PHARMACOL- GNAI2 (6D9H), HTR1B-GNAO1 (6G79). For each com- OGY (GtoPdb) (6) through a Receiver Operating Charac- plex, we aligned GPCR and chimeric G␣ subunit sequences teristic (ROC) analysis, which suggested a cutoff of LogRAi from the TGF␣ shedding assay to sequences homologous ≥ –1.0 (optimizing

Predicting Coupling Probabilities of G-Protein Coupled Receptors Gurdeep Singh1,2,†, Asuka Inoue3,*,†, J

A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus

Glioma Cell Secretion: a Driver of Tumor Progression and a Potential Therapeutic Target Damian A

G-Protein ␤␥-Complex Is Crucial for Efficient Signal Amplification in Vision

Supplementary Table 1. Pain and PTSS Associated Genes (N = 604

Recurrent GNAQ Mutation Encoding T96S in Natural Killer/T Cell Lymphoma

Novel Driver Strength Index Highlights Important Cancer Genes in TCGA Pancanatlas Patients

G Protein Alpha Inhibitor 1 (GNAI1) Rabbit Polyclonal Antibody Product Data

GNAT1 (NM 144499) Human Tagged ORF Clone Product Data

G Protein-Coupled Receptors

ANALYSIS Doi:10.1038/Nature14663

Identification of Potential Key Genes and Pathway Linked with Sporadic Creutzfeldt-Jakob Disease Based on Integrated Bioinformatics Analyses

Screening of Potential Genes and Transcription Factors Of