
Prediction of the odorant binding site of olfactory receptor proteins by human–mouse comparisons ORNA MAN, YOAV GILAD, AND DORON LANCET Department of Molecular Genetics and the Crown Human Genome Center, The Weizmann Institute of Science, Rehovot 76100, Israel (RECEIVED July 6, 2003; FINAL REVISION September 29, 2003; ACCEPTED October 1, 2003) Abstract Olfactory receptors (ORs) are a large family of proteins involved in the recognition and discrimination of numerous odorants. These receptors belong to the G-protein coupled receptor (GPCR) hyperfamily, for which little structural data are available. In this study we predict the binding site residues of OR proteins by analyzing a set of 1441 OR protein sequences from mouse and human. The central insight utilized is that functional contact residues would be conserved among pairs of orthologous receptors, but consid- erably less conserved among paralogous pairs. Using judiciously selected subsets of 218 ortholog pairs and 518 paralog pairs, we have identified 22 sequence positions that are both highly conserved among the putative orthologs and variable among paralogs. These residues are disposed on transmembrane helices 2 to 7, and on the second extracellular loop of the receptor. Strikingly, although the prediction makes no assumption about the location of the binding site, these amino acid positions are clustered around a pocket in a structural homology model of ORs, mostly facing the inner lumen. We propose that the identified positions constitute the odorant binding site. This conclusion is supported by the observation that all but one of the predicted binding site residues correspond to ligand-contact positions in other rhodopsin-like GPCRs. Keywords: orthologs; paralogs; G-protein coupled receptors; homology modeling Supplemental material: see www.proteinscience.org Olfaction, the sense of smell, is a versatile mechanism for lical transmembrane (TM) domains and an extracellular N detecting odorous molecules. The initial step of the olfac- terminus. tory biochemical cascade is the interaction of an odorant A large majority of ORs are semiorphan receptors, mean- with an olfactory receptor (OR) protein, embedded in the ing that although they are known to bind odorants, the speci- ciliary membrane of olfactory sensory neurons. ORs con- ficity of each receptor for target ligands is not available in stitute the largest mammalian gene superfamily, including most cases. This is largely due to the relative difficulty in more than 1000 genes and pseudogenes (Fuchs et al. 2001; functional expression of these proteins in heterologous ex- Glusman et al. 2001; Young et al. 2002; Zhang and Firestein pression systems (Gimelbrant et al. 1999). Also, to date, no 2002). ORs are members of the hyperfamily of G-protein experimentally determined structure of an OR protein exists coupled receptors (GPCRs; http://www.gpcr.org/7tm/seq/ in the literature. Consequently, relatively little is known 001_005/001_005.html), and more specifically are rhodop- about protein structural attributes of ligand recognition in sin-like GPCRs, integral membrane proteins with seven he- ORs. The sequencing of the first OR proteins revealed that TM helices 3 to 6 were more variable between paralogs, relative Reprint requests to: Doron Lancet, Department of Molecular Genetics to the rest of the protein (Buck et al. 1991). Based on the and the Crown Human Genome Center, The Weizmann Institute of Sci- notion that in a large protein repertoire, geared to recognize ence, Rehovot 76100, Israel; e-mail: [email protected]; fax: thousands of ligands, contact positions would show pro- 972-8-9344487. Article and publication are at http://www.proteinscience.org/cgi/doi/ nounced variability between paralogs (Wu and Kabat 1970), 10.1110/ps.03296404. these segments were hypothesized to participate in odorant 240 Protein Science (2004), 13:240–254. Published by Cold Spring Harbor Laboratory Press. Copyright © 2004 The Protein Society Olfactory receptor binding site binding (Buck et al. 1991). Later studies have attempted paucity of functional data. We therefore developed an al- to predict odorant binding residues in olfactory recep- ternative methodology, which uses sequence pairs. tors based upon sequence analysis, docking simula- tions using structural models, and predictions combining Results sequence analysis with structure information. Some of the earlier attempts included correlated mutation analysis Identifying putative odorant binding site residues used to identify eight contact positions (Singer et al. 1995a) and positive selection moments, which predicted three To identify potential odorant binding site residues, we specificity-determining residues within TM6 (Singer et al. searched for positions that are both highly conserved within 1996). ortholog pairs and significantly less conserved within para- Additional studies predicted ligand-contact residues by log pairs. Underlying our analysis were three assumptions. computer-based docking of odorants to structural models of First, that signal transduction in OR proteins occurs through the receptors (Afshar et al. 1998; Floriano et al. 2000; the propagation of structural changes from the functional Singer 2000; Vaidehi et al. 2002). Together, these studies contact residues to the highly conserved putative G-protein predicted 22 putative contact residues, located on TMs 3 to interface (Pilpel and Lancet 1999). Therefore, the structural 7 in their models. In an elaboration of the original variability locations, and as a result the alignment positions of the detection concept, analysis of the TM regions of ∼200 OR binding site residues, would be largely shared by all ORs. paralog sequences combined with a low-resolution struc- Second, that orthologs have similar odorant specificities, tural homology model allowed the prediction of 17 olfac- and are therefore likely to show conservation at odorant tory complementarity determining residues (CDRs; Pilpel recognition positions. Finally, that paralogs would be in- and Lancet 1999). The predicted 17 positions were sug- clined to differ in their odorant specificities, and hence in gested to constitute a hypervariable odorant binding site, their contact amino acids (Buck et al. 1991; Pilpel and Lan- similar to that of immunoglobulins. This analysis was sub- cet 1999). sequently enhanced by introducing comparisons of ortholog As a first step towards the prediction of the odorant bind- pairs. The hypothesis in this case was that functional resi- ing site we wanted to identify positions that are highly con- dues would tend to be conserved in orthologs, assuming that served within OR ortholog pairs. To this end we selected a such pairs may recognize the same or similar odorant li- set of 218 predicted OR ortholog pairs, using conservative gands. In a limited analysis (Lapidot et al. 2001), which cutoff criteria of bearing mutual best-hit relationship and included six human–mouse OR orthologous pairs, 16 of the having higher than 77% sequence identity. Figure 1 illus- 17 originally predicted CDRs (Pilpel and Lancet 1999) dis- trates the phylogenetic relationships captured by the ortho- played low interortholog variability and high interparalog log selection criteria. We then calculated the positional con- variability. A more recent study by Kondo et al. (2002) servation, C, in the predicted OR ortholog set (Fig. 1A), and similarly predicted binding site residues by identifying po- compared it to the conservation expected solely due to the sitions variable between two different OR paralogs but fully overall sequence identity among the ortholog pairs conserved among five fish orthologs of each. They identi- (0.838 ± 0.003). We found 146 positions to be significantly fied 14 potential contact residues dispersed on TMs 3, 5, 6, conserved within orthologous OR pairs with a false discov- and 7. ery rate (FDR) of 0.05, as assessed by a modified chi-square The resolution of both the human and mouse complete test (Fig. 1B). OR subgenomes (Fuchs et al. 2001; Glusman et al. 2001; The large number of positions found to be conserved Young et al. 2002; Zhang and Firestein 2002) provided within orthologous pairs suggested that this group of posi- large sets of paralog and putative ortholog OR pairs. In this tions also contains, in addition to the odorant binding site study we predict the binding site of ORs in an analysis that positions, positions that are important for maintaining the is unbiased by a priori assumptions as to the location of the OR structure and for interaction with partners common to binding site, using a large number of sequences from both all ORs. Therefore, a control group of OR pairs that share humans and the mouse. This is done by identifying se- all structural and functional features except odorant speci- quence positions with high conservation within ortholog ficity was needed to filter out positions that are conserved pairs but with significantly lower sequence preservation in within ortholog pairs but do not participate in odorant bind- paralog pairs. A similar approach has recently been success- ing. Based on the assumption that contact residues would ful in the prediction of the binding sites of bacterial tran- tend to differ between paralogs, we selected paralog pairs as scription factors and eukaryotic and prokaryotic protein ki- our control. Positions conserved among the pairs of para- nases (Mirny and Gelfand 2002; Li et al. 2003). However, logs to the same extent or more than among the pairs of the the exact
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages15 Page
-
File Size-