bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 A transcriptomic atlas of mammalian olfactory mucosae

2 reveals an evolutionary influence on food odor detection in

3 humans

4

5 Luis R. Saraiva1,2,3,4,*; Fernando Riveros-McKay2; Massimo Mezzavilla1; Eman H. Abou-

6 Moussa1; Charles J. Arayata4; Casey Trimmer4, Ximena Ibarra-Soria2; Mona Khan5; Laura Van

7 Gerven6; Mark Jorissen6; Matthew Gibbs7; Ciaran O’Flynn7; Scott McGrane7; Peter Mombaerts5;

8 John C. Marioni2,3; Joel D. Mainland4,8; Darren W. Logan2,4,7,*

9

10 1 Sidra Medicine, PO Box 26999, Doha, Qatar.

11 2 Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton-Cambridge, CB10

12 1SD, United Kingdom.

13 3 European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory,

14 Wellcome Genome Campus, Hinxton-Cambridge, CB10 1SD, United Kingdom.

15 4 Monell Chemical Senses Center, Philadelphia, PA 19104, USA.

16 5 Max Planck Research Unit for Neurogenetics, Max von-Laue-Strasse 4, 60438 Frankfurt,

17 Germany.

18 6 Department of ENT-HNS, UZ Leuven, Herestraat 49, 3000 Leuven, Belgium.

19 7 Waltham Centre for Pet Nutrition, Leicestershire, LE14 4RT, UK.

20 8 Department of Neuroscience, University of Pennsylvania, Philadelphia, PA, 19104, USA.

21

22

23 * Correspondence and requests for materials should be addressed to L.R.S. (email:

24 [email protected], twitter: @saraivalab) or D.W.L. (email: [email protected])

1

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

25 ABSTRACT

26 The mammalian olfactory system displays species-specific adaptations to different ecological

27 niches. To investigate the evolutionary dynamics of olfactory sensory neuron (OSN) sub-types

28 across 95 million years of mammalian evolution, we applied RNA-sequencing of whole olfactory

29 mucosa samples from mouse, , dog, marmoset, macaque and human. We find that OSN

30 subtypes representative of all known mouse chemosensory receptor families are present

31 in all analyzed species. Further, we show that OSN subtypes expressing canonical olfactory

32 receptors (ORs) are distributed across a large dynamic range and that homologous subtypes

33 can be either highly abundant across all species or species/order-specific. Interestingly, highly

34 abundant mouse and human OSN subtypes detect odorants with similar sensory profiles, and

35 sense ecologically relevant odorants, such as mouse semiochemicals or human key food

36 odorants. Taken together, our results allow for a better understanding of the evolution of

37 mammalian olfaction in and provide insights into the possible functions of highly

38 abundant OSN subtypes in mouse and human.

39

2

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

40 MAIN TEXT

41 Odor detection in mammals is initiated by the activation of olfactory receptors (ORs) expressed

42 in olfactory sensory neurons (OSNs), which populate the whole olfactory mucosa (WOM) (Buck

43 and Axel, 1991). Most mature OSNs predominantly express one allele of a single OR gene

44 (Chess et al., 1994; Hanchate et al., 2015; Malnic et al., 1999; Saraiva et al., 2015b). Smaller

45 subsets of mature OSNs express other families of chemosensors, such as trace-amine

46 associated receptors (TAARs), guanylate cyclases (GCs), or members of the membrane-

47 spanning 4-pass A (MS4As) gene family (Fulle et al., 1995; Greer et al., 2016; Hussain et al.,

48 2009; Liberles and Buck, 2006; Omura and Mombaerts, 2015; Saraiva et al., 2015b). These

49 receptors define the molecular identity and odorant response profile of OSNs, and OSNs apply

50 a combinatorial strategy to discriminate a number of odorants vastly greater than the number of

51 receptors present in the genome (Malnic et al., 1999; Nara et al., 2011; Zhang et al., 2013).

52 From a phylogenetic perspective, OR are divided in two classes: class I, which

53 preferentially bind hydrophilic odorants, and class II, which tend to recognize hydrophobic

54 odorants (Saito et al., 2009; Zhang and Firestein, 2002). The complex evolutionary dynamics of

55 OR genes have resulted in strikingly different species-specific repertoires, which are

56 presumably shaped by the chemosensory information that is required for survival in each

57 species’ niche (Bear et al., 2016; Khan et al., 2015; Nei et al., 2008; Niimura, 2012; Niimura et

58 al., 2014; Niimura and Nei, 2005). While cataloging the presence of orthologous OR genes

59 among species has provided some insight into the drivers of selection (Niimura et al., 2014),

60 knowing the relative abundance of each OSN subtype both within and among species may

61 provide a better understanding of these evolutionary dynamics.

62

63 Using an RNA-sequencing (RNA-seq) based approach, we have profiled previously the

64 complete mouse and OSN repertoires and found that they are stratified into hundreds

65 to thousands of functionally distinct subtypes, represented across a large dynamic range of

3

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

66 abundance in both species (Ibarra-Soria et al., 2014; Saraiva et al., 2015a; Saraiva et al.,

67 2015b). While this OSN distribution is stereotyped among genetically identical mice, it varies

68 greatly among different strains (Ibarra-Soria et al., 2017). These distributions are largely

69 genetically controlled in cis and have thus likely diverged under evolutionary pressures (Ibarra-

70 Soria et al., 2017). The abundance of OSNs expressing a given OR correlates with the total

71 volume of corresponding glomeruli in the olfactory bulb (Bressel et al., 2016). Increasing the

72 number of OR-expressing OSNs in mouse lowers detection thresholds (D'Hulst et al., 2016),

73 raising the possibility that more abundant expression increases sensitivity to the receptor's

74 ligands, providing a mechanism for adaptation to enhance the detection of important ecological

75 olfactory cues. Therefore, olfactory transcriptome analysis is a critical first step towards

76 identifying the most functionally significant among the hundreds of OSN subtypes without

77 identified odorants. Here, we investigated the transcriptional dynamics and putative functions of

78 the olfactory systems of six mammalian species, spanning ~95 million years of evolution.

79

80 We performed RNA-seq on the WOM of male dog (Canis familiaris), mouse (Mus musculus), rat

81 ( norvegicus), marmoset (Callithrix jacchus), macaque (Macaca mulatta), and human

82 (Homo sapiens) (Fig. 1A). We processed three biological replicates for each species, except for

83 macaque, where we profiled four (File S1 for quality metrics, File S2 for all gene expression

84 data). On average, 83.85 ± 1.66 % of the total reads mapped uniquely to each corresponding

85 genome. The intraspecific variability level among WOM replicates is extremely low (spearman’s

86 rho, rs=0.95-0.98, P<0.0001), consistent with previous studies in laboratory (Ibarra-

87 Soria et al., 2014; Ibarra-Soria et al., 2017; Saraiva et al., 2015a).

88

89 Gene markers for mature OSNs (mOSNs), immature OSNs (iOSNs), horizontal basal cells

90 (HBCs), and sustentacular cells (SUSs) are highly expressed across all species analyzed (Fig.

91 1B), consistent with previous studies (Saraiva et al., 2015a; Saraiva et al., 2015b). Genes

4

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

92 indicative of globose basal cells (GBCs) are enriched only in and marmoset. Taken

93 together, we have sampled, processed and sequenced RNA of sufficiently high quality to

94 reproducibly capture the neuronal component of WOM from inbred laboratory species (mice and

95 ) and non-laboratory species (dogs, marmoset, macaques, and humans).

96

97 To investigate broader gene expression dynamics of the WOM across mammalian evolution, we

98 focused on the 9725 genes that: i) have 1:1 orthology across the six species, ii) share ≥40%

99 amino-acid identity with the human ortholog, and iii) are expressed in at least three replicates. A

100 principal component analysis (PCA) of the expression data revealed inter-species differentiation

101 among sub-sets of orthologous genes: PC1 primarily separates rodents from primates, PC2

102 and PC3 separate new-world monkeys (marmosets) from old-world monkeys (macaques) and

103 hominins (humans), and dogs from all the other species, respectively (Fig. 1C and Fig. S1A, B).

104 Hierarchical clustering (HC) analysis further supports these results (Fig. S1C).

105

106 Several chemosensory receptors and/or unique molecular barcodes define non-canonical (i.e.

107 not expressing OR genes) OSN subsystems in the WOM (Greer et al., 2016; Hussain et al.,

108 2009; Leinders-Zufall et al., 2007; Liberles and Buck, 2006; Omura and Mombaerts, 2015;

109 Saraiva et al., 2015a; Saraiva et al., 2015b). While orthologs for these marker genes exist in

110 most mammals, it remains unclear whether expression is also conserved in their olfactory

111 systems. We retrieved the annotated orthologs for these genes and analyzed their RNA-seq

112 expression estimates. We found that the WOM of all species expresses trace-amine associated

113 receptors (TAARs) and membrane spanning 4-pass A (MS4As) chemosensory receptors (Fig.

114 2A, B). Within each family, the most abundant receptor and relative receptor abundance level

115 varies greatly between species. Although TAAR5 is the most abundantly expressed gene in

116 human and dog, TAAR2 is the most expressed in macaque and marmoset, and Taar6 and

117 Taar7b in mouse and rat, respectively (Fig. 2A). Among the Ms4a gene family, MS4A8B is the

5

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

118 most abundant in human, MS4A14 in macaque, MS4A7/Ms4a7 in marmoset, dog and rat, and

119 Ms4a6b in mouse (Fig. 2B). Moreover, our analysis revealed that the expression pattern of the

120 molecular barcode of OSNs expressing guanylate cyclase genes (GUCY2D/GC-D or

121 GUCY1B2) is also largely conserved (Fig. 2C). In other words, several specialized, non-

122 canonical olfactory sub-systems are likely present across mammals, including human.

123

124 Next, we focused on OR genes, the expression of which are accurate quantitative markers for

125 the abundance of over a thousand OSN subtypes in mouse (Buck and Axel, 1991; Ibarra-Soria

126 et al., 2017). We used uniquely mapped reads to quantify and analyze the distribution of OR-

127 expressing OSN subtypes in the WOM of the six species. Because ORs can be pseudogenes in

128 some individuals but not in others (Griff and Reed, 1995; Mainland et al., 2014), in these

129 analyses we included all ORs annotated as intact and pseudogenes (hereafter referred to as

130 ORs). In these species, the fraction of intact and Class II ORs are higher than the fractions of

131 pseudogenes and Class I OR genes, respectively (Fig S2A). The only exception is human, with

132 a higher fraction (51.56%) of pseudogenes. The OR pseudogene fraction is consistently higher

133 than its relative contribution to the total OR gene expression pool (Fig. S2A), meaning that

134 pseudogenes are, on average, expressed at lower levels than intact genes. A similar pattern

135 emerged when analyzing Class I and Class II ORs (Fig. S2B). Furthermore, we found that

136 expression estimates for ORs are more consistent between replicates from inbred strains (like

137 mouse and rat; rs=0.97-0.98, P<0.0001), than replicates from outbred animals (like macaque

138 and human; rs=0.68-0.77, P<0.0001, Fig. S2C). Within each species, the percentage of all intact

139 ORs expressed (≥1.0 normalized counts) in at least one replicate varies between 83.46% in

140 human and 98.31% in rat (Fig. 2D-I, File S3). In contrast, between 43.9-58.3% of ORs

141 annotated as truncated or pseudogenes lack expression (<1.0 normalized counts) in any of the

142 replicates (Fig. 2D-I, File S3). Importantly, with this approach we were able to quantify a total of

6

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

143 323 intact human ORs (plus additional 145 truncated or pseudogenes), thus increasing the

144 number of detected intact human ORs detected by ~18, when compared to a previous study

145 using an array-based approach (Verbeurgt et al., 2014). Together, these results are indicative of

146 the adequacy of our sampling strategy and of the quality of our WOM samples and suggests

147 that most intact ORs are expressed, and putatively functional, in most species. We found that

148 OR genes are expressed across a large dynamic range in all six species, with only a few being

149 highly expressed (Fig. 2D-I). These results are consistent with previous studies in mouse and

150 zebrafish (Ibarra-Soria et al., 2014; Ibarra-Soria et al., 2017; Saraiva et al., 2015a; Saraiva et

151 al., 2015b). As OR expression correlates positively with the number of OSN subtypes in the

152 mouse and zebrafish WOM (Ibarra-Soria et al., 2017; Saraiva et al., 2015a), our results indicate

153 that a few highly abundant OSN subtypes are present in each mammalian species, with the

154 majority present in relatively low numbers.

155

156 While the overall distribution pattern of the canonical, or OR-expressing, OSN subtypes is

157 similar among species, the relationship among them remains undetermined. Next, we plotted

158 phylogenetic trees overlaid with the relative frequency of OSN subtypes, as defined by the

159 abundance of the OR gene they express. To compare mean OR expression levels between

160 individuals and species, we transformed the plotted values into the percentage of the total OR

161 expression for each individual (File S3). Species-specific phylogenetic trees revealed that there

162 is no apparent clustering of the ORs that define the most abundant canonical OSN subtypes

163 within each species (Fig. S3A). However, within the order Rodentia, mouse and rat display a

164 high level of conservation in OSN subtype frequency (Fig. 3A). In contrast, when comparing

165 abundance levels of OR-expressing OSN subtypes within the order Primates (marmoset,

166 macaque, and human) or between any other species combinations, we observe no apparent

167 conservation in OSN subtype representation (Fig. 3B,C). Nonetheless, because of the high

168 sequence similarity and large gene repertoires, it can be difficult to assign 1:1 orthology among

7

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

169 all OR genes. To circumvent this problem, we used a classification method that sorts ORs into

170 ortholog gene groups (OGGs) (Niimura et al., 2014). We identified 73 OGGs showing complete

171 1:1 orthology (Fig. 3D). In other words, each OGG comprises of a set of a single intact

172 orthologous OR in all six species. Of these 73 OGGs, 18 are populated by Class I ORs and 55

173 by Class II ORs, hereafter referred to as OGG1- and OGG2-, respectively. Hierarchical

174 clustering analysis of OSN subtype abundance divided the 73 OGGs into two major clusters,

175 both containing Class I and Class II ORs. Of the 12 OGGs composing Cluster 1, half (5.67

176 ±0.80) contain species-specific highly expressed OSN subtypes (i.e., above the 90th percentile)

177 (Fig. 3D), which is significantly above chance level for all species (binomial test, two-tail,

178 P=0.000-0.021, File S4), but marmoset (binomial test, two-tail, P=0.085, File S4). In contrast,

179 the 61 OGGs composing Cluster 2 are populated by OSN subtypes that vary greatly in

180 abundance between species, and of which only ~9% are highly expressed (Fig. 3D). This is as

181 expected by chance in all species (binomial test, two-tail, P=0.077-0.182, File S4), except

182 human (binomial test, two-tail, P=0.040, File S4). Collectively, these data show that only in a

183 few cases is phylogenetic conservation positively associated with high OR expression,

184 suggesting that it cannot be used as a reliable predictor of highly abundant OSN subtypes in the

185 WOM.

186

187 RNA-seq expression estimates of chemosensory receptor genes in the WOM are positively

188 correlated with the number of OSNs expressing that given receptor (Ibarra-Soria et al., 2017;

189 Saraiva et al., 2015a). Moreover, increasing the numbers of an OSN subtype may result in

190 increased sensitivity to the odorants that activate these cells (D'Hulst et al., 2016). The

191 interesting hypothesis is raised that for some OSN subtypes, the greater their abundance, the

192 greater their contribution to the perception of its ligand. Under this assumption, unusually highly

193 abundant OSN subtypes could play a role in detecting subsets of odorants that serve a critical

8

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

194 ecological function, such as olfactory threshold or hedonics, semiochemical detection, or food

195 choice.

196 To investigate this hypothesis, we started by identifying ORs above the 90th-percentile of

197 expression in each species, hereafter also referred to as “the most” abundant OSN subtype

198 (File S5). By focusing our analysis in ORs with known ligands, we find that ~36% (26/73) of

199 deorphaned human ORs are above the 90th percentile of expression, which is more than

200 expected by chance (binomial test, two-tail, P<0.0001). In contrast, only ~15% (13/89) of mouse

201 ORs with known ligands are above the 90th percentile, as expected by chance (binomial test,

202 two-tail, P=0.1552).

203

204 We then investigated whether the OSN subtypes expressing ORs in the 73 highly conserved

205 OGGs are more likely to play a role in detecting subsets of related odorants and/or biologically

206 relevant odors. Because most OR-ligand combinations known come from studies in mouse or

207 human, we limited all our downstream analysis to these two species. We found a total of 73

208 unique odorants that activate mouse and/or human ORs in 23 OGGs, of which 12 contain

209 deorphaned mouse ORs, 8 contain deorphaned human ORs, and 3 contain both deorphaned

210 human and mouse ORs (Fig. 3D,E). These odorants have diverse perceived odors in humans

211 and based on their chemical structures we assigned them to 11 different chemical classes. In

212 both human and mouse, carboxylic acids (cheesy/sweaty odor) are the most represented class

213 (accounting for ~25% of all detected odorants) (Fig. 3F). Thiols (sulfurous) and vanillin-like

214 (sweet) odorants account for additional 23% in mouse and human, respectively. Interestingly,

215 these two categories together represent ~44% of all detected odorants in both species. Other

216 categories visibly different between species are aldehydes (fruity, aldehydic) in mouse, and

217 alcohols (floral, fruity) and ketones (fruity, floral, buttery) in human. Interestingly, terpenes

218 (green, minty) and azines (animalic, pungent) were detected only by human, and camphors

219 exclusively by mouse alone. Furthermore, we found that 10/11 OGGs (e.g., OR5A1, OR11H6

9

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

220 and OR6X1) containing deorphaned human ORs that detect molecules (e.g., β-ionone,

221 isovaleric acid and (-)-carvone) defined as human key food odorants (KFOs, Fig. 3D,E and Fig.

222 S3B), which are relevant ecological odorants (Dunkel et al., 2014). In one additional OGG, a

223 mouse OR (Olfr1509) is activated by the odorant (methylthio)methanethiol (Fig. 3D,E and S3B),

224 which is known to be a mouse semiochemical (SMC)(Lin et al., 2005). Taken together, these

225 results suggest that ORs displaying high levels of 1:1 orthology across mammals play a critical

226 olfactory role, as they are tuned to recognize specific classes of odorants, which include

227 ecologically relevant odors.

228

229 The experiments above revealed a striking feature of the highly conserved, and most abundant

230 mammalian OSN subtypes, and raise two important additional questions: Are highly abundant

231 OSN subtypes within a species biased towards specific odorants? Do the most abundant OSN

232 subtypes in different mammalian species share similar biases? To answer these questions, we

233 first investigated possible biases in the sensory profiles (defined by their human odor qualities)

234 and the physicochemical properties of odorants recognized by the most abundant human and

235 mouse OSN subtypes. Unexpectedly, we found that the sensory profiles of odorants activating

236 exclusively the mouse or human OSN subtypes above the 90th percentile are highly overlapping

237 and highly correlated (rs=0.6019, P=0.0227, Fig. 4A). We observed only minor differences

238 between the two species, with humans preferring floral, spicy, dairy and sweaty/pungent smells,

239 and mouse showing a bias towards odorants perceived by humans as nutty,

240 minty/camphor/menthol and sulfurous/meaty odors. PCA analysis of the physicochemical

241 properties of these odorants showed similar results, with all odorants intermingling in the odor

242 space (Fig. 4B).

243

244 Next, we analyzed the expression levels of all human and mouse OSN subtypes expressing

245 ORs that detect KFOs, SMCs or other odorants. The analysis revealed an enrichment above the

10

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

246 90th-percentile of expression for OSN subtypes in both human (binomial test, two-tail, P<0.0001)

247 and mouse (binomial test, two-tail, P= 0.0086) detecting KFOs and SMCs, respectively (Fig.

248 4C,D). Interestingly, we found that the five human ORs known to contribute to odor perception

249 and hedonics detect KFOs, and the two (OR7D4 and OR5A1) playing a role in food preferences

250 (Jaeger et al., 2013; Keller et al., 2007; Lunde et al., 2012; Mainland et al., 2014; McRae et al.,

251 2012; Menashe et al., 2007) are among the most abundant OSN subtypes (Fig. 4C). Similarly,

252 three mouse ORs (Olfr1509, Olfr1395 and Olfr1440) detecting at least one SMC (Jiang et al.,

253 2015; Lin et al., 2005; Saito et al., 2017; Sato-Akuhara et al., 2016; Yoshikawa et al., 2013) are

254 also above the 90th-percentile (Fig. 4D). Intriguingly, OSN subtypes detecting other odorants are

255 also enriched in human (binomial test, two-tail, P= 0.0086) but not in mouse (binomial test, two-

256 tail, P= 0.3730) (Fig. S4A,B), raising the hypothesis that these enrichments could be derived

257 from the fact that highly abundant ORs are more likely to be deorphaned (see results above).

258

259 In order to obtain a second line of evidence supporting the putative enrichment (above the 90th-

260 percentile of expression) of OSN subtypes sensing human KFOs and mouse SMCs, we

261 compared the mean expression values of these subtypes against the ones binding only other

262 odorants. We found that human ORs detecting at least one KFO have on average ~2.4-fold

263 higher expression levels than ORs detecting other odorants (unpaired t-test with Welch’s

264 correction, two-tail, P=0.0030; Fig. 4E). Moreover, a similar analysis with mouse ORs revealed

265 no significant differences in expression between OSN subtypes detecting human KFOs or other

266 odorants (Fig. 4F). In line with these results, the average expression levels of mouse ORs

267 detecting at least one SMC is ~2.7-fold higher than ORs not detecting SMCs (unpaired t-test

268 with Welch’s correction, two-tail, P=0.0004; Fig. 4G). Furthermore, a similar analysis for human

269 revealed no significant differences in expression between ORs detecting mouse SMCs or other

270 odorants (Fig. 4H).

271

11

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

272 DISCUSSION

273 Genes involved in sensing an ’s immediate environment are under strong evolutionary

274 pressure (Behrens et al., 2014; Hayden et al., 2010; Niimura, 2009; Niimura and Nei, 2005,

275 2006; Risso et al., 2017; Saraiva and Korsching, 2007; Shichida and Matsuyama, 2009; Young

276 et al., 2010). From identifying kin, food sources and sexually receptive mates to avoiding

277 predation and disease, appropriate perception of environmental sensory cues is critical for

278 survival and reproduction. The importance of sensing the chemical environment is reflected in

279 the genetic investment in encoding ORs, which comprise the largest gene families in mammals.

280 Since each mature OSN expresses just one allele of one receptor gene (Chess et al., 1994;

281 Malnic et al., 2004; Saraiva et al., 2015b), calculating total mRNA abundance of each OR in the

282 WOM samples permitted us to assess which receptors have been favorably selected for

283 expression in OSNs (Ibarra-Soria et al., 2017). Here, we profiled the transcriptomes of the WOM

284 samples from six mammalian species from three different orders (Carnivora, Rodentia, and

285 Primates), covering ~95 million years of mammalian evolution.

286

287 We first investigated the molecular markers of different cell types composing the WOM. While

288 markers for most major olfactory cell types were identified in the WOM of each of the mammals,

289 they occur at different abundances, possibly reflecting relative differences in cell type

290 composition. These differences are consistent with the variations in total numbers of OSNs

291 across mammalian species, which can range from ~6-225 million (Horowitz et al., 2014;

292 Kawagishi et al., 2014; Moran et al., 1982; Muller, 1955; Youngentob et al., 1997). As these

293 molecular signatures are already present in the WOM samples from zebrafish (Saraiva et al.,

294 2015a), including the most recently discovered subtype expressing MS4A (data not shown),

295 these results are consistent with an ancient origin of all mammalian OSN subtypes, arising at

296 least 450 million years ago.

297

12

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

298 Conservation was also apparent at the whole transcriptome level. Expression level divergence

299 of highly conserved singleton orthologs between human and other species does not scale with

300 evolutionary time, suggesting a tight regulation of expression of such genes in the mammalian

301 olfactory system. This finding is not unprecedented, as tissue-specific expression profiles can

302 display a wide-spectrum of evolutionary paths (Brawand et al., 2011; Chan et al., 2009; Merkin

303 et al., 2012).

304

305 Canonical OR genes are shaped by multiple gene birth and death events, which result in

306 species-specific repertoires with strikingly different numbers of intact genes, pseudogenes,

307 Class I and Class II ORs (Nei et al., 2008). Most intact OR genes show evidence for expression

308 in at least one replicate and most pseudogenes are not expressed, consistent with previous

309 studies (Ibarra-Soria et al., 2014; Ibarra-Soria et al., 2017; Saraiva et al., 2015a; Saraiva et al.,

310 2015b). We observe that the fraction of total OR gene expression originating from pseudogenes

311 is greater for species that are outbred, such as macaque and human. This observation is

312 consistent with greater inter-individual OR genomic variation in outbred species, resulting in a

313 higher fraction of ORs annotated as pseudogenes in the reference genome for which functional

314 alleles are segregating in the population (Griff and Reed, 1995; Mainland et al., 2014; Menashe

315 et al., 2007).

316

317 To date, the impact of evolutionary pressures on comparative global OR gene expression

318 across different mammalian species, and hence the corresponding OSN subtype distribution

319 (Ibarra-Soria et al., 2017; Saraiva et al., 2015a), were unknown. We found that the global

320 distribution profile of OSN subtypes expressing ORs is surprisingly similar in the WOM samples

321 from different mammalian species, with the subtypes consistently distributed across a large

322 dynamic range. It is unclear why, in all vertebrate species we have analyzed to date, a few OSN

323 subtypes are present at high frequency with the majority having a low abundance. This

13

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

324 distribution pattern may be an emergent property of the complex, multi-step processes that

325 regulate OR singularity (Monahan and Lomvardas, 2015; Tian et al., 2016). The highly

326 abundant OSN subtypes can be consistent between closely related species, like in mouse and

327 rat, but not over greater evolutionary distances. Indeed, we noted above that OR orthology, or

328 the phylogenetic proximity of ORs, is generally a poor predictor of the abundance of the OSN

329 subtypes between species. Together these results suggest the species-specificity of the

330 olfactory sub-genome extends to its regulatory elements. Indeed, ORs expressed at different

331 levels in inbred strains of mice have greater numbers of SNPs upstream of the transcriptional

332 start site compared to those ORs expressed at the same levels. Moreover, these variants

333 impact transcription factor binding sites(Ibarra-Soria et al., 2017).

334

335 How does cataloguing the distribution of OSNs in each species advance our understanding of

336 the ? In humans, loss-of-function variants in the OR5A1 gene have been shown

337 to increase the odor threshold by about 3 orders of magnitude to its ligand ß-ionone, suggesting

338 that this OSN subtype is the most sensitive to this ligand (Jaeger et al., 2013; McRae et al.,

339 2013). Similarly, mutations in OR7D4 significantly contribute to specific anosmias of

340 androstenone and androstadienone (Keller et al., 2007). Consistent with this, we observed that

341 OSNs expressing OR5A1 or OR7D4 are particularly abundant in human WOM (on average

342 ranked 6th and 11th, respectively). Consequently, we propose the more abundant an OSN

343 subtype is, the greater its contribution to the establishment of the detection threshold and

344 perception of its ligands. Genetic variation in other highly expressed ORs may therefore make

345 them strong candidates for causing specific anosmias or hyposmias.

346

347 We hypothesize that OSN subtypes that are more abundant in each species may be the

348 consequence of selection for the sensitive detection of ecologically meaningful odorants for that

349 species. Supporting this notion, we found a significant enrichment in abundance for human

14

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

350 OSNs that detect KFOs, not observed in mouse OSNs. In contrast we found a similar

351 enrichment for mouse SMC in mouse OSNs, but not human OSNs, albeit with a much smaller

352 sample size.

353

354 We propose that the over-representation of specific OSN subtypes is an evolutionary

355 phenomenon: driven by direct selective pressure on genetic elements that promote monogenic

356 OR selection because OSN abundances are, in mouse at least, largely genetically encoded

357 (Ibarra-Soria et al., 2017). However, it remains possible that the distributions of OSN subtypes

358 result from biases in cell survival during neurogenesis, migration, projection or circuit

359 integration. OSN life-span can be altered by odor exposure (Santoro and Dulac, 2012), but

360 when directly compared to the genetic influence the impact of odor environment on OSN

361 abundance is subtle (Ibarra-Soria et al., 2017). Data supporting a model whereby an increase in

362 an OSN subtype abundance driven by odor exposure was trans generationally inherited in mice

363 has been reported (Dias and Ressler, 2014). If this process scaled additively across subsequent

364 generations, our observations could be environmentally driven. However these data have since

365 been challenged on statistical grounds (Francis, 2014). Unless humans differ dramatically from

366 mice in their OSN dynamics, we consider it improbable that human KFO detection in the most

367 abundant OSN subtypes is a physiological response to KFO exposure alone.

368

369 We cannot exclude that ascertainment biases contribute to the association between

370 ethologically relevant odorants and high OSN abundance. For example, it is possible that KFOs

371 and SMCs were differentially enriched in odorant libraries screened against orphan ORs in

372 humans and mice respectively. Moreover, in accordance with a previous study (Verbeurgt et al.,

373 2014), our results suggest that ORs expressed in the most abundant OSN subtypes are more

374 likely to be deorphaned than those expressed in the least abundant OSNs, even though mouse

375 and human ORs have been the subject of systematic deorphanisation efforts (Mainland et al.,

15

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

376 2015; Saito et al., 2009). Future high-throughput studies focused on unraveling the complete

377 ligand activation patterns of all mammalian ORs will be critical to confirming, or refuting, the

378 conclusions from this study.

379

380 ACKNOWLEDGMENTS

381 We thank Hiroaki Matsunami and Yoshihito Niimura for insightful discussions.

382

383

384

385 AUTHOR CONTRIBUTIONS

386 L.R.S. initiated the project, designed experiments, performed experiments, interpreted results,

387 supervised the project and wrote the paper; F.R.M.A. performed data analyses; M.M. performed

388 data analyses; E.H.A-M. performed data analyses; C.J.A. performed data analyses; C.T.

389 performed experiments; X.I.S. performed data analyses; M.K. carried out experiments; L.V.G.

390 performed experiments; M. J. performed experiments; M. G. performed experiments; C. O’F.

391 performed experiments; S. M. performed experiments; P.V.J. performed data analyses; P.M.

392 designed experiments; J.C.M. initiated the project, designed experiments; J.D.M. designed

393 experiments and analyzed data; D.W.L. initiated the project, designed experiments, carried out

394 experiments, interpreted results, supervised the project and wrote the paper.

395

396 ADDITIONAL INFORMATION

397 MG, CO’F and SM were paid employees of Mars Petcare during the period the work was

398 conducted, based at the WALTHAM Centre for Pet Nutrition, and contributed to the analysis of

399 dog ORs only.

400

401 FIGURE LEGENDS

16

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

402 Figure 1 - Conservation of WOM expression signatures across mammals

403 (A) RNA-seq was performed on six species, representing three mammalian lineages: Carnivora,

404 Rodentia, and Primates. Species color scheme used is kept consistent across all figures and is

405 as follows: light green – dog, orange – mouse, brown – rat, dark grey – marmoset, light blue –

406 macaque, fuchsia– human.

407 (B) Heatmap of the expression levels of the canonical markers of the main cell populations

408 present in the WOM samples. RNA expression levels are represented on a log10 scale of

409 normalized counts (NC) plus one (0 - not expressed, 5 - highly expressed). There is

410 conservation of expression among all species. mOSNs: mature OSNs, iOSNs: immature OSNs,

411 GBCs: globose basal cells, HBCs: horizontal basal cells, SUSs: sustentacular cells, RES:

412 respiratory epithelium. No CHL1 ortholog was annotated in the rat genome version analyzed

413 (black squares).

414 (C) Principal component analysis (PCA) of the expression levels for the 9725 ‘one-to-one’

415 orthologs. Percentages of the variance explained by the principal components (PCs) are

416 indicated in parentheses. PC1 separates rodents from primates, PC2 old-world from new-world

417 primates and hominins, and PC3 separates dog from the other five species.

418

419 Figure 2 – Gene expression profiles of nasal chemosensory receptors in mammals

420 (A-B) Unrooted phylogenetic tree and mean expression levels for all TAAR (A) and MS4A (B)

421 receptor orthologs in the six species. Bars indicate the mean contribution (%) of each receptor

422 to the total gene expression within each family, and per species. Red branches indicate pseudo

423 and truncated OR genes. Black branches indicate intact OR genes.

424 (C) Heatmap of the expression pattern for markers of GC-D+ and Gucy1b2+ OSNs in mammals.

425 RNA expression levels are represented on a log10 scale of normalized counts (NC) plus one (0-

426 not expressed, 4- highly expressed). TRPC2 is a pseudogene in human and macaque, and

427 GUCY1B2 is a pseudogene in human.

17

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

428 (D-I) Distribution of mean expression values for each of the OR genes in the WOM of dog (D),

429 mouse (E), rat (F), marmoset (G), macaque (H) and human (I). Genes are displayed in

430 descending order of their mean expression values (normalized counts). Error bars represent the

431 standard error of the mean (SEM) from 3-4 individuals. To the right of each panel, a circular plot

432 shows the percentages of intact and pseudo plus truncated ORs expressed (≥ 1 normalized

433 count) or not expressed (< 1 normalized count) in at least 1 individual. Under the x-axis, the

434 total number of ORs within each species, and the position of the least abundant OR (arrow) are

435 noted.

436

437 Figure 3 – Abundance and ligand biases for highly conserved OR-expressing OSN

438 subtypes across mammalian evolution

439 (A-C) Unrooted phylogenetic trees containing the mean expression levels for all OR genes from

440 the orders Rodentia (A) and Primates (B) separately, and for all six species (C). Bars indicate

441 the mean contribution (%) of each receptor to the total gene expression within each receptor

442 family, and per species. Red branches indicate pseudo and truncated OR genes. Black

443 branches indicate intact OR genes.

444 (D) Heatmap of the expression pattern for the ORs populating the highly conserved 73 OGGs

445 across all species. Two clusters were identified (arrows): Cluster 1 containing highly abundant

446 ORs across all species, and Cluster 2 containing highly abundant ORs only in one or a subset

447 of species. Normalized percentage values of expression are represented on a relative

448 abundance scale (-2- lowly abundant, +2- highly abundant). OGGs containing human and/or

449 mouse deorphaned ORs are indicated by orange and fuchsia circles, respectively. Mouse ORs

450 activated by semiochemicals (SMCs) and human ORs activated by key food odorants (KFOs)

451 are indicated by cyan or dark-green circles, respectively.

452 (E) Odorants recognized by the deorphaned mouse (left column and within orange rectangles)

453 and human (right column and within fuchsia rectangles) ORs. In some, but not all cases, ligands

18

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

454 for the same OR have similar structures. In 1/3 OGGs (OGG2-288) containing both deorphaned

455 mouse and human ORs, the ligands are different in structure and perceived odors.

456 (F) Percentage of odorants, according to their chemical class, activating mouse and/or human

457 ORs.

458

459 Figure 4 – Sensory profile and detection of ecologically relevant odorants by highly

460 abundant ORs/OSN subtypes in mouse and humans

461 (A) The sensory profiles (spider plot) of the odorants detected exclusively by either mouse or

462 human ORs above the 90th-percentile of expression (Venn diagram) are vastly overlapping and

463 significantly positively correlated (rs=0.6019, P=0.0227).

464 (B) PCA of the 696 physicochemical descriptors of odorants detected exclusively by either

465 mouse or human ORs above the 90th-percentile of expression (see lower left in panel A).

466 Percentages of the variance explained by the PCs are indicated in parentheses. The mouse-

467 and human-specific odorants greatly overlap in the odor space.

468 (C-D) Distribution of mean normalized counts (NC) represented on a log10(x+1) scale

469 expression values for each of the OR genes in the human (C) and mouse (D) WOM. OR genes

470 detecting at least one human key food odorant (KFO) or a mouse semiochemical (SMC) are

471 indicated according to their expression percentile (fuschia/orange, above the 90th percentile;

472 black, below the 90th percentile). Known OR-ligand pairs playing a role in olfactory perception,

473 hedonics, or behavior are indicated below the x-axis. Error bars represent the standard error of

474 the mean (SEM) from 3 sample replicates. Binomial test, using Wilson/Brown method to

475 calculate the confidence interval (CI), two-tail.

476 (E-F) The mean expression levels of ORs detecting human KFOs are 2.4 times higher in

477 humans (E), but not in mouse (F). Unpaired t-test with Welch’s correction, two-tail, ***P≤0.001;

478 nsnon-significant (P≥0.05).

19

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

479 (G-H) The mean expression levels of ORs detecting mouse SMCs are 2.7 times higher in

480 mouse (G), but not in humans (H). Unpaired t-test with Welch’s correction, two-tail, ***P≤0.01,

481 nsnon-significant (P≥0.05).

482

483 REFERENCES

484 Bear, D.M., Lassance, J.M., Hoekstra, H.E., and Datta, S.R. (2016). The Evolving Neural and

485 Genetic Architecture of Vertebrate Olfaction. Curr Biol 26, R1039-R1049.

486 Behrens, M., Korsching, S.I., and Meyerhof, W. (2014). Tuning properties of avian and frog

487 bitter taste receptors dynamically fit gene repertoire sizes. Mol Biol Evol 31, 3216-3227.

488 Brawand, D., Soumillon, M., Necsulea, A., Julien, P., Csardi, G., Harrigan, P., Weier, M., Liechti,

489 A., Aximu-Petri, A., Kircher, M., et al. (2011). The evolution of gene expression levels in

490 mammalian organs. Nature 478, 343-348.

491 Bressel, O.C., Khan, M., and Mombaerts, P. (2016). Linear correlation between the number of

492 olfactory sensory neurons expressing a given mouse odorant receptor gene and the total

493 volume of the corresponding glomeruli in the olfactory bulb. J Comp Neurol 524, 199-209.

494 Buck, L., and Axel, R. (1991). A novel multigene family may encode odorant receptors: a

495 molecular basis for odor recognition. Cell 65, 175-187.

496 Chan, E.T., Quon, G.T., Chua, G., Babak, T., Trochesset, M., Zirngibl, R.A., Aubin, J., Ratcliffe,

497 M.J., Wilde, A., Brudno, M., et al. (2009). Conservation of core gene expression in vertebrate

498 tissues. J Biol 8, 33.

499 Chess, A., Simon, I., Cedar, H., and Axel, R. (1994). Allelic inactivation regulates olfactory

500 receptor gene expression. Cell 78, 823-834.

501 D'Hulst, C., Mina, R.B., Gershon, Z., Jamet, S., Cerullo, A., Tomoiaga, D., Bai, L., Belluscio, L.,

502 Rogers, M.E., Sirotin, Y., et al. (2016). MouSensor: A Versatile Genetic Platform to Create

503 Super Sniffer Mice for Studying Human Odor Coding. Cell reports 16, 1115-1125.

20

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

504 Dias, B.G., and Ressler, K.J. (2014). Parental olfactory experience influences behavior and

505 neural structure in subsequent generations. Nat Neurosci 17, 89-96.

506 Dunkel, A., Steinhaus, M., Kotthoff, M., Nowak, B., Krautwurst, D., Schieberle, P., and

507 Hofmann, T. (2014). Nature's chemical signatures in human olfaction: a foodborne perspective

508 for future biotechnology. Angew Chem Int Ed Engl 53, 7124-7143.

509 Francis, G. (2014). Too much success for recent groundbreaking epigenetic experiments.

510 198, 449-451.

511 Fulle, H.J., Vassar, R., Foster, D.C., Yang, R.B., Axel, R., and Garbers, D.L. (1995). A receptor

512 guanylyl cyclase expressed specifically in olfactory sensory neurons. Proceedings of the

513 National Academy of Sciences of the United States of America 92, 3571-3575.

514 Greer, P.L., Bear, D.M., Lassance, J.M., Bloom, M.L., Tsukahara, T., Pashkovski, S.L., Masuda,

515 F.K., Nowlan, A.C., Kirchner, R., Hoekstra, H.E., et al. (2016). A Family of non-GPCR

516 Chemosensors Defines an Alternative Logic for Mammalian Olfaction. Cell 165, 1734-1748.

517 Griff, I.C., and Reed, R.R. (1995). The genetic basis for specific anosmia to isovaleric acid in the

518 mouse. Cell 83, 407-414.

519 Hanchate, N.K., Kondoh, K., Lu, Z., Kuang, D., Ye, X., Qiu, X., Pachter, L., Trapnell, C., and

520 Buck, L.B. (2015). Single-cell transcriptomics reveals receptor transformations during olfactory

521 neurogenesis. Science 350, 1251-1255.

522 Hayden, S., Bekaert, M., Crider, T.A., Mariani, S., Murphy, W.J., and Teeling, E.C. (2010).

523 Ecological adaptation determines functional mammalian olfactory subgenomes. Genome Res

524 20, 1-9.

525 Horowitz, L.F., Saraiva, L.R., Kuang, D., Yoon, K.H., and Buck, L.B. (2014).

526 patterning in a higher primate. The Journal of neuroscience : the official journal of the Society

527 for Neuroscience 34, 12241-12252.

21

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

528 Hussain, A., Saraiva, L.R., and Korsching, S.I. (2009). Positive Darwinian selection and the birth

529 of an olfactory receptor clade in teleosts. Proceedings of the National Academy of Sciences of

530 the United States of America 106, 4313-4318.

531 Ibarra-Soria, X., Levitin, M.O., Saraiva, L.R., and Logan, D.W. (2014). The Olfactory

532 Transcriptomes of Mice. PLoS Genetics 10(9), e1004593.

533 Ibarra-Soria, X., Nakahara, T.S., Lilue, J., Jiang, Y., Trimmer, C., Souza, M.A., Netto, P.H.,

534 Ikegami, K., Murphy, N.R., Kusma, M., et al. (2017). Variation in olfactory neuron repertoires is

535 genetically controlled and environmentally modulated. eLife 6.

536 Jaeger, S.R., McRae, J.F., Bava, C.M., Beresford, M.K., Hunter, D., Jia, Y.L., Chheang, S.L.,

537 Jin, D., Peng, M., Gamble, J.C., et al. (2013). A Mendelian Trait for Olfactory Sensitivity Affects

538 Odor Experience and Food Selection. Current Biology 23, 1601-1605.

539 Jiang, Y., Gong, N.N., Hu, X.S., Ni, M.J., Pasi, R., and Matsunami, H. (2015). Molecular profiling

540 of activated olfactory neurons identifies odorant receptors for odors in vivo. Nature

541 Neuroscience 18, 1446-+.

542 Kawagishi, K., Ando, M., Yokouchi, K., Sumitomo, N., Karasawa, M., Fukushima, N., and

543 Moriizumi, T. (2014). Stereological quantification of olfactory receptor neurons in mice.

544 Neuroscience 272, 29-33.

545 Keller, A., Zhuang, H., Chi, Q., Vosshall, L.B., and Matsunami, H. (2007). Genetic variation in a

546 human odorant receptor alters odour perception. Nature 449, 468-472.

547 Khan, I., Yang, Z., Maldonado, E., Li, C., Zhang, G., Gilbert, M.T., Jarvis, E.D., O'Brien, S.J.,

548 Johnson, W.E., and Antunes, A. (2015). Olfactory Receptor Subgenomes Linked with Broad

549 Ecological Adaptations in Sauropsida. Mol Biol Evol 32, 2832-2843.

550 Leinders-Zufall, T., Cockerham, R.E., Michalakis, S., Biel, M., Garbers, D.L., Reed, R.R., Zufall,

551 F., and Munger, S.D. (2007). Contribution of the receptor guanylyl cyclase GC-D to

552 chemosensory function in the olfactory epithelium. Proceedings of the National Academy of

553 Sciences of the United States of America 104, 14507-14512.

22

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

554 Liberles, S.D., and Buck, L.B. (2006). A second class of chemosensory receptors in the

555 olfactory epithelium. Nature 442, 645-650.

556 Lin, D., Zhang, S., Block, E., and Katz, L. (2005). Encoding social signals in the mouse main

557 olfactory bulb. Nature 434, 470-477.

558 Lunde, K., Egelandsdal, B., Skuterud, E., Mainland, J.D., Lea, T., Hersleth, M., and Matsunami,

559 H. (2012). Genetic variation of an odorant receptor OR7D4 and sensory perception of cooked

560 meat containing androstenone. PLoS One 7, e35259.

561 Mainland, J.D., Keller, A., Li, Y.R., Zhou, T., Trimmer, C., Snyder, L.L., Moberly, A.H., Adipietro,

562 K.A., Liu, W.L., Zhuang, H., et al. (2014). The missense of smell: functional variability in the

563 human odorant receptor repertoire. Nat Neurosci 17, 114-120.

564 Mainland, J.D., Li, Y.R., Zhou, T., Liu, W.L., and Matsunami, H. (2015). Human olfactory

565 receptor responses to odorants. Sci Data 2, 150002.

566 Malnic, B., Godfrey, P.A., and Buck, L.B. (2004). The human olfactory receptor gene family.

567 Proceedings of the National Academy of Sciences of the United States of America.

568 Malnic, B., Hirono, J., Sato, T., and Buck, L.B. (1999). Combinatorial receptor codes for odors.

569 Cell 96, 713-723.

570 McRae, J.F., Jaeger, S.R., Bava, C.M., Beresford, M.K., Hunter, D., Jia, Y., Chheang, S.L., Jin,

571 D., Peng, M., Gamble, J.C., et al. (2013). Identification of regions associated with variation in

572 sensitivity to food-related odors in the . Curr Biol 23, 1596-1600.

573 McRae, J.F., Mainland, J.D., Jaeger, S.R., Adipietro, K.A., Matsunami, H., and Newcomb, R.D.

574 (2012). Genetic variation in the odorant receptor OR2J3 is associated with the ability to detect

575 the "grassy" smelling odor, cis-3-hexen-1-ol. Chemical senses 37, 585-593.

576 Menashe, I., Abaffy, T., Hasin, Y., Goshen, S., Yahalom, V., Luetje, C.W., and Lancet, D.

577 (2007). Genetic elucidation of human hyperosmia to isovaleric acid. PLoS Biol 5, e284.

578 Merkin, J., Russell, C., Chen, P., and Burge, C.B. (2012). Evolutionary dynamics of gene and

579 isoform regulation in Mammalian tissues. Science 338, 1593-1599.

23

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

580 Monahan, K., and Lomvardas, S. (2015). Monoallelic expression of olfactory receptors. Annu

581 Rev Cell Dev Biol 31, 721-740.

582 Moran, D.T., Rowley, J.C., 3rd, Jafek, B.W., and Lovell, M.A. (1982). The fine structure of the

583 olfactory mucosa in man. J Neurocytol 11, 721-746.

584 Muller, A. (1955). [Quantitative studies on olfactory epithelium of dogs]. Z Zellforsch Mikrosk

585 Anat 41, 335-350.

586 Nara, K., Saraiva, L.R., Ye, X., and Buck, L.B. (2011). A large-scale analysis of odor coding in

587 the olfactory epithelium. The Journal of neuroscience : the official journal of the Society for

588 Neuroscience 31, 9179-9191.

589 Nei, M., Niimura, Y., and Nozawa, M. (2008). The evolution of animal chemosensory receptor

590 gene repertoires: roles of chance and necessity. Nat Rev Genet 9, 951-963.

591 Niimura, Y. (2009). On the origin and evolution of vertebrate olfactory receptor genes:

592 comparative genome analysis among 23 species. Genome biology and evolution 1,

593 34-44.

594 Niimura, Y. (2012). Olfactory receptor multigene family in vertebrates: from the viewpoint of

595 evolutionary . Curr Genomics 13, 103-114.

596 Niimura, Y., Matsui, A., and Touhara, K. (2014). Extreme expansion of the olfactory receptor

597 gene repertoire in African elephants and evolutionary dynamics of orthologous gene groups in

598 13 placental mammals. Genome Res 24, 1485-1496.

599 Niimura, Y., and Nei, M. (2005). Evolutionary Dynamics of Olfactory Receptor Genes in Fishes

600 and Tetrapods. Proc Natl Acad Sci U S A 102, 6039-6044. Epub 2005 Apr 6011.

601 Niimura, Y., and Nei, M. (2006). Evolutionary dynamics of olfactory and other chemosensory

602 receptor genes in vertebrates. J Hum Genet 51, 505-517.

603 Omura, M., and Mombaerts, P. (2015). Trpc2-expressing sensory neurons in the mouse main

604 olfactory epithelium of type B express the soluble guanylate cyclase Gucy1b2. Mol Cell

605 Neurosci 65, 114-124.

24

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

606 Risso, D., Behrens, M., Sainz, E., Meyerhof, W., and Drayna, D. (2017). Probing the

607 Evolutionary History of Human Bitter Taste Receptor Pseudogenes by Restoring Their Function.

608 Mol Biol Evol 34, 1587-1595.

609 Saito, H., Chi, Q., Zhuang, H., Matsunami, H., and Mainland, J.D. (2009). Odor coding by a

610 Mammalian receptor repertoire. Sci Signal 2, ra9.

611 Saito, H., Nishizumi, H., Suzuki, S., Matsumoto, H., Ieki, N., Abe, T., Kiyonari, H., Morita, M.,

612 Yokota, H., Hirayama, N., et al. (2017). Immobility responses are induced by photoactivation of

613 single glomerular species responsive to fox odour TMT. Nature communications 8.

614 Santoro, S.W., and Dulac, C. (2012). The activity-dependent histone variant H2BE modulates

615 the life span of olfactory neurons. Elife (Cambridge) 1, e00070.

616 Saraiva, L.R., Ahuja, G., Ivandic, I., Syed, A.S., Marioni, J.C., Korsching, S.I., and Logan, D.W.

617 (2015a). Molecular and neuronal homology between the olfactory systems of zebrafish and

618 mouse. Scientific reports 5, 11487.

619 Saraiva, L.R., Ibarra-Soria, X., Khan, M., Omura, M., Scialdone, A., Mombaerts, P., Marioni,

620 J.C., and Logan, D.W. (2015b). Hierarchical deconstruction of mouse olfactory sensory

621 neurons: from whole mucosa to single-cell RNA-seq. Scientific reports 5, 18178.

622 Saraiva, L.R., and Korsching, S.I. (2007). A novel olfactory receptor gene family in teleost fish.

623 Genome Res 17, 1448-1457.

624 Sato-Akuhara, N., Horio, N., Kato-Namba, A., Yoshikawa, K., Niimura, Y., Ihara, S., Shirasu, M.,

625 and Touhara, K. (2016). Ligand Specificity and Evolution of Mammalian Musk Odor Receptors:

626 Effect of Single Receptor Deletion on Odor Detection. Journal of Neuroscience 36, 4482-4491.

627 Shichida, Y., and Matsuyama, T. (2009). Evolution of opsins and phototransduction.

628 Philosophical transactions of the Royal Society of Series B, Biological sciences 364,

629 2881-2895.

25

bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

630 Tian, X.J., Zhang, H., Sannerud, J., and Xing, J. (2016). Achieving diverse and monoallelic

631 olfactory receptor selection through dual-objective optimization design. Proceedings of the

632 National Academy of Sciences of the United States of America 113, E2889-2898.

633 Verbeurgt, C., Wilkin, F., Tarabichi, M., Gregoire, F., Dumont, J.E., and Chatelain, P. (2014).

634 Profiling of olfactory receptor gene expression in whole human olfactory mucosa. PLoS One 9,

635 e96333.

636 Yoshikawa, K., Nakagawa, H., Mori, N., Watanabe, H., and Touhara, K. (2013). An unsaturated

637 aliphatic alcohol as a natural ligand for a mouse odorant receptor. Nature Chemical Biology 9,

638 160-162.

639 Young, J.M., Massa, H.F., Hsu, L., and Trask, B.J. (2010). Extreme variability among

640 mammalian V1R gene families. Genome Res 20, 10-18.

641 Youngentob, S.L., Schwob, J.E., Sheehe, P.R., and Youngentob, L.M. (1997). Odorant

642 threshold following methyl bromide-induced lesions of the olfactory epithelium. Physiology &

643 behavior 62, 1241-1252.

644 Zhang, J., Pacifico, R., Cawley, D., Feinstein, P., and Bozza, T. (2013). Ultrasensitive detection

645 of amines by a trace amine-associated receptor. The Journal of neuroscience : the official

646 journal of the Society for Neuroscience 33, 3228-3239.

647 Zhang, X., and Firestein, S. (2002). The olfactory receptor gene superfamily of the mouse. Nat

648 Neurosci 5, 124-133.

26 bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A B 5 C Human (Hsap) Cfam Mmus Rnor Cjac Mmul Hsap NC 1) 28.1 GNB1 OMP 10 Primates 0 Macaque (Mmul) GNAO1 (x + Log CNGA4 42.6 ANO2 40 Marmoset (Cjac) GNAI2 mOSNs 88.0 GNAL RIC8B ADCY3 0 Rat (Rnor) CNGA2 Rodentia ALCAM 94.0 GAP43 iOSNs -40 16.0 CHL1 -80 Mouse (Mmus) ASCL1 PC1 (35.82%) PC3 (14.11%) NEUROG1 GBCs TP63 HBCs 0 -40 Carnivora EPAS1 0 SIX1 SUSs Dog (Cfam) SLC2A3 100 25 80 80 MYA PC2 (15.86%)

FIGURE 1 bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A B C Cfam Mmus Rnor Cjac Mmul Hsap 4 aar7b T MS4A14 GC-D (x+1) NC (x+1)

PDE2A 10

0 lo g Taar6 CA2 GUCY1B2

SLN Ms4a6b EMX1

SNCG

PDE1A Ms4a7 TRPC2 MS4A8B

AAR2 MS4A7

AAR5 T T

AAR5 MS4A7 T AAR2

T D E F

9.52 1500 4000 1.68 5000 1.38 13.97 1.22 14.14 7.69

11.12 13.79

(NC) 73.53 81.10 70.86

0 0 0 Normalized counts 1088 ORs 1365 ORs 1726 ORs Cfam Mmus Rnor G H I 2500 300 800 16.43 28.76 33.42 2.47

17.31 4.85 40.43 17.39 49.00 (NC) 63.78 8.01 18.15

0 0

Normalized counts 0 566 ORs 598 ORs 799 ORs Cjac Mmul Hsap

Expressed Intact ORs Expressed Pseudo ORs Not-Expressed Intact ORs Not-Expressed PseudoORs

FIGURE 2 bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A D E OGG2-288: Olfr1509 OGG2-288: OR4E2 SH O SH SH SH SH SH SH O

SH S SS SS OGG2-177: OR6F1 SHS SHS S SS OGG2-180: OR4D6 OGG2-162 O OGG2-248 O OGG2-211: Olfr1019 OGG2-320 O OGG2-288 O OGG2-370 Cluster 1 O OGG2-316: OR2B11 OGG2-332

Cluster 1 OGG2-180 OGG2-264: Olfr429 OH OGG2-211 Cluster 2 O OGG2-96 O O O O N

OGG2-177 SH OGG2-316 O O O OGG2-256 OGG2-300: Olfr2 S S SH Rodentia OGG2-245 O O O OGG2-264 OGG2-243: OR5A1 OGG2-238: OR11H6 OGG2-301 O O O OGG2-243 O O O OGG2-246 OH OGG2-425 B OGG2-213 OGG1-64: Olfr691 OGG2-285 OGG2-289: OR6X1 OGG2-279 O O O O OGG2-300 OH OH OH O OGG2-523 O O O O OGG2-255 OH OH OH OH OGG1-64 OGG2-179 OGG2-117: Olfr49 OGG2-360: OR10J5 OGG1-27 O OGG2-284 O OH OH O OGG2-215 O OGG2-238 OGG2-360 : Olfr16 OGG2-232 OGG2-520: Olfr1324 OGG1-51 O OGG2-254: OR2AT4

OGG2-334 O O OGG2-321 O O OH OH HO

OGG2-172 OGG2-286: Olfr145 OGG1-45: Olfr78 OGG2-274 O O OGG2-367 OGG1-45: OR51E2 OH OGG2-117 O OGG2-350 OGG1-44: Olfr558 O O Primates OGG2-289 OH OH O OGG2-360 O OH O O OGG2-469 O OGG1-44: OR51E1 Cluster 2 OGG2-524 OGG2-522 OGG2-199: Olfr958 O O C OGG2-520 O O O O O OH OGG2-254 O OGG1-34 HO HO O O O O OGG2-287 O O OH OH OH OGG1-85 O O O O O OGG2-286 O OGG1-45 O OH OH OGG1-44 O N OGG1-65: Olfr631 OH S OGG2-351 N OGG2-419 O O OH O O OH OGG2-199 O O OGG1-70 N

OGG1-55: Olfr7 O O

O OGG2-338 O O S O OGG1-39 OH S N OH OGG1-90 O O OGG2-115 OGG1-105: Olfr543 OGG1-31 OGG1-90: OR52B6 OGG2-426 O O OH OH O OGG1-97 OGG1-15 O O HO OGG2-510 OH OH OGG1-65 OGG1-42 OGG1-55 F +2 OGG2-190 Alcohols Camphors OGG1-105 Aldehydes Carb. acids Terpenes OGG2-413 0 OGG2-218 Aromatics Esters Thiols elative OGG1-56 Azines Ketones Vanillin-like -2 R

OR expression 0 30 0 30 300 All species 0 150 Percentage of odorants

FIGURE 3 bioRxiv preprint doi: https://doi.org/10.1101/552232; this version posted February 16, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Hsap Mmus Other A SWEET B Sensory profiles DAIRY FRUITY of >90th percentiles: FATTY/ FLORAL WAXY 20

MUSTY/ GREEN/ EARTHY HERBAL MINTY/ SULFUROUS/ 0 5 CAMPHOR/ MEATY MENTHOL SWEATY/ 15 PC2 (9.2%) PUNGENT NUTTY −20 ALCOHOLIC/ WOODY/ BALSAMIC 64 35 PHENOLIC SPICY −40 0 40 Odorants (N=120) rs = 0.6019; P = 0.0227 PC1 (39.9%)

C ORs detecting KFOs [N=55] D ORs detecting SMCs [N=5] 10000 1000 >90th percentile [N=20] >90th percentile [N=3] <90th percentile [N=35] <90th percentile [N=2] P<0.0001 P=0.0086 10 10 log (x+1) NC log (x+1) NC 0.1 0.1

OH OR2J3 OH Olfr288 O attraction to

O O urine N OR11H7P OH + Olfr1019 S Freezing O Perception OR10G4 HO and hedonics Olfr1440 Perception

N OR7D4 Olfr1395 S Aversion and fear O O O Perception, hedonics Olfr1509 SHS O attraction to OR5A1 + and food preference O urine E F G H ns 8.00 *** 1.00 ns 1.00 *** 8.00 % Total % Total % Total % Total OR expression OR expression 0.00 OR expression 0.00 OR expression 0.00 0.00 Other Other Other Other KFOs KFOs SMCs SMCs

Human Key Food Odorants (KFOs) Other odorants Mouse Semiochemicals (SMCs)

FIGURE 4