Proceedings of the Royal Society B: Biological Sciences DOI: 10.1098/rspb.2019.1828

Loss of olfaction in sea provides new perspectives on the aquatic adaptation of amniotes

Takushi Kishida, Yasuhiro Go, Shoji Tatsumoto, Kaori Tatsumi, Shigehiro Kuraku, Mamoru Toda

Supplemental Figures and Tables

Table of contents:

Figure S1. A NJ tree of intact V1R genes.

Figure S2. A NJ tree of class II OR genes.

Figure S3. Pictures of an amphibious sea Laticauda laticaudata and a fully- aquatic melanocephalus.

Figure S4. K-mer frequency spectrums of the WGS reads of four specimens of sea snakes sequenced in this study (k=23).

Figure S5. A NJ tree of class I OR genes.

Figure S6. Changes in the number of class I and class II OR genes estimated using the reconciled-tree method.

Table S1. Statistics of assembled sequences.

Table S2. Rate of truncated genes and pseudogenes in the snake genome assemblies.

Table S3. Expression levels of V1Rs and TAARs in each tissue of sea snakes.

Table S4. Numbers of reads and bases sequenced.

Table S5. Difference of mean FPKM values compared with the control (liver), and p-values calculated with paired t-test (one-tailed).

Fig. S1. A NJ tree of intact V1R genes. Fish V1R genes identified by Zapilko and Korsching (2016) were added, and all intact Tas2R genes of a green anole identified by Li and Zhang (2014) were used as outgroups. 209 amino acid sites were compared. Bootstrap values obtained by 500 resamplings were shown. Computations were performed using MEGA7 (Kumar et al. 2016). All squamates investigated in this study possess two intact V1Rs (except for the common viper). One is the ancV1R, which is conservative across vertebrates and co-expresses with other V1Rs or V2Rs in the VNO (Suzuki et al. 2018). The other one, which we named ‘-V1R’, is not orthologous to mammal V1Rs (all mammal V1Rs except for the ancV1R are phylogenetically related to the fish ORA1/ORA2-V1Rs, Saraiva and Korsching 2007). The ancV1R is highly expressed in the VNO in both hydrophins and laticaudins. However, expression of the Squamata-V1R is not confirmed in any olfactory tissues investigated in this study, at least in sea snakes (Table S3), though the Squamata- V1R evolved under strong purifying selection (data not shown). Squamates may not use this V1R for olfaction.

Fig. S2. A NJ tree of class II OR genes. Eight human class I ORs shown in the Materials and Methods section were used as outgroups, and all intact class II OR genes of squamates investigated in this study, including those of a python and a green anole, were analyzed. 131 amino acid sites were compared. Bootstrap values obtained by 100 sampling were shown. Computations were performed using MEGA7 (Kumar et al. 2016). Subtrees with >50 bootstrap value nodes were compressed except for a subtree including the tongue-expressing ORs, and each compressed subtree was painted with blue if it contains hydrophiin ORs, or purple if it contains laticaudin ORs but no hydrophiin ORs, or black if it contains only terrestrial squamate ORs.

Fig. S3. Pictures of an amphibious sea snake Laticauda laticaudata (A, specimen voucher: KUZR72402) and a fully-aquatic sea snake Hydrophis melanocephalus (B, specimen voucher: KUZR72403). Scale bars, 10cm.

Fig. S4. K-mer frequency spectrums of the WGS reads of four specimens of sea snakes sequenced in this study (k=23). KmerFreq_HA program in the SOAPec ver. 2.01 package (Luo et al. 2012) was employed, and trimmed paired-end reads were used for calculation (mate-pair reads and PacBio reads were excluded). Paired-end reads of L. colubrina were trimmed using Trimmomatic ver. 0.36 (Bolger et al. 2014) with following parameters: ILLUMINACLIP: TruSeq3-PE-2.fa:2:30:10, LEADING:20, TRAILING:20, SLIDINGWINDOW:5:25, HEADCROP:22, MINLEN:36. Based on this k-mer frequency spectrum, genome size of each specimen is estimated as follows: H. melanocephalus; 2.03Gbp, E. ijimae; 2.11Gbp, L. laticaudata; 1.74Gbp, L. colubrina; 2.37Gbp.

Fig. S5. A NJ tree of class I OR genes. Class I intact human ORs retrieved from the HORDE database (Glusman et al. 2001) build #44 were also added to the tree (shown with grey dots). Sixteen human class II ORs shown in the methods section were used as outgroups, and 267 amino acid sites were compared. Bootstrap values obtained by 500 sampling were shown. Computations were performed using MEGA7 (Kumar et al. 2016). It has been reported that OR51E1 and OR51E2 genes are highly conservative among mammals (Niimura et al. 2014) including anosmic odontocetes (Kishida et al. 2015), and these genes play important roles in non-olfactory organs (Chang et al. 2015). However, OR genes orthologous to human OR51E1 and OR51E2 are not conservative among snakes, and no sea snakes possess these genes. Instead, we found a class I OR gene conservative among snakes including hydrophiins, which is related to the human OR52D1. This gene may have an important non-olfactory function in snakes instead of OR51E1 and OR51E2.

Fig. S6. Changes in the number of class I (A) and class II (B) OR genes estimated using the reconciled-tree method (Niimura and Nei 2007).

Table S1. Statistics of the sea snake genome sequences assembled in this study.

L. colubrina L. laticaudata H. melanocephalus E. ijimae contig scaffold contig scaffold contig scaffold contig scaffold no. of sequences 164,304 62,906 180,051 83,587 313,421 122,022 281,190 157,858 total length (bp) 1,760,215,774 2,024,687,924 1,558,442,497 1,558,706,106 1,254,859,633 1,402,639,853 1,624,260,488 1,625,199,642 N25 (bp) 47,948 6,160,553 26,800 71,249 13,548 123,404 28,180 33,954 N50 (bp) 26,721 3,139,541 16,151 39,330 7,234 59,810 15,512 18,937 N75 (bp) 12,493 1,507,837 8,894 20,365 3,587 23,554 7,618 9,785 longest sequence (bp) 312,292 18,159,851 127,742 890,807 92,046 1,095,035 198,973 224,855 GC content (%) 40.91 35.57 40.10 40.09 38.85 34.75 40.36 40.34 rate of 'N' (%) n.a. 13.06 n.a. 0.02 n.a. 10.54 n.a. 0.06 no. of core genes detected*

complete 207 (88.84%) 162 (69.53%) 174 (74.68%) 139 (59.66%) complete + partial 231 (99.14%) 227 (97.42%) 223 (95.71%) 216 (92.70%) average no. of orthologous per core genes* 1.04 1.48 1.26 1.78 * These values were calculated using the CEGMA pipeline with the CVG geneset (233 core genes were queried)

Table S2. Rate of truncated and pseudo genes in the snake genome assemblies. intact truncated pseudo truncate-rate* pseudo-rate** OR 0 0.2 0 0.5 L. colubrina 123 22 280 0.15 0.66 L. laticaudata 105 13 286 0.11 0.71 H. melanocephalus 53 18 155 0.25 0.69 E. ijimae 74 2 147 0.026 0.66 king cobra 349 78 191 0.18 0.31 garter snake 289 58 154 0.17 0.31 common viper 367 73 142 0.17 0.24

V2R L. colubrina 263 61 356 0.19 0.52 L. laticaudata 92 17 198 0.16 0.64 H. melanocephalus 70 28 114 0.29 0.53 E. ijimae 204 70 39 0.26 0.12 king cobra 431 224 198 0.34 0.23 garter snake 233 52 114 0.18 0.29 common viper 389 72 39 0.16 0.078 *calculates as t/(i+t) **calculates as p/(i+t+p) (p: no. of pseudogenes, t: no. of truncated genes, ::i no. of intact genes)

Table S3. Expression levels of V1Rs and TAARs quantified with FPKM values in each tissue of sea snakes. See Fig. S1 for definition of the ‘Squamata-V1R’. FPKM values of the first exon of the ACTB (beta-actin) gene were also calculated for comparison. Note that no replicates were generated for the FPKM estimation. FPKM NC VNO tongue liver ancV1R 0.126 94.9 0.00 0.00 Squamata-V1R 0.00 0.00 0.00 0.00 TAAR1* 1.05 1.90 1.48 12.3 H. melanocephalus (fully-aquatic) TAAR2 (pseudo) 0.00 0.00 0.00 0.0721 TAAR4 (pseudo) 0.00 0.00 0.00 0.00 TAAR5 0.118 0.00 0.121 0.0805 ACTB 328 690 390 150

ancV1R 0.216 91.8 0.00 0.00 Squamata-V1R 0.00 0.0549 0.00 0.00 pseudoV1R 0.780 0.171 0.00 0.0623 L. laticaudata * (amphibious) TAAR1 1.91 0.00 0.00 0.111 TAAR2 2.09 0.0522 0.00 0.00 TAAR5 0.00 0.00 0.138 0.00 ACTB 1242 916 513 45.9 * TAAR1 is not involved in olfaction, at least in the case of rodents (Liberles 2009)

Table S4. Sequencing libraries prepared and sequenced for each specimen. specimen platform and library no. of reads no. of bases SRA accession no. KUZ R72402 HiSeq 350bp paired-end (2×101bp) 492M × 2 99.4G DRR144984 (L. laticaudata) HiSeq 550bp paired-end (2×101bp) 490M × 2 99.0G DRR144985 PacBio RS II* 1.08M 8.70G DRR144986-144993 KUZ R72403 HiSeq 550bp paired-end (2×101bp) 946M × 2 191G DRR147394 (H. melanocephalus) HiSeq 7kbp mate-pair (2×127bp) 93.1M × 2 23.7G DRR147395 MiSeq 7kbp mate-pair (2×151bp) 6.96M × 2 2.10G DRR147396 KUZ R72604 HiSeq 350bp paired-end (2×151bp) 442M × 2 134G DRR144861 (E. ijimae) KUZ R77260 HiSeq 340bp paired-end linked-reads (2×151bp) 426M × 2 129G DRR147552 (L. colubrina) *nos. of reads and bases of the filtered subreads

Table S5. Difference of mean FPKM values in each organ/gene compared with the control (liver). P -values calculated with paired t-test (one-tailed) are also shown. NC VNO tongue intact truncate pseudo intact truncate pseudo intact truncate pseudo OR 0.003, p=0.27 -0.020, p=0.37 -0.010, p=0.065 0.099, p=0.059 0.26, p=0.068 0.011, p=0.15 0.23, p=0.16 0.39, p=0.088 0.010, p=0.13 Hydrophis V2R 0.003, p=0.15 0.098, p=0.11 0.0, p=0.50 7.6, p=0.0031 11, p=0.0013 0.97, p=0.0001 0.006, p=0.10 -0.008, p=0.46 0.005, p=0.16 OR 5.9, p=1.4e-11 2.9, p=0.028 0.32, p=7.8e-6 0.080, p=0.014 0.009, p=0.17 0.019, p=0.0033 0.29, p=0.16 0.051, p=0.17 0.003, p=0.085 Laticauda V2R 0.026, p=0.0020 0.021, p=0.11 0.009, p=0.0073 9.5, p=4.6e-8 5.1, p=0.012 1.6, p=1.5e-5 0.005, p=0.20 0, p=n.a. 0.003, p=0.0043

Supplemental references

Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114-2120. Chang AJ, Ortega FE, Riegler J, Madison DV, Krasnow MA. 2015. Oxygen regulation of breathing through an olfactory receptor activated by lactate. Nature 527: 240- 244. Glusman G, Yanai I, Rubin I, Lancet D. 2001. The complete human olfactory subgenome. Genome Res 11: 685-702. Kishida T, Thewissen JGM, Hayakawa T, Imai H, Agata K. 2015. Aquatic adaptation and the evolution of smell and taste in whales. Zool Lett 1: 9. Kumar S, Stecher G, Tamura K. 2016. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol 33: 1870-1874. Li D, Zhang J. 2014. Diet shapes the evolution of the vertebrate bitter taste receptor gene repertoire. Mol Biol Evol 31: 303-309. Liberles SD. 2009. Trace amine-associated receptors are olfactory receptors in vertebrates. Ann N Y Acad Sci 1170: 168-172. Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, He G, Chen Y, Pan Q, Liu Y. 2012. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 1: 1-6. Niimura Y, Matsui A, Touhara K. 2014. Extreme expansion of the olfactory receptor gene repertoire in African elephants and evolutionary dynamics of orthologous gene groups in 13 placental mammals. Genome Res 24: 1485-1496. Niimura Y, Nei M. 2007. Extensive gains and losses of olfactory receptor genes in mammalian evolution. PLoS ONE 2: e708. Saraiva LR, Korsching SI. 2007. A novel olfactory receptor gene family in teleost fish. Genome Res 17: 1448-1457. Suzuki H, Nishida H, Kondo H, Yoda R, Iwata T, Nakayama K, Enomoto T, Wu J, Moriya-Ito K, Miyazaki M et al. 2018. A single pheromone receptor gene conserved across 400 million years of vertebrate evolution. Mol Biol Evol 35: 2928-2939. Zapilko V, Korsching SI. 2016. Tetrapod V1R-like ora genes in an early-diverging ray- finned fish : the canonical six ora gene repertoire of teleost fish resulted from gene loss in a larger ancestral repertoire. BMC Genomics 17: 83.