<<

bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

1 Visual pigment evolution in Characiformes: the dynamic interplay of 2 whole-genome duplication, surviving opsins and spectral tuning.

3 4 Daniel Escobar-Camacho1, Karen L. Carleton1, Devika W. Narain2, Michele E.R. Pierotti3 5 6 1Department of Biology, University of Maryland, College Park, MD 20742, USA. 7 2Environmental Sciences, Anton de Kom University of Suriname, Paramaribo, Suriname. 8 3Naos Marine Laboratories, Smithsonian Tropical Research Institute, Panama, Republic 9 of Panama. 10 11 12 Corresponding author 13 Daniel Escobar-Camacho 14 [email protected] 15

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

1 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

33 Abstract 34 Vision represents an excellent model for studying adaptation, given the genotype-to- 35 phenotype-map that has been characterized in a number of taxa. Fish possess a diverse 36 range of visual sensitivities and adaptations to underwater light making them an 37 excellent group to study visual system evolution. In particular, some speciose but 38 understudied lineages can provide a unique opportunity to better understand aspects of 39 visual system evolution such as opsin gene duplication and neofunctionalization. In this 40 study, we characterized the visual system of Neotropical Characiformes, which is the 41 result of several spectral tuning mechanisms acting in concert including gene 42 duplications and losses, gene conversion, opsin amino acid sequence and expression

43 variation, and A1/A2-chromophore shifts. The Characiforms we studied utilize three cone 44 opsin classes (SWS2, RH2, LWS) and a rod opsin (RH1). However, the characiform’s 45 entire opsin gene repertoire is a product of dynamic evolution by opsin gene loss 46 (SWS1, RH2) and duplication (LWS, RH1). The LWS- and RH1-duplicates originated 47 from a teleost specific whole-genome duplication as well as characiform-specific 48 duplication events. Both LWS-opsins exhibit gene conversion and, through substitutions 49 in key tuning sites, one of the LWS-paralogs has acquired spectral sensitivity to green 50 light. These sequence changes suggest reversion and parallel evolution of key tuning 51 sites. In addition, characiforms exhibited -specific differences in opsin

52 expression. Finally, we found interspecific and intraspecific variation in the use of A1/A2- 53 chromophores correlating with the light environment. These multiple mechanisms may 54 be a result of the highly diverse visual environments where Characiformes have evolved. 55

56 Introduction 57 To fully understand the evolutionary history of genes and their relevance for adaptation 58 and speciation, it is important to explore the genotype to phenotype map and how this 59 relates to the environment. Evolutionary studies of genes such as the ones involved in 60 the first steps of vision can provide valuable insights in the acquisition of new functions 61 and their adaptive significance. In , vision starts when light reaches the retina 62 and is detected by rod (night vision) or cone (diurnal vision) photoreceptors. 63 Photoreceptors are packed with visual pigments that are composed of two components: 64 an opsin protein with seven α-helices enclosing a ligand-binding pocket, and a light- 65 sensitive chromophore, 11-cis retinal (Bowmaker 2008; Yokoyama 2008). There can be

2 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

66 multiple cone types containing different visual pigments that absorb light maximally in 67 different parts of the wavelength spectrum. 68 69 There are four classes of cone pigments encoded by opsin genes among vertebrates: a 70 short-wave class (SWS1) sensitive to ultraviolet-violet light (350-400 nm), a second 71 short-wave class (SWS2) sensitive to violet-blue (410-490 nm), a middle-wave class 72 (RH2) sensitive to green (480-535 nm), and a middle- to long-wave class (LWS) 73 sensitive to the green and red spectral region (490-570 nm) (Bowmaker and Hunt 2006; 74 Bowmaker 2008). All four cone classes are the product of a series of gene duplications 75 from an ancestral single opsin gene that appeared early in evolution (450 76 MYA) (Bowmaker 1998; Bowmaker and Hunt 2006; Bowmaker 2008). This results in a 77 spectral tuning mechanism that is based on nucleotide variation: if a nucleotide 78 substitution leads to the replacement of an amino acid that alters the interaction of the 79 chromophore and the opsin, this will lead to a spectral shift in the maximal absorbance

80 (λmax) of the visual pigment. Consequently, variation in λmax between visual pigments is 81 the product of the interaction of different opsin classes and the identical 11-cis retinal.

82 The shift in λmax caused by a single amino acid substitution depends on the amino acid 83 identity and site, with most causing smaller shifts (2-10 nm) and a few sites causing very 84 large shifts (e.g. 75 nm) (Yokoyama 2008). 85 86 Among vertebrates, fish are ideal for the study of visual pigment evolution. First, 87 because of the physico-chemical properties of water, this medium has a profound effect 88 on light transmission. Water absorbs and scatters much of the incoming light, and this 89 inevitably causes great variation across aquatic habitats that differ in concentrations of 90 particulates and dissolved compounds (Loew and McFarland 1990; Warrant and 91 Johnsen 2013). This results in several adaptations in fish visual systems. Second, due to 92 their phylogenetic history, species richness, diverse ecologies, and diverse spectral 93 sensitivities, offer an excellent system for studying the evolution of visual 94 pigments. Spectral sensitivities have been documented for quite a few fish species 95 (Schwanzara 1967; Muntz 1973; Levine and MacNichol 1979; Bowmaker et al. 1994; 96 Lythgoe et al. 1994; Carleton 2009) and the dynamic evolution of the different opsin 97 classes has been actively studied (Bowmaker 2008; Yokoyama 2008; Hofmann and 98 Carleton 2009; Davies et al. 2012; Rennison et al. 2012; Cortesi et al. 2015; Lin et al. 99 2017; Musilova et al. 2019).

3 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

100 101 Characiformes, with more than 2000 described species, is an extremely diverse group of 102 freshwater fishes inhabiting a wide range of ecosystems. This order includes at least 23 103 families with dozens of species being described each year (Oliveira et al. 2011; Arcila et 104 al. 2018; Froese and Pauly 2019). Their Gondwanan origin, wide distribution, species 105 richness and colorful patterns, make them an ideal group for studying the evolution of 106 their visual system and its adaptation to the light environment. Data on the opsin 107 repertoire of Characiformes has only been reported for the fasciatus 108 (Yokoyama and Yokoyama 1990a; Yokoyama and Yokoyama 1993; Register et al. 109 1994; Yokoyama et al. 1995; Yokoyama et al. 2008). These studies characterized its 110 visual pigments and showed how this species has a duplication in the LWS opsin in 111 which one copy became sensitive to green light through amino acid substitutions; a 112 remarkable example of convergent evolution with green sensitivity in humans 113 (Yokoyama and Yokoyama 1990a). Recent studies have analyzed the origins of 114 Astyanax opsin genes more in depth and concluded that these duplicates are surviving 115 opsins from the teleost-specific genome duplication (TGD) (300-450 MYA: Taylor et al. 116 2001; Meyer and Peer 2005; Liu et al. 2018). 117 118 In this study, we expand the molecular characterization of the visual system in 119 Characiformes. We showcase the complex evolutionary dynamics of their opsin gene 120 repertoire and we examine the diverse set of spectral tuning mechanisms present in this 121 group. 122

123 Results

124 Opsin gene sequences 125 Opsin complements 126 Through phylogenetic analyses of 15 characiform species, we identified fully functional 127 sequences belonging to three cone opsin classes (SWS2, RH2, LWS) as well as the rod 128 opsin (RH1) (Fig. 1-2, Fig S1-3). The opsin-gene set within Characiformes seems highly 129 variable because we found variation in the presence/absence of some opsins as well as 130 several duplications (Fig. 3). We did not find sequences belonging to the UV-light 131 sensitive opsin (SWS1), either in the transcriptomes or genomes (Fig. S1). We also did 132 not detect the RH2 opsin in the transcriptomes of C. spilurus, H. microlepis, S.

4 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

133 rhombeus, and R. guatemalensis; however, we found a non-functional RH2 opsin in the 134 genome of P. nattereri. 135 136 Examination of the LWS opsin class revealed the presence of several duplications and 137 these varied between lineages suggesting the following. First, there was an LWS- 138 duplication event, which was the product of the teleost-specific genome duplication 139 (TGD) (300-450 MYA) (Fig. S4). This is supported by the fact that these initial duplicates 140 appear after the divergence of teleosts and the spotted , Lepisosteus oculatus (Fig. 141 1). These initial LWS paralogs formed two distinct clades in our analyses, which we will 142 call LWS1 and LWS2. LWS2 opsins grouped with , which are known 143 to share this duplication with characiforms (Liu et al. 2018), whereas the LWS1 opsins 144 grouped with the remaining teleost LWS opsins. Second, after TGD, LWS1 and LWS2 145 underwent subsequent rounds of duplications within Characiformes. This varied across 146 families ranging from having one to up to three LWS-duplicates (Fig. S2). The 147 characiform unique LWS2 opsin duplication is shared in most species. LWS2-1 and 148 LWS2-2 differ by the presence of a 6bp deletion in the first 20 bp of the coding sequence 149 (exon I, extracellular region) in LWS2-2, and by a few amino acids, although not in 150 spectral tuning sites. 151 152 Furthermore, we found TGD-surviving duplicates in the rodopsin (RH1) (Fig. 2). In 153 similar fashion to the LWS duplicates, these RH1 paralogs (which we will refer to as 154 RH1-1 and RH1-2) grouped in different RH1 clades where the TGD-surviving opsins of 155 Characiformes formed a well-supported clade with the known TGD-surviving copies 156 (RH1-2) present in (Morrow et al. 2011; Morrow et al. 2017) (Fig. 2). 157 Within RH1-2 opsin sequences, several species of characiforms had numerous deletions 158 in the last exon. We also found disparate amino acid variation at transmembrane sites 159 which suggest the non-functionality of these opsin genes, hence, they were excluded 160 from subsequent analyses. Lastly, the LWS1, RH2, SWS2, and RH1-1 opsins of 161 Characiformes show a paraphyletic pattern in relation to Siluriformes and 162 (Fig. S1-3). 163 164 165 Opsin sequence spectral tuning

5 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

166 Our analyses revealed several amino acid substitutions that shift λmax of visual pigments. 167 The SWS2 opsin showed the greatest variation in transmembrane regions, changes in 168 polarity, and variation in binding pocket sites (Fig. 4). Several of these substitutions 169 occurred in spectral tuning sites (M44T, A109G, M122I, A269T, A292S) that are known

170 to shift the SWS2 λmax (Yokoyama 2008). Other opsin classes also showed variable 171 diversity in known spectral tuning sites, including the RH2 opsin with substitutions that

172 shift λmax to shorter wavelengths (K36Q, L46F, I49C, F50L, L108T, A295S) (Chinen et al.

173 2005; Davies et al. 2007) and the RH1-1 opsin with substitutions that shift λmax to longer 174 wavelengths (N83D, F261Y) (Yokoyama et al. 1995; Yokoyama et al. 2005). We also 175 confirmed the presence of mutations in three “key sites” (S164A, Y261F, T269A) in the

176 LWS2 paralogs that shift λmax to shorter wavelengths (~30 nm) (Yokoyama 2008; 177 Yokoyama et al. 2008). Although previously reported only in Astyanax fasciatus 178 (Yokoyama and Yokoyama 1990a) and Osteoglossiformes (Liu et al. 2018), this trait 179 appears to be present in most characiforms (Fig. 5). 180 181 182 Gene conversion 183 Gene conversion analysis with GARD revealed evidence of interspecific gene 184 conversion within LWS1 and LWS2 opsins with two and three breakpoints respectively. 185 In both LWS1 and LWS2, conversion seems to be present in the first exons (Fig. S5, 186 Table. S1). Phylogenetic trees based on fragments between recombination breakpoints 187 exhibit different tree topologies. Trees based on exons three to six recovered the typical 188 phylogenetic relationships between families and ancestral duplications based on their 189 known genomic tree topologies (Oliveira et al. 2011) (Fig. S5). 190 191 192 Ancestral state reconstruction 193 By analyzing the evolution of LWS spectral tuning through ancestral state reconstruction, 194 our results suggest that the ancestral LWS haplotype of teleosts before TGD was 195 probably red wavelength sensitive (node #2, 73.56% of the scale likelihood) (Fig. 5, 196 Table. S2). This suggests that green sensitivity evolved soon after TGD (node #43, 197 93.7% of the scaled likelihood) (Fig. 5, Table. S2). 198

6 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

199 Furthermore, our analysis examining the molecular basis of spectral tuning of the site 200 S164A, typically conferring a -7 nm shift (Yokoyama et al. 2008; Yokoyama 2008), 201 suggests that the LWS2 ancestral haplotype of Characiformes most probably used the 202 codon GCC (node 43, 99.03% of the scaled likelihood) to encode for alanine whereas 203 the LWS1-ancestral haplotype used TCT (node 6, 99% of the scaled likelihood, Table 204 S3) to encode for serine. However, we found a reversion in the LWS2 opsins of some 205 earlier divergent lineages within Characiformes (C. strigata, P. nattereri, H. microlepis 206 and P. panamensis) where the reverse mutation in LWS2 opsins changed codons for 207 alanine (GCC) back to codons for serine (TCT or TCC). This occurred in parallel in the 208 characid P. innesi. (Fig. S6). Similar to LWS2, there is evidence of parallel evolution in 209 the LWS1 opsins. H. panamensis and C. spilurus shifted in parallel from serine to 210 alanine utilizing the same codons (TCT to GCT) (Fig. S6). Finally, even though the 211 scope of this study focused on Characiformes, the variability of site 164 is quite 212 extensive as several teleosts exhibit different codons for either alanine or serine (Fig. 213 S6). 214 215

216 Opsin gene expression 217 Opsin expression profiles varied between characiform species, ranging from some 218 expressing mainly two opsins, like the sail-fin tetra (C. spilurus) or the dogfish (H. 219 microlepis), to others expressing up to six (P. panamensis). The SWS2 opsin was the 220 only short-wavelength pigment expressed (3 to 15% of total opsin expression), and the 221 LWS duplicates accounted for the bulk of characiform opsin expression (80-95%) (Fig. 222 6). We always observed the expression of at least one copy of the LWS1 paralog, 223 followed by the expression of one or two copies of the LWS2 paralog (Fig. 6). There 224 seem to be differences in the expression of the LWS2 paralogs because in some 225 species the LWS2-1 opsin is more expressed than the LWS2-2 opsin (B. chagrensis, A. 226 ruberrimus, G. atracaudatus), but this pattern is reversed in other species (H. 227 panamensis, B. gonzalezi, C. strigata) (Fig. 6). 228 229 The RH2 opsin was lowly expressed (<5%) in most samples, except in B. emperador 230 (10%), and it was not recovered in the transcriptomes of four species (C. spilurus, H. 231 microlepis, S. rhombeus, and R. guatemalensis). Finally, rod opsin expression was

7 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

232 mainly dominated by the paralog RH1-1, while RH1-2 was very lowly expressed (<1.1%) 233 in all analyzed species (Fig. S7). 234 235

236 Photoreceptor spectral sensitivity

237 MSP of characiforms revealed a remarkable diversity in photoreceptor λmax. We 238 identified up to six different cone classes based on spectral sensitivity: a blue-sensitive

239 single cone (λmax=440-467 nm), a blue-green single cone (λmax=472-496 nm) and a

240 second medium wavelength single cone, sensitive to the short-green (λmax=514-545 nm).

241 Double cones contained either a green member (λmax=529-568 nm) paired with either a

242 green-yellow (545-588 nm) or with a yellow-orange sensitive member (λmax=564-614 nm) 243 (Fig. 7, Table 1). While all species showed at least three spectrally different 244 photoreceptors, different species exhibited different sets. Rods exhibited similar variation

245 within and across species, with a λmax range of 502-536 nm. 246 247 Nomogram fitting and sequence analysis from this study, together with previous work on 248 Astyanax (Parry et al. 2003) and other micro-spectrophotometric work on characiforms

249 (Levine and Mac Nichol 1979), suggest that variation in λmax in our dataset can be largely

250 explained in terms of A1/A2 chromophore content (Fig. 8) with the exception of two

251 groups that best fit a model including both coexpression and chromophore A1/A2 mixing. 252 These two groups covered ranges of 514-545nm and 545-588nm and had shorter

253 wavelength A1 sensitivities that were respectively best fit by a coexpressed RH2+LWS2

254 (in proportions 30%:70%) resulting in a 514nm predicted λmaxA1, and a coexpressed

255 LWS2+LWS1 (in proportions 50%:50%) resulting in a 544nm predicted λmaxA1 (Table S4). 256 The upper range for each of these groups was then fit by applying the same levels of

257 vitamin A2 estimated for the other cone classes in the same individual to these

258 coexpression λmaxA1 values. This provided models that consistently represented the best 259 fit of the data, notably both across individuals and species. Records from these two

260 classes could not be fit by any “pure” RH2 or LWS2, regardless of the amount of A2 261 imposed. It is important to observe that, these two classes were present in some but not

262 all species, the coexpression λmax values, assuming pure A1 (i.e. 514nm for RH2+LWS2; 263 544nm for LWS2+LWS1) allowed consistent fitting of similar uniform percentage values

264 of vitamin A2 across classes within the same individual, and this held across individuals

8 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

265 and species varying in their vitamin A2 content. Furthermore, based on the amino acid 266 substitutions of the three key sites in the LWS1 opsin sequence (S164, Y261, T269), we

267 predicted a λmaxA1 of 560 nm, which is in agreement with our MSP results. In addition, the

268 LWS2 opsin exhibits substitutions (A164, F261, A269) with predicted λmaxA1 at 532 nm 269 (Yokoyama et al. 2008) very close to the values obtained by MSP (529-530nm) from

270 records of pure A1 photoreceptors (Table S4). 271

272 Genomic sequencing 273 Our phylogenetic tree based on genomic sequences of characiforms was consistent with 274 results from previous studies (Oliveira et al. 2011), with African and Neotropical 275 Characiformes sharing a monophyletic origin and being sister taxa to Gymnotiformes 276 and Siluriformes (Fig. S9). All species used in this study were recovered to their 277 expected taxonomic groups except for panamensis which grouped with 278 instead of . 279

280 Discussion

281 Dynamic opsin evolution in Characiformes 282 Opsin gene duplication and gene loss 283 Through transcriptome and genome analysis, we characterized opsin evolution in 284 Neotropical Characiformes. Our results show that the opsin complement varies 285 significantly between species (Fig. 3), with species utilizing from four (R. guatemalensis), 286 to six cone opsins (P. panamensis). This diverse repertoire includes evidence for at least 287 two separate copies of LWS opsins (LWS1 and LWS2), each with unique opsin 288 sequences that originated after TGD. This was confirmed in our trees by the characiform 289 LWS2 opsins clustering with the osteoglossimorph LWS2 opsins, which are known 290 surviving copies of TGD (Liu et al. 2018). Our results also show that the LWS2 opsin 291 underwent subsequent gene duplications within Characiformes, highlighting the 292 susceptibility of opsin genes to duplication. LWS opsin gene duplications are not 293 uncommon and they have independently occurred in several teleosts (Chinen et al. 294 2003; Matsumoto et al. 2006; Ward et al. 2008; Owens et al. 2009; Phillips et al. 2015; 295 Liu et al. 2018). In addition, we also found another TGD surviving opsin product of a 296 RH1 duplication: RH1-2. We confirmed this as the surviving duplicates clustered with the

9 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

297 known RH1-2 opsins of Cypriniformes (Morrow et al. 2011; Morrow et al. 2017) (Fig. 2), 298 however their functionality remains unknown. RH1-2 duplicates have also been found in 299 the Japanese (Anguilla japonica) (Nakamura et al. 2017). Altogether, Characiformes 300 have maintained TGD duplicates of two opsin classes (LWS and RH1) (Fig. S4), yet 301 these duplicates have not been retained together in other teleosts. 302 303 The characiform visual pigment repertoire is also characterized by the absence of some 304 opsin classes. The loss of SWS1 opsins seems to have occurred early in the evolution of 305 Characiformes because its absence was shared between two phylogenetically distant 306 species (P. nattereri and A. mexicanus). This is also corroborated by our gene 307 expression and MSP data as we did not find any SWS1 cones (but see Parry et al. 308 2003). Additionally, RH2 seems to be variable across characiforms as it was absent in 309 some species, but fully functional in others; a result supported by both MSP and gene 310 expression. 311 312 313 Opsin gene conversion 314 We found evidence of gene conversion in both LWS1 and LWS2 opsins. GARD analysis 315 showed that the recombination locations are primarily in the first exons, which suggests 316 there might be selective pressures preventing gene conversion from homogenizing 317 coding sequences where the “key sites” are located in exons 3, 4 and 5. Gene 318 conversion is further supported by the different tree-topologies, a pattern more evident in 319 LWS2 where the tree based on exons 3 to 6 (Fig. S5B) suggests LWS2 opsins 320 duplicated in the early ancestor of , and . This 321 is a more parsimonious pattern than LWS2 duplications occurring in each family 322 independently and is also in agreement with our tree based on genomic sequencing 323 (Fig. S9). Overall, our findings are supported by other studies reporting gene conversion 324 homogenizing opsin sequences (Watson et al. 2010; Rennison et al. 2012; Nakamura et 325 al. 2013; Cortesi et al. 2015; Escobar-Camacho et al. 2017). However, it has also been 326 suggested that gene conversion might increase allelic diversity (Ohta 2010). 327 328

10 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

329 Opsin neofunctionalization: evolution of spectral tuning and opsin gene 330 expression 331 The great diversity in visual sensitivities among Characiformes is the product of different 332 spectral tuning mechanisms acting in concert. These include opsin sequence tuning, 333 opsin gene expression, opsin gene loss and duplication, opsin coexpression, and 334 chromophore tuning. 335 336 337 Opsin sequence tuning evolution 338 As discussed above, some characiforms have lost the RH2 opsin through opsin 339 downregulation and pseudogenization. However, characiforms have expanded their 340 green sensitivity by utilizing another opsin class gained from opsin gene duplication 341 followed by opsin sequence tuning. Through genetic and electrophysiology experiments 342 we confirm that LWS2 opsins are sensitive to green light and that this is maintained in all 343 analyzed species (Fig. 5,7). This is consistent with previous molecular studies that 344 showed that Astyanax had green sensitive opsins (Yokoyama and Yokoyama 1990b; 345 Yokoyama and Yokoyama 1990a) due to mutations in three of the known “five-sites” 346 (Yokoyama and Radlwimmer 1998). In our analysis, the diversity at spectral tuning sites 347 in the LWS opsins, particularly site 164 (Fig. S6), showed the ability of opsins to acquire

348 new functions through opsin sequence variation. This is important because shifts in λmax

349 can have profound impacts on fish color vision. As the λmax of a photoreceptor shifts 350 across the wavelength spectrum, chromatic contrast will also vary in the visual color 351 space and this could affect chromatic discrimination. 352 In addition, micro-spectrophotometry was not able to distinguish the two LWS2 (LWS2-1

353 and LWS2-2) duplicates and it remains unclear whether they differ at all in λmax. Protein 354 reconstitution might shed light on the effects of the observed substitutions in the two 355 LWS2 opsins and whether these have an influence on spectral absorbance. 356 357 The presence of LWS2 opsins in both Osteoglossiformes and Characiformes suggests 358 that LWS2-green sensitivity evolved in an early ancestor dated after TGD but before the 359 split of and Clupeocephala (around 240 MYA) (Hughes et al. 2018). 360 This implies that LWS2 opsins have been maintained in Osteoglossiformes and 361 Characiformes for at least over 300 million years (Liu et al. 2018), while they have been 362 lost in several other teleosts. Indeed, there is evidence that most duplicated genes were

11 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

363 lost in the first 60 million years after TGD (Inoue et al. 2015). Green sensitivity might 364 have evolved during the (~300-250 MYA), a period characterized by several 365 fish events in its early and middle epochs and ending with the Permian mass 366 extinction around 251 MYA (Romano et al. 2016). Therefore, the LWS2 367 neofunctionalization through opsin sequence tuning might be a result of the strong 368 environmental changes characterizing the Permian where green sensitivity was favored 369 in collapsed freshwater environments. 370 371 Opsin expression 372 Novel opsins can also acquire new functions through gene expression mechanisms 373 (Cortesi et al. 2015). In Characiformes, it seems that the rise of the LWS2 opsins might 374 have changed the regulatory architecture of RH2 expression, leading to downregulation 375 and even to gene loss (Fig. 6,S4). Previous studies have found the same pattern

376 between two different opsin classes: whenever a strong shift in λmax occurs in one opsin, 377 there can be gene loss/downregulation in another one. In flounder and South American 378 cichlids, the SWS2 opsin has acquired green sensitivity, while the functionality of the 379 RH2 opsins has been reduced (Kasagi et al. 2018, Escobar-Camacho et al. 2019). A 380 similar pattern has occurred in Osteoglossiformes where the LWS2 opsin is green 381 sensitive and the RH2 opsin has been lost (Liu et al. 2018). 382 383 Micro-spectrophotometry suggests that, in double cones, the LWS2 opsin is sometimes 384 coexpressed in RH2/LWS2 mixes or in LWS2/LWS1. Immunohistochemistry and/or in- 385 situ approaches will be needed to characterize in more detail the extent and localization 386 in the retina of such coexpressed opsins. Coexpression of RH2 and LWS opsins has 387 been observed in mammals (Applebury et al. 2000; Parry and Bowmaker 2002; Lukáts 388 et al. 2005), amphibians (Isayama et al. 2014), in the guppy (Archer and Lythgoe 1990) 389 and appears to be widespread in another highly diverse freshwater group; the African 390 and Neotropical cichlids (Dalton et al. 2014; Dalton et al. 2015; Torres-Dowdall et al. 391 2017). 392 Regional variation in the expression of RH2, single or coexpressed, is likely at the origin 393 of the discrepancy in its abundance as measured by whole-retina transcriptomics and by 394 micro-spectrophotometry. MSP is a useful approach to identify the peak of spectral 395 sensitivity of a photoreceptor, particularly when coexpression and/or chromophore 396 variation are present, i.e. when opsin sequence does not provide sufficient information to

12 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

397 infer photoreceptor sensitivity. However, MSP samples a limited number of 398 photoreceptors in the regions of the retina that happen to be scanned, so it is not 399 appropriate for quantitative estimates of photoreceptor or opsin class abundances. 400 401 Our results show that there is differential opsin expression in Characiformes. In 402 Characidae, most species express LWS2-2 more than LWS2-1, whereas this is the 403 opposite in species from other families (Fig. 6). Even though our data set is based on a 404 few individuals and more sampling is needed to quantify differential opsin expression, 405 this suggests there might be a pattern in which opsins are differentially regulated in 406 different species. In addition, we also found significant differential expression between 407 RH1-1 and RH1-2 (Fig. S7) yet we do not know whether RH1-2 has a specific role in the 408 visual system. More studies are needed to fully characterize its functionality. It has been 409 shown that RH1-2 duplicates can acquire specific functions such as regionalized 410 expression in the zebrafish retina (Morrow et al. 2017), or ontogenetic expression in the 411 life cycle of the Japanese eel (Nakamura et al. 2017). 412 413 414 Chromophore tuning 415 Fish visual pigments can be based on alternative chromophores, either 11-cis retinal, a

416 derivative of vitamin A1, or 3,4-didehydroretinal, a derivative of vitamin A2, or on mixtures

417 of the two, with the potential to generate large shifts in photoreceptors λmax at longer 418 wavelengths (Parry and Bowmaker 2000). In our dataset the spectral sensitivities of rods 419 and cones are significantly different between species sampled in murky waters (B. 420 chagrensis, C. magdalenae and H. microlepis) vs. clear-waters (A. ruberrimus, B. 421 gonzalezi, G. atracaudatus, and R. guatemalensis) (Fig. 8, Table S5-6), with the red-shift

422 attributable mainly to high levels of vitamin A2 in species sampled in turbid environments 423 (Fig. 8). Since variation in spectral sensitivities is due, to a large extent, to differences in

424 vitamin A2 content, it is not surprising that, with the notable exception of G. atracaudatus, 425 there was little variation in blue cone sensitivity across species, as the effects of the 426 chromophore are minimal at short wavelengths (Whitmore and Bowmaker 1989). 427 Chromophore-based spectral tuning is characteristic of fish inhabiting long-wavelength- 428 shifted habitats (Whitmore and Bowmaker 1989; Carleton et al. 2006; Toyama et al. 429 2008; Hofmann et al. 2009; Miyagi et al. 2012; Saarinen et al. 2012; Weadick et al. 430 2012; Liu et al. 2016; Terai et al. 2017; Torres-Dowdall et al. 2017; Escobar-Camacho et

13 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

431 al. 2019), and is well suited to the variable light environments of Neotropical freshwater 432 ecosystems, famously among the most diverse in spectra on the planet, from clear fast- 433 running mountain creeks, to muddy ‘white waters’ (from ‘agua blanca’, indicating turbid 434 highly scattering brown-tainted waters), and tannin-rich black waters (Wallace 1865; 435 Costa et al. 2013; Escobar-Camacho et al. 2019). Finally, non-significant variation in rod 436 chromophore proportions was observed between individuals, with some species 437 exhibiting a similar proportion across individuals, while other species exhibit a larger

438 range (Table S7). Variable A1/A2 ratios have been previously found in the characid A. 439 fasciatus (Parry et al. 2003), and in rods of several other characiforms (Schwanzara 440 1967; Levine & MacNichol 1979). 441 442

443 Opsins and characiform phylogenetics 444 Even though our genomic multilocus phylogeny suggested a monophyletic origin of 445 Characiformes, including African and Neotropical lineages (Citharinoidei and 446 Characoidei respectively) (Fig. S9), several of our opsin trees contradict this pattern 447 because Characiformes appear paraphyletic in relation to Siluriformes and 448 Gymnotiformes (Fig. S1-3). These contrasting results are not surprising as the non- 449 monophyly of Characiformes has been reported before (Nakatani et al. 2011; Chen et al. 450 2013; Hakrabarty et al. 2017), although; other comprehensive studies have resolved 451 Characiformes as monophyletic (Betancur-R et al. 2013; Arcila et al. 2017; Hughes et al. 452 2018). Studies that find Characiformes paraphyletic often find discrepancies between 453 Citharinoidei and Characoidei, where the latter often clusters as sister group to 454 Siluriformes (Nakatani et al. 2011; Chen et al. 2013; Hakrabarty et al. 2017). 455 Interestingly, we obtained paraphyletic results in our opsin trees because opsin 456 sequences from Gymnotiformes and Siluriformes clustered within the opsin clades of 457 Characiformes, although we did not include opsin sequences of African species. The 458 opsin tree topologies of this study could be the result of substitution saturation over 459 evolutionary time or indeed a signal of paraphyletic origins of Characiformes. Future 460 studies should include analysis of opsins of African characiforms in order to elucidate 461 this pattern. 462 463

14 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

464 Conclusions 465 Through molecular and electrophysiological experiments we have characterized the 466 visual system of Neotropical Characiformes. Their opsin repertoire is a product of 467 complex evolutionary dynamics characterized by opsin gene loss (SWS1, RH2) and 468 opsin gene duplication (LWS and RH1). These opsin duplicates are a product of a 469 teleost whole genome duplication (TGD) and from characiform-specific duplication 470 events that have been maintained for hundreds of millions of years. The LWS duplicates 471 have acquired new functions through amino acid substitution in key sites that shift their 472 maximal absorbance to green light. These duplicates exhibit gene conversion, and utilize 473 variable codons in key tuning sites leading to reversion and parallel evolution. In 474 addition, the SWS2 opsin exhibits great amino acid variation across species that might 475 shift spectral sensitivities, and the RH1-2 opsin has a different pattern in opsin 476 expression as it is always downregulated in our samples. 477 478 The diversity of visual pigments in Characiformes is the product of several spectral 479 tuning mechanisms acting in concert. These are mainly opsin sequence variation, opsin

480 gene loss and duplication, and A1/A2 chromophore tuning. Such mechanisms have 481 probably allowed characiforms to thrive in the variable freshwater light environments of 482 the Neotropics. Overall, the visual system of Characiformes showcases how opsins 483 acquire new functions and the divergent evolutionary pathway followed by this group 484 compared to other teleosts. This study shows how studying speciose, understudied 485 groups, provides a unique opportunity to better understand opsin gene evolution and 486 spectral tuning mechanisms. 487 488

489 Materials and methods

490 491 Adult fish specimens were collected using fishing lines, manual seines and cast nets in 492 several locations in Panama and Suriname from May to July of 2017. Fish were caught 493 either in murky, black, or clear waters (Table S8). Sampling permits were in accordance 494 with the Panamanian and Suriname laws of environmental protection (permits from 495 Ministerio de Ambiente de Panamá, MiAmbiente, permit No. SC/A-14-17; and Ministry of 496 Agriculture, Husbandry and Fisheries of Suriname, permit No. 1087). Fish were

15 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

497 handled following STRI IACUC protocol (#2017-0501-2020). After sampling all fish were 498 brought back to the Naos Research Laboratories at the Smithsonian Research Institute 499 Panama. Three specimens from each species were killed immediately for RNA- 500 Sequencing and a total of 29 specimens were used for micro-spectrophotometry (MSP). 501 Fish sampled in Suriname were sacrificed at the laboratories of Anton de Kom University 502 of Suriname. In total, we obtained 13 species that belonged to eight different families 503 within Characiformes (Tables S8). 504

505 RNA seq 506 After collection, fish were euthanized with buffered MS-222, their eyes were enucleated 507 and their retinas preserved in RNAlater. Two or three samples per species were used for 508 RNA sequencing. Total RNA was extracted with an RNeasy kit (Qiagen) and RNA 509 quality was verified on an Agilent Bioanalyzer. RNAseq libraries were prepared using the 510 Illumina TruSeq RNA library preparation kit (Illumina Inc, San Diego) and sequenced to 511 obtain 100-bp paired-end-reads with a total of 36 samples multiplexed in three lanes (12 512 samples per lane) on an Illumina HiSeq1500 sequencer at the University of Maryland 513 Institute for Bioscience & Biotechnology Research. The quality of the data was checked 514 using FastQC version 0.11.2. Further, we used Trimmomatic version 0.32 (Bolger et al. 515 2014) to remove overrepresented sequences and to retain sequences with a minimum 516 quality score of 20 and a minimum length of 80 bp. Transcriptomes were combined to 517 obtain 13 de-novo assemblies for each species. This was performed with Trinity version 518 r20140413 (Haas et al. 2013), using only paired sequences with a minimum coverage of 519 two to join contigs. 520

521 Opsin phylogenetics and molecular analysis 522 Candidate opsin sequences were identified from the assembled transcriptome FASTA 523 files by Tblastx querying with the opsin genes of Astyanax fasciatus (Yokoyama and 524 Yokoyama 1990a; Yokoyama and Yokoyama 1993; Register et al. 1994; Yokoyama et 525 al. 1995). Because we found opsin duplicates in the transcriptomes, we used 526 GENEIOUS 8.1 to map paired-reads for each paralog and correctly assemble each 527 opsin sequence. We confirmed the identities of gene sequences for each species to a 528 particular opsin class based on their phylogenetic relationships with opsins sequences of 529 lamprey (Geotria australis) and several teleosts obtained through Genbank (Benson et

16 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

530 al. 2005). Furthermore, we added to our analysis opsin sequences of the available 531 genomes of the Mexican cavefish, (Astyanax mexicanus), and the red-bellied 532 ( nattereri). We used MAFFT (Katoh et al. 2002) to align amino acid 533 sequences and ProtTest 3.4.2 (Darriba et al. 2011) to obtain the evolutionary models for 534 each opsin class (Table S9). We used RAXML for building maximum-likelihood trees. 535 We ran 10 searches for the best tree and performed 1000 bootstrap replicates in 536 RAXML 8.0 on CIPRES (Miller et al. 2015). 537 538 Once we identified the characiform opsin classes, we searched for amino acid 539 subtitutions that could shift the spectral sensitivity of visual pigments. To do this, we 540 aligned characiform opsin sequences with bovine rhodopsin and with opsins from other 541 teleosts. We looked for substitutions that fell in putative transmembrane regions and in 542 the retinal binding pocket facing the chromophore or in known spectral tuning sites (Hunt 543 et al. 2001; Carleton et al. 2005; Yokoyama 2008; Yokoyama et al. 2008). 544 545 Finally, since there were opsin duplicates in most analyzed species, we tested whether 546 there was gene conversion, as this has recently emerged as a common phenomenon in 547 teleosts (Owens et al. 2009; Watson et al. 2010; Nakamura et al. 2013; Escobar- 548 Camacho et al. 2017; Sandkam et al. 2017). For this we used the program GARD 549 (Genetic Algorithm Recombination Detection) (Kosakovsky Pond et al. 2006) on 550 separate alignments of the LWS duplicates to detect the presence or absence of 551 recombination. To corroborate patterns of gene conversion we built phylogenetic trees 552 based on the fragments between the recombination breakpoints. 553

554 Ancestral state reconstruction 555 Previous research has characterized the molecular basis of spectral tuning in the LWS 556 pigments where five amino acid changes (S164A, H181Y, Y261F, T269A, A292S) can

557 shift λmax up to 50 nm (Yokoyama and Yokoyama 1990a; Yokoyama and Radlwimmer 558 1998; Yokoyama and Radlwimmer 2001; Yokoyama et al. 2008). Since we observed 559 variation in the occurrence of three of these amino-acid substitutions (S164A, Y261F,

560 T269A) that are known to shift to short wavelengths the λmax of LWS opsins by 7, 10 and 561 16 nm respectively (Asenjo et al. 1994; Takahashi and Ebrey 2003; Yokoyama 2008; 562 Yokoyama et al. 2008), we analyzed the evolutionary relationships between the spectral

17 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

563 tuning sites across the Characiformes. We observed seven combinations of the three 564 sites in our LWS dataset (Table S10). We assigned one of these combinations to each 565 LWS opsin gene and performed a discrete trait ancestral state reconstruction analysis in 566 R (R Core Team 2014). 567 568 In addition, we observed that, among the three tuning sites, site S164A was the most 569 variable. In order to understand the molecular mechanisms leading to this variation, we 570 reconstructed the evolutionary changes leading to both serine or alanine and identified 571 parallel changes and reversions by characterizing the extant codons in each gene and 572 performing ancestral state reconstruction where we incorporated the respective codon to 573 each gene as a trait. 574 575 For ancestral state reconstruction analyses we used the ace function in the APE 576 package (Paradis and Schliep 2018) in R. The ace function employs a maximum 577 likelihood approach where the reconstructed ancestral states are given as a proportion 578 of the total likelihood for each state for each node. 579

580 Opsin gene expression 581 For estimating gene expression of each opsin, reads were mapped back to the 582 assembled transcriptomes using RSEM as part of the Trinity package (Haas et al. 2013). 583 Read counts for each opsin class were extracted from RSEM output (quantified as 584 fragments per kilobase of transcript per million reads, FPKM). In order to avoid non- 585 independent bias of opsin expression owing to variation in the expression of each opsin 586 class, cone opsin read counts were then normalized to those of the β-actin gene. We 587 also normalized for total cone opsin expression and divided the expression of each opsin 588 by the sum of all cone opsin counts to get the proportion of each expressed opsin. 589

590 Photoreceptor spectral sensitivity 591 The peak of maximum spectral sensitivity of individual photoreceptors was obtained by 592 micro-spectrophotometry of fresh retinas from wild-caught fish from the same collecting 593 sites as the individuals used for gene expression analysis. Fish were dark-adapted for at 594 least 2 hrs, after which they were sacrificed with an overdose of buffered MS222. Eyes 595 were enucleated under a dissecting scope in dim deep red light. The retina was removed

18 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

596 and transferred to a PBS solution containing 6.0% sucrose (Sigma). A small piece of 597 retina was cut out and placed on a glass cover slip in a drop of solution, then delicately 598 macerated with razor blades. The preparation was covered with a second glass cover 599 slip and sealed with high-vacuum silicone grease (Dow Corning) (Escobar-Camacho et 600 al. 2019). 601 602 Spectral absorbance was measured with a computer-controlled single-beam micro- 603 spectrophotometer fitted with quartz optics and a 100W quartz-halogen lamp (Loew 604 1982). Baseline records were taken by averaging a scan from 750 to 350 nm and a 605 second in the opposite direction, through a clear area of the preparation and in proximity 606 to the photoreceptor of interest. A record of the visual cell was then obtained by 607 scanning with the MSP beam through the photoreceptor outer segment. Finally, the 608 cell’s absorption spectrum was obtained by subtracting the baseline record. A custom- 609 designed spectral analysis program (Loew et al. unpublished data) was used to

610 determine λmax from absorbance records using existing templates (Dartnall 1953; Munz 611 and Schwanzara 1967). Individual spectra were smoothed with a nine-point adjacent 612 averaging function and the resulting curves were differentiated to obtain a preliminary 613 maximum value. This was used to normalize curves to zero at the baseline on the long 614 wavelength limb and to one at the maximum value (Escobar-Camacho et al. 2019). 615 Whitmore and Bowmaker’s (1989) relationship (Eqn 1) was used to recursively fit the 616 observed (normalized) absorption spectra to curves resulting from combinations of

617 different proportions of pure Vitamin A1 and corresponding pure Vitamin A2 nomograms 618 (Whitmore and Bowmaker 1989). 619 0.4 620 λmaxA1 = (λmaxA2 - 250) * 52.5 Eqn 1 621 622 A non-parametric ANOVA on ranks (Kruskal-Wallis t test) followed by a pairwise

623 Wilcoxon test was used to compare the λmax of different photoreceptor classes. 624

625 DNA extraction, sequencing and phylogenetic analysis 626 To analyze the evolutionary relationships of the sampled characiforms and confirm 627 species identification, we sequenced nuclear and mitochondrial genes (16S, Cytb, Myh6, 628 RAG1 and RAG2) from all collected species except C. spilurus. DNA was extracted with

19 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

629 a DNeasy kit (Qiagen), and DNA quality was verified using the Nanodrop approach. 630 PCR reactions were performed in a total volume of 50µl using a thermocycler 631 (Eppendorf). Reactions contained 25µl DreamTaq DNA polymerase, 20µl of sterile

632 distilled H2O, 2µl of each primer (10µM), and 1µl template DNA. Conditions were as 633 follow: 94°C (2 min); 35 cycles of 94°C (30s), °54 (30s), and 72°C (2 min) followed by 634 72°C (4min). Nested-PCRs were used to amplify the genes RAG1 and RAG2. Amplified 635 products were checked on 1% agarose gel stained with GelRed. 636 637 Once individual genes were sequenced for each species, all genes were concatenated. 638 We added our species alignments to a data set of 213 sequenced characiforms for the 639 same markers from Oliveira et al. (2011). We used Partitionfinder2 (Lanfear et al. 2017) 640 to obtain the most appropriate phylogenetic models and partitioning scheme. Finally, we 641 used RAXML to build maximum-likelihood trees with 1000 Bootstrap repetitions (Miller et 642 al. 2015). 643

644 Acknowledgements 645 Special thanks go to Suwei Zhao for training during library preparations. We thank the 646 University of Maryland Institute for Bioscience & Biotechnology Research for 647 sequencing. We thank Michaela Taylor for help during genomic sequencing and Danielle 648 Adams for help during ancestral reconstruction analysis. We are grateful to Ellis R. Loew 649 for generously providing his MSP machine and analyses software. We thank Alejandra 650 Rodríguez-Abaunza and Aureliano Valencia for their assistance during sampling. We 651 also thank all of the staff at Bocas del Toro Research Station and at Naos Laboratories, 652 Smithsonian Tropical Research Institute (STRI), Panama, for their help during our field 653 season. We also thank Owen McMillan for logistic support and Richard Cooke for his 654 valuable insight during our field season. This work was supported by a STRI Short Term 655 Fellowship (ID 102755 to D.E-C); the National Institute of Health (R01EY024693 to 656 K.L.C) and by the Secretariat of Higher Education, Science, and Technology and 657 Innovation of Ecuador (SENESCYT) (2014-AR2Q4465 to D.E-C). 658 659 Data availability 660 DNA sequences and transcriptome libraries will be available upon manuscript 661 acceptance.

20 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

662 References 663 Applebury ML, Antoch MP, Baxter LC, Chun LLY, Falk JD, Farhangfar F, Kage K, 664 Krzystolik MG, Lyass LA, Robbins JT. 2000. The Murine Cone Photoreceptor : A 665 Single Cone Type Expresses Both S and M Opsins with Retinal Spatial Patterning. 666 Neuron 27:513–523. 667 Archer SN, Lythgoe JN. 1990. The visual pigment basis for cone polymorphism in the 668 guppy, Poecilia reticulata. Vision Res. 30:225–233. 669 Arcila D, Ortí G, Vari R, Armbruster JW, Stiassny MLJ, Ko KD, Sabaj MH, Lundberg J, 670 Revell LJ, Betancur-R R. 2017. Genome-wide interrogation advances resolution of 671 recalcitrant groups in the tree of life. Nat. Ecol. Evol. 1:20. 672 Arcila D, Petry P, Ortí G. 2018. Phylogenetic relationships of the family Tarumaniidae 673 (Characiformes) based on nuclear and mitochondrial data. Neotro 16:e180016. 674 Asenjo AB, Rim J, Oprian DD. 1994. Molecular determinants of human red/green color 675 discrimination. Neuron 12:1131–1138. 676 Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. 2005. GenBank. 677 Nucleic Acids Res. 33:34–38. 678 Betancur-R R, Broughton R, Wiley E, Carpenter K. 2013. The tree of life and a new 679 classification of bony fishes. PLoS Curr. Tree Life 0732988. 680 Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: A flexible trimmer for Illumina 681 sequence data. Bioinformatics 30:2114–2120. 682 Bowmaker JK. 1998. Evolution of colour vision in vertebrates. Eye 12:541–547. 683 Bowmaker JK. 2008. Evolution of vertebrate visual pigments. Vision Res. 48:2022–2041. 684 Bowmaker JK, Govardovskii VI, Sideleva VG, Shukolyukov SA, Zueva L V. 1994. Visual 685 Pigments and the Photic Environment : the Cottoid Fish of Lake Baikal. Vision Res. 686 34:591–605. 687 Bowmaker JK, Hunt DM. 2006. Evolution of vertebrate visual pigments. Curr. Biol. 688 16:pR484–R489. 689 Carleton K. 2009. Cichlid fish visual systems: mechanisms of spectral tuning. Integr. 690 Zool. 4:75–86. 691 Carleton KL, Spady TC, Cote RH. 2005. Rod and cone opsin families differ in spectral 692 tuning domains but not signal transducing domains as judged by saturated 693 evolutionary trace analysis. J. Mol. Evol. 61:75–89. 694 Carleton KL, Spady TC, Kocher TD. 2006. Visual communication in East African Cichlid 695 Fishes: Diversity in a phylogenetic context. In: Comunication in Fishes. p. 485–515.

21 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

696 Chen WJ, Lavoué S, Mayden RL. 2013. Evolutionary Origin And Early Biogeography Of 697 Otophysan Fishes (: Teleostei). Evolution (N. Y). 67:2218–2239. 698 Chinen A, Hamaoka T, Yamada Y, Kawamura S. 2003. Gene duplication and spectral 699 diversification of cone visual pigments of zebrafish. Genetics 163:663–675. 700 Chinen A, Matsumoto Y, Kawamura S. 2005. Reconstitution of Ancestral Green Visual 701 Pigments of Zebrafish and Molecular Mechanism of Their Spectral Differentiation. 702 Mol. Biol. Evol. 22:1001–1010. 703 Cortesi F, Musilová Z, Stieb SM, Hart NS, Siebeck UE, Malmstrøm M, Tørresen OK, 704 Jentoft S, Cheney KL, Marshall NJ, et al. 2015. Ancestral duplications and highly 705 dynamic opsin gene evolution in percomorph fishes. Proc. Natl. Acad. Sci. 706 112:1493–1498. 707 Costa M, Telmer K, Novo EMLM. 2013. Spatial and temporal variability of light 708 attenuation in the Amazonian waters. Hydrobiologia 702:171–190. 709 Dalton BE, Loew ER, Cronin TW, Carleton KL. 2014. Spectral tuning by opsin 710 coexpression in retinal regions that view different parts of the visual field. Proc. R. 711 Soc. B 281:20141980. 712 Dalton BE, Lu J, Leips J, Cronin TW, Carleton KL. 2015. Variable light environments 713 induce plastic spectral tuning by regional opsin coexpression in the African cichlid 714 fish, Metriaclima zebra. Mol. Ecol. 24:4193–4204. 715 Darriba D, Taboada GL, Doallo R, Posada D. 2011. ProtTest 3 : fast selection of best-fit 716 models of protein evolution. 27:1164–1165. 717 Dartnall HJA. 1953. The interpretation of spectral sensitivity curves. Br. Med. Bull. 9:24– 718 30. 719 Davies WIL, Collin SP, Hunt DM. 2012. Molecular ecology and adaptation of visual 720 photopigments in craniates. Mol. Ecol. 21:3121–3158. 721 Davies WL, Cowing JA, Carvalho LS, Potter IC, Trezise AEO, Hunt DM, Collin SP. 2007. 722 Functional characterization, tuning, and regulation of visual pigment gene 723 expression in an anadromous lamprey. FASEB J. 21:2713–2724. 724 Escobar-Camacho D, Pierotti MER, Ferenc V, Sharpe DMT, Ramos E, Martins C, 725 Carleton KL. 2019. Variable vision in variable environments : the visual system of 726 an invasive cichlid (Cichla monoculus) in Lake Gatun , Panama. J. Exp. Biol. 727 222:jeb188300. 728 Escobar-Camacho D, Ramos E, Martins C, Carleton KL. 2017. The opsin genes of 729 amazonian cichlids. Mol. Ecol. 26:1343–1356.

22 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

730 Froese R, Pauly D. 2019. FishBase. World Wide Web electronic publication. Available 731 from: www.fishbase.org 732 Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, 733 Eccles D, Li B, Lieber M, et al. 2013. De novo transcript sequence reconstruction 734 from RNA-seq using the Trinity platform for reference generation and analysis. Nat. 735 Protoc. 8:1494–1512. 736 Hakrabarty PRC, Aircloth BRCF, Lda FEA, Udt WIBL, Ahan CADMCM. 2017. 737 Phylogenomic Systematics of Ostariophysan Fishes : Ultraconserved Elements 738 Support the Surprising Non-Monophyly of Characiformes. Syst. Biol. 0:1–15. 739 Hofmann CM, Carleton KL. 2009. Gene duplication and differential gene expression play 740 an important role in the diversification of visual pigments in fish. Integr. Comp. Biol. 741 49:630–643. 742 Hofmann CM, O’Quin KE, Justin Marshall N, Cronin TW, Seehausen O, Carleton KL. 743 2009. The eyes have it: Regulatory and structural changes both underlie cichlid 744 visual pigment diversity. PLoS Biol. 7:e1000266. 745 Hughes LC, Ortí G, Huang Y, Sun Y, Baldwin CC, Thompson AW. 2018. 746 Comprehensive phylogeny of ray-finned fishes () based on 747 transcriptomic and genomic data. Proc. Natl. Acad. Sci. 115:6249–6254. 748 Hunt DM, Dulai KS, Partridge JC, Cottrill P, Bowmaker JK. 2001. The molecular basis 749 for spectral tuning of rod visual pigments in deep-sea fish. J. Exp. Biol. 204:3333– 750 3344. 751 Inoue J, Sato Y, Sinclair R, Tsukamoto K, Nishida M. 2015. Rapid genome reshaping by 752 multiple-gene loss after whole-genome duplication in teleost fish suggested by 753 mathematical modeling. Proc. Natl. Acad. Sci. 112:14918–14923. 754 Isayama T, Chen Y, Kono M, Fabre E, Slavsky M, Degrip WJ, Ma J, Crouch RK, Makino 755 CL. 2014. Coexpression of Three Opsins in Cone Photoreceptors of the 756 Salamander Ambystoma tigrinum. J. Comp. Neurol. 522:2249–2265. 757 Kasagi S, Mizusawa K, Takahashi A. 2018. Green-shifting of SWS2A opsin sensitivity 758 and loss of function of RH2-A opsin in flounders, Verasper. Ecol. Evol. 759 8:1399–1410. 760 Katoh K, Misawa K, Kuma K, Miyata T. 2002. MAFFT: a novel method for rapid multiple 761 sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30:3059– 762 3066. 763 Kosakovsky Pond SL, Posada D, Gravenor MB, Woelk CH, Frost SDW. 2006. GARD : a

23 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

764 genetic algorithm for recombination detection. Bioinformatics 22:3096–3098. 765 Lanfear R, Frandsen PB, Wright AM, Senfeld T, Calcott B. 2017. Partitionfinder 2: New 766 methods for selecting partitioned models of evolution for molecular and 767 morphological phylogenetic analyses. Mol. Biol. Evol. 34:772–773. 768 Levine JS, MacNichol EF. 1979. Visual Pigments in Teleost Fishes: Effects of Habitat, 769 Microhabitat, and Behavior on Visual System Evolution. Sens. Processes 3:95–131. 770 Lin JJ, Wang FY, Li WH, Wang TY. 2017. The rises and falls of opsin genes in 59 ray- 771 finned fish genomes and their implications for environmental adaptation. Sci. Rep. 772 7:1–13. 773 Liu D-W, Lu Y, Yan HY, Zakon HH. 2016. South American Weakly Electric Fish 774 (Gymnotiformes) Are Long-Wavelength-Sensitive. Brain. Behav. Evol. 88:204–212. 775 Liu D, Wang F, Lin J, Thompson A, Lu Y, Vo D, Yan HY, Zakon H. 2018. The Cone 776 Opsin Repertoire of Osteoglossomorph Fishes : Gene Loss in Mormyrid Electric 777 Fish and a Long Wavelength-Sensitive Cone Opsin That Survived 3R. Mol. Biol. 778 Evol. 36:447–457. 779 Loew ER. 1982. A Field-Portable Microspectrophotometer. Methods Enzymol. 81:647– 780 655. 781 Loew ER, McFarland WN. 1990. The underwater visual environment. In: Douglas RH, 782 Djamgoz MBA, editors. The Visual System of Fish. New York, NY: Chapman and 783 Hall. p. 526. 784 Lukáts Á, Szabó A, Röhlich P, Vígh B, Szél Á. 2005. Photopigment coexpression in 785 mammals: comparative and developmental aspects. Histol. Histopathol. 20:551– 786 574. 787 Lythgoe JN, Muntz WRA, Partridge JC. 1994. The ecology of the visual pigments of 788 snappers (Lutjanidae) on the Great Barrier Reef. J. Comp. Physiol. A 174:461–467. 789 Matsumoto Y, Fukamachi S, Mitani H, Kawamura S. 2006. Functional characterization of 790 visual opsin repertoire in Medaka (Oryzias latipes). Gene 371:268–278. 791 Meyer A, Peer Y Van De. 2005. From 2R to 3R : evidence for a fish-specific genome 792 duplication (FSGD). BioEssays:937–945. 793 Miller MA, Schwartz T, Pickett BE, He S, Klem EB, Scheuermann RH, Passarotti M, 794 Kaufman S, Leary MAO. 2015. A RESTful API for Access to Phylogenetic Tools via 795 the CIPRES Science Gateway. Evol. Bioinforma.:43–48. 796 Miyagi R, Terai Y, Aibara M, Sugawara T, Imai H, Tachida H, Mzighani SI, Okitsu T, 797 Wada A, Okada N. 2012. Correlation between nuptial colors and visual sensitivities

24 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

798 tuned by opsins leads to species richness in sympatric Lake Victoria Cichlid Fishes. 799 Mol. Biol. Evol. 29:3281–3296. 800 Morrow JM, Lazic S, Chang BSW. 2011. A novel rhodopsin-like gene expressed in 801 zebrafish retina. Vis. Neurosci. 28:325–335. 802 Morrow JM, Lazic S, Dixon Fox M, Kuo C, Schott RK, de A. Gutierrez E, Santini F, 803 Tropepe V, Chang BSW. 2017. A second visual rhodopsin gene, rh1-2 , is 804 expressed in zebrafish photoreceptors and found in other ray-finned fishes. J. Exp. 805 Biol. 220:294–303. 806 Muntz WRA. 1973. Yellow filters and the absorption of light by the visual pigments of 807 some amazonian fishes. Vision Res. 13:2235–2254. 808 Muntz WRA. 1982. Visual Adaptations to Different light environments in Amazonian 809 Fishes. Rev. Can. Biol. Exp. 41:35–46. 810 Munz FW, Schwanzara SA. 1967. A nomogram for retinene2-based visual pigments. 811 Vision Res. 7:111–120. 812 Musilova Z, Cortesi F, Matschiner M, Davies WIL, Patel JS, Stieb SM, Busserolles F De, 813 Brown CJ, Mountford JK, Hanel R, et al. 2019. Vision using multiple distinct rod 814 opsins in deep-sea fishes. Science. 364:588–592. 815 Nakamura Y, Mori K, Saitoh K, Oshima K, Mekuchi M, Sugaya T, Shigenobu Y. 2013. 816 Evolutionary changes of multiple visual pigment genes in the complete genome of 817 Pacific bluefin tuna. Proc. Natl. Acad. Sci. 110:11061–11066. 818 Nakamura Y, Yasuike M, Mekuchi M, Iwasaki Y, Ojima N, Fujiwara A, Chow S, Saitoh K. 819 2017. Rhodopsin gene copies in Japanese eel originated in a teleost-specific 820 genome duplication. Zool. Lett. 3:2–12. 821 Nakatani M, Miya M, Mabuchi K, Saitoh K, Nishida M. 2011. Evolutionary history of 822 Otophysi (Teleostei), a major clade of the modern freshwater fishes: Pangaean 823 origin and Mesozoic radiation. BMC Evol. Biol. 11:177. 824 Ohta T. 2010. Gene Conversion and Evolution of Gene Families: An Overview. Genes 825 (Basel). 1:349–356. 826 Oliveira C, Avelino GS, Abe KT, Mariguela TC, Benine RC, Ortí G, Vari RP, Corrêa e 827 Castro RM. 2011. Phylogenetic relationships within the speciose family Characidae 828 (Teleostei: Ostariophysi: Characiformes) based on multilocus analysis and 829 extensive ingroup sampling. BMC Evol. Biol. 11:275. 830 Owens GL, Windsor DJ, Mui J, Taylor JS. 2009. A fish eye out of water: Ten visual 831 opsins in the four-eyed fish, Anableps anableps. PLoS One 4:1–7.

25 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

832 Paradis E, Schliep K. 2018. ape 5 . 0 : an environment for modern phylogenetics and 833 evolutionary analyses in R. Bioinformatics:1–2. 834 Parry JWL, Bowmaker JK. 2000. Visual pigment reconstitution in intact goldfish retina 835 using synthetic retinaldehyde isomers. Vision Res. 40:9–15. 836 Parry JWL, Bowmaker JK. 2002. Visual Pigment Coexpression in Guinea Pig Cones: A 837 Microspectrophotometric Study. Investig. Ophthalmol. Vis. Sci. 43:1662–1665. 838 Parry JWL, Peirson SN, Wilkens H, Bowmaker JK. 2003. Multiple photopigments from 839 the Mexican blind cavefish, Astyanax fasciatus: a microspectrophotometric study. 840 Vision Res. 43:31–41. 841 Phillips GAC, Carleton KL, Marshall NJ. 2015. Multiple Genetic Mechanisms Contribute 842 to Visual Sensitivity Variation in the Labridae. Mol. Biol. Evol. 33:201–215. 843 R Core Team. 2014. R: A language and environment for statistical computing. Available 844 from: http://www.r-project.org/ 845 Register EA, Yokoyama R, Yokoyama S. 1994. Multiple origins of the green-sensitive 846 opsin genes in fish. J. Mol. Evol. 39:268–273. 847 Rennison DJ, Owens GL, Taylor JS. 2012. Opsin gene duplication and divergence in 848 ray-finned fish. Mol. Phylogenet. Evol. 62:986–1008. 849 Romano C, Koot MB, Kogan I, Brayard A, Minikh A V, Brinkmann W, Bucher H. 2016. 850 Permian – (bony fishes): diversity dynamics and body size 851 evolution. Biol. Rev. Camb. Philos. Soc. 91:106–147. 852 Saarinen P, Pahlberg J, Herczeg G, Viljanen M, Karjalainen M, Shikano T, Merilä J, 853 Donner K. 2012. Spectral tuning by selective chromophore uptake in rods and 854 cones of eight populations of nine-spined stickleback (Pungitius pungitius). J. Exp. 855 Biol. 215:2760–2773. 856 Sandkam BA, Joy JB, Watson CT, Breden F. 2017. Genomic Environment Impacts 857 Color Vision Evolution in a Family with Visually Based Sexual Selection. Genome 858 Biol. Evol. 9:3100–3107. 859 Schwanzara SA. 1967. The Visual Pigments of Freshwater Fishes. Vision Res. 7:121– 860 148. 861 Takahashi Y, Ebrey TG. 2003. Molecular basis of spectral tuning in the newt short 862 wavelength sensitive visual pigment. Biochemistry 42:6025–6034. 863 Taylor JS, Peer Y Van De, Braasch I, Meyer A. 2001. Comparative genomics provides 864 evidence for an ancient genome duplication event in fish. Philos. Trans. R. Soc. 865 B.:1661–1679.

26 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

866 Terai Y, Miyagi R, Aibara M, Mizoiri S, Imai H, Okitsu T, Wada A, Takahashi-kariyazono 867 S, Sato A, Tichy H, et al. 2017. Visual adaptation in Lake Victoria cichlid fishes : 868 depth-related variation of color and scotopic opsins in species from sand / mud 869 bottoms. BMC Evol. Biol. 17:200. 870 Torres-Dowdall J, Pierotti MER, Harer, Andreas H, Karagic N, Woltering JM, Henning F, 871 Elmer KR, Meyer A. 2017. Rapid and Parallel Adaptive Evolution of the Visual 872 System of Neotropical Midas Cichlid Fishes. Mol. Biol. Evol. 34:2469–2485. 873 Toyama M, Hironaka M, Yamahama Y, Horiguchi H, Tsukada O, Uto N, Ueno Y, 874 Tokunaga F, Seno K, Hariyama T. 2008. Presence of Rhodopsin and Porphyropsin 875 in the Eyes of 164 Fishes, Representing Marine, Diadromous, Coastal and 876 Freshwater Species — A Qualitative and Comparative Study. Photochem. 877 Photobiol. 84:996–1002. 878 Wallace AR. 1853. A narrative of travels on the Amazon and Rio Negro: with an account 879 of the native tribes, and observations on the climate, geology, and natural history of 880 the Amazon valley. No. 8. Ward, Lock. 881 Ward MN, Churcher AM, Dick KJ, Laver CRJ, Owens GL, Polack MD, Ward PR, Breden 882 F, Taylor JS. 2008. The molecular basis of color vision in colorful fish : Four Long 883 Wave-Sensitive (LWS) opsins in guppies (Poecilia reticulata) are defined by amino 884 acid substitutions at key functional sites. BMC Evol. Biol. 8:210. 885 Warrant EJ, Johnsen S. 2013. Vision and the light environment. Curr. Biol. 23:R990– 886 R994. 887 Watson CT, Lubieniecki KP, Loew E, Davidson WS, Breden F. 2010. Genomic 888 organization of duplicated short wave- sensitive and long wave-sensitive opsin 889 genes in the green swordtail, Xiphophorus helleri. BMC Evol. Biol. 10:1–17. 890 Weadick CJ, Loew ER, Rodd FH, Chang BSW. 2012. Visual pigment molecular 891 evolution in the Trinidadian pike cichlid (Crenicichla frenata): a less colorful world 892 for neotropical cichlids? Mol. Biol. Evol. 29:3045–3060. 893 Whitmore A V., Bowmaker JK. 1989. Seasonal variation in cone sensitivity and short- 894 wave absorbing visual pigments in the rudd Scardinius erythrophthalmus. J. Comp. 895 Physiol. A 166:103–115. 896 Yokoyama R, Knox BE, Yokoyama S. 1995. Rhodopsin from the fish, Astyanax: Role of 897 tyrosine 261 in the red shift. Investig. Ophthalmol. Vis. Sci. 36:939–945. 898 Yokoyama R, Yokoyama S. 1990a. Convergent evolution of the red-and green-like 899 visual pigment genes in fish, Astyanax fasciatus, and human. Evolution (N. Y).

27 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

900 87:9315–9318. 901 Yokoyama R, Yokoyama S. 1990b. Isolation, DNA sequence and evolution of a color 902 visual pigment gene of the blind cave fish, Astyanax fasciatus. Vision Res. 30:807– 903 816. 904 Yokoyama R, Yokoyama S. 1993. Molecular characterization of a blue visual pigment 905 gene in the fish Astyanax fasciatus. FEBS Lett. 334:27–31. 906 Yokoyama S. 2008. Evolution of dim-light and color vision pigments. Annu. Rev. 907 Genomics Hum. Genet. 9:259–282. 908 Yokoyama S, Radlwimmer FB. 1998. The ‘“Five-Sites”’ Rule and the Evolution of Red 909 and Green Color Vision in Mammals. Mol. Biol. Evol. 15:560–567. 910 Yokoyama S, Radlwimmer FB. 2001. The Molecular Genetics and Evolution of Red and 911 Green Color Vision in Vertebrates. Genetics 158:1697–1710. 912 Yokoyama S, Takenaka N, Agnew DW, Shoshani J. 2005. Elephants and Human Color- 913 Blind Deuteranopes Have Identical Sets of Visual Pigments. 914 Yokoyama S, Yang H, Starmer WT. 2008. Molecular basis of spectral tuning in the red- 915 and green-sensitive (M/LWS) pigments in vertebrates. Genetics 179:2037–2043. 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933

28 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

934 Tables 935 Table 1. Cone and rod visual pigment peak sensitivities (λmax) Species n Photoreceptor type RH1 SWS2 RH2 RH2/ LWS2 LWS2/ LWS1 LWS2 LWS1 C. magdalenae 3 517-536 450-455 476-496 535-545 531-554 588 585 H. microlepis 4 510-528 — 489-491 526-543 535-561 564-581 576-614 Bryconidae B. chagrensis 4 504-531 446-467 472-485 515-535 532-568 — 564-612 Characidae G.atracaudatus 3 504-516 440-459 491 521 530 — — B. gonzalezi 6 504-523 449-462 486-495 514-527 530-542 545 576 R. guatemalensis 6 502-423 447-466 481-495 — 530-541 — — A. ruberrimus 3 504-519 448-463 472-495 519-522 529-542 — — 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957

29 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

958 Figures

Geotria_australis_LWS_AY366491 Ornithorhynchus_anatinus_LWS_EU624413.1 1 Homo_sapiens_MWS_NM_020061.5 1 Homo_sapeins_LWS_NM_000513.2 Callorhincus_milii_LWS_2_EF565166 1 0.92 Callorhincus_milii_LWS_1_EF565165 Lepisosteus_oculatus_LWS_XM_006625291 Osteoglossum_bicirrhosum_LWS2_MH174867.1 Scleropages_formosus_LWS2_XM_018742879 Chitala_ornata_LWS2_MG584731.1 0.74 LWS-2 1 0.84 Paramormyrops_kingsleyae_LWS2_XM_023833502.1 0.89 CHARACIFORMES Oryzias_latipes_LWS_AB223051 0.78 Oreochromis_niloticus_LWS_DQ235684

0.77 Gasterosteus_aculeatus_LWS_KC594699.1 Clupea_harengus_LWS_II_XM_012841169.1 Clupea_harengus_LWS_XM_012841170.1 Teleost-specific Salmo_salar_LWS_AY214131.1 1 genome duplication Oncorhynchus mykiss_LWS_AF425073.1 Chitala_ornata_LWS1_MG584730.1 0.8 Paramormyrops_kingsleyae_LWS_XM_023837583.1 LWS-1 Scleropages_formosus_LWS_XM_018725556 0.89 Osteoglossum_bicirrhosum_LWS_KY982954.1 Carassius_auratus_LWS_GQ168789 Danio_rerio_LWS2_NM_001002443 Danio_rerio_LWS1_NM_001313715 Carassius_auratus_LWS_L11867 CHARACIFORMES 0.8 + Siluriformes, Gymnotiformes

0.05 959 Figure 1. LWS opsin tree of Characiformes. LWS opsin maximum-likelihood 960 phylogenetic tree based on amino-acid sequences of Characiformes, 961 Osteoglossiformes, Siluriformes, Gymnotiformes, Geotria australis (lamprey), 962 Ornithorhynchus anatinus (platypus), Homo sapiens (humans), Callorhinchus milii 963 (Elephant shark), Lepisosteus oculatus (Spotted gar), Oryzias latipes (medaka), 964 Gasterosteus aculeatus (stickleback), Clupea harengus (herring), Salmo salar (salmon), 965 Oncorhynchus mykiss (trout), Carassius auratus (goldfish), and Danio rerio (zebrafish). 966 Bootstrap support over 75% is shown. This tree confirms that LWS1 and LWS2 arose 967 after the divergence of the spotted gar, probably as a product of teleost whole genome 968 duplication (TGD). Notice the clustering of characiform LWS2 opsins with the 969 osteoglossimorph LWS2 opsins. Characiform species are represented as compressed 970 color-filled clades (LWS2 in orange and LWS1 in red). 971

30 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

G_australis_RHA_AY366493_translation

0.99 AH007712_L_chalumane_RH1_translation EF565167_C_milii_RH1_translation 0.99 L_oculatus_RH1_XM_006630625__translation S_formosus_RH1_XM_018734111.1_translation 0.86 RH1-1 0.95 P_kingsleyae_RH1_XM_023794058.1_translation 0.98 0.99 Chitala_ornata_RH1_MG584732.1_translation O_bicirrhosum_RH1_PKY982940.1_translation D_rerio_RH1_2_HM367062_translation 0.80 1 E_bicolor_HQ286329_RH1_2_translation 0.84 1 C_auratus_RH1_2_KY026043_translation RH1-2 1 CHARACIFORMES 0.99 CHARACIFORMES 0.82 + Siluriformes, Gymnotiformes 0.90 Stickleback_RH1_KC774627_translation 0.78 1 Oryzias_latipes_RH1_100049259_translation Teleost-specific 0.99 TilRH1_XM_003439005_extraction_translation RH1-1 genome duplication L11863_C_auratus_RH1_translation 0.91 E_bicolor_HQ286332_RH1_translation 0.96 Drerio_RH1_1_HM367063_translation G_australis_RHB_AY366494_translation EF565168_C_milii_RH2_translation 0.99 AH007713_L_chalumnae_RH2_translation L_oculatus_RH2_XM_006628473__translation Drerio_RH2_1_NM_131253_translation L11865_C_auratus_green_translation 0.80 0.99 L11866_C_auratus_green_translation CHARACIFORMES RH2 0.99 + Siluriformes, Gymnotiformes Tilapia_RH2B_DQ235681_extraction_translation 0.98 Oryzias_latipes_RH2A_AB223053_translation

0.98 Stickleback_RH2_KC594702.1_translation Oryzias_latipes_RH2C_AB223055_translation 1 0.99 Oryzias_latipes_RH2B_AB223054_translation TilRH2Ab_DQ235682_extraction_translation TilRH2Aa_DQ235683_extraction_translation

0.1 972 973 Figure 2. RH1-RH2 opsin tree of Characiformes. RH1 and RH2 opsin maximum- 974 likelihood phylogenetic tree based on amino-acid sequences of Characiformes, 975 Osteoglossiformes, Siluriformes, Gymnotiformes, Cypriniformes, Geotria australis 976 (lamprey), Latimeria calumnae (coelacant), Callorhinchus milii (Elephant shark), 977 Lepisosteus oculatus (Spotted gar), Oryzias latipes (medaka), and Gasterosteus 978 aculeatus (stickleback). Bootstrap support over 75% is shown. This tree confirms that 979 RH1-2 arose after the divergence of the spotted gar, probably as a product of teleost 980 whole genome duplication (TGD). Notice the clustering of characiform RH1-2 opsins with 981 the cyprinimorphs surviving RH1-2 opsins. Characiform species are compressed in 982 color-filled clades (RH1-2 in gray, RH1-1 in black, and RH2 in green). 983 984 985 986

31 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

Species Family Opsin genes SWS1 SWS2 RH2 LWS2 LWS1 RH1 C. spilurus P. panamensis Lebiasinidae* C. magdalenae Curimatidae S. rhombeus P. nattereri H. microlepis Erythrinidae B. chagrensis Bryconidae C. strigata Gasteropelecidae H. panamensis A. ruberrimus A. mexicanus Characidae B. gonzalezi B.emperador G. atracaudatus R. guatemalensis

2.0 987 988 Figure 3. Opsin gene complement in Characiformes. To the left, schematic 989 representation of the phylogenetic relationships of characiforms in this study based on 990 Oliveira et al. (2011), and to the right, based on transcriptomes and genomes, the 991 presence or absence as well as the number of opsin genes in each class. Species 992 names and families are shown for the samples used in this study as well as their opsin 993 complement where each opsin-gene is indicated by a filled circle for each opsin class. 994 Empty circles denote potential gene losses. *Even though P. panamensis is considered 995 a member of Lebiasinidae, our results grouped this species with the Parodontidae. 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008

32 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

120 Transmembrane sites Binding pocket sites 100

80

60

40 Sites with AA variation 20

0

SWS2A RH2 LWS2.1 LWS2.2 LWS1 RH1 RH1.2 1009 1010 Figure 4. Number of sites with amino-acid substitution variation for each opsin 1011 class of 15 Characiformes species. Solid bars denote amino acid variation in 1012 transmembrane regions whereas stripped bars denote variation in binding pocket sites.

1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028

33 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

1029 1030 Figure 5. Ancestral state reconstruction of the spectral tuning in LWS opsin 1031 genes. The three known spectral tuning sites (S164A, Y261F, T269A) that are known to 1032 convey green sensitivity in Characiformes are shown for each species for each LWS 1033 gene. The seven combinations we found of the three tuning sites in our data set are also 1034 shown, with each combination represented as a colored circle. Pie charts on the nodes 1035 indicate the scaled likelihoods of each specific combination, calculated using the ace 1036 function in APE. Nodes #2 and #43 are denoted by yellow and blues rectangles 1037 respectively. Nodes are also labeled as in Table S2.

34 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

Family Species Relative Cone Opsin expression

Crenuchidae C. spilurus Lebiasinidae* P. panamensis Curimatidae C. magdalenae Serrasalmidae S. rhombeus Erythrinidae H. microlepis Bryconidae B. chagrensis Gasteropelecidae C. strigata H. panamensis A. ruberrimus

Characidae B. gonzalezi B.emperador G. atracaudatus R. guatemalensis

0.0 0.2 0.4 0.6 0.8 1.0

SWS2 RH2 LWS_2_1 LWS_2_2 LWS_2_3 LWS_1_1 LWS_1_2 LWS_1_3

1038 1039 Figure 6. Opsin expression in Characiformes. Relative cone opsin expression is 1040 shown for each Characiform species and color-coded for each opsin. 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053

35 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

1054 1055 Figure 7. Microspectrophotometry of Characiformes. Black circles represent mean

1056 maximal absorbance (λmax) from visual pigments in wild-caught Panamanian 1057 characiforms. Horizontal lines denote mean standard deviation. Colored columns 1058 represent the six cone classes and rods. Each circle represents an individual analyzed 1059 for a given species. 1060 1061

36 bioRxiv preprint doi: https://doi.org/10.1101/695544; this version posted July 8, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

1062 1063 Figure 8. Boxplots showing variation in photoreceptor spectral sensitivity and

1064 proportion of vitamin A2 in Characiformes. (A) Rods. (B) LWS2 cones. (C) SWS2 1065 cones. 1066

37