THE MOLECULAR EVOLUTION OF RHODOPSIN IN MARINE-DERIVED

AND OTHER FRESHWATER

by

Alexander Van Nynatten

A thesis submitted in conformity with the requirements

for the degree of Doctor of Philosophy

Department of Cell and Systems Biology

University of Toronto

© Copyright by Alexander Van Nynatten (2019)

THE MOLECULAR EVOLUTION OF RHODOPSIN IN MARINE-DERIVED AND OTHER FRESHWATER FISHES

A thesis submitted in conformity with the requirements

for the degree of Doctor of Philosophy

Department of Cell and Systems Biology

University of Toronto

© Copyright by Alexander Van Nynatten (2019)

ABSTRACT

Visual system evolution can be influenced by the spectral properties of light available in the environment. Variation in the dim-light specialized visual pigment rhodopsin is thought to result in functional shifts that optimize its sensitivity in relation to ambient spectral environments. Marine and freshwater environments have been shown to be characterized by different spectral properties and might be expected to place the spectral sensitivity of rhodopsin under different selection pressures. In Chapter two, I show that the rate ratio of non- synonymous to synonymous substitutions is significantly elevated in the rhodopsin gene of a

South American clade of freshwater with marine ancestry. This signature of positive selection is not observed in the rhodopsin gene of the marine sister clade or in non-visual genes.

ii In Chapter three I functionally characterize the effects of positively selected substitutions occurring on another independent invasion of freshwater made by ancestrally marine croakers.

In vitro spectroscopic assays on ancestrally resurrected rhodopsin pigments reveal a red-shift in peak spectral sensitivity along the transitional branch, consistent with the wavelengths of light illuminating freshwater environments. Kinetics assays reveal that freshwater croaker rhodopsin might also possess more efficient dark adaptation. In Chapter four I use a comparative approach to show that substitutions with similar functional effects occur convergently during marine to freshwater transitions, but only in deeper-dwelling lineages. In

Chapter five I investigate the molecular evolution of rhodopsin in Gymnotiformes, a clade of freshwater fishes with an alternative sensory modality specialized for dim-light environments.

Rhodopsin is highly conserved in this clade, but bouts of positive selection are observed in association with ecological transitions, indicating that dim-light vision remains an important sensory modality in these freshwater fishes. Altogether, these studies show that shifts in selection pressures and substitutions that alter the functional properties of rhodopsin are frequently observed during ecological transitions into and within freshwater environments, as long as inhabit depths where the attenuation of light is non-negligible. Furthermore, this thesis expands our understanding of the effects of ecology on visual evolution and its influence on the structural and functional properties of rhodopsin.

iii ACKNOWLEDGMENTS

This thesis would not have been possible without the help of many people. Most, if not all of the good ideas expressed within have benefited from bouncing off the brains of my family, friends and colleagues. I would first like to thank my supervisors, Nate and Belinda. They have not only provided me with a tremendous amount of support and guidance over the past six years but have also given me enough freedom to explore many of my own interests and ideas along with sending me to some truly remarkable and remote regions in the tropics. Their continued encouragement has been especially fundamental in my development as a writer and in fostering my passion in data visualization. I would also like to thank Jason Weir and Vince Tropepe for serving as my supervisory committee. They have helped shape this thesis into a coherent and feasible set of projects. I am grateful to John Calarco and David Liberles for donating their time to serve as my internal and external examiners respectively.

My experience as a grad student was made so much more enjoyable thanks to my colleagues in the Chang and Lovejoy labs. I would especially like to thank Matt, Emma and JP for their help in the field. I may not have made it back from the tropics without them. However, the hospitality of Joe Waddell, Juan Bogota and JP's family made it a difficult decision to leave. I also owe a great deal of thanks to Gianni, Eduardo, James, Nihar and Ahmed in showing me the ropes with respect to many wet lab techniques, as well as showing a great deal of patience in this process. I would not have been able to make any sense of the data generated in the wet lab without bioinformatics help from Frances, Ben, Sarah, Dominik, Lujan, Ryan and Amir. I would also like to thank Devin Bloom for collecting most of the fishes I analyzed in this project. Finally, I would like to acknowledge all of the support I have received from all other past and present graduate and undergraduate students in both the Chang and Lovejoy labs. On top making me a better scientist, my lab mates have helped me develop into a better person and I consider myself very privileged to have had such a great bunch of people to learn from.

iv I would also like to thank my family and friends. My parents, Kathy and Walter, are the hardest working people I know, but somehow have still always been there when I needed them. As much as I complained about having to milk cows, pick stones and stack hay, it has helped make me a more patient person and also provided some quality time to think. In addition to the tireless support I have received from my parents during my extended education, I was very fortunate to have supportive siblings, Nick and Evonne, and friends from back home in Perth County, London and Toronto. Finally, I owe a great deal of thanks to my best friend and partner Rowshyra. She has not only kept me motivated through the final stretch of this PhD process but has imbued in me deeper appreciation of what it takes to be a field biologist as well as a better understanding of the ecology of the fishes attached to the eyes that I study herein.

v TABLE OF CONTENTS

ACKNOWLEDGMENTS ...... IV TABLE OF CONTENTS ...... VI LIST OF TABLES ...... IX LIST OF FIGURES ...... X LIST OF ABBREVIATIONS ...... XI CHAPTER ONE: GENERAL INTRODUCTION ...... 1 1.1. THE EYE AND VISION ...... 1 1.1.1. The evolution of the vertebrate eye ...... 1 1.1.2. Vertebrate eyes ...... 2 1.1.3. Vertebrate retinas ...... 3 1.1.4. The phototransduction cascade ...... 5 1.1.5. The retinoid cycle ...... 6 1.2. VISUAL ECOLOGY ...... 7 1.2.1. Principles of light relevant to vision ...... 7 1.2.2. The attenuation of light ...... 8 1.2.3 Light underwater ...... 9 1.3. THE EVOLUTION OF RHODOPSIN ...... 10 1.3.1. Opsin evolution ...... 10 1.3.2. Opsin structure and function ...... 11 1.3.3. Differences in rod and cone opsin functional properties ...... 12 1.3.4. Rhodopsin spectral tuning ...... 13 1.3.5 Mutations causing disease ...... 14 1.4. VISUAL EVOLUTION IN FISHES ...... 15 1.4.1. The evolution of fishes ...... 15 1.4.2. Rhodopsin evolution in teleosts ...... 18 1.4.3. Adaptation of rhodopsin to deep-sea environments ...... 19 1.4.4. Adaptations of rhodopsin to red-shifted freshwater environments ...... 20 1.4.5. Marine and freshwater fishes ...... 20 1.5. MOLECULAR EVOLUTION ...... 21 1.5.1. Molecular sequence evolution ...... 21 1.5.2. Molecular phylogenetics ...... 22 1.5.3. Codon models of molecular evolution ...... 25 1.5.4. Analyses of convergent evolution in protein sequences ...... 27 1.5.5. Experimental characterization of ancestral protein function ...... 28 1.6. THESIS OBJECTIVES ...... 29 1.7. THESIS OVERVIEW ...... 30 1.8. FIGURES ...... 34 1.9. REFERENCES ...... 42 CHAPTER TWO: OUT OF THE BLUE: ADAPTIVE VISUAL PIGMENT EVOLUTION ACCOMPANIES AMAZON INVASION ...... 54 2.1. ABSTRACT ...... 54 2.2. INTRODUCTION ...... 55 2.3. METHODS ...... 56 2.4. RESULTS ...... 58 2.5. DISCUSSION ...... 59 2.6. TABLES ...... 61 2.7. FIGURES ...... 64

vi 2.8. REFERENCES ...... 66 2.9. SUPPLEMENTAL INFORMATION ...... 69 CHAPTER THREE: TURNING RED: FUNCTIONAL TUNING OF RHODOPSIN IN THE FACE OF STRONG SELECTION PRESSURES IN FRESHWATER CROAKERS ...... 70 3.1. ABSTRACT ...... 70 3.2. INTRODUCTION ...... 71 3.3. METHODS ...... 74 3.3.1. Sequencing and Sequence Alignment ...... 74 3.3.2. Phylogenetic Reconstructions ...... 75 3.3.3. Molecular Evolutionary Analyses ...... 75 3.3.4. Protein expression and functional characterization ...... 77 3.4. RESULTS ...... 79 3.4.1. Positive selection in croaker rhodopsin ...... 79 3.4.2. Increased positive selection in rhodopsin during invasion of freshwater rivers ...... 79 3.4.3. Different sets of positively selected sites in marine and freshwater lineages ...... 80 3.4.4. Experimental comparison of marine and freshwater ancestral croaker rhodopsins ...... 81 3.5. DISCUSSION ...... 83 3.5.1. Freshwater croakers have rhodopsin tuned to the underwater visual environment ...... 84 3.5.2. Rates of molecular evolution and distribution of positively selected sites differ in marine and freshwater lineages ...... 86 3.5.3. Faster dark adaptation in freshwater croakers and ecological implications of substitutions on the transitional lineage ...... 89 3.6. TABLES ...... 90 3.7. FIGURES ...... 94 3.8. REFERENCES ...... 98 3.9. SUPPLEMENTAL INFORMATION ...... 104 3.9.1. Phylogenetic Reconstructions ...... 104 3.9.2. Molecular Evolutionary Analyses ...... 104 3.9.3. Comparing CmC with CmD when two classes of positively selected sites present ...... 105 3.9.4. Supplementary tables ...... 107 3.9.5. Supplementary figures ...... 118 3.9.6. Supplementary references ...... 123 CHAPTER FOUR: DEPTH-DEPENDENT MOLECULAR EVOLUTION OF RHODOPSIN IN FISHES THAT HAVE MADE EVOLUTIONARY TRANSITIONS FROM MARINE TO FRESHWATER ENVIRONMENTS ...... 124 4.1. ABSTRACT ...... 124 4.2. INTRODUCTION ...... 125 4.3. METHODS ...... 128 4.3.1. Sequence dataset assembly ...... 128 4.3.2. Ancestral habitat reconstructions ...... 128 4.3.3. Molecular evolutionary analyses ...... 129 4.4. RESULTS ...... 131 4.4.1. Divergent selection more frequently associated with freshwater transitions in than Beloniformes ...... 131 4.4.2. Non-conservative substitutions positively selected on transitional and freshwater lineages ...... 132 4.4.3. Beloniformes rhodopsin under pervasive positive selection irrespective of habitat ...... 133 4.4.4. Rhodopsin duplication in North American freshwater drum ...... 133 4.4.5. Convergent substitutions at functionally important sites in rhodopsin ...... 134 4.5. DISCUSSION ...... 136 4.5.1. Stronger selective pressure acting on rhodopsin in deeper dwelling fishes ...... 137 4.5.2. Pseudogenized copy of rhodopsin in Aplodinotus grunniens ...... 139 4.5.3. Rhodopsin evolves under positive selection in predatory fishes ...... 139 4.5.4. Convergent functional shifts in the rhodopsin protein of deeper-dwelling freshwater fishes ...... 140 4.6. TABLES ...... 143

vii 4.7. FIGURES ...... 146 4.8. REFERENCES ...... 152 4.9. SUPPLEMENTAL INFORMATION ...... 158 4.9.1. Supplementary tables ...... 158 4.9.2. Supplementary figures ...... 169 CHAPTER FIVE: RHODOPSIN SUBJECT TO SELECTIVE CONSTRAINT IN GYMNOTIFORM FISHES TO MAINTAIN VISUAL SENSITIVITY IN LIGHT-LIMITED UNDERWATER ENVIRONMENT ...... 173 5.1. ABSTRACT ...... 173 5.2. INTRODUCTION ...... 174 5.3. METHODS ...... 177 5.3.1. Gymnotiform rhodopsin dataset ...... 177 5.3.2. Vertebrate rhodopsin dataset ...... 177 5.3.3. Analyses of molecular evolution ...... 178 5.4. RESULTS ...... 179 5.4.1. Rhodopsin subject to strong purifying selection within the gymnotiform clade ...... 179 5.4.2. Positive selection on branch leading to the Gymnotiformes ...... 180 5.4.3. Positive selection in rhodopsin associated with deep-water adaptation ...... 180 5.4.4. An amino acid in gymnotiform rhodopsin causes visual disease in humans ...... 181 5.5. DISCUSSION ...... 182 5.6. TABLES ...... 188 5.7. FIGURES ...... 189 5.8. REFERENCES ...... 192 5.9. SUPPLEMENTAL INFORMATION ...... 198 5.9.1. Supplementary tables ...... 198 5.9.2. Supplementary figures ...... 217 CHAPTER SIX: GENERAL CONCLUSIONS ...... 220 6.1. GENERAL SUMMARY ...... 220 6.2. ENVIRONMENTAL AND ECOLOGICAL EFFECTS ON RHODOPSIN EVOLUTION ...... 221 6.3. ALTERNATIVE AVENUES FOR ADAPTATION TO FRESHWATER VISUAL ENVIRONMENTS ...... 224 6.4. THE CHANGING EVOLUTIONARY LANDSCAPE DURING RHODOPSIN EVOLUTION ...... 227 6.5. TOWARDS A MORE HOLISTIC VIEW OF ADAPTATION IN RHODOPSIN STRUCTURE AND FUNCTION ...... 229 6.6. GENERAL CONCLUSIONS AND SIGNIFICANCE ...... 233 6.7. REFERENCES ...... 234

viii LIST OF TABLES

Table 2.1. Clade model C (PAML) analyses of rhodopsin and non vision-related genes...... 61 Table 2.2. Clade model C (PAML) analyses of rhodopsin and non vision-related genes, with marine reinvading anchovies included in marine partition...... 62 Table 2.3. Random sites (PAML) analyses of rhodopsin...... 62 Table 2.4. Results of random sites analyses (PAML) for non vision-related genes...... 63 Table 2.5. Positively selected sites identified using PAML and HyPhy (FUBAR) analyses...... 63 Table S2.1. Accession number for sequences generated in this study ...... 69 Table 3.1. Random-sites analyses (PAML) of the world-wide croaker rhodopsin dataset using the Bayesianspecies tree ...... 90 Table 3.2. Positively selected sites in different tests of positive selection ...... 90 Table 3.3. Branch-sites and clade model analyses (PAML) of the world-wide croaker rhodopsin dataset using the Bayesian species tree ...... 91 Table 3.4. Branch-sites and clade model analyses (PAML) of the New World clade croaker rhodopsin dataset using the Bayesian species tree ...... 92 Table 3.5. Peak spectral sensitivity of croaker rhodopsin ...... 93 Table S3.1. Genbank accession numbers for sequences used in phylogenetic reconstructions and selection analyses...... 107 Table S3.2. Results for Partition Finder analyses of concatenated dataset...... 110 Table S3.3. Random-sites analyses (PAML) of the world-wide croaker rhodopsin dataset using the maximum likelihood species tree ...... 111 Table S3.4. Random-sites analyses (PAML) of the world-wide croaker rhodopsin dataset using the maximum likelihood rhodopsin gene tree ...... 111 Table S3.5. Branch-sites and clade model analyses (PAML) of the world-wide croaker rhodopsin dataset using the maximum likelihood species tree ...... 112 Table S3.6. Branch-sites and clade model analyses (PAML) of the world-wide croaker rhodopsin dataset using the maximum likelihood rhodopsin gene tree ...... 113 Table S3.7. Random-sites analyses (PAML) of the world-wide croaker rhodopsin dataset with freshwater species removed using the Bayesian species tree ...... 113 Table S3.8. Branch-sites analyses (PAML) of the control gene dataset using the Bayesian species tree ...... 114 Table S3.9. Clade model analyses (PAML) of the control gene dataset using the Bayesian species tree ...... 115 Table S3.10. Clade model analyses (PAML) of the New World Clade rhodopsin dataset with and without highly positively selected sites removed using the Bayesian species tree ...... 116 Table S3.11. Substitutions on transitional branch and the frequency of their occurrence on other branches in the tree ...... 117 Table 4.1. Significant tests of positive and divergent selection on freshwater lineages ...... 143 Table 4.2. PAML results for Beloniformes for Random-sites analyses ...... 143 Table 4.3. PAML results for Clupeiformes for Random-sites analyses ...... 144 Table 4.4. PAML results for rhodopsin duplicates in the North American croaker invasion ...... 144 Table 4.5. Results from Pagel’s Discrete analysis...... 144 Table 4.6. Convergent substitutions in rhodopsin in independent freshwater invasions...... 145 Table S4.1. Habitat classification of freshwater Beloniformes, Clupeiformes and Scaiendae species...... 158 Table S4.2. New rhodopsin sequences for fishes included in the rhodopsin dataset ...... 158 Table S4.3. Branch-sites and clade model (PAML) results for the 55 species Clupeiformes rhodopsin dataset...... 166 Table S4.4. Branch-sites and clade model (PAML) results for the 56 species Beloniformes rhodopsin dataset...... 167 Table S4.5. Analysis of rhodopsin dataset with new Aplodinotus grunniens sequence using random- sites, branch-sites and clade models (PAML)...... 168 Table 5.1. Random-sites (PAML) analyses of the 147 species Gymnotiformes rhodopsin dataset using the maximum likelihood rhodopsin gene tree...... 188 Table 5.2. Random-sites (PAML) analyses of the 50 species vertebrate rhodopsin dataset...... 188 Table S5.1. Accession numbers, depth data and amino acid identities at sites on helix 5 and 6 in fishes ...... 198

ix

LIST OF FIGURES

Figure 1.1. The eye and the visual cycle...... 34 Figure 1.2. Underwater visual environments and spectral sensitivity of aquatic species...... 36 Figure 1.3. Rhodopsin functional domains and sequence diversity...... 37 Figure 1.4. Chronogram of aquatic vertebrate lineages...... 39 Figure 1.5. Frequently used models of codon evolution...... 40 Figure 2.1. Phylogeny and molecular evolution in New World anchovies...... 64 Figure 2.2. Distribution of amino acids at variable sites in dataset...... 65 Figure 2.3. Positively selected sites in freshwater clade...... 65 Figure 3.1. Phylogenetic reconstruction and ecological distribution of croakers...... 94 Figure 3.2. Tests for positive selection on rhodopsin associated with habitat transitions...... 95 Figure 3.3. Habitat specificity of positively selected sites in croaker rhodopsin...... 96 Figure 3.4. Functional characterization of substitutions along the transitional branch...... 97 Figure S3.1. Bayesian species tree reconstructed using a concatenated alignment of four nuclear loci and two mitochondrial loci...... 118 Figure S3.2. Maximum likelihood species tree reconstructed using a concatenated alignment of four nuclear loci and two mitochondrial loci...... 119 Figure S3.3. Maximum likelihood rhodopsin gene tree...... 120 Figure S3.4. Maximum likelihood species tree reconstructed using a concatenated alignment of four nuclear loci and two mitochondrial loci but with site 165 and 214 removed from rhodopsin dataset...... 121 Figure S3.5. Maximum likelihood species tree reconstructed using a concatenated alignment of four nuclear loci and two mitochondrial loci but with the first and second codon position removed from rhodopsin dataset...... 122 Figure S3.6. Bayesian phylogeny with amino acid branch lengths...... 123 Figure 4.1. Distribution of wavelengths of light and families of fishes with depth...... 146 Figure 4.2. Transition events in Beloniformes and Clupeiformes...... 148 Figure 4.3. Positively selected sites on the rhodopsin dark-state structure...... 149 Figure 4.4. dN/dS estimates for rhodopsin by site in each clade of fishes...... 150 Figure 4.5. Amino acid substitutions on freshwater and transitional branches...... 151 Figure S4.1. Ancestral habitat reconstructions...... 170 Figure S4.2. Bootstrap consensus tree of Sciaenidae rhodopsin...... 172 Figure S4.3. Substitutions on transitional branches and freshwater clades vs. null estimates...... 172 Figure 5.1. Intensified purifying selection on gymnotiforms rhodopsin...... 189 Figure 5.2. Variation at rhodopsin site 214 in gymnotiforms and other fishes...... 190 Figure 5.3. Epistatic interactions on helix 5 and 6 near RP associated F220C mutation...... 191 Figure S5.1. Rhodopsin gene tree for gymnotiforms...... 217 Figure S5.2. Vertebrate phylogeny...... 218 Figure S5.3. Support for ancestral amino acid reconstructions...... 219

x LIST OF ABBREVIATIONS

Abbreviation Definition A1 11-cis-retinal A2 11-cis-3,4-dehydroretinal BEB Bayes Empirical Bayes Br-Site Branch Sites cGMP Cyclic guanosine monophosphate CmC Clade model C CmD Clade model D COI Cytochrome c oxidase subunit I CNG Cyclic nucleotide gated cation CYP27c1 Cytochrome P450 family protein CYTB Cytochrome b dN/dS Rate of non-synonymous to synonymous substitutions DNA Deoxyribonucleic acid EGR1 Early growth response protein 1 EGR2 Early growth response protein 2 EL2 Extracellular loop 2 GDP Guanosine diphosphate GPCR G protein-coupled receptor GRK1 Rhodopsin kinase GTP Guanosine triphosphate GTPase GTP hydrolase enzyme IRBP Interphotoreceptor retinoid binding protein LRAT Lecithin:retinol acyltransferase LWS Long wavelength-sensitive opsin Ma Million years meta-II Metarhodopsin II MSP Microspectrophotometry NA nm Nanometer PCR Polymerase chain reaction Rag1 Recombination activating gene 1 Rag2 Recombination activating gene 2 RH1 Rhodopsin RH2 Middle wavelength-sensitive opsin RPE Retinal pigment epithelium RPE65 Retinoid isomerohydrolase SA South America SWS1 Short wavelength-sensitive opsin 1 SWS2 Short wavelength-sensitive opsin 2 t1/2 Half-life λmax Maximal wavelength of absorption UV Ultraviolet

xi CHAPTER ONE: GENERAL INTRODUCTION

1.1. THE EYE AND VISION

1.1.1. The evolution of the vertebrate eye The complexity and diversity of eyes has interested evolutionary biologists dating back to Darwin, who famously stated that even a complex organ such as the eye could come about through a series of advantageous intermediates (Darwin 1859). In its simplest form, an eye is comprised of a single light-sensitive photoreceptor cell and some shading pigment. This basic blueprint is found in flatworms and in the eyespots of many small invertebrates (Land and Nilsson 2012). Image-forming eyes, capable of spatial vision, are more complex and typically consist of some form of a lens, or multiple lenses, focusing light onto a layer of photoreceptor cells (Arendt and Wittbrodt 2001). The evolution of more sophisticated eyes is credited with catalyzing the Cambrian explosion (Land and Nilsson 2012), and complex image-forming eyes have evolved independently in at least ten bilaterian lineages (Land and Fernald 1992).

Vertebrates have camera-like eyes, as do , cnidarians, arachnids, cephalopods, gastropods and some (Figure 1.1a) (Land and Fernald 1992; Arendt and Wittbrodt 2001). Remarkably, camera-like eyes evolved independently from patches of photoreceptor cells in each of these lineages (Lamb et al. 2007; Land and Nilsson 2012). Simulations suggest that as few as 2000 evolutionary steps are required to explain the procession from simple to complex eyes. In short-lived species, with a conservatively estimated evolutionary rate, this would amount to roughly 500,000 years (Nilsson and Pelger 1994). However, because soft structures like eyes are not well represented in the fossil record, the exact series of steps culminating in the evolution of the vertebrate eye are unknown. Inferences based on ancestral- state reconstructions are also complicated because of the contentious phylogenetic placement of hagfish, a lineage with very rudimentary eye structures, as either the earliest diverging vertebrate lineage or the sister group to lamprey (Lamb et al. 2007). Most research supports the latter arrangement (Stock and Whitt 1992; Blair and Hedges 2005), suggesting that the

1 common ancestor of vertebrates had complex camera-like eyes similar to that of extant lamprey and jawed vertebrates by at least 420 Ma (Lamb 2013).

1.1.2. Vertebrate eyes The vertebrate eye is comprised of a complicated set of structures derived from neuronal and non-neuronal tissue. The neuronal component of the eye is responsible for sensing light and is derived from an evagination of the forebrain forming the retina, retinal- pigment epithelium (RPE) and the optic nerve (Graw 2010). The rest of the eye, derived mostly from non-neuronal tissue, is primarily responsible for restricting or directing incoming light onto the retina (Graw 2010). The three most critical structures defining the quantity and quality of light incident on the retina are the cornea, pupil and lens (Figure 1.1a). The cornea provides structural support and blocks harmful short-wavelength light. In terrestrial spaces it also has significant refractive power focusing light onto the retina. This refractive power is lost underwater where the refractive index of the cornea is the same as the external environment (Gregory 2015). The pupil is not truly a structure but a gap formed by the pigmented iris (Gregory 2015). The diameter of the pupil expands and constricts to optimize the size of the aperture for maximal sensitivity in dim light and resolution in bright light. Light that passes through the pupil is brought into focus on the retina by the lens. Most aquatic species have round lenses, accommodating for the reduced refractory power of the cornea underwater. Round lenses are moved towards the anterior or posterior of the eye to focus light from near and far objects onto the retina respectively. In terrestrial species, the lens is more ovoid and becomes flatter when focused on more distant objects (Land and Nilsson 2012).

Nearly all vertebrates with image-forming eyes rely on these same structures, but the size, shape and optical properties of the cornea, pupil and lens are highly variable across species. Nocturnal birds, reptiles and mammals tend to have large corneas relative to eye size, increasing the number of photons allowed to pass through to the retina (Hall et al. 2012). Pupils also come in a variety of shapes and sizes, which are generally consistent with the photic niche an animal inhabits (Land 2006). In general, pupils are more dynamic in terrestrial species than aquatic species, but pupil shape varies more underwater. The functional importance of the U

2 and W shaped pupils observed in many fishes and cephalopods is unclear, but may help camouflage the eye or act as a sunshade blocking high-intensity downwelling light (Land 2006; Mathger et al. 2013; Banks et al. 2015). The lenses focusing incoming light onto the retina are also larger and often rounder in nocturnal species, increasing the number of photons directed at specific sections of the retina (Schmitz and Motani 2010). In contrast, species inhabiting brightly lit environments generally have eyes well adapted for excluding extraneous light, especially short wavelengths. Short-wavelength light damages the retina through the production of reactive oxygen species and deteriorates image quality because of its increased tendency to cause chromatic aberration and diffraction (Wu et al. 2006). To prevent this, many diurnal species have corneas that filter out short wavelength light (Siebeck and Marshall 2001). Interestingly, relative eye size, pupil shape and lens transparency are most variable in fishes, reflecting the many different photic niches in underwater environments (Howland et al. 2004).

1.1.3. Vertebrate retinas Image-forming light focused onto the retina is ultimately absorbed by pigments expressed in the outer segments of highly specialized neurons, the rod and cone photoreceptors, situated posteriorly in the retina. Somewhat surprisingly, light must pass through a series of highly ordered cell layers comprising the down-stream signalling circuitry of the retina before it reaches the photoreceptors (Ramón y Cajal 1904; Gregory 2015). Following the path of light, there are eight histologically distinct layers defined as the nerve fibre layer, ganglion cell layer, inner plexiform layer, inner nuclear layer, outer plexiform layer, outer nuclear layer, the photoreceptor layer and the retinal pigment epithelium (RPE) (Figure 1.1b) (Erclik et al. 2009). The first two layers, the nerve fibre layer and ganglion cell layer are comprised of the axons and cell bodies of the retinal ganglion cells respectively. The axons of retinal ganglion cells come together to form the optic nerve that relays signals from the retina to the visual processing centres of the brain (Figure 1.1b) (Masland 2001a). Retinal ganglion cells receive input from amacrine and bipolar cells. The bodies of these cells form the inner nuclear layer alongside horizontal cells and Müller glial cells (Gregg et al. 2012). The outermost layers of the neuronal retina are formed by the photoreceptors. The cell bodies of these highly specialized neurons form the outer nuclear layer, and the highly specialized outer segments

3 form the photoreceptor layer (Figure 1.1b). It is in the outer segments of these cells that light is absorbed by photosensitive visual pigments. Many species possess multiple classes of photoreceptors, distinguished from one another by morphological differences and by the spectral sensitivity of the visual pigment expressed in the outer segment (Gregg et al. 2012).

The most obvious distinction between photoreceptors is that of rods and cones (Gregg et al. 2012). Rod photoreceptors are derived from a cone-type ancestor, but have numerous adaptations improving their efficacy in dim-light (scotopic) conditions (Ingram et al. 2016). This includes an extended outer segment formed of discontinuous lamellar disk membranes from which the cell type receives its name (Figure 1.1c). In addition, a dim-light specialized visual pigment protein known as rhodopsin is expressed at very high levels in the outer segments of rods (Figure 1.1d). In contrast, cone photoreceptors have tapered outer segments, with each lamellar stack belonging to one continuous membrane (Erclik et al. 2009). Cones are responsible for vision in bright-light environments (photopic), and unlike rods, many species possess multiple classes forming the basis for colour vision (Land and Nilsson 2012). Many vertebrates have both rods and cones, constituting a duplex retina, which optimizes the detection of light across a wider range of intensities (Land and Nilsson 2012). Light that is not absorbed by the photoreceptors terminates at the retinal pigment epithelium (RPE), the most posterior layer of the retina made up of highly pigmented cells reducing glare and providing support for the metabolically demanding photoreceptor cells (Strauss 2005).

As is the case for gross eye morphology, the organization of the retina tends to be associated with an organism’s life history. For example, species focusing their gaze on the horizon tend to have a preponderance of retinal ganglion cells forming a horizontal streak (Collin 2008). Other species requiring high visual acuity tend to have a circular region of retinal ganglion cell density (Collin 2008). In humans, this region corresponds to the fovea, where cone photoreceptor densities are also highest, and ganglion cells receive input from as few as five photoreceptors to maximize resolution (Curcio and Allen 1990). Alternatively, species with dim-light adapted retinas benefit from pooling the information from multiple photoreceptors. An example of this are the retinal ganglion cells of the domestic cat, that may receive input from as many as 7000 rod photoreceptors (Kolb and Nelson 1993). In general,

4 the proportions of rod and cone photoreceptors also reflect the photic environment a species inhabits, with dim-light species possessing more rods than cones (Lythgoe 1984).

1.1.4. The phototransduction cascade The phototransduction cascade proceeds in the opposite direction of the passage of light through the retina (Figure 1.1b). First, light is absorbed by visual pigments expressed in the outer segment of the rod and cone photoreceptors. This induces an 11-cis to all-trans isomerization of the chromophore component of the visual pigment. Electrostatic and steric conflict between the isomerized chromophore and the retinal binding pocket formed by the visual pigment protein, a highly specialized G protein-coupled receptor (GPCR), induces a series of conformation changes eventually resulting in the formation of the active state structure of the protein (Figure 1.1e) (Yau and Hardie 2009). The GPCR class expressed in photoreceptors are known as opsins, and different opsin proteins are specific to each photoreceptor class, dictating the wavelengths of light each photoreceptor is most sensitive to (Lamb 2013). The phototransduction pathway of rods and cones is similar, but for some components is mediated by a different set of mostly paralogous proteins (Ingram et al. 2016). Slight differences in the functional properties of these paralogous proteins are thought to contribute to the increased sensitivity in rods compared to cones (Ingram et al. 2016).

One major difference between rod and cone opsins is the duration the protein remains in its biologically active state, known as Metarhodopsin-II (Meta-II) (Imai et al. 1997). In this state, the Meta-II rhodopsin catalyzes a GDP-GTP exchange on the alpha subunits of up to 20 G protein transducin molecules. When bound to GTP, the alpha subunit of the G-protein transducin dissociates from the beta and gamma subunits and binds to the gamma subunit of a phosphodiesterase (Yau and Hardie 2009). This removes the constraint on the alpha and beta catalytic subunits of phosphodiesterase, which in turn begin rapidly hydrolyzing cytosolic cGMP (Yau and Hardie 2009). The reduction in circulating cGMP concentration closes cGMP- gated cation channels, hyperpolarizing the photoreceptor. A hyperpolarized photoreceptor reduces its release of glutamate, which causes the downstream rod bipolar cells to depolarize (Figure 1.1f). If the signal from the bipolar cells is sufficient, an action potential is invoked in

5 a retinal ganglion cell and a signal is relayed to the visual processing centres of the brain (Yau and Hardie 2009).

Interactions with other components involved in shutting off visual transduction ensure that the duration of time the light-activated visual pigment signals remain brief, and shortly after a light response the photoreceptor returns to its resting potential. In vertebrate photoreceptors this is -30 mV, 40 mV less negative than most other neurons (Yau and Hardie 2009). Before the active state of rhodopsin decays, it is shut off through phosphorylation by rhodopsin kinase and the binding of arrestin (Figure 1.1f) (Krupnick and Benovic 1998). Activated transducin decays on its own through intrinsic GTPase activity, but is sped along by GTPase activating protein (Yau and Hardie 2009). Without active transducin, the phosphodiesterase also returns to its inactive state. The resting state concentrations of cGMP are restored by a negative feedback loop beginning as soon as calcium levels in the cell drop, concomitant with the closing of cGMP-gated cation channels (Pugh et al. 1999). Lower cytosolic calcium levels disinhibits guanylate cyclase-activating proteins and rhodopsin kinase, resulting in more rapid production of cGMP by guanylate cyclase, and increasing the rate of rhodopsin phosphorylation by rhodopsin kinase (Figure 1.1f). The binding affinity of cGMP-gated cation channels is also controlled by calcium. When calcium levels decrease, more channels bind cGMP and remain open, rapidly restoring the resting potential (Pugh et al. 1999).

1.1.5. The retinoid cycle To restore sensitivity following visual pigment activation, the isomerized chromophore component of the visual pigment must be recycled. Unlike the bi-stable visual pigments of invertebrates, vertebrate visual pigments must shed the all-trans retinal chromophore at the end of each light event and take up fresh 11-cis retinal (Ernst et al. 2014). Much of this process occurs in the RPE, where specific enzymes convert spent all-trans retinol to 11-cis retinal (Figure 1.1g) (Yau and Hardie 2009). This is just one of many housekeeping roles of the RPE in maintaining photoreceptor function and health (Strauss 2005). Once all-trans retinal exits the opsin protein moiety, it is converted to all-trans retinol, which is transported to the RPE by

6 the inter-photoreceptor retinoid binding protein (IRBP). In the RPE, all-trans retinol is converted into an ester, to reduce its toxicity during storage. This ester form is eventually converted into 11-cis retinol by an isomerhydrolase (RPE65), which is converted again to 11- cis retinal and transported back to the photoreceptor by IRBP (Figure 1.1g), where it is available for uptake by an empty opsin protein (Yau and Hardie 2009).

1.2. VISUAL ECOLOGY

1.2.1. Principles of light relevant to vision The diversity in eye size, shape and photoreceptor composition stems from evolutionary optimization for the detection of photons in environments illuminated by different intensities and colours of light (Land and Fernald 1992). Most vertebrates detect photons of light between 400-700 nm, the range of the electromagnetic energy referred to as the visible spectrum, though some species have expanded this range to reach into the ultraviolet (< 400 nm) and infrared light (> 700 nm) (Hauser et al. 2014; Palczewska et al. 2014). Visual pigment absorbance spectra are generally quite broad, with the peak absorbance referred to as the wavelength of maximum absorbance or the λmax. The spectral location of λmax is determined by the energy difference between resting and excited states of the vitamin A-derived chromophore within the opsin binding pocket. Photons with wavelengths more similar to the energy difference are more likely to be absorbed (Rossotti 1985). For sensors like the eye, the probability of detecting light is increased when the quantity of photons increases and when the wavelength approaches the peak sensitivity (Lythgoe 1979). The quantity of photons of a specific wavelength is frequently reported as the irradiance, a measure of the number of photons striking a surface of set size per unit time (Lythgoe 1979). The duplex retina possessed by most vertebrates expands the range of irradiance values that can be effectively resolved (Lamb 2013). In humans, vision has a lower limit of 1010 photons/m2/s, with cones giving way to rods at 1014 photons/m2/s (Land and Nilsson 2012).

7 In most practical cases, detecting contrast between the foreground and background is of greatest importance. In brightly lit environments this ability is improved by considering the composition of light defining an object relative to background. Multiple detectors (cone photoreceptors), each most sensitive to a different wavelength of light can be compared, forming the basis of colour vision (Rossotti 1985). Depending on the number of cone classes, the resolution of colour vision can be high or low, with many mammals falling into the latter category. This is thought to be the result of a nocturnal bottleneck where a common ancestor evolved a retina optimized for vision in dim light (Heesy and Hall 2010). In addition to being able to detect the spectral composition of light, some species can also detect differences in the polarization of light (Johnsen et al. 2014; Novales Flamarique 2017).

1.2.2. The attenuation of light Some environments are illuminated by a narrower portion of the visible light spectrum because specific wavelengths of light are attenuated more rapidly than others (Johnsen et al. 2014). Nearly all natural systems are principally illuminated by the sun, which provides ample light across the entirety of the visible spectrum (Figure 1.2a). However, some exceptions do exist, such as deep-sea systems illuminated by bioluminescence and more recently, urban environments, illuminated by anthropogenic light sources (Johnsen et al. 2014). While the light radiating from the sun is nearly uniform, the attenuation of this light by the medium through which it passes is not. Small particles scatter light when their size is similar to that of the wavelength of a photon (Rossotti 1985). Ozone particles in the atmosphere scatter light through this mechanism and remove most of the UV band of light as well as some blue light in the visible spectrum before it reaches the earth’s surface (Johnsen et al. 2014). This results in a slightly red-shifted spectral profile that remains mostly consistent across terrestrial habitats with the exception of those covered by a thick canopy (Endler 1993). Terrestrial light environments do show substantial temporal variation, with extent of cloud cover or fog altering the amount of light and also the wavelengths of light reaching the surface (Johnsen et al. 2014) (Figure 1.2a). Even more significant is the dramatic diel difference in light. As the sun sets, the path length light travels through the atmosphere increases filtering out more short- wavelength light (through scattering by small molecules) before it reaches the surface (Figure

8 1.2a). This creates a red-shifted light environment that eventually gives way to a blue-shifted twilight environment where light reaching the surface is entirely composed of back-scattered light from the atmosphere (Johnsen et al. 2014). From a spectral perspective, nocturnal and diurnal environments are very similar, but daylight is about 500,000 times brighter (Johnsen et al. 2014) .

1.2.3 Light underwater Underwater-light environments also undergo temporal changes, but are primarily characterized by the attenuation of light due to the physiochemical properties of the water itself (Lythgoe 1979). In fact, the names of many large rivers reflect their optical qualities (Figure 1.2a) (Costa et al. 2012). Light is attenuated underwater in a depth-dependent manner. Exactly which wavelengths are filtered out most rapidly depends on the concentration of dissolved organic material and suspended particulate matter (Lythgoe 1979). Clear water, typified by non-coastal marine habitats and some lakes, is most transparent to blue-green light centred around 475 nm (Jerlov 1976). At depths exceeding 200 m only a narrow band of light in this range persists, and light sufficient for vision is fully attenuated by 1000 m (Figure 1.2a). As the concentrations of particulates and organic matter increases, the peak transparency of water shifts towards the red end of the visible light spectrum (Jerlov 1976). Coastal waters are more green, especially near the mouths of large rivers. The large rivers themselves are even more red-shifted (Jerlov 1976). The concentrations of organic matter is so high in some rivers that they take on a tea-like appearance and are colloquially known as black water rivers (Costa et al. 2012) (Figure 1.2a). In these environments, tannins absorb short wavelength light and result in a very red-shifted underwater scene. The same is true for highly turbid rivers, otherwise known as white water, where fast flowing water maintains high concentrations of suspended particulates scattering short-wavelength light more rapidly (Figure 1.2a). In both of these water types, light of all wavelengths is fully attenuated at shallower depths, usually not exceeding 20 m in very turbid or tannin-stained rivers (Costa et al. 2012).

The diversity in underwater visual environments is reflected in the significant range in spectral sensitivity of the visual opsins of fishes spanning the entirety of the visual spectrum

9 340-650 nm (Partridge and Cummings 1999). In general, species inhabiting brightly lit environments, such as reefs, have a larger complement of visual pigments to take advantage of the broad-spectrum light available (Parry et al. 2005; Hunt et al. 2015). In contrast, deep-sea fishes and fishes inhabiting highly turbid waters tend to have fewer cone opsin classes (Hope et al. 1997; Pointer et al. 2007; Liu et al. 2016), but the visual pigments possessed by these fishes are often well tuned for detecting the wavelengths of light available (Figure 1.2b). Because the attenuation of light is depth dependent, differences in the spectral sensitivity and functional properties of specific visual pigment proteins become more apparent with depth (Figure 1.2c).

1.3. THE EVOLUTION OF RHODOPSIN

1.3.1. Opsin evolution The opsin complement varies extensively across vertebrates and is especially diverse in fishes (Cortesi et al. 2015). Much of this diversity is a result of secondary duplication events or losses in specific lineages (Cortesi et al. 2015; Gutierrez et al. 2016; Lin et al. 2017). The morphological similarity of the ancestral vertebrate eye with respect to modern day species is also reflected in the molecular composition of the photoreceptor cells (Lamb 2013). Phylogenetic reconstructions reveal an ancestral set of five opsin classes. These five opsins make up the basis for all vertebrate visual opsin classes, however, other light-sensitive opsin proteins are expressed in the vertebrate retina and pineal gland, and are thought to mediate non-visual light responses and entrain circadian rhythms (Lamb 2013). In order of increasing wavelength sensitivity, the visual opsin classes are short-wavelength sensitive opsins 1 and 2 (SWS1 and SWS2), rhodopsin (Rh1), rhodopsin-like opsin (Rh2), and long-wavelength sensitive opsin (LWS). LWS is the most red-shifted opsin class and forms the sister clade to all other vertebrate visual opsins. Sometime after the split between LWS and the common ancestor of the four other visual opsins, a highly conserved glutamic acid (E181) was replaced with a histidine, shifting spectral sensitivity of LWS towards long-wavelength light (Lamb 2013). The other four opsins arose through two rounds of gene duplication events that coincide

10 with whole genome duplications (2R) in the ancestral vertebrate preceding the split of jawed and jawless fishes (Cañestro 2012). SWS1 and SWS2 have a very broad distribution of λmax values, thought to be due at least in part to the repeated losses of a protonated Schiff base in SWS1 that shifts its spectral sensitivity into the UV (Hauser et al. 2014). Rh1 and Rh2 are both sensitive to middle-wavelength light (Gutierrez et al. 2016), but the divergence in functional properties of these two pigments is the largest of any two opsin classes (Imai et al. 2007). Rh1 is expressed in rods and has many specific adaptations to maximize its ability to detect photons in dim-light environments. The increased photosensitivity of rhodopsin is thought to be attributed to an extended duration of the active state structure of rhodopsin (Meta-II) and an enhanced thermal stability, reducing noise associated with isomerization events occurring in the dark (Imai et al. 2007; Kefalov 2012).

1.3.2. Opsin structure and function Opsin proteins are members of the class A family GPCRs, the largest mammalian protein superfamily comprised of nearly 800 members mediating many critical cellular signalling pathways in vertebrates (Figure 1.3ab) (Katritch et al. 2013). GPCRs are characterized by having seven transmembrane helical domains connected by intra- and extra- cellular loop domains (Katritch et al. 2013). During activation, the fifth transmembrane helix is extended on its cytoplasmic face and swings approximately 3 Å away from the protein’s centre of mass (Ernst et al. 2014). This is accompanied by a 6-7 Å translocation of helix six which exposes the docking interface for the binding of the associated G-protein (Figure 1.3d) (Knierim et al. 2007). Less pronounced movements during activation are also observed on other helices but may be equally important (Tsukamoto et al. 2010).

Opsins are unique among GPCRs in that the binding of the ligand, the 11-cis retinal chromophore, is not sufficient for activation (Figure 1.3ab) (Zhou et al. 2012). In fact, 11-cis retinal acts as an inverse agonist stabilizing the dark state structure of rhodopsin. In vertebrate visual opsins, activation occurs when 11-cis retinal is isomerized to all-trans retinal (Zhou et al. 2012). Like other GPCRs, the chromophore binding pocket is on the extracellular side of the protein (Figure 1.3ab) (Zhou et al. 2012), but a number of key evolutionary innovations

11 unique to opsin proteins can be found in the extracellular domain to prevent the dissociation of the chromophore before its isomerization. These differences include an extended extracellular loop 2, forming a beta sheet atop the binding pocket which when combined with the N-terminal domain form an extra-cellular cap (Ernst et al. 2014). The chromophore is further stabilized in the binding pocket through the formation of a Schiff base with a lysine at site 296 and through charge delocalization across the polyene backbone of the chromophore facilitated by the counter ion at site 113 (Sakmar et al. 1989). Highly conserved hydrophobic residues in the binding pocket on helices five and six also help stabilize the dark-state structure (Zhou et al. 2012). Upon the absorption of light, the delocalization of charge along the chromophore backbone changes, promoting a transition to an all-trans conformation where the beta-ionone ring of the chromophore is repositioned between helices five and six (Zhou et al. 2012). The isomerization of the chromophore happens within femtoseconds and creates steric clashes between the chromophore and residues in the binding pocket, leading to many rapid structural changes in the opsin protein (Ernst et al. 2014). These changes culminate in the formation of the active state structure which can persist for minutes.

1.3.3. Differences in rod and cone opsin functional properties Rods and cones have different photosensitivities, in part because of the kinetic properties, thermal sensitivities and oligomeric potential of the different opsin proteins expressed in the two photoreceptor types. While the kinetic processes during photo activation are mostly shared by rod and cone opsins, the duration of each step is different. Rhodopsin has a prolonged meta-II state, primarily because of specific substitutions at sites 122 and 189 (Imai et al. 1997). In most rhodopsins, these sites are occupied by glutamic acid and isoleucine respectively. When these residues are replaced with residues more often associated with cone opsins the rate of meta-II decay increases, and the photosensitivity of the photoreceptor decreases (Imai et al. 1997). The advantage levied by an extended meta-II state is not entirely clear, as signalling ceases long before its decay because of the rapid binding of arrestin and phosphorylation of rhodopsin kinase. However, recent investigations utilizing natural variation have found a relationship between longer meta-II durations and nocturnal activity, suggesting that an extended meta-II active state might improve photosensitivity (Sugawara et al. 2010;

12 Hauser et al. 2017). Rhodopsin is also much more thermally stable than cone opsins (Yanagawa et al. 2015). Dark noise is the signal generated by the spontaneous isomerization of the chromophore by thermal energy and sets the absolute threshold of vision in dim-light conditions (Aho et al. 1988). The increased thermal stability of rhodopsin is established by a hydrogen bond network formed by highly conserved residues near the chromophore binding pocket. Point mutations at these sites destabilize the dark-state structure of rhodopsin (Liu et al. 2011). Species with less stable rhodopsin proteins are more susceptible to noise generated by heat (Aho et al. 1988). Rhodopsin is also thought to improve its sequestration of down- stream signalling molecules through the formation oligomeric track-like platforms in the outer segments, constructed through dimerization of rhodopsin, an assembly absent in some cone opsins (Fotiadis et al. 2006; Jastrzebska et al. 2016). Each of these properties are mediated by specific substitutions in the rhodopsin protein following rhodopsin’s divergence from Rh2 (Jastrzebska et al. 2016; Felce et al. 2017).

1.3.4. Rhodopsin spectral tuning Another mechanism used by rhodopsin to optimize its ability to detect photons is to shift its peak spectral sensitivity (λmax) towards the wavelengths of light most prevalent in an environment (Bowmaker 2008). The probability that a visual pigment absorbs light is based on the absorbance curve of the pigment. This, in effect, is represented by a unimodal probability distribution of photon capture around the maximum value of light absorbed by the pigment (Ernst et al. 2014). The peak spectral sensitivity of a pigment is dictated by the energy barrier between the dark and excited states of the chromophore, as influenced by electrostatic and other interactions with amino acid side chains within the binding pocket. An increased energy barrier would therefore blue shift the spectral sensitivity required for isomerizing the chromophore (Wang et al. 2013). Unbound 11-cis retinal chromophore (A1) is most sensitive to light of 380 nm. The shift into the visible spectrum (440 to 600 nm) when bound to an opsin protein is known as an “opsin shift.” The degree to which a visual pigment alters the peak spectral sensitivity depends upon the electrostatic nature of the retinal binding pocket, the planarity of the chromophore (Sekharan et al. 2012; Wang et al. 2013), and steric interactions inhibiting or favouring the formation of the active state (Sekharan et al. 2013).

13

Variability in the peak spectral sensitivity of visual pigments in the retina was originally attributed to differences in the rhodopsin and porphyrhopisn visual system, the latter representing a rhodopsin pigment reconstituted with the alternative chromophore 11-cis-3,4- dehydroretinal chromophore (A2), red-shifting its spectral sensitivity (Wald 1939). However, subsequent analyses show that the pigment type does not fully describe the variation in sensitivity observed across species. These fundamental investigations used partial bleaching, a technique that eliminates the absorbance of the A2 porphyrhopisn pigment by selectively bleaching it using long-wavelength light beyond the absorbance of an A1 pigment (Crescitelli and Dartnall 1954). The resulting λmax distribution of pure rhodopsin still varied across vertebrates (Crescitelli 1958), suggesting that differences also exist in the protein coding component (Wald et al. 1957). In rhodopsin, shifts in spectral sensitivity have been most frequently attributed to single amino acid substitutions in close proximity to the chromophore (Bowmaker 2008). The number of substitutions that shift the spectral sensitivity of rhodopsin are few and are mostly concentrated around the chromophore (Figure 1.3bc). The general rule is that substitutions that add positive charge near the beta-ionone ring tend to stabilize the excited state of the chromophore, and red shifts the spectral sensitivity, while adding negative charges in the vicinity of the Schiff base differentially stabilize the dark state and blue shifts the rhodopsin absorption spectra (Figure 1.3bc) (Sekharan et al. 2012; Zhou et al. 2014).

1.3.5 Mutations causing disease While a small number of mutations in rhodopsin can improve the functional properties of vision in specific visual environments, the vast majority of substitutions are deleterious. The high proportion of rods in the human retina and high expression of rhodopsin within the photoreceptors predisposes mutations in rhodopsin to be highly damaging (Mendes et al. 2005). Up to 40% of mutations associated with retinitis pigmentosa (RP), a heterogeneous classification applied to genetic diseases resulting in retinal degeneration, can be attributed to rhodopsin, the highest proportion of any gene (Mendes et al. 2005). In many cases, patients with RP first experience night blindness, followed by tunnel vision and eventually the total loss of sight as rod photoreceptor cell death leads to cone degradation and ultimately, retinal

14 degeneration. However, the rate of this progression and severity of RP varies extensively, in part because the large number of different genes and mutations involved (Mendes et al. 2005; Iannaccone et al. 2006). Many of the best characterized and most severe forms of RP are a result of mutations that either do not traffic correctly to the outer segment or fold incorrectly and do not leave the endoplasmic reticulum, classified as type I or type II mutations. Other RP types are typically associated with less severe phenotypes or later onset. These forms or RP affect many aspects of rhodopsin structure and function such as posttranslational modifications, disrupted endocytosis and vesicular trafficking, altered transducin activation, constitutive activity and dimerization defects (Mendes et al. 2005). Not surprisingly, many of the sites associated with disease are highly conserved across vertebrates (Hauser et al. 2016), but natural variants matching RP associated substitutions do occur (Figure 1.3e). This same trend has been observed in investigations of other genetic diseases and has been attributed to epistatic interactions of other genes or secondary sites in the same protein (Lunzer et al. 2010; Jordan et al. 2015; Starr and Thornton 2016).

1.4. VISUAL EVOLUTION IN FISHES

1.4.1. The evolution of fishes Ray-finned fishes () account for roughly half of all vertebrate taxa with over 30,000 recognized species (Braasch et al. 2016). Actinopterygians diverged from their sister clade, the Sarcopterygii, roughly 450 Ma (Betancur-R et al. 2017). Sarcopterygii is comprised of the coelacanths, lungfishes and all terrestrial vertebrates (Figure 1.4a). Together, Actinopterygii and Sarcopterygii form the Superclass Osteichthyes, which is the sister clade to Chondrichthyes (the cartilaginous fishes -- sharks, rays and chimeras), the only other extant lineage of jawed vertebrates (Figure 1.4a) (Brazeau and Friedman 2015). The relationships of jawless vertebrate lineages are unresolved due to conflicting morphological and molecular relationships. Earlier morphological data suggested lamprey are more closely related to Gnathostomata (jawed vertebrates) than hagfishes, but this has been challenged by phylogenetic reconstructions based on molecular data and which support the monophyly of

15 jawless vertebrates as the clade Cyclostomata (Near 2009; Heimberg et al. 2010). More recently, next-generation sequencing techniques have resolved many relationships in two of the most species-rich clades of fishes, the Ostariophysi and Acanthopterygii (Figure 1.4b) (Arcila et al. 2017; Betancur-R et al. 2017). These two clades account for the majority of bony diversity but differ in habitat occupancy, the former being mostly freshwater and latter mostly marine (Carrete Vega and Wiens 2012).

Ostariophysi is a globally distributed clade, with representatives found on all continents but , and accounts for more than half of all freshwater fish species (Chen et al. 2013). Species belonging to the earliest diverging Ostariophysian lineage, Gonorynchiformes (milkfish and allies), inhabit both marine and freshwater environments suggesting that the common ancestor of Ostariophysian fishes may have been marine. Gonorynchiformes represent only a small proportion of Ostariophysian species, with the majority of species richness contained within Otophysi, the sister clade to Gonorynchiformes. Otophysian fishes are almost entirely freshwater, but some have suggested that multiple marine to freshwater transitions are responsible for establishing the global distribution (Patterson 2010), as is the case for freshwater-inhabiting Clupeiformes, the sister clade to Ostariophysi (Bloom and Lovejoy 2014). More recent time trees suggest divergence events establishing the four major Otophysian lineages predated the split of Gondwana, suggesting a single marine to freshwater event is sufficient to explain the freshwater origin of this clade (Chen et al. 2013). The phylogenetic relationships between the four major Ostariophysian orders, , Characiformes, Siluriformes and Gymnotiformes remains contentious. Most recently, a study using a novel multi-gene approach recovered a phylogeny that is mostly consistent with early morphological studies suggesting Characiformes are monophyletic and sister to a clade formed by Siluriformes and Gymnotiformes, which is consistent with the shared electroreceptive capabilities of the latter two orders (Arcila et al. 2017).

Acanthopterygians comprise over half of all fishes, and difficulties associated with reconstructing this clade has led it to be referred to as the “bush at the top” (Nelson 2007). Theses fishes represent an enormous diversity in both form and size, and inhabit a vast range of marine habitats as well as many freshwater environments, a result of more recent marine to

16 freshwater transition events (Figure 1.4ab) (Carrete Vega and Wiens 2012; Wainwright and Longo 2017). Understanding the evolutionary events that led to this diversity will require more robust phylogenetic analyses of this clade.

Numerous diverse clades of fishes form the branches between the divergence of Ostariophysi and the largest order in Acanthopterygians, the (Figure 1.4a) (Betancur-R et al. 2017; Wainwright and Longo 2017). These clades contain many marine and freshwater species as well as a large clade of mostly diadromous fishes, the Salmoniformes (Betancur-R et al. 2017). and Cottoid fishes, also account for considerable freshwater species diversity, and include rapidly radiating clades in large rift lakes in East Africa and Siberia. These lineages diverging after Ostariophysii but before the Perciformes clade exist across a wide range of habitats as well as across a substantial array of depths, and include the deepest dwelling fish species, the hadal snailfish (Yancey et al. 2014), as well as surface dwelling flyingfishes (Exocoetidae), a family of fishes that evolved the ability to leap out of the water for extended periods (Lewallen et al. 2017).

Our understanding of the branching events of early diverging lineages of fishes is also improving (Braasch et al. 2016; Betancur-R et al. 2017). Four orders diverge from the main stem preceding the whole genome duplication event that occurred at the base of Teleostei. These lineages, in order of their diversification from the main stem, are Polypteriformes, Acipenseriformes, Amiiformes and Lepisosteiformes (Figure 1.4a) (Giles et al. 2017). All of these lineages are freshwater or diadromous, which suggests that the common ancestor of all Actinopterygiians was a freshwater fish (Carrete Vega and Wiens 2012). However, when fossil data is incorporated the ancestor of all fishes is unambiguously reconstructed as having a marine ancestry (Betancur-R et al. 2015), and the early diverging freshwater lineages are instead likely the only remaining representatives of more ancient clades of fishes due to past extinction events (Betancur-R et al. 2015).

Freshwater environments represent only a small fraction of the total habitable aquatic environment, but contain roughly the same number of fish species as marine environments (Carrete Vega and Wiens 2012). Many different reasons have been proposed for this mismatch

17 including the higher primary productivity, habitat complexity, and fewer barriers to dispersal in freshwater environments (Carrete Vega and Wiens 2012). It also appears that despite the fact that the interface between marine and freshwater environments acts as a boundary for most lineages, more freshwater species have a recent marine ancestry than marine fishes with freshwater ancestry (Bloom and Lovejoy 2014; Betancur-R et al. 2015).

1.4.2. Rhodopsin evolution in teleosts Two rounds of whole genome duplications prior to the diversification of vertebrate lineages established the set of five opsin classes mostly conserved across vertebrates (Lamb et al. 2007), as well as likely providing the molecular architecture for the diversity of form and function observed across vertebrate species (Ohno et al. 1968). Another whole genome duplication event occurred in teleosts (TGD) and has major implications on the evolution of the rhodopsin gene in fishes. Sometime prior to the TGD and before the divergence of Holostei (Amiiformes and Lepisosteiformes) a retrotransposition event occurred, resulting in the creation of an intronless rhodopsin sequence (Figure 1.4a) (Fitzgibbon et al. 1995). In teleost fishes it is this intronless copy of rhodopsin that is expressed in the retina. The intron retaining copy, orthologous to the ancestral Actinopterygian rhodopsin (“exorh” in fishes) is expressed in the pineal gland instead (Morrow et al. 2017). The TGD resulted in two copies of each of these genes; however, with the exception of some Cypriniformes (Morrow et al. 2017) and eels (Nakamura et al. 2017), few species retain multiple copies of intronless rhodopsin (Nakamura et al. 2017). Synteny analysis of intronless rhodopsin gene duplicates suggests different copies were lost in divergent lineages of fishes. Osteoglossiformes, a nearly globally distributed order of freshwater fishes, diverging from the main teleost stem shortly after eels (Figure 1.4a), retains one copy while euteleost fishes, comprising over two thirds of all fishes, retain the other duplicate copy (Nakamura et al. 2017). In contrast, diadromous eels exploit the possession of two rhodopsin copies by tuning each gene to match wavelengths of light in deep-sea marine or freshwater environments and preferentially expressing the corresponding copy when inhabiting each environment (Nakamura et al. 2017).

18

1.4.3. Adaptation of rhodopsin to deep-sea environments Investigations of deep-sea fishes repeatedly identify a significant correlation between the spectral sensitivity of rhodopsin and the blue-shifted spectral environment of the deep-sea (Hunt et al. 1996). A similar result is found in fishes inhabiting large, clear lakes (Crescitelli 1991; Bowmaker and Hunt 2006; Carleton et al. 2016). The blue shifts in rhodopsin are associated with specific substitutions in the protein that are thought to increase the energy barrier between the dark and excited states of the chromophore (Lin et al. 1998). Substitutions at sites 83, 122, 292 and 299 have all been reported in deep-dwelling species (Hunt et al. 1996), with associated blue shifts of roughly 4 nm, 20 nm, 10 nm and 2 nm respectively (Imai et al. 2007; Dungan and Chang 2017). Site 292 is one turn away from the Schiff base on helix seven and site 122 is positioned less than 4 Å from the beta-ionone ring (Palczewski et al. 2000). These substitutions are remarkably convergent and are also observed in deep-dwelling cetaceans, sharks and a Coelacanth (Fasick and Robinson 1998; Yokoyama 2000; Bozzano 2001; Dungan et al. 2016). Many deep-dwelling species have substitutions at more than one of these sites, but often the resulting blue shifts are not additive (Dungan and Chang 2017). In addition to the small blue shifts associated with substitutions at sites 83 and 299, there might be even more important effects on the kinetic properties and thermal stability of rhodopsin (Sugawara et al. 2010; Dungan and Chang 2017). Yet, when combined, as is the case in the deepest dwelling lineages of fishes, λmax estimates approach 470 nm, the apparent lower limit of rhodopsin spectral sensitivity, but well matched to the downwelling light regardless (Hunt et al. 1996). The close relationship between rhodopsin and the downwelling wavelengths of light suggests its evolution fits with the sensitivity hypothesis, which suggests that the optimal sensitivity for a visual pigment is matched to the background light environment (Crescitelli et al. 1985). In contrast, an alternative trend has been observed in some cone opsins, where the peak spectral sensitivity is offset from the most prevalent wavelengths of light to optimize contrast detection (Partridge and Cummings 1999). Off-set pigments are observed in one clade of deep-dwelling fishes, the Stomiidae, where a red-shifted rhodopsin more closely matches the wavelengths of light produced by the bioluminescent organs of these fishes, providing a secret channel to locate prey (Douglas et al. 1999).

19

1.4.4. Adaptations of rhodopsin to red-shifted freshwater environments Red-shifted rhodopsin pigments are more prevalent in freshwater fishes inhabiting turbid and tannin-stained water (Schwanzara 1967). With the exception of the red-shifting F261Y substitution characterized in the characid, Astyanax mexicanus (Yokoyama et al. 1995), most studies investigating shifts in rhodopsin spectral sensitivity focus not on molecular adaptation in the opsin pigment, but instead on the incorporation of the alternative 11-cis-3,4- dehydroretinal chromophore (A2) (Bridges 1964; Hasegawa and Miyaguchi 1997; Toyama et al. 2008; Wang et al. 2014; Enright et al. 2015). The additional double bond in A2 increases the delocalization of charge in the excited state of the chromophore, red shifting the spectral sensitivity (Luk et al. 2016). Recently the enzyme responsible for converting A1 chromophore to A2 was identified as Cyp27C1 (Enright et al. 2015). This gene is highly conserved in vertebrates and likely also plays a key role in reducing toxic compounds outside the retina (Johnson et al. 2017). Its utility and expression in the RPE has also been shown to be an ancestral vertebrate trait, observed in the sea lamprey, a member of the earliest diverging clade of vertebrates (Morshedian et al. 2017). The advantage provided by the A2 chromophore is obvious in freshwater environments, red shifting the spectral sensitivity of rhodopsin using machinery already existing in vertebrate groups. However, the A2 chromophore does result in decreased thermal stability decreasing sensitivity (Luk et al. 2016). Freshwater species using the A2 chromophore may accommodate for this by having specific substitutions that increase rhodopsin thermal stability (Fyhrquist et al. 1998).

1.4.5. Marine and freshwater fishes The marine-freshwater interface acts as a hard boundary for most aquatic species, with many families of fishes inhabiting only one of the two environments. Marine-derived freshwater lineages, and vice versa, are rare because of the osmotic stress caused during transitions from one environment to the other (Carrete Vega and Wiens 2012). Adaptations to each environment requires major and often opposing behavioral and physiological adaptations. Fishes inhabiting marine environments must constantly take in water and prevent water loss in opposition to the osmotic gradient caused by hyperosmotic surroundings. In contrast,

20 freshwater fishes constantly expel water as dilute urine to counteract the flow of water from the hypoosmotic surroundings into their bodies (Evans 2008).

Some fishes are more tolerant to variance in salinity and are able to move between marine and freshwater environments more freely. Diadromous fishes migrate between the two water types and undergo major physiological changes to survive shifts in salinity (Evans 2008). Some diadromous lineages have subsequently adapted to a fully freshwater existence (Velotta et al. 2015; Willoughby et al. 2018). The expansive phylogenetic diversity of many large tropical river systems such as the Amazon basin is in part attributed to endemic freshwater lineages of ancestrally marine fishes. The preponderance of freshwater fishes with marine ancestry in South America, known as marine-derived lineages, might be the result of large- scale geological events facilitating adaptation to freshwater, such as the Miocene marine- incursion events (Figure 1.4b). These events formed an environment of intermediate salinity known as the Pebas wetland that may have helped establish the now endemic Amazonian lineages of anchovies, herring, croakers, needlefishes, stingrays, and dolphins (Lovejoy et al. 1998). Alternatively, smaller scale invasions of freshwater, occurring globally, may be responsible for the repeated invasion by lineages more resilient to salinity differences (Bloom and Lovejoy 2017). Many of these lineages have since diversified and expanded their ranges throughout the optically variable waters of South America. Marine derived lineages of freshwater fishes provide an ideal natural system to study how repeated transitions from marine to freshwater have influenced the rate of molecular evolution and adaptation of visual pigments like rhodopsin.

1.5. MOLECULAR EVOLUTION

1.5.1. Molecular sequence evolution The genomic differences between species and populations underlies the morphological and physiological heterogeneity observed in nature. The amount of information contained in the genome is immense, with genome size ranging from 340 Mb to 4626 Mb in fishes (Smith

21 and Gregory 2009). Genome comparisons of closely related species with obvious phenotypic differences revealed that much of the observed variation is derived from the differential expression, and not mutations, in protein coding genes (King and Wilson 1975; The Chimpanzee Sequencing and Analysis Consortium 2005). Moreover, most of the differences between human and chimp sequences are synonymous substitutions, and do not alter the amino acid sequence of the protein they encode (The Chimpanzee Sequencing and Analysis Consortium 2005). Synonymous substitutions can still influence fitness by changing how a protein folds or by optimizing codon bias for more efficient translation in highly expressed protein, but are generally assumed to accumulate at a neutral rate in species as they diverge (Anisimova and Liberles 2007). Debate remains as to what extent protein-altering non- synonymous substitutions also accumulate by neutral means (Ohta 1992; Orr 2005a). In general, genes evolve predominantly under purifying selection. Still, phenotypic variation at the molecular level has been studied for decades, typically in species possessing unique adaptations to challenging environments, including the visual pigments of deep-sea fishes (Pauling and Zuckerkandl 1963). Comparative sequence analyses have been aided by advances in sequencing technologies and the rapid accumulation of crystal structures (Sanger et al. 1977; Shendure et al. 2017). These analyses have shown that convergent substitutions occur independently in lineages at functionally critical sites concomitant with shifts in environment (Bowmaker and Hunt 2006; Storz 2018). This suggests that Darwinian selection also occurs at the molecular level as a result of positive selection where non-synonymous substitutions are fixed at rates faster than expected under neutral expectations (Yang and Bielawski 2000).

1.5.2. Molecular phylogenetics Nucleotide and amino acid sequence data has many advantageous properties for phylogenetic reconstructions. The phylogenetic information encoded in these sequences and its utility in the reconstruction of evolutionary relationships has been understood for decades (Zuckerkandl and Pauling 1965), but the enormous amount of data encoded in vertebrate genomes makes it a powerful tool for reconciling differences, particularly in those sharing few phylogenetically informative morphological traits (Yang and Rannala 2012). Non-coding regions evolve rapidly and can be employed to identify differences between species where

22 there is little morphological difference (Andrews et al. 2016), while genes encoding the ribosomal proteins, critical to all life, evolve very slowly and can be used to define ancient divergences events. These phylogenetic reconstructions challenged many long held hypotheses pertaining to the topology and the tempo of diversification events at the base of the tree of life (Woese and Fox 1977; Dornburg et al. 2018). Recent advances in next-generation sequencing technology have allowed for larger amounts of DNA to be reliably and efficiently sequenced (Shendure et al. 2017). Full genomes not only provide more data in the form of direct comparisons between homologous regions but also allow researchers to use systemic differences and gene losses and duplications to infer phylogenies (Ghiurcuta and Moret 2014). Increasing the number of loci used in phylogenetic reconstructions has made it clear that using genes can result in incorrect phylogenetic inferences due to incomplete lineage sorting, convergent evolution, and long branch attraction (Yang and Rannala 2012). However, full genome sequencing remains costly, and genomes are available for only a small proportion of vertebrate species. Alternatively, using a large number of genes or loci instead of the full genome can be used to approximate the data encoded by the genome and by comparing the topologies recovered using many different genes, inconsistencies associated with any one gene or loci can be overcome (Arcila et al. 2017).

Another advantage of sequence data, specifically that coding for proteins, is that it evolves following rules defined by the physical and chemical properties of DNA and of the amino acid sequences for which it codes. This makes it possible to develop statistical models based on robust maximum likelihood and Bayesian theoretical approaches for phylogenetic reconstructions (Yang and Rannala 2012). In general, these more complex models perform better than parsimony approaches and provide a testable framework where support for each node in the tree can be established by comparing bootstrap replicates or across the posterior distribution in maximum likelihood and Bayesian approaches respectively (Yang and Rannala 2012). Both approaches can utilize the same models of molecular evolution, which differ in the number of parameters used to describe variation in nucleotide, codon or amino acid frequencies as well as the rates of substitutions such as transitions and transversions or non- synonymous and synonymous changes (Liò and Goldman 1998). Empirical models can also be derived from large sequence datasets (Yang and Rannala 2012). Maximum likelihood and

23 Bayesian approaches differ in how the final tree is estimated. In both cases, searching the entirety of tree space is computationally impossible and heuristic approaches must be used (Yang and Rannala 2012). Maximum likelihood reconstructions finish once the likelihood score can no longer be improved upon. Ideally this represents the best tree, but these approaches are sensitive to local sub-optimal peaks (Yang and Rannala 2012). In contrast, Bayesian approaches are more about the journey than the destination. Phylogenetic approaches utilize a Markov chain Monte Carlo algorithm to sample the distribution of parameters making up tree space. After the model has converged on a region of tree space, or “peak”, assumed to represent the tree with the highest probability, the proportion of trees sampled should reflect the actual probability distribution for each parameter (Yang and Rannala 2012). Most resistance towards Bayesian approaches stem from the incorporation of priors, a set of parameters provided a priori, which if incorrectly applied can bias the resulting phylogeny. However, flat priors can be used to test the influence of this data on the tree, and in many cases, logical priors can improve the fit and decrease the time required for the phylogenetic estimation (Nascimento et al. 2017).

The assumption that mutations tend to occur at a (mostly) consistent rate allows for ancient divergence times to be estimated using what is known as a molecular clock, especially useful for dating divergence times in species where fossil evidence is rare (Zuckerkandl and Pauling 1962). The neutral and nearly-neutral theories of molecular evolution largely support this technique for dating trees, assuming that, in general, the majority of the genome evolves due to non-adaptive processes (Kimura 1980; Ohta 1992; Nei et al. 2010). The assumptions of neutral evolution in protein-coding genes has been tested at a population (intraspecific) and interspecific scale (Nielsen 2005). Substitutions become fixed within a population at a rate that depends on the selection coefficient, the effective population size, and the effects of balancing selection favouring heterozygotes or due to fluctuating environmental pressures. Investigations of the selection pressures acting on populations has primarily focussed on comparing variation within and between species (Nielsen 2005). At an interspecific level, past selective events defining the evolution of homologous sequences are reconstructed across sites within a protein sequence using species trees for orthologous sequences or gene trees when comparing paralogous proteins. The selective pressures acting on genes at this scale are often inferred by

24 comparing the rates of non-synonymous to synonymous substitutions (which are discussed in further detail in the following section). At both scales, examination of specific lineages and/or positions within a protein diverging from the assumptions of neutrality has helped identify evolutionary events where the structural and functional properties of a protein have changed (Yang 2000; Yang and Nielsen 2002, Nielsen 2005).

1.5.3. Codon models of molecular evolution Codon models of molecular evolution allow for the comparison of the rates of non- synonymous to synonymous substitutions (dN/dS), and represent one way for establishing differences in selective constraint in gene evolution (Anisimova and Kosiol 2008). The majority of substitutions in a protein-coding gene are expected to be deleterious to protein function, therefore synonymous substitutions are fixed more frequently than non-synonymous substitutions. Because of this, most genes are said to be evolving under purifying selection with dN/dS well below one (Figure 1.5a) (Yang and Bielawski 2000). Rates of non- synonymous substitutions relative to synonymous substitutions can approach one at specific sites in a protein or along branches in a phylogeny that are under relaxed selective pressures (Figure 1.5b). Substitutions at these sites have little effect on protein function or occur in a protein that is no longer utilized in a specific lineage (Anisimova and Liberles 2007). In some cases, dN/dS exceeds one, indicating that non-synonymous substitutions are fixed more rapidly than expected under neutral expectations. This is termed positive selection and is often taken as evidence for adaptive molecular evolution (Yang and Bielawski 2000). Often the small number of sites on which selection may act made detecting positive selection difficult with early counting methods based on ancestral reconstructions, as well as simplistic maximum likelihood modes that assume the same dN/dS across the entire protein (Figure 1.5a). However, positive selection has been demonstrated using these methodologies in cases where evolutionary arms races favour substitutions that lead to greater sequence diversity such as the major histocompatibility complex in humans, the coat proteins of the HIV virus, and sperm lysin (Nielsen and Yang 1998; Hughes and Nei 1988; Yang et al. 2000). The power of these models is improved when multiple site classes are considered, requiring only a subset of sites to be under positive selection to infer a significant shift in selection pressures (Figure 1.5a).

25 To test for deviations from neutrality at specific sites in these random-sites models, models including a class of positively selected sites are compared to models where all site classes are fixed at or do not exceed one using a likelihood ratio test (Yang and Swanson 2002) (Figure 1.5b). Models estimating rates independently for each site in an alignment (Fixed Effects Likelihood: FEL) or with site classes estimated from a prior distribution (Fast, Unconstrained Bayesian AppRoximation: FUBAR) have also been implemented in the HYPHY package

(Pond et al. 2005). FUBAR also estimates dN and dS independently, which can help determine whether or not selection may also be acting on synonymous substitutions as is the case in species with significant codon bias (Du et al. 2014).

Most bouts of adaptive evolution are also expected to occur over brief evolutionary time scales. To account for this models have been developed that allow dN/dS estimates to vary on a priori selected branches on the phylogeny (Figure 1.5a) (Yang 1998). Like in the random- sites models described above, hypotheses of episodic changes in selection pressures can be tested by comparing models with more phylogenetic partitions to those with fewer, or no partitions using an LRT (Figure 1.5b) (Yang 1998; Yang and Nielson 2002). Some of these models can be overly conservative because they assume uniform shifts in dN/dS across every site in a protein, but in extreme cases are sensitive enough to detect positive selection such as in the lysozyme gene during the dietary switch in colobine monkeys (Yang 1998) and toxin genes of predatory snails (Duda and Palumbi 1999). These models have since been refined to include site variation in dN/dS (Yang and Nielson 2002), improving their power and also making them more effective at determining what regions of a protein are most highly conserved or under the strongest selection pressures during adaptation (Figure 1.5a). Models have also been designed to remove any a priori inference of what branches are under selection, providing a mechanism for more exploratory studies without predefined hypotheses of where selection regimes are expected to change (Kosakovsky Pond et al. 2011). These modern implementations of these methods are effective even in the absence of any functional studies or crystal structures and can predict important regions of a protein solely based on the conservation of amino acid residues at a site over evolutionary time (Delport et al. 2008). In a similar fashion to how codon bias can interfere with the assumptions of the selective constraint on synonymous substitutions, long branch lengths and changes in effective population size can

26 prevent the accurate estimation of dN/dS on specific lineages. Saturation, preventing the detection of codon substitutions, occurs more rapidly at synonymous than non-synonymous sites, especially when there is a strong codon bias. This is most problematic along long branches and is expected to lead to underestimates of dN/dS (Smith and Smith 1996).

Alternatively, a smaller effective population size can increase dN/dS by reducing the selective constraint acting on slightly deleterious substitutions (Nielsen 2005). These confounding factors can be mitigated by sampling a larger number of species, shrinking the branch lengths, and by comparing to selection pressures observed in other genes, as the effective population size and saturation of dS would have mostly uniform influence across the genome (Nielsen 2005).

1.5.4. Analyses of convergent evolution in protein sequences In some species a single substitution in a specific gene can substantially improve fitness. When this is the case, similar adaptive processes might be expected across species facing the same selective pressure, a process known as convergent or parallel evolution (Losos 2011). These one-off substitutions are difficult to detect using the aforementioned codon models of molecular evolution because they do not leave signatures of non-synonymous to synonymous substitutions at a site or along a branch that exceed the number of substitutions accumulating through neutral processes (Anisimova and Liberles 2012). However, when the number of repeated evolutionary events is sufficiently large, analyses of convergent molecular evolution can identify substitutions that occur significantly more frequently concomitant with shifts in selective pressures (Stern 2013). Parallel and convergent amino acid substitutions, to and from the same amino acid in the former and simply to the same amino acid in the latter are more likely to be observed in highly conserved proteins (Orr 2005b; Storz 2016). Such is the case in the evolution of resistance to toxic plants in insect sodium channels, ribonuclease proteins in leaf eating monkeys, and in substitutions observed in the binding pocket of the rhodopsin in deep-sea fishes (Hope et al. 1997; Zhang 2006). Adaptation to the same selective pressures might also proceed through divergent substitutions in the same protein or through entirely different molecular mechanisms. Detecting evidence of the former requires models

27 that consider information about the structural and functional ramifications of substitutions at specific sites in a protein (Liberles et al. 2012).

1.5.5. Experimental characterization of ancestral protein function In many cases the effects of specific substitutions on protein function are unclear when provided only with sequence data. Only a small proportion of proteins have been crystalized preventing informed predictions based on the structure of a protein, and the actual functional effects of specific substitutions have been investigated in even fewer proteins and often at only a small number of sites (Siltberg-Liberles et al. 2011). In addition, substitutions may not be expected to have the same effect across species or in the ancestral state of the protein (Starr and Thornton 2016), preventing predictions of past evolutionary events. The idea that ancestral sequences could be resurrected from ancestral sequence reconstructions for comparisons to extant species was first proposed over 50 years ago (Pauling and Zuckerkandl 1963), but have become more tenable over time with advances in sequencing and gene synthesis technologies (Liberles 2007). Studies investigating substitutions experimentally in ancestrally reconstructed proteins have been instrumental in determining the functional properties of proteins, and the modes by which proteins have adapted to different environments (Liberles 2007). This includes the evolution of the uricase gene, presumably in response to dietary changes in primates, the paleohabitat of bacteria, as well as the visual environments of ancient archosaurs and cetaceans (Chang et al. 2002; Gaucher et al. 2003; Kratzer et al. 2014; Dungan and Chang 2017). Experimental characterization of substitutions that shift protein function can also help identify functionally important domains in a protein. For example, convergent spectral-shifting substitutions at amino acids in rhodopsin predicted the location of the retinal binding pocket before the structure was crystallized (Hunt et al. 1996; Palczewski et al. 2000). The experimental characterization of ancestral proteins are especially effective in investigations of visual pigments, because of the close association between environmental light conditions, spectral sensitivity and the amino acid sequence (Bowmaker and Hunt 2006).

28 1.6. THESIS OBJECTIVES

The goal of this thesis is to investigate patterns and processes of molecular evolution in rhodopsin underlying visual adaptation to freshwater environments. By comparing rhodopsin evolution across lineages of fishes making independent transitions into freshwater with distant common ancestry and disparate ecologies, and by comparing the functional effects of substitutions associated with freshwater transitions through ancestral protein resurrection in vitro, differences in rates and the specific amino acid substitutions can be put into ecological and structural context. Data generated from these analyses will provide information into the structural and functional differences in rhodopsin pigments adapted for vision in highly turbid and tannin-stained freshwater environments.

Specific aims: • Investigate how transitions from marine to freshwater environments influences the rate of molecular evolution of the dim-light sensitive visual pigment, rhodopsin, in fishes. • Compare rates of rhodopsin molecular evolution in distantly related clades of ancestrally marine fishes making independent transitions into freshwater. • Evaluate how ecological factors, such as depth, might alter the strength of selection acting on rhodopsin during transitions from marine to freshwater. • Investigate if the evolution of electroreception, an alternative sensory modality in Gymnotiformes, has altered selection pressures acting on rhodopsin. • Test the performance of different models of molecular evolution for identifying lineage specific shifts in rates of non-synonymous substitutions. • Determine if the increased rate of molecular evolution in rhodopsin following freshwater invasions is independent of evolutionary trends in other genes. • Compare what sites and regions of rhodopsin are under positive selection in marine and freshwater fishes, and on branches representing transitions between these habitats. • Characterize the shift in spectral sensitivity resulting from convergent amino acid substitutions in rhodopsin occurring in freshwater fishes with marine ancestry.

29 • Identify if other aspects of rhodopsin structure and function are involved in adaptation to more turbid and tannin-stained freshwater environments. • Investigate how a naturally occurring rhodopsin mutation, F220C, associated with disease in humans persists in other species.

1.7. THESIS OVERVIEW

Comparing rhodopsin sequences across species has proven to be an effective method for characterizing the structural and functional properties of rhodopsin most critical for photosensitivity in different light environments (Bowmaker 2008). This thesis serves to expand this body of work by investigating adaptations improving rhodopsin’s photosensitivity in highly turbid and tannin-stained riverine systems. Red shifts in spectral sensitivity are observed in freshwater fishes, the opposite direction of shifts identified in previous studies that have focussed their efforts on deep-sea and lake fishes or fishes inhabiting environments dominated by a more blue-shifted underwater spectrum (Hunt et al. 1996; Carleton et al. 2016). This study also adds to the growing body of research suggesting other aspects of rhodopsin function (kinetic and thermal stability) may also contribute to adaptive evolution (Sugawara et al. 2010; Castiglione et al. 2017; Hauser et al. 2017). Evidence supporting more rapid kinetics in freshwater fishes suggest these lineages may have more rapid dark adaptation, possibly critical because of the narrow interface between brightly-lit and nearly pitch-black environments in rivers.

Chapter two is a comparison of rates of molecular evolution in marine and freshwater anchovies inhabiting the coastal regions surrounding, and the many continental rivers of the Amazon basin of South America. The freshwater species forming a monophyletic clade are all descendants from a single marine ancestor. I hypothesized that this transition would result in positive selection on the freshwater clade, as the environmental pressures due to the more red- shifted riverine visual environment have changed for this group. Models allowing rates of non- synonymous to synonymous substitutions (dN/dS) in rhodopsin to differ in the marine and

30 freshwater sister clades were employed to test this hypothesis. dN/dS estimates were found to be much higher for the freshwater clade, and no evidence for positive selection was found for the marine sister clade or in control genes not associated with visual function. Included among the positively selected sites are known spectral tuning substitutions, and other substitutions at sites that may also have influence on the proteins functional properties. The higher rate of non- synonymous substitutions in rhodopsin, but not other non-visual genes, in the freshwater clade of anchovies is evidence for the increased fixation of specific substitutions improving the function of the rhodopsin pigment for detecting light in freshwater environments. This represents the first study showing that a group of fishes making an evolutionary transition from a marine to freshwater environment have adapted at the molecular level to the different selection pressures imposed by freshwater visual environments.

In Chapter three, I investigate fishes from the family Sciaenidae, another clade of largely marine fishes that have made transitions into freshwater environments. Sciaenids (Drum and Croakers) are a globally distributed family of fishes. By reconstructing the phylogenetic tree of 114 Sciaenid species using four nuclear and two mitochondrial genes, I recovered three independent transitions into freshwater, consistent with previous findings (Lo et al. 2015). Comparing dN/dS estimates for marine species and branches representing transitions into freshwater environments, revealed a significant increase coinciding with the South American transitional event. The 20 amino acid substitutions along this branch include multiple red-shifting substitutions and residues in very close proximity to the chromophore. Ancestral amino acid sequences bounding the transitional branch were reconstructed and expressed in vitro. Spectroscopic assays indicate that the freshwater ancestor is red-shifted compared to the marine node preceding it, reflecting the shift in underwater visual environment. I also test the direct effect of four sites closest to the chromophore. These residues mostly recapitulate the difference in peak spectral sensitivity observed but also indicate that the rate of retinal release is much faster for the freshwater ancestor, evidence for faster dark adaptation. Faster dark adaptation might be critical in freshwater where the interface between very dark and bright environments is very small. By characterizing the substitutions in rhodopsin occurring on the transitional branch in vitro we can conclude that the phenotype of the freshwater ancestral rhodopsin is better suited to detecting light in freshwater

31 environments. No previous study has resurrected and studied the functional effects of substitutions associated with marine to freshwater transitions, a substantial shift in ecology long expected to have major evolutionary impacts (Carrete Vega and Wiens 2012). The major functional differences in the marine and freshwater croaker rhodopsin reconstructions lends support to the results obtained from codon models of molecular evolution suggesting positive selection in this study and in the anchovies investigated in chapter two.

Chapter four expands the study to include more Clupeiformes, the order which anchovies belong, and another order of very shallow-dwelling fishes, the Beloniformes, to test if depth determines the strength of selection acting on rhodopsin upon transition from marine to freshwater environments. Clupeiformes are a prototypical pelagic (midwater) clade of fishes and Sciaenids are primarily benthic. Given light is attenuated in a depth-dependent manner, I hypothesized that selection pressures are stronger on these lineages than the more shallow dwelling Beloniformes inhabiting depths illuminated by broad spectrum light. Estimates of dN/dS on marine to freshwater lineages supports this hypothesis, with more divergent results for Clupeiformes and Beloniformes. More red-shifting substitutions are also observed in these clades, also reflecting the more substantial shift in wavelengths of light available at greater depths. As observed in the Chapter three, positively selected substitutions at sites in the protein that destabilize the active state and are likely to increase the rate of retinal release are observed in Clupeiformes but not Beloniformes. The depth dependent attenuation of light in opposite directions in marine and freshwater environments provides a rare natural system where the effects of the strength of selection on a vertebrate species levied by the environment can be tested. This chapter shows that a greater difference between the optimal spectral sensitivities for detecting light in each environment results in increased differences in rates of molecular evolution associated with transitions from one environment to the other, and that the substitutions observed in lineages making a transition have larger effects on protein function. This indicates that the molecular evolution of the visual system during transitions in visual environments is driven by multiple ecological factors.

Chapter five investigates the molecular evolution of rhodopsin in an order of Amazonian fishes that have evolved active electroreception to contend with the limited

32 visibility in these rivers. I hypothesized that the alternate sensory modality would decrease the strength of selection maintaining normal rhodopsin function consistent with what has been observed in cone opsin genes in this clade and in bats relying heavily on echolocation (Liu et al. 2016; Gutierrez, Schott, et al. 2018). Gymnotiformes also possess a naturally occurring RP- associated amino acid identity in rhodopsin, supporting this hypothesis. However, dN/dS estimates for rhodopsin are significantly lower for this clade than other fishes. Positive selection is also observed on one branch representing the ancestor of a deep-dwelling clade, suggesting rhodopsin is not only highly conserved but under strong adaptive pressures within Gymnotiformes. The RP-associated amino acid is also under positive selection on the branch leading to the Gymnotiformes clade, and mapping its location on a structure alongside amino acid substitutions preceding its first occurrence suggests that it is accommodated for by epistatic interactions at other sites. This chapter reveals that no sensory trade-off has occurred in the rhodopsin gene of Gymnotiformes, and that dim-light vision remains an important part of these fishes sensory repertoire. This contrasts findings for cone opsin genes in Gymnotiformes and in other species with alternative sensory modalities (Gutierrez, Castiglione, et al. 2018). It also explains how a mutation causing disease in humans persists in Gymnotiformes through structural analysis of rhodopsin, a framework that could become increasingly important with the advancement of predictive approaches employed in precision medicine.

Taken together, these chapters reveal to what extent and how rapidly rhodopsin is able to adapt to different visual environments, and its importance to the overall sensory system even in environments where visibility is extremely limited. It also highlights the importance of more holistic views of rhodopsin function in adaptive analyses, supporting previous studies in suggesting that functions aside from spectral sensitivity are involved in visual adaptation. In addition, epistatic interactions are likely to alter the extent to which some substitutions might alter rhodopsin’s properties. These intricacies reaffirm the importance of expanding analyses of rhodopsin structure and function to non-model species because of the wealth of combinations in sequence, structure and functional properties they provide, perpetuated through millions of years of evolutionary trial and error.

33 1.8. FIGURES

Figure 1.1. The eye and the visual cycle. a) Schematic of a teleost eye based on (Walls 1942). b) Schematic of a vertebrate retina with cell types labelled and retinal layers enumerated: 1. nerve fibre layer; 2. ganglion cell layer; 3.

34 inner plexiform layer; 4. inner nuclear layer; 5. outer plexiform layer; 6. outer nuclear layer; 7. photoreceptor layer; 8. retinal pigment epithelium. Cell morphologies based on (Masland 2001b). c) Rod outer segment discs with track-like rhodopsin structures, based on (Schertler 2015). d) Crystal structure of dark-state rhodopsin (PDB: 1U19). e) Photoisomerization of the retinal chromophore and subsequent structural changes, and intermediate states, during formation of the rhodopsin active Meta-II state. f) Downstream signalling cascade in rod photoreceptor cell. Ultimately decreases in cGMP concentrations cause CNG channels to close, hyperpolarizing the cell until the signalling cascade is shut off by phosphorylation of rhodopsin by GRK1 and binding of arrestin. g) The retinal cycle regenerating all-trans retinal to 11-cis retinal.

35

Figure 1.2. Underwater visual environments and spectral sensitivity of aquatic species. a) Attenuation of light underwater in different water types from measurements in (Jerlov 1976) and (Costa et al. 2012). Top right shows the intensity of light from 450 to 700 nm at the water’s surface from (Warrant and Johnsen 2013). b) Peak spectral sensitivities of rhodopsin pigments in marine and freshwater fishes. c) Peak spectral sensitivities by depth from microspectrophotemetry estimates in (MacNichol and Levine 1979) and (Crescitelli 1990).

36

Figure 1.3. Rhodopsin functional domains and sequence diversity. Rhodospin dark state crystal structure with a window into the binding pocket of the protein from the side (a) and looking down from the N terminal domain (b) (Palczewski et al. 2000). c) Shifts in spectral sensitivity caused by substitutions made in bovine rhodopsin characterized in vitro. d) Comparison of dark-state and light activated meta-II rhodopsin (Palczewski et al. 2000; Choe et al. 2011). Differences in structures measured by the root mean square deviation

37 (RMSD). Sites undergoing larger dislocations upon activation shown in darker colours on the meta-II crystal structure. e) Histogram showing the amino acid identities at each site in a rhodopsin dataset of over 3000 vertebrate species. Amino acids coloured and stacked by decreasing frequency. Helices are labelled above and the location of substitutions associated with retinitis pigmentosa (RP) are shown below. Open circles represent sites where RP sites are observed in nature.

38

Figure 1.4. Chronogram of aquatic vertebrate lineages. a) Chronogram based on 1900 species tree generated in (Betancur-R et al. 2017). Marine and freshwater habitat preferences were reconstructed along the branches. Species without habitat data on Fishbase were removed and clades were collapsed to show ordinal diversity. Pie charts show the distribution of marine and freshwater fishes in each clade. Italicized orders had no habitat data and represent new orders not yet accepted on Fishbase. The approximate timing of the rhodopsin retrotransposition event (rho) and the teleost whole genome duplication (TGD) are shown on the tree. b) Examples of marine derived lineages with invasion times corresponding to Miocene marine incursion events (Bloom and Lovejoy 2011).

39

Figure 1.5. Frequently used models of codon evolution. a) Distribution of dN/dS for sites in an alignment of five related species. dN/dS estimates range from sites under highly purifying selection to positive selection (bottom right). The inclusion of these sites in the site classes assumed by random-sites, branch-sites and clade models are depicted along the y axis. Different ways of partitioning the tree for two-ratio, branch-sites and clade models are shown along the x axis and the complexity of each model (in number of parameters) is plotted. In partition 1 (p1) the foreground partition (thick bracket) is allowed to

40 exceed one, while the background is restricted to be below or equal to one. Asterisks label sites where a site class is fixed to equal one. b) Decision tree showing what LRT test should be conducted for the inference of different types of selection pressures acting on a gene. Arrows point from null to alternative. Dotted arrow from M2a to M2aREL indicates the relaxation of parameters described in (Weadick and Chang 2011).

41 1.9. REFERENCES

Aho A-C, Donner K, Hyden C, Larsen LO, Reuter T. 1988. Low retinal noise in with low body temperature allows high visual sensitivity. Nature 334:348.

Andrews KR, Good JM, Miller MR, Luikart G, Hohenlohe PA. 2016. Harnessing the power of RADseq for ecological and evolutionary genomics. Nat Rev Genet 17:81–92.

Anisimova M, Kosiol C. 2008. Investigating protein-coding sequence evolution with probabilistic codon substitution models. Mol. Biol. Evol. 26:255–271.

Anisimova M, Liberles DA. 2007. The quest for natural selection in the age of comparative genomics. Heredity 99:567.

Anisimova M, Liberles DA. 2012. Detecting and understanding natural selection. Codon evolution: mechanisms and models:73–96.

Arcila D, Ortí G, Vari R, Armbruster JW, Stiassny MLJ, Ko KD, Sabaj MH, Lundberg J, Revell LJ, Betancur-R R. 2017. Genome-wide interrogation advances resolution of recalcitrant groups in the tree of life. Nat Ecol Evol 1:0020.

Arendt D, Wittbrodt J. 2001. Reconstructing the eyes of Urbilateria. Philos Trans R Soc London B 356:1545–1563.

Banks MS, Sprague WW, Schmoll J, Parnell JAQ, Love GD. 2015. Why do animal eyes have pupils of different shapes? Sci. Adv. 1:e1500391–e1500391.

Betancur-R R, Ortí G, Pyron RA. 2015. Fossil-based comparative analyses reveal ancient marine ancestry erased by extinction in ray-finned fishes. Ecol Lett 18:441–450.

Betancur-R R, Wiley EO, Arratia G, Acero A, Bailly N, Miya M, Lecointre G, Ortí G. 2017. Phylogenetic classification of bony fishes. BMC Evol Biol 17:1–40.

Blair JE, Hedges SB. 2005. Molecular Phylogeny and Divergence Times of Deuterostome Animals. Mol. Biol. Evol. 22:2275–2284.

Bloom D, Lovejoy N. 2011. The Biogeography of Marine Incursions in South America. University of California Press

Bloom DD, Lovejoy NR. 2014. The evolutionary origins of diadromy inferred from a time-calibrated phylogeny for Clupeiformes (herring and allies). Proc. R. Soc. B 281:20132081.

Bloom DD, Lovejoy NR. 2017. On the origins of marine-derived freshwater fishes in South America. J. Biogeogr. 44:1927–1938.

Bowmaker JK, Hunt DM. 2006. Evolution of vertebrate visual pigments. Curr. Biol. 16:R484–R489.

Bowmaker JK. 2008. Evolution of vertebrate visual pigments. Vision Res. 48:2022–2041.

42 Bozzano A. 2001. The photoreceptor system in the retinae of two dogfishes, Scyliorhinus canicula and Galeus melastomus: possible relationship with depth distribution and predatory lifestyle. J. Fish Biol. 59:1258–1278.

Braasch I, Gehrke AR, Smith JJ, Kawasaki K, Manousaki T, Pasquier J, Amores A, Desvignes T, Batzel P, Catchen J, et al. 2016. The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons. Nat Genet 48:427–437.

Brazeau MD, Friedman M. 2015. The origin and early phylogenetic history of jawed vertebrates. Nature 520:490–497.

Bridges C. 1964. Periodicity of absorption properties in pigments based on vitamin A2 from fish retinae. Nature 203:303–304.

Cañestro C. 2012. Two Rounds of Whole-Genome Duplication: Evidence and Impact on the Evolution of Vertebrate Innovations. In: Soltis PS, Soltis DE, editors. Polyploidy and Genome Evolution. Polyploidy and Genome Evolution. Berlin, Heidelberg: Springer Berlin Heidelberg. pp. 309–339.

Carleton KL, Dalton BE, Escobar-Camacho D, Nandamuri SP. 2016. Proximate and ultimate causes of variable visual sensitivities: Insights from cichlid fish radiations. Genesis 54:299–325.

Carrete Vega G, Wiens JJ. 2012. Why are there so few fish in the sea? Proc. Biol. Sci. 279:2323– 2329.

Castiglione GM, Hauser FE, Liao BS, Lujan NK, Van Nynatten A, Morrow JM, Schott RK, Bhattacharyya N, Dungan SZ, Chang BSW. 2017. Evolution of nonspectral rhodopsin function at high altitudes. Proc. Natl. Acad. Sci. U.S.A. 114:7385–7390.

Chang BS, Jönsson K, Kazmi MA, Donoghue MJ, Sakmar TP. 2002. Recreating a functional ancestral archosaur visual pigment. Mol. Biol. Evol. 19:1483–1489.

Chen W-J, Lavoué S, Mayden RL. 2013. Evolutionary origin and early biogeography of otophysan fishes (Ostariophysi: Teleostei). Evolution 67:2218–2239.

Choe H-W, Kim YJ, Park JH, Morizumi T, Pai EF, Krauß N, Hofmann KP, Scheerer P, Ernst OP. 2011. Crystal structure of metarhodopsin II. Nature 471:651–655.

Collin SP. 2008. A web-based archive for topographic maps of retinal cell distribution in vertebrates. Clin Exp Optometry 91:85–95.

Cortesi F, Musilová Z, Stieb SM, Hart NS, Siebeck UE, Malmstrøm M, Tørresen OK, Jentoft S, Cheney KL, Marshall NJ, et al. 2015. Ancestral duplications and highly dynamic opsin gene evolution in percomorph fishes. Proc. Natl. Acad. Sci. U.S.A. 112:1493–1498.

Costa MPF, Novo EMLM, Telmer KH. 2012. Spatial and temporal variability of light attenuation in large rivers of the Amazon. Hydrobiologia 702:171–190.

Crescitelli F, Dartnall HJ. 1954. A photosensitive pigment of the carp retina. The Journal of Physiology 125:607–627.

43 Crescitelli F, McFall-Ngai M, Horwitz J. 1985. The visual pigment sensitivity hypothesis: further evidence from fishes of varying habitats. J Comp Physiol A 157:323–333.

Crescitelli F. 1958. The natural history of visual pigments. Ann. N.Y. Acad. Sci. 74:230–255.

Crescitelli F. 1990. Adaptations of visual pigments to the photic environment of the deep sea. Journal of Experimental Zoology 256:66–75.

Crescitelli F. 1991. The scotopic photoreceptors and their visual pigments of fishes: functions and adaptations. Vision Res. 31:339–348.

Curcio CA, Allen KA. 1990. Topography of ganglion cells in human retina. J. Comp. Neurol. 300:5– 25.

Darwin C. 1859. On the Origin of Species by Means of Natural Selection, Or, The Preservation of Favoured Races in the Struggle for Life.

Delport W, Scheffler K, Seoighe C. 2008. Models of coding sequence evolution. Brief. Bioinformatics 10:97–109.

Dornburg A, Su Z, Townsend JP. 2018. Optimal rates for phylogenetic inference and experimental design in the era of genome-scale datasets. Syst. Biol.:syy047.

Douglas RH, Partridge JC, Dulai KS, Hunt DM. 1999. Enhanced retinal longwave sensitivity using a chlorophyll-derived photosensitiser in Malacosteus niger, a deep-sea dragon fish with far red bioluminescence. Vision Res. 39:2817–2832.

Du J, Dungan SZ, Sabouhanian A, Chang BS. 2014. Selection on synonymous codons in mammalian rhodopsins: a possible role in optimizing translational processes. BMC Evol Biol 14:96.

Duda TF, Palumbi SR. 1999. Molecular genetics of ecological diversification: duplication and rapid evolution of toxin genes of the venomous gastropod Conus. Proc. Natl. Acad. Sci. U.S.A. 96:6820–6823.

Dungan SZ, Chang BSW. 2017. Epistatic interactions influence terrestrial–marine functional shifts in cetacean rhodopsin. Proc. R. Soc. B 284:20162743–20162749.

Dungan SZ, Kosyakov A, Chang BSW. 2016. Spectral Tuning of Killer Whale (Orcinus orca) Rhodopsin: Evidence for Positive Selection and Functional Adaptation in a Cetacean Visual Pigment. Mol. Biol. Evol. 33:323–336.

Endler JA. 1993. The color of light in forests and its implications. Ecol. Monogr. 63:1–27.

Enright JM, Toomey MB, Sato S-Y, Temple SE, Allen JR, Fujiwara R, Kramlinger VM, Nagy LD, Johnson KM, Xiao Y, et al. 2015. Cyp27c1 Red-Shifts the Spectral Sensitivity of Photoreceptors by Converting Vitamin A1 into A2. Curr. Biol. 25:3048–3057.

Erclik T, Hartenstein V, McInnes RR, Lipshitz HD. 2009. Eye evolution at high resolution: The neuron as a unit of homology. Dev. Biol. 332:70–79.

44 Ernst OP, Lodowski DT, Elstner M, Hegemann P, Brown LS, Kandori H. 2014. Microbial and Animal Rhodopsins: Structures, Functions, and Molecular Mechanisms. Chem. Rev. 114:126– 163.

Evans DH. 2008. Teleost fish osmoregulation: what have we learned since August Krogh, Homer Smith, and Ancel Keys. Am. J. Physiol. Regul. Integr. Comp. Physiol. 295:R704–R713.

Fasick JI, Robinson PR. 1998. Mechanism of spectral tuning in the dolphin visual pigments. Biochemistry 37:433–438.

Felce JH, Latty SL, Knox RG, Mattick SR, Lui Y, Lee SF, Klenerman D, Davis SJ. 2017. Receptor Quaternary Organization Explains G Protein-Coupled Receptor Family Structure. Cell Rep. 20:2654–2665.

Fitzgibbon J, Hope A, Slobodyanyuk SJ, Bellingham J. 1995. The rhodopsin-encoding gene of bony fish lacks introns. Gene 164:273–277.

Fotiadis D, Jastrzebska B, Philippsen A, Müller DJ, Palczewski K, Engel A. 2006. Structure of the rhodopsin dimer: a working model for G-protein-coupled receptors. Curr. Opin. Struct. Biol. 16:252–259.

Fyhrquist N, Donner K, Hargrave PA, McDowell JH, Popp MP, Smith WC. 1998. Rhodopsins from three frog and toad species: sequences and functional comparisons. Exp. Eye Res. 66:295–305.

Gaucher EA, Thomson JM, Burgan MF, Benner SA. 2003. Inferring the palaeoenvironment of ancient bacteria on the basis of resurrected proteins. Nature 425:285.

Ghiurcuta CG, Moret BME. 2014. Evaluating synteny for improved comparative studies. Bioinformatics 30:i9–i18.

Giles S, Xu G-H, Near TJ, Friedman M. 2017. Early members of “living fossil” lineage imply later origin of modern ray-finned fishes. Nature 549:265–268.

Graw J. 2010. Chapter Ten - Eye Development. Elsevier Inc.

Gregg RG, McCall MA, Massey S. 2012. Function and Anatomy of the Mammalian Retina.

Gregory RL. 2015. Eye and Brain: The Psychology of Seeing - Fifth Edition.

Gutierrez EA, Castiglione GM, Morrow JM, Schott RK, Loureiro LO, Lim BK, Chang BSW. 2018. Functional shifts in bat dim-light visual pigment are associated with differing echolocation abilities and reveal molecular adaptation to photic-limited environments. Mol. Biol. Evol.:msy140–msy140VL–IS–.

Gutierrez EA, Schott RK, Preston MW, Loureiro LO, Lim BK, Chang BSW. 2018. The role of ecological factors in shaping bat cone opsin evolution. Proc. Biol. Sci. 285:20172835–20172838.

Gutierrez EA, Van Nynatten A, Chang BSW, Lovejoy NR. 2016. Sensory Systems: Molecular Evolution in Vertebrates.

Hall MI, Kirk EC, Kamilar JM. 2012. Eye shape and the nocturnal bottleneck of mammals. Proc. R. Soc. B 279:4962–4968.

45 Hasegawa E, Miyaguchi D. 1997. Changes in scotopic spectral sensitivity of Ayu Plecoglossus altivelis. Fish. Sci. 63:509–513.

Hauser FE, Ilves KL, Schott RK, Castiglione GM, López-Fernández H, Chang BSW. 2017. Accelerated Evolution and Functional Divergence of the Dim Light Visual Pigment Accompanies Cichlid Colonization of Central America. Mol. Biol. Evol. 34:2650–2664.

Hauser FE, Schott RK, Castiglione GM, Van Nynatten A, Kosyakov A, Tang PL, Gow DA, Chang BSW. 2016. Comparative sequence analyses of rhodopsin and RPE65 reveal patterns of selective constraint across hereditary retinal disease mutations. Vis. Neurosci. 33:e002.

Hauser FE, van Hazel I, Chang BSW. 2014. Spectral tuning in vertebrate short wavelength-sensitive 1 (SWS1) visual pigments: Can wavelength sensitivity be inferred from sequence data? J. Exp. Zool. B Mol. Dev. Evol. 322:529–539.

Heesy CP, Hall MI. 2010. The nocturnal bottleneck and the evolution of mammalian vision. Brain Behav Evol 75:195–203.

Hope AJ, Partridge JC, Dulai KS, Hunt DM. 1997. Mechanisms of wavelength tuning in the rod opsins of deep-sea fishes. Proc. R. Soc. B 264:155–163.

Howland HC, Merola S, Basarab JR. 2004. The allometry and scaling of the size of vertebrate eyes. Vision Res. 44:2043–2065.

Hughes AL, Nei M. 1988. Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature 335:167.

Hunt DM, Slobodyanyuk SJ, Fitzgibbon J, Bowmaker JK. 1996. Spectral tuning and molecular evolution of rod visual pigments in the species flock of cottoid fish in Lake Baikal. Vision Res. 36:1217–1224.

Iannaccone A, Man D, Waseem N, Jennings BJ, Ganapathiraju M, Gallaher K, Reese E, Bhattacharya SS, Klein-Seetharaman J. 2006. Retinitis pigmentosa associated with rhodopsin mutations: Correlation between phenotypic variability and molecular effects. Vision Res. 46:4556–4567.

Imai H, Kefalov V, Sakurai K, Chisaka O, Ueda Y, Onishi A, Morizumi T, Fu Y, Ichikawa K, Nakatani K, et al. 2007. Molecular Properties of Rhodopsin and Rod Function. J. Biol. Chem. 282:6677–6684.

Imai H, Kojima D, Oura T, Tachibanaki S, Terakita A, Shichida Y. 1997. Single amino acid residue as a functional determinant of rod and cone visual pigments. Proc. Natl. Acad. Sci. U.S.A. 94:2322–2326.

Ingram NT, Sampath AP, Fain GL. 2016. Why are rods more sensitive than cones? The Journal of Physiology 594:5415–5426.

Jastrzebska B, Comar WD, Kaliszewski MJ, Skinner KC, Torcasio MH, Esway AS, Jin H, Palczewski K, Smith AW. 2016. A G Protein-Coupled Receptor Dimerization Interface in Human Cone Opsins. Biochemistry:acs.biochem.6b00877–39.

Jerlov NG. 1976. Marine Optics. Elsevier Inc.

46 Johnsen S, Cronin TW, Marshall NJ, Warrant EJ. 2014. Visual Ecology. Princeton University Press

Johnson KM, Phan TTN, Albertolle ME, Guengerich FP. 2017. Human mitochondrial cytochrome P450 27C1 is localized in skin and preferentially desaturates trans-retinol to 3,4-dehydroretinol. J. Biol. Chem. 292:13672–13687.

Jordan DM, Frangakis SG, Golzio C, Cassa CA, Kurtzberg J, Genomics TFFN, Davis EE, Sunyaev SR, Katsanis N. 2015. Identification of cis-suppression of human disease mutations by comparative genomics. Nature 524:225–229.

Katritch V, Cherezov V, Stevens RC. 2013. Structure-function of the G protein–coupled receptor superfamily. Annu. Rev. Pharmacol. Toxicol. 53:531–556.

Kefalov VJ. 2012. Rod and cone visual pigments and phototransduction through pharmacological, genetic, and physiological approaches. J. Biol. Chem. 287:1635–1641.

Kimura M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol.Evol.16:111–120.

King MC, Wilson AC. 1975. Evolution at two levels in humans and chimpanzees. Science 188:107– 116.

Knierim B, Hofmann KP, Ernst OP, Hubbell WL. 2007. Sequence of late molecular events in the activation of rhodopsin. Proc. Natl. Acad. Sci. U.S.A. 104:20290–20295.

Kolb H, Nelson R. 1993. OFF‐alpha and OFF‐beta ganglion cells in cat retina: II. Neural circuitry as revealed by electron microscopy of HRP stains. J. Comp. Neurol. 329:85–110.

Kosakovsky Pond SL, Murrell B, Fourment M, Frost SDW, Delport W, Scheffler K. 2011. A Random Effects Branch-Site Model for Detecting Episodic Diversifying Selection. Mol. Biol. Evol. 28:3033–3043.

Kratzer JT, Lanaspa MA, Murphy MN, Cicerchi C, Graves CL, Tipton PA, Ortlund EA, Johnson RJ, Gaucher EA. 2014. Evolutionary history and metabolic insights of ancient mammalian uricases. Proc. Natl. Acad. Sci. U.S.A.:201320393.

Krupnick JG, Benovic JL. 1998. The role of receptor kinases and arrestins in G protein–coupled receptor regulation. Annu. Rev. Pharmacol. Toxicol. 38:289–319.

Lamb TD, Collin SP, Pugh EN. 2007. Evolution of the vertebrate eye: opsins, photoreceptors, retina and eye cup. Nat. Rev. Neurosci. 8:960–976.

Lamb TD. 2013. Evolution of phototransduction, vertebrate photoreceptors and retina. Prog Retin Eye Res 36:52–119.

Land MF, Fernald RD. 1992. The Evolution of Eyes. Annu. Rev. Neurosci. 15:1–29.

Land MF, Nilsson D-E. 2012. Animal Eyes. OUP Oxford

Land MF. 2006. Visual Optics: The Shapes of Pupils. Curr. Biol. 16:R167–R168.

47 Lewallen EA, Bohonak AJ, Bonin CA, van Wijnen AJ, Pitman RL, Lovejoy NR. 2017. Phylogenetics and biogeography of the two‐wing flyingfish (Exocoetidae: Exocoetus). Ecol Evol 7:1751–1761.

Liberles DA, Teichmann SA, Bahar I, Bastolla U, Bloom J, Bornberg Bauer E, Colwell LJ, De Koning AJ, Dokholyan NV, Echave J. 2012. The interface of protein structure, protein biophysics, and molecular evolution. Protein Sci. 21:769–785.

Liberles DA. 2007. Ancestral sequence reconstruction. Oxford University Press on Demand

Lin J-J, Wang F-Y, Li W-H, Wang T-Y. 2017. The rises and falls of opsin genes in 59 ray-finned fish genomes and their implications for environmental adaptation. Sci. Rep. 7:1–13.

Lin SW, Kochendoerfer GG, Carroll KS, Wang D, Mathies RA, Sakmar TP. 1998. Mechanisms of spectral tuning in blue cone visual pigments visible and raman spectroscopy of blue-shifted rhodopsin mutants. J. Biol. Chem. 273:24583–24591.

Liò P, Goldman N. 1998. Models of molecular evolution and phylogeny. Genome Res. 8:1233–1244.

Liu D-W, Lu Y, Yan HY, Zakon HH. 2016. South American Weakly Electric Fish (Gymnotiformes) Are Long-Wavelength-Sensitive Cone Monochromats. Brain Behav Evol 88:204–212.

Liu J, Liu MY, Nguyen JB, Bhagat A, Mooney V, Yan ECY. 2011. Thermal properties of rhodopsin: insight into the molecular mechanism of dim-light vision. J. Biol. Chem. 286:27622–27629.

Lo P-C, Liu S-H, Chao NL, Nunoo FKE, Mok H-K, Chen W-J. 2015. A multi-gene dataset reveals a tropical New World origin and Early Miocene diversification of croakers (Perciformes: Sciaenidae). Mol. Phylogenet. Evol. 88:132–143.

Losos JB. 2011. Convergence, Adaptation, and Constraint. Evolution 65:1827–1840.

Lovejoy NR, Bermingham E, Martin AP. 1998. Marine incursion into South America. Nature 396:421–422.

Luk HL, Bhattacharyya N, Montisci F, Morrow JM, Melaccio F, Wada A, Sheves M, Fanelli F, Chang BSW, Olivucci M. 2016. Modulation of thermal noise and spectral sensitivity in Lake Baikal cottoid fish rhodopsins. Sci. Rep. 6:1–9.

Lunzer M, Golding GB, Dean AM. 2010. Pervasive Cryptic Epistasis in Molecular Evolution. PLoS Genet 6:e1001162.

Lythgoe JN. 1979. The Ecology of Vision. Clarendon Press

Lythgoe JN. 1984. Visual pigments and environmental light. Vision Res. 24:1539–1550.

MacNichol EF Jr., Levine JS. 1979. Visual Pigments in Teleost Fishes: Effects of Habitat, Microhabitat, and Behaviour on Visual System Evolution. Sensory processes 3:95–131.

Masland RH. 2001a. Neuronal diversity in the retina. Curr. Opin. Neurobiol. 11:431–436.

Masland RH. 2001b. The fundamental plan of the retina. Nature neuroscience 4:877.

48 Mathger LM, Hanlon RT, Håkansson J, Nilsson D-E. 2013. The W-shaped pupil in cuttlefish (Sepia officinalis): Functions for improving horizontal vision. Vision Res. 83:19–24.

Mendes HF, van der Spuy J, Chapple JP, Cheetham ME. 2005. Mechanisms of cell death in rhodopsin retinitis pigmentosa: implications for therapy. Trends Mol. Med. 11:177–185.

Morrow JM, Lazic S, Dixon Fox M, Kuo C, Schott RK, de A Gutierrez E, Santini F, Tropepe V, Chang BSW. 2017. A second visual rhodopsin gene, rh1-2, is expressed in zebrafish photoreceptors and found in other ray-finned fishes. J. Exp. Biol. 220:294–303.

Morshedian A, Toomey MB, Pollock GE, Frederiksen R, Enright JM, McCormick SD, Cornwall MC, Fain GL, Corbo JC. 2017. Cambrian origin of the CYP27C1-mediated vitamin A 1-to-A 2switch, a key mechanism of vertebrate sensory plasticity. R. Soc. open sci. 4:170362–170369.

Nakamura Y, Yasuike M, Mekuchi M, Iwasaki Y, Ojima N, Fujiwara A, Chow S, Saitoh K. 2017. Rhodopsin gene copies in Japanese eel originated in a teleost-specific genome duplication. Zoological Lett 3:1–12.

Nascimento FF, Reis MD, Yang Z. 2017. A biologist’s guide to Bayesian phylogenetic analysis. Nat Ecol Evol 1:1446–1454.

Nei M, Suzuki Y, Nozawa M. 2010. The Neutral Theory of Molecular Evolution in the Genomic Era. Annu. Rev. Genom. Hum. Genet. 11:265–289.

Nelson JS. 2007. Fishes of the World. Wiley

Nielsen R. 2005. Molecular Signatures of Natural Selection. Annu. Rev. Genet. 39:197–218.

Nielsen R, Yang Z. 1998. Likelihood Models for Detecting Positively Selected Amino Acid Sites and Applications to the HIV-1 Envelope Gene. Genetics 148:929–936.

Nilsson D-E, Pelger S. 1994. A pessimistic estimate of the time required for an eye to evolve. Proc. R. Soc. B 256:53–58.

Novales Flamarique I. 2017. A vertebrate retina with segregated colour and polarization sensitivity. Proc. Biol. Sci. 284:20170759.

Ohno S, Wolf U, Atkin NB. 1968. Evolution from fish to mammals by gene duplication. Hereditas 59:169–187.

Ohta T. 1992. The nearly neutral theory of molecular evolution. Annual Review of Ecology and Systematics 23:263–286.

Orr HA. 2005a. The genetic theory of adaptation: a brief history. Nat Rev Genet 6:119–127.

Orr HA. 2005b. The probability of parallel evolution. Evolution 59:216–220.

Palczewska G, Vinberg F, Stremplewski P, Bircher MP, Salom D, Komar K, Zhang J, Cascella M, Wojtkowski M, Kefalov VJ, et al. 2014. Human infrared vision is triggered by two-photon chromophore isomerization. Proc. Natl. Acad. Sci. U.S.A. 111:E5445–E5454.

49 Palczewski K, Kumasaka T, Hori T, Behnke CA, Motoshima H, Fox BA, Le Trong I, Teller DC, Okada T, Stenkamp RE, et al. 2000. Crystal Structure of Rhodopsin: A G Protein-Coupled Receptor. Science 289:739–745.

Partridge JC, Cummings ME. 1999. Adaptation of visual pigments to the aquatic environment. Adaptive Mechanisms in the Ecology of Vision:251–283.

Patterson C. 2010. Chanoides, a marine Eocene otophysan fish (Teleostei: Ostariophysi). J Vertebr Paleontol 4:430–456.

Pauling L, Zuckerkandl E. 1963. Chemical Paleogenetics. Molecular “Restoration Studies” of Extinct Forms of Life. Acta Chem. Scand. 17 supl.:9–16.

Pond SLK, Frost SDW, Muse SV. 2005. HyPhy: hypothesis testing using phylogenies. Bioinformatics 21:676–679.

Pugh EN Jr, Nikonov S, Lamb TD. 1999. Molecular mechanisms of vertebrate photoreceptor light adaptation. Curr. Opin. Neurobiol. 9:410–418.

Ramón y Cajal S. 1904. Textura del Sistema Nervioso del Hombre y de los Vertebrados. Madrid

Rossotti H. 1985. Colour. Princeton University Press

Sakmar TP, Franke RR, Khorana HG. 1989. Glutamic acid-113 serves as the retinylidene Schiff base counterion in bovine rhodopsin. Proc. Natl. Acad. Sci. U.S.A. 86:8309–8313.

Sanger F, Nicklen S, Coulson AR. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. U.S.A. 74:5463–5467.

Schertler GFX. 2015. Rhodopsin on Tracks: New Ways to Go in Signaling. Structure 23:606–608.

Schmitz L, Motani R. 2010. Morphological differences between the eyeballs of nocturnal and diurnal amniotes revisited from optical perspectives of visual environments. Vision Res. 50:936–946.

Schwanzara SA. 1967. The visual pigments of freshwater fishes. Vision Res. 7:121–148.

Sekharan S, Katayama K, Kandori H, Morokuma K. 2012. Color vision:“OH-site” rule for seeing red and green. J. Am. Chem. Soc. 134:10706–10712.

Sekharan S, Mooney VL, Rivalta I, Kazmi MA, Neitz M, Neitz J, Sakmar TP, Yan ECY, Batista VS. 2013. Spectral Tuning of Ultraviolet Cone Pigments: An Interhelical Lock Mechanism. J. Am. Chem. Soc. 135:19064–19067.

Shendure J, Balasubramanian S, Church GM, Gilbert W, Rogers J, Schloss JA, Waterston RH. 2017. DNA sequencing at 40: past, present and future. Nature 550:345–353.

Siebeck UE, Marshall NJ. 2001. Ocular media transmission of fish — can coral reef fish see ultraviolet light? Vision Res. 41:133–149.

Siltberg-Liberles J, Grahnen JA, Liberles DA. 2011. The evolution of protein structures and structural ensembles under functional constraint. Genes 2:748–762.

50 Smith EM, Gregory TR. 2009. Patterns of genome size diversity in the ray-finned fishes. Hydrobiologia 625:1–25.

Smith JM, Smith NH. 1996. Synonymous nucleotide divergence: what is “saturation?” Genetics 142:1033–1036.

Starr TN, Thornton JW. 2016. Epistasis in protein evolution. Protein Sci. 25:1204–1218.

Stern DL. 2013. The genetic causes of convergent evolution. Nat Rev Genet 14:751–764.

Stock DW, Whitt GS. 1992. Evidence from 18S ribosomal RNA sequences that lampreys and hagfishes form a natural group. Science 257:787–789.

Storz JF. 2016. Causes of molecular convergence and parallelism in protein evolution. Nat Rev Genet 17:239–250.

Storz JF. 2018. Compensatory mutations and epistasis for protein function. Curr. Opin. Struct. Biol. 50:18–25.

Strauss O. 2005. The Retinal Pigment Epithelium in Visual Function. Physiol. Rev. 85:845–881.

Sugawara T, Imai H, Nikaido M, Imamoto Y, Okada N. 2010. Vertebrate Rhodopsin Adaptation to Dim Light via Rapid Meta-II Intermediate Formation. Mol. Biol. Evol. 27:506–519.

The Chimpanzee Sequencing and Analysis Consortium. 2005. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437:69–87.

Toyama M, Hironaka M, Yamahama Y, Horiguchi H, Tsukada O, Uto N, Ueno Y, Tokunaga F, Seno K, Hariyama T. 2008. Presence of Rhodopsin and Porphyropsin in the Eyes of 164 Fishes, Representing Marine, Diadromous, Coastal and Freshwater Species—A Qualitative and Comparative Study. Photochem. Photobiol. 84:996–1002.

Tsukamoto H, Terakita A, Shichida Y. 2010. A pivot between helices V and VI near the retinal binding site is necessary for activation in rhodopsins. J. Biol. Chem.:jbc–M109.

Velotta JP, McCormick SD, Schultz ET. 2015. Trade-offs in osmoregulation and parallel shifts in molecular function follow ecological transitions to freshwater in the Alewife. Evolution 69:2676–2688.

Wainwright PC, Longo SJ. 2017. Functional Innovations and the Conquest of the Oceans by Acanthomorph Fishes. Curr. Biol. 27:R550–R557.

Wald G, Brown PK, Smith Brown P. 1957. Visual Pigments and Depths of Habitat of Marine Fishes. Nature 180:969–971.

Wald G. 1939. The porphyropsin visual system. The Journal of General Physiology 22:775–794.

Walls GL. 1942. The vertebrate eye and its adaptive radiation. Bloomfield Hills.

Wang F-Y, Fu W-C, Wang I-L, Yan HY, Wang T-Y. 2014. The Giant Mottled Eel, Anguilla marmorata, Uses Blue-Shifted Rod Photoreceptors during Upstream Migration. PLoS ONE 9:e103953.

51 Wang W, Geiger JH, Borhan B. 2013. The photochemical determinants of color vision. Bioessays 36:65–74.

Warrant EJ, Johnsen S. 2013. Vision and the light environment. Curr. Biol. 23:R990–R994.

Weadick CJ, Chang BSW. 2011. An Improved Likelihood Ratio Test for Detecting Site-Specific Functional Divergence among Clades of Protein-Coding Genes. Mol. Biol. Evol. 29:1297–1300.

Willoughby JR, Harder AM, Tennessen JA, Scribner KT, Christie MR. 2018. Rapid genetic adaptation to a novel environment despite a genome-wide reduction in genetic diversity. Mol Ecol 17:675.

Woese CR, Fox GE. 1977. Phylogenetic structure of the prokaryotic domain: The primary kingdoms. Proc. Natl. Acad. Sci. U.S.A. 74:5088–5090.

Wu J, Seregard S, Algvere PV. 2006. Photochemical Damage of the Retina. Surv. Ophthalmol. 51:461–481.

Yanagawa M, Kojima K, Yamashita T, Imamoto Y, Matsuyama T, Nakanishi K, Yamano Y, Wada A, Sako Y, Shichida Y. 2015. Origin of the low thermal isomerization rate of rhodopsin chromophore. Sci. Rep. 5:11081.

Yancey PH, Gerringer ME, Drazen JC, Rowden AA, Jamieson A. 2014. Marine fish may be biochemically constrained from inhabiting the deepest ocean depths. Proc. Natl. Acad. Sci. U.S.A. 111:4461–4465.

Yang Z, Bielawski JP. 2000. Statistical methods for detecting molecular adaptation. Trends in Ecology & Evolution 15:496–503.

Yang Z, Nielsen R. 2002. Codon-Substitution Models for Detecting Molecular Adaptation at Individual Sites Along Specific Lineages. Mol. Biol. Evol. 19:908–917.

Yang Z, Rannala B. 2012. Molecular phylogenetics: principles and practice. Nat Rev Genet 13:303– 314.

Yang Z, Swanson WJ, Vacquier VD. 2000. Maximum-Likelihood Analysis of Molecular Adaptation in Abalone Sperm Lysin Reveals Variable Selective Pressures Among Lineages and Sites. Mol. Biol. Evol. 17:1446–1455.

Yang Z, Swanson WJ. 2002. Codon-Substitution Models to Detect Adaptive Evolution that Account for Heterogeneous Selective Pressures Among Site Classes. Mol. Biol. Evol. 19:49–57.

Yang Z. 1998. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol. Biol. Evol. 15:568–573.

Yang Z. 2000. Maximum Likelihood Estimation on Large Phylogenies and Analysis of Adaptive Evolution in Human Influenza Virus A. Journal of Molecular Evolution 51:423–432.

Yau K-W, Hardie RC. 2009. Phototransduction Motifs and Variations. Cell 139:246–264.

Yokoyama R, Knox BE, Yokoyama S. 1995. Rhodopsin from the fish, Astyanax: role of tyrosine 261 in the red shift. Invest. Ophthalmol. Vis. Sci. 36:939–945.

52 Yokoyama S. 2000. Color vision of the coelacanth (Latimeria chalumnae) and adaptive evolution of rhodopsin (RH1) and rhodopsin-like (RH2) pigments. Journal of Heredity 91:215–220.

Zhang J. 2006. Parallel adaptive origins of digestive RNases in Asian and African leaf monkeys. Nat Genet 38:819.

Zhou X, Sundholm D, Wesołowski TA, Kaila VRI. 2014. Spectral Tuning of Rhodopsin and Visual Cone Pigments. J. Am. Chem. Soc. 136:2723–2726.

Zhou XE, Melcher K, Xu HE. 2012. Structure and activation of rhodopsin. Acta Pharmacol Sin 33:291–299.

Zuckerkandl E, Pauling L. 1962. Molecular disease, evolution and genetic heterogeneity. Academic Press.

Zuckerkandl E, Pauling L. 1965. Molecules as documents of evolutionary history. J. Theor. Biol. 8:357–366.

53 CHAPTER TWO: OUT OF THE BLUE: ADAPTIVE VISUAL PIGMENT EVOLUTION ACCOMPANIES AMAZON INVASION

This chapter was published as: Van Nynatten A, Bloom D, Chang BSW, Lovejoy NR. 2015. Out of the blue: adaptive visual pigment evolution accompanies Amazon invasion. Biology Letters 11:20150349.

Author contributions: A.V.N. collected the rhodopsin sequence dataset, analysed the data, participated in study design and drafted the manuscript; D.B. collected field data and participated in manuscript preparation; B.S.W.C. and N.R.L. conceived of the study, coordinated the study and helped write the manuscript.

2.1. ABSTRACT

Incursions of marine water into South America during the Miocene prompted colonization of freshwater habitats by ancestrally marine species, and present a unique opportunity to study the molecular evolution of adaptations to varying environments. Freshwater and marine environments are distinct in both spectra and average intensities of available light. Here, we investigate the molecular evolution of rhodopsin, the photosensitive pigment in the eye that activates in response to light, in a clade of South American freshwater anchovies derived from a marine ancestral lineage. Using likelihood-based comparative sequence analyses, we found evidence for positive selection in the rhodopsin of freshwater anchovy lineages at sites known to be important for aspects of rhodopsin function such as spectral tuning. No evidence was found for positive selection in marine lineages, nor in three other genes not involved in vision. Our results suggest that an increased rate of rhodopsin evolution was driven by diversification into freshwater habitats, thereby constituting a rare example of molecular evolution mirroring large-scale palaeogeographical events.

54 2.2. INTRODUCTION

Evolutionary transitions of species colonizing new ecological niches provide excellent systems for studying molecular adaptation. Visual pigments, light sensitive molecules mediating the initial steps in the visual transduction cascade, are amenable to these studies because they represent a direct interface between an organism and its environment. Mutations in the opsin protein component of visual pigments can shift peak sensitivity towards the wavelengths of light most prevalent in the environment (Bowmaker and Hunt 2006). Rhodopsin is the visual pigment predominantly expressed in rod photoreceptors and functions in dim-light vision. Rhodopsin is particularly important in aquatic environments, where light attenuation is much greater than in air, and has been extensively studied in deep-sea fishes where its peak spectral sensitivity has been shifted to match the predominately blue environment (Hunt et al. 2001). In contrast to oceans, many large rivers are most transparent to red light due to the selective scattering of short wavelengths by suspended particulate matter. This causes freshwater systems to appear dimmer and red-shifted when compared to marine systems of the same depth (Lythgoe 1979).

During the Miocene in South America, profound palaeogeographic and climatic changes caused massive incursions of seawater into formerly freshwater continental habitats (Lovejoy et al. 1998; Lovejoy et al. 2006). These incursions resulted in complex habitat mosaics with varying salinity levels that facilitated evolutionary transitions between marine and freshwater in several lineages of fishes, including anchovies (Bloom and Lovejoy 2011). The New World clade of anchovies (subfamily Engraulinae) includes marine species distributed along the coasts of North, Central, and South America, as well as freshwater species in the Amazon, Orinoco, and other large Neotropical rivers. Recent phylogenetic analyses revealed that the South America river anchovies are the product of a single freshwater invasion by a marine ancestor (Bloom and Lovejoy 2012; Bloom and Lovejoy 2014). Subsequent radiation throughout the basins of South America produced a profusion of morphologically and ecologically distinct species not seen in marine habitats, such as the miniaturized Amazonsprattus scintilla and the piscivorous Lycengraulis batesii. A few freshwater anchovy lineages even reinvaded marine habitats (Bloom and Lovejoy 2012).

55 In this study we use the striking marine to freshwater habitat transition as a unique natural experiment to study the effects of different light environments on rhodopsin gene evolution. At the shallow depths occupied by the majority of marine anchovies the spectral attenuation of light is negligible (Lythgoe 1979). In contrast, at similar depths in South American rivers, the amount of available light is substantially decreased and richer in longer wavelengths. In addition, the degree of spectral attenuation can vary among South American rivers, broadly classified as white water, black water, or clear water based on their optical qualities (Costa et al. 2012). Given the contrast in spectra of downwelling light between marine and freshwater habitats, as well as the diversity of visual environments in South American rivers, we predicted that the freshwater-invading anchovy lineage would show evidence of positive Darwinian selection in the rhodopsin gene. To test this hypothesis, we sequenced rhodopsin from New World anchovies, and use codon-based models of molecular evolution to compare the strength of selection acting on freshwater invaders versus their marine relatives.

2.3. METHODS

Genomic DNA was extracted and used as a template to amplify the rhodopsin gene from 54 individuals, representing 35 species of New World anchovies (Appendix 1). Sample collection and DNA extraction methodology has been described previously (Bloom and Lovejoy 2012). We amplified a ~800 bp fragment of the rhodopsin1 coding region spanning all seven transmembrane helices using the primers Rh193F (CNTATGAATAYCCTCAGTACTACC) and

Rh1073R (CCRCAGCACARCGTGGTGATCATG) (Chen et al. 2013). PCR products were purified using a QIAquick PCR Purification Kit (Qiagen) and sequenced by Sanger sequencing using the same primers described above. High quality sequencing reads were aligned using MUSCLE (Edgar 2004). Methods describing the generation of the sequence data and alignments for the non vision-related genes Rag1, Rag2 and Cytb (referred to below as control genes) can be found in (Bloom and Lovejoy 2012).

56 We used the aligned datasets and the most recent species tree for anchovies (Bloom and Lovejoy 2012) to determine the strength and form of selection acting on rhodopsin and the control genes in South American anchovies. We estimated the ratio of the non-synonymous to synonymous substitution rates (dN/dS) in the anchovy rhodopsin dataset using codon-based maximum likelihood models incorporated in PAML 4 (Yang 2007). Since these models implemented PAML do not incorporate rate variation in synonymous sites (dS), we also analyzed the data using FUBAR models available through the Datamonkey webserver, which do allow for synonymous rate variation (Pond et al. 2005). While some of the models implemented in PAML have recently come under statistical criticisms (Friedman and Hughes 2007; Suzuki 2008; Nozawa et al. 2009; Murrell et al. 2012), these criticisms have largely been refuted and the models shown to have robust statistical properties (Yang et al. 2009; Weadick and Chang 2011; Yang and Reis 2011; Zhai et al. 2012; Gharib and Robinson-Rechavi 2013). Clade model C (CmC) from the PAML 4 software package (Yang 2007) was used to test for differences in dN/dS in marine and freshwater anchovy lineages. The CmC model divides codons into three site classes based on estimated dN/dS values. The third site class is the divergent site class and is allowed to differ among foreground and background phylogenetic partitions based on a priori hypotheses (Bielawski and Yang 2004). We conducted two separate analyses on rhodopsin and control gene datasets, one with the freshwater invading clade identified as the foreground partition and one with only exclusively freshwater anchovies included in the foreground partition (marine reinvading anchovies included in the background). The null model used for the CmC analysis was M2a_REL which has the same number of site classes as CmC but assumes a uniform dN/dS across the entire phylogeny (Weadick and Chang 2011). Statistical support for a divergent site class was determined with a Likelihood Ratio

Test (LRT) comparing the likelihoods of CmC with M2aREL. Positive selection (dN/dS > 1) in the foreground partition was examined using a null model constraining the dN/dS of the divergent site class to one in the CmC analysis and comparing it to the unconstrained CmC model with an LRT (Chang et al. 2012).

Selection pressures acting independently on marine and freshwater lineages were analyzed by dividing rhodopsin and control gene datasets into a dataset containing the freshwater invading clade of anchovies and a dataset containing the marine clade and shared

57 marine relatives. We tested for positive selection using random sites models M8 and its nested null model M8a incorporated in PAML 4 (Yang 2007). Models M8 and M8a were compared using an LRT, with significant results (p < 0.01) indicating support for the inclusion of the positively selected site class in M8 (Swanson et al. 2003). Sites under positive selection in the M8 model were identified using a Bayes Empirical Bayes approach (Yang 2005). We also tested for positive selection in the rhodopsin dataset using FUBAR, a model implemented in HyPhy on the DataMonkey webserver (Pond et al. 2005; Delport et al. 2010; Murrell et al. 2013). FUBAR allows for a larger range of discrete site classes and independently estimates values for dN and dS (Murrell et al. 2013). Sites identified as undergoing positive selection were determined to be significant when posterior probabilities were greater than 0.75.

2.4. RESULTS

A dN/dS value greater than one (positive selection) is indicative of evolutionarily advantageous amino acid substitutions and is often observed in proteins undergoing adaptive functional change (Anisimova and Kosiol 2008). We first implemented models in PAML that allow variation in dN/dS across sites but assume a uniform distribution across the entire phylogeny. These analyses identified a subset of sites under positive selection in the rhodopsin gene. We then defined “marine” and “freshwater” partitions using the phylogeny (Figure 2.1a. Note that the freshwater partition includes five secondarily marine species), and implemented clade models that allow for a class of sites with dN/dS values that differ between defined partitions (Bielawski and Yang 2004). Incorporation of model parameters that allow dN/dS to differ between marine and freshwater partitions provided a statistically better fit to the data than simpler models with uniform dN/dS values across the tree (Table 2.1). These clade models indicate significant positive selection (dN/dS = 7.0) in the freshwater partition, but provide no evidence for positive selection in the marine partition. Transferring the five secondarily marine species from the freshwater to the marine partition results in an even higher freshwater value

(dN/dS = 9.5) and improves support for the model allowing variation between partitions (Table 2.2). In contrast, analyses of non vision-related genes provided no evidence for positive

58 selection, nor any significant differences in dN/dS between freshwater and marine partitions (Figure 2.1b; Table 2.1; Table 2.2).

To confirm our clade model results, we also examined selection pressures acting on the rhodopsin gene by using PAML to implement random sites models to estimate dN/dS (Yang 2007). For these analyses, marine and freshwater partitions were considered as separate datasets. As with the clade model analyses, we found clear evidence for positive selection occurring in the freshwater anchovy dataset (Table 2.2), with positively selected sites identified at positions in the rhodopsin gene previously implicated in tuning spectral sensitivity to longer-wavelength light (Figure 2.1c; Figure 2.2; Table 2.5) (Hunt et al. 2001). In contrast, random sites analysis did not provide evidence for positive selection in the marine partition (Table 2.3), nor for positive selection in either the freshwater or marine partitions for any of the non vision-related genes (Table 2.4). To ensure that these results were not due to artifacts in dS estimation, we also conducted analyses using HyPhy that allow for independent estimation of dN and dS (Murrell et al. 2013) (Figure 2.1c; Figure 2.2; Table 2.5).

2.5. DISCUSSION

Positive selection has occurred in the rhodopsin gene of freshwater anchovies, but not their marine relatives. This result is consistent with our hypothesis that the invasion of the much dimmer and red-shifted freshwater rivers of South America is accompanied by visual adaptation in the dim-light sensitive visual pigment rhodopsin. The relatively low dN/dS observed in marine anchovies may be due to the negligible attenuation of light at depths inhabited by the majority of anchovy lineages (Warrant and Johnsen 2013), or because the peak absorbance of rhodopsin is already tuned to the ideal wavelength of light for marine environments. The lack of evidence for positive selection in the three non vision-related genes indicates that the increased dN/dS found for rhodopsin in freshwater is not due to differences in population structure or genome-wide shifts in evolutionary rates.

59 South American rivers are also more spectrally diverse than marine ecosystems (Costa et al. 2012). Lineages distributed across highly stratified or disparate freshwater light environments may have increased dN/dS values due to divergent selection pressures acting on the visual systems of species occupying spectrally distinct habitats (Schott et al. 2014). In freshwater anchovies, different adaptations may be required for optimal visual performance in the highly turbid Amazon, the clear waters of the Rio Tapajos, or the tannin stained blackwaters of the Rio Negro. Also, the ecological diversification of freshwater anchovies (for example, into actively hunting piscivores) has likely altered visual demands and correspondingly shifted patterns of natural selection in rhodopsin (Bloom and Lovejoy 2011).

Amino acid substitutions within the chromophore binding pocket of rhodopsin can change the wavelength of its maximal absorbance (Bowmaker and Hunt 2006). Of the positively selected amino acid sites identified in this study, sites 124 and 299 are spectral tuning sites in rhodopsin (Figure 2.3) (Hunt et al. 2001), and site 108 is a spectral tuning site in the homologous cone opsin SWS2 (Chinen et al. 2005). Sites 165 and 213 have been identified as positively selected in recent studies of fishes inhabiting environments of varying turbidity (Larmuseau et al. 2011; Schott et al. 2014). Aspects of rhodopsin function other than spectral tuning have not been as well investigated, but are also likely to be adaptive under different light conditions. For example, the thermal stability of both the dark and light-activated state of rhodopsin may be important for visual perception in dim-light environments where the signal to noise ratio is very low (Liu et al. 2011). In addition, potential site reversions to marine states in secondarily marine lineages (for example, site 96 in Anchoa spinifer) are attractive candidates for mutagenesis studies.

Several other groups of marine fishes (including stingrays, pufferfishes, needlefishes, and flatfishes) have independently invaded the optically diverse freshwaters of South America. Our results suggest that these taxa represent a superb natural experiment for future studies of habitat-driven molecular adaptation.

60 2.6. TABLES

Table 2.1. Clade model C (PAML) analyses of rhodopsin and non vision-related genes. Parameters

Model lnL dN/dS 0 dN/dS 1 dN/dS 2 Null P (df)

Rhodopsin

M2a_REL -4099.1 0.03 (91.6%) 1.00 (7.4%) 2.82 (1.0%)

Constrained -4099.7 0.03 (91.7%) 1.00 (6.9%) 3.38/1.00 (1.4%) CmC (m/f)

CmC (m/f) -4095.6 0.03 (91.6%) 1.00 (7.7%) 0.89/7.04* (0.7%) M2a_REL 0.008(1)

Constrained Cmc 0.004(1)

Rag1

M2a_REL -6444.0 0.01 (85.2%) 1.00 (0.6%) 0.19 (14.2%)

CmC (m/f) -6444.0 0.01 (85.2%) 1.00 (0.6%) 0.19/0.19 (14.2%) M2a_REL 0.972(1)

Rag2

M2a_REL -3505.2 0.02 (89.2%) 1.00 (0.5%) 0.31 (10.3%)

CmC (m/f) -3504.0 0.02 (89.9%) 1.00 (0.7%) 0.21/0.41 (9.4%) M2a_REL 0.112(1)

Cytb

M2a_REL -11291.8 0.00 (94.4%) 1.00 (0.0%) 0.06 (5.4%)

CmC (m/f) -11290.0 0.00 (94.4%) 1.00 (0.0%) 0.07/0.11 (5.6%) M2a_REL 0.065(1)

Note: lnL, ln likelihood; m/f, marine partition (background) / freshwater partition including marine reinvading species (foreground); *, significantly different marine vs. freshwater dN/dS values at p<0.01; df, degrees of freedom.

61 Table 2.2. Clade model C (PAML) analyses of rhodopsin and non vision-related genes, with marine reinvading anchovies included in marine partition. Parameters

Model lnL dN/dS 0 dN/dS 1 dN/dS 2 Null P(df)

Rhodopsin

M2a_REL -4099.1 0.03 (91.6%) 1.00 (7.4%) 2.82 (1.0%) Constrained -4098.0 0.03 (91.5%) 1.00 (6.7%) 3.33/1.00 (1.6%) CmC (m/f) CmC (m/f) -4094.7 0.03 (91.6%) 1.00 (7.7%) 0.84/9.48* (0.7%) M2a_REL 0.003(1) Constrained 0.001(1) CmC Rag1

M2a_REL -6444.0 0.01 (85.2%) 1.00 (0.6%) 0.19 (14.2%)

CmC (m/f) -6443.9 0.01 (85.6%) 1.00 (0.6%) 0.21/0.18 (13.8%) M2a_REL 0.623(1)

Rag2

M2a_REL -3505.2 0.02 (89.2%) 1.00 (0.5%) 0.31 (10.3%)

CmC (m/f) -3503.9 0.02 (86.3%) 1.00 (1.3%) 0.16/0.32 (12.4%) M2a_REL 0.114(1)

Cytb

M2a_REL -11291.8 0.00 (94.4%) 1.00 (0.0%) 0.06 (5.4%) -11291.6 CmC (m/f) 0.00 (94.4%) 1.00 (0.0%) 0.09/0.10 (5.6%) M2a_REL 0.751(1)

Note: lnL, ln likelihood; m/f, marine partition including marine reinvading species (background) / freshwater partition excluding marine reinvading species (foreground); *, significantly different marine vs. freshwater dN/dS values at p<0.01, df, degrees of freedom.

Table 2.3. Random sites (PAML) analyses of rhodopsin. Parameters

Model lnL p q dN/dS 1 Null P (df)

Freshwater

M8a -2682.2 0.07 0.64 1.00 (3.9%)

M8 -2677.0 0.09 1.07 5.23 (0.8%) M8a 0.001(1)

Marine

M8a -2329.9 0.21 9.80 1.00 (7.3%)

M8 -2329.3 0.10 2.79 1.42 (5.3%) M8a 0.258(1)

Note: lnL, ln likelihood; p and q, parameters of beta distribution of site classes in models M8a and M8; df, degrees of freedom.

62 Table 2.4. Results of random sites analyses (PAML) for non vision-related genes. Parameters Model lnL p q dN/dS 1 Null P(df)

Rag1 Freshwater M8a -3449.8 0.10 2.63 1.00 (0.0%) M8 -3449.8 0.10 2.63 1.00 (0.0%) M8a 1.000(1) Marine M8a -2503.1 0.17 2.80 1.00 (0.0%) M8 -2503.1 0.17 2.80 1.00 (0.0%) M8a 1.000(1)

Rag2 Freshwater M8a -3047.1 0.20 2.00 1.00 (0.0%) M8 -3047.1 0.20 2.00 1.00 (0.0%) M8a 1.000(1) Marine M8a -3082.3 0.08 2.88 1.00 (1.4%) M8 -3081.8 0.06 1.66 3.52 (0.3%) M8a 0.334(1)

Cytb Freshwater M8a -5928.3 0.01 0.32 1.00 (0.0%) M8 -5928.3 0.01 0.31 1.00 (0.0%) M8a 0.996(1) Marine M8a -6669.8 0.05 5.10 1.00 (0.0%) 1.000(1) M8 -6669.8 0.05 5.10 1.00 (0.0%) M8a 1.000(1) Note: lnL, ln likelihood; p and q, parameters of beta distribution of site classes in models M8a and M8; df, degrees of freedom.

Table 2.5. Positively selected sites identified using PAML and HyPhy (FUBAR) analyses.

Model Site (Bovine numbering)

PAML M8 108, 124*, 165, 213

FUBAR 108, 160, 162, 165, 213, 299*

Note: *, spectral tuning sites in rhodopsin

63 2.7. FIGURES

Figure 2.1. Phylogeny and molecular evolution in New World anchovies. a) Phylogenetic relationships among marine (blue) and freshwater (red) species of the subfamily Engraulinae (Lovejoy et al. 2006). Colours on branches show optimization of habitat type; partitions used for clade model analyses indicated by vertical bars. b) dN/dS estimates for the divergent site class in marine and freshwater partitions from clade model analyses implemented in PAML. c) Sites with > 0.75 posterior probability for inclusion in the positively selected site class as inferred by FUBAR. Fish images after Whitehead et al. 1988.

64

Figure 2.2. Distribution of amino acids at variable sites in anchovy dataset. Sites under positive selection in the freshwater clade of anchovies are indicated with an asterisk. Amino acids coloured by properties of their side chains: small in yellow, polar in purple, aromatic in orange, hydrophobic in blue, positively charged in red, and negatively charged in green. Phylogenetic tree indicates freshwater species (red) and marine species (blue).

Figure 2.3. Positively selected sites in freshwater clade. Sites with significant support for being in the positively selected site class (BEB > 0.75) in clade or random-sites models shown on the 3D crystal structure of dark state rhodopsin (1U19: Okada et al. 2004). Distances shown from amino acids with hydroxyl-bearing side- chains to the chromophore.

65 2.8. REFERENCES

Anisimova M, Kosiol C. 2008. Investigating protein-coding sequence evolution with probabilistic codon substitution models. Mol. Biol. Evol. 26:255–271.

Bielawski JP, Yang Z. 2004. A Maximum Likelihood Method for Detecting Functional Divergence at Individual Codon Sites, with Application to Gene Family Evolution. Journal of Molecular Evolution 59:1–12.

Bloom D, Lovejoy N. 2011. The Biogeography of Marine Incursions in South America. University of California Press

Bloom DD, Lovejoy NR. 2012. Molecular phylogenetics reveals a pattern of biome conservatism in New World anchovies (family Engraulidae). J. Evol. Biol. 25:701–715.

Bloom DD, Lovejoy NR. 2014. The evolutionary origins of diadromy inferred from a time- calibrated phylogeny for Clupeiformes (herring and allies). Proc. R. Soc. B 281:20132081.

Bowmaker JK, Hunt DM. 2006. Evolution of vertebrate visual pigments. Curr. Biol. 16:R484–R489.

Chang BS, Du J, Weadick C, Muller J, Bickelmann C, Yu DD, Morrow JM, Cannarozii GM, Schneider A. 2012. The future of codon models in studies of molecular function: ancestral reconstruction and clade models of functional divergence. Codon evolution: mechanisms and models:145–163.

Chen W-J, Lavoué S, Mayden RL. 2013. Evolutionary origin and early biogeography of otophysan fishes (ostariophysi: teleostei). Evolution 67:2218–2239.

Chinen A, Matsumoto Y, Kawamura S. 2005. Spectral differentiation of blue opsins between phylogenetically close but ecologically distant goldfish and zebrafish. J. Biol. Chem. 280:9460–9466.

Costa MPF, Novo EMLM, Telmer KH. 2012. Spatial and temporal variability of light attenuation in large rivers of the Amazon. Hydrobiologia 702:171–190.

Delport W, Poon AFY, Frost SDW, Kosakovsky Pond SL. 2010. Datamonkey 2010: a suite of phylogenetic analysis tools for evolutionary biology. Bioinformatics 26:2455–2457.

Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797.

Friedman R, Hughes AL. 2007. Likelihood-ratio tests for positive selection of human and mouse duplicate genes reveal nonconservative and anomalous properties of widely used methods. Mol. Phylogenet. Evol. 42:388–393.

66 Gharib WH, Robinson-Rechavi M. 2013. The branch-site test of positive selection is surprisingly robust but lacks power under synonymous substitution saturation and variation in GC. Mol. Biol. Evol. 30:1675–1686.

Hunt DM, Bowmakers JK, Dulai KS, Partridge JC, Cottrill P. 2001. The molecular basis for spectral tuning of rod visual pigments in deep-sea fish. J. Exp. Biol. 204:3333–3344.

Larmuseau M, Vanhove M, Huyse T, Volckaert F, Decorte R. 2011. Signature of selection on the rhodopsin gene in the marine radiation of American seven‐spined gobies (, Gobiosomatini). J. Evol. Biol. 24:1618–1625.

Liu J, Liu MY, Nguyen JB, Bhagat A, Mooney V, Yan ECY. 2011. Thermal properties of rhodopsin: insight into the molecular mechanism of dim-light vision. J. Biol. Chem. 286:27622–27629.

Lovejoy NR, Albert JS, Crampton WGR. 2006. Miocene marine incursions and marine/freshwater transitions: Evidence from Neotropical fishes. J South Am Earth Sci 21:5–13.

Lovejoy NR, Bermingham E, Martin AP. 1998. Marine incursion into South America. Nature 396:421–422.

Lythgoe JN. 1979. The Ecology of Vision.

Murrell B, Moola S, Mabona A, Weighill T, Sheward D, Kosakovsky Pond SL, Scheffler K. 2013. FUBAR: A Fast, Unconstrained Bayesian AppRoximation for Inferring Selection. Mol. Biol. Evol. 30:1196–1205.

Murrell B, Wertheim JO, Moola S, Weighill T, Scheffler K, Pond SLK. 2012. Detecting individual sites subject to episodic diversifying selection. PLoS Genet 8:e1002764.

Nozawa M, Suzuki Y, Nei M. 2009. Response to Yang et al.: Problems with Bayesian methods of detecting positive selection at the DNA sequence level. Proc. Natl. Acad. Sci. U.S.A. 106:E96–E96.

Okada T, Sugihara M, Bondar A-N, Elstner M, Entel P, Buss V. 2004. The Retinal Conformation and its Environment in Rhodopsin in Light of a New 2.2Å Crystal Structure. J. Mol. Biol. 342:571–583.

Pond SLK, Frost SDW, Muse SV. 2005. HyPhy: hypothesis testing using phylogenies. Bioinformatics 21:676–679.

Schott RK, Refvik SP, Hauser FE, López-Fernández H, Chang BSW. 2014. Divergent positive selection in rhodopsin from lake and riverine cichlid fishes. Mol. Biol. Evol. 31:1149–1165.

Suzuki Y. 2008. False-positive results obtained from the branch-site test of positive selection. Genes & genetic systems 83:331–338.

67 Swanson WJ, Nielsen R, Yang Q. 2003. Pervasive adaptive evolution in mammalian fertilization proteins. Mol. Biol. Evol. 20:18–20.

Warrant EJ, Johnsen S. 2013. Vision and the light environment. Curr. Biol. 23:R990–R994.

Weadick CJ, Chang BSW. 2011. An Improved Likelihood Ratio Test for Detecting Site- Specific Functional Divergence among Clades of Protein-Coding Genes. Mol. Biol. Evol. 29:1297–1300.

Whitehead JP. 1985. Clupeoid Fishes of the World (suborder Clupeoidei): An Annotated and Illustrated Catalogue of the Herrings, Sardines, Pilchards, Sprats, Shads, Anchovies, and Wolfherrings. United Nations Development Programme.

Yang Z, Nielsen R, Goldman N. 2009. In defense of statistical methods for detecting positive selection. Proc. Natl. Acad. Sci. U.S.A.:pnas–0904550106.

Yang Z, Reis dos M. 2011. Statistical Properties of the Branch-Site Test of Positive Selection. Mol. Biol. Evol. 28:1217–1228.

Yang Z. 2005. Bayes Empirical Bayes Inference of Amino Acid Sites Under Positive Selection. Mol. Biol. Evol. 22:1107–1118.

Yang Z. 2007. PAML 4: Phylogenetic Analysis by Maximum Likelihood. Mol. Biol. Evol. 24:1586–1591.

Zhai W, Nielsen R, Goldman N, Yang Z. 2012. Looking for Darwin in genomic sequences— validity and success of statistical methods. Mol. Biol. Evol. 29:2889–2893.

68

2.9. SUPPLEMENTAL INFORMATION

Table S2.1. Accession number for sequences generated in this study Species Tissue Number Accession Number Amazonsprattus scintilla DDB3081 KT201093 Amazonsprattus n sp 1 DDB1848 KT201094 Anchoviella n sp 1 DDB733 KT201095 Pterengraulis atherinoides DDB501 KT201096 Anchoviella carrikeri DDB806 KT201097 Jurengraulis juruensis DDB696 KT201098 Anchovia surinamensis DDB4259 KT201099 Anchoviella manamensis DDB4168 KT201100 Anchoviella alleni DDB729 KT201101 Anchoviella cf guianensis DDB655 KT201102 Lycengraulis batesii DDB3087 KT201103 Anchoviella n sp 2 DDB739 KT201104 Anchoviella guianensis DDB4147 KT201105 Lycengraulis poeyi DDB3286 KT201106 Anchoa spinifer DDB3461 KT201107 Anchoviella lepidentostole DDB3608 KT201108 Lycengraulis grossidens DDB4304 KT201109 Anchoviella brevirostris DDB3602 KT201110 Anchoa cayorum DDB3020 KT201111 Anchoa colonensis DDB3369 KT201112 Anchoa delicatissima DDB3076 KT201113 Anchoa filifera DDB3409 KT201114 Engraulis ringens DDB3743 KT201115 Anchoa sp DDB4271 KT201116 Anchoa walkeri DDB3285 KT201117 Anchovia clupeoides DDB4274 KT201118 Anchoviella balboae DDB3290 KT201119 Anchoviella elongata DDB3364 KT201120 Cetengraulis edentulus DDB3648 KT201121 Cetengraulis mysticetus DDB3453 KT201122 Engraulis encrasicolus DDB3123 KT201123 Engraulis eurystole DDB3902 KT201124 Engraulis mordax DDB3080 KT201125 Anchoa cubana DDB2666 KT201126 Encrasicholina devisi DDB3239 KT201127 Pterengraulis atherinoides DDB502 KT201128 Anchoviella alleni DDB734 KT201129 Anchoviella carrikeri DDB812 KT201130 Anchoviella alleni DDB730 KT201131 Anchoviella manamensis DDB4170 KT201132 Anchovia surinamensis DDB4260 KT201133 Jurengraulis juruensis DDB804 KT201134 Lycengraulis batesii DDB3948 KT201135 Anchoviella balboae DDB3291 KT201136 Anchoviella brevirostris DDB3603 KT201137 Anchoa cayorum DDB3073 KT201138 Anchovia clupeoides DDB4275 KT201139 Anchoa colonensis DDB3370 KT201140 Anchoviella elongata DDB3365 KT201141 Anchoa sp DDB4272 KT201142 Anchoa walkeri DDB3287 KT201143 Cetengraulis edentulus DDB3392 KT201144 Engraulis encrasicolus DDB3125 KT201145 Lycengraulis grossidens DDB4122 KT201146

69 CHAPTER THREE: TURNING RED: FUNCTIONAL TUNING OF RHODOPSIN IN THE FACE OF STRONG SELECTION PRESSURES IN FRESHWATER CROAKERS

Contributors: Alexander Van Nynatten, Gianni M Castiglione, Eduardo A Gutierrez, Nathan R Lovejoy, Belinda SW Chang.

Author contributions: AVN, BSWC and NRL designed the study. AVN collected the rhodopsin sequence dataset and analysed the data. AVN performed the spectroscopic experiments, data analysis and interpretation, with assistance from GMC, EAG, NRL, and BSWC.

3.1. ABSTRACT

Rhodopsin, the light-sensitive visual pigment expressed in rod photoreceptors, is exquisitely well adapted for vision in dimly lit environments. However, rhodopsin’s ability to detect light diminishes underwater where the visible spectrum narrows with depth, unless its sensitivity overlaps with wavelengths of light available. Blue-shifting amino acid substitutions are observed in many deep-sea fishes matching rhodopsin to the most prevalent wavelengths of light in these environments. In contrast, rivers are illuminated by a red-shifted spectrum. We recently found positive selection in the rhodopsin gene of a freshwater clade of anchovies with marine ancestry, but the molecular mechanisms underlying spectral tuning in this system were not investigated. Using codon models, we show that positive selection in rhodopsin is also associated with a marine to freshwater transition in croakers, a globally distributed family of mostly marine fishes. In order to determine if this evolutionary transition from marine to freshwater in croakers was accompanied by adaptive shifts in visual abilities, we resurrected ancestral rhodopsin sequences experimentally tested the functional properties of ancestral pigments encompassing the marine to freshwater transition using spectroscopic assays. We found the ancestral freshwater rhodopsin is red shifted compared to the marine ancestor, and substitutions along the transitional branch result in faster kinetics associated with dark adaptation. These changes shift spectral sensitivity to more closely match environmental light conditions and alter rhodopsin kinetics to allow for more rapid restoration of visual sensitivity, likely to be advantageous in freshwater because of the relatively narrow interface and frequent transitions between bright-light and dim environments. This study is the first to experimentally

70 show that positively selected substitutions in ancestral visual pigments improve protein function following a marine to freshwater transition, providing insight into the molecular underpinnings describing the many morphological and physiological changes associated with this major change in habitat.

3.2. INTRODUCTION

Understanding the molecular underpinnings of adaptation is critical to fully appreciate the evolutionary processes associated with major habitat transitions. These transition events are often characterized by marked physiological and morphological adaptation improving an organism’s fitness to an environment starkly different from its ancestral domain (Pough et al. 1999). Ultimately, these differences are governed by changes in the genome (Amemiya et al. 2013; Foote et al. 2015), but to what extent adaptation is mediated by mutations in protein- coding genes is unclear (Nei 2007). Proteins involved in maintaining osmotic homeostasis have been shown to differ in marine and freshwater species, with convergent functional shifts observed in species making independent evolutionary transitions from one water type to the other (Yancey et al. 1982; di Prisco and Tamburrini 1992; Lee et al. 2011). More recently, comparative genomic studies have identified specific regions of the genome under selection in lineages that have made recent habitat transitions (Jones et al. 2012; Lee 2016). Most studies investigating functional adaptation to marine or freshwater environments have focussed on genes involved in osmoregulation (Velotta et al. 2015; Lee 2016; Willoughby et al. 2018), but salinity is only one of the many physical properties that differ between marine and freshwater environments (Carrete Vega and Wiens 2012). The underwater visual environment typical of freshwater rivers is much red shifted compared to marine environments due to increased turbidity and concentrations of organic compounds (Figure 3.1a). The visual pigments of freshwater fishes also tend to be red shifted compared to marine species, optimizing the sensitivity of these pigments for detecting light in freshwater environments (MacNichol and Levine 1979). The genes encoding these pigments are positively selected in freshwater fishes with recent marine ancestry, possible evidence for adaptive evolution (Van Nynatten et al.

71 2015; Marques et al. 2017). However, whether or not positively selected substitutions observed in these genes optimize the proteins functional properties for freshwater environments has not been experimentally investigated for opsin proteins or any other gene involved in adaptation to freshwater environments.

Visual pigments mediate the initial step in the visual transduction cascade and are particularly amenable to the functional characterization of molecular evolution. The spectral sensitivity of a visual pigment is controlled by electrostatic interactions between the light- sensitive chromophore and amino acids in the opsin G protein-coupled receptor (GPCR) forming the chromophore binding pocket (Ernst et al. 2014). Specific amino acid substitutions in this binding pocket domain can shift the spectral sensitivity so that it better aligns with the wavelengths of light illuminating an environment (Bowmaker and Hunt 2006). This trend, known as spectral tuning, is most obvious in rhodopsin, the visual pigment expressed in dim- light specialized rod photoreceptors (Lythgoe 1984). In deep-sea fishes, convergent amino acid substitutions blue shift the spectral sensitivity to match the most prevalent down-welling wavelengths of light (Bowmaker and Hunt 2006). Comparative sequence analyses have also identified convergent substitutions in rhodopsin that optimize its kinetic properties for dim- light environments characteristic of the deep sea (Sugawara et al. 2010). In contrast to marine systems, freshwater rivers typically become increasingly red shifted with depth (Figure 3.1a) (MacNichol and Levine 1979). Light is also more rapidly attenuated in freshwater rivers, making the underwater visual environment much darker at comparable depths (Figure 3.1a). However, while the specific amino acid substitutions shifting the spectral sensitivity of deep- sea fishes towards the blue end of the spectrum are well established, the molecular mechanisms underlying visual adaptation to freshwater fishes is not as well understood.

The diversity of fishes in marine and freshwater systems offer an excellent opportunity to investigate visual adaptations, particularly for visual pigment proteins (MacNichol and Levine 1979; Crescitelli 1990). However, comparative sequence analyses between marine and freshwater fishes are often hampered because many freshwater species have no extant marine sister taxa (Carrete Vega and Wiens 2012). This makes freshwater species with more recent marine ancestry ideal natural systems for adaptation to freshwater environments (Bloom and

72 Lovejoy 2017). These marine-derived lineages contribute to the immense biodiversity found in many large freshwater rivers and may have invaded and adapted to freshwater environments during major changes in paleogeography such as Pebas wetland system in South America (Bloom and Lovejoy 2017). This environment of intermediate salinity formed during Miocene marine incursion events is thought to have established many endemic marine-derived freshwater lineages including dolphins, stingrays, anchovies, needlefishes and croakers (Lovejoy et al. 1998). We have previously shown that the rhodopsin gene of the South American freshwater anchovies has rates of non-synonymous to synonymous substitutions

(dN/dS) higher than their marine sister clade (Van Nynatten et al. 2015). A similar trend has been observed in the short-wavelength sensitive visual pigment (SWS2) in freshwater stickleback (Marques et al. 2017). This suggests that freshwater fishes with marine ancestry have adapted to the red-shifted freshwater visual environment, but the identification of which amino acid substitutions are critical for dim-light vision in riverine environments, in comparison to other marine-derived freshwater lineages is required, along with the experimental characterization of the functional effects of these mutations.

Drum and croakers, members of the family Sciaenidae (herein referred to as croakers), are a globally distributed family of predatory fishes that inhabit a wide range of habitats and rely heavily on vision (Deary et al. 2016). Marine croakers tend to have visual pigments well- tuned to their environment (Horodysky et al. 2008; Xu et al. 2016), and deep-dwelling species optimize how visual information is processed to maximize sensitivity (Horodysky et al. 2008). In addition to marine habitats, endemic lineages of croakers can be found in the Amazon basin of South America, various rivers and lakes of North America, and in the Mekong river in South East Asia (Sasaki 1989) (Figure 3.1b). These freshwater lineages are the result of three independent invasion events and represent an ideal system to study visual adaptation (Lo et al. 2015). The Amazonian lineage invaded freshwater around the same time as the clade of freshwater anchovies and has similarly speciated throughout the many different rivers of the Amazon basin (Lo et al. 2015; Bloom and Lovejoy 2017). However, many life history traits differ in croakers and anchovies. Croakers tend to inhabit deeper water environments and are more active predators, relying more heavily on vision for detecting prey than the primarily pelagic filter-feeding anchovies (Figure 3.1a) (Deary et al. 2016). Because light is attenuated

73 progressively in opposite directions in marine and freshwater environments, deeper dwelling species might be under even stronger selection pressures. This increased strength in selection might be compounded in active predators, relying more heavily on vision for detecting prey (Deary et al. 2016).

We estimate dN/dS using models of codon evolution in PAML and HYPHY to identify sites undergoing positive selection in croaker rhodopsin (Pond et al. 2005; Yang 2007), and test for bursts in dN/dS on branches representing marine to freshwater transitions on a phylogeny reconstructed for 114 species (Yang and Nielsen 2002; Bielawski and Yang 2004; Kosakovsky Pond et al. 2011). We also reconstruct the amino acid sequences of the nodes bounding the transitional branch into South American rivers. We investigate shifts in functional properties of rhodopsin concomitant with the marine to freshwater transition in croakers by reconstructing the ancestral nodes bounding this transition and experimentally expressing the pigments in the laboratory. We used spectroscopic assays to characterize the spectral sensitivities of these ancestral croaker pigments. We also experimentally characterize the effects of specific substitutions along the transitional branch revealing any functional differences arising during the marine to freshwater transition event. This represents the first experimental investigation of ancestral protein sequences involved in the adaptation to freshwater environments by ancestrally marine species.

3.3. METHODS

3.3.1. Sequencing and Sequence Alignment DNA was extracted from 43 tissue samples preserved in ethanol, representing 36 species using a QIAGEN DNeasy kit (Qiagen Inc, Santa Clara CA, USA). Extracted DNA was used as a template for Polymerase Chain Reaction (PCR) amplification of four nuclear loci, the Recombination activating gene 1 (RAG1), Early growth response protein 1 and 2 (EGR1 and EGR2), and rhodopsin, as well as two mitochondrial loci, Cytochrome c oxidase subunit I (COI) and Cytochrome b (Cytb). Primer sequences and thermocycler conditions for each gene

74 are published in (Chen et al. 2003; Lo et al. 2015). Amplified DNA was purified using ExoSAP-I PCR Product Cleanup Reagent (Applied Biosystems, Foster City, CA, USA). Purified PCR products were sequenced by Sanger sequencing at the Hospital for Sick Children TCAG Sequencing Facility. Chromatograms were compared for forward and reverse reads in Geneious (Kearse et al. 2012) and only high-quality sequences were kept. (Supplementary table S3.1) lists all sequences generated in this study. These sequences were combined with sequences for the same six molecular markers generated in a previous phylogenetic reconstruction of croakers (Lo et al. 2015) to form a dataset of 139 taxa (Supplementary table S3.1). Sequences for each gene in the final dataset were aligned using MUSCLE (Edgar 2004). Terminal gaps were removed when present in more than half of the samples. We concatenated the alignments of all six molecular markers for a total dataset length of 6460bp.

3.3.2. Phylogenetic Reconstructions Partition Finder (Lanfear et al. 2016) was used to estimate the best fitting model of molecular evolution for each gene and the optimal partitioning scheme for the concatenated dataset (Supplementary table S3.2). A Bayesian phylogeny was generated with Mr. Bayes 3.2 (Supplementary figure S3.1) (Ronquist et al. 2012), sampling from an MCMC chain every 1000 steps from two independent runs of 5000000 generations, discarding the first 25% of samples as burn-in. RAXML (Stamatakis 2014) was used to reconstruct maximum likelihood phylogenies of the concatenated dataset and a rhodopsin-only dataset, data, with 1000 bootstrap replicates to obtain support for nodes (supplementary figure S3.2-3). Some topological differences were observed and are discussed further in the supplementary materials. We ran our analyses of molecular evolution on all three topologies.

3.3.3. Molecular Evolutionary Analyses To model the selection pressures acting on croaker rhodopsin, we employed codon models of molecular evolution implemented in PAML (Yang 2007) and HYPHY (Pond et al.

2005) estimating rates of non-synonymous to synonymous substitutions (dN/dS). We compared random-sites models in PAML, M2a and M8, allowing for a positively selected site class (dN/dS

75 > 1) with nested null models without a positively selected site class (dN/dS < or = 1), M1a, M7 and M8a, using likelihood ratio tests (LRT) to obtain a measure of support for the additional positively selected site class parameter (Yang 1998; Yang 2000). We also used FUBAR (HYPHY) to estimate positive selection on rhodopsin. FUBAR independetly estimates the rate of synonymous substitutions and allows for greater flexibility in dN/dS site class parameter estimates (Murrell et al. 2013).

We used branch-site and clade models in PAML to test for episodic bouts of positive and divergent selection on a priori selected branches herein referred to as “foreground” branches. The Branch-Sites model was used to compare support for a positively selected class of sites on branches representing the three freshwater invasion events (Figure 3.2ab), and each ecological partition in the New World Clade (Figure 3.2ab), with a null model constraining the dN/dS estimate on these branches to one. Clade models C and D (CmC and CmD) were also used to test for episodic shifts in selection pressures on these branches with comparison to nested null models M2aREL and M3 respectively (Bielawski and Yang 2004; Weadick and Chang 2011). As a compliment to models in PAML that allow for different rates across a priori selected branches we employed the aBSREL model in HYPHY that estimates dN/dS on each branch of the phylogeny without any a priori input (Kosakovsky Pond et al. 2011). These tests were replicated on a control gene dataset that was pruned to contain only species with sequence data for each of the nuclear genes included in this study.

PAML was also used to reconstruct the ancestral characters in the codon models described above and using the Dayhoff, JTT and WAG empirical amino acid models of molecular evolution. For all PAML analyses the specific sites under positive selection were inferred by Bayes Empirical Bayes analysis when available or Naïve Empirical Bayes when not (Nielsen and Yang 1998; Yang 2005). We visualized the location of positively selected sites and substitutions observed along transitional lineages on the crystal structure of rhodopsin in its dark state (IU19) and active state (3PQR) crystal structures, and homology models of these structures using ancestral croaker rhodopsin sequence, homology modelled using the modeler plug-in for UCSF Chimera (Pettersen et al. 2004).

76 3.3.4. Protein expression and functional characterization Rhodopsin coding sequences representing the ancestrally reconstructed rhodopsin sequences at the nodes bounding the marine to freshwater transition in South America were synthesized using GeneArt (Invitrogen). To maximize the yield of protein expressed in human HEK293T cells we converted codon identities in the croaker sequences to adhere to human codon biases where possible. Because most sequences available for croakers only span the seven transmembrane region of rhodopsin we appended the N and C terminal segments of the gene with human sequence. We do not expect this to have a significant effect on our analyses because the same sequence was used in both the marine and freshwater sequences and these parts of the protein are not expected to have an effect on the spectral tuning or kinetic properties of rhodopsin. The first two and last codon in the sequence were converted to match 5’ and 3’ restriction sites for insertion into the P1D3-hrGFP II expression vector (Morrow and Chang 2010). Because expression levels of these ancestral pigments were not sufficient for kinetic assays we also expressed bovine (Bos taurus) rhodopsin, where the complete coding sequence in the pJET1.2 cloning vector (ThermoFisher Scientfic), as described in a previous study was used (Castiglione et al. 2017). Site-directed mutagenesis primers were designed to induce single amino acid substitutions to convert the amino acid identities in bovine rhodopsin at sites 119, 122, 124 and 261 to match the marine and freshwater croaker sequences via PCR (QuickChange II, Agilent). All sequences were verified using a 3730 DNA Analyzer (Applied Biosystems) at the Centre for Analysis of Genome Evolution and Function (CAGEF) at the University of Toronto. All croaker and bovine rhodopsin sequences were transferred to the pIRES-hrGFP II expression vector (Stratagene) for subsequent transient transfection of HEK293T cells (8 µg per 10 cm plate) using Lipofectamine 2000 (Invitrogen). Media was changed after 24 hours, and cells were harvested 48 hours post-transfection. Cells were washed twice with harvesting buffer (PBS, 10 µg/mL aprotinin, 10 µg/mL leupeptin), and rhodopsins were regenerated for 2 hours in the dark with 5 µM 11-cis-retinal generously provided by Dr. Rosalie Crouch (Medical University of South Carolina). After regeneration the samples were incubated at 4oC in solubilization buffer (50 mM Tris pH 6.8, 100 mM NaCl, 1 mM CaCl2, 1% dodecylmaltoside, 0.1 mM PMSF) for 2 hours and immunoaffinity purified overnight using the 1D4 monoclonal antibody coupled to the UltraLink Hydrazide Resin (ThermoFisher Scientific). Resin was washed three times with wash buffer 1 (50 mM Tris pH 7.0, 100 mM

77 NaCl, 0.1% dodecylmaltoside) and twice using wash buffer 2 (50 mM sodium phosphate, 0.1% dodecylmaltoside; pH 7.0). Rhodopsins were eluted from the UltraLink resin using 5 mg/mL of a 1D4 peptide, consisting of the last 9 amino acids of bovine rhodopsin (TETSQVAPA).

The UV-visible absorption spectra of purified rhodopsin samples were recorded in the dark at 20 °C using a Cary 4000 double-beam absorbance spectrophotometer (Agilent). All peak spectral sensitivities were determined by fitting dark spectra to a standard template curve for A1 visual pigments (Govardovskii et al. 2000). Rhodopsin samples were light-activated for 30 seconds using a fiber optic lamp (Dolan-Jenner), resulting in a shift in peak spectral sensitivity to ~ 380 nm, characteristic of the biologically active metarhodopsin II intermediate

(Van Eps et al. 2017). Pigments were also exposed to hydroxylamine (NH2OH; 50mM) to test accessibility of the rhodopsin binding pocket before and after light activation, as previously described (van Hazel et al. 2016; Dungan et al. 2016; Sakmar et al. 1989).

Retinal release following rhodopsin photoactivation was monitored using a Cary Eclipse fluorescence spectrophotometer equipped with a Xenon flash lamp (Agilent), according to a protocol modified from previous studies (Farrens and Khorana 1995; Schafer et al. 2016). Rhodopsin samples (0.1-0.2 μM) were bleached for 30 seconds at 20oC with a fiber optic lamp (Dolan-Jenner) using a filter to restrict wavelengths of light below 475 nm to minimize heat. Fluorescence measurements were recorded at 30-second intervals with a 2 second integration time, using an excitation wavelength of 295 nm (1.5 nm slit width) and an emission wavelength of 330 nm (10 nm slit width). There was no noticeable activation by the excitation beam prior to rhodopsin activation. This assay detected increasing fluorescence as a result of decreased quenching of intrinsic tryptophan fluorescence at W265 by the retinal chromophore (Farrens and Khorana 1995), and is a reliable proxy for the tracking the decay of MII (Schafer et al. 2016). Data was fit to a three variable, first order exponential equation (y = -bx y0 + a(1-e )), and half-life values were calculated using the rate constant b (t1/2 = ln2/b). All curve fitting resulted in r2 values greater than 0.95. Differences in retinal release half-life values were statistically assessed using a two-tailed t test with unequal variance.

78 3.4. RESULTS

3.4.1. Positive selection in croaker rhodopsin We investigated if the diverse array of visual environments inhabited by croakers has placed strong selective pressures on rhodopsin by estimating dN/dS across sites in rhodopsin

(Yang 2000; Yang 2007). We found high average dN/dS estimates for the rhodopsin gene in croakers (m0: dN/dS = 0.27, Table 3.1), greater than values observed for other nuclear genes sequenced, and rates typical for rhodopsin in teleost fishes and protein coding genes in general

(Fay and Wu 2003; Rennison et al. 2012). This elevated average dN/dS is the result of pervasive positive selection at 9-19 sites in rhodopsin (Table 3.2). Random-sites models in PAML including a positively selected site class fit the data significantly better than nested null models without (M1a and M2a: χ2 = 203.6; df = 2; p < 0.00001, M7 and M8: χ2 = 202.2; df = 2; p <

0.00001, Table 3.1). Parameter estimates for the positively selected site class (dN/dS and proportion of sites) are very similar to the values reported by Schott et al. 2014 for the highly visual Cichlid fishes, a model system for studying visual evolution. All analyses were conducted using the largest phylogenetic reconstruction of Croakers to date (Figure 3.1c). We replicated all analyses of molecular evolution using Bayesian and maximum likelihood species trees (Supplementary tables S3.3-4), inferred using a concatenated alignment 6460 bp long, comprised of four nuclear gene fragments and two mitochondrial markers, as well as a maximum likelihood gene tree of rhodopsin (Supplementary figures S3.1-3).

3.4.2. Increased positive selection in rhodopsin during invasion of freshwater rivers To investigate whether there is evidence for positive selection concurrent with transitions to red-shifted freshwater environments we utilized the branch-sites and clade models in PAML with transitional branches set as the foreground (Figure 3.2ab) (Yang and Nielsen 2002; Bielawski and Yang 2004). When partitioned separately, only the branch representing the transition into the Amazon basin of South America had a dN/dS estimate significantly different from null expectations (Branch-Site null and Branch-Site: χ2 = 9.6; df = 1; p = 0.0019, M3 and CmD: χ2 = 6.6; df = 1; p = 0.0100, Table 3.3) (Figure 3.2c). Branch- site REL, a complementary model implemented in the HYPHY package, that does not require

79 a priori partitioning of the phylogeny also identifies this branch as being under positive selection (Kosakovsky Pond et al. 2011). Removal of freshwater lineages from the dataset had little effect on estimates of pervasive positive selection across the tree (Supplementary table S3.7), which suggests different ecological factors are driving positive selection in marine lineages and on the branch representing the Amazonian invasion.

The dN/dS estimate for the South American transitional branch was significantly higher than estimates for the marine and freshwater lineages of the New World Clade (Figure 3.2d) However, branch-sites analyses with each of the three partitions set as the sole foreground significantly supported the inclusion of a positively selected site class (Figure 3.2d). We compared each ecological partition using clade models and found dN/dS estimates for the transitional branch was significantly different from the freshwater clade and marine lineages. Models where the transitional branch was set independent of marine and freshwater partitions were equally as good fitting as a more parameter rich partitioning scheme with all three partitions independent of one another (CmD: transitional branch and CmD: transitional branch / freshwater clade: χ2 = 2.64, df = 1, p = 0.1042, Table 3.4), indicating that the marine and freshwater partitions have comparable dN/dS estimates and that these estimates are lower than that of the transitional branch.

Positive selection was not observed along the transitional branch in non-visual nuclear genes sequenced in this study (Figure 2e) (Supplementary table S3.8-9). Branch-sites tests of the non-visual control genes typically converged on the null model (dN/dS ≤ 1), and positive selection was only supported when marine lineages were set as the foreground in the Rag1 dataset (Branch-Site null and Branch-Site: χ2 = 6.29, df = 1, p = 0.0121, Supplementary table S3.8).

3.4.3. Different sets of positively selected sites in marine and freshwater lineages We investigated if the selection pressures acting on specific sites differed in the marine lineages, the freshwater clade and on the transitional branch by inferring positively selected sites in each foreground partition using Bayes Empirical Bayes (BEB) analysis in PAML. Sites

80 with posterior probability above 0.7 for inclusion in the positively selected site class are mostly unique to each phylogenetic partition and are distributed throughout the rhodopsin crystal structure (Figure 3.3a) (Table 3.2). An increased proportion of the positively selected sites on the transitional branch fall within 10 Å of the chromophore, the furthest distance from the chromophore of known spectral tuning sites (Bowmaker and Hunt 2006) (Figure 3.3b). These sites also undergo larger movements upon rhodopsin photoactivation, measured using the root mean-square deviation of dark (1U19) and active state crystal structures of rhodopsin (3PQR) (Figure 3.3b). Positively selected sites in marine and freshwater lineages are further from the chromophore (Figure 3.3bc).

The multiple independent sets of positively selected sites within the dataset interferes with the assumptions of the frequently employed Clade model C (CmC) (Bielawski and Yang

2004). CmC did not identify any difference in dN/dS on the transitional branch, at odds with inferences using branch-sites, branch-sites REL (HYPHY), CmD, and even the overly conservative two-ratio model (m0 and two-ratio: transitional branch: χ2 = 20.0; df = 1; p < 0.0001, Supplementary table S3.10) (Bielawski and Yang 2004). CmD was also significantly better fitting than CmC when the transitional branch was set as the foreground (CmC and CmD: χ2 = 28.04, df = 1, p < 0.0001, Supplementary table S3.10). CmD fits the data better because of its increased flexibility afforded to it by its freely estimated second site class (fixed to equal one in CmC), accommodating a class of positively selected sites independent of the positively selected sites identified in the divergent site class. Conversely, in CmC, positively selected sites pervasive across the phylogeny are forced into the divergent site class, masking the detection of the numerous non-synonymous substitutions along the South American transitional branch. More detail on this and the effects of pervasive positive selection on the topology of the species tree is available in the supplementary materials (Supplementary figures S3.4-5).

3.4.4. Experimental comparison of marine and freshwater ancestral croaker rhodopsins To investigate the substitutions and functional shifts associated with the transition from marine to freshwater in South America we reconstructed the amino acid substitutions in

81 rhodopsin along the transitional branch. The large number of amino acid substitutions reconstructed using likelihood models were consistent with the high dN/dS estimates for the transitional branch. In total, 20 substitutions are reconstructed along the transitional branch with high posterior probability, including known red-shifting substitutions and substitutions at sites defining the kinetic properties of rhodopsin photoactivation (Chan et al. 1992; Imai et al. 1997) (Supplementary table S3.11). When scaled by amino acid substitutions, the branch length for the transitional branch is longer than all other branches in the phylogeny (Supplementary figure S3.6). This is inconsistent with branch lengths measured using the concatenated nucleotide dataset (Supplementary figures S3.1-3). Eight substitutions on this branch occur only along this branch, and an additional six substitutions occur on only one or two other branches in the phylogeny (Supplementary table S3.11).

We used heterologous protein expression and spectroscopic assays to functionally characterize the spectral sensitivity of resurrected ancestral croaker sequences representing the nodes bounding the transition into South American freshwater (Figure 3.1c). The ancestral freshwater pigment was red-shifted relative to the marine ancestor with peak spectral sensitivities of 504 nm and 498 nm respectively (Figure 3.4a) (Table 3.5). The peak spectral sensitivity of the ancestral marine croaker rhodopsin is consistent with microspectrophotometry data for marine croaker rods (Beatty 1973; Dartnall and Lythgoe 1965) (Table 3.5). Expression levels for the marine and freshwater ancestral pigments were not high enough to test the kinetic properties of the protein. Instead, we used site directed mutagenesis to investigate the effects of the four positively selected substitutions nearest the chromophore, F119L, E122I, S124A, and F261Y on the retinal release rate of rhodopsin (Figure 3.3b). We mutated bovine rhodopsin to match the identities observed at these sites in the marine and freshwater ancestrally reconstructed sequences. The amino acids at these sites are mostly conserved between bovine rhodopsin and the marine croaker ancestor and differ only at site 124. The ancestral freshwater sequence is identical to bovine rhodopsin at site 124 but differs at sites 119, 122 and 124 (Figure 3.4b). Compared to the ancestral croaker sequences the spectral sensitivities of the freshwater-like and marine-like bovine pigments were slightly red- and blue-shifted respectively (506 nm and 496 nm, Table 3.5). Retinal release rates also differed in freshwater- and marine-like bovine rhodopsin. The substitutions

82 at sites 119, 122 and 124 in the freshwater-like bovine rhodopsin halved the half-life of the retinal release rate compared to wild type bovine rhodopsin (7.44 vs. 16.08 min, Figure 3.4b), and is one third that estimated for marine-like bovine rhodopsin (7.44 vs. 24.3 min, Figure 3.4c) (Castiglione and Chang 2018). Despite having a more cone-like retinal release rate, substitutions occurring during the transition into freshwater did not alter the pigments permeability to hydroxylamine suggesting an inaccessible binding pocket typical of most rhodopsin pigments (Figure 3.4d) (Sakmar et al. 1989).

3.5. DISCUSSION

In this study we investigate rates of molecular evolution in the dim-light sensitive rhodopsin gene of croakers during evolutionary transitions from marine to freshwater habitats, using the largest phylogenetic reconstruction of this family of fishes to date. We find rhodopsin is under pervasive positive selection, consistent with the divergent underwater environments inhabited by these visual predators. We also detect an episodic burst of positive selection on the branch representing the transition into the turbid and tannin-stained rivers of the Amazon basin. Sites positively selected on this branch differ from those positively selected across the rest of the phylogeny and appear in domains closer to the chromophore and at positions in the protein expected to have more extreme effects on the spectral sensitivity and kinetic properties of rhodopsin. In vitro spectroscopic assays of resurrected ancestral sequences bracketing this transitional event revealed a red shift in the freshwater ancestral rhodopsin, consistent with the prevailing wavelengths of light in the freshwater visual environment. Four substitutions along this branch L119Y, E122I, S124A and F261Y recapitulate the red shift and also result in much faster retinal release rates when made in bovine rhodopsin. This shows that freshwater croakers have red-shifted rhodopsin pigments with more efficient dark adaptation, contrasting adaptations observed in deep-sea species (Hunt et al. 1996; Dungan and Chang 2017). The divergent adaptive mechanisms in marine and freshwater fishes highlights the utility of investigating species that have undertaken major historical habitat transitions for understanding functional protein evolution.

83 3.5.1. Freshwater croakers have rhodopsin tuned to the underwater visual environment Experimental characterization of the resurrected croaker rhodopsins revealed the freshwater ancestral pigment was red shifted relative to the marine ancestor. This spectral shift is primarily due to substitutions at four sites nearest the chromophore. The amino acid identities at these four sites in the marine ancestor are virtually conserved throughout the croaker dataset except for the South American freshwater croakers, where the red-shifting residues are conserved throughout the clade. In general, freshwater fishes tend to have more red-shifted visual pigments than marine fishes (MacNichol and Levine 1979). However, many of these differences in spectral sensitivity have been attributed to differences in the chromophore or the expression of a freshwater specific copy of rhodopsin (Wald 1939; Toyama et al. 2008; Nakamura et al. 2017). Swapping the standard A1 chromophore (11-cis-retinal) with the dehydrogenated derivative A2 (11-cis-3,4-dehydroretinal) red shifts rhodopsin roughly 20 nm (Allison et al. 2004). The enzyme responsible for the chromophore conversion was recently identified (Enright et al. 2015) and shown to be an ancestral vertebrate trait (Morshedian et al. 2017). This red-shifting mechanism is utilized by many freshwater species (Enright et al. 2015), including freshwater fishes where the A2 chromophore is found more frequently (Toyama et al. 2008). It has also been extensively studied in diadromous fishes making transitions from marine to freshwater environments (Beatty 1966). Diadromous eels employ another adaptive strategy to optimize vision in both marine and freshwater environments. These species express different copies of rhodopsin with specific substitutions that optimize it for either marine or freshwater visual environments (Nakamura et al. 2017). However, Sciaenids and most other fishes express only one copy of rhodopsin in the retina and would not able to utilize this mechanism (Nakamura et al. 2017). In addition, it is not unknown whether or not marine-derived lineages of freshwater fishes can convert the chromophore, having had no need for this mechanism for millions of years. To date, all croakers measured have A1 pigments only (Dartnall and Lythgoe 1965; Beatty 1973; Horodysky et al. 2008). This could place even stronger selection pressures on adaptation in the rhodopsin gene and could explain the rapid rates of molecular evolution in croaker rhodopsin during the invasion of freshwater environments, but exactly how rhodopsin sequence variation functions alongside or acts alternatively to these other mechanisms requires further investigation.

84 While the substantial number of substitutions along the transitional lineage suggests some complexity is involved in freshwater visual adaptation, the effects of three substitutions on the functional properties are unambiguous. Substitutions at sites 119, 124 and 261 have all been shown in previous studies to red shift the spectral sensitivity of bovine rhodopsin (Chan et al. 1992; Castiglione and Chang). We show that these substitutions have the same red- shifting effect in a croaker rhodopsin background, however the magnitude of the spectral shift is slightly different from that observed in the bovine background. This might represent epistatic interactions with other substitutions observed along the marine to freshwater transition or differences between croakers and cows. These substitutions are all in close proximity to the chromophore but alter the spectral sensitivity through different mechanisms. The addition of OH group in close proximity to the chromophore's beta-ionone ring by the F261Y substitution delocalizes charge following the 11-cis to all-trans isomerization. This lowers the activation energy and red shifts the peak spectral sensitivity of the visual pigment by 10 nm in bovine rhodopsin (Chan et al. 1992). Site 119 and 124 do not directly face into the binding pocket. The leucine most frequent at site 119 points towards helix IV, and the substitution to a larger phenylalanine residue might instead shift spectral sensitivity by altering the stability of the inactive or active state structure of rhodopsin through steric interactions with neighboring helices during activation as has been observed in other opsin proteins and other GPCRs (Zhou et al. 2012; Sekharan et al. 2013; Yamazaki et al. 2014). The peak spectral sensitivity of rhodopsin is inversely proportional to the energy barrier between these two states (Ernst et al. 2014). The side chain of site 124 is also not directed into the binding pocket and is instead thought to alter the spectral sensitivity of rhodopsin by repositioning other residues nearer the chromophore (Morrow and Chang 2015) or by stabilization of the active or inactive state through H-bond networks (Okada et al. 2004). These substitutions ultimately result in a red- shifted rhodopsin, matching the spectral sensitivity of many other freshwater fishes. Interestingly the identities at sites 119, 124 and 261 in the freshwater croaker clade match the identities in other more ancient freshwater lineages. Because all freshwater fishes are believed to be ancestrally marine, this represents extensive convergent evolution to the red-shifted underwater visual environment (Betancur-R et al. 2015).

85 Croakers have also independently invaded the freshwater lakes and rivers of North America and the Mekong river in South East Asia (Lo et al. 2015). In both cases this has resulted in monotypic lineages, contrasting the speciation observed following the South American transition. This makes dating the transition event less accurate, but both species are thought to have invaded freshwater more recently than the South American lineage (Figure 3.1b) (Lo et al. 2015). There is no evidence for episodic bursts of positive selection in either of these species, however a convergent red-shifting substitution is seen at site 124 in the North American species. Aplodinotus grunniens is distributed widely across North America, inhabiting clear lakes and turbid rivers. It is unlikely that a one size fits all functional optima for this species exists, and further sampling should be done to determine if interspecies variation is present in the visual pigments of lake and river populations, as observed in other groups of fishes (Marques et al. 2017; Schott et al. 2014). Additionally, Boesemani microlepis inhabits estuaries as well as the Mekong river and may not have fully adapted to freshwater visual environments (Sheaves et al. 2008).

3.5.2. Rates of molecular evolution and distribution of positively selected sites differ in marine and freshwater lineages Using models of molecular evolution that allow for variation across sites and phylogenetic partitions revealed that rates of molecular evolution and the sites included in the positively selected site class are not the same on the transitional branch, marine lineages and the freshwater clade. This suggests that different selection pressures are acting on rhodopsin in each of these environments. Positive selection is most significant along the transitional branch where the number of amino acid substitutions is greater than on any other branch in the croaker phylogeny. In general, positively selected sites in the marine and freshwater lineages are not expected to have as significant an effect on the spectral sensitivity of the protein as the positively selected sites observed along the transitional branch. These sites are positioned further from the chromophore than those detected on the transitional branch. However, spectral tuning substitutions have been observed outside the chromophore binding pocket (Sekharan et al. 2013), and substitutions that alter other aspects of rhodopsin structure and function have been shown to be adaptive in dim-light environments (Sugawara et al. 2010). These non-

86 spectral adaptations are often observed in clades of aquatic species inhabiting a wide range of depths illuminated by vastly different intensities of light (Schott et al. 2014; Dungan et al. 2016). Marine croakers also inhabit a wide array of ecological niches. Some croakers are benthic specialists, with many morphological and physiological adaptations to life near the substrate, whereas other species are more pelagic, and previous studies have shown gross morphological and functional differences in the visual performance of croakers inhabiting different depths (Horodysky et al. 2008). These differences likely translate to the molecular level making positively selected sites in marine croakers tantalizing targets for future functional characterization.

Spectral tuning during marine to freshwater transitions might also depend on depth. We observe much stronger evidence for positive selection in croakers than in our previous analysis of anchovies that invaded the same South American rivers at a comparable time (Bloom and Lovejoy 2017). The number of sites included in the positively selected site class is greater for croakers and the dN/dS estimate for the positively selected site class is higher. In addition, we see no evidence for positive selection in marine anchovies (Van Nynatten et al. 2015), and in contrast with croakers, substitutions expected to shift the spectral sensitivity of rhodopsin are not found on the transitional branch but are unevenly distributed throughout the freshwater clade of anchovies (Van Nynatten et al. 2015). Croakers are generally deeper dwelling than the primarily pelagic anchovies, inhabiting depths where the wavelengths of light illuminating the environment would be more blue or red shifted in marine and freshwater environments respectively. They are also more active predators and rely heavily upon vision for detecting prey (Deary et al. 2016), while anchovies are mostly filter feeders. These different life history traits may place croaker rhodopsin under stronger selection pressures than anchovy rhodopsin and could explain why an elevated dN/dS is observed along the transitional branch in croakers but is more pervasive within the freshwater clade of anchovies. The reduction in apparent stochasticity due to the heightened selection pressures might be even more apparent with regards to red-shifting substitutions in rhodopsin, following a similar trend as the dN/dS estimates in croakers and anchovies, because the small number of substitutions capable of conferring the desired functional effect will take longer to become fixed when selection pressures are less strong. These ecological differences might also explain why positive

87 selection is observed in marine croakers and not anchovies, since marine croakers are expected to be under stronger selection pressures for optimizing rhodopsin function to different marine environments as well. Understanding how ecological diversity influences the selection pressures acting on protein evolution is key to explaining functional divergence in proteins like rhodopsin, where an expanding set of the protein’s properties are being implicated in adaptation to different light environments (Jastrzebska et al. 2016; Castiglione et al. 2017).

Diverse selection pressures can also influence the detection of positive selection, and multiple independent classes of positively selected sites interfere with some clade models in PAML. The lack of overlap between positively selected sites in marine lineages and sites under divergent positive selection on the transitional branch contradicts some assumptions of the frequently used Clade model C in PAML (CmC). In this model, a single site class is devoted both to positive selected sites and sites with divergent selection pressures in two or more partitions identified in the data (Bielawski and Yang 2004). This does not pose a problem if all, or even the majority of positively selected sites are a result of the divergent selection observed along the selected partition. However, if two largely independent classes of positively selected sites are observed, as is the case in the croaker dataset, this site class must try to accommodate both. In our croaker rhodopsin dataset this results in an inflated dN/dS estimate for the divergent site class parameter, relegating many of the non-synonymous substitutions along the transitional branch into the middle site class. This issue can be circumvented by using the more flexible but less frequently used clade model D (CmD), originally suggested to be more realistic than CmC, freely estimating dN/dS values for all three site classes (Bielawski and Yang 2004). For our dataset, CmD is consistently better fitting than CmC, and inferences of positive selection along the transitional branch are consistent with other analyses in PAML and HYPHY. Removing outlier sites under very high rates of positive selection improved the consistency of parameter estimates for CmC with other models of molecular evolution, but CmD still provides a better fit, likely due to its greater flexibility and ability to accommodate the remaining pervasively positively selected sites. These results suggest more flexible models of molecular evolution might be required as datasets become larger and span greater phylogenetic and ecological diversity.

88

3.5.3. Faster dark adaptation in freshwater croakers and ecological implications of substitutions on the transitional lineage We also investigated the effects of substitutions along the transitional branch on non- spectral properties of rhodopsin. Specifically, we tested if these substitutions alter the kinetic rate of rhodopsin activation. Rhodopsin kinetics might be equally important for visual adaptation, especially in species inhabiting dimly lit environments (Sugawara et al. 2010). The E122I substitution drastically decreases the Meta-II stability of chicken rhodopsin (Imai et al. 1997), and we show that this substitution has the same effect on Meta-II stability in a red- shifted freshwater-like bovine rhodopsin. Many possible molecular mechanisms have been proposed for the evolutionary significance of substitutions at site 122. Accelerating the rate of Meta-II formation and increasing its duration have been implicated in adaptation to dim-light environments in a number of deep-water species where it is hypothesized to increase signal amplification following rhodopsin activation (Hauser et al. 2017; Dungan and Chang 2017). Alternatively, in other environments faster dark adaptation, making rods more cone like, might be adaptive as it would allow rods to recover faster from a light bleach. This might be critical in freshwater, where descending only a few meters in the water column results in a major change in the intensity of light. Ambient light levels decrease an order of magnitude quicker in most freshwater rivers than marine waters (Crampton 2007). Transitions from intensities of light that would totally bleach rhodopsin to intensities where rhodopsin is the only pigment sensitive enough to detect light would be realized much more frequently by freshwater fishes, resulting in temporary blindness. In contrast, marine fishes would have to traverse tens or hundreds of meters vertically to experience a similar change in light intensity. As is observed in red-shifting substitutions at sites 119, 124 and 261, the importance of an E122I substitution in the visual performance of freshwater fishes is supported by its presence in three other diverse and ancient lineages of benthic freshwater fishes. This suggests that it is not only the shift in spectral sensitivity in marine and freshwater environments where the optimal properties are opposite of one another, but that adaptation in the kinetic properties of rhodopsin also differ in dim marine and freshwater environments.

89 3.6. TABLES

Table 3.1. Random-sites analyses (PAML) of the world-wide croaker rhodopsin dataset using the Bayesian species tree

Model np lnL Parameter estimates: dN/dS (proportion of sites) Null LRT p value M0 225 -7129.66 0.27(1.00) M1a 226 -6627.71 0.02(0.85) 1.00F(0.15) M0 1003.91 0.0000 M2a 228 -6525.91 0.02(0.84) 1.00F(0.13) 4.53(0.04) M1a 203.60 0.0000 M7 226 -6618.10 β (p = 0.01, q = 0.04) M8a 227 -6608.15 β (p = 0.13, q = 2.94), 1.00F(0.12) M8 228 -6517.02 β (p = 0.06, q = 0.38), 4.16(0.04) M7 202.16 0.0000 M8a 182.26 0.0000 F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test result

Table 3.2. Positively selected sites in different tests of positive selection Full rhodopsin dataset, Bayesian phylogeny

M2a (BEB) – dN/dS = 4.53 82, 162, 165, 169, 173, 209, 214, 218, 262

M8 (BEB) – dN/dS = 4.16 33, 82, 112, 162, 165, 169, 173, 209, 214, 218, 262, 299 FUBAR 33, 37, 39, 40, 112, 115, 151, 165, 168, 173, 209, 214, 218, 256, 263, 266, 271, 290, 299 F South American marine to freshwater transitional branch as foreground – Branch-Site (BEB) – dN/dS = 0.02/18.42, 1.00 /18.42 119, 122, 159, 261, 282

Dataset with freshwater lineages removed, Bayesian phylogeny

M2a (BEB) – dN/dS = 4.50 82, 112, 162, 165, 169, 209, 214, 218, 262

M8 (BEB) – dN/dS = 4.31 33, 82, 112, 162, 165, 169, 173, 209, 214, 218, 262, 299

New World only dataset, Bayesian phylogeny M2a 162, 165, 214, 218 F Marine lineages as foreground – Branch-Site (BEB) – dN/dS = 0.02/9.53, 1.00 /9.53 82, 162, 165, 214, 218

F Freshwater clade as foreground – Branch-Site (BEB) – dN/dS = 0.02/10.75, 1.00 /10.75 50, 52, 165, 173

F Marine to freshwater transitional branch as foreground – Branch-Site (BEB) – dN/dS = 0.02/23.52, 1.00 /23.52 63, 119, 122, 124, 158, 159, 169, 214, 261, 282, 304

Note: dN/dS estimates for the positively selected site class reported for each model (background/foreground); all sites reported have posterior probabilities greater than 0.7

90 Table 3.3. Branch-sites and clade model analyses (PAML) of the world-wide croaker rhodopsin dataset using the Bayesian species tree

Parameter Estimates: dN/dS (proportion of sites): Model np lnL Null LRT p value background dN/dS / foreground dN/dS M2aREL 228 -6525.91 0.02(0.84) 1.00F(0.13) 4.53(0.04) M3 229 -6525.91 0.02(0.83) 0.99(0.13) 4.51(0.04)

All Freshwater transitions as Foreground Br-site alt 228 -6620.94 0.02(0.82) 1.00F(0.14) 0.02/5.62(0.04) 1.00F/5.62(0.01) Br-site null 4.98 0.0256 Br-site null 227 -6623.43 0.02(0.79) 1.00F(0.13) 0.02/1.00F(0.07) 1.00F/1.00F(0.01) CmC 229 -6525.88 0.02(0.84) 1.00F(0.13) 4.55/4.12(0.04) M2a_rel 0.05 0.8168 CmD 230 -6523.47 0.04(0.86) 8.81(0.01) 1.44/3.95(0.13) M3 4.88 0.0272

North American Transition as Foreground Br-site alt 228 -6627.46 0.02(0.83) 1.00F(0.15) 0.02/1.42(0.02) 1.00F/1.42(0.00) Br-site null 0.01 0.9031 Branch Site 227 -6627.47 0.02(0.82) 1.00F(0.15) 0.02/1.00F(0.03) 1.00F/1.00F(0.00) (null) CmC 229 -6525.39 0.02(0.84) 1.00F(0.13) 4.55/2.16(0.04) M2a_rel 1.03 0.3100 CmD 230 -6525.14 0.02(0.83) 4.42(0.04) 0.96/1.94(0.13) M3 1.54 0.2143

Asian Transition as Foreground Br-site alt 228 -6627.71 0.02(0.85) 1.00F(0.15) 0.02/1.00(0.00) 1.00F/1.00(0.00) Br-site null 0.00 1.0000 Br-site null 227 -6627.71 0.02(0.85) 1.00F(0.15) 0.02/1.00F(0.00) 1.00F/1.00F(0.00) CmC 229 -6524.26 0.02(0.84) 1.00F(0.13) 4.56/0.00(0.04) M2a_rel 3.30 0.0693 CmD 230 -6525.69 1.00(0.13) 4.52(0.04) 0.03/0.00(0.84) M3 0.44 0.5071

South American Transition as Foreground Br-site alt 228 -6617.67 0.02(0.82) 1.00F(0.14) 0.02/18.42(0.04) 1.00F/18.42(0.01) Br-site null 9.64 0.0019 Br-site null 227 -6622.49 0.02(0.75) 1.00F(0.13) 0.02/1.00F(0.10) 1.00F/1.00F(0.02) CmC 229 -6524.93 0.02(0.84) 1.00F(0.13) 4.36/9.67(0.04) M2aREL 1.95 0.1622 CmD 230 -6522.59 0.03(0.85) 6.86(0.02) 1.30/6.42(0.13) M3 6.63 0.0100

F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test result; Br-site, Branch-site; CmC, Clade model C; CmD, Clade model D

91 Table 3.4. Branch-sites and clade model analyses (PAML) of the New World clade croaker rhodopsin dataset using the Bayesian species tree Parameter Estimates: dN/dS (proportion of sites): Model np lnL Null LRT p value background dN/dS / foreground dN/dS Unpartitioned null models m0 92 -3535.69 0.24(1.00) M2aREL 95 -3328.05 0.02(0.84) 1.00F(0.14) 8.99(0.01) M3 96 -3326.68 0.03(0.86) 1.37(0.13) 13.34(0.01)

Freshwater clade and transitional branch Two-ratio 93 -3533.54 0.22/0.36(1.00) m0 4.30 0.0381 Br-site alt 95 -3360.78 0.02(0.84) 1.00F(0.12) 0.02/6.28(0.04) 1.00/6.28(0.01) Br-site null 15.92 0.0001 Br-site null 94 -3368.74 0.01(0.81) 1.00F(0.12) 0.01/1.00 F(0.07) 1.00/1.00 F(0.01) CmC 96 -3327.21 0.02(0.84) 1.00F(0.14) 9.81/5.42(0.02) M2a 1.68 0.1949 CmD 97 -3323.74 0.01(0.80) 4.28(0.04) 0.40/1.44(0.16) M3 5.88 0.0153

Transitional branch Two-ratio 93 -3525.69 0.22/2.24(1.00) m0 20.0 0.0000 Br-site alt 95 -3362.17 0.02(0.82) 1.00F(0.12) 0.02/23.52(0.05) 1.00/23.52(0.01) Br-site null 12.66 0.0004 Br-site null 94 -3368.50 0.01(0.70) 1.00F(0.11) 0.01/1.00 F(0.16) 1.00/1.00 F(0.03) CmC 96 -3328.03 0.02(0.84) 1.00F(0.14) 8.94/10.95(0.01) M2a 0.04 0.8415 CmD 97 -3317.23 0.00(0.77) 4.12(0.05) 0.39/6.91(0.19) M3 18.9 0.0000

Freshwater clade only Two-ratio 93 -3535.64 0.25/0.23(1.00) m0 0.10 0.7518 Br-site alt 95 -3366.34 0.02(0.85) 1.00F(0.14) 0.02/10.75(0.01) 1.00/10.75(0.00) Br-site null 13.42 0.0002 Br-site null 94 -3373.05 0.02(0.84) 1.00F(0.14) 0.02/1.00 F(0.02) 1.00/1.00 F(0.00) CmC 96 -3327.01 0.02(0.84) 1.00F(0.14) 9.73/4.93(0.01) M2a 2.08 0.1492 CmD 97 -3326.67 0.03(0.86) 13.32(0.01) 1.35/1.42(0.13) M3 0.02 0.8875

Marine lineages only Br-site alt 95 -3337.09 0.02(0.83) 1.00F(0.15) 0.02/9.53(0.02) 1.00/9.53(0.00) Br-site null 74.26 0 Br-site null 94 -3374.22 0.02(0.85) 1.00F(0.15) 0.02/1.00 F(0.00) 1.00/1.00 F(0.00)

Transitional branch and freshwater clade (separate foregrounds) Two-ratio 94 -3525.68 0.21/2.23/0.23(1.00) m0 20.02 0.0000 CmC 97 -3327.01 0.02(0.84) 1.00F(0.14) 9.74/9.48/4.93(0.01) M2a 2.08 0.1492 CmD 98 -3315.91 0.01(0.78) 4.19(0.05) 0.36/8.21/0.72(0.17) M3 21.54 0.0000

Freshwater clade and transitional branch 15.66 0.0000 Transitional branch 2.64 0.1042 Freshwater clade only 21.52 0.0000 F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test result; Br-site, Branch-site; CmC, Clade model C; CmD, Clade model D

92 Table 3.5. Peak spectral sensitivity of croaker rhodopsin Pigment Spectral sensitivity of rhodopsin Bovine rhodopsin 498.7, 498.8 Bovine rhodopsin (A124S) 496.4 (Castiglione and Chang 2018) Bovine rhodopsin (L119F, E122I, F261Y) 505.6, 505.7, 505.6 Ancestral marine croaker rhodopsin 498.1, 497.3 Ancestral freshwater croaker rhodopsin 503.8, 504.5

Published MSP estimates of croaker rhodopsin Cynoscion regalis 496.0 (Beatty 1973) Micropogon undulatus 496.5 (Beatty 1973) Leiostomus xanthursus 499.0 (Beatty 1973) Sciaena umbra 501.0 (Dartnall and Lythgoe 1965)

93 3.7. FIGURES

Figure 3.1. Phylogenetic reconstruction and ecological distribution of croakers. a) Schematic representation of underwater light environments based on data from (Jerlov 1976) and (Costa et al. 2012). Violin plot of mid-point depth values for marine croakers bisected at the median value. b) Divergence time estimates for invasions into freshwater from (Lo et al. 2015). c) Bayesian phylogeny used for PAML analyses. Branch lengths scaled by the number of substitutions per codon. Asterisk indicates the branch supported to be under episodic positive selection by Branch-site REL (HYPHY), two-ratio, branch-sites, and clade model D in PAML.

94

Figure 3.2. Tests for positive selection on rhodopsin associated with habitat transitions. a) Cladogram representation of Croaker species tree. Transitions into freshwater, set as the foreground, enumerated and represented as purple arrows. Grey box represents the New World clade of croakers. b) Partitioning schemes tested using branch-sites and clade models in PAML coloured to match cladogram. Coloured bars represent difference in foreground dN/dS and background dN/dS estimates and are coloured to match the foreground partition. Partition (i) sets all transitional branches as the foreground. Partitions (ii-iv) sets each transitional branch as an independent foreground lineage. Partition (v) sets both the transitional branch and South American freshwater clade as the foreground and is equivalent to (viii) in CmD. c) Branch- sites (Br-site) (top) and CmD (bottom) estimates for marine to freshwater transitional branches in the world-wide croaker rhodopsin dataset. d) Branch-sites (top) and CmD (bottom) estimates for marine lineages, the freshwater clade, and the transitional branch in the New World clade croaker rhodopsin dataset. e) Branch-sites and CmD results for non-visual control genes.

95

Figure 3.3. Habitat specificity of positively selected sites in croaker rhodopsin. a) Positively selected sites on transmembrane helices shown on the rhodopsin dark state crystal structure (1U19) looking down from the intradiscal face. b) Distance from the chromophore and root mean square deviation (RMSD) between the dark (1U19) and light (3PQR) state rhodopsin crystal structures for each positively selected site. Histograms show the number of positively selected sites in each partition within 10 angstroms of the chromophore, the maximum distance from the chromophore of known spectral tuning substitutions in rhodopsin (Bowmaker and Hunt 2006). c) Positively selected sites on each transmembrane helix shown arranged in order with the chromophore to the left. Sites under positive selection in more than one partition are indicated by an asterisk of the same colour of the secondary partition.

96

Figure 3.4. Functional characterization of substitutions along the transitional branch. a) Estimates of the spectral sensitivity of the marine (blue) and freshwater (yellow) ancestral croaker rhodopsin pigments expressed in vitro. b) Retinal release rate for bovine rhodopsin and bovine rhodopsin with substitutions matching the marine and freshwater croaker sequences at site 119, 122, 124, and 261. Marine substitutions previously published in (Castiglione and Chang). c) Retinal release rates for bovine rhodopsin with substitutions making the pigment match that of marine and freshwater ancestors. d) Absorbance values of the ancestral freshwater pigment at its peak spectral sensitivity (504 nm) and the peak spectral sensitivity of the retinal oxime (360 nm) over a time series after treatment with hydroxylamine.

97 3.8. REFERENCES

Allison WT, Haimberger TJ, Hawryshyn CW, Temple SE. 2004. Visual pigment composition in zebrafish: Evidence for a rhodopsin–porphyropsin interchange system. Vis. Neurosci. 21:945– 952.

Amemiya CT, Alföldi J, Lee AP, Fan S, Philippe H, MacCallum I, Braasch I, Manousaki T, Schneider I, Rohner N. 2013. The African coelacanth genome provides insights into tetrapod evolution. Nature 496:311.

Beatty DD. 1966. A Study Of The Succession Of Visual Pigments In Pacific Salmon (Oncorhynchus). Can. J. Zool. 44:429–455.

Beatty DD. 1973. Visual pigments of several species of teleost fishes. Vision Res. 13:989–992.

Betancur-R R, Ortí G, Pyron RA. 2015. Fossil-based comparative analyses reveal ancient marine ancestry erased by extinction in ray-finned fishes. Ecol Lett 18:441–450.

Bielawski JP, Yang Z. 2004. A Maximum Likelihood Method for Detecting Functional Divergence at Individual Codon Sites, with Application to Gene Family Evolution. J. Mol. Evol. 59:1–12.

Bloom DD, Lovejoy NR. 2017. On the origins of marine-derived freshwater fishes in South America. J. Biogeogr. 44:1927–1938.

Bowmaker JK, Hunt DM. 2006. Evolution of vertebrate visual pigments. Curr. Biol. 16:R484–R489.

Carrete Vega G, Wiens JJ. 2012. Why are there so few fish in the sea? Proc. Biol. Sci. 279:2323– 2329.

Castiglione GM, Chang BS. Functional trade-offs and environmental variation determined ancient trajectories during the evolution of dim-light vision. eLife. In press

Castiglione GM, Hauser FE, Liao BS, Lujan NK, Van Nynatten A, Morrow JM, Schott RK, Bhattacharyya N, Dungan SZ, Chang BSW. 2017. Evolution of nonspectral rhodopsin function at high altitudes. Proc. Natl. Acad. Sci. U.S.A. 114:7385–7390.

Chan T, Lee M, Sakmar TP. 1992. Introduction of Hydroxyl-bearing Amino Acids Causes Bathochromic Spectral Shifts in Rhodopsin. J. Biol. Chem. 267:9478–9480.

Costa MPF, Novo EMLM, Telmer KH. 2012. Spatial and temporal variability of light attenuation in large rivers of the Amazon. Hydrobiologia 702:171–190.

Crampton WG. 2007. Diversity and adaptation in deep channel Neotropical electric fishes. In: Fish life in special environments. New Hampshire: Fish life in special environments. New Hampshire: Science Publishers, Inc., Enfield. pp. 283–339.

Crescitelli F. 1990. Adaptations of visual pigments to the photic environment of the deep sea. J. Exp. Zool. 256:66–75.

Dartnall H, Lythgoe JN. 1965. The spectral clustering of visual pigments. Vision Res. 5:81–100.

98 Deary AL, Metscher B, Brill RW, Hilton EJ. 2016. Shifts of sensory modalities in early life history stage estuarine fishes (Sciaenidae) from the Chesapeake Bay using X-ray micro computed tomography. Environ Biol Fish 99:361–375. di Prisco G, Tamburrini M. 1992. The hemoglobins of marine and freshwater fish: the search for correlations with physiological adaptation. Comp. Biochem. Physiol. B 102:661–671.

Dungan SZ, Chang BSW. 2017. Epistatic interactions influence terrestrial–marine functional shifts in cetacean rhodopsin. Proc. R. Soc. B 284:20162743–20162749.

Dungan SZ, Kosyakov A, Chang BSW. 2016. Spectral Tuning of Killer Whale (Orcinus orca) Rhodopsin: Evidence for Positive Selection and Functional Adaptation in a Cetacean Visual Pigment. Mol. Biol. Evol. 33:323–336.

Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797.

Enright JM, Toomey MB, Sato S-Y, Temple SE, Allen JR, Fujiwara R, Kramlinger VM, Nagy LD, Johnson KM, Xiao Y, et al. 2015. Cyp27c1 Red-Shifts the Spectral Sensitivity of Photoreceptors by Converting Vitamin A1 into A2. Curr. Biol. 25:3048–3057.

Ernst OP, Lodowski DT, Elstner M, Hegemann P, Brown LS, Kandori H. 2014. Microbial and Animal Rhodopsins: Structures, Functions, and Molecular Mechanisms. Chem. Rev. 114:126– 163.

Farrens DL, Khorana HG. 1995. Structure and function in rhodopsin measurement of the rate of metarhodopsin II decay by fluorescence spectroscopy. J. Biol. Chem. 270:5073–5076.

Fay JC, Wu CI. 2003. Sequence divergence, functional constraint, and selection in protein evolution. Annu. Rev. Genom. Hum. Genet. 4:213–235.

Foote AD, Liu Y, Thomas GW, Vinař T, Alföldi J, Deng J, Dugan S, van Elk CE, Hunter ME, Joshi V. 2015. Convergent evolution of the genomes of marine mammals. Nat Genet 47:272.

Govardovskii VI, Fyhrquist N, Reuter T, Kuzmin DG, Donner K. 2000. In search of the visual pigment template. Vis. Neurosci. 17:509–528.

Hauser FE, Ilves KL, Schott RK, Castiglione GM, López-Fernández H, Chang BSW. 2017. Accelerated Evolution and Functional Divergence of the Dim Light Visual Pigment Accompanies Cichlid Colonization of Central America. Mol. Biol. Evol. 34:2650–2664.

Horodysky AZ, Brill RW, Warrant EJ, Musick JA, Latour RJ. 2008. Comparative visual function in five sciaenid fishes inhabiting Chesapeake Bay. J. Exp. Biol. 211:3601–3612.

Hunt DM, Slobodyanyuk SJ, Fitzgibbon J, Bowmaker JK. 1996. Spectral tuning and molecular evolution of rod visual pigments in the species flock of cottoid fish in Lake Baikal. Vision Res. 36:1217–1224.

Imai H, Kojima D, Oura T, Tachibanaki S, Terakita A, Shichida Y. 1997. Single amino acid residue as a functional determinant of rod and cone visual pigments. Proc. Natl. Acad. Sci. U.S.A. 94:2322–2326.

99 Jastrzebska B, Comar WD, Kaliszewski MJ, Skinner KC, Torcasio MH, Esway AS, Jin H, Palczewski K, Smith AW. 2016. A G Protein-Coupled Receptor Dimerization Interface in Human Cone Opsins. Biochemistry:acs.biochem.6b00877–39.

Jerlov NG. 1976. Marine Optics.

Jones FC, Grabherr MG, Chan YF, Russell P, Mauceli E, Johnson J, Swofford R, Pirun M, Zody MC, White S, et al. 2012. The genomic basis of adaptive evolution in threespine sticklebacks. Nature 484:55–61.

Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, Buxton S, Cooper A, Markowitz S, Duran C, et al. 2012. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28:1647– 1649.

Kosakovsky Pond SL, Murrell B, Fourment M, Frost SDW, Delport W, Scheffler K. 2011. A Random Effects Branch-Site Model for Detecting Episodic Diversifying Selection. Mol. Biol. Evol. 28:3033–3043.

Lanfear R, Frandsen PB, Wright AM, Senfeld T, Calcott B. 2016. PartitionFinder 2: New Methods for Selecting Partitioned Models of Evolution for Molecular and Morphological Phylogenetic Analyses. Mol. Biol. Evol. 34:772–773.

Lee CE, Kiergaard M, Gelembiuk GW, Eads BD, Posavi M. 2011. Pumping ions: rapid parallel evolution of ionic regulation following habitat invasions. Evolution 65:2229–2244.

Lee CE. 2016. Evolutionary mechanisms of habitat invasions, using the Eurytemora affinis as a model system. Evolutionary Applications 9:248–270.

Lo P-C, Liu S-H, Chao NL, Nunoo FKE, Mok H-K, Chen W-J. 2015. A multi-gene dataset reveals a tropical New World origin and Early Miocene diversification of croakers (Perciformes: Sciaenidae). Mol. Phylogenet. Evol. 88:132–143.

Lovejoy NR, Bermingham E, Martin AP. 1998. Marine incursion into South America. Nature 396:421–422.

Lythgoe JN. 1984. Visual pigments and environmental light. Vision Res. 24:1539–1550.

MacNichol EF Jr., Levine JS. 1979. Visual Pigments in Teleost Fishes: Effects of Habitat, Microhabitat, and Behaviour on Visual System Evolution. Sensory processes 3:95–131.

Marques DA, Taylor JS, Jones FC, Di Palma F, Kingsley DM, Reimchen TE. 2017. Convergent evolution of SWS2 opsin facilitates adaptive radiation of threespine stickleback into different light environments. PLoS Biol 15:e2001627–24.

Morrow JM, Chang BSW. 2010. The p1D4-hrGFP II expression vector: A tool for expressing and purifying visual pigments and other G protein-coupled receptors. Plasmid 64:162–169.

Morrow JM, Chang BSW. 2015. Comparative Mutagenesis Studies of Retinal Release in Light- Activated Zebrafish Rhodopsin Using Fluorescence Spectroscopy. Biochemistry 54:4507–4518.

100 Morshedian A, Toomey MB, Pollock GE, Frederiksen R, Enright JM, McCormick SD, Cornwall MC, Fain GL, Corbo JC. 2017. Cambrian origin of the CYP27C1-mediated vitamin A 1-to-A 2 switch, a key mechanism of vertebrate sensory plasticity. R. Soc. open sci. 4:170362–170369.

Murrell B, Moola S, Mabona A, Weighill T, Sheward D, Kosakovsky Pond SL, Scheffler K. 2013. FUBAR: A Fast, Unconstrained Bayesian AppRoximation for Inferring Selection. Mol. Biol. Evol. 30:1196–1205.

Nakamura Y, Yasuike M, Mekuchi M, Iwasaki Y, Ojima N, Fujiwara A, Chow S, Saitoh K. 2017. Rhodopsin gene copies in Japanese eel originated in a teleost-specific genome duplication. Zoological Lett 3:1–12.

Nei M. 2007. The new mutation theory of phenotypic evolution. Proc. Natl. Acad. Sci. U.S.A. 104:12235–12242.

Nielsen R, Yang Z. 1998. Likelihood Models for Detecting Positively Selected Amino Acid Sites and Applications to the HIV-1 Envelope Gene. Genetics 148:929–936.

Okada T, Sugihara M, Bondar A-N, Elstner M, Entel P, Buss V. 2004. The Retinal Conformation and its Environment in Rhodopsin in Light of a New 2.2Å Crystal Structure. J. Mol. Biol. 342:571– 583.

Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE. 2004. UCSF Chimera-A visualization system for exploratory research and analysis. J. Comput. Chem. 25:1605–1612.

Pond SLK, Frost SDW, Muse SV. 2005. HyPhy: hypothesis testing using phylogenies. Bioinformatics 21:676–679.

Pough FH, Janis CM, Heiser JB. 1999. Vertebrate life. Prentice Hall Upper Saddle River, NJ

Rennison DJ, Owens GL, Taylor JS. 2012. Opsin gene duplication and divergence in ray-finned fish. Mol. Phylogenet. Evol. 62:986–1008.

Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Höhna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP. 2012. MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space. Syst. Biol. 61:539–542.

Sakmar TP, Franke RR, Khorana HG. 1989. Glutamic acid-113 serves as the retinylidene Schiff base counterion in bovine rhodopsin. Proc. Natl. Acad. Sci. U.S.A. 86:8309–8313.

Sasaki K. 1989. Phylogeny of the family Sciaenidae, with notes on its zoogeography (Teleostei, Perciformes). Mem. Fac. Fish., Hokkaido Univ. 36:1–137.

Schafer CT, Fay JF, Janz JM, Farrens DL. 2016. Decay of an active GPCR: Conformational dynamics govern agonist rebinding and persistence of an active, yet empty, receptor state. Proc. Natl. Acad. Sci. U.S.A. 113:11961–11966.

Schott RK, Refvik SP, Hauser FE, López-Fernández H, Chang BSW. 2014. Divergent positive selection in rhodopsin from lake and riverine cichlid fishes. Mol. Biol. Evol. 31:1149–1165.

101 Sekharan S, Mooney VL, Rivalta I, Kazmi MA, Neitz M, Neitz J, Sakmar TP, Yan ECY, Batista VS. 2013. Spectral Tuning of Ultraviolet Cone Pigments: An Interhelical Lock Mechanism. J. Am. Chem. Soc. 135:19064–19067.

Sheaves M, Duc NH, Khoa NX. 2008. Ecological attributes of a tropical river basin vulnerable to the impacts of clustered hydropower developments. Mar. Freshwater Res. 59:971–986.

Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313.

Sugawara T, Imai H, Nikaido M, Imamoto Y, Okada N. 2010. Vertebrate Rhodopsin Adaptation to Dim Light via Rapid Meta-II Intermediate Formation. Mol. Biol. Evol. 27:506–519.

Toyama M, Hironaka M, Yamahama Y, Horiguchi H, Tsukada O, Uto N, Ueno Y, Tokunaga F, Seno K, Hariyama T. 2008. Presence of Rhodopsin and Porphyropsin in the Eyes of 164 Fishes, Representing Marine, Diadromous, Coastal and Freshwater Species—A Qualitative and Comparative Study. Photochem. Photobiol. 84:996–1002.

Van Eps N, Caro LN, Morizumi T, Kusnetzow AK, Szczepek M, Hofmann KP, Bayburt TH, Sligar SG, Ernst OP, Hubbell WL. 2017. Conformational equilibria of light-activated rhodopsin in nanodiscs. Proc. Natl. Acad. Sci. U.S.A. 114:E3268–E3275. van Hazel I, Dungan SZ, Hauser FE, Morrow JM, Endler JA, Chang BSW. 2016. A comparative study of rhodopsin function in the great bowerbird (Ptilonorhynchus nuchalis): Spectral tuning and light-activated kinetics. Protein Sci. 25:1308–1318.

Van Nynatten A, Bloom D, Chang BSW, Lovejoy NR. 2015. Out of the blue: adaptive visual pigment evolution accompanies Amazon invasion. Biol. Lett. 11:20150349.

Velotta JP, McCormick SD, Schultz ET. 2015. Trade‐offs in osmoregulation and parallel shifts in molecular function follow ecological transitions to freshwater in the Alewife. Evolution 69:2676–2688.

Wald G. 1939. The porphyropsin visual system. J. Gen. Physiol. 22:775–794.

Weadick CJ, Chang BSW. 2011. An Improved Likelihood Ratio Test for Detecting Site-Specific Functional Divergence among Clades of Protein-Coding Genes. Mol. Biol. Evol. 29:1297–1300.

Willoughby JR, Harder AM, Tennessen JA, Scribner KT, Christie MR. 2018. Rapid genetic adaptation to a novel environment despite a genome-wide reduction in genetic diversity. Mol Ecol 17:675.

Xu T, Xu G, Che R, Wang R, Wang Y, Li J, Wang S, Shu C, Sun Y, Liu T. 2016. The genome of the miiuy croaker reveals well-developed innate immune and sensory systems. Sci. Rep. 6:21902.

Yamazaki Y, Nagata T, Terakita A, Kandori H, Shichida Y, Imamoto Y. 2014. Intramolecular Interactions That Induce Helical Rearrangement upon Rhodopsin Activation. J. Biol. Chem. 289:13792–13800.

Yancey PH, Clark ME, Hand SC, Bowlus RD, Somero GN. 1982. Living with water stress: evolution of osmolyte systems. Science 217:1214–1222.

102 Yang Z. 1998. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol. Biol. Evol. 15:568–573.

Yang Z. 2000. Maximum Likelihood Estimation on Large Phylogenies and Analysis of Adaptive Evolution in Human Influenza Virus A. J. Mol. Evol. 51:423–432.

Yang Z. 2005. Bayes Empirical Bayes Inference of Amino Acid Sites Under Positive Selection. Mol. Biol. Evol. 22:1107–1118.

Yang Z. 2007. PAML 4: Phylogenetic Analysis by Maximum Likelihood. Mol. Biol. Evol. 24:1586– 1591.

Zhou XE, Melcher K, Xu HE. 2012. Structure and activation of rhodopsin. Acta Pharmacol Sin 33:291–299.

103 3.9. SUPPLEMENTAL INFORMATION

3.9.1. Phylogenetic Reconstructions Model searches in Partition Finder (Lanfear et al. 2016) were constrained to those available in RAXML and Mr. Bayes, and the best fitting partitioning scheme was chosen using a Bayes Information Criterion (BIC) approach (Supplementary table S2). Tracer was used to visualize log-likelihood values for each parameter estimate across successive samples in the Bayesian phylogenetic reconstruction to determine if stationarity was reached and to ensure that each parameter estimate had an effective sample size of greater than 200.

In the Bayesian phylogeny, the sister group to the freshwater clade is the Paralonchurus clade, previously identified as the sister group in (Lo et al. 2015), albeit with only one Paralonchurus species. In the maximum likelihood species tree and maximum likelihood rhodopsin gene tree Umbrina bussingi was reconstructed as the sister species. When the two highly positively selected sites 165 and 214 were removed from the rhodopsin partition of our concatenated alignment this topological incongruence in Bayesian and Maximum likelihood species trees disappeared and the maximum likelihood species tree was reconstructed with Paralonchurus as the sister group (Supplementary figure S4). The same is true when the first and second codon position are removed from the rhodopsin partition (Supplementary figure S5).

3.9.2. Molecular Evolutionary Analyses We pruned our dataset to include only species with rhodopsin sequence data and removed all terminal gaps, outgroups and duplicate taxa. This resulted in an 825 bp long dataset, spanning all seven transmembrane helices of rhodopsin with sequences from 114 taxa. Branch-sites models in PAML include a parameter allowing positive selection in a subset of sites on the foreground, but constrains all other branches to be less than or equal to one. Explicit support for positive selection on the foreground lineage can be established with LRTs with a nested null branch-site model where dN/dS estimates for the positively selected site class are constrained to equal one on foreground lineages (Nielsen and Yang 1998).

104

We also employed clade model C and D (CmC, CmD) in PAML (Bielawski and Yang 2004). These models allow multiple independent foreground lineages and do not constrain background lineages to be less than or equal to one, allowing positive selection in the background. Support for the divergent site class parameter included in CmC can be established with LRTs to the nested null model M2aREL, which assumes uniform dN/dS estimates for each site class parameter across the entire tree. Support for positive selection in the foreground divergent site class estimate can also be tested explicitly using a modified CmC null model where the foreground is constrained to equal one. CmD differs from CmC in that the second site class, with a uniform dN/dS estimate across the phylogeny, is freely estimated and more flexible than the comparable site class in CmC that is constrained to equal one. This allows for more than one class of positively selected sites in CmD. The nested null model for CmD is M3, which has three freely estimated site classes but assumes uniform selection across the tree.

CmC and CmD can be used to determine the number of foreground partitions that best fit the data by comparing nested partitioning schemes (Schott et al. 2014). We partitioned our Croaker phylogeny so that freshwater invading lineages were the foreground. We tested this scheme using branch-sites and clade model analyses. We also isolated each individual freshwater invasion event as individual foreground lineages using both aforementioned models in PAML. We compared the clade model results for each freshwater invasion event with a more parameter rich partitioning scheme with each freshwater invasion event as its own isolated foreground partition with independent dN/dS estimate.

3.9.3. Comparing CmC with CmD when two classes of positively selected sites present CmD can also be directly compared with CmC, which differ only in how they estimate the second site class, to establish support for a second site class parameter different than one. M3 can also be set to have more than three site classes, and can be used to determine the number of site classes required to fit the data effectively by comparing models with more site classes with models with fewer (Bielawski and Yang 2004). We used this approach to test if the positively selected sites in the marine dataset (pervasive positively selected sites) were

105 interfering with our estimation of episodic selection using CmC, which was not significantly better fitting than M2aREL (its requisite null model) and significantly worse fitting than CmD (Supplementary table S10). Setting the number of site classes in M3 to four was significantly better fitting than three site classes (Supplementary table S10). The fourth site class was occupied by two sites, 165 and 214, and had a very high dN/dS estimate (13.64, Supplementary table S10). We suspected that these highly positively selected sites were shifting the estimate for the third site class in CmC, the only site class in this model allowed to be above one, beyond what was appropriate for modelling the divergent selection detected on the transitional branch using Branch-sites, CmD, Branch-sites REL and even the Two-Ratio model. Removing these two sites and re-running the analyses returned parameter estimates consistent with those reported for the other branch and clade models (Supplementary table S10) This suggests that in models with more than one class of positively selected sites, CmD should be used, as it has the flexibility to model more than one class of positively selected sites.

106 3.9.4. Supplementary tables

Table S3.1. Genbank accession numbers for sequences used in phylogenetic reconstructions and selection analyses. Voucher Species code cyt b COI RAG1 RH1 EGR1 EGR2 Dataset usage number

Cynoscion acoupa 1 NA xxxxx - xxxxx xxxxx xxxxx xxxxx

Cynoscion albus 2 ROM T13765 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Rho

Cynoscion arenarius 3 11630 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Rho

Cynoscion nebulosus 4 11628 - xxxxx xxxxx xxxxx xxxxx xxxxx Rho

Cynoscion nothus 5 11626 xxxxx - xxxxx xxxxx xxxxx - Rho

Cynoscion praedatorius 6 ROM T13978 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx

Isopisthus parvipinnis 7 ROM T7713 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Rho

Larimus fasciatus 8 11625 xxxxx xxxxx xxxxx xxxxx xxxxx Rho NWC

Macrodon ancylodon 9 ROM T7717 xxxxx - xxxxx xxxxx xxxxx xxxxx

Menticirrhus paitensis 10 ROM T13731 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Rho

Menticirrhus sp 11 NA xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Rho

Micropogonios furnieri 12 ROM T8290 xxxxx - xxxxx xxxxx xxxxx xxxxx

Nebris microps 13 ROM T7728 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx

Rho NWC Nebris sp 14 ROM T13741 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Con Rho NWC Ophioscion scierus 15 ROM T13732 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Con

Ophioscion sp 16 ROM T08723 xxxxx xxxxx xxxxx xxxxx xxxxx Rho NWC

Pachypops fourcroi 17 ANSP 40562 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx

Pachypops sp 18 ANSP 197648 xxxxx xxxxx xxxxx xxxxx xxxxx Rho NWC

Pachyurus bonariensis 19 NA xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx

Rho NWC Pachyurus cf paucirastrus 20 ANSP 199599 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Con

Pachyurus junki 21 ANSP 193039 xxxxx xxxxx xxxxx xxxxx xxxxx

Pachyurus junki 22 ANSP 198701 xxxxx xxxxx xxxxx xxxxx xxxxx Rho NWC

Pachyurus schomburgkii 23 NA xxxxx xxxxx xxxxx xxxxx xxxxx Rho NWC

Paralonchurus dumerilii 24 ROM T13891 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx

Rho NWC Paralonchurus dumerilii 25 ROM T13728 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Con Rho NWC Paralonchurus sp 26 ROM T7723 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Con Rho NWC Petilipinnis grunniens 27 ANSP 187423 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Con

Plagioscion auratus 28 ANSP 197653 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx

Rho NWC montei 29 NA xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Con

Plagioscion squamosissimus 30 NA xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx

Rho NWC Plagioscion squamosissimus 31 NA xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Con

Plagioscion squamosissimus 32 SIUC 37984 - xxxxx xxxxx xxxxx xxxxx xxxxx

Plagioscion squamosissimus 33 NA xxxxx xxxxx xxxxx - xxxxx xxxxx

Plagioscion squamosissimus 34 NA - xxxxx xxxxx - xxxxx xxxxx

Plagioscion squamosissimus 35 INHS 54286 - xxxxx xxxxx - xxxxx -

107 Pogonias cromis 36 NA xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Rho

Sciaenops ocellatus 37 NA xxxxx xxxxx

Rho NWC Stellifer chrysoleuca 38 ROM T13888 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Con

Stellifer lanceolatus 39 NA xxxxx xxxxx xxxxx xxxxx xxxxx Rho NWC

Rho NWC Stellifer sp 40 ROM T13734 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Con

Stellifer sp 41 ROM T7720 xxxxx - xxxxx xxxxx xxxxx xxxxx Rho NWC

Rho NWC Stellifer stellifer 42 ROM T7719 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx Con

Umbrina sp 43 NA xxxxx xxxxx xxxxx xxxxx xxxxx Rho NWC

Aplodinotus grunniens 44 NA KP722606 KP722699 KP722882 KP722970 KP722788 KP723062 Rho

Argyrosomus japonicus 45 NA KP722607 KP722700 KP722883 KP722971 KP722789 KP723063 Rho

Argyrosomus regius 46 NA KP722608 KP722701 KP722884 KP722972 KP722790 KP723064 Rho

Atractoscion nobilis 47 NA KP722609 EU547246 KP722885 KP722973 KP722791 KP723065 Rho

Atrobucca nibe 48 NA KP722610 KP722702 KP722886 KP722974 KP722792 KP723066 Rho

Austronibea oedogenys 49 NA KP722611 KP722887 KP722975 KP722793 KP723067 Rho

Bahaba taipingensis 50 NA KP722612 KP722703 KP722888 KP722976 KP722794 KP723068 Rho

Rho NWC Bairdiella cf armata 51 NA KP722613 KP722704 KP722889 KP722977 KP722795 KP723069 Con

Bairdiella ronchus 52 NA KP722614 KP722705 KP722978 KP722796 KP723070 Rho NWC

Boesemania microlepis 53 NA KP722615 KP722706 KP722890 KP722979 KP722797 KP723071 Rho

Rho NWC Cheilotrema saturnum 54 NA KP722616 KP722707 KP722891 KP722980 KP722798 KP723072 Con

Chrysochir aureus 55 NA KP722617 KP722708 KP722892 KP722981 KP722799 KP723073 Rho

Rho NWC Cilus gilberti 56 NA KP722618 KP722709 KP722893 KP722982 KP722800 KP723074 Con

Collichthys lucidus 57 NA KP722619 KP722710 KP722894 KP722983 KP722801 KP723075 Rho

Corvula macrops 58 NA KP722620 KP722711 KP722895 KP722984 KP722802 KP723076

Corvula macrops 59 NA KP722712 KP722896 KP722985 KP722803 KP723077 Rho NWC

Cynoscion acoupa 60 NA KP722621 KP722713 KP722897 KP722986 KP722804 KP723078 Rho

Cynoscion guatucupa 61 NA KP722622 KP722714 KP722898 KP722987 KP722805 KP723079 Rho

Cynoscion parvipinnis 62 NA KP722623 KP722715 KP722899 KP722988 KP722806 KP723080 Rho

Cynoscion praedatorius 63 NA KP722624 KP722716 KP722900 KP722989 KP722807 KP723081 Rho

Cynoscion regalis 64 NA KP722625 KP722717 KP722901 KP722990 KP722808 KP723082 Rho

Cynoscion reticulatus 65 NA KP722626 KP722718 KP722902 KP722991 KP722809 KP723083 Rho

Daysciaena albida 66 NA KP722627 KP722719 KP722903 KP722992 KP722810 KP723084 Rho

Dendrophysa russelii 67 NA KP722628 KP722720 KP722904 KP722993 KP722811 KP723085 Rho

Dicentrarchus labrax 68 NA AP009166 AP009166 KP722969 KP723061 KP722881 KP723151

Rho NWC Equetus lanceolatus 69 NA KP722629 KP722721 KP722905 KP722994 KP722812 KP723086 Con

Genyonemus lineatus 70 NA KP722630 KP722722 KP722906 KP722995 KP722813 Rho NWC

Isopisthus remifer 71 NA KP722631 KP722723 KP722907 KP722996 KP722814 KP723087 Rho

Johnius amblycephalus 72 NA KP722632 KP722724 KP722908 KP722997 KP722815 KP723088 Rho

Johnius belangerii 73 NA KP722633 KP722725 KP722909 KP722998 KP722816 KP723089 Rho

Johnius borneensis 74 NA KP722634 KP722910 KP722999 KP722817 KP723090 Rho

Johnius carouna 75 NA KP722635 KP722726 KP722911 KP723000 KP722818 KP723091 Rho

108 Johnius distinctus 76 NA KP722636 KP722912 KP723001 KP722819 KP723092 Rho

Johnius heterolepis 77 NA KP722637 KP722913 KP723002 KP722820 KP723093 Rho

Johnius macropterus 78 NA KP722638 KP722727 KP722914 KP723003 KP722821 KP723094 Rho

Johnius majan 79 NA KP722639 KP722728 KP722915 KP723004 KP722822 KP723095 Rho

Johnius trewavasae 80 NA KP722640 KP722729 KP722916 KP723005 KP722823 KP723096 Rho

Larimichthys crocea 81 NA KP722641 KP722730 KP722917 KP723006 KP722824 KP723097 Rho

Larimichthys polyactis 82 NA KP722642 KP722731 KP722918 KP72300 KP722825 KP723098 Rho

Rho NWC Larimus pacificus 83 NA KP722643 KP722732 KP722919 KP723008 KP722826 KP723099 Con

Leiostomus xanthurus 84 NA KP722644 KP722733 KP722920 KP723009 KP722827 KP723100 Rho

Macrodon ancylodon 85 NA KP722645 KP722734 KP722921 KP723010 KP722828 KP723101 Rho

Megalonibea fusca 86 NA KP722646 KP722735 KP722922 KP723011 KP722829 KP723102 Rho

Menticirrhus americanus 87 NA KP722647 KP722736 KP722923 KP723012 KP722830 KP723103 Rho

Menticirrhus undulatus 88 NA KP722648 KP722737 KP722924 KP723013 KP722831 KP723104 Rho

Micropogonias ectenes 89 NA KP722649 KP722738 KP722925 KP723014 KP722832 KP723105 Rho

Micropogonias furnieri 90 NA KP722650 KP722739 KP722926 KP723015 KP722833 KP723106 Rho

Micropogonias undulatus 91 NA KP722651 KP722740 KP722927 KP723016 KP722834 KP723107 Rho

Miichthys miiuy 92 NA KP722652 KP722741 KP722928 KP723017 KP722835 KP723108 Rho

KC44210 KC44213 Monotaxis grandoculis 93 NA EU036430 FN689114 EF095651 Y18673 0 4

Nebris microps 94 NA KP722653 KP722742 KP722929 KP723018 KP722836 KP723109 Rho NWC

Nibea albiflora 95 NA KP722654 KP722743 KP722930 KP723019 KP722837 KP723110 Rho

Nibea microgenys 96 NA KP722656 KP722745 KP722931 KP723020 KP722838 KP723111 Rho

Nibea soldado 97 NA KP722657 KP722746 KP722932 KP723021 KP722839 KP723112 Rho

Nibea squamosa 98 NA KP722658 KP722747 KP722933 KP723022 KP722840 KP723113 Rho

Odontoscion xanthops 99 NA KP722659 KP722748 KP723023 KP722841 Rho NWC

Rho NWC Ophioscion punctatissimus 100 NA KP722660 KP722749 KP722934 KP723024 KP722842 KP723114 Con

Ophioscion scierus 101 NA KP722661 KP722750 KP723025 KP722843 KP723115

Ophioscion vermicularis 102 NA KP722662 KP722751 KP722935 KP723026 KP722844 KP723116 Rho NWC

Otolithes ruber 103 NA KP722663 KP722752 KP722936 KP723027 KP722845 KP723117 Rho

Rho NWC Pachypops fourcroi 104 NA KP722664 KP722753 KP722937 KP723028 KP722846 KP723118 Con Rho NWC Pachyurus bonariensis 105 NA KP722665 KP722754 KP72293 KP723029 KP722847 KP723119 Con

Panna microdon 106 NA KP722666 KP722755 KP722939 KP723030 KP722848 KP723120 Rho

Rho NWC Paralonchurus brasiliensis 107 NA KP722667 KP722756 KP722940 KP723031 KP722849 KP723121 Con

Pareques sp 108 NA KP722668 KP722757 KP723032 KP722850 Rho NWC

Pennahia argentata 109 NA KP722669 KP722758 KP722941 KP723033 KP722851 KP723122 Rho

Pennahia macrocephalus 110 NA KP722670 KP722759 KP722942 KP723034 KP722852 KP723123 Rho

Pennahia pawak 111 NA KP722671 KP722760 KP722943 KP723035 KP722853 KP723124 Rho

Plagioscion auratus 112 NA KP722672 KP722761 KP722944 KP723036 KP722854 KP723125 Rho NWC

Plagioscion squamosissimus 113 NA KP722673 KP722762 KP722945 KP723037 KP722855 KP723126

Rho NWC Plagioscion surinamensis 114 NA KP722674 KP722763 KP722946 KP723038 KP722856 KP723127 Con

Plagioscion ternetzi 115 NA KP722675 KP722764 KP722857

109 Pogonias cromis 116 NA KP722676 KP722765 KP722947 KP723039 KP722858 KP723128

Protonibea diacanthus 117 NA KP722677 KP722766 KP722948 KP723040 KP722859 KP723129 Rho

Pseudotolithus brachygnathus 118 NA KP722678 KP722767 KP722949 KP723041 KP722860 KP723130 Rho

Pseudotolithus elongatus 119 NA KP722679 KP722768 KP722950 KP723042 KP722861 KP723131 Rho

Pseudotolithus senegallus 120 NA KP722680 KP722769 KP722951 KP723043 KP722862 KP723132 Rho

Pseudotolithus typus 121 NA KP722681 KP722770 KP723044 KP722863 KP723133 Rho

Pteroscion peli 122 NA KP722682 KP722771 KP722952 KP723045 KP722864 KP723134 Rho

Pterotolithus maculatus 123 NA KP722683 KP722772 KP722953 KP723046 KP722865 KP723135 Rho

Rho NWC Roncador stearnsii 124 NA KP722684 KP722773 KP722954 KP723047 KP722866 KP723136 Con Rho NWC Sciaena deliciosa 125 NA KP722685 KP722774 KP722955 KP723048 KP722867 KP723137 Con

Sciaena umbra 126 NA KP722686 KP722775 KP722956 KP723049 KP722868 KP723138 Rho

Sciaenops ocellatus 127 NA KP722687 KP722776 KP722957 KP723050 KP722869 KP723139 Rho

Rho NWC Seriphus politus 128 NA KP722688 KP722777 KP722958 KP723051 KP722870 KP723140 Con KC44210 KC44213 Sparus aurata 129 NA AF240735 FN689315 EF095657 Y18665 0 4 Rho NWC Stellifer ericymba 130 NA KP722689 KP722778 KP722959 KP723052 KP722871 KP723141 Con Rho NWC Stellifer microps 131 NA KP722690 KP722779 KP722960 KP723053 KP722872 KP723142 Con Rho NWC Stellifer oscitans 132 NA KP722691 KP722780 KP722961 KP723054 KP722873 KP723143 Con Rho NWC Stellifer rastrifer 133 NA KP722692 KP722781 KP722962 KP723055 KP722874 KP723144 Con

Totoaba macdonaldi 134 NA KP722693 KP722782 KP722963 KP723056 KP722875 KP723145 Rho

Umbrina bussingi 135 NA KP722694 KP722783 KP722964 KP723057 KP722876 KP723146 Rho NWC

Umbrina canariensis 136 NA KP722695 KP722784 KP722965 KP722877 KP723147

Umbrina cirrosa 137 NA KP722696 KP722785 KP722966 KP723058 KP722878 KP723148 Rho

Umbrina roncador 138 NA KP722697 KP722786 KP722967 KP723059 KP722879 KP723149 Rho NWC

Rho NWC Umbrina xanti 139 NA KP722698 KP722787 KP722968 KP723060 KP722880 KP723150 Con NOTE. Rho, Rhodopsin sequence included in dataset for analyses of molecular evolution; Con, Sequences used in control gene dataset; NWC, Rhodopsin sequences used in analyses of just the New World clade of croakers; xxxxx indicates the sequence was generated in this study.

Table S3.2. Results for Partition Finder analyses of concatenated dataset. Mr. Bayes RaxML Gene Length Partition Model Partition Model

Rh1 827 1 K80+I+G 1 GTR+I+G

Cytb 1104 2 GTR+I+G 2 GTR+I+G

COI 648 3 GTR+I+G 3 GTR+I+G

EGR1 896 4 HKY+I+G 4 GTR+I+G

EGR2 1124 4 HKY+I+G 4 GTR+I+G

EGR2 intron 422 5 K80+I+G 5 GTR+G

Rag1 1439 6 SYM+I+G 6 GTR+I+G

110

Table S3.3. Random-sites analyses (PAML) of the world-wide croaker rhodopsin dataset using the maximum likelihood species tree

Model np lnL Parameter estimates: dN/dS (Proportion of sites) Null LRT p value n/a M0 227 -7075.70 0.26(1.00) M1a 228 -6586.36 0.02(0.85) 1.00F(0.15) M0 978.7 0.0000 M2a 230 -6490.44 0.03(0.84) 1.00F(0.13) 4.54(0.03) M1a 191.8 0.0000 M7 228 -6579.96 β (p = 0.06, q = 0.30) M8a 229 -6569.04 β (p = 0.13, q = 3.07) 1.00F(0.12) M7 21.8 0.0000 M8 230 -6483.41 β (p = 0.06, q = 0.39) 4.13(0.04) M7 193.1 0.0000 M8a 171.2 0.0000 F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test result

Table S3.4. Random-sites analyses (PAML) of the world-wide croaker rhodopsin dataset using the maximum likelihood rhodopsin gene tree

Model np lnL Parameter estimates: dN/dS (Proportion of sites) Null LRT p value

M0 227 -6673.45 0.24(1.00) M1a 228 -6277.64 0.02(0.84) 1.00F(0.16) M0 791.6 0.0000 M2a 230 -6206.98 0.02(0.84) 1.00F(0.16) 8.94(0.01) M1a 141.3 0.0000 M7 228 -6271.43 β (p = 0.06, q = 0.32) M8a 229 -6263.12 β (p = 0.12, q = 2.61) 1.00F(0.11) M7 16.6 0.0000 M8 230 -6202.27 β (p = 0.01, q = 0.34) 8.29(0.01) M7 138.3 0.0000 M8a 121.7 0.0000 F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test result

111 Table S3.5. Branch-sites and clade model analyses (PAML) of the world-wide croaker rhodopsin dataset using the maximum likelihood species tree

Parameter Estimates: dN/dS (proportion of sites) LR Partitions np lnL Null p value (background dN/dS / foreground dN/dS) T

F M2aREL 230 -6490.44 0.03(0.84) 1.00 (0.13) 4.54(0.03) M3 231 -6490.39 0.03(0.84) 1.04(0.13) 4.65(0.03)

All transitional branches Br-site Br-site alt 230 -6579.40 0.02(0.82) 1.00F(0.14) 0.02/5.49(0.04) 1.00F/5.49(0.01) null 4.45 0.0349 F F F F Br-site null 229 -6581.63 0.02(0.78) 1.00 (0.13) 0.02/1.00 (0.08) 1.00 /1.00 (0.01) CmC 231 -6489.73 0.03(0.84) 1.00F(0.13) 4.64/2.33(0.03) M2aREL 1.43 0.2317 CmD 232 -6486.78 0.03(0.86) 8.71(0.01) 1.38/3.71(0.13) M3 7.21 0.0072

North American transition Br-site Br-site alt 230 -6586.12 0.02(0.83) 1.00F(0.15) 0.02/1.42(0.02) 1.00F/1.42(0.00) null 0.01 0.9057 F F F Br-site null 229 -6586.13 0.02(0.82) 1.00 (0.15) 0.02/1.00(0.02) 1.00 /1.00 (0.00) CmC 231 -6489.95 0.03(0.84) 1.00F(0.13) 4.55/2.19(0.03) M2aREL 0.98 0.3219 CmD 232 -6490.10 0.04(0.86) 8.75(0.01) 1.47/2.58(0.13) M3 0.57 0.4498

Asian transition Br-site Br-site alt 230 -6586.36 0.02(0.85) 1.00F(0.15) 0.02/1.00F(0.00) 1.00F/1.00(0.00) null 0.00 1.0000 F F F F Br-site null 229 -6586.36 0.02(0.85) 1.00 (0.15) 0.02/1.00 (0.00) 1.00 /1.00 (0.00) CmC 231 -6488.84 0.03(0.84) 1.00F(0.13) 4.56/0.00(0.03) M2aREL 3.20 0.0735 CmD 232 -6490.12 0.04(0.86) 8.74(0.01) 1.48/3.59(0.13) M3 0.54 0.4631

South American transition Br-site Br-site alt 230 -6575.55 0.02(0.82) 1.00F(0.14) 0.02/23.12(0.03) 1.00F/23.12(0.01) null 9.52 0.0020 F F F F Br-site null 229 -6580.31 0.02(0.72) 1.00 (0.12) 0.02/1.00 (0.13) 1.00 /1.00 (0.02) CmC 231 -6490.25 0.03(0.84) 1.00F(0.13) 4.48/8.02(0.03) M2aREL 0.38 0.5370 CmD 232 -6487.31 0.03(0.86) 8.57(0.01) 1.40/5.78(0.13) M3 6.16 0.0130

F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test result; Br-site, Branch-site; CmC, Clade model C; CmD, Clade model D

112 Table S3.6. Branch-sites and clade model analyses (PAML) of the world-wide croaker rhodopsin dataset using the maximum likelihood rhodopsin gene tree

Parameter Estimates: dN/dS (proportion of sites) Partitions np lnL Null LRT p value (background dN/dS / foreground dN/dS) M2aREL 230 -6206.98 0.02(0.84) 1.00F(0.16) 8.94(0.01) M3 231 -6204.77 0.03(0.84) 1.21(0.15) 9.50(0.01)

All transitional branches Br-site alt 230 -6271.17 0.02(0.82) 1.00F(0.14) 0.02/6.30(0.03) 1.00F/6.30(0.01) Br-site null 4.4 0.0359 Br-site null 229 -6273.37 0.02(0.77) 1.00F(0.14) 0.02/1.00F(0.07) 1.00F/1.00F(0.01) CmC 231 -6206.35 0.02(0.84) 1.00F(0.16) 9.25/3.26(0.01) M2aREL 1.25 0.2635 CmD 232 -6200.74 0.03(0.84) 9.49(0.01) 1.15/3.12(0.15) M3 8.05 0.0045

North American (NA) transition Br-site alt 230 -6277.4 0.02(0.82) 1.00F(0.15) 0.02/1.00(0.03) 1.00F/1.00(0.01) Br-site null 0.00 1.0000 Br-site null 229 -6277.4 0.02(0.82) 1.00F(0.15) 0.02/1.00F(0.03) 1.00F/1.00F(0.01) CmC 231 -6205.51 0.02(0.84) 1.00F(0.16) 9.15/0.00(0.01) M2aREL 2.94 0.0863 CmD 232 -6204.66 0.03(0.84) 9.50(0.01) 1.20/1.59(0.15) M3 0.21 0.6457

Asian transition Br-site alt 230 -6277.64 0.02(0.84) 1.00F(0.16) 0.02/1.00(0.00) 1.00F/1.00(0.00) Br-site null 0.00 1.0000 Br-site null 229 -6277.64 0.02(0.84) 1.00F(0.16) 0.02/1.00F(0.00) 1.00F/1.00F(0.00) CmC 231 -6206.32 0.02(0.84) 1.00F(0.16) 9.04/0.00(0.01) M2aREL 1.31 0.2519 CmD 232 -6204.23 0.03(0.85) 9.50(0.01) 1.20/2.88(0.15) M3 1.07 0.3006

South American (SA) transition Br-site alt 230 -6267.03 0.02(0.82) 1.00F(0.15) 0.02/25.27(0.03) 1.00F/25.27(0.01) Br-site null 10.3 0.0013 Br-site null 229 -6272.18 0.02(0.72) 1.00F(0.13) 0.02/1.00F(0.12) 1.00F/1.00F(0.02) CmC 231 -6206.81 0.02(0.84) 1.00F(0.16) 8.85/17.73(0.01) M2aREL 0.34 0.5606 CmD 232 -6200.16 0.03(0.84) 9.47(0.01) 1.16/5.42(0.15) M3 9.21 0.0024

F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test result; Br-site, Branch-site; CmC, Clade model C; CmD, Clade model D

Table S3.7. Random-sites analyses (PAML) of the world-wide croaker rhodopsin dataset with freshwater species removed using the Bayesian species tree Model np lnL Parameter estimates: dN/dS (Proportion of sites) Null LRT p value M0 199 -6372.04 0.26(1.00) M1a 200 -5917.10 0.02(0.86) 1.00 F(0.14) m0 909.9 0.0000 M2a 202 -5827.38 0.02(0.85) 1.00 F(0.11) 4.50(0.04) m1a 179.4 0.0000 M7 200 -5928.78 β (p = 0.01, q = 0.03) 1.00 F(0.10) M8a 201 -5904.11 β (p = 0.12, q = 3.23) 1.00 F(0.11) M8 202 -5821.39 β (p = 0.03, q = 0.24) 4.31(0.04) m7 214.8 0.0000 m8a 165.4 0.0000 F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test resul

113 Table S3.8. Branch-sites analyses (PAML) of the control gene dataset using the Bayesian species tree Parameter Estimates: dN/dS (proportion of sites) Foreground np lnL LRT p value (background dN/dS / foreground dN/dS) Rh1 0.03(0.86) 1.00(0.12) 0.03/10.23(0.02) marine 62 -2587.14 54.38 < 0.0001 1.00/10.23(0.00) marine (null) 61 -2614.33 0.03(0.88) 1.00(0.12) 0.03/1.00(0.00) 1.00/1.00(0.00) 0.02(0.84) 1.00(0.10) 0.02/18.53(0.06) Transitional branch 62 -2604.32 8.97 0.0027 1.00/18.53(0.01) Transitional branch (null) 61 -2608.81 0.02(0.69) 1.00(0.09) 0.02/1.00(0.20) 1.00/1.00(0.02) South American freshwater 62 -2609.18 0.03(0.88) 1.00(0.10) 0.03/7.55(0.02) 1.00/7.55(0.00) 8.10 0.0044 South American freshwater 61 -2613.24 0.03(0.87) 1.00(0.10) 0.03/1.00(0.03) 1.00/1.00(0.00) (null)

Rag1 marine 62 -3365.04 0.04(0.89) 1.00(0.09) 0.04/3.51(0.01) 1.00/3.51(0.00) 6.29 0.0121 marine (null) 61 -3368.19 0.03(0.88) 1.00(0.10) 0.03/1.00(0.02) 1.00/1.00(0.00) Transitional branch 62 -3367.71 0.03(0.58) 1.00(0.07) 0.03/1.00(0.31) 1.00/1.00(0.04) 0.00 1.0000 Transitional branch (null) 61 -3367.71 0.03(0.58) 1.00(0.07) 0.03/1.00(0.31) 1.00/1.00(0.04) South American freshwater 62 -3368.11 0.03(0.87) 1.00(0.11) 0.03/1.00(0.02) 1.00/1.00(0.00) 0.00 1.0000 South American freshwater 61 -3368.11 0.03(0.87) 1.00(0.11) 0.03/1.00(0.02) 1.00/1.00(0.00) (null)

EGR1 marine 62 -1793.39 0.00(0.93) 1.00(0.07) 0.00/1.00(0.00) 1.00/1.00(0.00) 0.00 1.0000 marine (null) 61 -1793.39 0.00(0.93) 1.00(0.07) 0.00/1.00(0.00) 1.00/1.00(0.00) Transitional branch 62 -1793.39 0.00(0.93) 1.00(0.07) 0.00/1.00(0.00) 1.00/1.00(0.00) 0.00 1.0000 Transitional branch (null) 61 -1793.39 0.00(0.93) 1.00(0.07) 0.00/1.00(0.00) 1.00/1.00(0.00) South American freshwater 62 -1792.15 0.00(0.91) 1.00(0.05) 0.00/1.00(0.04) 1.00/1.00(0.00) 0.00 1.0000 South American freshwater 61 -1792.15 0.00(0.91) 1.00(0.05) 0.00/1.00(0.04) 1.00/1.00(0.00) (null)

EGR2 marine 62 -1867.65 0.05(0.96) 1.00(0.04) 0.05/4.99(0.00) 1.00/4.99(0.00) 0.00 1.0000 marine (null) 61 -1867.65 0.05(0.96) 1.00(0.04) 0.05/1.00(0.00) 1.00/1.00(0.00) Transitional branch 62 -1867.65 0.05(0.89) 1.00(0.03) 0.05/2.06(0.08) 1.00/2.06(0.00) 0.00 1.0000 Transitional branch (null) 61 -1867.65 0.05(0.95) 1.00(0.04) 0.05/1.00(0.01) 1.00/1.00(0.00) South American freshwater 62 -1867.65 0.05(0.96) 1.00(0.04) 0.05/1.00(0.00) 1.00/1.00(0.00) 0.00 1.0000 South American freshwater 61 -1867.65 0.05(0.96) 1.00(0.04) 0.05/1.00(0.00) 1.00/1.00(0.00) (null) Note: lnL, ln likelihood; LRT, likelihood ratio test result;

114 Table S3.9. Clade model analyses (PAML) of the control gene dataset using the Bayesian species tree

Parameter Estimates: dN/dS (proportion of sites) Foreground np lnL null LRT p value (background dN/dS / foreground dN/dS) Rh1 m2aREL (null) 62 -2577.2 0.03(0.87) 1.00(0.11) 8.95(0.02) m3 (null) 63 -2576.12 0.05(0.90) 1.61(0.09) 12.70(0.01) CmC / Marine lineages 63 -2576.63 0.03(0.87) 1.00(0.11) 5.91/9.92(0.02) M2aREL 1.15 0.2843 CmD / Marine lineages 64 -2570.42 0.03(0.86) 5.73(0.03) 2.27/0.39(0.12) M3 11.41 0.0007 CmC / Transitional branch 63 -2577.19 0.03(0.87) 1.00(0.11) 8.89/10.25(0.02) M2aREL 0.02 0.8846 CmD / Transitional branch 64 -2569.00 0.03(0.85) 7.43(0.02) 0.68/11.88(0.13) M3 14.25 0.0002 CmC / Freshwater clade 63 -2576.48 0.03(0.87) 1.00(0.11) 9.78/5.32(0.02) M2aREL 1.43 0.2318 CmD / Freshwater clade 64 -2575.34 0.04(0.90) 9.59(0.01) 1.18/2.11(0.09) M3 1.56 0.2111

Rag1 m2aREL (null) 62 -3367.07 0.04(0.89) 1.00(0.08) 2.08(0.02) m3 (null) 63 -3367.04 0.04(0.89) 0.80(0.08) 1.89(0.04) CmC / Marine lineages 63 -3364.94 0.04(0.89) 1.00(0.10) 0.00/3.55(0.01) M2aREL 4.26 0.0389 CmD / Marine lineages 64 -3364.92 0.04(0.89) 1.04(0.09) 0.00/3.55(0.01) M3 4.24 0.0396 CmC / Transitional branch 63 -3367.03 0.04(0.90) 1.00(0.08) 2.08/0.00(0.02) M2aREL 0.08 0.7740 CmD / Transitional branch 64 -3366.56 0.09(0.48) 1.40(0.09) 0.00/0.74(0.43) M3 0.96 0.3269 CmC / Freshwater clade 63 -3365.07 0.04(0.89) 1.00(0.10) 3.55/0.00(0.01) M2aREL 4.01 0.0453 CmD / Freshwater clade 64 -3365.05 0.04(0.89) 1.05(0.09) 3.56/0.00(0.01) M3 3.98 0.0460

EGR1 m2aREL (null) 62 -1792.98 0.00(0.91) 1.00(0.00) 0.69(0.09) m3 (null) 63 -1792.98 0.00(0.20) 0.00(0.72) 0.69(0.09) CmC / Marine lineages 63 -1792.03 0.00(0.00) 1.00(0.05) 0.03/0.00(0.95) M2aREL 1.89 0.1696 CmD / Marine lineages 64 -1791.97 0.00(0.00) 0.85(0.06) 0.03/0.00(0.94) M3 2.04 0.1527 CmC / Transitional branch 63 -1792.86 0.00(0.91) 1.00(0.00) 0.70/0.00(0.09) M2aREL 0.24 0.6223 CmD / Transitional branch 64 -1792.86 0.00(0.49) 0.00(0.42) 0.70/0.00(0.09) M3 0.26 0.6068 CmC / Freshwater clade 63 -1791.96 0.00(0.00) 1.00(0.05) 0.00/0.04(0.95) M2aREL 2.04 0.1534 CmD / Freshwater clade 64 -1791.89 0.00(0.00) 0.85(0.06) 0.00/0.03(0.94) M3 2.19 0.1390

EGR2 m2aREL (null) 62 -1866.40 0.06(0.99) 1.00(0.00) 4.01(0.01) m3 (null) 63 -1866.40 0.06(0.99) 4.01(0.01) 4.01(0.00) CmC / Marine lineages 63 -1866.19 0.06(0.99) 1.00(0.00) 6.00/3.12(0.01) M2aREL 0.42 0.5185 CmD / Marine lineages 64 -1866.19 0.06(0.40) 0.06(0.59) 6.00/3.12(0.01) M3 0.42 0.5185 CmC / Transitional branch 63 -1866.40 0.06(0.99) 1.00(0.00) 4.01/4.49(0.01) M2aREL 0.00 1.0000 CmD / Transitional branch 64 -1866.40 0.06(0.92) 0.06(0.07) 4.01/3.14(0.01) M3 0.00 1.0000 CmC / Freshwater clade 63 -1866.19 0.06(0.99) 1.00(0.00) 3.12/6.00(0.01) M2aREL 0.42 0.5185 CmD / Freshwater clade 64 -1866.19 0.06(0.31) 0.06(0.68) 3.12/6.00(0.01) M3 0.42 0.5185 F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test result; CmC, Clade model C; CmD, Clade model D

115 Table S3.10. Clade model analyses (PAML) of the New World Clade rhodopsin dataset with and without highly positively selected sites removed using the Bayesian species tree Parameter Estimates: dN/dS (proportion of sites) Model np lnL Null LRT p value (background dN/dS / Foreground dN/dS) Full rhodopsin dataset M3 96 -3326.68 0.03(0.86) 1.37(0.13) 13.34(0.01) M3 - 4 site classes 98 -3315.31 0.00(0.74) 0.40(0.20) 2.63(0.05) 13.64(0.01) M3 22.74 < 0.0001 M2aREL 95 -3328.05 0.02(0.84) 1.00F(0.14) 8.99(0.01) CmC 96 -3328.03 0.02(0.84) 1.00 F(0.14) 8.94/10.95(0.01) M2a 0.04 0.8415 CmD 97 -3317.25 0.00(0.77) 4.12(0.05) 0.39/6.91(0.19) M3 18.86 < 0.0001 CmC 21.56 < 0.0001 Rhodopsin dataset with sites 165 and 214 removed (sites in 4th site class M3 four site classes) M3 96 -3145.94 0.00(0.75) 0.40(0.20) 2.63(0.05) M3 - 4 site classes 98 -3145.94 0.00(0.75) 0.40(0.20) 2.63(0.05) 3.82(0.00) M3 0.00 1.0000 M2aREL 95 -3150.45 0.02(0.85) 1.00F(0.11) 3.05(0.04) CmC 96 -3145.42 0.01(0.81) 1.00F(0.11) 0.24/14.76(0.08) M2a 10.06 0.0015 CmD 97 -3131.94 0.00(0.75) 2.76(0.05) 0.34/6.32(0.20) M3 28.04 < 0.0001 F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test result; CmC, Clade model C; CmD, Clade model D

116 Table S3.11. Substitutions on transitional branch and the frequency of their occurrence on other branches in the tree Substitution on Number of Number of WAG Dayhoff JTT M0 transitional branch substitutions at identical (PPm|PPf) (PPm|PPf) (PPm|PPf) (PPm|PPf) (WAG) the same site substitutions S33N 1|1 1|1 1|1 0.999|0.998 15 13 G39A 1|0.794 1|0.792 1|0.794 1|1(G) 2 2 F50L 1|0.901 1|0.797 1|0.924 1(L)|1 6 1 F52L 1|0.999 1|0.999 1|0.999 1|0.992(F) 3 1 L63I 1|1 1|1 1|1 1|0.999 4 3 L119F 1|1 1|1 1|1 0.999|0.997 1 1 E122I 1|1 1|1 1|1 1|0.994 1 1 S124A 1|0.999 1|0.999 1|0.999 1|0.999 3 3 G158A 1|1 1|1 1|1 1|0.999 14 8 F159L 1|1 1|1 1|1 1|1 3 2 S165V 0.996|0.805 0.995|0.901 0.996|0.801 0.363|0.605(F) 14 1 V169G 1|1 1|1 1|1 1|1 17 1 V173I 1|0.999 1|0.999 1|0.999 1|0.995 7 2 V209T 1|1 1|1 1|1 0.851|0.997 21 3 V214T 1|1 1|1 1|1 0.85|0.998 33 4 V218I 1|0.999 1|0.999 1|0.999 0.929|0.999 16 12 V259I 1|0.999 1|0.999 1|0.999 0.93|0.997 13 6 F261Y 1|0.999 1|0.999 1|1 1|0.999 1 1 E282D 1|1 1|1 1|1 1|0.999 2 1 M304A 0.998|1 0.997|1 0.998|1 0.999|0.998 6 4 Note: PPm, best supported amino acid identity in marine ancestor; PPf, best supported amino acid identity in freshwater ancestor; If the identity differs from the identity according to the WAG reconstruction the alternative amino acid is shown in parentheses

117 3.9.5. Supplementary figures

Figure S3.1. Bayesian species tree reconstructed using a concatenated alignment of four nuclear loci and two mitochondrial loci. Posterior probability values displayed at nodes. Marine lineages in blue, freshwater lineages in yellow, transitional branches shown as purple arrows, and outgroup taxa shown in grey. Green and pink branches show the two possible sister groups to the South American freshwater clade. Branches proportional to number of substitutions per site.

118

Figure S3.2. Maximum likelihood species tree reconstructed using a concatenated alignment of four nuclear loci and two mitochondrial loci. Bootstrap support values from 1000 replicate analyses displayed at nodes. Marine lineages in blue, freshwater lineages in yellow, transitional branches shown as purple arrows, and outgroup taxa shown in grey. Green and pink branches show the two possible sister groups to the South American freshwater clade. Branches proportional to number of substitutions per site.

119

Figure S3.3. Maximum likelihood rhodopsin gene tree. Bootstrap support values from 1000 replicate analyses displayed at nodes. Marine lineages in blue, freshwater lineages in yellow, transitional branches shown as purple arrows, and outgroup taxa shown in grey. Green and pink branches show the two possible sister groups to the South American freshwater clade. Branches proportional to number of substitutions per site.

120

Figure S3.4. Maximum likelihood species tree reconstructed using a concatenated alignment of four nuclear loci and two mitochondrial loci but with site 165 and 214 removed from rhodopsin dataset. Bootstrap support values from 1000 replicate analyses displayed at nodes. Marine lineages in blue, freshwater lineages in yellow, transitional branches shown as purple arrows, and outgroup taxa shown in grey. Green and pink branches show the two possible sister groups to the South American freshwater clade. Branches proportional to number of substitutions per site.

121

Figure S3.5. Maximum likelihood species tree reconstructed using a concatenated alignment of four nuclear loci and two mitochondrial loci but with the first and second codon position removed from rhodopsin dataset. Bootstrap support values from 1000 replicate analyses displayed at nodes. Marine lineages in blue, freshwater lineages in yellow, transitional branches shown as purple arrows, and outgroup taxa shown in grey. Green and pink branches show the two possible sister groups to the South American freshwater clade. Branches proportional to number of substitutions per site.

122

Figure S3.6. Bayesian phylogeny with amino acid branch lengths. Branch lengths estimated in PAML using the WAG substitution matrix and labelled on each branch.

3.9.6. Supplementary references Bielawski JP, Yang Z. 2004. A Maximum Likelihood Method for Detecting Functional Divergence at Individual Codon Sites, with Application to Gene Family Evolution. J. Mol. Evol. 59:1–12.

Lanfear R, Frandsen PB, Wright AM, Senfeld T, Calcott B. 2016. PartitionFinder 2: New Methods for Selecting Partitioned Models of Evolution for Molecular and Morphological Phylogenetic Analyses. Mol. Biol. Evol. 34:772–773.

Lo P-C, Liu S-H, Chao NL, Nunoo FKE, Mok H-K, Chen W-J. 2015. A multi-gene dataset reveals a tropical New World origin and Early Miocene diversification of croakers (Perciformes: Sciaenidae). Mol. Phylogenet. Evol. 88:132–143.

Nielsen R, Yang Z. 1998. Likelihood Models for Detecting Positively Selected Amino Acid Sites and Applications to the HIV-1 Envelope Gene. Genetics 148:929–936.

Schott RK, Refvik SP, Hauser FE, López-Fernández H, Chang BSW. 2014. Divergent positive selection in rhodopsin from lake and riverine cichlid fishes. Mol. Biol. Evol. 31:1149–1165.

123 CHAPTER FOUR: DEPTH-DEPENDENT MOLECULAR EVOLUTION OF RHODOPSIN IN FISHES THAT HAVE MADE EVOLUTIONARY TRANSITIONS FROM MARINE TO FRESHWATER ENVIRONMENTS

Contributors: Alexander Van Nynatten, Nathan Lovejoy, Belinda SW Chang.

Author contributions: AVN, BSWC and NRL designed the study. AVN collected the rhodopsin sequence dataset and analysed the data.

4.1. ABSTRACT

The spectrum of light illuminating underwater environments narrows with depth, becoming disproportionately comprised of short and long wavelengths of light in marine and freshwater habitats respectively. We hypothesize that this depth-dependant divergence in light environments will place stronger selection pressures on the visual systems of deep-dwelling lineages of fishes making transitions from marine to freshwater environments. To test this, we compared rates of molecular evolution in the dim-light sensitive visual pigment, rhodopsin, using codon models of molecular evolution and amino acid reconstructions. We focus our analyses on three clades of fishes making multiple transitions into freshwater: the surface- dwelling Beloniformes (needlefishes, flyingfishes and halfbeaks), the primarily pelagic Clupeiformes (anchovies, herring and sardines) and the benthopelagic Sciaenids (drum and croakers). We have previously shown that positive selection and substitutions that red shift the spectral sensitivity of rhodopsin accompany transitions into freshwater in both anchovies and croakers. In contrast, similar transitions in Beloniformes were not found to be accompanied by elevated rates nor adaptive substitutions. We attribute this to the shallow depths inhabited by Beloniforms, which tend to be illuminated by broader spectrum light in both water types, consistent with our depth-dependant hypothesis. However, positive selection is found in rhodopsin irrespective of habitat in Beloniformes, possibly reflecting their dietary diversity. Expanding the sampling of Clupeiformes revealed convergent substitutions at known red- shifting sites, suggesting that like Sciaenids, these pelagic fishes have adapted to the wavelengths of light characterizing freshwater visual environments, but through less acute bouts of positive selection.

124 4.2. INTRODUCTION

Ancient invasions of freshwater by ancestrally marine species has left signatures of selection in the genomes of fishes and invertebrates (Jones et al. 2012; Lee 2016). The decreased salinity in freshwater environments subjects freshwater species to different physiological constraints than their marine ancestors (Evans 2008). Substitutions in the promoters and protein coding regions controlling the expression and encoding proteins involved in osmoregulatory functions have been found in marine-derived lineages (Lee et al. 2011; Rahi et al. 2017; Bassham et al. 2018; Willoughby et al. 2018). However, osmoregulation is just one aspect that differs between marine and freshwater environments. The wavelengths of light illuminating freshwater environments is more red shifted than marine environments because of increased concentrations of suspended particulate matter and dissolved organic material (Figure 4.1a) (Lythgoe 1979). Similar to the molecular changes observed in genes involved in osmoregulation, shifts in selective constraint due to the divergence in marine and freshwater light environments have been found to occur in the visual systems of anchovies, croakers, and sticklebacks (Van Nynatten et al. 2015; Marques et al. 2017). This includes amino acid substitutions in the opsin protein component of the dim-light specialized visual pigment rhodopsin that red shift the spectral sensitivity to better detect the wavelengths of light most prominent in freshwater environments (see Chapter 3).

Most investigations of spectral tuning in fishes have focussed on changes associated with depth (Bowmaker and Hunt 2006). As depth increases the visible spectrum narrows until only a small band of light is of detectable intensity (Figure 4.1a). In marine environments, blue light penetrates deepest which is thought to exert strong selection on the visual pigments of deep-sea fishes to shift their spectral sensitivities to match. Indeed, investigations using extracted pigments, microspectrophotometry, and experimental characterizations of opsin sequences using spectroscopic assays find that deep-sea fishes tend to have blue-shifted visual pigments (Crescitelli 1990; Yokoyama et al. 2008). Studies comparing independent transitions into deep-sea habitats have shown that it is specific substitutions in the rhodopsin protein that shift the spectral sensitivity towards the blue end of the spectrum (Hope et al. 1997). Convergent substitutions have also been observed in deep-sea species that are thought to

125 optimize the kinetic properties of rhodopsin for dimly lit deep-sea environments (Dungan et al. 2016). An extensive analysis of freshwater fishes using microspectrophotometry by MacNichol and Levine 1979, suggests a similar depth-dependent shift in spectral sensitivity occurs in freshwater but towards the red end of the spectrum (Figure 4.1b). However, experimental studies of the molecular basis of shifts in spectral sensitivity in freshwater fishes have not yet been investigated.

Evolutionary analyses of molecular mechanisms associated with marine to freshwater transitions, and the functional properties of rhodopsin for vision in freshwater environments are limited. These studies are complicated in freshwater fishes because many species have no extant marine sister lineages for comparative sequence analysis (Betancur-R et al. 2015). More recent invasions of freshwater, known as marine-derived lineages, offer an ideal natural system for studying the molecular evolution of rhodopsin in freshwater environments (Lovejoy et al. 1998). These lineages have proven to be a powerful system for studying visual evolution and have been used to compare rates of molecular evolution in marine and freshwater sister clades of anchovies (Van Nynatten et al. 2015) and to compare the spectral sensitivities of ancestral marine and freshwater croaker rhodopsin sequences resurrected in vitro (Chapter 3). Anchovies and croakers invaded the Amazon basin at roughly the same time during the Miocene (Lo et al. 2015; Bloom and Lovejoy 2017), but the strength of selection acting on croaker rhodopsin is much stronger (Chapter 3). Sciaenids are generally deeper dwelling than the predominantly pelagic anchovies (Figure 4.1c and Supplementary table S4.1). Because the divergence in optimal spectral sensitivities for detecting light in marine and freshwater environments widens with depth, we expect the rhodopsin protein of deeper-dwelling fishes, such as Sciaenids, to be under more significant selective pressures (Figure 4.1b). This hypothesis suggests that shallow epipelagic fishes might experience little if any selection pressures for spectral tuning during marine to freshwater transitions.

Beloniformes are an emblematic epipelagic order of fishes that include halfbeaks, needlefishes and flying fishes (Lovejoy 2004), and comprised of multiple lineages that have transitioned from marine to freshwater environments (Lovejoy 2004; Lovejoy et al. 2006) (Figure 4.2). These shallow water specialists, famous for leaping out of the water and even

126 gliding across its surface, provide an ideal natural system for investigating how depth influences the selection pressures acting on rhodopsin by comparison to the benthopelagic Scianidae investigated in Chapter 3 and the primarily pelagic Clupeiformes an order of fishes comprised of anchovies, sardines and herring (Figure 4.1c). Clupeiformes have also invaded freshwater on numerous occasions (Figure 4.2) (Bloom and Lovejoy 2014), including an invasion by South American anchovies where positive selection in rhodopsin was reported (Van Nynatten et al. 2015). These three clades exist across a gradient of depths in marine and freshwater environments (Figure 4.1c and Supplementary table S4.1). They also differ from one another in other aspects of ecology such as diet. Sciaenids are more active predators than anchovies, relying heavily on vision for detecting prey (Deary et al. 2016), whereas Beloniformes display substantial diversity in jaw morphology consistent with the broad array of dietary niches these fishes have established (Lovejoy 2004).

We investigate to what extent depth influences the selection pressures and substitutions occurring in rhodopsin following transitions into red-shifted freshwater environments. We test this hypothesis by comparing rates of non-synonymous to synonymous substitutions (dN/dS) in a new dataset Beloniformes, as well as expanded datasets of Clupeiformes and Sciaenids. We implement models that allow dN/dS to differ on freshwater branches and branches representing transitions into freshwater. By comparing these models to null estimates assuming no variation in dN/dS in each habitat we assess the significance of marine to freshwater transitions on the rate of molecular evolution in rhodopsin. To look for convergence during these transitions we reconstruct ancestral amino acid sequences and compare substitutions observed with null estimates assuming the habitat has no influence on amino acid identity. This extended sampling of ecologically distinct fishes making comparable transitions into freshwater provides the ideal natural system for investigating the influence of the red-shifted freshwater environment on visual evolution.

127 4.3. METHODS

4.3.1. Sequence dataset assembly A contiguous segment of the rhodopsin gene spanning all seven of rhodopsin’s transmembrane helices was amplified from genomic DNA for 15 Clupeiformes and 56 Beloniformes species (Supplementary table S4.2) using primers and polymerase chain reaction protocols from (Chen et al. 2003). Forward and reverse Sanger sequencing reads, sequenced at the Hospital for Sick Children TCAG Sequencing Facility, were aligned and compared using GENEIOUS (Kearse et al. 2012). When possible, sequence data was compared for multiple individuals of the same species. Methods used to generate the sciaenid rhodopsin dataset can be found in Chapter 3. An additional three Clupeiformes rhodopsin sequences were downloaded from Genbank and anchovy sequences generated in a previous study (Van Nynatten et al. 2015) were added to form a dataset totaling 55 Clupeiformes. The 55 Clupeiformes species contain representatives from five of the six Clupeiformes families, and 12 of the 39 freshwater inhabiting genera representing freshwater invasions in South America and Africa (Figure 4.2). Our 56 sequence Beloniformes rhodopsin dataset is comprised of species from four of the five Beloniformes families, excluding ricefishes, and eight of eleven freshwater inhabiting genera invading North America, South America, South East Asia and . Our 114 species Sciaenidae dataset has representatives for all freshwater inhabiting genera. Rhodopsin sequences were aligned using the DECIPHER package in R (Wright 2015). Terminal regions where sequence data was missing in more than 50% of taxa were removed.

4.3.2. Ancestral habitat reconstructions All three clades of fishes in this study are believed to have a marine ancestry (Bloom and Lovejoy 2017). However, reconstructing the ancestral habitat preference on each clade individually does not reflect this because of early diverging freshwater lineages in Clupeiformes and Beloniformes. To more accurately polarize ancestral habitat, we spliced species trees for each of these clades (Lovejoy 2004; Bloom and Lovejoy 2014) (Chapter 3) onto the best supported phylogeny representing species relationships across Actinopterygians (Betancur-R et al. 2017). We categorize habitat preference using RFishbase (Boettiger et al.

128 2012). Fishbase includes a binary classification for each species indicating if it is associated with freshwater, brackish or saltwater environments. We categorize a fish as freshwater if they are associated with freshwater and not associated with saltwater environments. For any species in our dataset not represented in Fishbase we used water-type classification from (Lovejoy 2004; Bloom and Lovejoy 2014). Ancestral habitat preference (marine or freshwater) was reconstructed along the spliced species tree using an implementation of Felsenstein's tree- pruning algorithm in the Phytools package in R assuming equal rates for marine to freshwater and freshwater to marine transitions and branch lengths based on the codon substitution rate estimated using M0 in PAML (Yang 2007; Revell 2011). We also used RFishbase to collect mid-point depth estimates by taking the median of minimum and maximum depth values for each species in our dataset. When depth estimates were not available, as was the case for most freshwater fishes, we used habitat descriptions, categorizing each species as either epipelagic, pelagic, benthopelagic, or demersal.

4.3.3. Molecular evolutionary analyses We used maximum likelihood codon-based models implemented in PAML to estimates rates of non-synonymous to synonymous substitutions (dN/dS) in rhodopsin. Analyses were performed separately on the Clupeiformes, Beloniformes and Sciaenidae rhodopsin datasets to isolate inferences to each clade of fishes. Analyses of the Sciaenids rhodopsin dataset have been previously reported in (Chapter 3). We tested for pervasive positive selection in each of these datasets by comparing the fit of models incorporating a positively selected site class with nested null models using likelihood ratio tests (LRT) with a χ2 distribution equal to the difference in the number of parameters in each model (Yang 1998). We also investigated episodic positive selection and episodic shifts in selection pressures associated with the transition from marine to freshwater environments using the branch-sites and clade models in PAML (Yang and Swanson 2002; Bielawski and Yang 2004). For these tests we selected each transitional branch and freshwater clade as the sole foreground lineage in the test. For the branch sites tests this is the only branch or set of branches allowed to have dN/dS estimates greater than one. We compared the branch-sites model to its nested null where the foreground partition is prevented from exceeding one, assuming no positive selection in the dataset (Yang

129 and Swanson 2002). Clade models C and D are more flexible and foreground lineages are not forced to have dN/dS estimates greater than one (Bielawski and Yang 2004). Models C and D were compared to null models M2aREL and M3 respectively representing a restricted version of the more parameter rich clade models assuming uniform selection across the entire dataset (Bielawski and Yang 2004; Weadick and Chang 2011). For random-sites and clade model analyses positively selected sites were inferred using Bayes Empirical Bayes analysis (Yang 2005). Ancestral reconstructions were also performed using PAML using the Actinopterygian tree to more accurately polarize ancestral amino acid states. Models M0 and M3 were used for codon reconstructions as they impose no constraints on particular partitions of the phylogeny.

The two copies of Aplodinotus grunniens rhodopsin were amplified using the same protocol and primers as described above. The bands were separated using gel electrophoresis, excised and extracted using a QIAquick Gel Extraction Kit (Qiagen Inc, Santa Clara CA, USA). Purified PCR products were sequenced by Sanger sequencing at the Hospital for Sick Children TCAG Sequencing Facility. Chromatograms were compared for forward and reverse reads in Geneious (Kearse et al. 2012) and only high-quality sequences were kept. Both sequences were aligned to the Sciaenidae rhodopsin dataset using MUSCLE (Edgar 2004). A rhodopsin gene tree was reconstructed using IQ tree (Nguyen et al. 2015) with the substitution model set to the Kimura‐2‐parameter with an invariant site class and four gamma rate class parameters, inferred to be the best fitting model using a Bayes Information Criterion method in ModelTest. For tests of molecular evolution regions where gaps are found in the shorter fragment were removed from the alignment. The two-ratio model in PAML that assume uniform rates across sites but allow specific branches to have different dN/dS estimates were compared to simpler models that either assume the same rate across all branches in the tree or that constrain the foreground to be equal to one, as expected under neutral evolution.

Convergent substitutions on transitional and freshwater branches were inferred using ancestral amino acid reconstructions in PAML employing the Dayhoff, JTT, and WAG empirical amino acid matrices (Yang et al. 1995). The best supported amino acid identities for ancestral nodes were used to count substitutions on marine, freshwater and transitional branches using custom scripts and the Phytools package in R (Revell 2011). The frequency of

130 convergent substitutions were compared to null estimates generated by replicating the evolution of 100 sequences of the same length of our rhodopsin fragment on a tree with the same branch lengths using EVOLVER (Yang 2007). Amino acid substitution rates were based on the WAG matrix, the best fitting for our rhodopsin dataset, and amino acid frequencies were based on estimates from ancestral reconstructions of rhodopsin using the WAG matrix in PAML. We chose to focus our analyses on convergent substitutions that result in more substantial changes in amino acid properties as these are most likely to impact rhodopsin structure and function (Luk et al. 2016). We define conservative substitutions as substitutions to identities within the same functional class; non-polar alkyl (G, A, V, L, I, M, P), non-polar aromatic (W, F), polar uncharged (Y, S, T, N, Q, C), polar basic (H, K, R), polar acidic (D, E). The statistical significance of convergent substitutions were tested for statistical significance using Pagel’s discrete, implemented in the R package CorHMM. The null model used is a modified version of the “SYM” model packaged in CorHMM assuming the rate of an amino acid substitution from one identity to another is identical in both habitat backgrounds (marine/freshwater) (Pagel 1994). These analyses compare substitutions across all 364 species in Supplementary figure S4.1, including species used to reconstruct the backbone for habitat reconstructions on the three focal clades analyzed using PAML. The location of positively selected and convergent were mapped onto the dark state (1U19) rhodopsin crystal structure (Okada et al. 2004) and distances to the chromophore in the dark and Meta-II state (3PQR) (Choe et al. 2011) of rhodopsin were measured in UCSF Chimera (Pettersen et al. 2004).

4.4. RESULTS

4.4.1. Divergent selection more frequently associated with freshwater transitions in Clupeiformes than Beloniformes We used branch-sites and clade models in PAML to test for divergent selection on branches representing transitions from marine to freshwater environments and on freshwater clades. We found that dN/dS in rhodopsin is significantly different from null estimates in at least one of Branch-sites, Clade model C or Clade model D for every transition from marine

131 to freshwater environments in Clupeiformes (Figure 4.2) (Table 4.1). Half of the freshwater transition events in the Clupeiformes clade were significantly elevated compared to background rates in more than one model (Table 4.1) (Supplementary table S4.3). Transition events 1 and 4, both monotypic freshwater lineages in our dataset, representing species invading Africa and South America respectively, were not significant when using clade models, but are both positively selected when estimated using the branch-sites test (Table 4.1). Beloniformes have made a comparable number of transitions into freshwater as Clupeiformes (Figure 4.2) but divergent selection was observed only in two freshwater lineages, the South East Asian clade of halfbeaks and a monotypic lineage invading Australia (Table 4.1) (Supplementary table S4.4). Clade models disagreed in the significance of the divergent site class assigned to the Australian freshwater invasion, with the more flexible model, CmD, suggesting no difference in dN/dS on this branch (Supplementary table S4.4).

4.4.2. Non-conservative substitutions positively selected on transitional and freshwater lineages To determine which sites in rhodopsin are under positive selection on transitional branches and freshwater clades we used Bayes Empirical Bayes (BEB) analyses. Positively selected substitutions are observed at highly conserved sites on branches identified as under positive selection using the branch-sites test. A substitution from a cysteine to an alanine was observed at site 185 in Denticeps clupeoides, the monotypic freshwater African lineage of Clupeiformes. This is the only branch in our dataset where this substitution is observed, and an alanine at this position is rare among actinopterygians in general. Site 185 is within 10 angstroms of the chromophore but points away from the binding pocket and is not expected to change the electrostatic environment or alter spectral sensitivity (Figure 4.3b). The C185A substitution observed on this branch has been shown to decrease protein stability by creating a more open binding pocket in bovine rhodopsin (McKibbin et al. 2007). In Rhinosardinia, a South American freshwater clupeiform lineage the C264A substitution is the only substitution included in the positively selected site class. This site is also highly conserved in rhodopsin and is part of a network of sites on helices 6 and 7 that form the binding pocket for a conserved water molecule that likely plays a role in the activation process of rhodopsin (Figure 4.3c)

132 (Okada et al. 2002). Positively selected sites in the South East Asian freshwater clade of Beloniformes include sites 113 and 163. The methionine to glycine substitution on the branch leading to the common ancestor of this clade is only observed in this lineage in Beloniformes and forms a hydrogen bond network with sites 206, 211, and 122 in bovine rhodopsin (Figure 4.3d). This network is believed to hold helices 3 and 5 together and stabilize the dark state structure (Angel et al. 2009). A I133V substitution is also positively selected within this clade, but variation at this site is more prevalent across species and the functional ramifications are not known (Figure 4.3).

4.4.3. Beloniformes rhodopsin under pervasive positive selection irrespective of habitat We investigated selection pressures acting on rhodopsin in the three clades of fishes irrespective of transitions from marine to freshwater environments, using the random sites models in PAML. High dN/dS estimates, significantly supporting the inclusion of a class of positively selected sites was inferred for the Beloniformes rhodopsin dataset when models when these models were implemented with no phylogenetic partitioning (M1a and M2a: χ2 = 2 156.6; df = 2; p < 0.0001, M7 and M8: χ = 191.92; df = 2; p < 0.0001, Table 4.2). dN/dS estimates for the rhodopsin gene of Beloniformes is twice that of the value estimated for Clupeiformes, and similar to results previously reported for other predatory fishes (Chapter

3) (Schott et al. 2014; Phillips et al. 2015) (Table 4.2). These high dN/dS estimates are a result of multiple sites under positive selection. Six sites have BEB support values for positive selection greater than 0.95 in Beloniformes (Figure 4.4). Only one site exceeds this cut off in Clupeiformes (Figure 4.4), where a class of positively selected sites is also supported (M1a and M2a: χ2 = 44.88; df = 2; p < 0.0001, M8 and M7: χ2 = 42.75; df = 2; p < 0.0001, Table 4.3).

4.4.4. Rhodopsin duplication in North American freshwater drum Using the same protocols and primers as for other Sciaenids amplified two copies of rhodopsin of different lengths from the genomic DNA of the North American freshwater drum, Aplodinotus grunniens. Aligning both copies of Aplodinotus grunniens rhodopsin with the

133 sciaenid rhodopsin dataset revealed one copy was 108 bp shorter due to multiple gaps within the gene. A maximum likelihood gene tree strongly supports that these duplicate rhodopsin copies are sister to one another, suggesting the second copy arose following the divergence of

Aplodinotus grunniens (fast band in Supplementary figure S4.2). The dN/dS rate for the shorter rhodopsin copy was significantly greater than the background estimate (m0 vs. two- ratio: χ2 = 9.30; df = 1; p = 0.0023, Table 4.4), but not significantly different from a model assuming neutral evolution along this branch (two-ratio null vs. two-ratio: χ2 = 0.37; df = 1; p = 0.5430, Table 4), strongly suggesting the shorter copy is a pseudogene. The other full-length fragment clusters with the sequence available for Aplodinotus grunniens rhodopsin on Genbank. The only rhodopsin sequence available for this species on Genbank has numerous polymorphic sites possibly as a result of contamination during sequencing from the pseudogenized sequence. Replacing the sequence from Genbank used in our analyses in

Chapter 3 with the newly generated rhodopsin sequence had little effect on the dN/dS estimates for random-sites, branch-sites and clade models (Supplementary table S4.5).

4.4.5. Convergent substitutions at functionally important sites in rhodopsin We used ancestral reconstructions of rhodopsin amino acid sequences to identify substitutions occurring more frequently than expected on transitional and freshwater branches. Ancestral reconstructions identify multiple convergent substitutions in rhodopsin on freshwater and transitional branches in Beloniformes, Clupeiformes and sciaenids. Excluding conservative substitutions (see methods), substitutions are observed on freshwater and transitional branches more frequently than marine at 49 sites in rhodopsin, including substitutions very close to the chromophore (Figure 4.5a). The red-shifting F261Y substitution is observed on three of the five transitional branches in Clupeiformes, an additional three times within the clade of South American freshwater anchovies, and on the branch leading to the South American freshwater clade in Sciaenidae (Figure 4.5b). The seven independent occurrences of this substitution exceed the expected number of convergent substitutions on freshwater and transitional branches based on a null distribution of 100 sequence simulations generated using EVOVLER (Figure 4.5b) (Yang 2007). A tyrosine at site 261 is also more frequently found in freshwater fishes in general, red-shifting rhodopsin’s peak spectral

134 sensitivity by 10 nm (Yokoyama et al. 1995). Convergent substitutions at sites further from the chromophore are also observed following transitions into freshwater (Figure 4.5b). Substitutions at site 213 from a polar to non-polar residues are observed six times in total and substitutions at sites 159, 270, and 287 also occur above that expected from simulations (Figure 4.5b) (Table 4.6)

Convergent substitutions at sites further from the retinal-binding pocket known to affect rhodopsin structure and function are also observed on freshwater branches. At site 124, both red-shifting and blue-shifting substitutions, S124A and A124S (Castiglione and Chang), are observed multiple times (Figure 4.5b). A substitution from a V to a T at site 271, shown to improve thermal stability (personal communication: Amir Sabouhanian), is observed twice in Clupeiformes (Figure 4.5b). The newly-generated sequence for Aplodinotus grunniens allows for further inferences of convergent evolution within Sciaenids, indicating the red- shifting L119F substitution occurs on the branch leading to the functional copy of Aplodinotus grunniens rhodopsin and to the South American freshwater clade (Figure 4.5b). An F212A substitution is observed within 4 Å from the chromophore in the African freshwater Clupeiformes lineage and has been shown to decrease the stability of the active state (Tsukamoto et al. 2010), similar to the effect of the E122I substitution observed in Sciaenids (Chapter 3). We used Pagel’s 1994 test of convergent evolution to obtain estimates of the statistical significance of these substitutions in relation to habitat changes (Pagel 1994), classifying the substitutions as binary state changes using the same aforementioned amino acid categories and a modified null model assuming equal rates of substitutions irrespective of habitat transitions (personal communication: Amir Sabouhanian). Substitutions at sites 119, 212, 213, 261, 270 and 287 were all statistically significant (Table 4.5). However, only site 261 exceeded the cut off following Bonferonni correction for multiple testing.

Convergent conservative substitutions are also observed on transitional and freshwater branches in the retinal binding pocket. An I189V substitution, 3.62 Å from the chromophore is observed on two lineages of freshwater Clupeiformes (Supplementary figure S4.3). This same substitution is observed in freshwater copy of rhodopsin of the mottled eel (Wang et al. 2014). An I112L substitution is also observed on six transitional branches but only once in

135 marine lineages (Supplementary figure S4.3). No substitutions in the retinal binding pocket (<5 Å of the chromophore) are observed on transitional or freshwater Beloniformes. Moreover, convergent substitutions more frequent on freshwater and transitional branches are less often observed in Beloniformes than Clupeiformes and Sciaenidae.

4.5. DISCUSSION

We investigated if the depth-dependant divergence in the wavelengths of light illuminating marine and freshwater environments altered the selection pressures acting on rhodopsin in species making evolutionary transitions between the two water types. We compared the molecular evolution of the rhodopsin gene in three teleost clades, representing an array of depths, by estimating dN/dS using codon-based models and reconstructing ancestral amino acid sequences. As expected, differences in the rate of molecular evolution are infrequently observed in Beloniformes, the most shallow-dwelling lineage, where the attenuation of light in both marine and freshwater environments is negligible. However, positive selection is observed when analyses are not partitioned by habitat, similar to results for other predatory fishes (Schott et al. 2014; Phillips et al. 2015). In contrast, we observe lower average dN/dS but statistically significant elevation in rates of non-synonymous substitutions along multiple marine to freshwater transitions in the pelagic and primarily filter-feeding Clupeiformes and convergent substitutions known to red shift the spectral of rhodopsin. This is similar to the trend observed in the deepest-dwelling lineage, Sciaenidae, but the number of sites included in the positively selected site class on transitional branches is less substantial. Sciaenids have also invaded freshwater on two other occasions, but the rhodopsin sequence available for the North American invading species, Aplodiontus grunniens, had previously hindered comparative analyses because of multiple polymorphic sites. We observe a pseudogenized copy of rhodopsin in this species that might have contributed to this noise, and the improved resolution from sequences collected in this study support convergent evolution at red-shifting sites. Substitutions expected to increase the rate of retinal release were found in multiple freshwater lineages of Clupeiformes. This matches the kinetic shift identified in

136 marine and freshwater croaker pigments experimentally characterized in Chapter 3 and represents convergent functional adaptation across these distantly related marine-derived lineages. We do not observe amino acid substitutions expected to alter the spectral or non- spectral properties of rhodopsin in marine-derived Beloniformes. This provides further evidence that both spectral and kinetic adaptation to deeper light-limited underwater visual environments are divergent in marine and freshwater species and that these changes are under strong convergent selection pressures in fishes transitioning between these two environments.

4.5.1. Stronger selective pressure acting on rhodopsin in deeper dwelling fishes The depth-dependent divergence in the wavelengths of light illuminating marine and freshwater environments is evident in the rates of molecular evolution in the rhodopsin gene of fishes making this transition. Signatures of selection suggesting adaptive evolution in rhodopsin during the transition from marine to freshwater environments are much more frequent in Sciaenidae and Clupeiformes than shallow-water Beloniformes. This is also observed in the number and type of amino acid substitutions occurring during marine to freshwater transitions. This trend continues when we compare the substitutions that occur in the deepest-dwelling lineage, the Sciaenids to Clupeiformes. The number of sites included in the positively selected site class is greater in the South American transition event in Sciaenidae than any transitional lineage in the Clupeiformes clade. In addition, while we do observe convergent substitutions in Sciaenidae and Clupeiformes, the combined effect of the numerous sites included in the positively selected site class along the transitional lineage into South America in Sciaenids is expected to have the most significant effect on the spectral and non- spectral properties of rhodopsin. To date, many investigations of visual evolution in fishes have focussed solely on differences in the depth species inhabit (Bowmaker and Hunt 2006) or on the optical properties of different visual environments (Schott et al. 2014; Hauser et al. 2017), but do not account for the compounding nature of these two environmental factors. These results suggest that more attention must be given to differences in microhabitat in addition to the focal environmental transitions typically investigated in these studies. Recent advances in population genetics support this idea, suggesting that adaptation is not binary and that selection gradients should be considered in evolutionary analyses (Riesch et al. 2018). As analyses of

137 visual evolution expand to include variation in opsin expression (Torres-Dowdall et al. 2017) and the molecular evolution of non-spectral properties of rhodopsin (Hauser et al. 2017; Castiglione et al. 2018), a more fine-tuned description of habitat might be necessary to discern what aspects of these complex phenotypes are adaptive in association with specific environmental pressures.

Selection gradients might also influence inferences of convergent evolution. We observe multiple convergent substitutions occurring on the transitional branches and freshwater-inhabiting lineages in Sciaenidae and Clupeiformes. Some of these substitutions are at sites that are known to red shift the spectral sensitivity of rhodopsin, an obvious adaptive advantage for detecting light in freshwater environments (Lythgoe 1979). Other convergent substitutions have been shown to influence rhodopsin function through non-spectral adaptation (Castiglione et al. 2017) (Chapter 3). However, some of these substitutions are not statistically significant and our inferences of their utility in freshwater rely on the robust functional characterization of rhodopsin. Few proteins are as well understood as rhodopsin (Ernst et al. 2014), requiring more support for a specific substitution from analyses of convergence to establish their evolutionary importance in analyses of other proteins. The probability of observing convergent evolution is known to be higher than expected in recently divergent species due to standing genetic variation (Thompson et al. 2018) and lower than expected in distantly related species due to epistatic changes altering the functional effects of a given substitution (Zou and Zhang 2015). These analyses may also be underpowered when multiple lineages do not experience the same expected difference in selection pressure associated with a change in the environment. These lineages decrease the proportion of branches where convergent substitutions are observed, not because substitutions on other branches are insignificant but because differences in selection pressures experienced following a transition event in one species are not the same across all species making the transition. Understanding how these microhabitat differences influence inferences of convergent evolution might have an equally important role in improving the power of these studies as understanding how variation in selection pressures alter the rates of molecular evolution across a protein.

138 4.5.2. Pseudogenized copy of rhodopsin in Aplodinotus grunniens We found a pseudogenized copy of rhodopsin in the North American freshwater drum. Opsin duplications events have been attributed to improved visual proficiencies in other species (Carleton et al. 2016), as well as in vertebrates in general (Lamb 2013). However, with a few exceptions (Morrow et al. 2017), most species have only one rhodopsin expressed in the retina despite many duplication events in the evolutionary history of teleosts (Nakamura et al. 2017). Unlike cone opsins, the dim-light visual system mediated by rhodopsin and rods is not set up to integrate or interpret differences in intensities from receptors with different peak spectral sensitivities (Gregg et al. 2012). The only advantage of a duplicate rhodopsin gene given the limitations of the rod pathway would be to increase dosage, which is already very high for rhodopsin. However, the sequence collected in this study did differ at polymorphic sites when compared to the only Aplodinotus rhodopsin sequence on Genbank. The functional sequence copy generated in this study revealed a convergent L119F substitution when compared to the South American transitional lineage, a substitution that causes a red shift in rhodopsin’s spectral sensitivity (Castiglione and Chang, 2018). Whether or not this substitution is ubiquitous across Aplodinotus’ distribution needs further investigation. Aplodinotus has the largest range of any freshwater fish, spanning multiple different visual environments (Barney 1926) and spectral tuning substitutions may not be uniform across the distribution

4.5.3. Rhodopsin evolves under positive selection in predatory fishes We observed a trend where active predators have higher average rates of positive selection. Vision is critical for most active predators (Potier et al. 2018), and the importance of diet on evolution is a corner stone of evolutionary theory (Darwin 1859). Eyes evolved in many species independently during the Cambrian, likely as a direct result of a transition to more active strategies or for avoiding predation by other visual species (Land and Nilsson

2012). Sciaenids and Beloniformes, mostly predatory fishes, have dN/dS rates much higher than the predominantly filter feeding Clupeiformes. Beloniformes have a diverse set of prey types and morphological adaptations for each (Lovejoy 2004). A similar association between high dN/dS estimates and a diverse diet is also observed in and croakers (Chapter 3) (López- Fernández et al. 2012; Schott et al. 2014). This trend could also help explain the significant

139 support for positive selection observed in South American freshwater anchovies (Van Nynatten et al. 2015) and South East Asian halfbeaks, both clades that have undergone dynamic adaptation for food acquisition in their new habitats (Lovejoy 2004). In addition to prey type, shaping morphological adaptation in mouth structure in fishes (Wainwright and Longo 2017), the different colours, activity patterns and habitat preferences associated with a prey type can alter the visual system. In anchovies specialized cone photoreceptors allow for the detection of polarized light, improving the detection of planktonic prey (Novales Flamarique 2017). Likewise, UV-sensitive pigments are frequent in the larval stages of fishes improving the contrast between the background light and small invertebrates in other species (Flamarique 2013). Alterations to rhodopsin kinetics might also be important, as seen in the novel heating organs observed in swordfishes (Fritsches et al. 2005). Studies comparing feeding strategies and visual evolution have focussed on differences in eye morphology or cone opsins, but these results suggest that rhodopsin may be equally important, especially in nocturnal species.

4.5.4. Convergent functional shifts in the rhodopsin protein of deeper-dwelling freshwater fishes We identify red-shifting substitutions in the rhodopsin gene of Clupeiformes and Sciaenids, consistent with a shift towards the wavelengths of light most prevalent in freshwater environments. In many cases, the same red-shifting substitutions are observed in independent freshwater transitions. The same amino acid identities at these sites are also observed in other endemic freshwater lineages, suggesting the same molecular mechanism are used to adapt rhodopsin to freshwater environments across divergent lineages. We do not observe any substitutions expected to be red shifting associated with the transition into freshwater for Beloniformes, consistent with our depth-dependent hypothesis. Indeed, both marine and freshwater surface-dwelling fishes have λmax estimates centred around 500 nm (Figure 4.1b). The spectral sensitivity of rhodopsin may be restricted from becoming much more red shifted, because of the decreased thermal stability associated with red-shifted pigments (Ala-Laurila et al. 2004). This could make red-shifted rhodopsins only beneficial when other wavelengths of light are no longer present, as is the case for deeper-dwelling freshwater fishes (MacNichol and Levine 1979). The low signal to noise ratio and the mostly uniform and warm temperature

140 with depth typical of large tropical rivers likely compounds this non-spectral selective pressure acting on rhodopsin, and might necessitate substitutions at secondary sites for improving thermal stability. Previous studies have suggested that an S299A substitution is critical for maintaining thermal stability in frogs using the red-shifted and less stable A2 chromophore (Fyhrquist et al. 1998). All species with the red-shifting tyrosine at site 261 in this dataset have an alanine at site 299. In most cases this substitution occurs simultaneously with the substitution at site 261, but in some cases, precedes its presence. We also observe a V271T substitution along two freshwater lineages. This same substitution has been functionally characterized in vitro and demonstrated to increase thermal stability in deep-sea fishes (personal communication: Amir Sabouhanian). This suggests that some of the many other substitutions observed along these branches, some positive selected and some convergent, might have non-spectral epistatic effects mitigating some of the less optimal functional ramifications of a red-shifted rhodopsin pigment.

We also observe substitutions occurring along marine to freshwater transitions that have been shown to increase the rate at which rhodopsin releases its retinal chromophore, controlling the rate at which it can be regenerated following light activation. The ecological relevance of the duration rhodopsin stays in its active Meta-II state, where the chromophore has isomerized but remains within the binding pocket, is still unclear. Because arrestin and rhodopsin kinase deactivate rhodopsin signalling almost immediately after signalling, a longer Meta-II state would seem to have no impact on the amplification of signal. However, this mechanism has been suggested in other studies, and convergent adaptations lengthening Meta- II have occurred convergently in species inhabiting dim-light environments (Sugawara et al. 2010; Castiglione et al. 2017; Hauser et al. 2017). Conversely, in the freshwater lineages investigated in this study we find evidence for more rapid retinal release and shorter Meta-II. This is consistent with the experimental results collected from expressed ancestral marine and freshwater pigments bounding the Sciaenid transition into freshwater in Chapter 3. Faster retinal release decreases the time it takes to restore the sensitivity of rhodopsin that have been bleached in brightly lit environments. This is known as dark adaptation and may be critical for vision in freshwater environments because of the narrow band between bright- and dim-light regimes due to the rapid attenuation of light (Crampton 2007). Species traversing down this

141 narrow interface would experience a dramatic decrease in light intensity and more rapid adaptation to this change would limit the time spent with insensitive bleached rods. In Chapter 3, we show this kinetic shift is mostly mediated by a substitution to a more cone like residue at site 122. In this study, we see other substitutions in independent freshwater invasions that might represent convergent functional adaptation (Castiglione et al. 2018).

Unlike the convergent substitutions observed at spectral tuning sites, the substitutions that alter kinetic changes are found throughout the rhodopsin protein. Site 122 is very near the chromophore and likely has the greatest impact on the kinetic properties of rhodopsin through direct interactions with the chromophore. Sites 185 and 212 are also close to the chromophore but face away from the chromophore and instead contribute more to maintaining a closed binding pocket (McKibbin et al. 2007; Tsukamoto et al. 2010). Substitutions changing the kinetic properties of rhodopsin have also been observed in other domains (Sugawara et al. 2010; Castiglione et al. 2017; Hauser et al. 2017), in some cases with different substitutions amounting to the same kinetic changes in divergent species under similar environmental pressures (Castiglione et al. 2018). This convergent functional adaptation via different molecular mechanisms suggests that a more nuanced view of rhodopsin structure and function is needed to understand visual adaptation, as well as a consideration of the ecological tendencies of the fish making the transitions into a novel visual environment.

142 4.6. TABLES

Table 4.1. Significant tests of positive and divergent selection on freshwater lineages parameter estimates: dN/dS (proportion of sites) partition test p value (background dN/dS / foreground dN/dS)

Clupeiformes Branch 1 branch-sites vs. branch-sites null 0.02(0.84) 1.00 F(0.14) 0.02/90.36(0.02) 1.00/90.36(0.00) 0.0064 Clade 2 CmC vs. M2aREL 0.00(0.73) 1.00 F(0.06) 0.19/0.78(0.21) 0.0053 Clade 2 CmD vs. M3 0.00(0.74) 1.42(0.05) 0.23/0.98(0.21) 0.0037 Branch 3 CmC vs. M2aREL 0.00(0.73) 1.00 F(0.06) 0.19/0.78(0.21) 0.0061 Branch 3 CmD vs. M3 0.00(0.74) 1.33(0.05) 0.22/0.85(0.21) 0.0083 Branch 4 branch-sites vs. branch-sites null 0.02(0.85) 1.00 F(0.15) 0.02/204.40(0.01) 1.00/204.40(0.00) 0.0087 Branch 5 CmD vs. M3 0.01(0.76) 0.28(0.20) 1.56/24.69(0.04) 0.0059 Branch 6 CmC vs. M2aREL 0.00(0.72) 1.00 F(0.07) 0.20/0.00(0.22) 0.0091 Branch 6 CmD vs. M3 0.01(0.77) 0.31(0.20) 1.78/28.55(0.03) 0.0155

Beloniformes Clade 1 branch-sites vs. branch-sites null 0.02(0.85) 1.00 F(0.13) 0.02/6.35(0.02) 1.00/6.35(0.00) 0.0031 Branch 2 CmC vs. M2aREL 0.02(0.85) 1.00 F(0.13) 6.00/46.99(0.02) 0.0317 F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test result; Br-site, Branch-site; CmC, Clade model C; CmD, Clade model D

Table 4.2. PAML results for Beloniformes for Random-sites analyses parameter estimates: dN/dS (proportion of sites) model np lnL null LRT p value (background dN/dS / foreground dN/dS) m0 113 -5157.18 0.20(1.00) m1 114 -4827.12 0.02(0.86) 1.00 F(0.14) m0 660.13 < 0.0001 m2 116 -4748.82 0.02(0.85) 1.00 F(0.13) 6.15(0.02) m1 156.6 < 0.0001 m7 114 -4832.46 (β: p = 0.014, q = 0.03) m8a 115 -4810.89 (β: p = 0.08, q = 1.82) 1.00 F(0.10) m8 116 -4736.5 (β: p = 0.04, q = 0.30) 5.76(0.02) m7 191.92 < 0.0001 m8a 148.77 < 0.0001 F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test result; Br-site, Branch-site; CmC, Clade model C; CmD, Clade model D

143 Table 4.3. PAML results for Clupeiformes for Random-sites analyses parameter estimates: dN/dS (proportion of sites) model np lnL null LRT p value (background dN/dS / foreground dN/dS) m0 111 -7268.65 0.09(1.00) m1 112 -6886.59 0.02(0.85) 1.00 F(0.15) m0 764.12 < 0.0001 m2 114 -6864.15 0.02(0.85) 1.00 F(0.15) 8.89(0.00) m1 44.88 < 0.0001 m7 112 -6792.25 (β: p = 0.08 q = 0.52) m8a 113 -6784.05 (β: p = 0.12 q = 2.06) 1.00 F(0.05) m8 114 -6770.87 (β: p = 0.09 q = 0.69) 3.61(0.01) M7 42.75 < 0.0001 M8a 26.34 < 0.0001 F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test result; Br-site, Branch-site; CmC, Clade model C; CmD, Clade model D

Table 4.4. PAML results for rhodopsin duplicates in the North American croaker invasion parameter estimates: dN/dS (proportion of sites) model np LnL null LRT p value (background dN/dS / foreground dN/dS) m0 231 -5505.9453 0.21 (1.00)

two ratio fixed 231 -5501.481 0.21 / 1.00F(1.00)

two ratio 232 -5501.2961 0.20/ 0.76 (1.00) m0 9.30 0.0023 two ratio fixed 0.37 0.5430 F Note: lnL, ln likelihood; , dN/dS estimate fixed at reported value; LRT, likelihood ratio test result; Br-site, Branch-site; CmC, Clade model C; CmD, Clade model D

Table 4.5. Results from Pagel’s Discrete analysis. site substitution class 0 class 1 lnL alternate lnL null LRT p value 115 F115Y FW Y -162.86 -166.52 7.33 0.12 119 L119F LVIMTHM F -131.66 -138.97 14.62 0.01 122 E122I EQ IV -140.34 -144.02 7.37 0.12 124 A124S AG S -191 -193.43 4.86 0.30 159 F159L FWSTC LVG -183.05 -186.45 6.8 0.15 212 F212A AL F -125.56 -131.72 12.31 0.02 213 T213L/I TSCWMF LAIVG -238.44 -243.67 10.46 0.03 261 F261Y F Y -163 -182.68 39.37 0.00 270 S270G SHY G -220.48 -225.57 10.19 0.04 271 V271T VM T -161.62 -165.92 8.59 0.07 287 F287L FYS LVA -167.67 -174.3 13.26 0.01 Note: Substitution refers to substitutions identified more frequently on freshwater branches; Class 0 and Class 1 are the two binary character bins for the amino acid identifies with identities in the same class as the final amino acid given a 1. All other classes of residues observed a 0. Null hypothesis assumes that the rate of a 0 to 1 and 1 to 0 for a given site is equal in both habitat variable, also coded as 0 for marine and 1 freshwater. The null has four fewer parameters than the alternate

144 Table 4.6. Convergent substitutions in rhodopsin in independent freshwater invasions. Substitution Function Observed in

Three freshwater anchovy lineages, three our of four other marine derived clupeiform lineages, South American freshwater F261Y ~ 9 nm redshift (Chan and Sakmar 1992) sciaenids transitional branch, also in ancient freshwater lineages (Charachiphysii, Osteoglossiformes)

North and South American freshwater ~ 2 nm redshift, faster retinal release sciaenids, two freshwater clupeiforms and S124A (Castiglione and Chang 2018) also in marine beloniforms, clupeiforms and sciaenids

~ 2 nm blueshift, slower retinal release Freshwater anchovies and some marine A124S (Castiglione and Chang 2018) lineages

North and South American freshwater ~ 5 nm redshift, slower retinal release sciaenids and one marine sciaenid lineage, L119F (Castiglione and Chang 2018) also in ancient freshwater lineages (Charachiphysii, Osteoglossiformes)

South American freshwater sciaenids only, ~ 2 nm blueshift, much faster retinal release E122I but also seen in other ancient freshwater (Castiglione and Chang 2018) fishes (Charachiphysii, Osteoglossiformes)

African freshwater clupeiforms lineage, and Decreased active state stability (Tsukamoto F212A some distantly related cold temperature et al. 2010) freshwater fishes

145 4.7. FIGURES

Figure 4.1. Distribution of wavelengths of light and families of fishes with depth. a) Schematic of downwelling light in marine and freshwater environments. b) Comparison of spectral sensitivity differences for rhodopsin between epipelagic, pelagic and benthopelagic fishes. Data from MacNichol and Levine 1979 and Crescitelli 1990. c) Mid-point depths for species belonging to families in the orders Beloniformes and Clupeiformes and the family Sciaenidae. Data shown only for marine species. For freshwater data see Supplementary table S1.

146

147 Figure 4.2. Transition events in Beloniformes and Clupeiformes. Beloniformes (top), Clupeiformes (bottom). Labelled branches indicate a transition event, and are numbered sequentially from the base. Red branches indicate freshwater species and blue marine species, branches with arrows represent transitional lineages. Yellow boxes show freshwater clades that have been investigated using PAML using the same numbering scheme. Location of freshwater invasion shown on map. Branch lengths represent number of codon substitutions per codon.

148

Figure 4.3. Positively selected sites on the rhodopsin dark-state structure. a) Positively selected sites found in Beloniformes and Clupeiformes rhodopsin shown on the structure (1U19). Atoms common to both amino acids, before and after the substitution that is under positive selection are shown in burgundy. Carbon atoms (dark grey) and sulfur atoms (gold) that are lost in the species where positive selection is observed are differentiated from the atoms in burgundy that remain. Water molecules shown in blue. b) Positively selected sites identified in Denticeps clupeioides shown in detail. Extracellular loop where it resides is shown

149 in grey. Site 187, a partner in forming disulfide bonds is also shown in dark grey. c) Positively selected sites in Rhinosardinia amazonica shown in context of helix 6 and 7. Side chains of other highly conserved residues in close proximity also shown. Distances to the water shown by black lines. Distance from 264 to water labelled. d) Positively selected site in the South East Asian Beloniformes clade shown alongside residues forming a H bond network (Blue lines) shown in dark grey and labelled.

Figure 4.4. dN/dS estimates for rhodopsin by site in each clade of fishes.

Postmean dN/dS estimates for m2a analysis (PAML) of rhodopsin for a) Beloniformes, b) Clupeiformes, c) Sciaenidae. Sites with BEB posterior probability support for positive selection above 0.95 are labelled and coloured a darker grey.

150

Figure 4.5. Amino acid substitutions on freshwater and transitional branches. a) Sites where non-conservative substitutions in rhodopsin occur on transitional and freshwater branches in Beloniformes, Clupeiformes and Sciaenids. Coloured by distance to the chromophore. Right image is from the top with the N terminal cap and EL2 removed. b) Non- conservative substitutions occurring at least once on a transitional branch. Frequency of occurrence of the exact substitution in different habitats shown on each axis. Grey points represent substitutions from 100 null estimates generate using EVOLVER (PAML). Substitutions above the dotted line occur more frequently on freshwater or transitional branches. Convergent substitutions and those in close proximity to the chromophore are labelled. Three real substitutions and one simulated substitution appear outside the bounds of the figure (see Supplementary figure S4.3).

151 4.8. REFERENCES

Ala-Laurila P, Donner K, Koskelainen A. 2004. Thermal activation and photoactivation of visual pigments. Biophysical J. 86:3653–3662.

Angel TE, Gupta S, Jastrzebska B, Palczewski K, Chance MR. 2009. Structural waters define a functional channel mediating activation of the GPCR, rhodopsin. Proc. Natl. Acad. Sci. U.S.A. 106:14367–14372.

Barney RL. 1926. The Distribution of the Fresh‐Water Sheepshead, Aplodinotus Grunniens Rafinesque, in Respect to the Glacial History of North America. Ecology 7:351–364.

Bassham S, Catchen J, Lescak E, Hippel von FA, Cresko WA. 2018. Repeated Selection of Alternatively Adapted Haplotypes Creates Sweeping Genomic Remodeling in Stickleback. Genetics 300610.

Betancur-R R, Ortí G, Pyron RA. 2015. Fossil-based comparative analyses reveal ancient marine ancestry erased by extinction in ray-finned fishes. Ecol. Lett. 18:441–450.

Betancur-R R, Wiley EO, Arratia G, Acero A, Bailly N, Miya M, Lecointre G, Ortí G. 2017. Phylogenetic classification of bony fishes. BMC Evol. Biol. 17:1–40.

Bielawski JP, Yang Z. 2004. A Maximum Likelihood Method for Detecting Functional Divergence at Individual Codon Sites, with Application to Gene Family Evolution. J. Mol. Evol. 59:1–12.

Bloom DD, Lovejoy NR. 2014. The evolutionary origins of diadromy inferred from a time- calibrated phylogeny for Clupeiformes (herring and allies). Proc. R. Soc. B 281:20132081.

Bloom DD, Lovejoy NR. 2017. On the origins of marine-derived freshwater fishes in South America. J. Biogeogr. 44:1927–1938.

Boettiger C, Lang DT, Wainwright PC. 2012. rfishbase: exploring, manipulating and visualizing FishBase data from R. J. Fish Biol. 81:2030–2039.

Bowmaker JK, Hunt DM. 2006. Evolution of vertebrate visual pigments. Curr. Biol. 16:R484–R489.

Carleton KL, Dalton BE, Escobar-Camacho D, Nandamuri SP. 2016. Proximate and ultimate causes of variable visual sensitivities: Insights from cichlid fish radiations. Genesis 54:299–325.

Castiglione GM, Chang BS. 2018 Functional trade-offs and environmental variation determined ancient trajectories during the evolution of dim-light vision. eLife.

152 Castiglione GM, Hauser FE, Liao BS, Lujan NK, Van Nynatten A, Morrow JM, Schott RK, Bhattacharyya N, Dungan SZ, Chang BSW. 2017. Evolution of nonspectral rhodopsin function at high altitudes. Proc. Natl. Acad. Sci. U.S.A. 114:7385–7390.

Castiglione GM, Schott RK, Hauser FE, Chang BSW. 2018. Convergent selection pressures drive the evolution of rhodopsin kinetics at high altitudes via nonparallel mechanisms. Evolution 72:170–186.

Chan T, Lee M, Sakmar TP. 1992. Introduction of Hydroxyl-bearing Amino Acids Causes Bathochromic Spectral Shifts in Rhodopsin. J. Biol. Chem. 267:9478–9480.

Chen W-J, Bonillo C, Lecointre G. 2003. Repeatability of clades as a criterion of reliability: a case study for molecular phylogeny of Acanthomorpha (Teleostei) with larger number of taxa. Mol. Phylogenet. Evol. 26:262–288.

Choe H-W, Kim YJ, Park JH, Morizumi T, Pai EF, Krauß N, Hofmann KP, Scheerer P, Ernst OP. 2011. Crystal structure of metarhodopsin II. Nature 471:651–655.

Crampton WG. 2007. Diversity and adaptation in deep channel Neotropical electric fishes. In: Fish life in special environments. New Hampshire: Fish life in special environments. New Hampshire: Science Publishers, Inc., Enfield. pp. 283–339.

Crescitelli F. 1990. Adaptations of visual pigments to the photic environment of the deep sea. J. Exp. Zool. 256:66–75.

Darwin C. 1859. On the Origin of Species by Means of Natural Selection, Or, The Preservation of Favoured Races in the Struggle for Life.

Deary AL, Metscher B, Brill RW, Hilton EJ. 2016. Shifts of sensory modalities in early life history stage estuarine fishes (Sciaenidae) from the Chesapeake Bay using X-ray micro computed tomography. Environ. Biol. Fish. 99:361–375.

Dungan SZ, Kosyakov A, Chang BSW. 2016. Spectral Tuning of Killer Whale (Orcinus orca) Rhodopsin: Evidence for Positive Selection and Functional Adaptation in a Cetacean Visual Pigment. Mol. Biol. Evol. 33:323–336.

Edgar RC. 2004. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32:1792–1797.

Ernst OP, Lodowski DT, Elstner M, Hegemann P, Brown LS, Kandori H. 2014. Microbial and Animal Rhodopsins: Structures, Functions, and Molecular Mechanisms. Chem. Rev. 114:126–163.

Evans DH. 2008. Teleost fish osmoregulation: what have we learned since August Krogh, Homer Smith, and Ancel Keys. Am. J. Physiol. Regul. Integr. Comp. Physiol. 295:R704–R713.

153 Flamarique IN. 2013. Opsin switch reveals function of the ultraviolet cone in fish foraging. Proc. R. Soc. B 280:20122490.

Fritsches KA, Brill RW, Warrant EJ. 2005. Warm eyes provide superior vision in swordfishes. Curr. Biol. 15:55–58.

Fyhrquist N, Donner K, Hargrave PA, McDowell JH, Popp MP, Smith WC. 1998. Rhodopsins from three frog and toad species: sequences and functional comparisons. Exp. Eye Res. 66:295–305.

Gregg RG, McCall MA, Massey S. 2012. Function and Anatomy of the Mammalian Retina. In: Retina Fifth Edition. Vol. 1. pp. 360–400.

Hauser FE, Ilves KL, Schott RK, Castiglione GM, López-Fernández H, Chang BSW. 2017. Accelerated Evolution and Functional Divergence of the Dim Light Visual Pigment Accompanies Cichlid Colonization of Central America. Mol. Biol. Evol. 34:2650–2664.

Hope AJ, Partridge JC, Dulai KS, Hunt DM. 1997. Mechanisms of wavelength tuning in the rod opsins of deep-sea fishes. Proc. R. Soc. B 264:155–163.

Jones FC, Grabherr MG, Chan YF, Russell P, Mauceli E, Johnson J, Swofford R, Pirun M, Zody MC, White S, et al. 2012. The genomic basis of adaptive evolution in threespine sticklebacks. Nature 484:55–61.

Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, Buxton S, Cooper A, Markowitz S, Duran C, et al. 2012. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28:1647–1649.

Lamb TD. 2013. Evolution of phototransduction, vertebrate photoreceptors and retina. Prog Retin Eye Res 36:52–119.

Land MF, Nilsson D-E. 2012. Animal Eyes. OUP Oxford

Lee CE, Kiergaard M, Gelembiuk GW, Eads BD, Posavi M. 2011. Pumping ions: rapid parallel evolution of ionic regulation following habitat invasions. Evolution 65:2229– 2244.

Lee CE. 2016. Evolutionary mechanisms of habitat invasions, using the copepod Eurytemora affinis as a model system. Evol. Appl. 9:248–270.

Lo P-C, Liu S-H, Chao NL, Nunoo FKE, Mok H-K, Chen W-J. 2015. A multi-gene dataset reveals a tropical New World origin and Early Miocene diversification of croakers (Perciformes: Sciaenidae). Mol. Phylogenet. Evol. 88:132–143.

Lovejoy NR, Albert JS, Crampton WGR. 2006. Miocene marine incursions and marine/freshwater transitions: Evidence from Neotropical fishes. J South Am Earth Sci 21:5–13.

154 Lovejoy NR, Bermingham E, Martin AP. 1998. Marine incursion into South America. Nature 396:421–422.

Lovejoy NR. 2004. Phylogeny and Jaw Ontogeny of Beloniform Fishes. Integr. Comp. Biol. 44:366–377.

López-Fernández H, Winemiller KO, Montaña C, Honeycutt RL. 2012. Diet-morphology correlations in the radiation of South American geophagine cichlids (Perciformes: Cichlidae: Cichlinae). PLoS ONE 7:e33997.

Luk HL, Bhattacharyya N, Montisci F, Morrow JM, Melaccio F, Wada A, Sheves M, Fanelli F, Chang BSW, Olivucci M. 2016. Modulation of thermal noise and spectral sensitivity in Lake Baikal cottoid fish rhodopsins. Sci. Rep. 6:1–9.

Lythgoe JN. 1979. The Ecology of Vision.

MacNichol EF Jr., Levine JS. 1979. Visual Pigments in Teleost Fishes: Effects of Habitat, Microhabitat, and Behaviour on Visual System Evolution. Sensory Processes 3:95–131.

Marques DA, Taylor JS, Jones FC, Di Palma F, Kingsley DM, Reimchen TE. 2017. Convergent evolution of SWS2 opsin facilitates adaptive radiation of threespine stickleback into different light environments. PLoS Biol 15:e2001627–24.

McKibbin C, Toye AM, Reeves PJ, Khorana HG, Edwards PC, Villa C, Booth PJ. 2007. Opsin Stability and Folding: The Role of Cys185 and Abnormal Disulfide Bond Formation in the Intradiscal Domain. J. Mol. Biol. 374:1309–1318.

Morrow JM, Lazic S, Dixon Fox M, Kuo C, Schott RK, de A Gutierrez E, Santini F, Tropepe V, Chang BSW. 2017. A second visual rhodopsin gene, rh1-2, is expressed in zebrafish photoreceptors and found in other ray-finned fishes. J. Exp. Biol. 220:294–303.

Nakamura Y, Yasuike M, Mekuchi M, Iwasaki Y, Ojima N, Fujiwara A, Chow S, Saitoh K. 2017. Rhodopsin gene copies in Japanese eel originated in a teleost-specific genome duplication. Zoological Lett. 3:1–12.

Nguyen L-T, Schmidt HA, Haeseler von A, Minh BQ. 2015. IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies. Mol. Biol. Evol. 32:268–274.

Novales Flamarique I. 2017. A vertebrate retina with segregated colour and polarization sensitivity. Proc. Biol. Sci. 284:20170759.

Okada T, Fujiyoshi Y, Silow M, Navarro J, Landau EM, Shichida Y. 2002. Functional role of internal water molecules in rhodopsin revealed by X-ray crystallography. Proc. Natl. Acad. Sci. U.S.A. 99:5982–5987.

155 Okada T, Sugihara M, Bondar A-N, Elstner M, Entel P, Buss V. 2004. The Retinal Conformation and its Environment in Rhodopsin in Light of a New 2.2Å Crystal Structure. J. Mol. Biol. 342:571–583.

Pagel M. 1994. Detecting correlated evolution on phylogenies: a general method for the comparative analysis of discrete characters. Proc. R. Soc. B 255:37–45.

Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE. 2004. UCSF Chimera-A visualization system for exploratory research and analysis. J. Comput. Chem. 25:1605–1612.

Phillips GAC, Carleton KL, Marshall NJ. 2015. Multiple Genetic Mechanisms Contribute to Visual Sensitivity Variation in the Labridae. Mol. Biol. Evol. 33:201–215.

Potier S, Mitkus M, Kelber A. 2018. High resolution of colour vision, but low contrast sensitivity in a diurnal raptor. Proc. Biol. Sci. 285:20181036.

Rahi ML, Amin S, Mather PB, Hurwood DA. 2017. Candidate genes that have facilitated freshwater adaptation by palaemonid prawns in the Macrobrachium: identification and expression validation in a model species (M. koombooloomba). PeerJ 5:e2977.

Revell LJ. 2011. phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol. Evol. 3:217–223.

Riesch R, Plath M, Bierbach D. 2018. Ecology and evolution along environmental gradients. Curr. Zool. 64:193–196.

Sasaki K. 1989. Phylogeny of the family Sciaenidae, with notes on its zoogeography (Teleostei, Perciformes). Mem. Fac. Fish., Hokkaido Univ. 36:1–137.

Schott RK, Refvik SP, Hauser FE, López-Fernández H, Chang BSW. 2014. Divergent positive selection in rhodopsin from lake and riverine cichlid fishes. Mol. Biol. Evol. 31:1149–1165.

Sugawara T, Imai H, Nikaido M, Imamoto Y, Okada N. 2010. Vertebrate Rhodopsin Adaptation to Dim Light via Rapid Meta-II Intermediate Formation. Mol. Biol. Evol. 27:506–519.

Thompson KA, Osmond MM, Schluter D. 2018. Patterns of speciation and parallel genetic evolution under adaptation from standing variation. bioRxiv:368324.

Torres-Dowdall J, Pierotti MER, Härer A, Karagic N, Woltering JM, Henning F, Elmer KR, Meyer A. 2017. Rapid and Parallel Adaptive Evolution of the Visual System of Neotropical Midas Cichlid Fishes. Mol. Biol. Evol. 34:2469–2485.

Tsukamoto H, Terakita A, Shichida Y. 2010. A pivot between helices V and VI near the retinal binding site is necessary for activation in rhodopsins. J. Biol. Chem.:jbc–M109.

156 Van Nynatten A, Bloom D, Chang BSW, Lovejoy NR. 2015. Out of the blue: adaptive visual pigment evolution accompanies Amazon invasion. Biol. Lett. 11:20150349.

Wainwright PC, Longo SJ. 2017. Functional Innovations and the Conquest of the Oceans by Acanthomorph Fishes. Curr. Biol. 27:R550–R557.

Wang F-Y, Fu W-C, Wang I-L, Yan HY, Wang T-Y. 2014. The Giant Mottled Eel, Anguilla marmorata, Uses Blue-Shifted Rod Photoreceptors during Upstream Migration. PLoS ONE 9:e103953.

Weadick CJ, Chang BSW. 2011. An Improved Likelihood Ratio Test for Detecting Site- Specific Functional Divergence among Clades of Protein-Coding Genes. Mol. Biol. Evol. 29:1297–1300.

Willoughby JR, Harder AM, Tennessen JA, Scribner KT, Christie MR. 2018. Rapid genetic adaptation to a novel environment despite a genome-wide reduction in genetic diversity. Mol Ecol 17:675.

Wright ES. 2015. DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment. BMC Bioinformatics 16:322.

Yang Z, Kumar S, Nei M. 1995. New Method of Inference of Ancestral Nucleotide and Amino Acid Sequences. Genetics 141:1–10.

Yang Z, Swanson WJ. 2002. Codon-Substitution Models to Detect Adaptive Evolution that Account for Heterogeneous Selective Pressures Among Site Classes. Mol. Biol. Evol. 19:49–57.

Yang Z. 1998. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol. Biol. Evol. 15:568–573.

Yang Z. 2005. Bayes Empirical Bayes Inference of Amino Acid Sites Under Positive Selection. Mol. Biol. Evol. 22:1107–1118.

Yang Z. 2007. PAML 4: Phylogenetic Analysis by Maximum Likelihood. Mol. Biol. Evol. 24:1586–1591.

Yokoyama R, Knox BE, Yokoyama S. 1995. Rhodopsin from the fish, Astyanax: role of tyrosine 261 in the red shift. Invest. Ophthalmol. Vis. Sci. 36:939–945.

Yokoyama S, Tada T, Zhang H, Britt L. 2008. Elucidation of phenotypic adaptations: Molecular analyses of dim-light vision proteins in vertebrates. Proc. Natl. Acad. Sci. U.S.A. 105:13480–13485.

Zou Z, Zhang J. 2015. Are Convergent and Parallel Amino Acid Substitutions in Protein Evolution More Prevalent Than Neutral Expectations? Mol. Biol. Evol. 32:2085–2096.

157 4.9. SUPPLEMENTAL INFORMATION

4.9.1. Supplementary tables

Table S4.1. Habitat classification of freshwater Beloniformes, Clupeiformes and Scaiendae species. Habitat description Belonidae Zenarchopteridae Hemiramphidae Engraulidae Denticipitidae Pristigasteridae Clupeidae Sciaenidae

epipelagic 9 1 0 0 0 0 2 0

pelagic 1 45 3 15 1 7 46 8

benthopelagic 0 0 0 0 0 0 4 9

demersal 0 0 0 0 0 0 3 2

Table S4.2. New rhodopsin sequences for fishes included in the rhodopsin dataset species sequence habitat Aeoliscus_strigatus EU637931.1 marine Albulichthys_albuloides KC690011.1 freshwater Ameiurus_nebulosus JN230991.1 freshwater Amia_calva AF137208.1 freshwater Anoplogaster_cornuta JN412582.1 marine Aphanopus_carbo EU637938.1 marine Apteronotus_albifrons JN230983.1 freshwater Archoplites_interruptus AY742563.1 freshwater Argentina_sialis JN230995.1 marine Argyropelecus_gigas JN412572.1 marine Aristostomias_scintillans JX255574.1 marine Aseraggodes_heemstrai JQ938062.1 marine Ateleopus_japonicus KC442218.1 marine Aulostomus_chinensis AY141279.1 marine Barbourisia_rufa AY368333.1 marine Bathylagus_euryops AY141255.1 marine Bathysaurus_ferox JN412585.1 marine Bonapartia_pedaliota JN544534.1 marine Calliurichthys_grossi KF265142.1 marine Callorhinchus_milii NM_001292252.1 marine Ceratias_holboelli AY141263.1 marine Ceratoscopelus_warmingii JN412573.1 marine Chanos_chanos JN230981.1 marine Chascanopsetta_lugubris KF312131.1 marine Chauliodus_sloani JX255575.1 marine Chlorurus_sordidus KP881291.1 marine Cirrhilabrus_punctatus KP881287.1 marine Citharichthys_arctifrons JQ938042.1 marine Citharoides_macrolepis KF312133.1 marine

158 Clinus_superciliosus JF320879.1 marine Coris_gaimard KP881290.1 marine Coryphaenoides_rupestris AY368319.1 marine Ctenopharyngodon_idella GU218588.1 freshwater Cyclopsetta_chittendeni JQ938043.1 marine Dactyloptena_orientalis KC442232.1 marine Danio_tinwini JQ614153.1 freshwater Danionella_mirifica FJ531347.1 freshwater Diaphus_rafinesquii JN412587.1 marine Diodon_holocanthus KC442241.1 marine Dormitator_maculatus KU765098.1 marine Elacatinus_oceanops AY846604.1 marine Elassoma_zonatum KF751789.1 freshwater Electrona_antarctica AY141258.1 marine Esox_lucius XM_010902101.2 freshwater Etropus_crossotus JQ938044.1 marine Eustomias_polyaster KC163318.1 marine Fistularia_petimba AY141324.1 marine Foetorepus_altivelis KF265120.1 marine Gadus_morhua AF137211.1 marine Gasterosteus_aculeatus EU637962.1 marine Gnatholepis_cauerensis JF261539.1 marine Gonorynchus_greyi EU409632.1 marine Gonostoma_elongatum KC163332.1 marine Gyrinocheilus_aymonieri FJ197071.2 freshwater Harpadon_microchir KC442220.1 marine Hippocampus_comes XM_019890602.1 marine Howella_brodiei EU637966.1 marine Indostomus_paradoxus JQ938022.1 freshwater Jordanella_floridae KJ697384.1 freshwater Latimeria_chalumnae XM_005997817.2 marine Lefua_echigonia FJ197028.1 freshwater Lota_lota KX146111.1 freshwater Macroramphosus_scolopax AY141280.1 freshwater Mancopsetta_maculata KF312129.1 marine Melamphaes_suborbitalis JN231006.1 marine Menidia_menidia EU637977.1 marine Microdesmus_bahianus HQ536888.1 marine Microdevario_kubotai JQ614188.1 marine Microrasbora_rubescens GQ365225.1 freshwater Minytrema_melanops FJ197034.1 freshwater Monopterus_albus XM_020585428.1 freshwater Morone_saxatilis KX145720.1 freshwater Mugil_cephalus Y18668.1 marine Myripristis_berndti U57538.1 marine Naso_lituratus EU637984.1 marine Neoceratodus_forsteri EF526295.1 marine Notothenia_coriiceps XM_010780757.1 freshwater Oncorhynchus_nerka AY214156.1 marine

159 Opistognathus_maxillosus JQ937975.1 marine Opsanus_pardus DQ874822.1 marine Paralichthys_dentatus KU980166.1 marine Pentapodus_caninus KY363166.1 marine Perca_fluviatilis AY141295.1 marine Perccottus_glenii KX224215.1 freshwater Pervagor_nigrolineatus KF025937.1 freshwater Photostomias_guernei JN412566.1 marine Plecoglossus_altivelis JX255568.1 marine Poeciliopsis_elongata KJ697427.1 marine Poecilopsetta_beanii JQ938054.1 freshwater Pollichthys_mauli JN544535.1 marine Polymixia_lowei KC442227.1 marine Polyodon_spathula AF369050.1 marine Pontinus_longispinis EU637996.1 freshwater Pseudopleuronectes_americanus AY631036.1 marine Pygocentrus_nattereri XM_017683920.1 marine Rasbora_borapetensis HM223982.1 freshwater Rasbora_pauciperforata JQ614284.1 freshwater Rhombosolea_plebeia JQ938068.1 freshwater Sagamichthys_abei JN230975.1 marine Samariscus_latus KF312146.1 marine Sargocentron_tiere U57545.1 marine Sarpa_salpa Y18664.1 marine Securicula_gora HM224015.1 marine Sinibotia_robusta JN177209.1 freshwater Solea_solea EU638009.1 freshwater Speoplatyrhinus_poulsoni HQ729685.1 marine Stokellia_anisodon JX255573.1 freshwater Synodus_foetens JN231001.1 marine Tactostoma_macropus KC163302.1 marine Tanichthys_micagemmae HM224017.1 marine Tetraodon_nigroviridis JQ682445.1 freshwater Trachipterus_arcticus KC442225.1 freshwater Trichiurus_lepturus DQ874796.1 marine Trinectes_maculatus EU638015.1 marine Umbra_limi JN230999.1 marine Vaillantella_maassi FJ197031.1 freshwater Valenciennea_strigata EU638017.1 freshwater Xeneretmus_latifrons EU638018.1 marine Xiphias_gladius EU638019.1 marine Xyrichtys_novacula EU638020.1 marine Zeus_faber EU638023.1 marine Potamorrhaphis guianensis this chapter marine Belonion apodion this chapter freshwater Belonion dibranchodon this chapter freshwater Dermogenys collettei this chapter freshwater Hemirhamphodon pogonognathus this chapter freshwater Nomorhamphus brembachi this chapter freshwater

160 Nomorhamphus weberi this chapter marine Potamorrhaphis eigenmanni this chapter freshwater Potamorrhaphis petersi this chapter freshwater Pseudotylosurus angusticeps this chapter freshwater Pseudotylosurus microps this chapter freshwater Strongylura fluviatilis this chapter freshwater Strongylura hubbsi this chapter marine Strongylura krefftii this chapter freshwater Xenentodon cancila this chapter freshwater Hyporhamphus quoyi this chapter marine Strongylura marina this chapter marine Strongylura timucu this chapter marine Prognichthys gibbifrons this chapter marine Cheilopogon dorsomacula this chapter marine Cheilopogon exsiliens this chapter marine Cheilopogon furcatus this chapter marine Cheilopogon xenopterus this chapter marine Oxyporhamphus this chapter marine Prognichthys tringa this chapter marine Ablennes hians this chapter marine Belone belone this chapter marine Belone svetovidovi this chapter marine Chriodorus atherinoides this chapter marine Euleptorhamphus viridis this chapter marine Exocoetus monocirrhus this chapter marine Exocoetus obtusirostris this chapter marine Exocoetus volitans this chapter marine Hemiramphus balao this chapter marine Hemiramphus brasiliensis this chapter marine Hemiramphus lutkei this chapter marine Hirundichthys affinis this chapter marine Hyporhamphus dussumieri this chapter marine Hyporhamphus limbatus this chapter marine Hyporhamphus mexicanus this chapter marine Hyporhamphus naos this chapter freshwater Hyporhamphus snyderi this chapter marine Melapedalion breve this chapter marine Parexocoetus brachypterus this chapter marine Petalichthys capensis this chapter marine Platybelone argalus this chapter marine Prognichthys sealei this chapter marine Strongylura exilis this chapter marine Strongylura incisa this chapter marine Strongylura leiura this chapter marine Strongylura notata this chapter marine Strongylura scapularis this chapter marine Strongylura senegalensis this chapter marine Strongylura strongylura this chapter marine Tylosurus crocodilus this chapter marine

161 Tylosurus gavialoides this chapter marine Tylosurus punctulatus this chapter marine Pterengraulis atherinoides ch2 marine Amazonsprattus scintilla ch2 freshwater Amazonsprattus sp ch2 freshwater Anchovia surinamensis ch2 freshwater Anchoviella manamensis ch2 freshwater Denticeps clupeoides JN230976 freshwater Ilisha amazonica this chapter freshwater Pellona castelnaeana this chapter freshwater Rhinosardinia bahiensis ch2 freshwater Anchoviella alleni ch2 freshwater Anchoviella sp ch2 freshwater Anchoviella carrikeri ch2 freshwater Anchoviella guianensis ch2 freshwater Anchoviella guianensis ch2 freshwater Anchoviella spII ch2 freshwater Jurengraulis juruensis ch2 freshwater Lycengraulis batesii ch2 freshwater Pristigaster whiteheadi this chapter freshwater Alosa chrysochloris this chapter freshwater Alosa mediocris this chapter marine Alosa aestivalis KX146146 marine Alosa sapidissima KX145751 marine Sardina pilchardus Y18677 marine Anchoviella lepidentostole ch2 marine Dorosoma cepedianum JN230979 marine Ilisha megaloptera this chapter marine Lycengraulis grossidens ch2 marine Odontognathus mucronatus this chapter marine Anchoviella brevirostris ch2 marine Encrasicholina devisi ch2 marine Engraulis encrasicolus ch2 marine Engraulis japonicus ch2 marine Opisthonema libertate this chapter marine mystax this chapter marine Anchoa lyolepis ch2 marine Anchoa cayorum ch2 marine Anchoa colonensis ch2 marine Anchoa cubana ch2 marine Anchoa delicatissima ch2 marine Anchoa filifera ch2 marine Anchoa sp ch2 marine Anchoa spinifer ch2 marine Anchoa walkeri ch2 marine Anchovia clupeoides ch2 marine Anchoviella balboae ch2 marine Anchoviella elongata ch2 marine Cetengraulis edentulus ch2 marine

162 Cetengraulis mysticetus ch2 marine Engraulis eurystole ch2 marine Engraulis mordax ch2 marine Engraulis ringens ch2 marine Harengula humeralis this chapter marine Lycengraulis poeyi ch2 marine Pellona harroweri this chapter marine Sardinella aurita this chapter marine microlepis ch3 marine Pachypops fourcroi ch3 freshwater Pachypops sp ch3 freshwater Pachyurus bonariensis ch3 freshwater Pachyurus paucirastrus ch3 freshwater Pachyurus junki ch3 freshwater Pachyurus schomburgkii ch3 freshwater Petilipinnis grunniens ch3 freshwater Plagioscion auratus ch3 freshwater Plagioscion montei ch3 freshwater Plagioscion squamosissimus ch3 freshwater Plagioscion surinamensis ch3 freshwater Aplodinotus grunniens this chapter freshwater Cynoscion nothus ch3 freshwater Cynoscion parvipinnis ch3 marine Cynoscion regalis ch3 marine Atrobucca nibe ch3 marine Austronibea oedogenys ch3 marine Bahaba taipingensis ch3 marine Chrysochir aureus ch3 marine Collichthys lucidus ch3 marine Cynoscion praedatorius ch3 marine Daysciaena albida ch3 marine Dendrophysa russelii ch3 marine Equetus lanceolatus ch3 marine Isopisthus remifer ch3 marine Johnius amblycephalus ch3 marine Johnius belangerii ch3 marine Johnius borneensis ch3 marine Johnius carouna ch3 marine Johnius distinctus ch3 marine Johnius heterolepis ch3 marine Johnius macropterus ch3 marine Johnius majan ch3 marine Johnius trewavasae ch3 marine Larimichthys crocea ch3 marine Larimichthys polyactis ch3 marine Megalonibea fusca ch3 marine Miichthys miiuy ch3 marine Nebris microps ch3 marine Nebris sp ch3 marine

163 Nibea albiflora ch3 marine Nibea microgenys ch3 marine Nibea soldado ch3 marine Nibea squamosa ch3 marine Otolithes ruber ch3 marine Panna microdon ch3 marine Paralonchurus brasiliensis ch3 marine Paralonchurus dumerilii ch3 marine Paralonchurus sp ch3 marine Pareques sp ch3 marine Pennahia argentata ch3 marine Pennahia macrocephalus ch3 marine Pennahia pawak ch3 marine Protonibea diacanthus ch3 marine Pseudotolithus brachygnathus ch3 marine Pseudotolithus elongatus ch3 marine Pseudotolithus senegallus ch3 marine Pseudotolithus typus ch3 marine Pteroscion peli ch3 marine maculatus ch3 marine Umbrina bussingi ch3 marine Umbrina sp ch3 marine Argyrosomus japonicus ch3 marine Argyrosomus regius ch3 marine Atractoscion nobilis ch3 marine Bairdiella armata ch3 marine Bairdiella ronchus ch3 marine Corvula macrops ch3 marine Cynoscion acoupa ch3 marine Cynoscion arenarius ch3 marine Cynoscion guatucupa ch3 marine Cynoscion nebulosus ch3 marine Cynoscion reticulatus ch3 marine Cynoscion albus ch3 marine Isopisthus parvipinnis ch3 marine Larimus fasciatus ch3 marine Leiostomus xanthurus ch3 marine Macrodon ancylodon ch3 marine Menticirrhus americanus ch3 marine Menticirrhus paitensis ch3 marine Menticirrhus sp ch3 marine Menticirrhus undulatus ch3 marine Micropogonias ectenes ch3 marine Micropogonias furnieri ch3 marine Micropogonias undulatus ch3 marine Ophioscion punctatissimus ch3 marine Ophioscion scierus ch3 marine Ophioscion sp ch3 marine Ophioscion vermicularis ch3 marine

164 Pogonias cromis ch3 marine Sciaena umbra ch3 marine Sciaenops ocellatus ch3 marine Stellifer chrysoleuca ch3 marine Stellifer ericymba ch3 marine Stellifer lanceolatus ch3 marine Stellifer microps ch3 marine Stellifer oscitans ch3 marine Stellifer rastrifer ch3 marine Stellifer sp ch3 marine Stellifer sp ch3 marine Stellifer stellifer ch3 marine Totoaba macdonaldi ch3 marine Umbrina cirrosa ch3 marine Cheilotrema saturnum ch3 marine Cilus gilberti ch3 marine Genyonemus lineatus ch3 marine Roncador stearnsii ch3 marine Sciaena deliciosa ch3 marine Seriphus politus ch3 marine Umbrina roncador ch3 marine Umbrina xanti ch3 marine

165 Table S4.3. Branch-sites and clade model (PAML) results for the 55 species Clupeiformes rhodopsin dataset. Model Partition np lnL Parameter estimates Null LRT p value

m2aREL NA 114 -6791.5 0.00(0.72) 1.00(0.07) 0.19(0.21) m3 NA 115 -6789.95 0.00(0.73) 0.23(0.22) 1.31(0.05)

BrSalt br1 114 -6875.02 0.02/0.02(0.84) 1.00/1.00(0.14) 0.02/91.50(0.02) 1.00/91.50(0.00) BrSnull 7.5877 0.0059 BrSnull br1 113 -6878.82 0.02/0.02(0.82) 1.00/1.00(0.14) 0.02/1.00(0.03) 1.00/1.00(0.01) CmC br1 115 -6791.49 0.00/0.00(0.72) 1.00/1.00(0.07) 0.19/0.20(0.21) m2aREL 0.018862 0.8908 CmD br1 116 -6789.91 0.00/0.00(0.73) 1.32/1.32(0.05) 0.23/0.21(0.21) m3 0.087634 0.7672

BrSalt br2 114 -6885.2 0.02/0.02(0.77) 1.00/1.00(0.13) 0.02/1.00(0.08) 1.00/1.00(0.01) BrSnull 0 1.0000 BrSnull br2 113 -6885.2 0.02/0.02(0.77) 1.00/1.00(0.13) 0.02/1.00(0.08) 1.00/1.00(0.01) CmC br2 115 -6790.42 0.00/0.00(0.72) 1.00/1.00(0.07) 0.19/0.58(0.21) m2aREL 2.163134 0.1414 CmD br2 116 -6788.81 0.00/0.00(0.73) 1.33/1.33(0.05) 0.23/0.75(0.22) m3 2.277662 0.1312

BrSalt br3 114 -6865.65 0.02/0.02(0.73) 1.00/1.00(0.12) 0.02/1.00(0.13) 1.00/1.00(0.02) BrSnull 0 1.0000 BrSnull br3 113 -6865.65 0.02/0.02(0.73) 1.00/1.00(0.12) 0.02/1.00(0.13) 1.00/1.00(0.02) CmC br3 115 -6784.07 0.00/0.00(0.71) 1.00/1.00(0.07) 0.17/1.02(0.22) m2aREL 14.844214 0.0001 CmD br3 116 -6783.23 0.00/0.00(0.72) 1.25/1.25(0.06) 0.20/1.10(0.22) m3 13.446796 0.0002

BrSalt br4 114 -6880.7 0.02/0.02(0.85) 1.00/1.00(0.15) 0.02/206.41(0.01) 1.00/206.41(0.00) BrSnull 6.935206 0.0085 BrSnull br4 113 -6884.17 0.02/0.02(0.84) 1.00/1.00(0.15) 0.02/1.00(0.01) 1.00/1.00(0.00) CmC br4 115 -6790.77 0.00/0.00(0.72) 1.00/1.00(0.07) 0.19/0.33(0.22) m2aREL 1.458706 0.2271 CmD br4 116 -6789.87 0.00/0.00(0.73) 0.23/0.23(0.22) 1.30/1.72(0.05) m3 0.172944 0.6775

BrSalt br5 114 -6886.59 0.02/0.02(0.85) 1.00/1.00(0.15) 0.02/1.00(0.00) 1.00/1.00(0.00) BrSnull 0 1.000 BrSnull br5 113 -6886.59 0.02/0.02(0.85) 1.00/1.00(0.15) 0.02/1.00(0.00) 1.00/1.00(0.00) CmC br5 115 -6791.44 0.00/0.00(0.72) 1.00/1.00(0.07) 0.19/0.16(0.21) m2aREL 0.104828 0.7461 CmD br5 116 -6786.16 0.01/0.01(0.76) 0.28/0.28(0.20) 1.56/24.69(0.04) m3 7.595286 0.0059

BrSalt br6 114 -6886.59 0.02/0.02(0.85) 1.00/1.00(0.15) 0.02/1.00(0.00) 1.00/1.00(0.00) BrSnull 0 1.0000 BrSnull br6 113 -6886.59 0.02/0.02(0.85) 1.00/1.00(0.15) 0.02/1.00(0.00) 1.00/1.00(0.00) CmC br6 115 -6788.1 0.00/0.00(0.72) 1.00/1.00(0.07) 0.20/0.00(0.22) m2aREL 6.797828 0.0091 CmD br6 116 -6787.02 0.01/0.01(0.77) 0.31/0.31(0.20) 1.78/28.55(0.03) m3 5.862952 0.0155

BrSalt fw2 114 -6879.49 0.02/0.02(0.83) 1.00/1.00(0.15) 0.02/9.32(0.02) 1.00/9.32(0.00) BrSnull 2.87387 0.0900 BrSnull fw2 113 -6880.93 0.02/0.02(0.78) 1.00/1.00(0.13) 0.02/1.00(0.07) 1.00/1.00(0.01) CmC fw2 115 -6788.48 0.00/0.00(0.72) 1.00/1.00(0.07) 0.19/0.67(0.21) m2aREL 6.028356 0.0141 CmD fw2 116 -6786.28 0.00/0.00(0.74) 1.34/1.34(0.05) 0.22/0.87(0.21) m3 7.343142 0.0067

BrSalt fw3 114 -6865.86 0.02/0.02(0.74) 1.00/1.00(0.12) 0.02/1.00(0.12) 1.00/1.00(0.02) BrSnull 0 1.0000 BrSnull fw3 113 -6865.86 0.02/0.02(0.74) 1.00/1.00(0.12) 0.02/1.00(0.12) 1.00/1.00(0.02) CmC fw3 115 -6784.46 0.00/0.00(0.71) 1.00/1.00(0.07) 0.17/0.96(0.22) m2aREL 14.064036 0.0002 CmD fw3 116 -6783.62 0.00/0.00(0.72) 1.25/1.25(0.06) 0.20/1.03(0.22) m3 12.664876 0.0004

BrSalt fw6 114 -6885.01 0.02/0.02(0.85) 1.00/1.00(0.14) 0.02/1.00(0.01) 1.00/1.00(0.00) BrSnull 2E-06 0.9989 BrSnull fw6 113 -6885.01 0.02/0.02(0.85) 1.00/1.00(0.14) 0.02/1.00(0.01) 1.00/1.00(0.00) CmC fw6 115 -6791.08 0.00/0.00(0.72) 1.00/1.00(0.07) 0.18/0.22(0.21) m2aREL 0.827732 0.3629 CmD fw6 116 -6789.84 0.00/0.00(0.73) 1.30/1.30(0.05) 0.22/0.24(0.22) m3 0.230558 0.6311

166 Table S4.4. Branch-sites and clade model (PAML) results for the 56 species Beloniformes rhodopsin dataset. Model Partition np lnL Parameter estimates Null LRT p value

m2aREL NA 116 -4748.82 0.02(0.85) 1.00(0.13) 6.15(0.02) m3 NA 117 -4745.75 0.01(0.81) 0.68(0.17) 5.68(0.03)

Transitional branches BrSalt 1 116 -4826.6 0.02/0.02(0.83) 1.00/1.00(0.14) 0.02/1.17(0.03) 1.00/1.17(0.00) BrSnull 0 0.9681 BrSnull 1 115 -4826.61 0.02/0.02(0.83) 1.00/1.00(0.14) 0.02/1.00(0.03) 1.00/1.00(0.01) CmC 1 117 -4748.44 0.02/0.02(0.85) 1.00/1.00(0.13) 6.07/11.80(0.02) m2aREL 0.76 0.3835 CmD 1 118 -4745.74 0.01/0.01(0.81) 5.67/5.67(0.03) 0.68/0.74(0.17) m3 0.02 0.8751

BrSalt 2 116 -4827.12 0.02/0.02(0.86) 1.00/1.00(0.14) 0.02/1.00(0.00) 1.00/1.00(0.00) BrSnull 0 0.9989 BrSnull 2 115 -4827.12 0.02/0.02(0.86) 1.00/1.00(0.14) 0.02/1.00(0.00) 1.00/1.00(0.00) CmC 2 117 -4746.51 0.02/0.02(0.85) 1.00/1.00(0.13) 6.00/46.99(0.02) m2aREL 4.61 0.0317 CmD 2 118 -4745.73 0.01/0.01(0.81) 5.68/5.68(0.03) 0.68/0.56(0.17) m3 0.03 0.8525

BrSalt 3 116 -4825.91 0.02/0.02(0.83) 1.00/1.00(0.14) 0.02/3.25(0.03) 1.00/3.25(0.00) BrSnull 0.29 0.5912 BrSnull 3 115 -4826.06 0.02/0.02(0.80) 1.00/1.00(0.13) 0.02/1.00(0.06) 1.00/1.00(0.01) CmC 3 117 -4748.27 0.02/0.02(0.85) 1.00/1.00(0.13) 6.05/14.85(0.02) m2aREL 1.09 0.2970 CmD 3 118 -4745.49 0.01/0.01(0.80) 0.16/0.16(0.68) 5.59/8.27(0.02) m3 0.52 0.4708

BrSalt 4 116 -4827.13 0.02/0.02(0.86) 1.00/1.00(0.14) 0.02/1.00(0.00) 1.00/1.00(0.00) BrSnull 0 1.0000 BrSnull 4 115 -4827.13 0.02/0.02(0.86) 1.00/1.00(0.14) 0.02/1.00(0.00) 1.00/1.00(0.00) CmC 4 117 -4747.91 0.02/0.02(0.85) 1.00/1.00(0.13) 6.06/25.37(0.02) m2aREL 1.81 0.1779 CmD 4 118 -4745.74 0.01/0.01(0.81) 5.68/5.68(0.03) 0.68/0.79(0.17) m3 0.02 0.8757

BrSalt 5 116 -4827.19 0.02/0.02(0.86) 1.00/1.00(0.14) 0.02/1.00(0.00) 1.00/1.00(0.00) BrSnull 0 1.0000 BrSnull 5 115 -4827.19 0.02/0.02(0.86) 1.00/1.00(0.14) 0.02/1.00(0.00) 1.00/1.00(0.00) CmC 5 117 -4748.25 0.02/0.02(0.85) 1.00/1.00(0.13) 6.19/0.00(0.02) m2aREL 1.14 0.2856 CmD 5 118 -4745.68 0.01/0.01(0.81) 5.67/5.67(0.03) 0.68/1.09(0.17) m3 0.14 0.7124

BrSalt 6 116 -4827.13 0.02/0.02(0.86) 1.00/1.00(0.14) 0.02/1.00(0.00) 1.00/1.00(0.00) BrSnull 0 1.0000 BrSnull 6 115 -4827.13 0.02/0.02(0.86) 1.00/1.00(0.14) 0.02/1.00(0.00) 1.00/1.00(0.00) CmC 6 117 -4748.61 0.02/0.02(0.85) 1.00/1.00(0.13) 6.09/13.95(0.02) m2aREL 0.42 0.5152 CmD 6 118 -4745.68 0.01/0.01(0.81) 5.68/5.68(0.03) 0.69/0.46(0.17) m3 0.13 0.7205

Freshwater clades BrSalt 1 116 -4813.92 0.02/0.02(0.85) 1.00/1.00(0.13) 0.02/6.35(0.02) 1.00/6.35(0.00) BrSnull 8.72 0.0031 BrSnull 1 115 -4818.28 0.02/0.02(0.83) 1.00/1.00(0.13) 0.02/1.00(0.03) 1.00/1.00(0.00) CmC 1 117 -4747.92 0.02/0.02(0.85) 1.00/1.00(0.13) 6.40/3.68(0.02) m2aREL 1.79 0.1809 CmD 1 118 -4744.61 0.01/0.01(0.81) 0.67/0.67(0.17) 5.84/3.14(0.03) m3 2.27 0.1315

BrSalt 3 116 -4823.91 0.02/0.02(0.83) 1.00/1.00(0.14) 0.02/4.42(0.02) 1.00/4.42(0.00) BrSnull 1.43 0.2313 BrSnull 3 115 -4824.63 0.02/0.02(0.80) 1.00/1.00(0.13) 0.02/1.00(0.06) 1.00/1.00(0.01) CmC 3 117 -4748.54 0.02/0.02(0.85) 1.00/1.00(0.13) 6.04/9.11(0.02) m2aREL 0.56 0.4540 CmD 3 118 -4745.24 0.01/0.01(0.80) 0.68/0.68(0.17), 5.59/13.30(0.03) m3 1.02 0.3125

BrSalt 5 115 -4827.18 0.02/0.02(0.86) 1.00/1.00(0.14) 0.02/1.00(0.00) 1.00/1.00(0.00) BrSnull 0 1.0000 BrSnull 5 116 -4827.18 0.02/0.02(0.86) 1.00/1.00(0.14) 0.02/1.00(0.00) 1.00/1.00(0.00) CmC 5 117 -4748.78 0.02/0.02(0.85) 1.00/1.00(0.13) 6.19/5.46(0.02) m2aREL 0.07 0.7907 CmD 5 118 -4745.69 0.01/0.01(0.81) 0.68/0.68(0.17) 5.72/4.90(0.03) m3 0.11 0.7419

167 Table S4.5. Analysis of Sciaenidae rhodopsin dataset with new Aplodinotus grunniens sequence using random-sites, branch-sites and clade models (PAML). Model np lnL Parameter estimates Null LRT p value

Random-sites models

m0 225 -7135.98 0.27(1.00)

m1 226 -6634.58 0.02(0.84) 1.00(0.16) m0 1002.8 0

m2 228 -6532.14 0.02(0.83) 1.00(0.13) 4.57(0.04) m1 204.88 0

m2aREL 228 -6532.14 0.02(0.83) 1.00(0.13) 4.57(0.04)

m3 229 -6531.66 0.02(0.82) 0.87(0.14) 4.21(0.04) m2aREL 0.96 0.3272

Partitioned models

South American branch

branch-sites 228 -6624.88 0.02/0.02(0.82) 1.00/1.00(0.14) 0.02/18.66(0.03) 1.00/18.66(0.01) BrSnull 9.66 0.0019

branch-site null 227 -6629.71 0.02/0.02(0.75) 1.00/1.00(0.13) 0.02/1.00(0.10) 1.00/1.00(0.02)

CmC 229 -6531.15 0.02/0.02(0.83) 1.00/1.00(0.13) 4.39/9.79(0.04) M2aREL 1.98 0.1594

CmD 230 -6526.05 0.02/0.02(0.82) 4.47/4.47(0.04) 0.88/4.50(0.14) M3 11.22 0.0008

Aplodinotus grunniens branch

branch-sites 228 -6634.57 0.02/0.02(0.84) 1.00/1.00(0.15) 0.02/1.00(0.00) 1.00/1.00(0.00) BrSnull 0 1.0000

branch-site null 227 -6634.57 0.02/0.02(0.84) 1.00/1.00(0.15) 0.02/1.00(0.00) 1.00/1.00(0.00)

CmC 229 -6531.37 0.02/0.02(0.83) 1.00/1.00(0.13) 4.59/1.89(0.04) M2aREL 1.54 0.2146

CmD 230 -6531.24 0.87/0.87(0.14) 4.21/4.21(0.04) 0.02/0.00(0.82) M3 0.84 0.3594

168 4.9.2. Supplementary figures

.

169

Figure S4.1. Ancestral habitat reconstructions. Extant marine and freshwater fishes are represented in blue and red respectively. The colours of ancestral branches represent the probability of freshwater inhabitancy (red) from binary state ancestral reconstructions of tip values using an equal rates model. Phylogeny is based on species trees in Lovejoy 2004, Bloom and Lovejoy 2014, Chapter Three and (Betancur-R et al. 2017). Branch lengths based on codon branch lengths estimated by M0 in PAML.

170

171 Figure S4.2. Bootstrap consensus tree of Sciaenidae rhodopsin. Bootstrap support values from 1000 replicates using an SH-like aLRT test displayed at each node. Bolded branches indicate Aplodinotus grunniens rhodopsin sequences.

Figure S4.3. Substitutions on transitional branches and freshwater clades vs. null estimates. Frequency of substitutions in rhodopsin occurring on the transitional branches and freshwater clade and in marine lineages. Coloured based on proximity to chromophore (see Figure 4.5 for scale). Grey points represent null estimates for the number of expected substitutions from 100 null estimates generate using EVOLVER on the same tree with the same branch lengths (PAML).

172 CHAPTER FIVE: RHODOPSIN SUBJECT TO SELECTIVE CONSTRAINT IN GYMNOTIFORM FISHES TO MAINTAIN VISUAL SENSITIVITY IN LIGHT-LIMITED UNDERWATER ENVIRONMENT

Contributors: Alexander Van Nynatten, Francesco Janzen, Kristen Brochu, William GR Crampton, Belinda SW Chang, Nathan R Lovejoy

Author contributions: AVN, BSWC and NRL designed the study. AVN, KB and FJ collected the rhodopsin sequence dataset. AVN analysed the data and wrote the manuscript with edits, feedback and guidance from BSWC WGRC and NRL.

5.1. ABSTRACT

Functional variation in the dim-light-specialized visual pigment, rhodopsin, frequently occurs in species inhabiting light-limited environments. Variation in visual function can arise through two processes: relaxation of selection or adaptive evolution improving photon detection in a given environment. Here, we investigate the molecular evolution of rhodopsin in Gymnotiformes, an order of South American fishes with sophisticated electrosensory capabilities. These nocturnal fishes are thought to have poor vision, potentially resulting from a sensory trade-off between vision and electrolocation. We surveyed rhodopsin from 147 gymnotiform species, spanning the order, and analyzed rates of molecular evolution. In contrast to our expectation, we detected strong selective constraint in gymnotiform rhodopsin with rates of non-synonymous to synonymous substitutions lower in gymnotiforms than other vertebrate lineages. In addition, we found evidence for positive selection on the branch leading to gymnotiforms and in one clade of deep-channel specialized gymnotiform species. On the gymnotiform branch, positively selected sites include a substitution associated with visual disease in humans, but its effect is likely masked by epistatic substitutions at nearby sites. Our results suggest that rhodopsin remains an important component of the gymnotiform sensory system alongside electrolocation, and that photosensitivity of rhodopsin is well adapted for vision in dim-light environments.

173 5.2. INTRODUCTION

Animals rely on the information collected by multiple sensory systems to navigate and orient themselves in their environment, avoid predators, and find mates. Vertebrates have specialized sensory organs to detect light, sound, electric and magnetic fields, heat and many different chemical compounds (Gutierrez et al. 2016). Most of the sensory systems possessed by extant vertebrates predates and likely contributed to their rapid diversification during the Cambrian explosion (Shu et al. 2003). When conditions are favourable, species integrate information from multiple sensory modalities, reducing noise that may be associated with a specific sensory system or compensating for a lack of signal from any one modality, resulting in a robust representation of the environment (Munoz and Blumstein 2012).

Many vertebrate lineages are characterized by sensory adaptations to environments where the array of stimuli is diminished. Sensory systems are generally energetically expensive, and suboptimal sensory systems are rapidly reduced and eventually lost (Niven and Laughlin 2008). For example, the invasion of terrestrial environments resulted in a loss of the lateral line and electrosensory organs found across fishes because neither sensory system is effective out of water (Gutierrez et al. 2016). Regressive evolution is also observed in the rudimentary eyes of vertebrate lineages inhabiting lightless caves and deep-sea habitats (Fernholm and Holmberg 1975; Niemiller et al. 2013). Adaptations that improve sensory systems to specific environments are also common (Gutierrez et al. 2016). In some cases, entirely new sensory solutions arise. Remarkably, in some cases the same novel sensory system convergently evolved in distantly related species experiencing similar selection pressures. This includes the advent of echolocation in whales and bats (Parker et al. 2013) and active electrolocation in distantly related nocturnal fishes (Lavoué et al. 2012).

Investigations of molecular evolution mostly mirror the macroscopic differences observed in species undergoing adaptive or regressive sensory evolution. Genes associated with the visual system, including but not limited to the light sensitive opsin genes, have been pseudogenized in many cave-dwelling fishes, deep-sea fishes, subterranean mammals, aquatic mammals, and snakes (Meredith et al. 2013; Emerling and Springer 2014; Schott et al. 2018).

174 The reduced opsin complement observed in mammals is also thought to be a remnant of relaxed selection on the visual system that occurred during a period of nocturnal ancestry (Heesy and Hall 2010). Conversely, evolutionary rates reflecting adaptive evolution have been found in the prestin gene of bats and cetaceans, where convergent amino acid substitutions are thought to facilitate the high-frequency sound sensitivity required for echolocation (Parker et al. 2013). A similar trend is observed in the adaptive molecular evolution of the sodium channel gene (Nav1.4a), expressed in the electrogenic organ of fishes capable of active electrolocation (Zakon et al. 2006).

Regressive evolution in one sensory modality might relate to adaptive evolution in alternative sensory systems better suited for the environment. This evolutionary process is known as a sensory trade-off (Nummela et al. 2013). The reduced visual performance in bats and cetaceans following the convergent evolution of echolocation is one common example of a sensory trade-off (Parker et al. 2013), and this process is also thought to explain the principle reliance on either olfaction or vision in mammals (Nummela et al. 2013). In mormyroids, an electrogenic clade of fishes found in Africa, the evolution of these two sensory modalities appear dichotomous, with species having well adapted visual systems or electrosensory systems, but not both, also suggesting a possible sensory trade-off (Stevens et al. 2013). Molecular evidence for sensory trade-offs has been reported in high duty cycle echolocating bats, the naked mole-rat, the star-nosed mole, and the blind Mexican cave fish. In these species, the molecular evolution of vision genes is relaxed coincident with an expansion of auditory, mechanosensory and gustatory sensory systems (Wilkens 2010; Emerling and Springer 2014; Gutierrez et al. 2018)

The electric knifefishes of South and Central America (Order Gymnotiformes) and the electric elephantfishes of Africa (Mormyroidea) are the only two vertebrate clades capable of active electrolocation (Albert and Crampton 2006). Unlike passive electrolocation, active electrolocation involves the production of stereotypical electric fields through specialized electric organs made up of modified muscle or neuronal tissue (Albert and Crampton 2006). Much like echolocation, active electrolocation is thought to benefit species in the absence of light, but instead of sound, disturbances in weak self-generated electric fields are monitored

175 by a sophisticated electrosensory system consisting of specialized high-frequency-sensitive tuberous electroreceptors (Albert and Crampton 2006). Gymnotiforms inhabit a wide range of habitats in the Neotropics ranging from clear shallow streams to the depths of large turbid and tannin stained rivers and are mostly nocturnally active (Albert and Crampton 2005). Active electrolocation provides a clear advantage over vision in highly turbid or tannin stained river environments where down-welling light is rapidly attenuated, especially at night (Takiyama et al. 2015), and evolved in gymnotiforms and mormyroids convergently for life in similar environments on different continents (Lavoué et al. 2012). Gymnotiforms have small eyes and are thought to have poor vision (Takiyama et al. 2015), but to what extent this apparent reduction in visual proficiency has influenced the molecular evolution of genes associated with vision has not been investigated.

The first step of the visual transduction pathway is mediated by light sensitive visual pigments expressed in the outer-segments of rod and cone photoreceptors (Bowmaker 2008). Most vertebrates have multiple visual pigment classes but rely exclusively on rhodopsin for vision in dim-light settings (Bowmaker 2008). Rhodopsin is well adapted for vision in dim light. It is more thermally stable than other visual pigments, improving the signal to noise ratio, and remains in its active state longer, activating more second messenger molecules, increasing signal amplification (Ernst et al. 2014). Maintaining these functional properties places the rhodopsin gene under significant evolutionary constraint, which can be seen in its high level of sequence conservation across vertebrates (Hauser et al. 2016). The deleterious effect of substitutions in a rod dominated retina can be seen in the numerous diseases associated rhodopsin mutations in humans (Daiger et al. 2013). However, substitutions altering the functional properties of rhodopsin have been reported in multiple vertebrate lineages, especially in those inhabiting light-limited environments (Bowmaker 2008). These substitutions have been attributed to relaxed selection in some cave-dwelling lineages that exist in environments that are devoid of light (Niemiller et al. 2013). In contrast, adaptive evolution in rhodopsin has been observed in many species inhabiting dim-light environments. In these species, convergent substitutions are observed at sites that shift the spectral sensitivity to more closely match the wavelengths of light illuminating an environment (Van Nynatten et al. 2015) or alter kinetic properties of rhodopsin to maximize photosensitivity (Hauser et al. 2017).

176 In this study, we investigate the molecular evolution of gymnotiform rhodopsin, and test whether the evolution of active electrolocation in gymnotiforms has reduced selective constraint on rhodopsin as part of a sensory trade-off or instead whether these fishes have undergone adaptive evolution, improving the photosensitivity of rhodopsin to the dim-light environments they inhabit. Using models that measure site specific substitution rates across a gene we compare evolutionary rates in gymnotiform rhodopsin with other fishes that are known to rely heavily on vision. We also investigate if the diverse array of photic niches inhabited by gymnotiforms, including the highly turbid and tannin stained deep channels of the Amazon basin, has influenced rates and patterns of rhodopsin evolution within the gymnotiform clade.

5.3. METHODS

5.3.1. Gymnotiform rhodopsin dataset We amplified and sequenced rhodopsin from 147 gymnotiform species using primers and protocols from (Chen et al. 2003). Sequences were aligned using the DECIPHER package in R (Wright 2015). We used the online implementation of IQ tree (Nguyen et al. 2015) to reconstruct a maximum likelihood gene tree for rhodopsin using eight other ostariophysan fishes as outgroups. The best fitting substitution model (HKY+F+I+G4) was determined with a Bayes Information Criterion comparison of the 88 models available in IQ tree. Node support was assessed for a consensus tree generated from 1000 bootstrap replicates (Supplementary figure S5.1).

5.3.2. Vertebrate rhodopsin dataset We used BlastPhyME (Schott et al. 2016) to assemble a large dataset of rhodopsin sequences available on Genbank (Supplementary table S5.1). An alignment was generated for these sequences using the DECIPHER package in R (Wright 2015). Depth data for fishes in our rhodopsin dataset were collected using the R Fishbase package (Boettiger et al. 2012).

177 To avoid any phylogenetic incongruence in the rhodopsin data, analyses of molecular evolution on the vertebrate rhodopsin dataset consisted only of rhodopsin sequences belonging to species represented in the robust multi-gene species tree generated in (Betancur-R et al. 2017). We kept all sequences longer than 700 bp for Characiphysian fishes but only representative species for other major vertebrate lineages spanning Gnathostomata for computational tractability (Supplementary figure S5.1).

5.3.3. Analyses of molecular evolution

To estimate rates of non-synonymous to synonymous substitutions (dN/dS) in rhodopsin we employed the maximum likelihood models available in PAML and HYPHY. Random-sites models in PAML (M0, M1a and M2a) have one to three sites classes respectively (Yang and

Swanson 2002). M2a has one site class where dN/dS can exceed one. Support for positive selection (dN/dS > 1) is tested by comparing the fit of M2a to M1a. Models M7 and M8 are an extension of M1a and M2a with ten site classes with dN/dS estimates defined by a beta distribution (Yang and Swanson 2002). The Branch-sites test was used to compare dN/dS estimates for the branch leading to the gymnotiform clade with the rest of the tree (Yang et al. 2000). This model allows positive selection at a subset of sites only on the selected branch or clade (herein called the foreground) and its fit is compared to a nested null model assuming no positive selection by fixing the dN/dS estimate for the foreground to equal one (Yang and

Nielsen 2002). Clade models C and D allow dN/dS to differ in a subset of sites in pre-determined phylogenetic partitions, but does not restrict the estimate to be greater than one (Bielawski and Yang 2004). The fit of these models were compared with nested null models M2aREL and M3 where the dN/dS estimate for the divergent site class in CmC and CmD is collapsed to uniformity across the phylogeny (Weadick and Chang 2011). BUSTED, RELAX and the adaptive branch-site REL (aBSREL) models were accessed through the Datamonkey webserver (Pond et al. 2005; Delport et al. 2010; Kosakovsky Pond et al. 2011; Wertheim et al. 2014; Murrell et al. 2015). We employed BUSTED in a similar fashion to the Branch-sites test in PAML, selecting the branch leading to the gymnotiform clade as the foreground. RELAX was used to test if selection pressures have relaxed or intensified on the gymnotiform clade with cypriniforms used as the reference. The aBSREL model was used on a dataset

178 comprised only of gymnotiforms without the specification of any foreground lineage. Ancestral reconstructions were reconstructed using the best fitting codon models and Dayhoff, Jones and WAG empirical amino acid substitution matrices. Substitutions were modelled onto the rhodopsin meta II active-state structure (Choe et al. 2011) using UCSF Chimera (Pettersen et al. 2004).

5.4. RESULTS

5.4.1. Rhodopsin subject to strong purifying selection within the gymnotiform clade We collected and analyzed the molecular evolution of rhodopsin sequences from 147 gymnotiform species but contrary to our expectation find no evidence for relaxed selection.

Comparing dN/dS estimates using the random-sites models in PAML suggests gymnotiform rhodopsin is evolving under strong purifying selection (m0: dN/dS = 0.03, Table 5.1). We also found no evidence of pervasive positive selection (m2a and m1a: p = 1.00; m8 and m7: p = 1.00, Table 5.1). Models incorporating a class of neutrally evolving sites fit better than nested null models (m1a and m0: p < 0.0001; m2aREL and m3ns2: p = 0.0227, Table 5.1), but only a small number of sites are included in the neutrally evolving site class (m1a; m2aREL, Table 5.1), well within expected values for functional protein coding genes (Yang and Swanson 2002).

We used clade models C and D in PAML to test if dN/dS estimates for gymnotiform rhodopsin are significantly different from other vertebrates included in our 50 species vertebrate rhodopsin dataset (Bielawski and Yang 2004). We find a subset of sites in the rhodopsin gene of gymnotiforms are under stronger purifying selection, with dN/dS estimates half that of other vertebrates (Table 5.2) (Figure 5.1a). Estimating dN/dS separately for the gymnotiform clade is significantly better fitting than nested null models assuming a uniform dN/dS across the entire vertebrate phylogeny (CmC and M2aREL: p < 0.0001, CmD and M3: p < 0.0001; Table 5.2). Using RELAX (Wertheim et al. 2014) we directly compared dN/dS in the gymnotiform clade with the Cypriniformes (minnows and their allies), a comparably

179 diverse group of highly visual fishes also belonging to the superorder Ostariophysii (Figure 5.1a) (Supplementary figure S5.2). These analyses indicate that a subset of sites in rhodopsin are evolving under more purifying selection (dN/dS << 1) in gymnotiforms than cypriniforms. Clade models C and D with the gymnotiform and cypriniform clade set as independent foreground partitions are not better fitting than models with gymnotiforms representing the only foreground lineage (Figure 5.1a) (Table 5.2), indicating that rhodopsin is evolving under more selective constraint in gymnotiforms than in other fish lineages.

5.4.2. Positive selection on branch leading to the Gymnotiformes We used the branch-sites model in PAML to test for adaptive evolution on the branch leading to the gymnotiform clade (Yang and Nielsen 2002). We find evidence that positive selection occurred on the branch leading to the common ancestor of gymnotiforms (Branch- sites and Branch-sites null: p = 0.0455; Table 5.2). Parameter estimates for the positively selected site class in the branch-sites model suggest a small number of sites are under positive selection with a very high dN/dS (Branch-sites foreground dN/dS = 83.14, Table 5.2). BUSTED, a similar test in HYPHY supports this finding (p-value < 0.05 dN/dS = 72.27).

5.4.3. Positive selection in rhodopsin associated with deep-water adaptation We find evidence for positive selection using models that do not specify phylogenetic partitions a priori on a branch leading to a clade of deep-channel specialists within the family Apteronotidae that inhabit large rivers in the Amazon basin (Branch-sites REL – full adaptive model and baseline model: corrected p value = 0.0014) (Pond et al. 2005; Delport et al. 2010; Kosakovsky Pond et al. 2011). Ancestral codon and amino acid reconstructions strongly support the T214F substitution along this branch (PP = 1.00), a substitution that occurs on no other branch in the gymnotiform phylogeny (Figure 5.2a). In fact, a phenylalanine at site 214 in rhodopsin is rare, found in just 15 other vertebrate sequences of 2754 examined (Supplementary table S5.1). These 15 species represent lineages from seven different families and six orders of fishes, indicating multiple independent substitutions. Fishes with a phenylalanine at site 214 typically inhabit deeper water, with the majority of species with this

180 residue residing below the 200 m cut off characterizing the mesopelagic or twilight zone (Figure 5.2b). A Wilcoxon–Mann–Whitney test indicates that fishes with a phenylalanine at site 214 are significantly deeper-dwelling than fishes with a threonine at site 214 (p = 0.0002), the residue found in most gymnotiforms and present in the common ancestor of the gymnotiform clade.

5.4.4. An amino acid in gymnotiform rhodopsin causes visual disease in humans We were surprised to observe an amino acid in rhodopsin (cysteine at site 220) present in all gymnotiforms, that in humans causes retinitis pigmentosa (RP), a genetic visual disease (Bunge et al. 1993; Daiger et al. 2013). Unlike many mutations that cause RP, F220C does not interfere with intracellular trafficking or prevent the formation of a light-sensitive visual pigment. Instead, the substitution ablates rhodopsin dimerization and oligomerization through interactions at the dimer interface (Ploier et al. 2016). In order to determine if the F220C substitution is likely to have the same effect in the gymnotiform sequence background we reconstructed ancestral rhodopsin sequences for the most recent common ancestor of gymnotiforms and humans as well as the nodes in between this ancestor and the gymnotiform clade (Figure 5.1a) (Supplementary figure 5.3). We then modelled these substitutions onto the 3D crystal structure of rhodopsin, focusing on substitutions occurring in close proximity to site 220 on the dimer interface (Figure 5.1b). Six of the eight amino acid substitutions along this evolutionary trajectory arose in the common ancestors of Otomorpha or the common ancestor of Characiphysi (Figure 5.1a). Sites 217 and 258 are particularly interesting because of their proximity to site 220 and because the phenylalanine residues found at both of these sites in gymnotiforms mirror the amino acid lost at site 220 (Figure 5.3). Both phenylalanines are completely conserved across gymnotiforms. Phenylalanine residues at sites 217 and 258 are also observed in distantly related fishes with a cysteine at site 220 (Supplementary table S5.1). In the meta-II active-state structure of rhodopsin (3PQR), the side chain of site 258 is closer to the phenylalanine at site 220 than any other residue (Figure 5.3) (Choe et al. 2011). The close proximity of these two sidechains in the active-state structure would almost certainly cause a steric conflict if phenylalanines were present at both sites simultaneously (Figure 5.3). In gymnotiform evolution, the phenylalanine at site 258 appears only after a non-deleterious

181 substitution occurs at site 220 and a rhodopsin sequence with phenylalanines at both sites is not observed in nature (Supplementary table S5.1). We expect these and other substitutions along the dimerization interface have altered the functional implications of the F220C mutation, supporting the inference of positive selection at this site in branch-sites tests of the branch leading to gymnotiforms. Positive selection at a site can imply adaptive evolution, suggesting that a substitution causing disease in human rhodopsin might improve its function in other species because of epistatic interactions with neighboring residues.

5.5. DISCUSSION

In contrast with the expectations of a sensory trade-off between vision and active electrolocation, we find no evidence for relaxed selection in the rhodopsin gene of

Gymnotiformes. Instead, the dN/dS estimate for gymnotiform rhodopsin is much lower than values previously reported for other groups of fishes (Van Nynatten et al. 2015; Hauser et al. 2017) indicating that, if anything, selection pressures on rhodopsin are intensified in Gymnotiformes. This result is the opposite of an emerging trend found in bat visual pigment evolution (Gutierrez et al. 2018; Wu et al. 2018). Echolocating bats, like gymnotiforms, have an alternative sensory modality better suited for dim-light environments. In some bats, echolocation has become more sophisticated providing a higher resolution representation of the nocturnal environment (Gutierrez et al. 2018). These bats with high-duty cycle (HDC) echolocation display molecular, morphological and physiological differences indicating a lessened reliance on vision, and multiple studies have suggested this reduction in visual capacity is a result of a sensory trade-off with echolocation (Thiagavel et al. 2018, Gutierrez et al. 2018; Wu et al. 2018). As in bats, poor visual performance in gymnotiforms has been suggested on the basis of their small eyes (Takiyama et al. 2015). However, reduction in eye size does not appear to affect the rate of rhodopsin evolution in gymnotiforms. This could be due to differences in the two alternative sensory modalities, echolocation and electrolocation. The effective range of echolocation in bats far exceeds that of electrolocation in gymnotiforms (Albert and Crampton 2006; Madsen and Surlykke 2013), and while the visual field is reduced

182 in turbid and tannin stained waters, visual detection distances may exceed that of electrolocation in some conditions (Crampton 2007). Previous studies have shown that electric fishes use vision and active electrolocation for the respective detection of near and far objects (Schumacher et al. 2017) and other studies show that vision improves the ability of gymnotiform fishes to navigate their environments (Rose and Canfield 1993). Thus, one explanation of our results is that vision and electrolocation interact via a division of labour rather than a trade-off, with fishes using electrolocation for close range detection, and vision for the perception of more distant objects.

Sensory trade-offs between vision and alternative sensory modalities might not be ubiquitous across the entirety of the visual system, but may instead involve some components of the visual system but not others. Most vertebrates have duplex retinas, where bright-light vision is mediated by cones and dim-light vision by rods (Bowmaker 2008). So far, many investigations of sensory trade-offs have focussed on the molecular evolution of cone opsins. Gymnotiforms appear to lack two cone opsin genes encoding the short-wavelength sensitive opsins SWS1 and SWS2 that are sensitive to wavelengths of light most rapidly attenuated in turbid and tannin-stained waters (Liu et al. 2016), while LWS, the long-wavelength sensitive opsin gene, and rhodopsin remain intact in the genome and are expressed in the eye. The retention of LWS and rhodopsin matches the opsin complement observed in echolocating species (bats and cetaceans) and subterranean mammals that rely more heavily on specialized mechanoreceptors (Meredith et al. 2013; Emerling and Springer 2014; Gutierrez et al. 2018). Cone opsins have also been lost in other vertebrate lineages, most often taxa inhabiting light- limited environments. This includes deep-sea fishes, nocturnal birds and mammals (Zhao et al. 2009; Emerling and Springer 2015), aquatic mammals, and fossorial mammals and reptiles (Emerling and Springer 2014; Schott et al. 2018). In addition to losses of cone-opsins, relaxed selection and even the total loss of other genes associated with the visual transduction cascade or retinal cycle have also been reported in some of these taxa (Emerling and Springer 2014; Schott et al. 2018). However, with regard to rhodopsin, it is only in cave dwelling fishes that inhabit environments where light is entirely absent, that rates of the molecular evolution of rhodopsin are consistent with expectations of a sensory trade-off (Niemiller et al. 2013). Cave species exist at the extreme metabolic limit and display many regressive phenotypes associated

183 with saving energy (Niemiller et al. 2013). The energetic cost associated with both vision and electrolocation is quite high (Salazar et al. 2013), but it would appear, at least in gymnotiforms, that losing rhodopsin, the best adapted opsin for sensing light in dim-light environments, is more detrimental.

The high rate of purifying selection acting on rhodopsin might be a result of its increased relative expression in the retina. High expression levels have been shown to increase purifying selection on proteins, potentially to reduce the deleterious effects associated with the accumulation of misfolded proteins (Drummond et al. 2005). Indeed, it is the high proportion of rods in the human retina that causes mutations in rhodopsin to so frequently result in disease (Daiger et al. 2013). Taking into account the effect expression levels might have on the molecular evolution of genes has not been widely investigated and might help explain some differences in the rates of molecular evolution observed in rod and cone genes (Schott et al. 2018). Further research is required to disentangle the influence that the relative expression of a gene and its evolutionary importance has on the rate of molecular evolution. Analyses of other genes highly expressed in rods, but with less direct effect on vision in dim-light environments might help investigate this relationship.

On the branch leading to the gymnotiform clade we find evidence for positive selection. This indicates that substitutions on this branch are adaptive and likely improve the functional properties of rhodopsin for the habitat of the common ancestor. The origination of gymnotiforms pre-dates the rise of the Eastern Cordillera of the Andes and the formation of the modern Amazonian watersheds (Albert and Crampton 2005). Whether or not these ancient rivers were similar in optical qualities to the rivers making up the present-day Amazon river basin is unclear, complicating inferences of the ecological importance of positive selected substitutions on the gymnotiform ancestral branch. However, some geomorphological evidence does suggest that the major water types characteristic of the contemporary Amazon basin (white water, black water and clear water) have existed since the origination of gymnotiforms (Crampton 2011). Catfishes (Order: Siluriformes), represent the most likely sister group to gymnotiforms (Arcila et al. 2017). These fishes are characterized by their barbels, another sensory adaptation to optically challenging environments, suggesting the

184 common ancestor of these two lineage might have inhabited dim-light environments. Catfishes also possess electrosensory organs, but they are less sophisticated and are not paired with active electrogenesis, limiting their use for electrolocation and navigation (Albert and Crampton 2005). The evolution of more sophisticated electrosensory systems in gymnotiforms might have facilitated transitions into deeper, darker, and more turbid environments as well as a more nocturnal life history, necessitating further adaptations in rhodopsin for improved vision in dim-light environments.

Substitutions improving sensitivity in dim-light environments have been observed in many fishes inhabiting environments with challenging optical conditions (Sugawara et al. 2010), most notably the deep-sea (Hunt et al. 2001). We also find evidence for positive selection on a branch within the gymnotiform clade representing a transition into deep-channel habitats by members of the family Apteronotidae. These species primarily inhabit large turbid and tannin stained rivers where ambient light levels drop to intensities similar to those observed in the deep-sea within 10 m of the surface (Crampton 2007). In fact, many of these species have only recently been discovered due to the secrecy afforded to them by the depths they inhabit (Crampton 2007). Deep-channel gymnotiforms have multiple morphological adaptations to deep-water habitats (Crampton 2007). Like deep-sea fishes, feeding morphology has changed significantly suggesting specialized diet and feeding habits (Crampton 2007). The light-limited environment has also resulted in reductions in skin pigmentation and eye size. Interestingly, we find that the T214F substitution observed along this branch occurs frequently in deep-sea fishes, evidence of possible convergent evolution to dim-light environments. Similar instances of positive selection in rhodopsin have been associated with other transitions into dim-light environments made by deep-diving whales and cichlids (Sugawara et al. 2010).

Most studies investigating molecular aspects of visual pigment evolution have focused on spectral tuning, a shift in the sensitivity of a visual pigment to more closely match wavelengths of light available in an environment (Hunt et al. 2001). The substitutions that are positively selected on the gymnotiform branch at site 220 and on the deep-channel gymnotiform lineage at site 214 are not at sites known to shift the spectral sensitivity of rhodopsin. Instead, we expect that these substitutions influence rhodopsin dimerization, a non-

185 spectral property of rhodopsin that is thought to be important in maintaining thermal-stability and improving signal amplification (Gunkel et al. 2015, Jastrzebska et al. 2016). In the long- wavelength sensitive opsin (LWS), a paralog of rhodopsin, substitutions at sites on helix 5, including sites 214 and 220 (bovine rhodopsin numbering) reduce dimer formation (Jastrzebska et al. 2016). Most freshwater fishes have a red-shifted rhodopsin. Red-shifted pigments are better matched to the underwater riverine visual environment but are also inherently more noisy due to frequent thermally induced rhodopsin activation events, decreasing sensitivity to light (Ernst et al. 2014). It is thought that dimerization might increase the thermal stability of rhodopsin (Jastrzebska et al. 2016), and may be especially important for freshwater fishes like gymnotiforms, using red-shifted rhodopsins more prone to thermal noise in warm tropical rivers. The track-like oligomeric structures made up of rows of rhodopsin dimers observed in the outer segments of rod photoreceptors, have also been hypothesized to improve signal amplification following light activation by sequestering the down-stream signaling molecules that rhodopsin activates (Gunkel et al. 2015). Signal amplification through other mechanisms, including alteration of rhodopsin structure and function, has been shown to be adaptive in dim-light environments. Unlike spectral tuning, the same non-spectral adaptations to dim-light environments may be convergent in marine, freshwater and terrestrial species (Sugawara et al. 2010). The substitution at site 214 appears to represent an example of this as convergent substitutions are observed in deep water marine and freshwater fishes.

Surprisingly, one of the positively selected substitutions we see in our dataset is to an amino acid associated with disease in humans. The substitution to a cysteine at site 220 is conserved throughout the gymnotiform clade, but when observed in humans leads to the genetic degenerative eye disease RP (Bunge et al. 1993). When expressed in bovine rhodopsin in vitro, the F220C mutation eliminates dimerization in rhodopsin (Ploier et al. 2016). The persistence of this disease associated substitution in gymnotiforms is likely accommodated by substitutions at other sites maintaining the dimerization potential of the protein and masking its effect. Closer examination of this disease-associated substitution at site 220, in the context of the ancestral reconstructions of rhodopsin sequences indicates potential compensation by substitutions in close proximity on helices 5 and 6. These substitutions evolved prior to the

186 disease-associated substitution (Figure 5.1a) and identical substitutions are observed in distantly related teleost lineages with the same residue at site 220 (Supplementary table S5.1). Dimerization might be particularly context dependent as it is typically facilitated by an interface comprised of multiple residues (Liberles et al. 2012). The repeated evolution of dimer-forming GPCRs from divergent monomeric ancestors would suggest that a dimerization interface can be derived through divergent evolutionary paths (Felce et al. 2017). Differences in the dimerization interface of rhodopsin might also be expected given the substantial sequence variation in helix 5 across vertebrates. However, the overall importance of this interface for normal rhodopsin function can be inferred from the high frequency of RP associated substitutions occurring at sites on helix 5 (Daiger et al. 2013).

Understanding how rhodopsin sequence diversity impacts its function is critical for investigations of sensory evolution, especially in species inhabiting dim-light environments where rhodopsin represents the principle component of the visual system. Rhodopsin is also critically important for vision in humans, and comparative sequence analyses with other vertebrate species can improve our functional understanding of disease associated substitutions. The accumulation of genetic data for non-model species has revealed many substitutions that are pathogenic in humans, suggesting that substitutions elsewhere in the gene or genome mask or compensate for disease associated substitutions (Kondrashov et al. 2002; Xu and Zhang 2014; Jordan et al. 2015). Combining mutation data from human disease databases with natural sequence variation has great potential to improve our functional understanding of proteins in both contexts. These types of studies will become increasingly powerful as techniques for genome screening, the acquisition sequence data for non-model species, and bioinformatics tools improve. Rhodopsin represents a model system for further studies of this style. Large disease databases already exist for rhodopsin, and numerous rhodopsin sequences are available from a wide array of vertebrate species possessing rhodopsin proteins with diverse functional properties. Indeed, it is the functional flexibility of rhodopsin that has allowed its adaptation to many different environments and appears to have cemented its place in vertebrate vision and in the overall integration of sensory information, even in species with novel sensory systems for challenging optical environments like the gymnotiforms.

187

5.6. TABLES

Table 5.1. Random-sites (PAML) analyses of the 147 species Gymnotiformes rhodopsin dataset using the maximum likelihood rhodopsin gene tree. Parameter Estimates: dN/dS (proportion of sites): Model np lnL Null LRT p value background dN/dS / foreground dN/dS m0 293 -8565.65 0.03(1.00) m1a 294 -8391.98 0.01(0.96) 1.00(0.04) m0 347.47 < 0.0001 m2a 296 -8391.98 0.01(0.96) 1.00(0.04) 13.68(0.00) m1a 0 1.0000 m3 (2 site classes) 295 -8281.88 0.00(0.91) 0.25(0.09) m2aREL 296 -8279.29 0.00(0.90) 1.00(0.01) 0.21(0.09) m3 (ns2) 5.19 0.0227 M3 297 -8265.96 0.00(0.88) 0.10(0.07) 0.38(0.05) M2aREL 26.66 < 0.0001 m7 294 -8271.57 p = 0.07, q = 1.39 m8 296 -8271.57 p = 0.07, q = 1.39, 1.00(0.00) m7 0 1.0000

Note: lnL, ln likelihood; LRT, likelihood ratio test result; Br-site, Branch-site;

Table 5.2. Random-sites (PAML) analyses of the 50 species vertebrate rhodopsin dataset. Parameter Estimates: dN/dS (proportion of sites): Model np lnL Null LRT p value background dN/dS / foreground dN/dS M0 99 -14551.75 0.05(1.00) M1 100 -14345.03 0.04(0.93), 1.00(0.07) m0 413.44 < 0.0001 M2 102 -14345.03 0.04(0.93), 1.00(0.03), 1.00(0.04) m1 0 1.0000 m7 100 -13947.32 p = 0.24, q = 3.07 m8a 101 -13945.23 p = 0.25, q = 3.50 m8 102 -13945.23 p = 0.26, q = 3.50 1.00(0.01) m7 4.18 0.1237 m8a 0 1.0000 M2aREL 102 -14001.94 0.01(0.73), 1.00(0.01), 0.17(0.26) m3 103 -13959.18 0.01(0.64), 0.10(0.28), 0.36(0.08)

Gymnotiformes branch Branch Site (null) 101 -14342.37 0.04(0.90), 1.00(0.07), 0.04/1.00(0.03), 1.00/1.00(0.00) Branch Site 102 -14340.37 0.04(0.91), 1.00(0.07), 0.04/83.14(0.02), 1.00/83.14(0.00) BrS null 4 0.0455 CmC 103 -14001.94 0.01(0.73), 1.00(0.01), 0.17/0.17(0.26) M2aREL 0 1.0000 CmD 104 -13959.02 0.01(0.64), 0.10(0.28), 0.36/0.26(0.08) m3 0.32 0.5716

Gymnotiformes Clade Branch Site (null) 101 -14340.81 0.04(0.92), 1.00(0.07), 0.04/1.00(0.01), 1.00/1.00(0.00) Branch Site 102 -14340.59 0.04(0.92), 1.00(0.07), 0.04/1.79(0.00), 1.00/1.79(0.00) BrS null 0.44 0.5071 CmC 103 -13988.66 0.01(0.72), 1.00(0.01), 0.18/0.08(0.27) M2aREL 26.56 < 0.0001 CmD 104 -13951.80 0.01(0.63), 0.09(0.26), 0.33/0.12(0.11) m3 14.76 < 0.0001

Gymnotiformes branch and clade CmC 104 -13988.58 0.01(0.72), 1.00(0.01), 0.19/0.16/0.08(0.27) M2aREL 26.72 < 0.0001 CmD 105 -13951.56 0.01(0.63), 0.09(0.26), 0.34/0.23/0.12(0.11) m3 15.24 0.0001

Cypriniformes clade CmC 103 -14001.45 0.01(0.73) 1.00(0.01) 0.18/0.16(0.26) M2aREL 0.98 0.3222 CmD 104 -13958.94 0.01(0.64) 0.36(0.08) 0.10/0.09(0.28) m3 0.48 0.4884

Gymnotiformes and Cypriniformes clade CmC 104 -13987.62 0.01(0.72) 1.00(0.01) 0.20/0.09/0.16(0.27) M2aREL 28.64 < 0.0001 0.01(0.64) 0.10(0.27) 0.40/0.16/0.32(0.09) CmD 105 -13950.85 m3 16.66 0.0002

Gymnotiformes CmC 2.08 0.1492 Gymnotiformes CmD 1.90 0.1675 Cypriniformes CmC 27.66 < 0.0001 Cypriniformes CmD 23.24 < 0.0001

Note: lnL, ln likelihood; LRT, likelihood ratio test result; CmC, Clade model C; CmD, Clade model D

188

5.7. FIGURES

Figure 5.1. Intensified purifying selection on gymnotiforms rhodopsin. a) Vertebrate species tree based on Betancur-R et al. 2017. Rhodopsin amino acid substitutions in close proximity to site 220 labelled. PAML dN/dS estimates for the third (divergent) site class in model CmD with three partitions (top right). b) Amino acid differences between the common ancestor (node 1) and gymnotiform rhodopsin shown on the active state crystal structure of rhodopsin, with helix five (dark grey) and helix six (light grey) facing out (Choe et al. 2011). Ancestral residues in grey and gymnotiform residues in gold. Ancestral phenylalanine at site 220, the wild type residue at the site implicated in retinitis pigmentosa coloured red.

189

Figure 5.2. Variation at rhodopsin site 214 in gymnotiforms and other fishes. a) Ancestral reconstruction of amino acid states at site 214 in gymnotiform rhodopsin. b) Boxplots showing midpoint depth fishes inhabit grouped by the amino acid identity at site 214. Mesopelagic zone (200-1000m) indicated by dashed lines. ** indicates significance (p < 0.001) for Wilcoxon rank-sum test.

190

Figure 5.3. Epistatic interactions on helix 5 and 6 near RP associated F220C mutation. Amino acid differences between gymnotiform and the common ancestor of humans and gymnotiform rhodopsin near site 220 modelled onto the rhodopsin active-state crystal structure. Distances from site 220 to neighbouring residues shown, colours match those of Figure 1. Variability at site 214 in gymnotiforms also shown, colours match those of Figure 2.

191

5.8. REFERENCES

Albert JS, Crampton W. 2006. Electroreception and Electrogenesis. In: The Physiology of Fishes. pp. 429–470.

Albert JS, Crampton WG. 2005. Diversity and phylogeny of Neotropical electric fishes (Gymnotiformes). In: Electroreception. Springer. pp. 360–409.

Arcila D, Ortí G, Vari R, Armbruster JW, Stiassny MLJ, Ko KD, Sabaj MH, Lundberg J, Revell LJ, Betancur-R R. 2017. Genome-wide interrogation advances resolution of recalcitrant groups in the tree of life. Nat Ecol Evol 1:0020.

Betancur-R R, Wiley EO, Arratia G, Acero A, Bailly N, Miya M, Lecointre G, Ortí G. 2017. Phylogenetic classification of bony fishes. BMC Evol Biol 17:1–40.

Bielawski JP, Yang Z. 2004. A Maximum Likelihood Method for Detecting Functional Divergence at Individual Codon Sites, with Application to Gene Family Evolution. J. Mol. Evol. 59:1–12.

Boettiger C, Lang DT, Wainwright PC. 2012. rfishbase: exploring, manipulating and visualizing FishBase data from R. J. Fish Biol. 81:2030–2039.

Bowmaker JK. 2008. Evolution of vertebrate visual pigments. Vision Res. 48:2022–2041.

Bunge S, Wedemann H, David D, Terwilliger DJ, van den Born LI, Aulehla-Scholz C, Samanns C, Horn M, Ott J, Schwinger E, et al. 1993. Molecular Analysis and Genetic Mapping of the Rhodopsin Gene in Families with Autosomal Dominant Retinitis Pigmentosa. Genomics 17:230–233.

Chen W-J, Bonillo C, Lecointre G. 2003. Repeatability of clades as a criterion of reliability: a case study for molecular phylogeny of Acanthomorpha (Teleostei) with larger number of taxa. Mol. Phylogenet. Evol. 26:262–288.

Choe H-W, Kim YJ, Park JH, Morizumi T, Pai EF, Krauß N, Hofmann KP, Scheerer P, Ernst OP. 2011. Crystal structure of metarhodopsin II. Nature 471:651–655.

Crampton WG. 2007. Diversity and adaptation in deep channel Neotropical electric fishes. In: Fish life in special environments. New Hampshire: Fish life in special environments. New Hampshire: Science Publishers, Inc., Enfield. pp. 283–339.

Crampton WG. 2011. An ecological perspective on diversity and distributions. Historical biogeography of neotropical freshwater fishes:165–189.

Daiger SP, Sullivan LS, Bowne SJ. 2013. Genes and mutations causing retinitis pigmentosa. Clin. Genet. 84:132–141.

192 Delport W, Poon AFY, Frost SDW, Kosakovsky Pond SL. 2010. Datamonkey 2010: a suite of phylogenetic analysis tools for evolutionary biology. Bioinformatics 26:2455–2457.

Drummond DA, Raval A, Wilke CO. 2005. A Single Determinant Dominates the Rate of Yeast Protein Evolution. Mol. Biol. Evol. 23:327–337.

Emerling CA, Springer MS. 2014. Eyes underground: Regression of visual protein networks in subterranean mammals. Mol. Phylogenet. Evol. 78:260–270.

Emerling CA, Springer MS. 2015. Genomic evidence for rod monochromacy in sloths and armadillos suggests early subterranean history for Xenarthra. Proc. Biol. Sci. 282:20142192–20142192.

Ernst OP, Lodowski DT, Elstner M, Hegemann P, Brown LS, Kandori H. 2014. Microbial and Animal Rhodopsins: Structures, Functions, and Molecular Mechanisms. Chem. Rev. 114:126–163.

Felce JH, Latty SL, Knox RG, Mattick SR, Lui Y, Lee SF, Klenerman D, Davis SJ. 2017. Receptor Quaternary Organization Explains G Protein-Coupled Receptor Family Structure. Cell Rep. 20:2654–2665.

Fernholm B, Holmberg K. 1975. The eyes in three genera of hagfish (Eptatretus, Paramyxine andMyxine)—a case of degenerative evolution. Vision Res. 15:253–IN254.

Gunkel M, Schöneberg J, Alkhaldi W, Irsen S, Noé F, Kaupp UB, Al-Amoudi A. 2015. Higher-Order Architecture of Rhodopsin in Intact Photoreceptors and Its Implication for Phototransduction Kinetics. Structure 23:628–638.

Gutierrez EA, Schott RK, Preston MW, Loureiro LO, Lim BK, Chang BSW. 2018. The role of ecological factors in shaping bat cone opsin evolution. Proc. Biol. Sci. 285:20172835– 20172838.

Gutierrez EA, Van Nynatten A, Chang BSW, Lovejoy NR. 2016. Sensory Systems: Molecular Evolution in Vertebrates.

Hauser FE, Ilves KL, Schott RK, Castiglione GM, López-Fernández H, Chang BSW. 2017. Accelerated Evolution and Functional Divergence of the Dim Light Visual Pigment Accompanies Cichlid Colonization of Central America. Mol. Biol. Evol. 34:2650–2664.

Hauser FE, Schott RK, Castiglione GM, Van Nynatten A, Kosyakov A, Tang PL, Gow DA, Chang BSW. 2016. Comparative sequence analyses of rhodopsin and RPE65 reveal patterns of selective constraint across hereditary retinal disease mutations. Vis. Neurosci. 33:e002.

Heesy CP, Hall MI. 2010. The nocturnal bottleneck and the evolution of mammalian vision. Brain Behav Evol 75:195–203.

193 Hunt DM, Dulai KS, Partridge JC, Cottrill P, Bowmaker JK. 2001. The molecular basis for spectral tuning of rod visual pigments in deep-sea fish. J. Exp. Biol. 204:3333–3344.

Jastrzebska B, Comar WD, Kaliszewski MJ, Skinner KC, Torcasio MH, Esway AS, Jin H, Palczewski K, Smith AW. 2016. A G Protein-Coupled Receptor Dimerization Interface in Human Cone Opsins. Biochemistry:acs.biochem.6b00877–39.

Jordan DM, Frangakis SG, Golzio C, Cassa CA, Kurtzberg J, Genomics TFFN, Davis EE, Sunyaev SR, Katsanis N. 2015. Identification of cis-suppression of human disease mutations by comparative genomics. Nature 524:225–229.

Kondrashov AS, Sunyaev S, Kondrashov FA. 2002. Dobzhansky–Muller incompatibilities in protein evolution. Proc. Natl. Acad. Sci. U.S.A. 99:14878–14883.

Kosakovsky Pond SL, Murrell B, Fourment M, Frost SDW, Delport W, Scheffler K. 2011. A Random Effects Branch-Site Model for Detecting Episodic Diversifying Selection. Mol. Biol. Evol. 28:3033–3043.

Lavoué S, Miya M, Arnegard ME, Sullivan JP, Hopkins CD, Nishida M. 2012. Comparable Ages for the Independent Origins of Electrogenesis in African and South American Weakly Electric Fishes. PLoS ONE 7:e36287–18.

Liberles DA, Teichmann SA, Bahar I, Bastolla U, Bloom J, Bornberg Bauer E, Colwell LJ, De Koning AJ, Dokholyan NV, Echave J. 2012. The interface of protein structure, protein biophysics, and molecular evolution. Protein Sci. 21:769–785.

Liu D-W, Lu Y, Yan HY, Zakon HH. 2016. South American Weakly Electric Fish (Gymnotiformes) Are Long-Wavelength-Sensitive Cone Monochromats. Brain Behav Evol 88:204–212.

Lythgoe JN. 1979. The Ecology of Vision.

Madsen PT, Surlykke A. 2013. Functional Convergence in Bat and Toothed Whale Biosonars. Physiology 28:276–283.

Meredith RW, Gatesy J, Emerling CA, York VM, Springer MS. 2013. Rod Monochromacy and the Coevolution of Cetacean Retinal Opsins. PLoS Genet 9:e1003432–12.

Munoz NE, Blumstein DT. 2012. Multisensory perception in uncertain environments. Behavioral Ecology 23:457–462.

Murrell B, Weaver S, Smith MD, Wertheim JO, Murrell S, Aylward A, Eren K, Pollner T, Martin DP, Smith DM, et al. 2015. Gene-Wide Identification of Episodic Selection. Mol. Biol. Evol. 32:1365–1371.

Nguyen L-T, Schmidt HA, Haeseler von A, Minh BQ. 2015. IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies. Mol. Biol. Evol. 32:268–274.

194 Niemiller ML, Fitzpatrick BM, Shah P, Schmitz L, Near TJ. 2013. Evidence for Repeated Loss of Selective Constraint in Rhodopsin of Amblyopsid Cavefishes (Teleostei: Amblyopsidae). Evolution 67:732–748.

Niven JE, Laughlin SB. 2008. Energy limitation as a selective pressure on the evolution of sensory systems. J. Exp. Biol. 211:1792–1804.

Nummela S, Pihlström H, Puolamäki K, Fortelius M, Hemilä S, Reuter T. 2013. Exploring the mammalian sensory space: co-operations and trade-offs among senses. J Comp Physiol A 199:1077–1092.

Parker J, Tsagkogeorga G, Cotton JA, Liu Y, Provero P, Stupka E, Rossiter SJ. 2013. Genome-wide signatures of convergent evolution in echolocating mammals. Nature 502:228–231.

Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE. 2004. UCSF Chimera-A visualization system for exploratory research and analysis. J. Comput. Chem. 25:1605–1612.

Ploier B, Caro LN, Morizumi T, Pandey K, Pearring JN, Goren MA, Finnemann SC, Graumann J, Arshavsky VY, Dittman JS, et al. 2016. Dimerization deficiency of enigmatic retinitis pigmentosa-linked rhodopsin mutants. Nat. Commun. 7:1–11.

Pond SLK, Frost SDW, Muse SV. 2005. HyPhy: hypothesis testing using phylogenies. Bioinformatics 21:676–679.

Rose GJ, Canfield JG. 1993. Longitudinal tracking responses of the weakly electric fish, Sternopygus. J Comp Physiol A 171:791–798.

Salazar VL, Krahe R, Lewis JE. 2013. The energetics of electric organ discharge generation in gymnotiform weakly electric fish. J. Exp. Biol. 216:2459–2468.

Schott RK, Gow D, Chang BS. 2016. BlastPhyMe: A toolkit for rapid generation and analysis of protein-coding sequence datasets. bioRxiv:1–15.

Schott RK, Van Nynatten A, Card DC, Castoe TA, Chang BSW. 2018. Shifts in Selective Pressures on Snake Phototransduction Genes Associated with Photoreceptor Transmutation and Dim-Light Ancestry. Mol. Biol. Evol. 71:1944–1389.

Schumacher S, de Perera TB, Emde von der G. 2017. Electrosensory capture during multisensory discrimination of nearby objects in the weakly electric fish Gnathonemus petersii. Sci. Rep. 7:43665.

Shu D-G, Morris SC, Han J, Zhang Z-F, Yasui K, Janvier P, Chen L, Zhang X-L, Liu J-N, Li Y. 2003. Head and backbone of the Early Cambrian vertebrate Haikouichthys. Nature 421:526.

195 Stevens JA, Sukhum KV, Carlson BA. 2013. Independent evolution of visual and electrosensory specializations in different lineages of mormyrid electric fishes. Brain Behav Evol 82:185–198.

Sugawara T, Imai H, Nikaido M, Imamoto Y, Okada N. 2010. Vertebrate Rhodopsin Adaptation to Dim Light via Rapid Meta-II Intermediate Formation. Mol. Biol. Evol. 27:506–519.

Takiyama T, Luna da Silva V, Moura Silva D, Hamasaki S, Yoshida M. 2015. Visual Capability of the Weakly Electric Fish Apteronotus albifrons as Revealed by a Modified Retinal Flat-Mount Method. Brain Behav Evol 86:122–130.

Thiagavel J, Cechetto C, Santana SE, Jakobsen L, Warrant EJ, Ratcliffe JM. 2018. Auditory opportunity and visual constraint enabled the evolution of echolocation in bats. Nat. Commun. 9:98.

Van Nynatten A, Bloom D, Chang BSW, Lovejoy NR. 2015. Out of the blue: adaptive visual pigment evolution accompanies Amazon invasion. Biol. Lett. 11:20150349.

Weadick CJ, Chang BSW. 2011. An Improved Likelihood Ratio Test for Detecting Site- Specific Functional Divergence among Clades of Protein-Coding Genes. Mol. Biol. Evol. 29:1297–1300.

Wertheim JO, Murrell B, Smith MD, Kosakovsky Pond SL, Scheffler K. 2014. RELAX: Detecting Relaxed Selection in a Phylogenetic Framework. Mol. Biol. Evol. 32:820–832.

Wilkens H. 2010. Genes, modules and the evolution of cave fish. Heredity 105:413–422.

Wright ES. 2015. DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment. BMC Bioinformatics 16:322.

Wu J, Jiao H, Simmons NB, Lu Q, Zhao H. 2018. Testing the sensory trade-off hypothesis in New World bats. Proc. Biol. Sci. 285:20181523.

Xu J, Zhang J. 2014. Why human disease-associated residues appear as the wild-type in other species: genome-scale structural evidence for the compensation hypothesis. Mol. Biol. Evol. 31:1787–1792.

Yang Z, Nielsen R. 2002. Codon-Substitution Models for Detecting Molecular Adaptation at Individual Sites Along Specific Lineages. Mol. Biol. Evol. 19:908–917.

Yang Z, Swanson WJ, Vacquier VD. 2000. Maximum-Likelihood Analysis of Molecular Adaptation in Abalone Sperm Lysin Reveals Variable Selective Pressures Among Lineages and Sites. Mol. Biol. Evol. 17:1446–1455.

Yang Z, Swanson WJ. 2002. Codon-Substitution Models to Detect Adaptive Evolution that Account for Heterogeneous Selective Pressures Among Site Classes. Mol. Biol. Evol. 19:49–57.

196 Zakon HH, Lu Y, Zwickl DJ, Hillis DM. 2006. Sodium channel genes and the evolution of diversity in communication signals of electric fishes: convergent molecular evolution. Proc. Natl. Acad. Sci. U.S.A. 103:3675–3680.

Zhao H, Rossiter SJ, Teeling EC, Li C, Cotton JA, Zhang S. 2009. The evolution of color vision in nocturnal mammals. Proc. Natl. Acad. Sci. U.S.A. 106:8980–8985.

197 5.9. SUPPLEMENTAL INFORMATION

5.9.1. Supplementary tables

Table S5.1. Accession numbers, depth data and amino acid identities at sites on helix 5 and 6 in fishes Species Accession site214 site217 site220 site258 Mid-point depth Arenigobius bifrenatus JF261553.1 I F S V 0.5 Danio margaritatus GQ365223.1 I V F V 0.5 Parablennius parvicornis HM630108.2 I F G V 0.5 Parablennius sanguinolentus HM630109.1 I F G V 0.5 Pethia tiantian JQ614274.1 I F F V 0.5 Poeciliopsis infans KJ697430.1 I F F V 0.5 Priapella olmecae KJ697443.1 I A F V 0.5 erythrodon AB084941.1 I T F V 0.5 Ameca splendens KJ697352.1 I F F V 1 Kuhlia marginata HE798245.1 I I F V 1 Kuhlia munda HE798263.1 I I F V 1 Lucania parva KJ697399.1 I V F V 1 Microlipophrys canevae HM630115.1 I V G V 1 Parablennius incognitus JQ697343.1 I V G V 1 Tuberoschistura baenzigeri FJ650481.1 I T F V 1 Apeltes quadracus KX145991.1 I I F V 1.5 Carinotetraodon salivator JQ682399.1 I A F V 1.5 Ctenogobiops tongaensis HQ536928.1 I I F V 1.5 Eviota afelei JF261541.1 I T F V 1.5 Istiblennius lineatus KF265115.1 I V G V 1.5 Menidia menidia EU637977.1 I I F V 1.5 Microlipophrys dalmatinus HM630116.1 I F G V 1.5 Psammogobius biocellatus JF261547.1 I I F V 1.5 Barbulifer ceuthoecus HQ536891.1 I F F V 2.5 Favonigobius melanobranchus JF261582.1 I T F V 2.5 Favonigobius reichei EU637960.1 I ? F V 2.5 Istiblennius edentulus KF265116.1 I I F V 2.5 Kuhlia rupestris HE798255.1 I C F V 2.5 Lentipes kaaea HQ639176.1 I T F V 2.5 Maccullochella peelii KF017162.1 I F F V 2.5 Microlipophrys adriaticus HM630111.1 I F G V 2.5 Ophiocara porocephala EU637988.1 I T F V 2.5 Opsanus tau JQ938025.1 I T F V 2.5 chrysurus KX766125.1 I T F V 2.5 Protogobius attiti HQ639145.1 I I F V 2.5 Puntius semifasciolatus JQ614261.1 I I F V 2.5 Rhyacichthys guilberti HQ639144.1 I V F V 2.5 Selenotoca multifasciata EU638002.1 I T F V 2.5

198 Sicyopterus lagocephalus HQ639162.1 I T F V 2.5 Smilosicyopus chloe HQ639195.1 I T F V 2.5 Smilosicyopus pentecost KF669065.1 I T F V 2.5 Stiphodon atratus HQ639199.1 I T F V 2.5 Stiphodon rutilaureus HQ639156.1 I T F V 2.5 Stiphodon sapphirinus HQ639152.1 I T F V 2.5 Trimmatom eviotops HQ536905.1 I T F V 2.5 Xiphophorus monticolus KJ525802.1 I A F V 2.5 Bathygobius fuscus KF265114.1 I A F V 3 Cryptocentrus albidorsus HQ536913.1 I F F V 3 Caffrogobius saldanha JF261548.1 I I F V 3.5 Labeotropheus fuelleborni AY775113.1 I T F V 3.5 Micropterus salmoides KX145895.1 I S F G 3.5 Neolamprologus tetracanthus AB458074.1 I T F V 3.5 Tigrigobius macrodon HQ536889.1 I F F V 3.5 Acanthurus guttatus KC623870.1 T T F V 4 Callionymus schaapii EU637946.1 I T F V 4 puncticulatus AY846615.1 I F F V 4 Lipophrys pholis HM630123.1 I V G V 4 Cabillus tongarevae JF261540.1 I V F V 4.5 Cryptocentrus inexplicatus HQ536950.1 I F F A 4.5 Opistognathus maxillosus JQ937975.1 I V F V 4.5 Stethojulis terina KF265113.1 I I F V 4.5 Acanthurus achilles KC623863.1 T T F V 5 Acrossocheilus paradoxus FJ531342.1 I V F V 5 Bathygobius cocosensis HQ536902.1 I A F V 5 Chrysiptera cyanea KX766117.1 I T F V 5 chiquita AY846630.1 I I F V 5 Halichoeres chloropterus KP881297.1 I T F V 5 Hemiculter leucisculus KF029654.1 I V F G 5 Labrisomus nuchipinnis KY126047.1 I I F V 5 Pomacentrus adelus KX766124.1 I T F V 5 Scartella cristata JQ697371.1 I V G V 5 Soleichthys heterorhinos JQ938064.1 I T F V 5 Tigrigobius gemmatus AY846623.1 I V F V 5 Channa striata AY141277.1 I F F V 5.5 Dischistodus perspicillatus KX766119.1 I A F V 5.5 Gymnothorax tile KY026033.1 I T S V 5.5 Parablennius intermedius JQ697342.1 I V G V 5.5 Chilomycterus schoepfii KF027978.1 I F F V 6 Chrysiptera brownriggii KX766116.1 I T F V 6 nasuta AB457981.1 I V F V 6 Ophthalmotilapia ventralis AB457982.1 I V F V 6 Pomatoschistus microps FN430607.1 I V F V 6 Symphurus orientalis KF312137.1 I T F V 6 Cryptocentrus nigrocellatus HQ536916.1 I F F V 6.5 Eleutheronema rhadinum KF312124.1 I T F V 6.5 Neopomacentrus azysron KX766121.1 I T F V 6.5 Parachromis managuensis KP715365.1 I A F V 6.5

199 Aruma histrio AY846631.1 I I F V 7 Encrasicholina devisi KT201127.1 L C T F 7 Menticirrhus undulatus KP723013.1 V F F V 7 Acanthurus lineatus KC623874.1 T T F V 7.5 Chilomycterus antennatus JQ682365.1 I F F V 7.5 Heteroclinus adelaidae KF525059.1 I V F V 7.5 Microdesmus longipinnis JF261527.1 I I F V 7.5 Neopomacentrus bankieri HQ286553.1 I T F V 7.5 Pomacentrus moluccensis KU745431.1 I T F V 7.5 Sphaeramia nematoptera EU638010.1 I I F V 7.5 Strophidon sathete HQ444183.1 I T S V 7.5 Amphiprion ocellaris XM_023282761.1 I A F V 8 Bathygobius soporator JF261554.1 I M F V 8 bennetti JQ682383.1 T V F V 8 Caracanthus unipinna KC222253.1 I T F V 8 Cottus bairdii KX146140.1 I T F V 8 Gerres cinereus EF095624.1 I V F V 8 Halidesmus scapularis JQ937978.1 I T F V 8 Petroscirtes breviceps KF265117.1 I F G V 8 Plectroglyphidodon dickii HQ286555.1 V I G V 8 Acanthurus blochii KC623867.1 T T F V 8.5 Canthigaster amboinensis JQ682382.1 T T F V 8.5 Coryogalops anomolus JF261557.1 I F F V 8.5 chilotes AY673746.1 I T F V 8.5 Cryptocentrus leptocephalus HQ536970.1 I F F V 9 Heteroclinus kuiteri KF525061.1 I V F V 9 Lacantunia enigmatica JX470082.1 T F T F 9 Parablennius tentacularis JQ697366.1 I ? G V 9 Parablennius zvonimiri JQ697368.1 I V G V 9 Paragobiodon modestus JF261551.1 I F F V 9 Scorpaenodes corallinus KC222249.1 I T F V 9 Sphyraena argentea KF312127.1 I V F V 9 Amphiprion melanopus HM107824.1 I T F V 9.5 Callogobius bifasciatus HQ536904.1 I I F V 9.5 Clinus musaicus JF320877.1 I A F V 9.5 Halichoeres ornatissimus KP881295.1 I T F V 9.5 Heteroclinus macrophthalmus KF525060.1 I V F V 9.5 Pomacentrus pavo KX766126.1 I T F V 9.5 kiyoae KF265145.1 T I F V 9.5 Synchiropus splendidus KF265140.1 T F F V 9.5 Acanthocybium solandri DQ874804.1 I V F V 10 Amblycirrhitus bimacula KC222241.1 V T F V 10 Carassius auratus KX146007.1 I I F V 10 Dascyllus aruanus KU745449.1 I T F V 10 Forsterygion lapillum AY141272.1 I V G V 10 Heteroclinus wilsoni KF525078.1 I V F V 10 Hypophthalmichthys molitrix KX224222.1 I C F G 10 HE798267.1 I I F V 10 Labeo calbasu GQ913525.1 I I F V 10

200 Oreochromis niloticus AY775108.1 I T F V 10 Scorpaenodes guamensis KC222248.1 I A F V 10 Telmatochromis temporalis AB458081.1 I I F V 10 Tigrigobius pallens AY846625.1 I F F V 10 Xenotilapia rotundiventralis AB457997.1 I T F V 10 Abudefduf sexfasciatus HQ286548.1 I T F V 10.5 Acanthurus auranticavus KC623864.1 T T F V 10.5 Chromis viridis HQ286550.1 I T F V 10.5 Chrysiptera rex HQ286551.1 I T F V 10.5 Ctenogobiops formosa HQ536911.1 I I F V 10.5 Cynoscion acoupa KP722986.1 V F F V 10.5 Gunnellichthys monostigma HQ536898.1 I F F V 10.5 Kuhlia mugil JF764596.1 V I F V 10.5 Paracirrhites hemistictus KC222238.1 I T F V 10.5 Pomacentrus coelestis KU745437.1 I T F V 10.5 Pomacentrus wardi KX766127.1 I T F V 10.5 Scaevius milii KY363175.1 T V F V 10.5 Scolopsis lineata KY363186.1 I M F V 10.5 Scomberomorus regalis DQ874805.1 I V F V 10.5 Syngnathus typhle AY368326.1 T V F V 10.5 Thalassoma lunare KP881298.1 I T F V 10.5 Arothron manilensis JQ682376.1 I T F V 11 Centropomus undecimalis KC442233.1 I I F V 11 Ctenochaetus truncatus KC623888.1 T T F V 11 Parma oligolepis HQ286554.1 V T F V 11 bifasciatus KY363165.1 I C F V 11 Seriphus politus KP723051.1 V F F V 11 Chaetodon semilarvatus AY368312.1 I T F V 11.5 Chanodichthys erythropterus KF029649.1 I V F G 11.5 Cirrhitus pinnulatus KC222240.1 I T F V 11.5 Coryphopterus dicrus JF261530.1 I T F V 11.5 Elacatinus figaro AY846611.1 I F F V 11.5 Eleutheronema tetradactylum JQ938020.1 V T F V 11.5 Kuhlia xenura HE798261.1 I I F V 11.5 Neoniphon argenteus U57540.1 V I F V 11.5 Roncador stearnsii KP723047.1 V F F V 11.5 Amblyeleotris fasciata HQ536957.1 I T F V 12 Canthigaster punctatissima JQ682392.1 T I F V 12 Elacatinus lori AY846586.1 I T F V 12 Acanthurus leucosternon KC623873.1 T T F V 12.5 Assessor flavissimus EU637944.1 I T F V 12.5 Cirrhinus molitorella KC631257.1 I V F V 12.5 Elacatinus prochilos AY846610.1 I T F V 12.5 neophytus JF261586.1 I T F V 12.5 Hemibarbus labeo EU919548.1 I C F G 12.5 Lagocephalus suezensis JQ682408.1 T T F V 12.5 Mahidolia mystacina HQ536971.1 I F F V 12.5 Megalobrama amblycephala KF029657.1 I V F G 12.5 Neoglyphidodon nigroris KX766123.1 I T F V 12.5

201 Parablennius pilicornis JQ697358.1 I V G V 12.5 Sphoeroides lispus JQ682417.1 I T F V 12.5 Amphiprion akindynos HQ286549.1 V T F V 13 Cryptocentrus cinctus HQ536927.1 I F F V 13 Istigobius decoratus JF261567.1 I F F V 13 Paraluteres prionurus KF027986.1 I T F V 13 Scolopsis bilineata KY363181.1 T T F V 13 Valenciennea strigata HQ536900.1 I F F V 13 Acanthurus chirurgus KC623868.1 T T F V 13.5 Canthigaster compressa JQ682385.1 T V F V 13.5 Priolepis eugenius JF261533.1 I T F V 13.5 Scarus psittacus EF095633.1 I T F V 13.5 Scolopsis ciliata KY363183.1 I T F V 13.5 Scolopsis margaritifera KY363188.1 I ? F V 13.5 Scolopsis vosmeri KY363200.1 T T F V 13.5 Arothron nigropunctatus JQ682379.1 I T F V 14 Platax teira JQ937980.1 I I F V 14 Salmo trutta JX255557.1 I F T V 14 Cerdale floridana HQ536887.1 I F F V 15 Chromis nitida KX766118.1 I T G V 15 aurantiacus KF525049.1 I V F V 15 Cristiceps australis KF525044.1 I V F V 15 Ctenogobiops mitodes HQ536945.1 I I F V 15 Ctenopharyngodon idella KX224231.1 I C F G 15 Esox lucius XM_010902101.2 I I F V 15 Haemulon aurolineatum EF095619.1 I T F V 15 Istigobius rigilius HQ536946.1 I F F V 15 Lepadichthys lineatus KY126049.1 I F F V 15 Neogobius melanostomus KX145843.1 M I G V 15 Nerophis lumbriciformis EU637987.1 M T F V 15 Ostracion whitleyi JQ861047.1 I T F V 15 Siganus vulpinus EU638007.1 I F F V 15 Silurus glanis KX224240.1 I F T F 15 Stellifer ericymba KP723052.1 V F F V 15 Synanceia verrucosa EU638011.1 I T F V 15 Xenophallus umbratilis KJ697456.1 I I F V 15 Acanthurus nigricauda KC623878.1 T T F V 15.5 Amblygobius nocturnus HQ536939.1 I F F V 15.5 Canthigaster jactator JQ682388.1 T I F V 15.5 Canthigaster janthinoptera JQ682389.1 T I F V 15.5 Chanos chanos FJ197072.1 I T T F 15.5 Johnius macropterus KP723003.1 C F F V 15.5 Neosynchiropus ocellatus KF265136.1 T C F V 15.5 Oplopomus oplopomus HQ536936.1 I T F V 15.5 Perca fluviatilis AY141295.1 I T F V 15.5 Plagioscion surinamensis KP723038.1 T F Y V 15.5 Scolopsis aurata KY363179.1 T T F V 15.5 Sebastapistes tinkhami KC222243.1 I C F V 15.5 Signigobius biocellatus HQ536947.1 I F F V 15.5

202 Stegastes gascoynei HQ286557.1 V F F V 15.5 Torquigener pleurogramma JQ682452.1 I I F V 15.5 Cetengraulis mysticetus KT201122.1 I F T F 16 Haplochromis piceatus LC130221.1 I T F V 16 Holocentrus rufus KC442230.1 F F F V 16 Pentapodus trivittatus KY363174.1 I C F V 16 Spinibarbus hollandi EU606011.1 I I F V 16 Valenciennea longipinnis HQ536923.1 I F F V 16 Dorosoma cepedianum KX145707.1 I I T F 16.5 Elacatinus xanthiprora AY846591.1 I F F V 16.5 Fusigobius signipinnis JF261545.1 I T F V 16.5 Gnathodentex aureolineatus KC222236.1 T T F V 16.5 Gobiesox strumosus KY126051.1 I A F V 16.5 Myripristis violacea U57539.1 I V F V 16.5 Onigocia bimaculata KC222254.1 I T F V 16.5 Prionurus laticlavius KC623895.1 T T F V 16.5 Ptereleotris zebra EU637999.1 I T F V 16.5 Arothron mappa JQ682377.1 I T F V 17 Elacatinus horsti AY846583.1 I T F V 17 Anchoa filifera KT201114.1 M F T F 17.5 Haplotaxodon trifasciatus AB458084.1 I T L V 17.5 Mylopharyngodon piceus GU218587.1 I C F G 17.5 Neopomacentrus cyanomos KX766122.1 I T F V 17.5 Parablennius gattorugine JQ697341.1 I F C F 17.5 Tigrigobius dilepis AY846617.1 I F F V 17.5 Tigrigobius janssi AY846574.1 I F F V 17.5 Tomiyamichthys lanceolatus HQ536937.1 I I F V 17.5 Umbrina xanti KP723060.1 V F F V 17.5 Canthigaster figueiredoi JQ682387.1 T T F V 18 Gomphosus varius KP881294.1 I T F V 18 Microgobius microlepis JF261576.1 I F F V 18 Neolamprologus obscurus AB458071.1 I T F V 18 Paracirrhites forsteri KC222239.1 V T F V 18 Pentapodus caninus KY363166.1 T T F V 18.5 Pentapodus emeryii KY363168.1 T T F V 18.5 Amblyeleotris yanoi HQ536929.1 I T F V 19 Callogobius sclateri HQ536903.1 I T F V 19 Grammistops ocellatus KC222228.1 I T F V 19 Pomacentrus nagasakiensis KU745425.1 I T F V 19 Cyprichromis zonatus AB457938.1 I I F V 19.5 Amblyeleotris gymnocephala JF261546.1 I T F V 20 Amblyeleotris periophthalma HQ536926.1 I T F V 20 Ameiurus nebulosus KX146011.1 T F T F 20 Elacatinus chancei AY846581.1 I F F V 20 Johnius belangerii KP722998.1 C F F V 20 Kyphosus vaigiensis KC222237.1 I I F V 20 Lycengraulis grossidens KT201146.1 I F T F 20 Medialuna californiensis KF017151.1 I A F V 20 Menticirrhus americanus DQ874821.1 V F F V 20

203 Pentapodus aureofasciatus KY363163.1 T T F V 20 Stellifer microps KP723053.1 T F F V 20 Stellifer rastrifer KP723055.1 T F F V 20 Tripterygion delaisi EU638016.1 V T G V 20 Zoarces viviparus KF017149.1 I V F V 20 Acanthurus tennentii KC623882.1 T T F V 20.5 Amblyglyphidodon curacao KX766114.1 I T F V 20.5 Canthigaster rostrata JQ682394.1 T T F V 20.5 Dendrochirus biocellatus KC222252.1 I T F V 20.5 Johnius amblycephalus KP722997.1 C F F V 20.5 Johnius trewavasae KP723005.1 C V F V 20.5 Labroides dimidiatus KP881296.1 I T F V 20.5 Nibea soldado KP723021.1 I F F V 20.5 Acanthurus coeruleus KC623869.1 T T F V 21 Parablennius rouxi JQ697360.1 I V G V 21 Paracanthurus hepatus KC623893.1 T T F V 21 Pomacentrus amboinensis HQ286556.1 I T F V 21 Acanthurus leucocheilus KC623872.1 T T F V 21.5 Epibulus insidiator KP881288.1 I S F V 21.5 Xenotilapia papilio AB457995.1 I T F V 21.5 Aeoliscus strigatus EU637931.1 I A F V 22 Ctenogobiops tangaroai HQ536906.1 T T F V 22 Amblyeleotris steinitzi HQ536919.1 I T F V 22.5 Amblyeleotris wheeleri HQ536907.1 I T F V 22.5 Bodianus mesothorax KP881292.1 I C F V 22.5 Diplogrammus goramensis KF265127.1 T C F V 22.5 Hippocampus comes XM_019890602.1 T T F V 22.5 Pempheris schwenkii AB495203.1 I C F V 22.5 Scomberomorus maculatus DQ874798.1 I V F V 22.5 Asterropteryx ensifera HQ536931.1 I T F V 23 Cheilotrema saturnum KP722980.1 V F F V 23 Ctenogobiops aurocingulus HQ536914.1 I I F V 23 Elacatinus oceanops AY846603.1 I T F V 23 Embiotoca jacksoni EF095628.1 T T F V 23 Hexagrammos decagrammus JQ937987.1 I V F V 23 Neoniphon sammara U57536.1 V I F V 23 Zebrasoma velifer KC623901.1 T T F V 23 Amblyglyphidodon leucogaster KX766115.1 I T F V 23.5 Fusigobius duospilus JF261549.1 I C F V 23.5 Bryaninops yongei JF261556.1 M T F V 24 Stonogobiops xanthorhinica HQ536930.1 I F F V 24 Zebrasoma flavescens KC623899.1 T T F V 24 Coryphopterus personatus JF261528.1 I T F V 24.5 Acanthopagrus berda JQ638362.1 I L F V 25 Amblyeleotris guttata HQ536912.1 I T F V 25 Anchoa colonensis KT201112.1 I F T F 25 Anchoa delicatissima KT201113.1 I F T F 25 Anchoa walkeri KT201117.1 I F T F 25 Anchovia clupeoides KT201118.1 I F T F 25

204 Cheilopogon heterurus EU637950.1 I V F V 25 Chlorurus sordidus KP881291.1 I T F V 25 Cirrhitichthys falco KF017157.1 I T F V 25 Coilia dussumieri JN230978.1 I I S F 25 Elops saurus JN230971.1 I I F V 25 Hemicaranx amblyrhynchus JQ938004.1 I T F V 25 Heteroclinus johnstoni KF525080.1 I V F V 25 Lates calcarifer EU637970.1 I F F V 25 Lycengraulis poeyi KT201106.1 I F T F 25 Nebris microps KP723018.1 V F F V 25 Oligoplites saurus JQ938006.1 I V F V 25 Otolithes ruber KP723027.1 C F F V 25 Paralonchurus brasiliensis KP723031.1 V F F V 25 Photoblepharon palpebratum EU637993.1 I V F V 25 Pterocaesio digramma EU638000.1 I T F V 25 Rhinecanthus aculeatus KF027976.1 A A F V 25 Rhinesomus triqueter JQ861042.1 T T F V 25 Anchoviella brevirostris KT201110.1 I C T F 25.5 Anchoviella lepidentostole KT201108.1 I F T F 25.5 Arothron hispidus JQ682374.1 I T F V 25.5 Coris gaimard KP881290.1 I T F V 25.5 Dascyllus reticulatus KU745443.1 V T F V 25.5 Gymnothorax favagineus HQ444181.1 I T S V 25.5 Labrus bergylta XM_020629505.1 I I F V 25.5 Myripristis murdjan KC442231.1 I V F V 25.5 Naso vlamingii KC623892.1 T T F V 25.5 Ptereleotris microlepis HQ536975.1 I T F V 25.5 Stereolepis gigas DQ336173.1 I T F V 25.5 Amblygobius phalaena HQ536897.1 I F F V 26 Balistapus undulatus JQ861029.1 T T F V 26 Diodon hystrix JQ682370.1 I F F V 26 Gnatholepis cauerensis JF261539.1 I I F V 26 Lactophrys trigonus JQ861041.1 T T F V 26 Scolopsis monogramma KY363191.1 T M F V 26 Scolopsis taenioptera KY363196.1 T M F V 26 Scorpaenodes minor KC222247.1 I T F V 26.5 Belonoperca chabanaudi KC222231.1 I T F V 27 Pomacanthus maculosus EU637995.1 I T F V 27 Culaea inconstans KX145902.1 I V F V 27.5 Eucinostomus gula EF095621.1 I M F V 27.5 Pentapodus setosus KY363171.1 T T F V 27.5 Scolopsis xenochroa KY363204.1 T T F V 27.5 Scophthalmus rhombus EU638005.1 I V F V 27.5 Acanthurus bariene KC623866.1 T T F V 28 Anchoa spinifer KT201107.1 I F T F 28 Bairdiella ronchus KP722978.1 V F F V 28 Canthigaster papua JQ682391.1 T V F V 28 Canthigaster valentini JQ682397.1 T I F V 28 Dactylopus dactylopus KF265133.1 T F F V 28

205 Dascyllus trimaculatus HQ286552.1 V T F V 28 Gymnothorax reticularis HQ444182.1 I T S V 28 Scorpaenopsis possi KC222244.1 I C F V 28.5 Elacatinus louisae AY846589.1 I T F V 29 Lactoria diaphana JQ861039.1 T T F V 29 Alosa aestivalis KX146146.1 I I T F 30 Anchoa cubana KT201126.1 I F T F 30 Coryphopterus hyalinus JF261529.1 I T F V 30 Halobatrachus didactylus AY368323.1 I I F V 30 Macrodon ancylodon KP723010.1 T F F V 30 Micropogonias furnieri KP723015.1 I F F V 30 Scolopsis bimaculata KY363182.1 T M F V 30 Sillago sihama EU638008.1 I T F V 30 Tetrosomus concatenatus JQ861049.1 T T F V 30 Torquigener flavimaculosus JQ682450.1 T T F V 30 Arothron stellatus JQ682380.1 I T F V 30.5 Carangoides ferdau JQ937968.1 I T F V 30.5 Gramma loreto JQ937971.1 I F F V 30.5 Sargocentron diadema U57537.1 I T F V 30.5 Zebrasoma scopas KC623900.1 T T F V 30.5 Paralabrax clathratus KF017150.1 I V F V 31 Scolopsis affinis KY363177.1 T T F V 31.5 Acanthurus pyroferus KC623881.1 T T F V 32 Chelonodon patoca JQ682401.1 I I F V 32 Cyprichromis coloratus AB457874.1 I T F V 32.5 Trixiphichthys weberi KF028006.1 I T F V 32.5 Acanthochromis polyacanthus XM_022201932.1 I T F V 33 Rhamphochromis esox AB185236.1 I I F V 33.5 Samariscus latus KF312146.1 I S F V 33.5 Rhinomuraena quaesita HQ444180.1 I T S V 34 Alabes scotti KY126048.1 I F F V 35 Aurigequula fasciata EU637972.1 I T F V 35 Callionymus valenciennei KF265157.1 T T F V 35 Cynoscion reticulatus KP722991.1 A F F V 35 Cyprichromis pavo AB457930.1 I I F V 35 Equetus lanceolatus KP722994.1 L F F V 35 Gobiodon quinquestrigatus JF261550.1 I F F V 35 Isopisthus remifer KP722996.1 C F F V 35 Lates niloticus EU637971.1 I F F V 35 Pseudorhombus oligodon KF312138.1 I T F V 35 Scolopsis taeniata KY363192.1 T T F V 35 Sphoeroides annulatus JQ682415.1 I T F V 35 Sphoeroides lobatus JQ682418.1 T T F V 35 Trimma caesiura JF261599.1 I T F V 35 Umbrina roncador KP723059.1 V F F V 35 Diodon nicthemerus JQ682372.1 I F F A 35.5 Holacanthus ciliaris AY141322.1 T T F V 35.5 KX145912.1 I F F V 35.5 Polydactylus octonemus JQ937976.1 I T F V 35.5

206 Priolepis cincta HQ536901.1 I T F V 35.5 Scorpaenopsis diabolus KC222246.1 I T F V 35.5 Halichoeres chrysus KP881293.1 I T F V 36 Sphoeroides spengleri JQ682423.1 T T F V 36 Gymnachirus melas JQ938030.1 T I F V 36.5 Arothron meleagris JQ682378.1 I T F V 37 Nemipterus nematophorus KY363133.1 I T F V 37.5 Neolamprologus bifasciatus AB458135.1 T T F V 37.5 Sarpa salpa Y18664.1 V M F V 37.5 Trinectes maculatus DQ874795.1 I I F V 37.5 Lythrypnus dalli JF261570.1 I A G V 38 Nemateleotris magnifica HQ536886.1 I T F V 38 Larimus pacificus KP723008.1 V F F V 39.5 Canthigaster leoparda JQ682390.1 T T F V 40 Cirrhilabrus punctatus KP881287.1 I T F V 40 Clarias gariepinus JX470077.1 V F T F 40 Gnathanodon speciosus EU637963.1 I T F V 40 Neolamprologus ventralis AB458000.1 I T F V 40 Pentanemus quinquarius AY141317.1 I T F V 40 Acanthostracion quadricornis JQ861037.1 T T F V 40.5 Atule mate JQ937997.1 V T F V 40.5 Scorpaenopsis macrochir KC222245.1 I T F V 40.5 Nemipterus marginatus KY363128.1 I T F V 41 Acanthostracion polygonius JQ861035.1 T T F V 41.5 Engraulis ringens KT201115.1 L C T F 41.5 Xystreurys liolepis KF312139.1 I T F V 42 Boulengerochromis microlepis AB084928.1 I T F V 42.5 Coryphaena hippurus DQ874824.1 I V F V 42.5 Drepane africana AY141321.1 I T F V 42.5 Nemipterus japonicus KY363127.1 I A F V 42.5 Ostracion rhinorhynchos JQ861045.1 I T F V 42.5 Paraplagusia japonica KF312136.1 I T F V 42.5 Echeneis naucrates AY141315.1 I T F V 43 Valenciennea puellaris HQ536910.1 I F F V 43 Callionymus formosanus KF265129.1 T C F V 44 Pterois antennata KC222250.1 I T F V 44 Polydactylus sextarius KF312125.1 I F F V 44.5 Acanthurus triostegus KC623884.1 G I F V 45 Collichthys lucidus KP722983.1 T F F V 45 Diplodus annularis Y18662.1 I T F V 45 Enoplosus armatus JF913270.1 I T F V 45 Glaucosoma scapulare AB495201.1 I T F V 45 Naso lituratus EU637984.1 T T F V 45 Nemipterus hexodon KY363118.1 I T F V 45 Neolamprologus variostigma AB458070.1 T T F V 45 Pomatoschistus marmoratus FN430608.1 I V F V 45 Bothus robinsi JQ938034.1 I F F V 45.5 Diodon liturosus JQ682371.1 I F F V 45.5 Etropus microstomus JQ938045.1 I T A V 45.5

207 Xyrichtys novacula EU638020.1 T T F V 45.5 Callionymus bairdi KF265146.1 T C F V 46 Ammodytes tobianus AY141306.1 I V F V 48.5 Gobius niger JF261597.1 I A G V 48.5 Lythrypnus zebra JF261573.1 I I F V 48.5 Balistes capriscus DQ874818.1 T T F V 50 Cubiceps gracilis EU637952.1 I C F V 50 Decapterus punctatus JQ938000.1 V T F V 50 Foetorepus calauropomus KF265124.1 T F F V 50 Gasterosteus aculeatus KX145873.1 I M F V 50 Micropogonias undulatus KP723016.1 I F F V 50 Notothenia angustata DQ498787.1 V V S V 50 Oncopterus darwinii JQ938081.1 I I G V 50 Paracottus knerii U97264.1 I S F V 50 Pennahia macrocephalus KP723034.1 L F F V 50 Pholis gunnellus AY141298.1 I V L V 50 EU637989.1 I T F V 50 Protonibea diacanthus KP723040.1 C F F V 50 Pseudotolithus elongatus KP723042.1 I F F V 50 Sphyraena guachancho DQ874817.1 I T F V 50 Stegastes partitus XM_008296826.1 V F F V 50 Thysanophrys chiltonae JQ937988.1 I T F V 50 Umbrina cirrosa KP723058.1 I F F V 50 Dactyloptena orientalis KC222256.1 T T F V 50.5 Dactylopterus volitans AY141282.1 I T F V 50.5 Lactoria cornuta JQ861038.1 I T F V 50.5 Monotaxis grandoculis KP723061.1 T T F V 50.5 Psettodes erumei KC442235.1 I T F V 50.5 Sphyraena barracuda DQ874816.1 I I F V 50.5 Rhinogobiops nicholsii JF261588.1 I A F V 53 Xenochromis hecqui AB458089.1 I T F V 53 Brachypleura novaezeelandiae KF312132.1 I T F V 55 Chloroscombrus chrysurus AY141313.1 I T F V 55 Dicentrarchus labrax Y18673.1 I C F G 55 Pungitius pungitius KX145923.1 I V F V 55 Repomucenus huguenini KF265152.1 T T F V 55 Sardina pilchardus Y18677.1 C T T F 55 Pseudotriacanthus strigilifer KF028004.1 I T F V 56 Ctenochaetus strigosus KC623887.1 T T F V 57 Lactarius lactarius KF312123.1 V T F V 57.5 Miichthys miiuy KP723017.1 T F F V 57.5 Pampus argenteus AY141309.1 L T F V 57.5 Nemipterus peronii KY363140.1 I T F V 58.5 Nemipterus furcosus KY363115.1 I T F V 59 Sphoeroides dorsalis JQ682416.1 T T F V 59 Chilomycterus reticulatus JQ682367.1 I F F V 60 Larimichthys crocea KP723006.1 L F F V 60 Larimichthys polyactis KP723007.1 L F F V 60 Lutjanus analis EF095620.1 I I F V 60

208 Mesogobius batrachocephalus JF261591.1 M T A V 60 Mugil cephalus Y18668.1 I V F V 60 Nemipterus tambuloides KY363142.1 V T F V 60 Parastromateus niger EF095616.1 I T F V 60 Pentapodus nagasakiensis KY363170.1 T T F V 60 Saurida elongata KC442219.1 G T S M 60 Selene dorsalis EU638006.1 I T F V 60 Atractoscion nobilis KP722973.1 V F F V 61 Sargocentron spiniferum U57544.1 F F F V 61.5 Naso brevirostris KC623889.1 T T F V 62 Aulostomus chinensis AY141279.1 T T F V 62.5 Priolepis hipoliti JF261572.1 I T F V 65.5 Johnius borneensis KP722999.1 C F F V 66 Nemipterus nematopus KY363137.1 I T F V 66 Samaris cristatus KF312145.1 I F F V 67 Nemipterus zysron KY363150.1 I T F V 67.5 Cottus ricei KX145997.1 V T F V 68.5 Oplegnathus punctatus KF017153.1 I T F V 69 Baileychromis centropomoides AB185217.1 I T F V 70 Conger japonicus JX255593.1 I I S F 70 Repomucenus virgis KF265154.1 T T F V 70 Trichopsetta ventralis JQ938037.1 V F F V 70 Microcanthus strigatus EU637978.1 I T F V 70.5 Ranzania laevis KF027981.1 T T F V 70.5 Repomucenus calcaratus KF265150.1 T T F V 72.5 Scomberomorus cavalla DQ874799.1 I V F V 72.5 Caranx sexfasciatus JQ938003.1 I T F V 73 Glaucosoma buergeri AB495205.1 I T F V 73 Nemipterus aurora KY363102.1 I T F V 73 Omegophora armilla JQ682411.1 I T F A 73 Tetrosomus gibbosus JQ861050.1 T T F V 73.5 Pseudopleuronectes americanus AY631036.1 V T F V 74 Verasper variegatus LC209606.1 I T F V 74.5 Elagatis bipinnulata JQ938001.1 V T F V 75 Lithognathus mormyrus Y18667.1 I M F V 75 Psettodes belcheri JQ938077.1 I T F V 75 Pseudotolithus typus KP723044.1 V F F V 75 Solea solea EU638009.1 I I F V 75 Sparus aurata Y18665.1 I T F V 75.5 Trachinus draco AY141304.1 I V F V 75.5 Cephalopholis sonnerati KC222229.1 I T F V 80 Diplodus vulgaris Y18663.1 I L F V 80 Pennahia argentata KP723033.1 I F F V 80 Gonorynchus greyi EU409632.1 I T T F 80.5 Myripristis berndti U57538.1 F V F V 81 Limanda limanda KF312142.1 T T F V 85 Canthigaster coronata JQ682386.1 T T F V 85.5 Serranus accraensis AY141289.1 I T F V 87.5 Catostomus catostomus KX146028.1 I I F V 90

209 Genyonemus lineatus KP722995.1 L F F V 91.5 Sargocentron punctatissimum U57543.1 V V F V 91.5 Sargocentron microstoma U57542.1 I T F V 92 Sargocentron tiere U57545.1 F F F V 92 Lutjanus sebae EU637974.1 I T F V 92.5 Lethrinus olivaceus KC222235.1 T T F V 93 Pseudorhombus pentophthalmus JQ938046.1 T T F V 94 Lagocephalus laevigatus JQ682406.1 T I F V 95 inermis KY363159.1 T T F V 95.5 Paralichthys dentatus KU980166.1 I I F V 96.5 Sphoeroides maculatus JQ682419.1 I V F V 96.5 Cynoscion guatucupa KP722987.1 L F F V 97.5 Caranx ignobilis JQ937999.1 I T F V 99 Euthynnus affinis LC016733.1 I I F V 100 Istiophorus platypterus JQ937992.1 V T F V 100 Makaira nigricans DQ874810.1 V T F V 100 Oncorhynchus mykiss NM_001124319.1 I F S V 100 Pallidochromis tokolosh AB185229.1 I T F V 100 Pteroscion peli KP723045.1 A F F V 100 Remora osteochir JQ938002.1 V T F V 100 Synodus foetens DQ874823.1 I V S V 100 Taurulus bubalis U97275.1 I F F V 100 Glaucosoma hebraicum AB495204.1 I T F V 100.5 Sciaena umbra KP723049.1 V F F V 100.5 Thunnus orientalis AB290449.1 I I F V 100.5 Carangoides plagiotaenia JQ937998.1 V T F V 101 Diodon holocanthus JQ682369.1 I F F V 101 Diplotaxodon macrops AB185220.1 I T F V 102 Pomatoschistus minutus FJ410467.1 A I F V 102 Acanthurus monroviae KC623876.1 T T F V 102.5 Aracana aurita JQ861032.1 T T F V 105 Arnoglossus laterna KF312130.1 V I F V 105 Merlangius merlangus AY141260.1 I A F V 105 Pagrus major JQ638374.1 T T F V 105 Paralichthys olivaceus XM_020096205.1 I F F V 105 Priacanthus arenatus EU637997.1 T T F V 105 Salmo salar NM_001123537.1 I F T V 105 Neoniphon aurolineatus U57541.1 V F F V 109 Sargocentron xantherythrum U57546.1 V T F V 109 Epinephelus aeneus AY141291.1 I T F V 110 Epinephelus bruneus LC064406.1 I T F V 110 Leocottus kesslerii L42953.1 I I F V 110 Nemipterus virgatus KY363148.1 I T F V 110.5 Pogonoperca punctata AY141292.1 I T F V 113 Antennarius striatus KC442240.1 T T F V 114.5 Caprichthys gymnura JQ861033.1 T T F V 120 Atrobucca nibe KP722974.1 T F F V 122.5 Alosa sapidissima KX145751.1 I I T F 125 Canthigaster callisterna JQ682384.1 T T F V 125

210 Mene maculata AY141316.1 T T F V 125 Oncorhynchus gorbuscha AY214151.1 I F S V 125 Oncorhynchus keta AY214141.1 I F S V 125 Oncorhynchus kisutch XM_020459364.1 I F S V 125 Oncorhynchus nerka AY214156.1 I F S V 125 Trachinotus ovatus AY141314.1 I A F V 125 Paranotothenia magellanica DQ498788.1 V I S V 127.5 Symphysanodon katayamai KF017147.1 I T F V 137 Neomerinthe hemingwayi DQ874819.1 I T F V 137.5 Parachaenichthys georgianus HQ170083.1 I I S V 137.5 Balistes vetula KF027975.1 T T F V 138.5 Sarda sarda DQ874800.1 I I F V 140 Ostracion cubicus JQ861043.1 I T F V 140.5 KY363156.1 T T F V 144.5 Champsocephalus esox HQ170039.1 I V S V 150 Citharus linguatula AY141323.1 I T F V 150 Niphon spinosus EU637934.1 I T F V 150 Scomber japonicus AY141311.1 I V F V 150 Heteropriacanthus cruentatus KC222233.1 A T F V 151.5 Allomycterus pilatus JQ682364.1 I F F A 155 Anoplocapros inermis JQ861030.1 T T F V 155 Benthosema pterotum JN231002.1 G I F G 155 Engraulis mordax KT201125.1 L F T F 155 Argyrosomus regius EU637942.1 V F F V 157.5 Parascolopsis aspinosa KY363153.1 T T F V 160 Umbrina bussingi KP723057.1 T C L V 161 Neobythites sivicola JN231007.1 I T F V 162 acuta DQ498786.1 V I S V 165 Nemipterus bathybius KY363104.1 I T F V 167.5 Argentina sialis JN230995.1 L T F L 168 Plagiopsetta glossa JQ938056.1 I C F V 171.5 Canthigaster rivulata JQ682393.1 T T F V 175 Sardinella aurita JN230980.1 I C T F 175 Triodon macropterus KF028010.1 T T F V 175 Pegusa lascaris KF312148.1 I V F V 177.5 Seriola dumerili XM_022739072.1 V T F V 180.5 Clupea harengus XM_012841175.1 I L S V 182 Myoxocephalus thompsonii KX145806.1 I V F V 183 Porichthys notatus JQ938026.1 I I F V 183 Arnoglossus imperialis JQ938032.1 V F F V 185 Champsodon snyderi EU637949.1 T T F V 185 Laeops kitaharae JQ938035.1 V F F V 185 Psenopsis anomala AY141310.1 I T F V 185 Terapon jarbua KF017155.1 I I F V 185 Kentrocapros rosapinto JQ861036.1 T T F V 187.5 Oncorhynchus tshawytscha AY214136.1 I F S V 187.5 Uranoscopus albesca AY141305.1 I T F V 190 Citharoides macrolepis KF312133.1 I T F V 191 Coryphaena equiselis EU637951.1 I I F V 200

211 Engraulis encrasicolus JN230977.1 C V T F 200 Engraulis japonicus AB731902.1 C F T F 200 Trematomus newnesi HM166276.1 V V S V 200 Anguilla japonica AJ249202.1 I T S V 200.5 Anguilla marmorata KJ462782.1 I T S V 200.5 chiloensis EU637932.1 A T F V 201.5 Parachaenichthys charcoti HQ170081.1 I I S V 202.5 Psilodraco breviceps KU647483.1 V T S V 202.5 Zanclorhynchus spinifer EU638021.1 I T F V 202.5 Zeus faber EU638023.1 I T F V 202.5 Engraulis eurystole KT201124.1 C F T F 203 Cygnodraco mawsoni HQ170075.1 I I S V 205 Citharichthys arctifrons JQ938042.1 I T G V 206 Grammatostomias circularis KC163319.1 V F S V 206 Mullus surmuletus EU637982.1 I I F V 207 Bothus podas AY368313.1 V F F V 207.5 Cepola macrophthalma EU637948.1 I T F V 207.5 Xeneretmus latifrons EU638018.1 T T F V 209 Trematomus nicolai DQ498799.1 V V S V 210.5 Osmerus mordax KX145811.1 I I S V 212.5 Pterycombus brama EU638001.1 I T F V 212.5 Polyipnus stereope JN230997.1 F M T V 215 Callionymus lyra AY141270.1 T V F V 217.5 Synchiropus goodenbeani KF265122.1 F F F V 221.5 Isopsetta isolepis JQ938050.1 T L F V 222.5 Poecilopsetta plinthus KF312144.1 I T F V 230 Lagocephalus lagocephalus EU637968.1 T T F V 243 Paracallionymus costatus KF265141.1 T T F V 247 Chauliodus danae JN412564.1 T S S V 250 Lampris guttatus KC442226.1 T C F V 250 Stomias brevibarbatus JN544528.1 I F A V 250 Mola mola AF137215.1 T T F V 255 Pontinus longispinis EU637996.1 I T F V 257.5 Sphoeroides pachygaster JQ682421.1 T T F V 265 Callanthias ruber EU637945.1 T T F V 275 Cottocomephorus inermis U97266.1 T I S V 275 Eopsetta jordani KF312140.1 T T F V 275 Gymnodraco acuticeps KU647485.1 V I S V 275 Notothenia coriiceps AY141302.1 V I S V 275 Pagothenia borchgrevinki HM166265.1 V V S V 275 Parophrys vetulus JQ938052.1 T L F V 275 Patagonotothen ramsayi DQ498783.1 I V S V 275 Trematomus hansoni DQ498789.1 V V S V 277.5 Schedophilus medusophagus EU638003.1 V T F V 279 Trichiurus lepturus LC223133.1 I V F V 294.5 Hime japonica KC442221.1 T V S V 297.5 Gadus morhua AF137211.1 T F F V 300 Gempylus serpens DQ874812.1 V V F V 300 Lepidopus fitchi EU407252.1 V T S M 300

212 Anarhichas lupus EU637936.1 I V F V 300.5 Chionodraco hamatus DQ498792.1 I V S V 302 Lopholatilus chamaeleonticeps EU637973.1 V T F V 310 Prionodraco evansii HQ170086.1 V C S V 310 Macroramphosus scolopax AY141280.1 T T F V 312.5 Phycis phycis EU637994.1 I V F V 313.5 Polyprion americanus JQ937977.1 I T F V 320 Polymixia lowei KC442227.1 I L F V 325 Zenopsis conchifer AY368314.1 I T F V 325 Pagetopsis macropterus HQ170060.1 I I S V 330 Enchelyopus cimbrius EU637958.1 I I F M 335 Cryodraco antarcticus HQ170052.1 I V S V 345 Anguilla anguilla L78008.1 V T S V 350 Argentina striata JX255562.1 L T F L 350 Champsocephalus gunnari HQ170042.1 I V S V 350 Pollichthys mauli JN544535.1 I F C V 350 Trematomus bernacchii EU638014.1 V V S V 350 Lota lota KX146040.1 I I F V 350.5 Lophiodes iwamotoi KF060342.1 T T F V 355 Trematomus eulepidotus HM166268.1 V V S V 360 Scorpaena onaria AY141288.1 I S F V 361.5 Trematomus pennellii JQ693498.1 V V S V 366 Lepidoblepharon ophthalmolepis KF312135.1 I T F V 369 Ateleopus japonicus KC442218.1 I I T V 370 Capros aper AY141262.1 T T F V 370 Foetorepus agassizii KF265123.1 F F F V 374 Chaenocephalus aceratus HQ170035.1 I I S V 387.5 Polymixia japonica JN231005.1 I L F V 394 Xiphias gladius EU638019.1 I T F V 400 Lepidorhombus boscii JQ938059.1 T T F V 403.5 Trematomus scotti HM166282.1 V V S V 406.5 Microstomus achne LC209607.1 T T F V 407.5 Scopelarchus analis EF517404.1 F I S V 410 Lyopsetta exilis JQ938073.1 T T F V 412.5 Seriola lalandi XM_023422803.1 V T F V 414 Racovitzia glacialis HQ170090.1 V S S V 414.5 Batrachocottus multiradiatus U97267.1 I S F V 425 Cyclopterus lumpus AY368316.1 I V F V 434 Gerlachea australis HQ170077.1 I T S V 435 Polymixia nobilis AY368320.1 I L F V 435 Ruvettus pretiosus DQ874813.1 I V F V 450 Trachipterus arcticus KC442225.1 I C F V 450 Verasper moseri AB930176.1 I T F V 450.5 Lentipes concolor KF016038.1 I I F V 457.5 Neopagetopsis ionah EU637986.1 I I S V 460 Bolinichthys indicus JN412574.1 G V F C 462.5 Antigonia capros KF027970.1 T T F V 475 Neoscopelus microchir KC442224.1 T V F V 475 Synagrops bellus JF913271.1 T T F V 485

213 Lophiodes mutilus KF060338.1 T T F V 497 Trematomus tokarevi HM166284.1 V V S V 497.5 Chaenodraco wilsoni HQ170037.1 I V S V 500 Chionodraco myersi HQ170048.1 I V S V 500 Lamprogrammus shcherbachevi EU637969.1 I M F V 500 Photonectes braueri KC163325.1 V F S V 500 Scomber scombrus DQ874797.1 I V F V 500 Trematomus lepidorhinus HM166272.1 T V T V 500 Awaous guamensis HQ639148.1 I F F V 500.5 Lophius piscatorius AY368325.1 I T F V 510 Synagrops japonicus KF017148.1 I I F V 525 Trachurus trachurus EU638013.1 I V F V 525 Polyipnus asteroides JN544533.1 F M T V 549 Dacodraco hunteri HQ170057.1 I V T V 550 Limnocottus bergianus U97270.1 I S F V 550 Idiacanthus antrostomus KC163334.1 V F S V 551.5 Merluccius merluccius JN231004.1 T T F V 552.5 Chlorophthalmus acutifrons KC442222.1 I M S V 575 Conger myriaster AB043818.1 I T S V 575 Limnocottus pallidus U97271.1 I S F V 575 Diaphus metopoclampus JN544536.1 G G F V 587.5 Trematomus loennbergii HM166274.1 T V S V 595.5 Hippoglossus stenolepis KF312141.1 I T F V 600 Rachycentron canadum KF312126.1 I T F V 600 Phycis blennoides JN412579.1 T V F V 605 Mancopsetta maculata KF312129.1 V S A V 607.5 Aristostomias scintillans KC163301.1 I I F V 609.5 Astronesthes chrysophekadion KC442217.1 I T F V 610 Epigonus telescopus EU637959.1 I T F V 637.5 Hoplostethus mediterraneus JN412583.1 I T F V 637.5 Grammatostomias flagellibarba KC163323.1 V F S V 640.5 Akarotaxis nudiceps HQ170067.1 V C S V 643 Argyropelecus gigas JN412572.1 F L S V 650 Bonapartia pedaliota JN544534.1 T T T V 650 Batrachocottus nikolskii U97268.1 I S F V 652 Vomeridens infuscipinnis HQ170093.1 V S S V 656.5 Beryx splendens AY141265.1 I V F V 662.5 Grammicolepis brachiusculus EU637964.1 I T F V 663 Dolloidraco longedorsalis HQ170034.1 V C S V 674 Stomias gracilis JX255576.1 I F S V 725 Melamphaes suborbitalis JN231006.1 G I S F 750 Alepocephalus bicolor JN230974.1 I V F V 759.5 Lycodapus antarcticus EU637976.1 T T F V 761.5 Careproctus rhodomelas LC050194.1 A S F V 766.5 Sagamichthys abei JN230975.1 I V F V 768.5 Lampanyctus alatus JN412575.1 G V F G 770 Bathydraco marri HQ170070.1 V C S V 775 Alepocephalus antipodianus EU637933.1 I V F V 795 Alepocephalus owstoni JX255563.1 I V F V 800

214 Stomias atriventer KC311787.1 I F A V 800 Trachyrincus murrayi AY368318.1 F V F V 815 Cottunculus thomsonii AY368315.1 I T F V 850 Abyssocottus korotneffi U97272.1 I S F V 860 Poecilopsetta beanii JQ938054.1 I T F V 895.5 Maurolicus muelleri MF805834.1 I T T V 897.5 Echiodon cryomargarites EU637956.1 T A F V 920.5 Trigonolampa miriceps KC163327.1 V F S V 930 Howella brodiei EU637966.1 I F F L 964.5 Sebastolobus altivelis DQ490124.1 I T F V 979 Astronesthes macropogon JN544530.1 I I A V 1000 Cottinella boulengeri U97273.1 I S F V 1000 Idiacanthus fasciola JN412568.1 I F A V 1000 Nemichthys curvirostris KY026032.1 I T T V 1000 Aristostomias tittmanni KC311788.1 I I F V 1007.5 Melanostomias bartonbeani KC163313.1 I F S V 1012.5 Tactostoma macropus KC163302.1 V F S V 1015 Hippoglossus hippoglossus KF941294.1 I F F V 1025 Alepocephalus bairdii JN412584.1 I T F V 1032.5 Photostomias goodyeari KC163304.1 G I F V 1042.5 Diaphus watasei JN231003.1 G C F V 1052.5 Argyropelecus aculeatus JN412571.1 F L T V 1078 Neocyttus helgae AY141261.1 I T F V 1099 Dissostichus mawsoni DQ498794.1 V T S V 1100 Triplophos hemingi KC163329.1 I L S V 1100 Heterophotus ophistoma KC163311.1 I F S V 1105 Diaphus rafinesquii JN412587.1 G C F V 1106.5 Borostomias panamensis KC163305.1 I F S V 1150 Talismania longifilis JX255564.1 I V F V 1150 Astronesthes gemmifer KC163307.1 V I S V 1200 Ceratoscopelus warmingii JN412573.1 G C F V 1223.5 Notacanthus bonaparte JN544543.1 I T F V 1243.5 Aphanopus carbo EU637938.1 I V F V 1250 Chionobathyscus dewitti HQ170044.1 I I S V 1250 Ichthyococcus ovatus JN412569.1 I F C V 1250 Bathydraco macrolepis HQ170068.1 V C S V 1275 Benthosema suborbitale JN412576.1 I V Y G 1275 Bathydraco scotiae HQ170074.1 V C S V 1290 Bathydraco antarcticus HQ170073.1 V C S V 1370 Coryphaenoides rupestris AY368319.1 I T F V 1390 Sternoptyx pseudobscura KC163312.1 I M T V 1400 Cataetyx laticeps JN412580.1 I T F V 1450 Borostomias antarcticus KC163322.1 V F S V 1465 Mora moro AY368322.1 F S F V 1475 Flagellostomias boureei KC163308.1 I F S V 1500 Saccopharynx ampullaceus KY026028.1 F V S V 1500 Bathypterois dubius AY141257.1 A T T V 1530 Alepocephalus agassizii JN544545.1 I T F V 1550 Bathophilus pawneei KC163309.1 I F S V 1550

215 Lycodes terraenovae JF764597.1 T T F V 1617 Chascanopsetta lugubris KF312131.1 V F F V 1635 Stenobrachius leucopsarus EU407251.1 G A F V 1715.5 Coryphaenoides guentheri JN412578.1 I T F V 1830.5 Sigmops bathyphilus AY141256.1 G L A V 1850 Bathylagus euryops AY141255.1 T S S M 1868.5 Dissostichus eleginoides DQ498780.1 V T S V 1950 Bathysaurus ferox JN412585.1 I T F V 2050 Echiostoma barbatum KC163306.1 I V S V 2115 Photostomias guernei JN412566.1 G I F V 2119 Malacosteus niger AJ224691.1 I C I M 2193 Sigmops gracilis KC442216.1 I F S V 2194.5 Halosauropsis macrochir JN544541.1 I T F V 2200 Chauliodus macouni EU407250.1 V F S V 2207.5 Coryphaenoides leptolepis JN544537.1 I T F V 2305 Pachystomias microdon EF517408.1 I I T M 2330 Ceratias holboelli AY141263.1 V T F V 2400 Coryphaenoides profundicolus JN544538.1 A T F V 2436 Bathophilus vaillanti JN544532.1 I F S V 2450 Bathytroctes microlepis JN544540.1 H S F V 2450 Chauliodus sloani JN412563.1 V T S V 2450 Rhadinesthes decimus KC163321.1 I F S V 2450 Anoplogaster cornuta JN412582.1 I T F V 2497 Leptostomias gladiator KC163328.1 V F S V 2500 Vinciguerria nimbaria JN412570.1 I V S V 2510 Histiobranchus bathybius JN544542.1 I T F V 2867.5 Photonectes margarita KC163314.1 I F S V 2947.5 Coryphaenoides carapinus JN544539.1 I T F V 2997 Bathysaurus mollis JN412586.1 I S S V 3226.5 Bassozetus compressus JN412581.1 I T F V 3295 Conocara salmoneum JN412577.1 I V F T 3450 Eurypharynx pelecanoides JN544544.1 I T S V 4062.5

216 5.9.2. Supplementary figures

Figure S5.1. Rhodopsin gene tree for gymnotiforms. Branch lengths equal to the number of nucleotide substitutions per site. Cypriniformes, Characiformes and Siluriformes species used as outgroups. Gray box labels the monophyletic clade comprised of mostly deep-channel gymnotiforms identified as under positive selection by Branch-Site REL.

217

Figure S5.2. Vertebrate phylogeny. Phylogeny of rhodopsin sequences based on the topology of the species tree generated by Betacantur et al. 2017. Branch lengths based on the number of nucleotide substitutions per codon in model m0 in PAML.

218

Figure S5.3. Support for ancestral amino acid reconstructions. Results shown for the WAG amino acid substitution matrix and at nodes representing the ancestors between gymnotiforms and Osteichthyes. Sites on helix 5 and 6 coloured and outlined. Sites with amino acid reconstructions with posterior probabilities less than 0.5 labelled.

219 CHAPTER SIX: GENERAL CONCLUSIONS

6.1. GENERAL SUMMARY

This thesis expands our perspective of visual system evolution to include adaptations improving dim-light vision in optically challenging freshwater rivers. The rate at which wavelengths of light are attenuated is different in marine and freshwater environments, which can lead to different selection pressures on rhodopsin function. In Chapters two and three, I demonstrate that signatures of positive selection, along with amino acid substitutions that alter the functional properties of rhodopsin, were found to accompany invasions of freshwater by ancestrally marine teleost lineages. Functional characterization of these rhodopsin substitutions in Chapter three revealed red-shifted spectral sensitivity with possibly more efficient dark adaptation that is consistent with a receptor tuned to a red-shifted and dim freshwater environment. In Chapter four, I show that shifts in selection pressures and amino acid substitutions in rhodopsin at spectral tuning sites are convergent in multiple independent invasions of freshwater, and are most significant in deeper-dwelling lineages. In Chapter five, I find that electrogenic fishes retain a functional copy of rhodopsin, despite having an alternative sensory modality designed for optically challenging environments. Similar to the lineages of freshwater fishes with marine ancestry investigated in Chapters two-four, these freshwater fishes have undergone adaptive evolution, whereby specific amino acid substitutions in rhodopsin improve the perception of light in warmer, dimmer and red-shifted rivers. These results suggest that a holistic view of the functional properties of rhodopsin and the ecology of the species possessing these pigments is necessary for studying visual evolution. In this final chapter, I discuss some of the ecological factors differing across freshwater fishes and environments and how these factors might have shaped visual evolution. I also report on some unanswered questions related to adaptation in other aspects of the visual system and how the functional effects of some substitutions might differ across species and improve rhodopsin function through previously overlooked mechanistic processes.

220 6.2. ENVIRONMENTAL AND ECOLOGICAL EFFECTS ON RHODOPSIN EVOLUTION

Freshwater visual environments are more heterogeneous than marine waters, and as this thesis demonstrates, this diversity is reflected in adaptations in the rhodopsin pigments of freshwater fishes. Marine environments, ranging from clear off-shore waters to turbid and tannin-stained coastal regions have been described by 10 categories, approximating the optical diversity in most marine environments (Jerlov 1976). While most large lakes and some clear rivers may fit within these broad classifications, many rivers are too heterogeneous to be categorized by this system. Rivers are broadly classified as being white, black or clear, the specific colour determined by the river’s substrate. Rivers running over land rich in organic material take on a tea-like (black water) appearance, whereas fast flowing rivers, running over silty substrate take on a milky appearance because of the suspended particulate matter. In both cases, the underwater light environments would fall well beyond the most red-shifted marine category (Jerlov 1976). Moreover, the degree of red-shifting can vary seasonally as water levels rise and fall as well as spatially across the course of a river. (Costa et al. 2012).

In Chapter two I found rhodopsin is positively selected in a South American freshwater clade of marine-derived anchovies. Reconstructing amino acid substitutions in this clade indicated that some anchovies have more red-shifted rhodopsin than others. Other studies have found similarly high rates of positive selection and variation in spectral sensitivity in African rift lake cichlids inhabiting a dramatic diversity of depths (Carleton et al. 2016). Anchovies are predominantly pelagic, suggesting the uneven selection pressures acting across the clade are not due to differences in depth, but instead might reflect the substantial optical diversity in Amazonian rivers (Costa et al. 2012) or the impressive range of ecological niches (e.g., miniaturized and piscivorous lineages) occupied by this clade of fishes (Bloom and Lovejoy 2012). Such ecological factors have been suggested as drivers of visual adaptation in South American cichlids, which also possess a positively selected rhodopsin gene, and inhabit similar environments as the freshwater clade of anchovies (Schott et al. 2014; Hauser et al. 2017; Torres-Dowdall et al. 2017). While this thesis shows that transitions from marine to freshwater environments can result in increased rates of molecular evolution, it is only just

221 scratching the surface of the immense variability in visual environments within freshwater systems. A more detailed understanding of smaller scale differences in freshwater visual environments, or at least the diversity of visual environments freshwater species encounter could dramatically improve the resolution of future analyses of the molecular evolution of rhodopsin and other genes involved in vision.

Optical variability in freshwater environments could also alter the magnitude of the shift in wavelengths of light illuminating marine and freshwater environments. Similar to depth, this could alter the strength of selection associated with transitions into freshwater environments and might explain some of the variation observed in selection pressures observed in chapters three and four. For example, in Chapter three I observed positive selection in rhodopsin evolution in the South American, but not North American, marine to freshwater transition events of drum (Sciaenidae). The habitats occupied by South American freshwater drum are optically diverse and include black, clear, and whitewater, while the expansive range of the North American freshwater drum (Aplodinotus grunniens) includes the Laurentian Great Lakes, a relatively clear environment. Clear lakes are more likely to resemble marine habitats with respect to the attenuation of light making transitions from marine habitats to clear lakes less likely to alter evolutionary rates. This is the case in the lake dwelling cottoids and cichlids inhabiting Lake Baikal and the African Rift Valley Lakes respectively (Hunt et al. 1996; Carleton et al. 2016). In Lake Baikal, the peak spectral sensitivities of rhodopsin ranges from 516 to 484 nm in surface- and deepest-dwelling lineages, respectively (Hunt et al. 1996).

Ecological differences might be expected to interact with visual environment to influence visual evolution. In Chapter four I found that rhodopsin in Beloniformes, is under strong positive selection despite the fact that this clade inhabits very shallow water environments where spectral attenuation is minimal. Beloniformes do vary in diet, possibly representing another dimension to visual adaptation (Lovejoy 2004). Indeed, dietary evolution has been repeatedly shown to drive molecular, morphological and physiological evolution in vertebrates (Fritsches et al. 2005; López-Fernández et al. 2012; Li and Zhang 2013). This includes studies focussed on vision. For example, acuity is highest in predatory birds where the need for high visual performance is great (Land and Nilsson 2012). However, the trend of

222 high acuity in predatory species does not appear to apply to fishes (Caves et al. 2017). Yet, other adaptations are observed in lineages of predatory fishes, some specific to different prey types. Highly visual predatory fishes devote a larger portion of the brain to the processing of visual information than fishes relying on more passive predation strategies (Deary et al. 2016). Other studies have shown that opsin complement varies with prey type, most notably in larval fishes where ontogenetic differences in the expression of UV cones help to distinguish from the surface (Novales-Flamarique and Hawryshyn 1994). Morphological adaptations altering the kinetic properties of vision are also observed in billfishes, where novel heater organs placed next to the eye have opened up new ecological niches for foraging (Fritsches et al. 2005). Similar kinetic and spectral adaptations might be expected at the molecular level in opsins, optimizing performance for different prey types and foraging strategies.

Understanding how the factors mentioned above influence visual evolution could become increasingly important for accessing the risk of habitat loss due to anthropogenic modification of water systems (Hufbauer et al. 2012). Many fishes are sensitive to changes in turbidity, and are among the most at risk of extinction due to human interference (van der Sluijs et al. 2010). Increases in turbidity have been shown to affect some fishes more profoundly than others (Gray et al. 2014), and visual predators in particular are known to be at greater risk (Lunt and Smee 2015). Interestingly, an inverse effect is observed in some tropical fishes, where increased water clarity as a result of the construction of large dams decreases the survivability of native predatory species, with eyes adapted for turbid environments (Santos et al. 2018). The role variation in the visual systems of these fishes plays in mediating these differences has not been investigated. In addition to threats posed to native species, the turbidification of rivers may also disproportionately benefit that evolved in more turbid tropical rivers. One example of this is in the spread of the goldfish (Carassius auratus) through North American rivers and lakes; goldfish possess a highly red-shifted complement of visual pigments ideal for vision in turbid environments (Parry and Bowmaker 2000). Understanding the molecular basis of vision in native and non-native fishes could help to explain some of the complexity associated with anthropogenic pressures on fish populations, and to clarify if invasion events and turbidification of rivers has a compounding influence on sensitive species.

223 6.3. ALTERNATIVE AVENUES FOR ADAPTATION TO FRESHWATER VISUAL ENVIRONMENTS

In this thesis, I focussed on the evolution of rhodopsin structure and function, but adaptation in other aspects of the visual system are also expected to optimize vision in freshwater. Adaptations to turbid and tannin stained freshwater environments have also been attributed to differences in chromophore usage and differential expression of opsin classes (Torres-Dowdall et al. 2017). While the utility of the A1 to A2 chromophore switch for vision in freshwater fishes has been understood for over fifty years (Beatty 1966), the enzyme responsible for the conversion, Cyp27C1, was only recently identified (Enright et al. 2015). It was also shown that this same enzyme is conserved across vertebrates (Morshedian et al. 2017), but freshwater species incorporate the A2 chromophore far more frequently than marine or terrestrial species (Toyama et al. 2008), and the activity of this enzyme is highest in fishes inhabiting more turbid environments (Torres-Dowdall et al. 2017). Diadromous species undergoing habitat transitions from marine to freshwater, and amphibious species transitioning from freshwater terrestrial environments, both preferentially express the A2 chromophore during the freshwater phase of their lifecycle (Enright et al. 2015). The A2 chromophore shifts the λmax of rhodopsin further to the red than is possible through amino acid substitutions in rhodopsin alone.

Spectral sensitivity curves from microspectrophotometry data suggest many freshwater fishes incorporate the A2 chromophore (Bridges 1964; Schwanzara 1967). However, only a small subset of A2 incorporating fishes are derived from marine ancestors. It is not clear whether or not this mode of adaptation to red-shifted underwater visual environments is possible for freshwater lineages with marine ancestry, as the enzymatic function of Cyp27C1 may have been lost during long evolutionary periods of disuse. Even some freshwater species do not express Cyp27C1 naturally, although it can be induced through treatment with thyroid hormone (Allison et al. 2004). The A2 chromophore is also more prone to thermal noise, and may be a hindrance in warmer waters. In general, freshwater fishes express the A2 chromophore less frequently in the summer months when water is warmer (Allen and McFarland 1973). Amazonian rivers are always warm and may not be suitable for the use of

224 A2; however, specific substitutions in the rhodopsin binding pocket might mitigate some thermal effects. Interestingly, the S299A substitution shown to improve thermal stability in the rhodopsin of frogs that use the A2 chromophore is repeatedly observed in fishes making marine to freshwater transitions in Chapter four (Fyhrquist et al. 1998). Whether or not the S299A substitutions has the same impact on thermal stability in the rhodopsin pigments of freshwater fishes with marine ancestry, or if other convergent substitutions are also involved, is unknown, but likely dictates if the A2 chromophore is beneficial in these lineages. The increased number of rhodopsin sequences now available for freshwater fishes should make comparative sequence analyses possible for species known to possess A1, A2 or both chromophores at some point during their life cycle. These studies might help to identify conserved structures in rhodopsin, or other aspects of the visual system, associated with the usage of the A1 or A2 chromophore. Specifically, differences in the binding pocket of rhodopsin might be expected given that similar differences have been observed other GPCRs in regard to ligand specificity (Wolf and Grünewald 2015).

The complement of cone opsins is also expected to vary across environments and lineages in freshwater species (O'Quin et al. 2010). Next-generation sequencing approaches have made investigations of the relative expression of different opsins more tractable. Cones are more active in bright light and may be more critical to the shallower dwelling Beloniformes inhabiting brighter environments than deeper dwelling Clupeiformes and Sciaenids, which might help to explain some of the difference in divergent selection observed in Chapter four. Unlike the dim-light sensitive rods, visual evolution in cones is more complicated, as these cells must prioritize both the detection of and discrimination between different wavelengths of light to maximize contrast detection (Lythgoe 1979). Cone opsin complements do differ in cichlids, guppies, and stickleback inhabiting different environments (O'Quin et al. 2010; Brawand et al. 2014; Sandkam et al. 2018). In stickleback, standing allelic variation in cone opsin sensitivities segregates differently in species that colonized clear or tannin stained lakes following the last glacial maxima (Marques et al. 2017). To what extent cone opsin complements have shifted to match the prevailing light environments in marine-derived freshwater lineages has not yet been investigated, but might provide another example of adaptations to different light regimes. Croakers and anchovies have reduced cone opsin

225 complements, possessing three cone classes each, (LWS, Rh2A, SWS2) (Xu et al. 2016) and (Rh2-1, Rh2-2 and LWS) (Kondrashev et al. 2012) respectively. The reduced opsin complement of these lineages, and even more extreme reduction in the Gymnotiformes might explain the higher selection pressures on the remaining pigments, including rhodopsin.

Comparison between dN/dS rates in species with different numbers of opsins might help to determine if this parameter affects the visual evolution. Interestingly, I found evidence for two copies of the Rh1 gene in the North American freshwater drum, one of which is a pseudogene. Most fishes retain only a single copy of rhodopsin (Musilová et al. 2018), but duplication events have been instrumental in establishing the expanded cone opsin complements in many species (Cortesi et al. 2015; Lin et al. 2017; Nakamura et al. 2017). Whether or not the two rhodopsin copies in the North American freshwater drum are a result of a genome wide duplication event, or if they represent a tandem duplication or allelic diversity remains to be investigated.

Another adaptive strategy in highly turbid freshwater environments is the development of an entirely new sensory system. Electrosensory systems, well-suited for highly turbid habitats, arose twice in teleosts: in the Gymnotiformes and mormyroids. In Gymnotiformes, contrary to our expectations, the evolution of the electrosensory system has not resulted in the loss of visual capacity with respect to rhodopsin function. In fact, in Chapter five I show that the rhodopsin gene, critical for dim-light vision, is highly conserved. One future direction would be to extend this study to include the convergent electrogenic clade of fishes, the Mormyroidea, and expand the dataset to include other genes involved in the visual system. This could reveal if the trend of high conservation in rhodopsin is observed in other clades with a similar sensory modality and if the results are consistent across the entirety of the visual system. Although it may seem ideal to have input from multiple sensory systems, previous studies have shown relaxation of selection in cone opsins in bats with higher duty cycle echolocation, a process deemed to be part of a sensory trade-off (Gutierrez, Schott, et al. 2018). Similar studies comparing Gymnotiformes and mormyroids with different types of electric signals might reveal sensory trade-offs in cones or other aspects of the visual transduction cascade and provide evidence for the ultimate importance of vision in the totality of the sensory system in species with an alternative sensory modality specialized for freshwater habitats.

226 6.4. THE CHANGING EVOLUTIONARY LANDSCAPE DURING RHODOPSIN EVOLUTION

In previous sections I have described how the ancestral ecology of a fish can affect its adaptation to freshwater visual environments. Similarly, the ancestral sequence and structure of rhodopsin will alter the functional effects of specific amino acid substitutions during these adaptive events. The interdependence of sites in protein evolution, known as epistasis, is critical in explaining why such a small fraction of sequence space has been explored through evolution (Maynard Smith 1970) and why substitutions have different functional effects in different sequence backgrounds (Gong et al. 2013). However, the effects of epistasis on functional protein evolution is still not well understood and is not incorporated in many models of protein evolution (Liberles et al. 2012). Opsin proteins are one of the few systems where epistatic interactions have been investigated and represent a potential model system for these studies. The numerous opsin sequences available, well resolved crystal structures in active and inactive states, and the ability to reconstruct and functionally characterize the effects of specific substitutions in vitro has revealed epistatic functional shifts involved in ancient divergence events in paralogous visual opsins (Kojima et al. 2017; Gerrard et al. 2018) and species-specific epistatic interactions in rhodopsin (Castiglione et al. 2017; Dungan and Chang 2017). In chapters three, four and five I find further evidence for epistasis occurring in the molecular evolution of rhodopsin.

In chapter three, positive selection is observed at site 165 of rhodopsin in both the South American freshwater clade and marine phylogenetic partitions of croakers. Interestingly, the amino acid residues at this site have different properties before and after the marine to freshwater transition event. The amino acid residues at this site are highly variable in croakers and across fishes in general but are mostly constrained to polar amino acids in marine croakers. In the freshwater clade the residues residing at this position remain variable but are restricted to non-polar leucine, valine and isoleucine. Unlike some other substitutions observed during the marine to freshwater transition, non-polar residues at site 165 do not appear to be more frequent in freshwater fishes, but instead are found in nearly all fishes with a phenylalanine at site 119. The L119F substitution has occurred multiple times independently in fishes including

227 along the marine to freshwater transition in croakers. The exclusion of polar residues at site 165 might represent a form of epistasis levied by the large phenylalanine at site 119 projecting into close proximity of site 165. Swapping a leucine for phenylalanine at site 119 in the dark state crystal structure shortens the distance between the side chain of 119 and the nearest residue on helix 4, site 165, from 5.58 angstroms to 3.0 angstroms in the dark state crystal structure. This could represent a steric conflict that is alleviated upon activation where the distance between the two sites lengthens from 3.0 angstroms to 4.2 angstroms and may have kinetic consequences. In zebrafish, a C165L substitution does slightly but significantly alter rhodopsin kinetics (Morrow and Chang 2015). To determine if this effect is magnified by epistatic interactions with site 119 further functional characterization must also be carried out in a background sequence where the residue at site 119 is a phenylalanine.

In chapter four a substitution is observed at site 185, a highly conserved site in the second extracellular loop of rhodopsin (EL2). The C185A substitution occurs along a transitional lineage under positive selection alongside 18 other amino acid substitutions, some at known functionally relevant sites. EL2 forms a cap covering the retinal binding pocket of rhodopsin and must fold correctly to form a functional pigment (Ernst et al. 2014). This folding is facilitated by the formation of a disulfide bond between conserved cysteines at sites 110 and 187. While site 185 is not directly involved in this bond formation, substitutions at neighbouring sites can cause an aberrant disulfide bond to form between site 110 and 185 and leads to protein misfolding (McKibbin et al. 2007). Replacing the cysteine at site 185 with an alanine prevents this pathogenic bond from forming in vitro (McKibbin et al. 2007). Experimental characterization of the substitutions observed along the transitional branch will reveal if the C185A substitution is mitigating any substitutions destabilizing rhodopsin folding.

Understanding protein epistasis is also important for predicting the functional effects of mutations associated with disease in humans in the context of precision medicine. As sequence data for more non-model species becomes available, substitutions thought to previously be deleterious are appearing in natural populations (Xu and Zhang 2014). In Chapter five, I report a similar observation in Gymnotiformes, where a retinitis pigmentosa- associated substitution is conserved throughout the clade. I suggest that this substitution is

228 masked by other substitutions in close proximity in the 3D structure, found also in other species with the same disease associated substitution.

Rhodopsin has a relative wealth of crystal structure data, with both the dark-state and active-state crystal structures (Palczewski et al. 2000; Choe et al. 2011). However, this is not the case for most proteins (Jaskolski et al. 2014). Crystal structures are also often only available for one species preventing comparisons between orthologous structures even in a relatively well characterized proteins like rhodopsin (Burley et al. 2018). Proteins are also not static entities, and in many cases, substitutions may have state-specific functional effects. New methods are being developed to visualize proteins in more native settings and over small time scales in vivo and in silico (Dror et al. 2012; Bai et al. 2015). Simultaneously, more parameter rich models of molecular evolution are being developed to account for the structural properties of proteins (Chi et al. 2018). Better structural data and improved models of molecular evolution should provide a better substrate for making informed hypotheses as to what effect natural variation might have on the function of proteins in different species, extending our understanding of evolutionary processes to include a more reasonable representation of a protein’s structure and function.

6.5. TOWARDS A MORE HOLISTIC VIEW OF ADAPTATION IN RHODOPSIN STRUCTURE AND FUNCTION

In this thesis, I find evidence for shifts in rhodopsin spectral sensitivity and differences in rhodopsin kinetics that suggest adaptation to riverine environments is not entirely analogous to the trends observed in marine environments and large lakes. In addition, substitutions that are expected to affect thermal stability and dimerization were also observed. Increasingly sophisticated assays and techniques have furthered our understanding of the functional properties of rhodopsin during its activation (Kang et al. 2015). Simultaneous advances in crystallography and microscopy are also providing more detailed descriptions of the intermediate structures formed during rhodopsin activation, as well as providing more natural

229 depictions of how these structures assemble in the outer segments of photoreceptor cells (Gunkel et al. 2015). Likewise, the rapid increase in sequence data for non-model species has provided a great opportunity to investigate parallel changes in protein structure and function that might have been previously overlooked. An improved understanding of how these lesser- known properties affect rhodopsin in an adaptive context might help explain some of the natural variation observed across species once believed to be neutral (Yokoyama et al. 2008; Castiglione et al. 2017; Castiglione et al. 2018).

The series of steps in the activation of rhodopsin is becoming increasingly populated as novel assays isolate intermediate conformations detectable only on the femtosecond timescale (Yamazaki et al. 2014; Kang et al. 2015). Simultaneously, mutagenesis studies are revealing how specific substitutions influence the equilibrium and ultimately the duration rhodopsin remains in the comparably long-lasting active (Meta-II) and inactive dark states (Schafer and Farrens 2015; Schafer et al. 2016; Yue et al. 2017). Many different hypotheses have been suggested for the adaptive importance of shifts in rhodopsin equilibrium that prolong or shorten the time spent in any given conformation (Sugawara et al. 2010; Sommer et al. 2014). Longer Meta-II duration is thought to improve signal amplification by increasing the number of G-proteins activated (Sugawara et al. 2010). This theory is complicated by the fact that very soon after Meta-II is formed it is shut off by rhodopsin kinase and arrestin, long before the active state degrades (Ernst et al. 2014). However, the trend remains that dim-light species tend to have longer Meta-II rates (Sugawara et al. 2010; Dungan and Chang 2017; Hauser et al. 2017; Gutierrez, Castiglione, et al. 2018). A longer Meta-II duration might also act as a reservoir for the large amount of toxic all-trans chromophore generated during light bleaches in rods, decreasing exposure while it is transformed into less toxic intermediate forms or until it is shuttled out of the photoreceptors and into the RPE (Sommer et al. 2014; Kiser and Palczewski 2016). Deeper-dwelling fishes tend to have higher proportions of rods (Hunt et al. 2015), which might make photoprotective adaptations more important (Castiglione and Chang 2018). Whether or not this trend translates to freshwater fishes is not clear, as we do not observe functional shifts that would suggest photo-protective adaptations in freshwater fishes with marine ancestry. Instead it would appear that freshwater fishes opt for faster retinal release and a less stable Meta-II, presumably to increase the rate of dark adaptation when traversing

230 the narrow interface between bright and dark environments. These studies suggest that adaptations in rhodopsin kinetics are multifaceted. Further studies on a more diverse array of species, with functional assays designed to infer differences across more intermediate states, are needed to fully understand the effects rhodopsin kinetics has on visual adaptation.

The thermal stability of rhodopsin determines the absolute threshold for vision, and is critical for reliable vision in dim-light environments (Liu et al. 2011). The depths of tropical rivers are much warmer than marine environments and lakes because the mixing of the water column prevents the formation of a thermocline (Crampton 2007). Red-shifted pigments are also inherently more thermally unstable (Luk et al. 2016). Rhodopsin pigments would appear to have an upper spectral sensitivity boundary at 545 nm, even in fishes that have inhabited freshwater for hundreds of millions of years, well short of the optimal value for detecting light in freshwater. The LWS cone opsin is capable of sensing much longer wavelength light than this which suggests that other factors, such as sensitivity, are preventing rhodopsin from exploring substitutions that would red-shift the pigment this far in freshwater environments. The incorporation of the A2 chromophore, the noisier of the two chromophore types (Ala- Laurila et al. 2007; Luk et al. 2016), might compound these thermal constraints, but specific substitutions may help alleviate some of these thermal issues in species inhabiting warmer waters. Amino acid differences in frog and toad rhodopsin, the former utilizing the A2 chromophore more than the latter, might be responsible for observed differences in thermostability (Fyhrquist et al. 1998). Further functional characterization of specific substitutions identified in this study are necessary to find which substitutions might be improving thermal sensitivity. Interestingly, we do see convergent S299A substitutions, also observed in frogs, in multiple freshwater and transitional lineages, highlighting a useful starting point for these future studies. It is also possible that the incorporation of the A2 chromophore might provide some mechanism for fishes to deal with anthropogenic turbidification of rivers, but dealing with the associated thermal noise may be challenging in a warming world. Understanding which substitutions facilitate the use of an A2 chromophore in rhodopsin might help explain some of this expected variation in adaptability and sensitivity to environmental variation (Allison et al. 2004).

231 Recently, increasing attention has been devoted to the tendency of rhodopsin to form track-like oligomeric structures in the outer segments of photoreceptors (Gunkel et al. 2015). This appears to be a relatively rare property in GPCRs (Felce et al. 2017). In opsins, it has been suggested to act as a scaffold to sequester and amplify the activation downstream signalling components, providing a molecular basis for the single-photon response (Gunkel et al. 2015). Interestingly, the ability of rhodopsin to form dimers is not conserved across other visual pigments. At some point following the recent duplication and neofunctionalization of paralogous LWS and MWS opsins in humans, the ability to form dimers was gained or lost respectively depending upon the ancestral state (Jastrzebska et al. 2016). These opsins differ at only a few positions, and MWS can be transformed into a dimer-forming pigment by swapping out residues at just three sites on the helices making up the interface (Jastrzebska et al. 2016). Dimerization in the longer wavelength sensitive copy (LWS) might be required because of the increased thermal noise associated with the red-shifted pigment. How dimerization potential in rhodopsin varies across different species has not been investigated, but many studies, including my investigation of Gymnotiform rhodopsin in Chapter five, find strong evidence for selection at sites that form the dimerization interface of rhodopsin (Liang et al. 2003; Schott et al. 2014; Morrow et al. 2017; Stieb et al. 2017). Future studies should look more closely at variation at these sites across species, but unlike investigations of spectral sensitivity kinetics and thermal stability, specific substitutions may not be the primary driving force for adaptation. Instead sets of substitutions, potentially of small individual effect, could alter the electrostatic or hydrophobic tendencies of the dimerization interface, collectively altering dimerization potential. Investigations of these properties require more data because of the extensive number of permutations possible, but should become more feasible with machine learning approaches and the wealth of sequence data becoming available.

232 6.6. GENERAL CONCLUSIONS AND SIGNIFICANCE

This thesis is the first study to report on the molecular adaptation of rhodopsin in freshwater fishes with marine ancestry. More broadly, it is also the first study to investigate the molecular mechanisms underlying the ancestral functional adaptation associated with marine to freshwater transitions. The relationship between positive selection and depth investigated for marine to freshwater transitions in this thesis represents a rare natural system to test how the strength of selection acts on protein evolution. In the deepest-dwelling clade investigated, the croakers, a large red-shift was observed concomitant with a marine to freshwater transition resulting in a match of spectral sensitivity with the prevailing wavelengths of light in freshwater environments. Substitutions on this branch and others detected in independent invasions of freshwater would make dark adaptation more efficient, likely improving vision in freshwater fishes frequently traversing the narrow interface between bright light and dark environments. This form of non-spectral adaptation has not been previously explored but is consistent with a growing number of studies suggesting visual adaptation is more complex than substitutions in the binding pocket altering spectral sensitivity. This prospect is also supported by convergent positively selected sites at positions outside the binding pocket on multiple transitional branches and also in epistatic interactions allowing mutations causing disease in humans to persist in some fishes. Taken together this thesis reveals some of the complexity in adaptations in rhodopsin to different visual environments by expanding these studies to include adaptations to red-shifted freshwater rivers, and by structurally and functionally testing properties of rhodopsin mutations at sites not expected to solely shift spectral sensitivity.

233 6.7. REFERENCES

Ala-Laurila P, Donner K, Crouch RK, Cornwall MC. 2007. Chromophore switch from 11- cis-dehydroretinal (A2) to 11- cis-retinal (A1) decreases dark noise in salamander red rods. The Journal of Physiology 585:57–74.

Allen DM, McFarland WN. 1973. The effect of temperature on rhodopsin-porphyropsin ratios in a fish. Vision Res. 13:1303–1309.

Allison WT, Haimberger TJ, Hawryshyn CW, Temple SE. 2004. Visual pigment composition in zebrafish: Evidence for a rhodopsin–porphyropsin interchange system. Vis. Neurosci. 21:945–952.

Bai X-C, McMullan G, Scheres SHW. 2015. How cryo-EM is revolutionizing structural biology. Trends Biochem. Sci. 40:49–57.

Barney RL. 1926. The Distribution of the Fresh‐Water Sheepshead, Aplodinotus Grunniens Rafinesque, in Respect to the Glacial History of North America. Ecology 7:351–364.

Beatty DD. 1966. A Study Of The Succession Of Visual Pigments In Pacific Salmon (Oncorhynchus). Can. J. Zool. 44:429–455.

Bloom DD, Lovejoy NR. 2012. Molecular phylogenetics reveals a pattern of biome conservatism in New World anchovies (family Engraulidae). J. Evol. Biol. 25:701–715.

Brawand D, Wagner CE, Li YI, Malinsky M, Keller I, Fan S, Simakov O, Ng AY, Lim ZW, Bezault E, et al. 2014. The genomic substrate for adaptive radiation in African cichlid fish. Nature 513:375–381.

Bridges C. 1964. Periodicity of absorption properties in pigments based on vitamin A2 from fish retinae. Nature 203:303–304.

Burley SK, Berman HM, Christie C, Duarte JM, Feng Z, Westbrook J, Young J, Zardecki C. 2018. RCSB Protein Data Bank: Sustaining a living digital data resource that enables breakthroughs in scientific research and biomedical education. Protein Sci. 27:316–330.

Carleton KL, Dalton BE, Escobar-Camacho D, Nandamuri SP. 2016. Proximate and ultimate causes of variable visual sensitivities: Insights from cichlid fish radiations. Genesis 54:299–325.

Castiglione GM, Chang BS. 2018. Functional trade-offs and environmental variation shaped ancient trajectories in the evolution of dim-light vision. eLife 7:e35957.

Castiglione GM, Hauser FE, Liao BS, Lujan NK, Van Nynatten A, Morrow JM, Schott RK, Bhattacharyya N, Dungan SZ, Chang BSW. 2017. Evolution of nonspectral rhodopsin function at high altitudes. Proc. Natl. Acad. Sci. U.S.A. 114:7385–7390.

234 Castiglione GM, Schott RK, Hauser FE, Chang BSW. 2018. Convergent selection pressures drive the evolution of rhodopsin kinetics at high altitudes via nonparallel mechanisms. Evolution 72:170–186.

Caves EM, Sutton TT, Johnsen S. 2017. Visual acuity in ray-finned fishes correlates with eye size and habitat. J. Exp. Biol.:jeb–151183.

Chi PB, Kim D, Lai JK, Bykova N, Weber CC, Kubelka J, Liberles DA. 2018. A new parameter‐rich structure‐aware mechanistic model for amino acid substitution during evolution. Proteins: Structure, Function, and Bioinformatics 86:218–228.

Choe H-W, Kim YJ, Park JH, Morizumi T, Pai EF, Krauß N, Hofmann KP, Scheerer P, Ernst OP. 2011. Crystal structure of metarhodopsin II. Nature 471:651–655.

Cortesi F, Musilová Z, Stieb SM, Hart NS, Siebeck UE, Malmstrøm M, Tørresen OK, Jentoft S, Cheney KL, Marshall NJ, et al. 2015. Ancestral duplications and highly dynamic opsin gene evolution in percomorph fishes. Proc. Natl. Acad. Sci. U.S.A. 112:1493– 1498.

Costa MPF, Novo EMLM, Telmer KH. 2012. Spatial and temporal variability of light attenuation in large rivers of the Amazon. Hydrobiologia 702:171–190.

Crampton WG. 2007. Diversity and adaptation in deep channel Neotropical electric fishes. In: Fish life in special environments. New Hampshire: Fish life in special environments. New Hampshire: Science Publishers, Inc., Enfield. pp. 283–339.

Darwin C. 1859. On the Origin of Species by Means of Natural Selection, Or, The Preservation of Favoured Races in the Struggle for Life.

Deary AL, Metscher B, Brill RW, Hilton EJ. 2016. Shifts of sensory modalities in early life history stage estuarine fishes (Sciaenidae) from the Chesapeake Bay using X-ray micro computed tomography. Environ Biol Fish 99:361–375.

Dror RO, Dirks RM, Grossman JP, Xu H, Shaw DE. 2012. Biomolecular simulation: a computational microscope for molecular biology. Annu Rev Biophys 41:429–452.

Dungan SZ, Chang BSW. 2017. Epistatic interactions influence terrestrial–marine functional shifts in cetacean rhodopsin. Proc. R. Soc. B 284:20162743–20162749.

Enright JM, Toomey MB, Sato S-Y, Temple SE, Allen JR, Fujiwara R, Kramlinger VM, Nagy LD, Johnson KM, Xiao Y, et al. 2015. Cyp27c1 Red-Shifts the Spectral Sensitivity of Photoreceptors by Converting Vitamin A1 into A2. Curr. Biol. 25:3048–3057.

Ernst OP, Lodowski DT, Elstner M, Hegemann P, Brown LS, Kandori H. 2014. Microbial and Animal Rhodopsins: Structures, Functions, and Molecular Mechanisms. Chem. Rev. 114:126–163.

235 Felce JH, Latty SL, Knox RG, Mattick SR, Lui Y, Lee SF, Klenerman D, Davis SJ. 2017. Receptor Quaternary Organization Explains G Protein-Coupled Receptor Family Structure. Cell Rep. 20:2654–2665.

Fritsches KA, Brill RW, Warrant EJ. 2005. Warm eyes provide superior vision in swordfishes. Curr. Biol. 15:55–58.

Fyhrquist N, Donner K, Hargrave PA, McDowell JH, Popp MP, Smith WC. 1998. Rhodopsins from three frog and toad species: sequences and functional comparisons. Exp. Eye Res. 66:295–305.

Gerrard E, Mutt E, Nagata T, Koyanagi M, Flock T, Lesca E, Schertler GF, Terakita A, Deupi X, Lucas RJ. 2018. Convergent evolution of tertiary structure in rhodopsin visual proteins from vertebrates and box jellyfish. Proc. Natl. Acad. Sci. U.S.A. 115:6201– 6206.

Gong LI, Suchard MA, Bloom JD. 2013. Stability-mediated epistasis constrains the evolution of an influenza protein. eLife 2:e00631.

Gray SM, Bieber FME, Mcdonnell LH, Chapman LJ, Mandrak NE. 2014. Experimental evidence for species‐specific response to turbidity in imperilled fishes. Aquatic Conserv: Mar. Freshw. Ecosyst. 24:546–560.

Gunkel M, Schöneberg J, Alkhaldi W, Irsen S, Noé F, Kaupp UB, Al-Amoudi A. 2015. Higher-Order Architecture of Rhodopsin in Intact Photoreceptors and Its Implication for Phototransduction Kinetics. Structure 23:628–638.

Gutierrez EA, Castiglione GM, Morrow JM, Schott RK, Loureiro LO, Lim BK, Chang BSW. 2018. Functional shifts in bat dim-light visual pigment are associated with differing echolocation abilities and reveal molecular adaptation to photic-limited environments. Mol. Biol. Evol.:msy140–msy140VL–IS–.

Gutierrez EA, Schott RK, Preston MW, Loureiro LO, Lim BK, Chang BSW. 2018. The role of ecological factors in shaping bat cone opsin evolution. Proc. Biol. Sci. 285:20172835– 20172838.

Harrington KA, Hrabik TR, Mensinger AF. 2015. Visual Sensitivity of Deepwater Fishes in . PLoS ONE 10:e0116173–14.

Hauser FE, Ilves KL, Schott RK, Castiglione GM, López-Fernández H, Chang BSW. 2017. Accelerated Evolution and Functional Divergence of the Dim Light Visual Pigment Accompanies Cichlid Colonization of Central America. Mol. Biol. Evol. 34:2650–2664.

Hufbauer RA, Facon B, Ravigne V, Turgeon J, Foucaud J, Lee CE, Rey O, Estoup A. 2012. Anthropogenically induced adaptation to invade (AIAI): contemporary adaptation to human‐altered habitats within the native range can promote invasions. Evolutionary Applications 5:89–101.

236 Hunt DM, Rawlinson NJF, Thomas GA, Cobcroft JM. 2015. Investigating photoreceptor densities, potential visual acuity, and cone mosaics of shallow water, temperate fish species. Vision Res. 111:13–21.

Hunt DM, Slobodyanyuk SJ, Fitzgibbon J, Bowmaker JK. 1996. Spectral tuning and molecular evolution of rod visual pigments in the species flock of cottoid fish in Lake Baikal. Vision Res. 36:1217–1224.

Jaskolski M, Dauter Z, Wlodawer A. 2014. A brief history of macromolecular crystallography, illustrated by a family tree and its Nobel fruits. The FEBS journal 281:3985–4009.

Jastrzebska B, Comar WD, Kaliszewski MJ, Skinner KC, Torcasio MH, Esway AS, Jin H, Palczewski K, Smith AW. 2016. A G Protein-Coupled Receptor Dimerization Interface in Human Cone Opsins. Biochemistry:acs.biochem.6b00877–39.

Jerlov NG. 1976. Marine Optics. Elsevier Inc

Kang Y, Zhou XE, Gao X, He Y, Liu W, Ishchenko A, Barty A, White TA, Yefanov O, Han GW. 2015. Crystal structure of rhodopsin bound to arrestin by femtosecond X-ray laser. Nature 523:561.

Kiser PD, Palczewski K. 2016. Retinoids and retinal diseases. Annu Rev Vis Sci 2:197–234.

Kojima K, Yamashita T, Imamoto Y, Kusakabe TG, Tsuda M, Shichida Y. 2017. Evolutionary steps involving counterion displacement in a tunicate opsin. Proc. Natl. Acad. Sci. U.S.A.:201701088.

Kondrashev SL, Miyazaki T, Lamash NE, Tsuchiya T. 2012. Three cone opsin genes determine the properties of the visual spectra in the Japanese anchovy Engraulis japonicus (Engraulidae, Teleostei). J. Exp. Biol.:jeb–078980.

Land MF, Nilsson D-E. 2012. Animal Eyes. OUP Oxford

Li D, Zhang J. 2013. Diet shapes the evolution of the vertebrate bitter taste receptor gene repertoire. Mol. Biol. Evol. 31:303–309.

Liang Y, Fotiadis D, Filipek S, Saperstein DA, Palczewski K, Engel A. 2003. Organization of the G protein-coupled receptors rhodopsin and opsin in native membranes. J. Biol. Chem. 278:21655–21662.

Liberles DA, Teichmann SA, Bahar I, Bastolla U, Bloom J, Bornberg Bauer E, Colwell LJ, De Koning AJ, Dokholyan NV, Echave J. 2012. The interface of protein structure, protein biophysics, and molecular evolution. Protein Sci. 21:769–785.

Lin J-J, Wang F-Y, Li W-H, Wang T-Y. 2017. The rises and falls of opsin genes in 59 ray- finned fish genomes and their implications for environmental adaptation. Sci. Rep. 7:1– 13.

237 Liu J, Liu MY, Nguyen JB, Bhagat A, Mooney V, Yan ECY. 2011. Thermal properties of rhodopsin: insight into the molecular mechanism of dim-light vision. J. Biol. Chem. 286:27622–27629.

Lovejoy NR. 2004. Phylogeny and Jaw Ontogeny of Beloniform Fishes. Integr. Comp. Biol. 44:366–377.

López-Fernández H, Winemiller KO, Montaña C, Honeycutt RL. 2012. Diet-morphology correlations in the radiation of South American geophagine cichlids (Perciformes: Cichlidae: Cichlinae). PLoS ONE 7:e33997.

Luk HL, Bhattacharyya N, Montisci F, Morrow JM, Melaccio F, Wada A, Sheves M, Fanelli F, Chang BSW, Olivucci M. 2016. Modulation of thermal noise and spectral sensitivity in Lake Baikal cottoid fish rhodopsins. Sci. Rep. 6:1–9.

Lunt J, Smee DL. 2015. Turbidity interferes with foraging success of visual but not chemosensory predators. PeerJ 3:e1212–e1212.

Lythgoe JN. 1979. The Ecology of Vision. Clarendon Press

Marques DA, Taylor JS, Jones FC, Di Palma F, Kingsley DM, Reimchen TE. 2017. Convergent evolution of SWS2 opsin facilitates adaptive radiation of threespine stickleback into different light environments. PLoS Biol 15:e2001627–24.

Maynard Smith J. 1970. Natural selection and the concept of a protein space. Nature 225:563.

McKibbin C, Toye AM, Reeves PJ, Khorana HG, Edwards PC, Villa C, Booth PJ. 2007. Opsin Stability and Folding: The Role of Cys185 and Abnormal Disulfide Bond Formation in the Intradiscal Domain. J. Mol. Biol. 374:1309–1318.

Morrow JM, Chang BSW. 2015. Comparative Mutagenesis Studies of Retinal Release in Light-Activated Zebrafish Rhodopsin Using Fluorescence Spectroscopy. Biochemistry 54:4507–4518.

Morrow JM, Lazic S, Dixon Fox M, Kuo C, Schott RK, de A Gutierrez E, Santini F, Tropepe V, Chang BSW. 2017. A second visual rhodopsin gene, rh1-2, is expressed in zebrafish photoreceptors and found in other ray-finned fishes. J. Exp. Biol. 220:294–303.

Morshedian A, Toomey MB, Pollock GE, Frederiksen R, Enright JM, McCormick SD, Cornwall MC, Fain GL, Corbo JC. 2017. Cambrian origin of the CYP27C1-mediated vitamin A 1-to-A 2 switch, a key mechanism of vertebrate sensory plasticity. R. Soc. open sci. 4:170362–170369.

Musilová Z, Cortesi F, Matschiner M, Davies WIL, Stieb SM, de Busserolles F, Malmstroem M, Toerresen OK, Mountford JK, Hanel R, et al. 2018. Vision using multiple distinct rod opsins in deep-sea fishes. bioRxiv:424895.

238 Nakamura Y, Yasuike M, Mekuchi M, Iwasaki Y, Ojima N, Fujiwara A, Chow S, Saitoh K. 2017. Rhodopsin gene copies in Japanese eel originated in a teleost-specific genome duplication. Zoological Lett 3:1–12.

Novales-Flamarique H, Hawryshyn C. 1994. Ultraviolet Photoreception Contributes to Prey Search Behaviour in Two Species of Zooplanktivorous Fishes. J. Exp. Biol. 186:187.

O'Quin KE, Hofmann CM, Hofmann HA, Carleton KL. 2010. Parallel evolution of opsin gene expression in African cichlid fishes. Mol. Biol. Evol. 27:2839–2854.

Palczewski K, Kumasaka T, Hori T, Behnke CA, Motoshima H, Fox BA, Le Trong I, Teller DC, Okada T, Stenkamp RE, et al. 2000. Crystal Structure of Rhodopsin: A G Protein- Coupled Receptor. Science 289:739–745.

Parry JWL, Bowmaker JK. 2000. Visual pigment reconstitution in intact goldfish retina using synthetic retinaldehyde isomers. Vision Res. 40:2241–2247.

Sandkam B, Dalton B, Breden F, Carleton K. 2018. Reviewing guppy color vision: integrating the molecular and physiological variation in visual tuning of a classic system for sensory drive. Curr. Zool. 38:1–11.

Santos AFGN, García-Berthou E, Hayashi C, Santos LN. 2018. Water turbidity increases biotic resistance of native Neotropical piscivores to alien fish. Hydrobiologia 817:293– 305.

Schafer CT, Farrens DL. 2015. Conformational selection and equilibrium governs the ability of retinals to bind opsin. J. Biol. Chem. 290:4304–4318.

Schafer CT, Fay JF, Janz JM, Farrens DL. 2016. Decay of an active GPCR: Conformational dynamics govern agonist rebinding and persistence of an active, yet empty, receptor state. Proc. Natl. Acad. Sci. U.S.A. 113:11961–11966.

Schott RK, Refvik SP, Hauser FE, López-Fernández H, Chang BSW. 2014. Divergent positive selection in rhodopsin from lake and riverine cichlid fishes. Mol. Biol. Evol. 31:1149–1165.

Schwanzara SA. 1967. The visual pigments of freshwater fishes. Vision Res. 7:121–148.

Sommer ME, Hofmann KP, Heck M. 2014. Not just signal shutoff: the protective role of arrestin-1 in rod cells. :101–116.

Stieb SM, Cortesi F, Sueess L, Carleton KL, Salzburger W, Marshall NJ. 2017. Why UV vision and red vision are important for damselfish (): structural and expression variation in opsin genes. Mol Ecol 26:1323–1342.

Sugawara T, Imai H, Nikaido M, Imamoto Y, Okada N. 2010. Vertebrate Rhodopsin Adaptation to Dim Light via Rapid Meta-II Intermediate Formation. Mol. Biol. Evol. 27:506–519.

239 Torres-Dowdall J, Pierotti MER, Härer A, Karagic N, Woltering JM, Henning F, Elmer KR, Meyer A. 2017. Rapid and Parallel Adaptive Evolution of the Visual System of Neotropical Midas Cichlid Fishes. Mol. Biol. Evol. 34:2469–2485.

Toyama M, Hironaka M, Yamahama Y, Horiguchi H, Tsukada O, Uto N, Ueno Y, Tokunaga F, Seno K, Hariyama T. 2008. Presence of Rhodopsin and Porphyropsin in the Eyes of 164 Fishes, Representing Marine, Diadromous, Coastal and Freshwater Species—A Qualitative and Comparative Study. Photochem. Photobiol. 84:996–1002. van der Sluijs I, Gray SM, Amorim MCP, Barber I, Candolin U, Hendry AP, Krahe R, Maan ME, Utne-Palm AC, Wagner H-J, et al. 2010. Communication in troubled waters: responses of fish communication systems to changing environments. Evol Ecol 25:623– 640.

Wolf S, Grünewald S. 2015. Sequence, structure and ligand binding evolution of rhodopsin- like G protein-coupled receptors: a crystal structure-based phylogenetic analysis. PLoS ONE 10:e0123533.

Xu J, Zhang J. 2014. Why human disease-associated residues appear as the wild-type in other species: genome-scale structural evidence for the compensation hypothesis. Mol. Biol. Evol. 31:1787–1792.

Xu T, Xu G, Che R, Wang R, Wang Y, Li J, Wang S, Shu C, Sun Y, Liu T. 2016. The genome of the miiuy croaker reveals well-developed innate immune and sensory systems. Sci. Rep. 6:21902.

Yamazaki Y, Nagata T, Terakita A, Kandori H, Shichida Y, Imamoto Y. 2014. Intramolecular Interactions That Induce Helical Rearrangement upon Rhodopsin Activation. J. Biol. Chem. 289:13792–13800.

Yokoyama S, Tada T, Zhang H, Britt L. 2008. Elucidation of phenotypic adaptations: Molecular analyses of dim-light vision proteins in vertebrates. Proc. Natl. Acad. Sci. U.S.A. 105:13480–13485.

Yue WWS, Frederiksen R, Ren X, Luo D-G, Yamashita T, Shichida Y, Cornwall MC, Yau K-W. 2017. Spontaneous activation of visual pigments in relation to openness/closedness of chromophore-binding pocket. eLife 6:e18492.

240