Evolution and Molecular Mechanisms of Photoreceptor Transmutation in Reptiles

by

Ryan K Schott

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Ecology and Evolutionary Biology University of Toronto

© Copyright by Ryan K Schott 2018

Evolution and Molecular Mechanisms of Photoreceptor

Transmutation in Reptiles

Ryan K Schott Doctor of Philosophy Graduate Department of Ecology and Evolutionary Biology University of Toronto 2018

Abstract

Natural light levels vary drastically, and to deal with this variance vertebrates typically utilize a duplex retina that contains rod photoreceptors for dim-light vision and cone photoreceptors for bright-light vision. Squamate reptiles, however, are unique in the predominance of simplex retinas that contain only rods or cones. Evolutionary transitions between rods and cones, termed photoreceptor transmutations, have been proposed to explain the evolution of these simplex retinas, but little previous work has focused on the molecular evolutionary underpinnings of the observed morphological changes. The goal of this thesis is to expand knowledge of the evolution and molecular mechanisms of photoreceptor transmutation. In the first study, I provide strong support for the hypothesis that the morphologically all-cone retina of diurnal colubrid snakes evolved through transmutation of the rods to resemble the appearance, and function, of cones. In the second, I developed a new method of targeted sequence capture that enables efficient sequencing of complete coding regions across divergent taxa, which further provided data for the final two studies. In the third, I analyzed the effect that photoreceptor transmutation and snake

ii

origins had on phototransduction evolution. I found results consistent with a strong effect of transmutation, including positive selection on cone-specific that may indicate adaptation during the evolution of rod-like cones. Furthermore, the low degree of gene loss in snakes, and a lack of relaxed selection early during their evolution, support a dim-light ancestor that lacked strong fossorial adaptations. In the final study, I used whole eye transcriptome sequencing to demonstrate that geckos do not utilize only cone phototransduction machinery as previously thought, and instead appear to co-express both rod and cone genes in cone photoreceptors. I also expand upon the third study to show that geckos experienced a similar shift in selective pressures as snakes that are also associated with transmutation. As a whole, this thesis provides the first molecular evidence for photoreceptor transmutation in snakes, produces a new methodology for efficiently producing sequence data relevant for molecular evolutionary studies, revises our view of transmutation in geckos, and provides the first evidence for molecular changes associated with photoreceptor transmutation.

iii

Acknowledgments

I am indebted to all the individuals that have given me the supervision, guidance, and assistance that have made this thesis possible. I am especially thankful to my supervisor Belinda Chang, for encouraging me to study the visual system despite my initial desire to work on a completely untenable project involving the genetics of horned lizard horns. Belinda allowed me to develop my own research projects and provided me with valuable opportunities to collaborate that have contributed substantially to my academic development. Without her support and guidance this would not have been possible. I am also grateful to my supervisory committee David Evans,

Jennifer Mitchell, Zhaolei Zhang for their insight into my various projects. David deserves special thanks: he began as my undergraduate and then master’s supervisor and it is with his guidance and support that I have been able to come this far. I am thankful to my external examiners, Stephen Wright, Santiago Claramunt, and David Gower, for their valuable comments and feedback on my work. Special thanks to the members of the Chang Lab, especially Frances

Hauser, Gianni Castiglione, and Nihar Bhattacharyya for discussions and assistance with many aspects of this thesis, not to mention the many hours of therapeutic ranting.

Finally I would like to thank my family, especially my wife Alicia; for their love, support, and patience throughout my academic pursuits, despite the fact that my wife jokingly (I hope!) hates science. And to my daughter Inara who was a great distraction in getting this thing finally finished, but one I wouldn’t trade for the world.

iv

Table of Contents

Acknowledgments...... iv Table of Contents ...... v List of Tables ...... viii List of Figures ...... x Chapter 1 General Introduction ...... 1 1.1 The Eye ...... 1 1.2 The Retina ...... 4 1.3 Photoreceptors...... 6 1.4 Phototransduction ...... 9 1.5 Visual Cycle ...... 15 1.6 Visual Pigments ...... 17 1.7 The Duplex Retina and Diurnal and Nocturnal Vision ...... 19 1.8 Photoreceptor Transmutation ...... 20 1.9 Visual System of Geckos ...... 23 1.10 Visual System of Snakes ...... 25 1.11 Overview ...... 30 1.12 References ...... 31 Chapter 2 Evolutionary transformation of rod photoreceptors in the all-cone retina of a diurnal garter snake ...... 42 2.1 Abstract ...... 42 2.2 Significance...... 43 2.3 Introduction ...... 44 2.4 Results ...... 48 2.4.1 Thamnophis proximus has an ‘all-cone’ retina ...... 48 2.4.2 Thamnophis proximus possesses three visual pigments ...... 50 2.4.3 RH1, LWS, and SWS1 expressed in Thamnophis proximus eye RNA and RH1 maintained under normal selective pressures ...... 51 2.4.4 Thamnophis proximus is functional with a highly blue-shifted λmax ....53 2.4.5 Rhodopsin and rod transducin are expressed in ‘cone’ photoreceptor cells ...... 53 2.4.6 A subset of small single ‘cones’ have rod ultrastructure ...... 55 2.5 Discussion ...... 57 2.6 Materials and Methods ...... 64 2.6.1 Animals ...... 64 2.6.2 Microspectrophotometry ...... 64 2.6.3 Phylogenetic and molecular evolutionary analyses ...... 64 2.6.4 Rhodopsin expression and spectroscopic assay ...... 65 2.6.5 Immunohistochemistry ...... 65 2.6.6 Electron microscopy ...... 65 2.7 Acknowledgements ...... 65 2.8 References ...... 66 2.9 Supplementary Results...... 73 2.9.1 Thamnophis proximus possesses three visual pigments ...... 73 2.9.2 Rhodopsin and rod transducin are expressed in ‘cone’ photoreceptor cells ...... 75

v

2.10 Supplementary Materials and Methods ...... 76 2.10.1 Animals ...... 76 2.10.2 Microspectrophometry ...... 76 2.10.3 isolation and sequencing ...... 76 2.10.4 Phylogenetic and molecular evolutionary analyses ...... 77 2.10.5 Rhodopsin expression and spectroscopic assay ...... 79 2.10.6 Immunohistochemistry ...... 79 2.10.7 Electron microscopy ...... 81 2.11 Supplementary Figures ...... 82 2.12 Supplementary Tables ...... 91 Chapter 3 Targeted capture of complete coding regions across divergent species ...... 94 3.1 Abstract ...... 94 3.2 Introduction ...... 95 3.3 Results and Discussion ...... 98 3.3.1 Reference sequences have a large effect on cross-species guided assembly ...... 106 3.3.2 Different assemblers performed best on similar and divergent reads ...... 108 3.3.3 Increased probe diversity and tiling substantially increase gene recovery and completeness ...... 111 3.3.4 Short exons had only a small effect on completeness of recovered genes ...... 112 3.3.5 Incomplete and erroneous probe sequences caused substantial reductions in gene completeness ...... 113 3.3.6 Completeness of recovered genes decreased with increasing sequence divergence ...... 114 3.3.7 Targeted capture performed similarly or better, and cost as little or less, than RNA-Seq and WGS ...... 119 3.3.8 Captured phylogenetic markers produced an accurate species tree ...... 122 3.4 Conclusions ...... 124 3.5 Materials and Methods ...... 125 3.5.1 Probe Design ...... 125 3.5.2 Sample Preparation and Sequencing ...... 127 3.5.3 Reference File Creation for Guided Assembly ...... 127 3.5.4 Assembly and Analysis Pipeline ...... 129 3.5.5 Method Analysis and Evaluation ...... 130 3.5.6 Phylogenetic Analysis ...... 132 3.6 Data Availability ...... 132 3.7 Acknowledgements ...... 133 3.8 References ...... 133 3.9 Supplementary Tables ...... 144 Chapter 4 Shifts in selective pressures on snake phototransduction genes associated with photoreceptor transmutation and dim-light ancestry ...... 147 4.1 Abstract ...... 147 4.2 Introduction ...... 148 4.3 Results ...... 156 4.3.1 Loss of (GRK1) in Snakes ...... 156 4.3.2 Distinct Selection Pressures on Snake Phototransduction Genes ...... 157 4.3.3 Long-term Shifts in Selection Pressures on Caenophidian Snake Phototransduction Genes ...... 161

vi

4.3.4 Positive Selection in Caenophidians Primarily in Cone-specific Phototransduction Genes ...... 166 4.3.5 No Evidence for a Relaxation of Constraint on the Branch Leading to Snakes ..169 4.4 Discussion ...... 170 4.5 Conclusions ...... 176 4.6 Methods...... 178 4.6.1 Animals ...... 178 4.6.2 Transcriptome Sequencing...... 178 4.6.3 Visual Transduction Gene Datasets ...... 178 4.6.4 Molecular Evolutionary Analyses...... 180 4.7 Acknowledgements ...... 181 4.8 References ...... 182 4.9 Supplementary Figures ...... 194 4.10 Supplementary Tables ...... 200 4.11 Supplementary Files...... 202 Chapter 5 Gene loss and divergent selection in gecko visual transduction genes ...... 203 5.1 Abstract ...... 203 5.2 Introduction ...... 204 5.3 Results ...... 208 5.3.1 Geckos still possess and express several rod transduction phototransduction genes ...... 208 5.3.2 Divergent, Elevated Selection in Gecko Visual Transduction Genes ...... 211 5.3.3 Geckos and snakes have experienced similar divergent selective pressures that are associated with photoreceptor transmutation ...... 216 5.4 Discussion ...... 217 5.5 Conclusions ...... 224 5.6 Methods...... 225 5.6.1 Animals ...... 225 5.6.2 Transcriptome Sequencing...... 225 5.6.3 Visual Transduction Gene Datasets ...... 225 5.6.4 Expression Analyses ...... 226 5.6.5 Molecular Evolutionary Analyses...... 226 5.7 References ...... 227 5.8 Supplementary Figure ...... 236 Chapter 6 Conclusions ...... 237 6.1 Summary and Conclusions ...... 237 6.2 Future Directions ...... 245 6.3 References ...... 252

vii

List of Tables

Table 1.1. Major components of the vertebrate visual phototransduction cascade found in rod and cone photoreceptors...... 14 Table S2.1. Estimates of peak absorbance (λmax) from individual photoreceptor cells for each of the different cone types as measured by microspectrophotometry (MSP)...... 91 Table S2.2. Results of analyses of selection of rhodopsin using PAML random sites, branch, and clade models...... 92 Table S2.3. Primers used for isolation of Thamnophis proximus ...... 93 Table 3.1. Comparison of pros and cons of different high-throughput sequencing strategies. ...105 Table 3.2. Comparison of BWA assembly with the Anolis and Snake references...... 107 Table 3.3. Comparison of BWA assembly with the Anolis and Gekko references...... 108 Table 3.4. Comparison of average completeness of recovered coding regions obtained using different assemblers...... 111 Table 3.5. Comparison of the performance of the assembly and annotation pipeline on RNA- Seq and whole genome data with the targeted capture approach...... 121 Table 3.6. Cost comparison of targeted capture, RNA-Seq, and whole genome sequencing experiments...... 122 Table S3.1. Detailed information on the 166 targeted genes and the probes designed based on them. Genes that did not have an Anolis probe are highlighted in orange...... 144 Table S3.2. List of the 16 squamate reptiles that were sequenced in order of phylogenetic divergence from Anolis...... 144 Table S3.3. Assembly statistics using each of the four different assemblers against each of the three reference sets...... 144 Table S3.4. Recovery and completeness of individual genes for each of the assembly methods for the 16 species ...... 144 Table S3.5. Average gene recovery for each species using of the different assembly methods and references...... 145 Table S3.6. Average gene completeness for each species using of the different assembly methods and references...... 145 Table S3.7. Completeness of opsins gene using additional reference demonstrating increased completeness as a results of increase probe diversity...... 145 Table S3.8. Differences in completeness between genes with only long exons compared to those that also had short exons...... 145 Table S3.9. Pairwise sequence identity of the 16 species compared to Anolis for a set of independent genes compared to average completeness levels for those species...... 145 Table S3.10. Comparison of pairwise sequence identity between Anolis and Gekko for those genes present (and complete) in both species to the recovered completeness of the Gekko hybrid enrichment data assembled using BWA and the Gekko reference...... 145 Table S3.11. Comparison of targeted capture and RNA-seq data for Thamnophis sirtalis...... 145 Table S3.12. Test of targeted capture assembly pipeline on and whole genome sequence data of varying coverage ...... 146 Table S3.13. Detailed cost breakdown of targeted capture, RNA-Seq, and whole genome sequencing experiments...... 146

viii

Table S3.14. List of the 23 phylogenetic genes captured and their respective average and minimum completeness values ...... 146 Table 4.1. Major components of the vertebrate visual phototransduction cascade and their presence or absence in snakes and other reptile groups...... 155 Table 4.2. Summary of the selection analyses performed on the reptile and snake datasets using the random sites models to test for pervasive positive selection and the clade models to test for divergent (and positive) selection...... 160 Table S4.1. Comparison of selective constraint between snakes and other reptiles, and between caenophidian snakes and other snakes for each of the analyzed genes...... 200 Table S4.2. Summary of the selection analyses performed on the branch leading to snakes, and to caenophidian snakes using the branch, branch-site, CmC models...... 201 Table 5.1. Major components of the vertebrate visual phototransduction cascade and their presence or absence in geckos and other reptile groups...... 211 Table 5.2. Summary of the selection analyses...... 215

ix

List of Figures

Figure 1.1. Basic anatomy of a vertebrate eye...... 3 Figure 1.2. Neuronal organization of a vertebrate retina...... 5 Figure 1.3. Basic vertebrate photoreceptor anatomy...... 8 Figure 1.4. Generalized schematic of vertebrate rod phototransduction cascade...... 12 Figure 1.5. Representative ancestral complement of vertebrate photoreceptors and visual pigments based on those present in birds...... 19 Figure 1.6. Walls’ view of photoreceptor transmutation in geckos (A) and snakes (B) depicting the transition from all-cone to all-rod and back to all-cone again in geckos and from duplex to the all-cone to all-rod in snakes...... 22 Figure 1.7. Schematic view of photoreceptor transmutation in geckos...... 24 Figure 1.8. Schematic phylogeny of snakes illustrating the three major groups and the hypothesized position where photoreceptor transmutation begins to occur...... 28 Figure 1.9. Photoreceptors and visual pigment absorption spectra of Python regius (A) and Thamnophis sirtalis (B)...... 29 Figure 2.1. Illustration of evolutionary pathways for two alternative hypotheses for the evolution of an all-cone retina from a duplex ancestor in diurnal colubrids ...... 47 Figure 2.2. Light and scanning electron microscopy of Thamnophis proximus retina...... 49 Figure 2.3. Normalized absorbance spectra of (A) middle-wavelength visual pigment from intact photoreceptor cells measured by MSP and (B) in vitro expressed rhodopsin (RH1) from Thamnophis proximus...... 51 Figure 2.4. Immunohistochemical staining of control (mouse, A–D) and Thamophis proximus (E–K) transverse retinal cryosections with rhodopsin (4D2) and rod-specific-transducin (K20) antibodies...... 55 Figure 2.5. Transmission electron microscope (TEM) image of the outer segment of a Thamnophis proximus photoreceptor cell with rod ultrastructure...... 57 Figure 2.6. Absorption spectra of Python (A) and Thamnophis proximus (B)...... 63 Figure S2.1. Scanning electron microscope images of Thamnophis proximus retina at increasing magnifications illustrating the all-cone photoreceptor population...... 82 Figure S2.2. Normalized visual pigment absorbance spectra measured using microspectrophometry (MSP) on intact photoreceptor cells from the long- (A) and short- (B) wavelength visual pigments of Thamnophis proximus...... 82 Figure S2.3. Rhodopsin gene tree estimated using Bayesian inference illustrating the position of Thamnophis proximus RH1...... 83 Figure S2.4. LWS gene tree estimated using Bayesian inference illustrating the position of Thamnophis proximus...... 85 Figure S2.5. SWS1 gene tree estimated using Bayesian inference illustrating the position of Thamnophis proximus...... 87 Figure S2.6. Dark absorption spectrum of in vitro expressed Thamnophis proximus rhodopsin...... 89 Figure S2.7. Transmission electron microscope (TEM) images of Thamnophis proximus photoreceptor cells...... 90 Figure 3.1. Cross-species hybrid capture methods...... 103

x

Figure 3.2. Species relationships of the 16 species sequenced and the enrichment, percent of genes recovered and the average completeness of those genes that were recovered...... 104 Figure 3.3. Analyses of the effect of divergence on completeness of recovered coding regions...... 117 Figure 3.4. Analyses of the effect of divergence on completeness when removing the effect of cross-species assembly ...... 118 Figure 3.5. Bayesian multigene phylogeny of the 16 species with two outgroups ...... 123 Figure 4.1. Schematic illustration of major snake retina types...... 152 Figure 4.2. Generalized schematic of vertebrate rod phototransduction cascade...... 153 Figure 4.3. Partitions used to analyze shifts in selective constraint in snakes relative to other reptiles...... 163 Figure 4.4. Tests for shifts in selective pressures on phototransduction genes...... 165 Figure 4.5. Tests for shifts in selective pressures on phototransduction genes between caenophidian and non-caenophidian snakes (Fig. S4.3)...... 168 Figure S4.1. Species topology and representative taxon sampling used for the selection analyses on visual transduction genes...... 194 Figure S4.2. Snake species tree topology used for SWS1 to illustrate the expanded taxon sampling available for the visual opsin genes...... 196 Figure S4.3. Additional partitioning schemes...... 197 Figure S4.4. Tests for shifts in selective pressures on phototransduction genes between reptiles and snakes, and snakes and caenophidian snakes (Fig. 4.3, Fig S4.3)...... 198 Figure 5.1. Schematic view of photoreceptor transmutation in geckos...... 207 Figure 5.2. Relative expression levels (TPM) of phototransduction genes in Gekko gecko eye. 210 Figure 5.3. Comparison of relative expression levels (TPM) of rod phototransduction genes between the diurnal anole (Anolis) and nocturnal gecko (Gekko)...... 210 Figure 5.4. Clade partitions used to test the hypothesis that geckos and snakes experienced similar divergent selective pressures on phototransduction genes as a result of photoreceptor transmutation...... 213 Figure 5.5. Analysis of divergent selection on visual transduction genes between reptiles and geckos, and reptiles and geckos + snakes...... 214 Figure 6.1. Multiple sequence alignment of bovine rhodopsin (Bos RH1) with squamate visual opsins, highlighting the area (black box) that the 4D2 antibody binds...... 252

xi 1

Chapter 1 General Introduction

It is through sensory systems that animals are able to detect and respond to the environments they inhabit. Animals have evolved a wide array of different systems to this purpose, but vision is perhaps one of the most important. While many organisms are sensitive to light, spatial

(image) vision is restricted to animals (Land 2005). Vision in animals evolved well before the emergence of vertebrates and prior to the Cambrian explosion (Land 2005). Here I briefly summarize the major components and processes of the visual system in vertebrates.

1.1 The Eye

The eye has always been of special importance to biologists due to its highly complex structure and function that upon initial consideration could seem to defy the process of evolution. This was freely admitted by Darwin (1859) in his famous quote: "to suppose that the eye, with all its inimitable contrivances . . . could have been formed by natural selection, seems, I freely confess, absurd in the highest possible degree". However, Darwin (1859) also wrote that:

"if numerous gradations from a perfect and complex eye to one very imperfect

and simple, each grade being useful to its possessor, can be shown to exist, and

if any variation or modification in the organ be ever useful to an animal under

changing conditions of life, then the difficulty of believing that a perfect and

2

complex eye could be formed by natural selection, though insuperable by our

imagination, can hardly be considered real."

It is only recently, and through the integration of research across many fields, that such gradations as described by Darwin have become apparent (for reviews see Lamb et al. 2007;

Fernald 2009; Lamb 2013; Nilsson 2013). While there are several distinct types of eyes, all eyes appear to share a common developmental basis with transcription factors such as PAX6 and

RAX, which control patterning and development of the eye and related parts of the brain (Lamb et al. 2007), as well as sharing a common origin for the photoreceptors and visual pigments utilized to absorb light (Lamb 2013). The vertebrate camera-style eye appears to have emerged prior to the divergence of lampreys from jawed vertebrates (Lamb 2013).

The typical vertebrate eye consists of the cornea, iris, ciliary body, pupil, , retina, retinal pigment epithelium, choroid, and sclera (Fig. 1.1; Walls 1942). The cornea and sclera make up the outer case of the eye, together forming the fibrous tunic (Walls 1942). The sclera mainly functions to maintain the shape of the eye and is what gives the eye its white colour

(Walls 1942). The cornea, which covers the anterior portion of the eye, provides a transparent layer for light to pass through, but also is the primary structural barrier to the eye and serves to refract light, along with the lens, to focus it on the retina (Walls 1942; DelMonte and Kim 2011).

The iris is a circular structure located in the anterior portion of the eye behind the cornea and functions to control the size of the pupil, which is the aperture through which light passes (Kolb

2012). Controlling the size and shape of the pupil is one of the main ways that vertebrates can adapt to different light intensities (Walls 1942; Kolb 2012). The lens is located behind the iris and, like the cornea, provides refractive power projecting an inverted image onto the retina

(Walls 1942; Kolb 2012). The ciliary body contains the ciliary muscles that allow the eye to accommodate (that is focus on objects at varying distances) by either adjusting the shape of the

3 lens or the distance of the lens from the retina, depending on the vertebrate group in question

(Walls 1942; Kolb 2012). The ciliary body also contains the ciliary processes, which are formed by the inward folding of the choroid and provide nutrients to the anterior portion of the eye

(Walls 1942; Kolb 2012). The choroid is a vascular layer that lies between the sclera and the retinal pigment epithelium and provides the blood supply to most of the eye (Walls 1942; Kolb

2012). It is also highly pigmented, absorbing light that is not absorbed by the retina, thus preventing reflection within the eye (Walls 1942; Kolb 2012). Like the choroid, the retinal pigment epithelium, which lies between the choroid and retina, is highly pigmented and absorbs excess light (Walls 1942; Kolb 2012). This tissue provides the majority of the blood supply necessary to maintain vision and it is within this tissue that most of the retinoid cycle of vision takes place (see Section 1.5; Walls 1942; Lamb and Pugh 2004; Kolb 2012). The retina is the neural tissue on the back of the eye that is responsible for absorbing light that is used for vision.

Figure 1.1. Basic anatomy of a vertebrate eye. From Kolb (2012).

4

1.2 The Retina

The vertebrate retina is a thin layer of neural tissue that consists of three cellular (nuclear) layers

(outer nuclear layer, inner nuclear layer, and ganglion cell layer) and two synaptic (plexiform) layers (outer plexiform and inner plexiform layers; Fig. 1.2; Dowling 2009; Gregg et al. 2013).

The ganglion cell layer is the layer closest to the interior surface of the eye and contains the ganglion cells, as well as displaced amacrine cells (Dowling 2009; Gregg et al. 2013). The next nuclear layer is the inner nuclear layer, which contains the horizontal, bipolar, amacrine, and

Müller cells, followed by the outer nuclear layer, which contains the photoreceptor cells

(Dowling 2009; Gregg et al. 2013). The two synaptic layers are between the ganglion and inner nuclear layer (inner plexiform layer) and between the inner and outer nuclear layers (outer plexiform layer; Dowling 2009; Gregg et al. 2013). The inner plexiform layers contain the synaptic connections of the ganglion and amacrine cells with the biplolar cells, while the outer plexiform layer contains the synaptic connections of the bipolar and horizontal cells with the photoreceptor cells (Dowling 2009; Gregg et al. 2013).

The photoreceptors are the cells responsible for the absorption of light and the consequent visual transduction cascade that produces the electrical signal that ultimately results in vision. The bipolar and horizontal cells are the second-order neurons of the retina. The bipolar cells are output neurons and pass information to the inner retina. Horizontal cells are interneurons that extend laterally and are extensively coupled (Gregg et al. 2013). Similarly, ganglion and amacrine cells are the output and inter- neurons of the inner retina. There are mutiple types of each of these neurons that have disinct functions (for a review see Dowling

2009; Gregg et al. 2013). Visual information is processed in both the outer and inner plexiform layers, where the outer layer deals with spatial analyses, having separate ON and OFF channels and the inner layer processes temporal changes in visual stimuli (e.g., movement detection;

5

Dowling 2009). Additional processing also takes place within higher visual centres of the brain

(Dowling 2009).

Figure 1.2. Neuronal organization of a vertebrate retina. From Palczewski (2012).

6

1.3 Photoreceptors

Vertebrate photoreceptors are derived from cilia, as opposed to photoreceptors derived from microvilli found in most other animals (Fain et al. 2010). Photoreceptor cells consist of the outer segment, which contains the light sensitive visual pigments and most of the phototransduction machinery; the inner segment, which contains the Golgi, ER, and mitochondria; the cell body, which contains the nucleus; and the synaptic terminal, which releases the neurotransmitter glutamate onto the bipolar and horizontal cells (Fig. 1.3; Chen and Sampath 2013). In vertebrates there are two types of photoreceptor cells: rods and cones. Rods mediate dim-light vision and cones mediate bright-light and colour vision. This forms the basis of the duplex theory of vision originally proposed by Schultze (1866) (Ebrey and Koutalos 2001), and although some vertebrates have simplex retinas (all-rod or all-cone) this is relatively rare.

The rod and cone photoreceptor cells differ in their morphology, physiology, and molecular components, although historically photoreceptors were often identified based solely on their morphology. Rods are generally larger with cylindrical outer segments composed of a series of stacked discs that are isolated from both the external environment and the inner segment by the plasma membrane. In contrast, cones have tapering outer segments that are open to the external environment throughout their length (Lamb 2013). This results in a much different surface area to volume ratio in rods and cones (Ebrey and Koutalos 2001), as well as allowing diffusion of newly synthesized throughout the outer segment in cones, but restricting them to their discs in rods (Young 1976; Lamb 2013). Another morphological difference between rods and cones is in the synaptic terminals, where cones have large pedicles and rods smaller spherules (Lamb 2013). Cones may also contain oil droplets, which are not found in rods. Oil droplets are organelles located in the distal region of the inner segment in front of the outer segment (Bowmaker 2008). They are composed of lipids and also a variable concentration

7 of carotenoids and so can vary from being colourless to red. Oil droplets act as long-pass filters and microlenses, which are thought to improve colour vision and cone sensitivity, respectively

(Stavenga and Wilts 2014; Toomey et al. 2016). Oil droplets appear to have been lost in various vertebrates groups including snakes and most geckos (although diurnal geckos tend to have colourless oil droplets; Bowmaker 2008).

While rods and cones respond to light in largely the same way, they differ substantially in some respects. Rods are much more sensitive and produce much less noise than cones giving them the ability to respond to a single photon of light (Lamb 2013). Cones, however, have much faster response and recovery kinetics, and can respond over a much wider range of intensities than rods, and never saturate, even in very bright light (Lamb 2010, 2013).

The extent to which the morphological differences of rods and cones contribute to their differences in physiology is not well understood. Morshedian and Fain (2015) found that the enclosed discs of rods were not necessary for single photon response in lampreys, and instead suggested that enclosed discs may contribute to more efficient outer segment renewal. In rods

(and cones) new discs are synthesized at the base of the outer segment and old discs are shed from the distal end, which is thought to offset the optical and metabolic demands placed on photoreceptors (Jonnal et al. 2010). However, Lamb et al. (1981) demonstrated that the enclosed discs of rods contribute to increase sensitivity by increasing the longitudinal spread of the cytoplasmic messenger cGMP (Lamb 2013). In contrast the open discs of cones increase the surface to volume ratio, which contributes to the rapid response kinetics (Lamb 2013).

Furthermore, recent theoretical work by Hárosi and Novales Flamarique (2012) has suggested the tapering morphology of the outer segment in cones may help to reduce self-screening of the visual pigments, increase signal-to-noise ratios, and allow light to more efficiently be focused on the outer segment by the ellipsoid.

8

Figure 1.3. Basic vertebrate photoreceptor anatomy. Illustration from Chen and Sampath

(2013).

9

1.4 Phototransduction

Visual phototransduction is the process by which light is converted to an electrical signal in the photoreceptor cells. In the dark there is a depolarization current of about -30 – -40 mv in the photoreceptor cells referred to as the dark current. This current is maintained by the action of cyclic nucleotide gated (CNG) cation channels, which move Na+ and Ca2+ into the cell and the light-insensitive Na+/Ca2+-K+ exchanger which moves Ca2+ and K+ out of the cell in exchange for Na+. The depolarizing dark current activates voltage-gated L-type calcium channels at the photoreceptor synapse, which bring Ca2+ into the synapse and result in continuous release of the neurotransmitter glutamate (for review see Lagnado and Schmitz 2015).

Vision is initiated by the absorption of light by the visual pigments contained in the disc membrane of the outer segments of the photoreceptors cells (Fig. 1.4; for detailed reviews see

Wensel 2008; Hurley 2009; Fain et al. 2010; Lamb 2013). Visual pigments are composed of an opsin covalently bound to a light absorbing chromophore (11-cis retinal, often referred to as A1; or in some species 11-cis-3,4-dehydroretinal, referred to as A2). Absorption of light induces a cis-trans isomerization that converts the chromophore from the 11-cis to all-trans form. This causes a conformational change in the opsin protein that enables it to bind to and activate the G-protein, transducin. Transducin is composed of three subunits, α, β, and γ. In its inactive state the α-subunit binds GDP, but upon binding the activated visual pigment, GDP is exchanged for GTP. This causes the α-subunit to dissociate from the βγ-subunits. The α-subunit then binds the inhibitory γ-subunit of phosphodiesterase (PDE), activating the catalytic α/β- subunits of PDE, which hydrolyzes the second messenger cGMP. The reduction in cGMP levels cause the CNG channels to close resulting in hyperpolarization of the photoreceptor cell.

Hyperpolarization causes reduced activity of voltage-gated Ca2+ channels at the synaptic terminal resulting in slowed (or halted) glutamate release onto the synapse. This results in a signal being

10 sent through the retina, where it is further processed and eventually sent to the brain resulting in vision. The photoreceptor cell then needs to be reset to its depolarized, dark state through a series of deactivation steps many of which are regulated by negative feedback from decreased Ca2+ levels.

In the dark, when Ca2+ levels are high, the Ca2+-binding protein binds G- protein-coupled receptor kinase (GRK); however the light-induced decrease in Ca2+ results in dissociation of GRK from recoverin. GRK preferentially phosphorylates activated visual pigment, partially reducing its ability to activate transducin. This ability is abolished by the binding of , which has a much higher affinity for phosphorylated, active visual pigment.

Transducin is deactivated by hydrolysis of its bound GTP, which is catalyzed by the binding of the regulator of G-protein signaling (RGS9) complex. Hydrolysis of GTP causes transducin to dissociate from the PDE γ-subunit, which again binds the α/β PDE subunits inhibiting their activity. Lowered Ca2+ concentration also results in activation of guanylate cyclase activating proteins (GCAPs) through replacement of Ca2+ with Mg2+. The GCAPs activate guanylate cyclases, which synthesize cGMP. Increasing cGMP concentration reopens the CNGs, while at the same time the reduced Ca2+ concentration increases the affinity of CNGs for cGMP, thus restoring the Ca2+ concentration and depolarizing current, and in turn deactivating the GCAPs and guanylate cyclases. To fully reset the cell to the dark state the visual pigment needs to be regenerated with new 11-cis retinal and dephosphorylated. How dephosphorylation occurs in rods and cones is only beginning to be understood (Yamaoka et al. 2015). Dephosphorylation has been shown to occur faster in cones than rods (Yamaoka et al. 2015) and recent preliminary results suggest that phosphatase 2A (PP2A) may be the primary visual pigment phosphatase and necessary to reset visual pigments to their dark state in both rods and cones (Kolesnikov et al.

2017).

11

While the molecular mechanisms are very similar in rods and cones, many of the proteins involved in signal amplification and shutoff are distinct, duplicated copies that are exclusive to rod or cone photoreceptors (Lamb 2013). A list of phototransduction proteins, and the genes that encode them, is provided in Table 1.1. Differences in the function and expression levels of these proteins contribute to the physiological differences between rods and cones (for reviews see

Kawamura and Tachibanaki 2012; Ingram et al. 2016). For example, cone opsin kinase (GRK7) has a much higher activity than rhodopsin kinase (GRK1), which appears to contribute to higher sensitivity of rods through slower phosphorylation allowing more transducin be to activated

(Tachibanaki et al. 2005; Wada et al. 2006; Vogalis et al. 2011). Differences in activity are not the only possible difference between rod and cone phototransduction proteins. Rod and cone transducin are functionally very similar in terms of activation and inactivation (Deng et al. 2009;

Gopalakrishna et al. 2012; Tachibanaki et al. 2012; Mao et al. 2013; but see Chen et al. 2010), but differ in their light induced translocation between the inner and outer segment of the photoreceptor (Lobanova et al. 2010). In rods bright light induces translocation of transducin from the outer segment to the inner segment, which reduces their sensitivity enabling the rods to operate in brighter light conditions than would otherwise be possible (Sokolov et al. 2002). In cones transducin does not translocate under natural conditions; however this does not seem to be related to functional or structural differences between rod and cone tranducins, but rather is a result of differences between the lifetime of activated rod and cone visual pigments and the rate of transducin inactivation (Lobanova et al. 2007; Lobanova et al. 2010). Transducin inactivation is mediated by the RGS9 complex, which is the same in rods and cones. The expression level of the RGS9 complex in cones, however, is higher than in rods, and this contributes to the much faster response kinetics of cones (Cowan et al. 1998; Zhang et al. 2003). In addition to

12 morphological and phototransduction differences, rods and cones also differ in their ability to regenerate visual pigments after light activation, which is mediated by the visual cycle.

Figure 1.4. Generalized schematic of vertebrate rod phototransduction cascade. In the dark the components of the cascade are largely inactive, except for the cyclic nucleotide gated channel

(CNG), and the light-insensitive Na+/Ca2+-K+ exchanger. These proteins, located on the plasma membrane result in a stable depolarizing current in the dark. Light activation of rhodopsin, shown here as a dimer, results in a conformational change that opens a binding site for the G protein transducin, facilitating the exchange of GDP with GTP within its α-subunit. Dissociation

13 of the transducin α-subunit allows it to activate phosphodiesterase (PDE) via binding of its inhibitory γ-subunit. Activated PDE hydrolyzes cGMP to GMP resulting in a decrease in cGMP concentration, which in turn results in closing the CNGs. This results in hyperpolarization of the cell, slowing the release of glutamate into the synapse and eventually resulting in a visual signal being sent to the brain. Recovery begins with deactivation of rhodopsin. Reduction in Ca2+ concentration causes dissociation of recoverin from rhodopsin kinase (GRK) allowing it to phosphorylate the activated rhodopsin reducing its activity. Phosphorylated rhodopsin is further deactivated by the binding of arrestin. Transducin is deactivated by hydrolysis of its bound GTP, which is catalyzed by the binding of the regulator of G-protein signalling complex (RGS9-

GNB5-RGS9BP). Hydrolysis of GTP causes transducin to dissociate from the PDE inhibitory subunit, which deactivates PDE. Finally, the lowered Ca2+ concentration results in activation of guanylate cyclase activating proteins (GCAPS) through replacement of Ca2+ with Mg2+. The

GCAPs activate guanylate cyclases, which synthesize cGMP. Increasing cGMP concentration reopens the CNGs, thus restoring the Ca2+ concentration and in turn deactivating the GCAPs and guanylate cyclases. The cone phototransduction cascade is similar, but involves cone-specific copies of several proteins. The genes that encode the proteins involved in phototransduction, including which genes are specific to rods or cones, are outlined in Table 1.1.

14

Table 1.1. Major components of the vertebrate visual phototransduction cascade found in rod and cone photoreceptors.

Protein Gene Symbol Photoreceptor Gene Name RH1 Rod Rhodopsin (RHO) LWS Cone Long-wave Sensitive Cone Opsin Opsin RH2 Cone Middle-wave Sensitive Cone Opsin SWS1 Cone Short-wave Sensitive Cone Opsin 1 SWS2 Cone Short-wave Sensitive Cone Opsin 2 GNAT1 Rod G Protein α-subunit 1 GNB1 Rod G Protein β-subunit 1 GNGT1 Rod G Protein γ-subunit 1 Transducin GNAT2 Cone G Protein α-subunit 2 GNB3 Cone G Protein β-subunit 3 GNGT2 Cone G Protein γ-subunit 2 PDE6A Rod Phosphodiesterase α-subunit 6A PDE6B Rod Phosphodiesterase β-subunit 6B cGMP PDE6G Rod Phosphodiesterase γ-subunit 6G Phosophodiesterase PDE6C Cone Phosphodiesterase β-subunit 6C PDE6H Cone Phosphodiesterase γ-subunit 6H CNGA1 Rod CNG α-subunit 1 Cyclic Nucleotide Gated CNGB1 Rod CNG β-subunit 1 Channel CNGA3 Cone CNG α-subunit 3 CNGB3 Cone CNG β-subunit 3 Na+/Ca2+-K+ SLC24A1 Rod Solute Carrier Family 24 Member 1 Exchanger SLC24A2 Cone Solute Carrier Family 24 Member 1 SAG Rod Rod Arrestin (S-Antigen) Arrestin ARR3 Cone Cone Arrestin (X-arrestin) G Protein-Coupled GRK1 Rod Rhodopsin Kinase Receptor Kinase GRK7 Cone Cone Opsin Kinase RGS9 Both Regulator of G-Protein Signaling 9 Regular of G-Protein RGS9BP Both RGS9 Binding Protein Signalling Complex GNB5 Both G Protein β-subunit 5 GUCA1A Both Guanylate Cyclase Activator 1A Guanylate Cyclase GUCA1B Both Guanylate Cyclase Activator 1B Activating Protein GUCA1C Cone Guanylate Cyclase Activator 1C GUCY2D Both Guanylate Cyclase 2D Guanylate Cyclase GUCY2F Both Guanylate Cyclase 2F Recoverin RCVRN Both Recoverin

15

1.5 Visual Cycle

For regeneration of the visual pigments to occur the all-trans retinal produced by light absorption needs to be converted back to 11-cis. This occurs through a process known as the visual cycle (or retinoid cycle of vision; for reviews see Saari 2000; Lamb and Pugh 2004; Wang and Kefalov 2011; Saari 2012; Tang et al. 2013). There are two distinct visual cycles: the retinal pigment epithelium (RPE) visual cycle, which supplies 11-cis retinal to both rods and cones, and the retinal visual cycle, which is specific for cones. After activation, the visual pigment decays into apo-opsin (free opsin without chromophore) and all-trans retinal. The released all-trans retinal is reduced to all-trans retinol by NADPH-dependent retinol dehydrogenases (RDH8,

RDH12, and potentially others) and transported to the RPE through chaperoning by interphotoreceptor retinoid binding protein (IRBP) or, for the cone-specific retina visual cycle, to the Müller glial cells by diffusion or an unknown mechanism.

In the RPE all-trans retinol is bound by another chaperone protein, cellular retinol binding protein (CRBP), which facilitates its transport further into the RPE. All-trans retinol is next esterified by lecithin:retinol acyltransferase (LRAT) to all-trans retinyl ester, which is then hydrolyzed and isomerized by the retinoid isomerohydrolase (RPE65) into 11-cis retinol. The

11-cis retinol in then bound to another chaperone protein, cellular retinaldehyde binding protein

(CRALBP), and then oxidized by 11-cis retinol dehydrogenase (RDH5). 11-cis retinal is then transported back to the photoreceptor outer segment chaperoned again by IRBP.

The cone-specific retina visual cycle is much more poorly understood than the RPE visual cycle, but is necessary for the continued function of cone photoreceptors in bright light

(Wang and Kefalov 2011; Saari 2012; Tang et al. 2013). All-trans retinal released from activated cone visual pigments is similarly reduced to all-trans retinol, but at a faster rate than in rods.

This involves RDH8 and RDH12 and, likely, also other cone-specific RDHs. All-trans retinol is

16 transported or diffuses to Müller glial cells where it is converted to 11-cis retinol. The 11-cis retinol moves back to the cone photoreceptor where it is oxidized back to 11-cis retinal, which can then regenerate cone visual pigments. The mechanisms and proteins that govern this process are not well characterized, but are an area of active research. Kaylor et al. (2013) have identified dihydroceramide desaturase-1 (DES1) as a retinol isomerase that may govern the conversion of

11-trans to 11-cis retinol with the involvement of CRALBP, which may mediate the selective production of 11-cis retinol over other forms such as 9-cis (Sato and Kefalov 2016). Kaylor et al.

(2014) further found that O-acyltransferase (MFAT) may be a retinyl-ester synthase that is also involved in this process. The identity of the cis retinol oxidase that converts 11-cis retinol to 11- cis retinal in the photoreceptor has not been confirmed. Some evidence suggests the process may involve RPE65, possibly as a retinol binding protein (Tang et al. 2013). Sato et al. (2013, 2015) have shown that in the inner segment of carp cone photoreceptors, RDH13L is the enzyme responsible for the cis retinol oxidase activity. RDH14, a functional homologue to RDH13L, may be responsible for this process in amphibians and mammals, but in the outer segment, rather than the inner segment as in carp (Sato et al. 2017).

Kaylor et al. (2017) have recently identified a possible third mechanism for the regeneration of 11-cis retinal that is light-driven. It had been shown previously that retinal binds reversibly with phosphatidylethanolamine (PE) to form a retinyl-lipid capable of transferring the retinal from the lipid membrane to apo-opsin (Poincelot et al. 1969; Kimbel et al. 1970).

Additionally, Shichi and Somers (1974) found that all-trans-retinal-PE undergoes photoisomerization to 11-cis-retinal-PE in blue light. Building on these previous findings, Kaylor et al. (2017) demonstrated that this photoisomerization occurs in photoreceptor membranes.

Furthermore, they found that synthesis of 11-cis retinal, regeneration of rhodopsin, and cone sensitivity were all increased in blue light. The authors suggest that this mechanism could

17 contribute significantly to cone pigment regeneration and be necessary for sustained vision under natural conditions. Further work will be needed to explore this possibility.

1.6 Visual Pigments

Visual pigments initiate the first step in vision through the absorption of a photon, which is converted into an electrical signal via the phototransduction cascade. They are composed of an opsin protein covalently bound to a light absorbing chromophore (retinal). Vision is initiated through a cis to trans photoisomerization of the chromophore, which induces a conformational change in the opsin protein to its active form. Different visual pigments absorb light maximally at different wavelengths. These differences are controlled both by the chromophore usage and by the structure of the opsin protein. Adjustment of the wavelength of maximal absorbance (λmax) is one of the ways that vertebrates can adapt their visual system to spectral environments (Loew and Lythgoe 1978; Shand 1993; Bowmaker et al. 1994; Lythgoe et al. 1994; McDonald and

Hawryshyn 1995; Cronin et al. 1996; Loew et al. 2002).

Opsins are members of the G-protein coupled receptor family and are transmembrane proteins with seven α-helical domains (Bowmaker 2008). The ancestral vertebrate likely had five visual opsin genes, one rod opsin (rhodopsin or RH1) and four spectrally distinct cone opsins

(LWS, RH2, SWS1, SWS2; Fig. 1.5; Bowmaker 2008). Rod and cone opsins are generally present only in rods and cones, respectively, but there are exceptions such as in amphibian green rods, which express SWS2, and gecko ‘rods’, which contain only cone opsins (Kojima et al.

1992; Ma et al. 2001). The visual opsin genes have been lost or duplicated in many vertebrate lineages providing either an increased or decreased breadth of spectral sensitivity. The distinct, but overlapping absorbance spectra provide the basis for colour vision (Daw 2009). The

18 absorbance spectra are controlled, in part by the opsin protein, where changes in the protein structure, especially around the area of the chromophore, can change the wavelengths of light absorbed. Opsin structure also controls other aspects of visual pigment function, such as the rate of thermal activation and light activated decay (Ernst et al. 2014). In general cone opsins have faster light activated decay and faster regeneration, but also higher rates of thermal activation, and therefore dark noise (Imai et al. 2005; Luo et al. 2011; Chen et al. 2012). An exception to this has recently been found with frog SWS2, which is expressed both in blue cones and green rods, and has thermal activation rates almost as low as rhodopsin and much lower than other

SWS2 opsins (Kojima et al. 2017). Beyond differences in opsin structure changes in the chromophore can also effect visual pigment function.

Vertebrates use two different chromophores: 11-cis retinal (A1) and 11-cis-3,4- dehydroretinal (A2). Terrestrial vertebrates and marine fishes tend to exclusively use A1, while freshwater animals and those that migrate, or transition between freshwater and marine/terrestrial habitats, tend to use A2 or an A1/A2 mixture (Bridges 1972). A2 has an extra double bond in the

β-ionone ring, which results in red-shifted λmax and increased thermal activation rates relative to pigments with A1 (Bridges 1967; Donner et al. 1990; Makino et al. 1999; Ala-Laurila et al.

2007). The use of A2 in freshwater species matches with the red-shifted transmission spectra of many freshwater environments (Lythgoe 1979). The usage of the A1 and A2 chromophores, therefore provides a distinct mechanism with which to tune the function of visual pigments that does not require changes to the underlying protein structure.

19

Figure 1.5. Representative ancestral complement of vertebrate photoreceptors and visual pigments based on those present in birds. The ancestral vertebrate had five distinct visual pigments encoded by five opsin genes (see text). These are contained in distinct photoreceptor types and may be accompanied by oil droplets. Each visual pigment has a distinct absorbance spectrum that is determined by the protein sequence and the structure of the bound chromophore.

Modified from Bowmaker (2008).

1.7 The Duplex Retina and Diurnal and Nocturnal Vision

Diurnal and nocturnal animals are exposed to very different light intensities, which can differ by up to 11 orders of magnitude (Warrant 2008). The inherent trade-off between resolution and sensitivity (Warrant 2008) imposes divergent selective pressures on the visual system in dim and bright light. In dim light achieving high sensitivity is paramount, while in bright light sensitivity is less critical, but increased resolution (spatial, temporal, chromatic) may provide large benefits.

To deal with this dichotomy, vertebrates have rod and cone photoreceptor cells, where rods mediate dim-light vision and cones mediate bright-light vision (Schultze 1866; Ebrey and

Koutalos 2001). Rods have extremely high sensitivity and low noise, but have much slower response kinetics and become saturated in bright light, while cones are less sensitive and noisier,

20 but have fast responses and do not saturate even in very bright light (Lamb 2010, 2013). A combination of rods and cones in the retina can allow vision in both bright and dim light to varying degrees, and most vertebrates, therefore, have a duplex retina. The retinas of many vertebrates appear adapted to particular diel activity patterns where the proportion of rods to cones varies, with more rods present in the retinas of more highly nocturnal vertebrates and vice versa (Walls 1942). While some highly diurnal and highly nocturnal species have taken this to the extreme by evolving simplex retinas (all-cone, or all-rod), most vertebrates maintain at least a small population of rods or cones, as even relatively small amounts can provide at least some vision in dim- or bright-light conditions, respectively.

1.8 Photoreceptor Transmutation

Photoreceptor transmutation is the evolutionary process by which it is purported that a cone can be converted into a rod or vice versa. This theory of photoreceptor transmutation was originally proposed by the comparative ophthalmologist Gordon Walls to explain evolutionary transitions inferred between simplex retinas in squamate reptiles (Walls 1934), and challenged the previous view that rods and cones were always separate and distinct, which had been established since

Schultze (1866). Walls (1934, 1942) observed a preponderance of simplex retinas in squamates that, specifically in snakes and geckos, varied in what appeared to be a transitional morphological series from all-cone in highly diurnal species to all-rod in highly nocturnal species. Those species in the middle of the series that had intermediate photoreceptor morphologies formed the basis for the theory that photoreceptors could transform, through evolution, from one cell type to another. In his seminal book, Walls (1942) further expanded this theory in the following ways.

21

Despite the prevailing view at the time, Walls (1942) believed that rods evolved from cones through transmutation, a view that is supported by recent studies (for a review see

Morshedian and Fain 2017). Thus, early in vertebrate evolution, the duplex retina was established enabling both dim-light and bright-light vision, which has been maintained in most lineages since. However, some vertebrates have simplex retinas, which Walls (1942) postulated evolved from the loss of either rods or cones. Specifically, Walls believed that in adapting to diurnal, terrestrial environments early lizards lost rods resulting in an all-cone retina, with a similar situation occurring in caenophidian snakes. In some lineages, through pressure to adapt to nocturnal conditions, Walls (1942) believed that lizards and snakes with all-cones retinas, transmuted their cones, through a series of evolutionary intermediates preserved in extant species, into rods in secretive and nocturnal species. In geckos this process was reversed as some gecko lineages reverted back to diurnality and all-cone retinas. A basic outline of Walls’ view of photoreceptor transmutation in snakes and geckos is shown in Figure 1.6. These views were further supported in snakes and geckos by additional work done by Underwood (1967, 1968,

1970), who additionally distinguished between outer segment and synaptic pedicle transmutation. Underwood (1968, 1970), while generally supporting Walls’ views, suggested that the distinction between rods and cone might be less clear than previously thought. We now have considerably more knowledge of the molecular underpinnings of vision, but little work has been done to examine Walls’ theory in light of this. Of the two best examples of photoreceptor transmutation, geckos and snakes, only transmutation in geckos has been tested in any detail.

The visual systems of both these groups are outlined in the following sections.

22

Figure 1.6. Walls’ view of photoreceptor transmutation in geckos (A) and snakes (B) depicting the transition from all-cone to all-rod and back to all-cone again in geckos and from duplex to the all-cone to all-rod in snakes. In both cases Walls (1942) and Underwood

(1970) identified species with intermediate photoreceptor morphologies. Illustrations from

Underwood (1970).

23

1.9 Visual System of Geckos

The best studied example of photoreceptor transmutation occurs in geckos, a highly diverse group of squamate lizards. Most geckos are nocturnal and have retinas that contain only rod-like photoreceptors. Based on comparative retinal and photoreceptor morphology, Walls (1942) proposed that the all-‘rod’ retinas of nocturnal geckos were derived from the all-cone retinas of ancestral diurnal lizards (Fig. 1.7). Furthermore, he hypothesized that extant diurnal geckos reverted to diurnality, and consequently their all-rod retinas were transmuted back to all-cone retinas (Walls 1942). Support for this hypothesis has come from several avenues of research.

Röll (2000) found that nocturnal gecko ‘rod’ photoreceptors were actually cones at all levels of their ultrastructure, having such cone features as open outer segment membranes, confirming previous work by Tansley (1964). Molecular studies of the gecko visual system have revealed that geckos lack rhodopsin (RH1), instead expressing only cone pigments (LWS, RH2, SWS1) in their photoreceptors (Kojima et al. 1992). Furthermore, the ‘rods’ were found to use cone phototransduction machinery, not the phototransduction machinery that is specific to normal rod photoreceptors (Zhang et al. 2006). The photoreceptors of nocturnal geckos, however, function more similarly to rods of other vertebrates than they do to cones (Kleinschmidt and Dowling

1975; Zhang et al. 2006). Together these studies provide strong support for the hypothesis that gecko ‘rods’ were ‘transmuted’ from diurnal lizard cones. As such, the photoreceptors of nocturnal geckos can be appropriately referred to as rod-like cones. Röll (2001), also found support for the tertiary diurnality of some geckos through an analysis of lens . An extensive phylogenetic study of temporal activity patterns in geckos confirmed the nocturnal ancestry of geckos followed by multiple independent transitions to diurnality (Gamble et al.

2015). While each of these studies support Walls' (1942) transmutation theory, very little is

24 known about the extent and nature of the impact of photoreceptor transmutation on the evolution and function of the visual system.

In addition to the unique features that appear to be linked to photoreceptor transmutation, the gecko visual system also differs substantially from that of typical lizards (Fig. 1.7). Instead of a single type of double cone and multiple types of single cones that contain different visual pigments (Crescitelli 1972; Loew 1994), geckos have only one type of single cone and three types of double cones. The single cones all contain an LWS pigment, as do the primary members of the double cones. One of the double cone types has equally sized outer segments (both of which contain LWS) and is also referred to as a twin cone. This is similar to, but presumably not homologous with, the twin cones of teleost fishes (Walls 1942). The other two types of double cones are unequal, with the smaller, accessory member containing a UV (SWS1) or RH2 pigment (Loew 1994; Loew et al. 1996). The evolutionary origins of these cell types, and their functional relevance, are unclear. Behavioural evidence demonstrates that nocturnal geckos are able to discriminate colours under light levels where humans are colourblind (Roth and Kelber

2004). This suggests the double cones of geckos are able to contribute to colour vision, but the mechanism behind this is unknown.

Figure 1.7. Schematic view of photoreceptor transmutation in geckos. Different rod and cone photoreceptor types are depicted based on their gross morphology and the visual pigments contained therein, identified based on the wavelength of maximal absorption. The ancestral tetrapod most likely had large single cones and double cones that contained LWS (red); small

25 single cones that contained RH2 (green), SWS2 (blue), and SWS1 (purple); and rods that contained RH1 (white). At some point a diurnal lizard ancestor of geckos lost rods and RH1.

This ancestral lineage transitioned to a nocturnal lifestyle, which was accompanied by photoreceptor transmutation. All of the cone photoreceptors were modified to resemble rods.

Small single photoreceptors were lost, as was SWS2. RH2 and SWS1 were maintained, but instead are found in the accessory members of double cones. Several gecko lineages have independently re-evolved diurnality and this was accompanied by a return to an all-cone retina.

Schematic is based on Walls (1942); Pedler and Tilly (1964); Tansley (1964); Underwood

(1970); Kleinschmidt and Dowling (1975); Kojima et al. (1992); Loew et al. (1996); Röll (2000,

2001); Zhang et al. (2006).

1.10 Visual System of Snakes

Snakes are the most diverse group squamates and have a near worldwide distribution inhabiting many environments with diverse lifestyles. Many snakes are visual predators (Drummond 1985) and some have good visual acuity and binocular vision (Baker et al. 2007), although many rely heavily on chemical and tactile cues as well as, or instead of, vision (Greene 1997). Extant snakes can be roughly divided into three groups (Fig. 1.8): Blind snakes and thread snakes

(Scolecophidia), which are small, highly fossorial snakes many of which superficially resemble earth worms. The second is a grade of snakes referred to as ‘Henophidia’ that includes pythons, boas, and sunbeam snakes, as well as several other lineages whose visual systems have not been studied. These snakes are largely nocturnal and some are also fossorial. The third group are the caenophidian snakes, which includes colubrids, elapids, and vipers among others. This is the

26 most diverse groups of snakes with species that range from diurnal to nocturnal, and terrestrial to aquatic, arboreal, and fossorial.

In terms of their visual system, snakes are particularly unique, but also very understudied.

Snake eyes differ substantially from other squamates and reptiles (Walls 1940; Walls 1942).

Compared to other squamates, snakes have lost the ciliary muscles and also accommodate by moving the lens back and forth, rather than compressing it (Walls 1940; Caprette et al. 2004).

Snakes also have considerable retinal variation: blindsnakes have rudimentary retinas; pythons, boas, and sunbeam snakes have duplex retinas; and colubrids, elapids, and vipers have a mix of duplex, all-cone, and all-rod retinas, based on outer segment morphology (Walls 1942;

Underwood 1970). Colubrids and elapids that are secretive, semi-nocturnal, and nocturnal have photoreceptors that appear adapted to dimmer light environments with photoreceptors that have outer segment morphologies intermediate between those of cones and rods, and in some species photoreceptors that all fully resemble rods (Walls 1942). Based on this comparative eye and photoreceptor morphology, Walls (1942) proposed that the ancestors of snakes went through a fossorial phase during which their eyes and photoreceptors were severely reduced. After re- acquiring a terrestrial lifestyle, the eyes were again expanded, evolving several unique features that compensated for the losses incurred when the eyes were reduced, explaining the substantial differences between snake and other squamate eyes (Walls 1942). The photoreceptor cells were enlarged and differentiated into re-evolved rods and cones (e.g., in henophidian-grade species).

As some snakes became highly diurnal, such as in colubrids, the rods were lost and double cones evolved leading to the all-cone retina. Walls (1942) proposed that this all-cone retina was the ancestral colubrid condition and that the rod-like and all-rod retinas of secretive, partially nocturnal, and nocturnal colubrids were achieved through transmutation of the cones into rods.

27

Previous work done on henophidian-grade snakes (Python, Boa, and Xenopeltis) and a diurnal colubrid (Thamnophis sirtalus) using electron microscopy, microspectrophotomerty

(MSP), cDNA sequencing, and opsin expression to study the visual receptors and pigments of snakes has provided mixed support for Walls' (1942) transmutation theory (Sillman et al. 1997,

1999, 2001; Sillman et al. 2001; Davies et al. 2009). Henophidian snakes, at least those studied thus far, express rhodopsin (RH1) in their rods that absorbs maximally (λmax) at about 495 nm,

LWS in the double and large single cones at ~550 nm, and SWS1 in the small single cones at

~360 nm (Fig. 1.9; Sillman et al. 1999, 2001; Davies et al. 2009). While the SWS2 and RH2 opsins appear to have been lost in snakes, Davies et al. (2009) report that no evidence was found to support transmutation.

The pattern in diurnal colubrids is somewhat different. From both a morphological and a physiological perspective, diurnal colubrids (such as Thamnophis sirtalis) are reported to have all-cone retinas (Walls 1942; Underwood 1970; Wong 1989; Jacobs et al. 1992; Sillman et al.

1997). These consist of double cones and large single cones that, in T. sirtalis, express a long- wavelength pigment (presumably LWS) with a λmax at ~554, and two types of small single cone, one with an UV pigment (SWS1) at ~360 nm and another with a middle-wavelength at 482 nm

(Sillman et al. 1997). The identity of this pigment is unclear. The all-cone retina and unidentified pigment both are suggestive of photoreceptor transmutation; however further data, such as has been shown in geckos, is needed to support this hypothesis.

28

Figure 1.8. Schematic phylogeny of snakes illustrating the three major groups and the hypothesized position where photoreceptor transmutation begins to occur. Phylogeny based on Davies et al. (2009).

29

Figure 1.9. Photoreceptors and visual pigment absorption spectra of Python regius (A) and

Thamnophis sirtalis (B). Python regius has a duplex retina with LWS and SWS1 large and single cones, respectively, and RH1 containing rods. Thamnophis sirtalis also has the LWS and

SWS1 single cones, but also has a double cone and a second type of single cone that contains a visual pigment with a λmax of 482 nm. Importantly, T. sirtalis does not have visible rods and the identity of the 482 nm pigment is unclear. The spectra are based on the visual pigment template of Govardovskii et al. (2000) using λmax values obtained through MSP by Sillman et al. (1997,

1999).

30

1.11 Overview

This thesis consists of four studies, each of which furthers the goal of elucidating the evolution and molecular mechanisms of photoreceptor transmutation. In the first study I synthesize results from multiple experiments (performed by myself and others) to test the hypothesis that the apparent all-cone retina of diurnal colubrids evolved through photoreceptor transmutation of the rods to resemble the appearance, and function, of cones, rather than the loss of the rods as supposed by Walls (1942). In the second, I develop a new method of targeted sequence capture that enables efficient sequencing of complete coding regions across divergent taxa. I use this method to sequence visual genes from squamates providing data necessary for the final two studies. In the third study I use whole eye transcriptome sequencing, along with the targeted sequence capture data from the previous study and new publically available whole genome sequences, to test predictions of two hypothesis relating to snake evolution. The first is that ancestral snakes were highly fossorial with degenerated visual systems as hypothesized by Walls

(1942), and so would be predicted to have lost phototransduction genes and experienced a relaxation of selective pressures early during their evolution. The second prediction tested in this study is that caenophidian snakes, in which photoreceptor transmutation is hypothesized to have been widespread, have experienced divergent and positive selection as they adapted to simplex retinas. In the final study I explore the impact of photoreceptor transmutation in geckos. Using whole eye transcriptome sequencing of a nocturnal gecko and a diurnal anole, I test the hypothesis that gecko photoreceptors utilize only cone phototransduction machinery. I also expand upon the third study to ask if geckos experienced similar divergent selective pressures as snakes that are associated with transmutation. As a whole this thesis provides the first molecular evidence for photoreceptor transmutation in snakes, provides a new methodology for producing sequencing data relevant for molecular evolutionary studies, revises our view of photoreceptor

31 transmutation in geckos, and provides the first evidence for molecular changes that are associated with transmutation.

1.12 References

Ala-Laurila P, Donner K, Crouch RK, Cornwall MC. 2007. Chromophore switch from 11-cis-

dehydroretinal (A2) to 11-cis-retinal (A1) decreases dark noise in salamander red rods.

Journal of Physiology-London 585:57-74.

Baker RA, Gawne TJ, Loop MS, Pullman S. 2007. Visual acuity of the midland banded water

snake estimated from evoked telencephalic potentials. J Comp Physiol A Neuroethol Sens

Neural Behav Physiol 193:865-870.

Bowmaker JK. 2008. Evolution of vertebrate visual pigments. Vision Res 48:2022-2041.

Bowmaker JK, Govardovskii VI, Shukolyukov SA, Zueva LV, Hunt DM, Sideleva VG,

Smirnova OG. 1994. Visual pigments and the photic environment: the cottoid fish of Lake

Baikal. Vision Res 34:591-605.

Bridges CD. 1967. Spectroscopic properties of porphyropsins. Vision Res 7:349-369.

Bridges CDB. 1972. The rhodopsin-porphyropsin visual system. In: Datnall HJA, editor.

Handbook of Sensory PhysiologyVII/1: Photochemistry of Vision. Berlin-Heidelberg-New

York: Springer-Verlag. p. 417-480.

Caprette CL, Lee MSY, Shine R, Mokany A, Downhower JF. 2004. The origin of snakes

(Serpentes) as seen through eye anatomy. Biol J Linn Soc 81:469-482.

Chen CK, Woodruff ML, Chen FS, Shim H, Cilluffo MC, Fain GL. 2010. Replacing the rod with

the cone transducin alpha subunit decreases sensitivity and accelerates response decay.

Journal of Physiology-London 588:3231-3241.

32

Chen J, Sampath AP. 2013. Structure and Function of Rod and Cone Photoreceptors. In: Ryan

SJ, Hinton DR, editors. Retina (Fifth Edition). London: W.B. Saunders. p. 342-359.

Chen MH, Kuemmel C, Birge RR, Knox BE. 2012. Rapid release of retinal from a cone visual

pigment following photoactivation. Biochemistry 51:4117-4125.

Cowan CW, Fariss RN, Sokal I, Palczewski K, Wensel TG. 1998. High expression levels in

cones of RGS9, the predominant GTPase accelerating protein of rods. Proc Natl Acad Sci

U S A 95:5351-5356.

Crescitelli F. 1972. The visual cells and visual pigments of the vertebrate eye. In: Dartnall HJA,

editor. Photochemistry of Vision, Handbook of Sensory Physiology. Heidlberg: Springer-

Verlag. p. 245-263.

Cronin TW, Marshall NJ, Caldwell RL. 1996. Visual pigment diversity in two genera of mantis

shrimps implies rapid evolution (Crustacea; Stomatopoda). Journal of Comparative

Physiology a-Sensory Neural and Behavioral Physiology 179:371-384.

Darwin C. 1859. On the Origin of Species by Means of Natural Selection, or the Preservation of

Favoured Races in the Struggle for Life. London: John Murray.

Davies WL, Cowing JA, Bowmaker JK, Carvalho LS, Gower DJ, Hunt DM. 2009. Shedding

light on serpent sight: the visual pigments of henophidian snakes. J Neurosci 29:7519-

7525.

Daw NW. 2009. Retinal color mechanisms. In: Squire LR, editor. Encyclopedia of

Neuroscience. Oxford: Academic Press. p. 187-194.

DelMonte DW, Kim T. 2011. Anatomy and physiology of the cornea. J Cataract Refract Surg

37:588-598.

33

Deng WT, Sakurai K, Liu JW, Dinculescu A, Li J, Pang JJ, Min SH, Chiodo VA, Boye SL,

Chang B, Kefalov VJ, Hauswirth WW. 2009. Functional interchangeability of rod and

cone transducin alpha-subunits. Proc Natl Acad Sci U S A 106:17681-17686.

Donner K, Firsov ML, Govardovskii VI. 1990. The frequency of isomerization-like 'dark' events

in rhodopsin and porphyropsin rods of the bull-frog retina. The Journal of Physiology

428:673-692.

Dowling JE. 2009. Retina: An Overview. In: Squire LR, editor. Encyclopedia of Neuroscience.

Oxford: Academic Press. p. 159-169.

Drummond H. 1985. The role of vision in the predatory behavior of natricine snakes. Anim

Behav 33:206-215.

Ebrey T, Koutalos Y. 2001. Vertebrate photoreceptors. Prog Retin Eye Res 20:49-94.

Ernst OP, Lodowski DT, Elstner M, Hegemann P, Brown LS, Kandori H. 2014. Microbial and

animal : structures, functions, and molecular mechanisms. Chem Rev 114:126-

163.

Fain GL, Hardie R, Laughlin SB. 2010. Phototransduction and the Evolution of Photoreceptors.

Curr Biol 20:R114-R124.

Fernald RD. 2009. Vertebrate Eyes: Evolution. In: Squire LR, editor. Encyclopedia of Neuroscience. Oxford: Academic Press. p. 85-89.

Gamble T, Greenbaum E, Jackman TR, Bauer AM. 2015. Into the light: diurnality has evolved

multiple times in geckos. Biol J Linn Soc 115:896-910.

Gopalakrishna KN, Boyd KK, Artemyev NO. 2012. Comparative analysis of cone and rod

transducins using chimeric Galpha subunits. Biochemistry 51:1617-1624.

Govardovskii VI, Fyhrquist N, Reuter T, Kuzmin DG, Donner K. 2000. In search of the visual

pigment template. Vis Neurosci 17:509-528.

34

Greene HW. 1997. Snakes: The Evolution of Mystery in Nature. Berkeley: University of

California Press.

Gregg RG, McCall MA, Massey SC. 2013. Function and Anatomy of the Mammalian Retina. In:

Ryan SJ, Hinton DR, editors. Retina (Fifth Edition). London: W.B. Saunders. p. 360-400.

Hárosi FI, Novales Flamarique I. 2012. Functional significance of the taper of vertebrate cone

photoreceptors. J Gen Physiol 139:159-187.

Hurley JB. 2009. Phototransduction. In: Squire LR, editor. Encyclopedia of Neuroscience.

Oxford: Academic Press. p. 687-692.

Imai H, Kuwayama S, Onishi A, Morizumi T, Chisaka O, Shichida Y. 2005. Molecular

properties of rod and cone visual pigments from purified chicken cone pigments to mouse

rhodopsin in situ. Photochem Photobiol Sci 4:667-674.

Ingram NT, Sampath AP, Fain GL. 2016. Why are rods more sensitive than cones? J Physiol

594:5415-5426.

Jacobs GH, Fenwick JA, Crognale MA, Deegan JF. 1992. The all-cone retina of the garter snake

- spectral mechanisms and photopigment. J Comp Phys A 170:701-707.

Jonnal RS, Besecker JR, Derby JC, Kocaoglu OP, Cense B, Gao W, Wang Q, Miller DT. 2010.

Imaging outer segment renewal in living human cone photoreceptors. Opt Express 18:

5257-5270.

Kawamura S, Tachibanaki S. 2012. Explaining the functional differences of rods versus cones.

Wiley Interdisciplinary Reviews: Membrane Transport and Signaling 1:675-683.

Kaylor JJ, Cook JD, Makshanoff J, Bischoff N, Yong J, Travis GH. 2014. Identification of the

11-cis-specific retinyl-ester synthase in retinal Muller cells as multifunctional O-

acyltransferase (MFAT). Proc Natl Acad Sci U S A 111:7302-7307.

35

Kaylor JJ, Xu TZ, Ingram NT, Tsan A, Hakobyan H, Fain GL, Travis GH. 2017. Blue light

regenerates functional visual pigments in mammals through a retinyl-phospholipid

intermediate. Nature Communications 8.

Kaylor JJ, Yuan Q, Cook J, Sarfare S, Makshanoff J, Miu A, Kim A, Kim P, Habib S, Roybal

CN, Xu T, Nusinowitz S, Travis GH. 2013. Identification of DES1 as a vitamin A

isomerase in Muller glial cells of the retina. Nat Chem Biol 9:30-36.

Kimbel RL, Jr., Poincelot RP, Abramhamson EW. 1970. Chromophore transfer from lipid to

protein in bovine rhodopsin. Biochemistry 9:1817-1820.

Kleinschmidt J, Dowling JE. 1975. Intracellular-recordings from gecko photoreceptors during

light and dark-adaptation. J Gen Physiol 66:617-648.

Kojima D, Okano T, Fukada Y, Shichida Y, Yoshizawa T, Ebrey TG. 1992. Cone visual

pigments are present in gecko rod cells. Proc Natl Acad Sci U S A 89:6841-6845.

Kojima K, Matsutani Y, Yamashita T, Yanagawa M, Imamoto Y, Yamano Y, Wada A, Hisatomi

O, Nishikawa K, Sakurai K, Shichida Y. 2017. Adaptation of cone pigments found in green

rods for scotopic vision through a single amino acid mutation. Proc Natl Acad Sci U S A

114:5437-5442.

Kolb H. 2012. Gross Anatomy of the Eye. In: Kolb H, Nelson R, Fernandez E, Jones B, editors.

Webvision: The Organization of the Retina and Visual System. Utah: Moran Eye Center.

Kolesnikov AV, Orban T, Palczewski K, Kefalov VJ. 2017. Dephosphorylation of visual

pigments by PP2A is required for timely dark adaptation of rods and cones. Invest

Ophthalmol Visual Sci 58:3575-3575.

Lagnado L, Schmitz F. 2015. Ribbon Synapses and Visual Processing in the Retina. Annual

Review of Vision Science, Vol 1 1:235-262.

36

Lamb TD. 2013. Evolution of phototransduction, vertebrate photoreceptors and retina. Prog

Retin Eye Res 36:52-119.

Lamb TD. 2010. Phototransduction: adaptation in cones. In: Dartt DA, Besharse JC, Dana R,

editors. Encyclopedia of the Eye,. Oxford: Academic Press. p. 354-360.

Lamb TD, Collin SP, Pugh EN, Jr. 2007. Evolution of the vertebrate eye: opsins, photoreceptors,

retina and eye cup. Nat Rev Neurosci 8:960-976.

Lamb TD, McNaughton PA, Yau KW. 1981. Spatial spread of activation and background

desensitization in toad rod outer segments. J Physiol 319:463-496.

Lamb TD, Pugh EN, Jr. 2004. Dark adaptation and the retinoid cycle of vision. Prog Retin Eye

Res 23:307-380.

Land MF. 2005. The optical structures of animal eyes. Curr Biol 15:R319-323.

Lobanova ES, Finkelstein S, Song H, Tsang SH, Chen CK, Sokolov M, Skiba NP, Arshavsky

VY. 2007. Transducin translocation in rods is triggered by saturation of the GTPase-

activating complex. J Neurosci 27:1151-1160.

Lobanova ES, Herrmann R, Finkelstein S, Reidel B, Skiba NP, Deng WT, Jo R, Weiss ER,

Hauswirth WW, Arshavsky VY. 2010. Mechanistic basis for the failure of cone transducin

to translocate: why cones are never blinded by light. J Neurosci 30:6815-6824.

Loew ER. 1994. A third, ultraviolet-sensitive, visual pigment in the Tokay gecko (Gekko gekko).

Vision Res 34:1427-1431.

Loew ER, Fleishman LJ, Foster RG, Provencio I. 2002. Visual pigments and oil droplets in

diurnal lizards: a comparative study of Caribbean anoles. J Exp Biol 205:927-938.

Loew ER, Govardovskii VI, Rohlich P, Szel A. 1996. Microspectrophotometric and

immunocytochemical identification of ultraviolet photoreceptors in geckos. Vis Neurosci

13:247-256.

37

Loew ER, Lythgoe JN. 1978. The ecology of cone pigments in teleost fishes. Vision Res 18:715-

722.

Luo D, Yue W, Ala-Laurila P, Yau K. 2011. Activation of visual pigments by light and heat.

Science 332:1307-1312.

Lythgoe J. 1979. The Ecology of Vision. Oxford: Clarendon Press.

Lythgoe JN, Muntz WRA, Partridge JC, Shand J, Williams DM. 1994. The ecology of the visual

pigments of snappers (Lutjanidae) on the Great Barrier Reef. J Comp Physiol A Sens

Neural Behav Physiol 174:461-467.

Ma JX, Znoiko S, Othersen KL, Ryan JC, Das J, Isayama T, Kono M, Oprian DD, Corson DW,

Cornwall MC, Cameron DA, Harosi FI, Makino CL, Crouch RK. 2001. A visual pigment

expressed in both rod and cone photoreceptors. Neuron 32:451-461.

Makino CL, Groesbeek M, Lugtenburg J, Baylor DA. 1999. Spectral tuning in salamander visual

pigments studied with dihydroretinal chromophores. Biophys J 77:1024-1035.

Mao W, Miyagishima KJ, Yao Y, Soreghan B, Sampath AP, Chen JE. 2013. Functional

Comparison of Rod and Cone G alpha(t) on the Regulation of Light Sensitivity. J Biol

Chem 288:5257-5267.

McDonald CG, Hawryshyn CW. 1995. Intraspecific variation of spectral sensitivity in threespine

stickleback (Gasterosteus aculeatus) from different photic regimes. J Comp Physiol A Sens

Neural Behav Physiol 176:255-260.

Morshedian A, Fain GL. 2017. The evolution of rod photoreceptors. Philos Trans R Soc Lond B

Biol Sci 372.

Morshedian A, Fain GL. 2015. Single-Photon Sensitivity of Lamprey Rods with Cone-like Outer

Segments. Curr Biol 25:484-487.

Nilsson DE. 2013. Eye evolution and its functional basis. Vis Neurosci 30:5-20.

38

Palczewski K. 2012. Chemistry and biology of vision. J Biol Chem 287:1612-1619.

Pedler C, Tilly R. 1964. The nature of the gecko visual cell: a light and electron microscopic

study. Vision Res 4:499-510.

Poincelot RP, Millar PG, Kimbel RL, Jr., Abrahamson EW. 1969. Lipid to protein chromophore

transfer in the photolysis of visual pigments. Nature 221:256-257.

Röll B. 2000. Gecko vision-visual cells, evolution, and ecological constraints. J Neurocytol

29:471-484.

Röll B. 2001. Multiple origin of diurnality in geckos: evidence from eye lens crystallins.

Naturwissenschaften 88:293-296.

Roth LS, Kelber A. 2004. Nocturnal colour vision in geckos. Proc Biol Sci 271 Suppl 6:S485-

487.

Saari JC. 2000. Biochemistry of visual pigment regeneration - The Friedenwald Lecture. Invest

Ophthalmol Visual Sci 41:337-348.

Saari JC. 2012. Vitamin A metabolism in rod and cone visual cycles. Annu Rev Nutr 32:125-145.

Sato S, Frederiksen R, Cornwall MC, Kefalov VJ. 2017. The retina visual cycle is driven by cis

retinol oxidation in the outer segments of cones. Vis Neurosci 34.

Sato S, Fukagawa T, Tachibanaki S, Yamano Y, Wada A, Kawamura S. 2013. Substrate

specificity and subcellular localization of the aldehyde-alcohol redox-coupling reaction in

carp cones. J Biol Chem 288:36589-36597.

Sato S, Kefalov VJ. 2016. cis Retinol oxidation regulates photoreceptor access to the retina

visual cycle and cone pigment regeneration. Journal of Physiology-London 594:6753-

6765.

39

Sato S, Miyazono S, Tachibanaki S, Kawamura S. 2015. RDH13L, an enzyme responsible for

the aldehyde-alcohol redox coupling reaction (AL-OL coupling reaction) to supply 11-cis

retinal in the carp cone retinoid cycle. J Biol Chem 290:2983-2992.

Schultze M. 1866. Zur Anatomie und Physiologie der Retina. Archiv für mikroskopische

Anatomie 2:175-286.

Shand J. 1993. Changes in the spectral absorption of cone visual pigments during the settlement

of the goatfish Upeneus tragula: the loss of red sensitivity as a benthic existence J Comp

Physiol A Sens Neural Behav Physiol 173:115-121.

Shichi H, Somers RL. 1974. Possible involvement of retinylidene phospholipid in

photoisomerization of all-trans-retinal to 11-cis-retinal. J Biol Chem 249:6570-6577.

Sillman AJ, Carver JK, Loew ER. 1999. The photoreceptors and visual pigments in the retina of

a boid snake, the ball python (Python regius). J Exp Biol 202:1931-1938.

Sillman AJ, Govardovskii VI, Rohlich P, Southard JA, Loew ER. 1997. The photoreceptors and

visual pigments of the garter snake (Thamnophis sirtalis): a microspectrophotometric,

scanning electron microscopic and immunocytochemical study. J Comp Phys A 181:89-

101.

Sillman AJ, Johnson JL, Loew ER. 2001. Retinal photoreceptors and visual pigments in Boa

constrictor imperator. J Exp Zool 290:359-365.

Sokolov M, Lyubarsky AL, Strissel KJ, Savchenko AB, Govardovskii VI, Pugh EN, Arshavsky

VY. 2002. Massive light-driven translocation of transducin between the two major

compartments of rod cells: A novel mechanism of light adaptation. Neuron 34:95-106.

Tachibanaki S, Arinobu D, Shimauchi-Matsukawa Y, Tsushima S, Kawamura S. 2005. Highly

effective phosphorylation by G protein-coupled receptor kinase 7 of light-activated visual

pigment in cones. Proc Natl Acad Sci U S A 102:9329-9334.

40

Tachibanaki S, Yonetsu SI, Fukaya S, Koshitani Y, Kawamura S. 2012. Low Activation and Fast

Inactivation of Transducin in Carp Cones. J Biol Chem 287:41186-41194.

Tang PH, Kono M, Koutalos Y, Ablonczy Z, Crouch RK. 2013. New insights into retinoid

metabolism and cycling within the retina. Prog Retin Eye Res 32:48-63.

Tansley K. 1964. The gecko retina. Vision Res 4:33-37.

Underwood G. 1970. The Eye. In: Gans C, editor. Biology of the Reptilia. New York: Academic

Press. p. 1-97.

Vogalis F, Shiraki T, Kojima D, Wada Y, Nishiwaki Y, Jarvinen JLP, Sugiyama J, Kawakami K,

Masai I, Kawamura S, Fukada Y, Lamb TD. 2011. Ectopic expression of cone-specific G-

protein-coupled receptor kinase GRK7 in zebrafish rods leads to lower photosensitivity and

altered responses. Journal of Physiology-London 589:2321-2348.

Wada Y, Sugiyama J, Okano T, Fukada Y. 2006. GRK1 and GRK7: unique cellular distribution

and widely different activities of opsin phosphorylation in the zebrafish rods and cones. J

Neurochem 98:824-837.

Walls GL. 1940. Ophthalmological Implications for the Early History of the Snakes. Copeia

1940:1-8.

Walls GL. 1934. The Reptilian Retina: I. A new concept of visual-cell evolution. Am J

Ophthalmol 17:892-915.

Walls GL. 1942. The vertebrate eye and its adaptive radiation. Bloomfield Hills, MI: Cranbrook

Institute of Science.

Wang JS, Kefalov VJ. 2011. The cone-specific visual cycle. Prog Retin Eye Res 30:115-128.

Warrant EJ. 2008. Nocturnal vision. In: Albright T, Masland RH, editors. The Senses: A

Comprehensive Reference. Oxford: Academic Press. p. 53-89.

41

Wensel TG. 2008. Signal transducing membrane complexes of photoreceptor outer segments.

Vision Res 48:2052-2061.

Wong ROL. 1989. Morphology and distribution of neurons in the retina of the American garter

snake Thamnophis sirtalis. J Comp Neurol 283:587-601.

Yamaoka H, Tachibanaki S, Kawamura S. 2015. Dephosphorylation during Bleach and

Regeneration of Visual Pigment in Carp Rod and Cone Membranes. J Biol Chem

290:24381-24390.

Young RW. 1976. Visual cells and the concept of renewal. Invest Ophthalmol Vis Sci 15:700-

725.

Zhang X, Wensel TG, Kraft TW. 2003. GTPase regulators and photoresponses in cones of the

eastern chipmunk. J Neurosci 23:1287-1297.

Zhang X, Wensel TG, Yuan C. 2006. Tokay gecko photoreceptors achieve rod-like physiology

with cone-like proteins. Photochem Photobiol 82:1452-1460.

42

Chapter 2 Evolutionary transformation of rod photoreceptors in the all-cone retina of a diurnal garter snake

Citation: Schott RK, J Müller, CGY Yang, N Bhattacharyya, N Chan, M Xu, JM Morrow, A-H

Ghenu, ER Loew, V Tropepe, BSW Chang. 2016. Evolutionary transformation of rod photoreceptors in the all-cone retina of a diurnal garter snake. Proceedings of the National

Academy of Sciences of the United States of America 113:356–361.

Author Contributions: Conceived and designed the study: JM, BSWC. Produced sequence data:

JM, NC, MX. Performed the phylogenetic and molecular evolution experiments: RKS.

Performed the electron microscopy experiments: CGYY. Performed the immunohistochemistry experiments: NB, A-HG, VT. Performed the in vitro expression experiments: JMM. Performed the MSP experiments: ERL. Analyzed the data: RKS, BSWC, CGYY, NB. Wrote the manuscript: RKS and BSWC with contributions and approval from all authors.

2.1 Abstract

Vertebrate retinas are generally composed of rod (dim-light) and cone (bright-light) photoreceptors with distinct morphologies that evolved as adaptations to nocturnal/crepuscular and diurnal light environments. Over 70 years ago, the ‘transmutation’ theory was proposed to explain some of the rare exceptions in which a photoreceptor type is missing, suggesting that photoreceptors could evolutionarily transition between cell types. Although studies have shown

43 support for this theory in nocturnal geckoes, the origins of all-cone retinas such as those found in diurnal colubrid snakes, remain a mystery. Here we investigate the evolutionary fate of the rods in a diurnal garter snake, and test two competing hypotheses: 1) that the rods, and their corresponding molecular machinery, were lost or 2) that the rods were evolutionarily modified to resemble, and function, as cones. Using multiple approaches we find evidence for a functional and unusually blue-shifted rhodopsin that is expressed in small single ‘cones’. Moreover, these

‘cones’ express rod transducin and have rod ultrastructural features, providing strong support for the hypothesis that they are not true cones, as previously thought, but rather are modified rods.

Several intriguing features of snake rhodopsin are suggestive of a more cone-like function. We propose that these cone-like rods may have evolved to regain spectral sensitivity and chromatic discrimination as a result of ancestral losses of middle-wavelength cone opsins in early snake evolution. This study illustrates how sensory evolution can be shaped not only by environmental constraints, but also by historical contingency in forming new cell types with convergent functionality.

2.2 Significance

This study provides compelling evidence that the previously reported all-cone retina of a diurnal garter snake in fact contains a population of rod photoreceptors with the appearance, and presumably function, of cones. Our results suggest that the evolution of all-cone retinas occurred not through loss of rods, but rather via the evolutionary transmutation of ancestral rods into more

‘cone-like’ photoreceptors, in order to regain functionality that was lost during the early, possibly fossorial, origin of snakes. This study provides a better understanding of the process by which complex molecular/cellular structures and tissue types can evolve, and how, particularly

44 for sensory systems, physiological constraints can be shaped by selective forces to produce evolutionary novelty.

2.3 Introduction

How complex structures can arise has long fascinated evolutionary biologists and the evolution of the eye, as noted by Charles Darwin (Darwin 1859), is perhaps the most famous example. Within the vertebrate eye, the light-sensing photoreceptors are complex, highly specialized cellular structures that can be divided into two general types based on their distinct morphologies and functions: cones, which are active during the day and contain cone opsin pigments; and rods, which mediate dim-light vision and contain rhodopsin (RH1) (Walls 1942;

Bowmaker 2008; Lamb 2013). The visual pigments contained in cone photoreceptors are classified into four different subtypes that mediate vision across the visible spectrum from the ultraviolet to the red (SWS1, SWS2, RH2, LWS) (Bowmaker 2008). Although most vertebrate retinas are duplex, containing both cones and rods, squamate reptiles (lizards and snakes) are unusual, not only in having highly variable photoreceptor morphologies, but also for several instances of the absence of an entire class of photoreceptors, resulting in simplex retinas composed of only cones or rods (Walls 1942).

In a seminal book published in 1942, Walls hypothesized that, during evolution, vertebrate photoreceptors could transform from one type to another, a process that he termed photoreceptor ‘transmutation’. As key examples of his theory, Walls (1942) highlighted anatomical changes in the photoreceptors of snakes and geckos, two groups within which there have been significant shifts in diurnal and nocturnal activity patterns. Although several subsequent studies have investigated this hypothesis in geckos (Tansley 1964; Kojima et al.

45

1992; Loew 1994; Röll 2000; Zhang et al. 2006), whether the evolutionary transmutation of photoreceptors can happen in snakes remains an open question (Davies et al. 2009). Walls also noted a number of peculiar morphological adaptations in snake eyes, which he proposed were due to a subterranean phase early in snake evolution that led to degeneration of the ophidian visual system, resulting in loss of features common to other terrestrial vertebrates (Walls 1942).

Colubrid snakes are an ideal group to study Walls’ hypothesis of transmutation due to their highly variable photoreceptor morphologies that range from all-cone in, at least some, diurnal species, such as Thamnophis (garter snakes), to all-rod in some nocturnal species, as well as species with the presumed ancestral condition of duplex retinas (Walls 1942; Underwood

1970). Previous studies in the diurnal colubrid Thamnophis have demonstrated an all-cone retina

(Walls 1942; Underwood 1970; Wong 1989; Jacobs et al. 1992; Sillman et al. 1997), consisting of double cones and large single cones that express a long-wavelength pigment (presumably

LWS), and two classes of small single cone, one with a short-wavelength pigment (presumably

SWS1) and the other with a middle-wavelength pigment, the identity of which is unclear

(Sillman et al. 1997). However, the ancestral condition for colubrids is likely to have been a duplex retina containing both rods and cones, similar to snakes such as pythons and boas, which have rods that express RH1, large single cones that express LWS, and small single cones that express SWS1 (Fig. 2.1) (Walls 1942; Underwood 1970; Sillman et al. 1999; Sillman et al. 2001;

Davies et al. 2009). The SWS2 and RH2 opsins, present ancestrally in vertebrates, appear to have been lost early in the evolution of snakes, perhaps as a result of their proposed fossorial origins (Davies et al. 2009; Castoe et al. 2013; Simões et al. 2015).

Based on these findings we can formulate two main hypotheses for the evolution of the all-cone retina of diurnal colubrids from the duplex ancestral condition (Fig. 2.1). The first is that the rods were lost, and RH1 and other components of the visual transduction cascade unique to

46 rod photoreceptors were either lost or targeted to cones. The second hypothesis is that the rods were evolutionarily modified to resemble the appearance, and presumably the function, of cones.

If the rods were modified to resemble cones, we might expect a subset of cones to possess molecular components, such as RH1, and morphological features consistent with a rod ancestry.

In order to test these hypotheses, we examined the photoreceptors and visual pigments of a diurnal garter snake (Thamnophis proximus) by combining multiple methodologies including sequencing and molecular evolutionary analyses of opsin genes, microspectrophotometry (MSP) of intact photoreceptor cells, in vitro expression of visual pigments, and scanning and transmission electron microscopy (SEM and TEM) and immunohistochemistry of T. proximus retinas. The combined results of these experiments provide strong evidence that RH1 and other components of the rod visual transduction machinery are expressed in a subset of cone-like photoreceptors with rod ultrastructural features, and that the RH1-expressing ‘cones’ are not true cones, as previously thought, but rather are modified (ie, ‘transmuted’), cone-like rods. Our results shed new light on the evolutionary origins of the all-cone retinas of diurnal colubrid snakes, demonstrating how ancestral losses can be compensated by evolutionary modification of existing cellular structures.

47

Figure 2.1. Illustration of evolutionary pathways for two alternative hypotheses for the evolution of an all-cone retina from a duplex ancestor in diurnal colubrids. In Hypothesis 1 the rod photoreceptors, along with RH1, are lost and an additional cone type is derived from duplication of an existing cone or retained from an ancestral condition that was lost in other snakes. In Hypothesis 2 the rod photoreceptor is evolutionarily modified into a cone photoreceptor maintaining expression of RH1 and other rod-specific phototransduction machinery.

48

2.4 Results

2.4.1 Thamnophis proximus has an ‘all-cone’ retina

Scanning electron microscopy of Thamnophis proximus retina revealed only cells that could be identified as cones based on their gross morphology, including small, tapering outer segments and bulbous inner segments (Fig. 2.2, Fig. S2.1). We found no evidence of rods, such as those in, for example, python and boa retinas, which are quite distinct with long, slender inner and outer segments (Sillman et al. 1999; Sillman et al. 2001). This finding is consistent with earlier studies of a closely related species, T. sirtalis (Wong 1989; Jacobs et al. 1992; Sillman et al. 1997), and with the condition described by Walls (1942) for diurnal colubrids in general. Four cone types were identified in T. proximus: double cones, large single cones, and two seemingly distinct sizes of small single cones (Fig. 2.2C). These four cone types appear to be the same as those reported for T. sirtalis (Sillman et al. 1997) and similar to those described for other caenophidian snakes with all-cone retinas (Hart et al. 2012). Sillman et al. (1997) described two subtypes of small single cone in T. sirtalis, and we also found evidence for this in T. proximus where some small single cones were substantially smaller than the others (see very small single cone, Fig. 2.2C), but this distinction was more subtle than that between the large single cones and small single cones and may be confounded by size variation of individual cells. As far as is known, pythons and boas have only large and small single cones, with no double cones (Sillman et al. 1999;

Sillman et al. 2001).

In T. proximus, the large single cones and double cones account for approximately 45% and 44% of the cones, respectively. The small single cones were rarer, accounting for the remaining 11% (~9% small single and ~2% very small single). While four individuals were used for SEM, only a single complete retinal preparation was available to determine proportions. As a

49 result, the level of individual variation in T. proximus photoreceptor proportions is unknown.

Despite this, the proportions we found for T. proximus are similar to those found previously for

T. sirtalis (Sillman et al. 1997). Samples from different areas of the retina had similar proportions of the three photoreceptor cells and there did not appear to be any strong distributional pattern or mosaic to the photoreceptors, such as that found in some other vertebrates (Ahnelt and Kolb 2000; Allison et al. 2010; Kram et al. 2010), consistent with T. sirtalis (Sillman et al. 1997).

Figure 2.2. Light and scanning electron microscopy of Thamnophis proximus retina. A and

B, retinal cross-sections imaged using light (A) and electron (B) microscopy illustrating the layers of the retina. C, scanning electron microscope image of the retina illustrating the all-cone photoreceptor population with four different photoreceptor cell types. Abbreviations: SCL, scelera; RPE, retinal pigment epithelium; PC, photoreceptor cell layer; ONL, outer nuclear layer; INL, inner nuclear layer; GC, ganglion cell layer; a, accessory member of double cone; p, principal member of double cone; ls, large single cone; ss, small single cone; vss, very small single cone; OS, outer segment; IS, inner segment.

50

2.4.2 Thamnophis proximus possesses three visual pigments

Microspectrophotometry (MSP) of intact photoreceptors from dissociated retina was used to determine the absorption spectra of the four morphological types of photoreceptor cells (Table

S2.1). The double cones and large single cones were found to possess a long-wavelength pigment with a peak absorbance (λmax) of 542 nm (Fig. S2.2), whereas the small single cones could be divided into two categories based on absorption characteristics: some contained a medium-wavelength pigment with a λmax of 482 nm (Fig. 2.3), and others possessed a short wavelength pigment with a λmax of 366 nm (Fig. S2.2). The absorbance spectra of all three pigments fit the A1 chromophore profile. These results are similar to those found previously for

T. sirtalis (Sillman et al. 1997), except that the long-wavelength pigment is blue-shifted by ~12 nm, and the shortwave-length pigment red-shifted by ~6 nm, but differ from previous MSP in other snakes (see Supplementary Results). The long- and short-wavelength pigments for both species are likely to be LWS and SWS1, respectively, based on their λmax values and presence in other snakes, but the identity of the 482 nm pigment is unclear.

51

Figure 2.3. Normalized absorbance spectra of (A) middle-wavelength visual pigment from intact photoreceptor cells measured by MSP and (B) in vitro expressed rhodopsin (RH1) from Thamnophis proximus. The filled circles and smooth curves of (A) are for the best-fit visual pigments calculated from A1-based template data. The λmax values are the average of measurements from multiple cells as shown in Table S2.1. The peak absorbance (λmax) of (B) was estimated by Govardovskii curve fitting.

2.4.3 RH1, LWS, and SWS1 expressed in Thamnophis proximus eye RNA and RH1 maintained under normal selective pressures

Three full-length visual pigment genes were isolated from eye RNA using a combination of degenerate and RACE primers. These were identified using BLAST searches followed by phylogenetic analysis with other reptilian and vertebrate opsin sequences. These analyses identified the three opsin genes in T. proximus to be LWS, SWS1, and RH1 (Genbank accession

52 numbers: KU306727, KU306728, and KU306726, respectively, Figs. S2.3–2.5). Thamnophis proximus RH1 grouped with other snake RH1 sequences, and was most closely related to the king cobra sequence, as expected based on the inferred species relationships (Fig. S2.3) (Pyron et al. 2013). The identification of a RH1 gene in the all-cone retina of T. proximus was surprising.

Despite terrestrial vertebrates typically having RH1 that absorb maximally around 500 nm

(Bowmaker 2008) this raised the possibility that the 482 nm pigment identified by MSP may in fact be a highly blue-shifted rhodopsin. Thamnophis proximus RH1 has several distinctive residues including S185 and S292. A292S is known to cause a substantial blue-shift of λmax in other vertebrate rhodopsins (Sugawara et al. 2005), while C185S has been shown to reduce transducin activation in vitro when mutated in bovine RH1 (Karnik et al. 1988).

In order to determine if expression in an all-cone retina altered evolutionary constraints on RH1, we analyzed selection patterns with PAML random-sites, branch, branch-site, and clade models (Fig. S2.3; Table S2.2). The M0 model found an average ω (the ratio of nonsynonymous to synonymous substitutions or dN/dS) of 0.07 and significant rate variation across sites (M3 vs.

M0, Table S2.2), as expected for a protein-coding gene under strong selective constraint. No evidence was found for positive selection on RH1 (ω > 1) either alignment-wide (M2a vs. M1a,

M8 vs. M7, Table S2.2) or in snakes, caenophidians, or T. proximus specifically, with the branch-site test. We found no evidence for loss of function in T. proximus RH1, which would be expected to result in an increased ω along this lineage; instead the ω values for T. proximus did not differ significantly from the background with either model (Table S2.2), which is consistent with conserved function. This indicates that the RH1 gene in T. proximus is under strong selective constraint similar to other vertebrates despite it being expressed in an apparently all- cone retina.

53

2.4.4 Thamnophis proximus rhodopsin is functional with a highly blue- shifted λmax

In order to determine whether the T. proximus RH1 gene isolated from retinal mRNA encodes a functional visual pigment, the gene was ligated into the p1D4-hrGFP II expression vector

(Morrow and Chang 2010) and heterologously expressed in HEK293T cells. Thamnophis proximus RH1 properly bound and regenerated with 11-cis-retinal, producing a dark absorbance spectrum with a λmax of 481 nm (Fig. 2.3, Fig. S2.6). This value is consistent with the MSP estimate of 482 nm for a subset of small single cone photoreceptors (Fig. 2.3), strongly implying that RH1 is expressed in these cells. When bleached with light, the λmax of T. proximus RH1 shifted to approximately 380 nm, representing the biologically active metarhodopsin II intermediate, and indicative of proper visual pigment function (Wald et al. 1955).

2.4.5 Rhodopsin and rod transducin are expressed in ‘cone’ photoreceptor cells

To further explore the possibility that components of the rod phototransduction cascade may be expressed in cone photoreceptors, we performed immunohistochemistry on retinal cryosections using two different antibodies: a rhodopsin antibody (4D2), and a rod-specific transducin antibody (K20).

As a positive control, we labelled mouse retina with both anti-rhodopsin (4D2) and anti- rod-transducin (K20) antibodies (Fig. 2.4A‒D). We found RH1 localized to the rod outer segments and rod-transducin localized to the inner segments, which was expected based on previous immunohistochemical characterization of mouse retina using these antibodies

(Rosenzweig et al. 2009). As mouse retinas are highly rod-dominated, both RH1 and rod transducin were continuously distributed across the photoreceptor layer (Fig. 2.4D).

54

In T. proximus, staining for RH1 (4D2) was found in a small proportion of the ‘cone’ photoreceptor cells. Staining was localized to the outer segment (Fig. 2.4F). This is consistent with previously unexplained staining of T. sirtalis retinas (see Supplementary Results) (Sillman et al. 1997). Rod transducin (K20) was also found in a subset of the cone photoreceptor cells, where staining was localized primarily to the inner segment and cell body of the photoceptor

(Fig. 2.4G). The presence of rod transducin in the inner segment is expected from retina exposed to light, unlike cone transducin, which does not translocate to the inner segment (Lobanova et al.

2010). This further supports the specificity of K20 for rod transducin to the exclusion of cone transducin. Double staining and analysis of the confocal z-stack revealed that RH1 and rod transducin are present in the same cells and that there is some overlap of their localizations (Fig.

2.4H‒J). Combined with our MSP, sequencing, and in vitro expression results, the immunohistochemical results support the hypothesis that T. proximus RH1 is expressed in a

‘cone’ photoreceptor cell.

55

Figure 2.4. Immunohistochemical staining of control (mouse, A–D) and Thamophis proximus (E–K) transverse retinal cryosections with rhodopsin (4D2) and rod-specific- transducin (K20) antibodies. Rhodopsin is found in a subset of ‘cone’ cells localized to the outer segment (F). Rod-specific transducin is also found in a subset of these cells localized primarily to the inner segment (G). Double staining indicates that both rhodopsin and rod- specific transducin are found within the same cells (H) and this is confirmed in individual slices from the Z-stack (I,J). The section in (K) shows the broad distribution of rhodopsin and rod- transducin containing cells. Nuclear staining is shown in blue, rhodopsin (4D2) staining is shown in red, and rod-specific transducin (K20) staining is shown in green.

2.4.6 A subset of small single ‘cones’ have rod ultrastructure

To further test the hypothesis that the rhodopsin-bearing ‘cones’ are actually derived from rods, we examined the ultrastructure of the photoreceptors using TEM. Four different cone types were identified: double cones (Fig. S2.7A), large single cones (Fig. S2.7B,C), and two types of small

56 single cone (Fig. S2.7B,C). The double cones, large single cones, and first type of the small single cone had the expected morphology, that is small tapering outer segments and bulbous inner segments with large ellipsoids (Fig. S2.7A‒C) (Wong 1989; Sillman et al. 1997). These cones also had the expected lamellar structure, where the outer segment discs were open to the plasma membrane on one side (Fig. S2.7D‒F, arrows). The other type of small single cone was noticeably distinct. These cells tended to have less tapered outer segments and inner segments that were less bulbous and closer in width to the outer segments (Fig. 2.5, Fig, S2.7C).

Additionally, the outer segment discs of these cells were completely enclosed by plasma membrane (Fig. 2.5, arrows), which is a feature that is otherwise exclusive to, and characteristic of, rods (Sillman et al. 1997; Röll 2000). Collectively, these results suggest that these cells are actually ‘transmuted’ cone-like rods rather than true cones.

57

Figure 2.5. Transmission electron microscope (TEM) image of the outer segment of a

Thamnophis proximus photoreceptor cell with rod ultrastructure. The arrows indicate the complete enclosure of the discs by plasma membrane, which is a feature exclusive to rods.

2.5 Discussion

In this study we present several lines of evidence, both experimental and computational, to support the evolutionary transmutation of rods into ‘cone-like’ photoreceptors in colubrid snakes.

We found that despite a lack of apparent rod photoreceptors in its all-cone retina, which we confirmed by SEM, T. proximus possesses a rhodopsin gene (RH1), in addition to two cone opsins (SWS1, LWS). Immunofluorescent staining demonstrated that RH1 is present in the outer segments of a subset of ‘cone’ photoreceptor cells in T. proximus retina. Another rod-specific component of the phototransduction cascade, rod transducin, was found to co-localize in the same subset of photoreceptors. Despite its unusual expression in an all-cone retina, comparative sequence analyses showed T. proximus RH1 to be under strong selective constraint indicative of a functionally conserved protein-coding gene. When heterologously expressed in vitro, T.

58 proximus RH1 was found to encode a photoactive visual pigment that is substantially blue- shifted in its absorption maxima, matching our spectral MSP measurements of intact photoreceptors. Finally, while the general morphology of the photoreceptors was indicative of an all-cone retina, close examination of the ultrastructure of individual cells using TEM revealed that a subset of ‘cones’ in fact had rod features, including outer segment discs that were completely enclosed by plasma membrane.

The finding that RH1 is expressed in a previously reported all-cone retina of the diurnal colubrid T. proximus raises several possible alternative hypotheses to those proposed in Figure

2.1. The simplest is that RH1 is a non-functional pseudogene. Our molecular evolutionary analyses, however, indicate that RH1 has been maintained under strong selective constraint and we found no evidence for a relaxation of selection. This implies that T. proximus RH1 is functional. To confirm this, we heterologously expressed T. proximus RH1 and found that it can bind retinal and activate in response to light. Another alternative is that, along with the loss of rods in diurnal colubrids, RH1 was relegated to a solely non-visual role (e.g., maintenance of circadian rhythm) (Bertolucci and Foa 2004). Immunohistochemical staining of T. proximus retina revealed the presence of RH1 within ‘cone’ photoreceptors, which strongly suggests that this is not the case. Lastly, RH1 may have been co-opted for expression in cones, possibly even co-expressed with a cone opsin. The co-expression of multiple types of cone opsin within individual cone cells has been found in rodents (Lukáts et al. 2005), salamanders (Isayama et al.

2014), and cichlid fishes (Dalton et al. 2014), but co-expression of a cone opsin and RH1 has not been reported. The presence of rod transducin along with rhodopsin implies that other components of the rod transduction machinery would have had to be co-opted as well. However, the finding of rod-specific ultrastructure argues against a simple shift in expression of rod- specific transduction machinery into a different cell type, though this idea could be addressed in

59 future cell developmental studies. Currently, the most parsimonious explanation of our results is that the rhodopsin-containing ‘cones’ of T. proximus are homologous to the rods of pythons and boas; that is, they are actually ‘cone-like’ rods.

While this study is the first molecular evidence of an evolutionary shift from rod to cone morphology, a transition in the opposite direction has been shown in nocturnal geckos. Geckos are hypothesized to have evolved from a lizard ancestor with an all-cone retina and to have evolved an all-rod retina during adaptation to a nocturnal lifestyle (Walls 1942). A series of papers have shown that gecko ‘all-rod’ retinas contain only cone opsins and cone phototransduction machinery (Kojima et al. 1992; Loew 1994; Zhang et al. 2006), that the ‘rods’ have cone ultrastructural features (Röll 2000), and function at a level intermediate between true rods and cones (Zhang et al. 2006). These findings support Walls (1942) contention that gecko

‘rod’ photoreceptors are ‘transmuted’ cones.

The evolutionary alterations in gecko ‘rods’ are similar in nature to those found in our study in a subset of rhodopsin-staining ‘cones’, but in the opposite direction. Thamnophis proximus rhodopsin-staining ‘cones’ have outer segments that resemble cones, but with rod ultrastructural features, and contain rod phototransduction machinery. Several intriguing and atypical features of the rod machinery within these photoreceptors are also consistent with a more cone-like function. The highly blue-shifted absorption spectrum of T. proximus RH1, unique among terrestrial vertebrates, is a shift toward wavelengths generally occupied by the cone opsin RH2, which is suggestive of a more cone-like physiology. Thamnophis proximus

RH1 also has the mutation C185S, which been shown to reduce transducin activation in bovine

RH1 (Karnik et al. 1988), which is more typical of cone opsins. Furthermore, the only electrophysiological study of Thamnophis (performed in T. sirtalis) (Jacobs et al. 1992) found no evidence for a separate rod (scotopic) visual response. Although these data all point to more

60 cone-like characteristics, despite the rod machinery and ultrastructure, it is clear that further study is needed to explore the functional consequences of this evolutionary transition in

Thamnophis, and other diurnal colubrids.

A common property of photoreceptor ‘transmutation’ appears to be substantial morphological changes to the outer segment. The correlation of rod-like cellular morphology with nocturnal species and cone-like morphology with diurnal species (Walls 1942) suggests a functional relevance to outer segment shape. Enlarged, rod-like outer segments are known to increase sensitivity by increasing cell volume and, as a result, the number of visual pigment molecules available to catch photons (Wen et al. 2009; Lamb 2013). Recent theoretical work has proposed that the small tapering outer segments of cones may help to reduce self-screening of the visual pigments, increase signal-to-noise ratios, and allow light to more efficiently be focused on the outer segment by the ellipsoid (Hárosi and Novales Flamarique 2012). Interestingly, recent work has also suggested that reduction of RH1 expression alone can result in a more cone- like morphology, decreasing the photosensitivity of the cell and increasing the kinetics of the phototransduction cascade (Wen et al. 2009; Makino et al. 2012; Rakshit and Park 2015). A second striking difference in rod and cone morphology is the accessibility of the outer segment discs to the plasma membrane. In cones the discs are open, which contributes to rapid response kinetics, whereas in rods the complete enclosure of the discs results in increased sensitivity to light (Lamb 2013). In the rod-like cones of nocturnal geckos, the discs are partially enclosed, and this may contribute to their intermediate physiological properties. In T. proximus, the discs of the cone-like rods remain enclosed by the plasma membrane, but the extent to which this slows responses, and how it may have been overcome, would be an interesting area for future research.

The question remains as to why diurnal colubrids and nocturnal geckos have modified their rods and cones when many other groups that have transitioned between diurnality and

61 nocturnality have not. Goldsmith (1990) proposed that opsin gene loss might be a prerequisite for photoreceptor transmutation. At the time it was known that geckos had lost the RH1 and

SWS2 opsins, but in this context it is interesting to note that snakes have also experienced opsin loss (RH2 and SWS2), likely as a result of their proposed burrowing origins (Walls 1942; Davies et al. 2009; Castoe et al. 2013; Simões et al. 2015). Because the diurnal ancestors of geckos had already lost RH1, the advantage of transmuting cones into rods when adapting to a nocturnal lifestyle is clear. Nearly all highly diurnal animals, however, maintain a population of rods

(Bowmaker 2008), presumably because even highly diurnal animals may encounter, or be active in, dim-light environments. In fact, only diurnal squamates are thought to have lost rods and thus have all-cone retinas (with the possible exception of the stellate sturgeon) and only geckos are known to have lost RH1 (Bowmaker 2008). Thus, the change to cone-like rods in diurnal snakes, and the corresponding reduction in dim-light visual capabilities, is unusual.

The extraordinary evolutionary shift from a duplex to an all-cone retina might be explained by the ancestral loss of the SWS2 and RH2 cone opsins in snakes, which results in low sensitivity to a large portion of the visual spectrum due to the lack of appreciable overlap between the LWS and SWS1 cone opsins (Fig. 2.6A). Not only would this largely preclude colour vision, it would also severely limit the amount of visible light snakes would be sensitive to. In primarily nocturnal snakes this may not be an issue, but in highly diurnal snakes, such as

Thamnophis, there may be a significant advantage to increasing the range of spectral sensitivity.

Inclusion of RH1 in the daylight (photopic) absorption spectrum would greatly enhance the range of spectral sensitivity, and provide the basis for trichromatic colour vision (Fig. 2.6B). This would also help to explain the unusual blue-shifted absorption spectra of T. proximus RH1. It is the most blue-shifted RH1 found so far in any terrestrial vertebrate, and it is also highly blue- shifted relative to other snake groups that tend to have burrowing and nocturnal habits, such as

62 the sunbeam snake (Davies et al. 2009). The substantial blue-shift could be important for chromatic discrimination and colour vision, resulting in more even spacing in spectral tuning with LWS and SWS1 opsins. This effect on chromatic discrimination could be further enhanced by the slight red- and blue-shifting of SWS1 and LWS, respectively, relative to other snakes, such as the python (Fig. 2.6). It is not known whether diurnal colubrids possess colour vision or if the rod neural pathways in snakes, or more generally reptiles, can contribute to colour vision.

However, there is evidence that suggests that rods can contribute to colour vision (McKee et al.

1977; Cao et al. 2008). For example, human cone monochromats (individuals with only SWS1 cones and RH1 rods) are able to perceive colour under mesopic conditions where both the rods and cones are active (Reitner et al. 1991). If rods are similarly able to contribute to colour vision in snakes, the transition to cone-like rods may have provided an additional adaptive advantage, but testing this hypotheses will require studies both of retinal pathways in snakes and behavioural tests for colour vision.

The unexpected results presented in this study that reveal a hidden class of photoreceptors in a previously characterized all-cone retina provide tantalizing clues to the diverse evolutionary pathways through which sensory adaptations may be achieved. Here we have shown that the ‘all- cone’ retina of a diurnal colubrid evolved through modification of the rod photoreceptors, which may have allowed recovery of visual function that was lost during the presumed fossorial origins of snakes. Sensory systems in general may be particularly vulnerable to the need to compensate for ancestral loss of function in response to shifts in ecology. For example, a recent study showed that although sweet taste receptors were lost in the avian ancestor, hummingbirds have reacquired the ability to taste sweet compounds through modification of their savoury taste receptor (Baldwin et al. 2014). The peculiar adaptive transitions necessitated by ancestral loss

63 demonstrate how fascinating evolutionary novelty may arise even out of the limitations imposed by accidents of history.

Figure 2.6. Absorption spectra of Python (A) and Thamnophis proximus (B). Spectra are based on Govardovkii curves and illustrate the large gap in appreciable bright-light spectral sensitivity in Python between ~380–480 nm (A) that is filled by the presence of a blue-shifted rhodopsin expressed in a cone-like photoreceptor in T. proximus (B). This gap, and a corresponding increase in spectral overlap between pigments, is further decreased by slight red- shifting of the SWS1 and slight blue-shifting of the LWS pigments relative to Python. Python

λmax values from Sillman et al. (1999).

64

2.6 Materials and Methods

Also see Supplementary Materials and Methods for detailed descriptions.

2.6.1 Animals

Adult Thamnophis proximus were obtained from a licensed retailer and euthanized. Eyes were extracted and prepared either for MSP, RNA extraction, or electron microscopy. Blood was collected for gDNA extraction.

2.6.2 Microspectrophotometry

Methodology used for MSP measurements and analyses has been described previously (Loew

1994; Sillman et al. 1997).

2.6.3 Phylogenetic and molecular evolutionary analyses

Full length RH1, LWS, and SWS1 coding sequences were sequenced from total RNA extracted from T. proximus eyes, or from gDNA, utilizing standard PCR, RACE, and GenomeWalker

(Clontech) procedures. A representative set of vertebrate RH1, LWS, SWS1 sequences were aligned with the T. proximus sequence and gene trees estimated with MrBayes 3 (Ronquist et al.

2012). The RH1 gene tree and alignment was analyzed with the codeml package of PAML 4

(Yang 2007) using the random sites, branch and branch-site model (Zhang et al. 2005), as well as clade model C (CmC) (Bielawski and Yang 2004). Model pairs were compared using a likelihood ratio test (LRT) with a χ2 distribution.

65

2.6.4 Rhodopsin expression and spectroscopic assay

Rhodopsin was expressed and spectroscopically assayed as previously described (Morrow and

Chang 2010; Morrow et al. 2011).

2.6.5 Immunohistochemistry

Retinae from T. proximus were processed for immunohistochemistry following sucrose infiltration. Stained cryosections were visualized via a Leica TCSSP8 confocal laser microscope.

Primary antibodies used were the K20 antibody (Santa Cruz Biotechnology) and 4D2 anti- rhodopsin antibody. AlexaFluor-488 goat anti-rabbit (Life Technologies Inc.) and the Cy-3 anti- mouse (Jackson Immunoresearch) were used as secondary antibodies.

2.6.6 Electron microscopy

Hemisections of T. proximus retinae were prepared for SEM and TEM following standard procedures. Detailed protocol is available in Supplementary Material and Methods. SEM samples were examined with the Hitachi S2500 and images acquired using a Quartz PCI. TEM sections were examined with a Hitachi H7000 and images acquired using a digital camera

(Advanced Microscopy Techniques).

2.7 Acknowledgements

This work was supported by a Natural Sciences and Engineering Research Council (NSERC)

Discovery grant (BSWC), an Ontario Graduate Scholarship (RKS), a Vision Science Research

Program Scholarship (RKS), a grant from the Deutsche Forschungsgemeinschaft (JM), and

66

NSERC Summer Undergraduate Research Awards (NC, MX). The 11-cis-retinal was generously provided by Rosalie Crouch (Medical University of South Carolina), and the 4D2 anti-rhodopsin antibody from David McDevitt (University of Pennsylvania). We would like to thank the anonymous reviewers for their feedback and suggestions.

2.8 References

Ahnelt PK, Kolb H. 2000. The mammalian photoreceptor mosaic-adaptive design. Prog Retin

Eye Res 19:711-777.

Allison WT, Barthel LK, Skebo KM, Takechi M, Kawamura S, Raymond PA. 2010. Ontogeny

of cone photoreceptor mosaics in zebrafish. J Comp Neurol 518:4182-4195.

Baldwin MW, Toda Y, Nakagita T, O'Connell MJ, Klasing KC, Misaka T, Edwards SV, Liberles

SD. 2014. Evolution of sweet taste perception in hummingbirds by transformation of the

ancestral umami receptor. Science 345:929-933.

Bennis M, Molday RS, Versaux-Botteri C, Reperant J, Jeanny JC, McDevitt DS. 2005.

Rhodopsin-like immunoreactivity in the 'all cone' retina of the chameleon (Chameleo

chameleo). Exp Eye Res 80:623-627.

Bertolucci C, Foa A. 2004. Extraocular photoreception and circadian entrainment in

nonmammalian vertebrates. Chronobiol Int 21:501-519.

Bielawski JP, Yang Z. 2004. A maximum likelihood method for detecting functional divergence

at individual codon sites, with application to gene family evolution. J Mol Evol 59:121-

132.

Bowmaker JK. 2008. Evolution of vertebrate visual pigments. Vision Res 48:2022-2041.

67

Bugra K, Jacquemin E, Ortiz JR, Jeanny JC, Hicks D. 1992. Analysis of opsin messenger-rna

and protein expression in adult and regenerating newt retina by immunology and

hybridization. J Neurocytol 21:171-183.

Cao D, Pokorny J, Smith VC, Zele AJ. 2008. Rod contributions to color perception: linear with

rod contrast. Vision Res 48:2586-2592.

Castoe TA, de Koning APJ, Hall KT, Card DC, Schield DR, Fujita MK, Ruggiero RP, Degner

JF, Daza JM, Gu WJ, Reyes-Velasco J, Shaney KJ, Castoe JM, Fox SE, Poole AW,

Polanco D, Dobry J, Vandewege MW, Li Q, Schott RK, Kapusta A, Minx P, Feschotte C,

Uetz P, Ray DA, Hoffmann FG, Bogden R, Smith EN, Chang BSW, Vonk FJ, Casewell

NR, Henkel CV, Richardson MK, Mackessy SP, Bronikowsi AM, Yandell M, Warren WC,

Secor SM, Pollock DD. 2013. The Burmese python genome reveals the molecular basis for

extreme adaptation in snakes. Proc Natl Acad Sci U S A 110:20645-20650.

Chang BSW, Du J, Weadick CJW, Muller J, Bickelmann C, Yu DD, Morrow JM. 2012. The

future of codon models in studies of molecular function: ancestral reconstruction and clade

models of functional divergence. In: Cannarozii GM, Schneider A, editors. Codon

evolution: mechanisms and models. Oxford: Oxford University Press. p. 145-163.

Dalton BE, Loew ER, Cronin TW, Carleton KL. 2014. Spectral tuning by opsin coexpression in

retinal regions that view different parts of the visual field. Proc R Soc B 281:20141980.

Darwin C. 1859. On the Origin of Species by Means of Natural Selection, or the Preservation of

Favoured Races in the Struggle for Life. London: John Murray.

Davies WL, Cowing JA, Bowmaker JK, Carvalho LS, Gower DJ, Hunt DM. 2009. Shedding

light on serpent sight: the visual pigments of henophidian snakes. J Neurosci 29:7519-

7525.

68

Elias R, Sezate S, Cao W, McGinnis J. 2004. Temporal kinetics of the light/dark translocation

and compartmentation of arrestin and alpha-transducin in mouse photoreceptor cells. Mol

Vis 10:672-681.

Goldsmith TH. 1990. Optimization, constraint, and history in the evolution of eyes. Q Rev Biol

65:281-322.

Hárosi FI, Novales Flamarique I. 2012. Functional significance of the taper of vertebrate cone

photoreceptors. J Gen Physiol 139:159-187.

Hart NS, Coimbra JP, Collin SP, Westhoff G. 2012. Photoreceptor types, visual pigments, and

topographic specializations in the retinas of hydrophiid sea snakes. J Comp Neurol

520:1246-1261.

Hauser FE, van Hazel I, Chang BS. 2014. Spectral tuning in vertebrate short wavelength-

sensitive 1 (SWS1) visual pigments: can wavelength sensitivity be inferred from sequence

data? J Exp Zool B Mol Dev Evol 322:529-539.

Hicks D, Molday RS. 1986. Differential immunogold dextran labeling of bovine and frog rod

and cone cells using monoclonal-antibodies against bovine rhodopsin. Exp Eye Res 42:55-

71.

Hicks D, Sparrow J, Barnstable CJ. 1989. Immunoelectron microscopical examination of the

surface distribution of opsin in rat rod photoreceptor cells. Exp Eye Res 49:13-29.

Isayama T, Chen Y, Kono M, Fabre E, Slavsky M, DeGrip WJ, Ma JX, Crouch RK, Makino CL.

2014. Coexpression of three opsins in cone photoreceptors of the salamander Ambystoma

tigrinum. J Comp Neurol 522:2249-2265.

Jacobs GH, Fenwick JA, Crognale MA, Deegan JF. 1992. The all-cone retina of the garter snake

- spectral mechanisms and photopigment. J Comp Phys A 170:701-707.

69

Karnik SS, Sakmar TP, Chen HB, Khorana HG. 1988. Cysteine residue-110 and residue-187 are

essential for the formation of correct structure in bovine rhodopsin. Proc Natl Acad Sci U S

A 85:8459-8463.

Kerov V, Artemyev NO. 2011. Diffusion and light-dependent compartmentalization of

transducin. Mol Cell Neurosci 46:340-346.

Knight JK, Raymond PA. 1990. Time course of opsin expression in developing rod

photoreceptors. Development 110:1115-1120.

Kojima D, Okano T, Fukada Y, Shichida Y, Yoshizawa T, Ebrey TG. 1992. Cone visual

pigments are present in gecko rod cells. Proc Natl Acad Sci U S A 89:6841-6845.

Kram YA, Mantey S, Corbo JC. 2010. Avian Cone Photoreceptors Tile the Retina as Five

Independent, Self-Organizing Mosaics. PLoS One 5:e8992.

Lamb TD. 2013. Evolution of phototransduction, vertebrate photoreceptors and retina. Prog

Retin Eye Res 36:52-119.

Lobanova ES, Herrmann R, Finkelstein S, Reidel B, Skiba NP, Deng WT, Jo R, Weiss ER,

Hauswirth WW, Arshavsky VY. 2010. Mechanistic basis for the failure of cone transducin

to translocate: why cones are never blinded by light. J Neurosci 30:6815-6824.

Loew ER. 1994. A third, ultraviolet-sensitive, visual pigment in the Tokay gecko (Gekko gekko).

Vision Res 34:1427-1431.

Löytynoja A, Vilella AJ, Goldman N. 2012. Accurate extension of multiple sequence alignments

using a phylogeny-aware graph algorithm. Bioinformatics 28:1684-1691.

Lukáts A, Szabo A, Rohlich P, Vigh B, Szél A. 2005. Photopigment coexpression in mammals:

comparative and developmental aspects. Histol Histopathol 20:551-574.

70

Ma JX, Znoiko S, Othersen KL, Ryan JC, Das J, Isayama T, Kono M, Oprian DD, Corson DW,

Cornwall MC, Cameron DA, Harosi FI, Makino CL, Crouch RK. 2001. A visual pigment

expressed in both rod and cone photoreceptors. Neuron 32:451-461.

Makino CL, Wen XH, Michaud NA, Covington HI, DiBenedetto E, Hamm HE, Lem J, Caruso

G. 2012. Rhodopsin Expression Level Affects Rod Outer Segment Morphology and

Photoresponse Kinetics. PLoS One 7.

McDevitt DS, Brahma SK, Jeanny JC, Hicks D. 1993. Presence and foveal enrichment of rod

opsin in the all-cone retina of the american chameleon. Anat Rec 237:299-307.

McKee SP, McCann JJ, Benton JL. 1977. Color-vision from rod and long-wave cone

interactions: conditions in which rods contribute to multi-colored images. Vision Res

17:175-185.

Morrow JM, Chang BS. 2010. The p1D4-hrGFP II expression vector: a tool for expressing and

purifying visual pigments and other G protein-coupled receptors. Plasmid 64:162-169.

Morrow JM, Lazic S, Chang BS. 2011. A novel rhodopsin-like gene expressed in zebrafish

retina. Vis Neurosci 28:325-335.

New ST, Hemmi JM, Kerr GD, Bull CM. 2012. Ocular anatomy and retinal photoreceptors in a

skink, the sleepy lizard (Tiliqua rugosa). Anat Rec 295:1727-1735.

Pyron RA, Burbrink FT, Wiens JJ. 2013. A phylogeny and revised classification of Squamata,

including 4161 species of lizards and snakes. BMC Evol Biol 13:93.

Rakshit T, Park PSH. 2015. Impact of Reduced Rhodopsin Expression on the Structure of Rod

Outer Segment Disc Membranes. Biochemistry 54:2885-2894.

Reitner A, Sharpe LT, Zrenner E. 1991. Is color-vision possible with only rods and blue-

sensitive cones. Nature 352:798-800.

71

Röll B. 2000. Gecko vision-visual cells, evolution, and ecological constraints. J Neurocytol

29:471-484.

Ronquist F, Huelsenbeck JP. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed

models. Bioinformatics 19:1572-1574.

Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Hohna S, Larget B, Liu L,

Suchard MA, Huelsenbeck JP. 2012. MrBayes 3.2: Efficient Bayesian Phylogenetic

Inference and Model Choice Across a Large Model Space. Syst Biol 61:539-542.

Rosenzweig DH, Nair KS, Levay K, Peshenko IV, Crabb JW, Dizhoor AM, Slepak VZ. 2009.

Interaction of retinal guanylate cyclase with the alpha subunit of transducin: potential role

in transducin localization. Biochem J 417:803-812.

Saïdi T, Mbarek S, Chaouacha-Chekir RB, Hicks D. 2011. Diurnal rodents as animal models of

human central vision: characterisation of the retina of the sand rat Psammomys obsesus.

Graefes Arch Clin Exp Ophthalmol 249:1029-1037.

Schott RK, Refvik SP, Hauser FE, Lopez-Fernandez H, Chang BS. 2014. Divergent positive

selection in rhodopsin from lake and riverine cichlid fishes. Mol Biol Evol 31:1149-1165.

Sillman AJ, Carver JK, Loew ER. 1999. The photoreceptors and visual pigments in the retina of

a boid snake, the ball python (Python regius). J Exp Biol 202:1931-1938.

Sillman AJ, Govardovskii VI, Rohlich P, Southard JA, Loew ER. 1997. The photoreceptors and

visual pigments of the garter snake (Thamnophis sirtalis): a microspectrophotometric,

scanning electron microscopic and immunocytochemical study. J Comp Phys A 181:89-

101.

Sillman AJ, Johnson JL, Loew ER. 2001. Retinal photoreceptors and visual pigments in Boa

constrictor imperator. J Exp Zool 290:359-365.

72

Simões BF, Sampaio FL, Jared C, Antoniazzi MM, Loew ER, Bowmaker JK, Rodriguez A, Hart

NS, Hunt DM, Partridge JC, Gower DJ. 2015. Visual system evolution and the nature of

the ancestral snake. J Evol Biol 28:1309-1320.

Stavenga DG, Wilts BD 2014. Oil droplets of bird eyes: microlenses acting as spectral filters.

Philos Trans R Soc Lond B Biol Sci 369: 20130041.

Sugawara T, Terai Y, Imai H, Turner GF, Koblmuller S, Sturmbauer C, Shichida Y, Okada N.

2005. Parallelism of amino acid changes at the RH1 affecting spectral sensitivity among

deep-water cichlids from Lakes Tanganyika and Malawi. Proc Natl Acad Sci U S A

102:5448-5453.

Tansley K. 1964. The gecko retina. Vision Res 4:33-37.

Toomey MB, Lind O, Frederiksen R, Curley RW, Jr., Riedl KM, Wilby D, Schwartz SJ, Witt

CC, Harrison EH, Roberts NW, Vorobyev M, McGraw KJ, Cornwall MC, Kelber A,

Corbo JC 2016. Complementary shifts in photoreceptor spectral tuning unlock the full

adaptive potential of ultraviolet vision in birds. eLife 5.

Underwood G. 1970. The Eye. In: Gans C, editor. Biology of the Reptilia. New York: Academic

Press. p. 1-97.

Vogalis F, Shiraki T, Kojima D, Wada Y, Nishiwaki Y, Jarvinen JLP, Sugiyama J, Kawakami K,

Masai I, Kawamura S, Fukada Y, Lamb TD. 2011. Ectopic expression of cone-specific G-

protein-coupled receptor kinase GRK7 in zebrafish rods leads to lower photosensitivity and

altered responses. Journal of Physiology-London 589:2321-2348.

Wada Y, Okano T, Fukada Y. 2000. Phototransduction molecules in the pigeon deep brain. J

Comp Neurol 428:138-144.

Wald G, Brown PK, Smith PH. 1955. Iodopsin. J Gen Physiol 38:623-681.

73

Walls GL. 1942. The vertebrate eye and its adaptive radiation. Bloomfield Hills, MI: Cranbrook

Institute of Science.

Wen XH, Shen LX, Brush RS, Michaud N, Al-Ubaidi MR, Gurevich VV, Hamm HE, Lem J,

DiBenedetto E, Anderson RE, Makino CL. 2009. Overexpression of Rhodopsin Alters the

Structure and Photoresponse of Rod Photoreceptors. Biophys J 96:939-950.

Wong ROL. 1989. Morphology and distribution of neurons in the retina of the American garter

snake Thamnophis sirtalis. J Comp Neurol 283:587-601.

Yang Z. 2007. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24:1586-

1591.

Zhang J, Nielsen R, Yang Z. 2005. Evaluation of an improved branch-site likelihood method for

detecting positive selection at the molecular level. Mol Biol Evol 22:2472-2479.

Zhang X, Wensel TG, Yuan C. 2006. Tokay gecko photoreceptors achieve rod-like physiology

with cone-like proteins. Photochem Photobiol 82:1452-1460.

2.9 Supplementary Results

2.9.1 Thamnophis proximus possesses three visual pigments

Previous MSP on Python regius (ball python) and Boa constrictor imperator (common northern boa), both of which are nocturnal, found three visual pigments in three cone types (Sillman et al.

1999; Sillman et al. 2001). A long-wavelength pigment with a λmax of 551 and 549 nm, respectively, was found in large single cones (Sillman et al. 1999; Sillman et al. 2001), an 8–10 nm red-shift relative to T. proximus, but similar to the value for T. sirtalis. Double cones are not

74 present in python or boa retinas. In small single cones a short-wavelength pigment was found with a λmax of 360 and 357 nm, respectively for Python and Boa (Sillman et al. 1999; Sillman et al. 2001), a 6–9 nm blue-shift relative to T. proximus, but again similar to the value for T. sirtalis. The third pigment was rhodopsin, found in rod cells, with a λmax of 494 and 495 (Sillman et al. 1999; Sillman et al. 2001).

Recently MSP was performed on two species of sea snakes (Hart et al. 2012). Sea snakes are caenophidian snakes that belong to the family Elapidae and are more closely related to colubrid snakes than they are to pythons and boas. The pattern of cell types in the sea snakes is very similar to that found for Thamnophis with double cones, large single cones, and two types of small single cones. Three visual pigments were identified: A long-wavelength pigment with a

λmax of 555–559 nm was found in the large single cones and double cones. A medium- wavelength pigment with a λmax of 496 nm was found in one of the types of small single cones and a short-wavelength pigment with a λmax of 428–430 nm was found in in the other type (Hart et al. 2012). The λmax of the long-wavelength pigment is very similar to Thamnophis and to pythons and boas, but that of the short-wavelength pigment, if it is the same pigment present in these snakes, is red-shifted into the violet. This would be the first incidence of a violet-type

SWS1 in non-avian reptiles, as other non-avian reptile SWS1 pigments, including Thamnophis, are UV-type (Hauser et al. 2014). The authors suggest that this may be an adaptation resulting from the attenuation of short wavelengths with increased depth (Hart et al. 2012). The λmax of the middle-wavelength pigment is very similar to the rhodopsin pigment of pythons and boas suggesting that it is rhodopsin, but this pigment is expressed in cones (Hart et al. 2012).

Measurements of the middle-wavelength single cones show that they have smaller ellipsoids than expected, which the authors suggest may be evidence that they are 'transmuted' rods. Whether

75 the middle-wavelength pigment identified in Thamnophis, and in the sea snakes, is rhodopsin cannot be determined from MSP data alone.

2.9.2 Rhodopsin and rod transducin are expressed in ‘cone’ photoreceptor cells

In light of our results, previously unexplained staining of T. sirtalis retina by Sillman et al.

(1997) is consistent with the expression of rhodopsin in the all-cone retina. Sillman et al. (1997) performed immunohistochemical staining of T. sirtalis retina using a variety of different antibodies including three raised against rhodopsin (AO, B6, and K42-41), but did not stain for rod transducin. Using this approach, they identified two distinct populations of small single cones, one that reacted strongly with AO, B6 and B42-41 and one that reacted only weakly or not at all (Sillman et al. 1997). Based on our results, we can conclude that the population of cones that reacted strongly with AO, B6, and B42-41 (the three antibodies raised against rhodopsin) were likely rhodopsin-expressing cones homolgous to those identified here in T. proximus. Combined with our MSP, sequencing, and in vitro expression results, the immunohistochemical results of both Sillman et al. (1997) and the present study strongly support the hypothesis that Thamnophis rhodopsin is expressed in a cone cell that is actually derived from a rod.

76

2.10 Supplementary Materials and Methods

2.10.1 Animals

Adult Thamnophis proximus were obtained from a licensed retailer and euthanized under approval of the University of Toronto Animal Care Committee. Eyes were extracted and prepared either for MSP, RNA extraction, or electron microscopy. Blood was collected for genomic DNA extraction.

2.10.2 Microspectrophometry

Methodology used for MSP measurements and analyses has been described previously (Loew

1994; Sillman et al. 1997). Briefly, after dark adaptation, snakes were sacrificed and the retinas were extracted and fixed on slides. Absorbance spectra of individual photoreceptor cells were measured, and plotted into absorption curves, from which the sensitivity range and wavelength of maximum absorption was inferred (Loew 1994; Sillman et al. 1997).

2.10.3 Opsin isolation and sequencing

Total mRNA was extracted from Thamnophis proximus eye tissue using the Qiagen RNeasy kit and QiaShredder columns. cDNA libraries were constructed using the SMART cDNA Library

Construction Kit (Clontech). Genomic DNA was extracted using QIAamp Blood Minikit.

Degenerate primers were designed from an alignment of tetrapod RH1 and RH2, LWS, and

SWS1 genes (Table S2.3). Hot start Taq DNA polymerases AmpliTaq Gold (Applied

Biosystems) or FastStart (Roche) were used under standard PCR conditions. Reactions were visualized on agarose gels and DNA fragments extracted and purified (Qiagen QIAquick). Gene

77 fragments were cloned and ligated into TOPO-TA vectors (Invitrogen) or pJET (Fermentas) and transformed into One Shot TOP10 or Mach1 competent cells (Invitrogen). Sequencing was performed with BigDye Terminator (ABI) reagents in the forward and reverse directions on a

3730 DNA Analyzer (Applied Biosystems). To minimize artifacts from sequencing, multiple clones were sequenced and sequencing errors and ambiguities were eliminated. To obtain 5’ and

3’ ends, RACE (rapid amplification of cDNA ends) was performed with specific, nested primers designed based on the sequenced fragments of each gene (Table S2.1). RACE was performed using the SMART RACE cDNA amplification kit (Clontech) under standard conditions.

GenomeWalker (Clontech) was additionally used to initially obtain the 5' end of RH1.

2.10.4 Phylogenetic and molecular evolutionary analyses

A representative set of vertebrate rhodopsin (RH1), LWS, and SWS1 sequences was obtained from Genbank. These sequences were aligned with the RH1, LWS, and SWS1 genes sequenced from Thamnophis proximus using PAGAN codon alignment (Löytynoja et al. 2012). The poorly aligned 5' and 3' ends of the sequence were manually trimmed. In order to confirm the identities of the genes from T. proximus, gene trees were estimated using the resulting PAGAN alignments in MrBayes 3 (Ronquist and Huelsenbeck 2003; Ronquist et al. 2012) using reversible jump

MCMC with a gamma rate parameter (nst=mixed, rates=gamma), which explores the parameter space for the nucleotide model and the phylogenetic tree simultaneously. The analyses were each run for five million generations with a 25% burn-in. Convergence was confirmed by checking that the standard deviations of split frequencies approached zero and that there was no obvious trend in the log likelihood plot. The RH1 gene tree was used to further analyze the evolution of

T. proximus, and other snake, rhodopsins.

78

To estimate the strength and form of selection acting on RH1, the gene tree and alignment were analyzed with the codeml package of PAML 4 (Yang 2007) using the random sites models (M0, M1a, M2a, M3, M7, M8a, and M8), branch and branch-site model (Zhang et al. 2005), and clade model C (CmC) (Bielawski and Yang 2004). Comparisons between the

PAML random sites models were used to test for variation in ω (M3 vs M0) and for the presence of a positively selected class of sites (M2a vs M1a, and M8 vs M7 and M8a). All analyses were repeated at least three times with varying initial starting points of κ (transition to transversion rate ratio) and ω (the synonymous to nonsynomymous rate ratio, dN/dS) to avoid potential local optima. The model pairs were compared using a likelihood ratio test (LRT) with a χ2 distribution.

The branch, branch-site, and clade models were used to test for changes in selective constraint and positive selection in snakes, caenophians, and T. proximus by placing them in a separate foreground partition. The branch model estimates a single ω value for each branch and/or clade type specified a priori. This model is useful for testing for overall changes in selective constraint between branches/clades. The branch-site and clade models allow ω to vary both among sites and between branches/clades. The branch-site model has four site classes: 0) 0

< ω0 < 1 for all branches; 1) ω2 = 1 for all branches, 2a) ω2a = ω2b ≥ 1 in the foreground and 0 <

ω2a = ω0 < 1 in the background, and 2b) ω2b = ω2a ≥ 1 in the foreground and ω2b = ω2 = 1 in the background. This model provides a test for positive selection on specified branches/clades. CmC assumes that some sites evolve conservatively across the phylogeny (two classes of sites where 0

< ω0 < 1 and ω1 = 1), while a class of sites is free to evolve differently among two or more partitions (e.g., ωD1 > 0 and ωD1 ≠ ωD2 > 0), which can be branches, clades, or a mix of both.

Rather than a test for positive selection, this provides a test for divergent selective pressure

(although a test for positive selection can be performed if desired; see Chang et al. (2012)). For further explanation of the methods and partitioning see Schott et al. (2014).

79

2.10.5 Rhodopsin expression and spectroscopic assay

The full length RH1 sequence was amplified from cDNA and inserted in the pJET1 cloning vector (Fermentas). The sequence was re-amplified using primers that added the BamHI and

EcoRI restriction sites to its 5' and 3' ends, respectively, and inserted into the p1D4-hrGFP II expression vector following Morrow and Chang (2010). Expression vectors containing T. proximus rhodopsin were transiently transfected into cultured HEK293T cells using

Lipofectamine 2000 (Invitrogen; 12 µg of DNA per 10-cm plate) and harvested after 48 h. A total of 48 plates were used and concentrated using a Centrifugal Filter Device (Amicon). Visual pigments were regenerated with 11-cis-retinal, solubilized in 1% n-dodecyl-β-D-maltoside, and purified with the 1D4 monoclonal antibody as previously described (Morrow and Chang 2010;

Morrow et al. 2011). The ultraviolet-visible absorption spectra of purified visual pigments were recorded using a Cary 4000 double beam spectrophotometer (Aglient). Dark-light difference spectra were calculated by subtracting light-bleached absorbance spectra from respective dark spectra. Pigments were photoexcited with light from a fiber optic lamp (Dolan-Jenner) for 60s at

25°C.

2.10.6 Immunohistochemistry

Four retinas from two dark-adapted T. proximus were processed for immunohistochemistry.

Retinas from CD-1 mice were processed as a positive control. After enucleation of eyes in the light, the eyecups were fixed overnight at 4ºC in 4% paraformaldehyde in PBS. Eyes were then infiltrated with increasing concentrations of sucrose in PBS and embedded in a 2:1 solution of

30% sucrose and O.C.T compound (Tissue-Tek) at -20ºC. The eyes were cryosectioned transversely into 20 µm sections using a Leica CM3050 cryostat. Sections were blocked in 2% normal goat serum with 1% BSA in PDT for 1 h, incubated with primary antibody diluted in

80 blocking solution overnight at 4ºC, and then secondary antibody for 1 h at 37ºC. Sections were stained with 10 µg/mL Hoechst (Jackson Immunoresearch) and mounted with ProLong Gold

Antifade mounting media (Life Technologies). Sections were visualized via a Leica TCSSP8 confocal laser microscope. Primary antibodies used were the K20 antibody (Santa Cruz

Biotechnology) and 4D2 anti-rhodopsin antibody. AlexaFluor-488 goat anti-rabbit (Life

Technologies Inc.) and the Cy-3 anti-mouse (Jackson Immunoresearch) were used as secondary antibodies.

The antibody used to detect rhodopsin, 4D2, is a monoclonal mouse antibody raised against the N-terminal domain of bovine rhodopsin. It has been shown to selectively label rod outer segments, but not cone outer segments (Hicks and Molday 1986). This specificity has been shown for a wide variety of vertebrates including mammals (Hicks and Molday 1986; Hicks et al. 1989), fish (Knight and Raymond 1990), amphibians (Hicks and Molday 1986; Bugra et al.

1992; Ma et al. 2001), and reptiles (McDevitt et al. 1993; Bennis et al. 2005; New et al. 2012).

Furthermore its specificity has been verified in squamates through immunoblotting of anole retina (McDevitt et al. 1993).

The rod transducin antibody we used, K20, is an affinity purified rabbit polyclonal antibody raised against amino acid positions 75-125 of the Gαt1 subunit of the human rod G protein transducin. K20 has been shown to be specific for rods in both mammalian (Elias et al.

2004; Lobanova et al. 2010; Saïdi et al. 2011) and non-mammalian (Wada et al. 2000) species without any labelling of cone cells, and has been used to calculate expression of rod transducin in zebrafish retina through immunoblotting (Vogalis et al. 2011) and for labelling of rod transducin in Xenopus rods (Kerov and Artemyev 2011).

81

2.10.7 Electron microscopy

Eight T. proximus retinae were prepared for SEM and two for TEM. Eyes were hemisected and the retina separated from its pigmented epithelium. Retinas were fixed in 3% glutaraldehyde overnight at room temperature, rinsed with phosphate buffer (0.1 M, pH 7.8) and postfixed in

1.0% osmium tetroxide for 1 h at room temperature. The retina was then dehydrated with increasing concentrations of ethanol. Tissues for SEM were infiltrated with a

Hexamethyldisilizane (HMDS) series and allowed to volatilize overnight. The retina was then positioned with the photoreceptors facing outward and Sputter coated with gold-palladium using the Bal24 Tec SCD050. The sample was examined with the Hitachi S2500 at 20 kV and images acquired using a Quartz PCI. Tissues for TEM were embedded with modified Spurr’s epoxy resin. Semithin sections (0.51 µm) were stained with Toluidine blue (Fisher BioReagents) and methylene blue (British Drug House) and ultrathin sections (60–90 nm) were stained with 3% uranyl acetate in 50% methanol and post-stained with Reynold’s lead citrate. Sections were examined with the Hitachi H7000 at 75 kV and images acquired using an AMT 11 megapixel digital camera (Advanced Microscopy Techniques).

82

2.11 Supplementary Figures

Figure S2.1. Scanning electron microscope images of Thamnophis proximus retina at increasing magnifications illustrating the all-cone photoreceptor population.

Figure S2.2. Normalized visual pigment absorbance spectra measured using microspectrophometry (MSP) on intact photoreceptor cells from the long- (A) and short-

(B) wavelength visual pigments of Thamnophis proximus. The filled circles and smooth curves are for the best-fit visual pigments calculated from vitamin-A1-based template data. The

λmax values are the average of measurements from multiple cells as shown in Table S2.1.

83

Figure S2.3. Rhodopsin gene tree estimated using Bayesian inference illustrating the position of Thamnophis proximus RH1. This tree topology was used for the selection analyses.

Numbers at the nodes are posterior probability percentages. Species [Genbank accession

84 number]: Alligator mississippiensis [U23802], Ambystoma tigrinum [U36574], Anas platyrhynchos [XM_005012054], Anolis carolinensis [L31503], Bos taurus [NM_001014890],

Bufo bufo [U59921], Bufo maurinus [U59922], Caluromys philander [AY313946], Chrysemys picta bellii [XM_008168043], Columba livia [AF149230], Corvus macrorhynchos [AB555651],

Cynops pyrrhogaster [AB043890], Falco cherrug [XM_005443603], Felis catus

[NM_001009242], Gallus gallus [NM_001030606], Homo sapiens [NM_000539], Latimeria chalumnae [AF131253], Loxodonta africana [NM_001280858], Monodelphis domestica

[XM_001366188], Mus musculus [NM_145383], Neocratodus forsteri [EF526295],

Ophiophagus hannah (Castoe et al. 2013), Ornithorhynchus anatinus [NM_001127627],

Pelodiscus sinensis [XM_006132837], Python molurus bivittatus (Castoe et al. 2013), Python regius [FJ497236], Rana temporaria [U59920], Sarcophilus harrisii [XM_003762449],

Sminthopsis crassicaudata [AY159786], Tachyglossus aculeatus [JX103830], Taeniopygia guttata [NM_001076695], Thamnophis proximus [This study: KU306726], Xenopeltis unicolor

[FJ497233], Xenopus laevis [NM_001087048], Xenopus tropicalis [NM_001097334].

85

Figure S2.4. LWS gene tree estimated using Bayesian inference illustrating the position of

Thamnophis proximus. Numbers at the nodes are posterior probability percentages. Species

[Genbank accession number]: Alligator mississippiensis [XM_006269029], Ambystoma tigrinum

86

[AF038947], Anolis carolinensis [XM_008103916], Bos taurus [NM_174566], Chrysemys picta belli [XM_005281282], Columba livia [AH007800], Cynops pyrrhogaster [AB043891], Felis catus [NM_001009871], Gallus gallus [NM_205440], Gekko gecko [M92036], Homo sapiens

LWS [NM_020061], Homo sapiens MWS [NM_000513], Loxodonta africana

[NM_001280862], Monodelphis domestica [NM_001145081], Mus musculus [NM_008106],

Neocratodus forsteri [EF526297], Ophiophagus hannah (Castoe et al. 2013), Ornithorhynchus anatinus [NM_001127625], Pelodiscus sinensis [XM_006113208], Phelsuma madagascariensis longintinue [AF074043], Python regius [FJ497238], Sarcophilus harrisii [XM_003774721],

Sminthopsis crassicaudata [EU232013], Tachyglossus aculeatus [EU636011], Taeniopygia guttata [AF222333], Thamnophis proximus [This study: KU306727], Uta stansburiana

[DQ129869], Xenopeltis unicolor [FJ497235], Xenopus laevis [NM_001090645], Xenopus tropicalis [NM_001102861].

87

Figure S2.5. SWS1 gene tree estimated using Bayesian inference illustrating the position of

Thamnophis proximus. Numbers at the nodes are posterior probability percentages. Species

[Genbank accession number]: Ambystoma tigrinum [AF038948], Anolis carolinensis

88

[AH007736], Bos taurus [NM_174567], Chelonia mydas [XM_007067421], Chrysemys picta belli [XM_005281289], Columba livia [AH007798], Corvus brachyrhynchos [XM_008637700],

Cynops pyrrhogaster [AB052889], Falco cherrug [XM_005446545], Felis catus [BK006813],

Gallus gallus [NM_205438], Gekko gecko [AY024356], Homo sapiens [NM_001708],

Loxodonta africana [NM_001280859], Monodelphis domestica [NM_001145084], Mus musculus [NM_007538], Neocratodus forsteri [EF526298], Ophiophagus hannah (Castoe et al.

2013), Phelsuma madagascariensis longintinue [AF074045], Python regius [FJ497237], Rana catesbeiana [AB001983], Sarcophilus harrisii [XM_003771592], Sminthopsis crassicaudata

[AY442173], Taeniopygia guttata [AF222331], Thamnophis proximus [This study: KU306728],

Uta stansburiana [DQ100325], Xenopeltis unicolor [FJ497234], Xenopus laevis [XLU23463],

Xenopus tropicalis [NM_001126076].

89

Figure S2.6. Dark absorption spectrum of in vitro expressed Thamnophis proximus rhodopsin. Inset, light-dark difference spectrum.

90

Figure S2.7. Transmission electron microscope (TEM) images of Thamnophis proximus photoreceptor cells. A, double cone showing accessory (a) and principal (p) members. B, large single cone (ls) and small single cone (ss), with the inner (IS) and outer (OS) segments of the photoreceptor cell demarcated. Note the short, tapering outer segments and large, bulbous inner segments. C, large single cone and small single cone of a different type. Note that this type of small single cone has a more rod-like outer segment, and much less bulbous inner segment. D–F, close-ups of the outer segments of a large single cone (D) and small single cone of the first type

(E,F), noting the lamellar structure of the open discs (arrows).

91

2.12 Supplementary Tables

Table S2.1. Estimates of peak absorbance (λmax) from individual photoreceptor cells for each of the different cone types as measured by microspectrophotometry (MSP).

λmax of Individual Cell (nm) Cell Type Pigment Type Mean SD 1 2 3 4 5 6 7 8

SS UV 366 364 364 366 - - - - 365 1.2 SS MW 483 483 482 480 484 480 483 483 482 1.5 LS LW 542 544 ------543 1.4 DC-P LW 544 544 545 543 542 545 544 544 544 1.0 DC-A LW 541 544 543 545 544 - - - 544 1.5

Abbreviations—SS, small single cone; LS, large single cone, DC-P, double cone, principal member; DC-A, double cone accessory member; SD, standard deviation.

92

Table S2.2. Results of analyses of selection of rhodopsin using PAML random sites, branch, and clade models.

Parameters2 Model & 3 1 lnL Null p [df] Foreground ω0/p ω1/q ω2/ωp/ωd M0 -12378.0 0.07287 - - N/A - M1a -12227.8 0.056 (94.1%) 1 (5.9%) - M0 0.000* [1] M2a -12227.8 0.06 (94.1%) 1 (4.6%) 1 (1.3%) M1a 1 [2] M2a_rel -12030.6 0.00 (63%) 1 (2%) 0.17 (35%) M1a 0.000* [2] M3 -12022.9 0.00 (61%) 0.14 (34%) 0.55 (0.05%) M0 0.000* [4] M7 -12029.8 0.23 2.21 - N/A - M8 -12024.2 0.25 3.20 1 (1%) M7 0.004* [2] 0.07 Br_Snake -12369.7 - - M0 0.000* [1] Snake: 0.05 0.07 Br_Caen -12371.2 - - M0 0.000* [1] Caen: 0.20 0.07 Br_Tham -12376.2 - - M0 0.055 [1] Tham: 0.17 0.16 (35.1%) CmC_Snake -12026.4 0.00 (62.7%) 1 (2.2%) M2a_rel 0.004* [1] Snake: 0.29 0.16 (35.1%) CmC_Caen -12026.9 0.00 (62.6%) 1 (2.1%) M2a_rel 0.007* [1] Caen: 0.40 0.16 (35.1%) CmC_Tham -12029.3 0.00 (62.6%) 1 (2.3%) M2a_rel 0.107 [1] Tham: 0.37

1The foreground partition is listed after the underscore for the branch and clade models and consists of either snakes (Snakes), caenophidians (Caen), or Thamnophis proximus (Tham). All other sequences are present in the background partition. 2 ω values of each site class are shown are shown for model M0–M3 (ω0– ω2) with the proportion of each site class in parentheses. For M7 and M8, the shape parameters, p and q, which describe the beta distribution are listed instead. In addition, the ω value for the positively selected site class (ωp, with the proportion of sites in parentheses) is shown for M8. The branch model only has a single site class (ω0), but this is allowed to vary between the foreground and background partitions. For CmC, ωD is the divergent site class, which has a separate value for the foreground and background partitions. 3Significant p-values (α = 0.05) are marked with an asterisk (*). Degrees of freedom are given in square brackets after the p-values. Abbreviations—lnL, ln Likelihood; p, p-value; N/A, not applicable.

93

Table S2.3. Primers used for isolation of Thamnophis proximus opsins.

Primer Name Gene Sequence VertRho23F RH1 CCCTTCGAGTATCCCCARTAYTA VertRho113R RH1 CCMAGMGTRGCRAAGAAGCCYTC VertRho221R RH1 CANASSAGGCGYCCRTAGCAGAA SnakeRho39F RH1 GCCTTGGCCGCGTACATGTTTCTT SnakeRho64F RH1 CAACACAAGAAACTCAGAACACCC SnakeRho103R RH1 GCATCCTACTGTCCCAAAAATGAA VertRG168F LWS TGCGCTCCTCCiATHTTYGG VertRG238F LWS AAGGAGTCTGARTCiACiCARAARGC VertRG313R LWS GCGGAACTGTCGATTCATRAAiACRTADAT GarterRG_5F LWS AAAGAGTCTGAATCAACACAGAAG GarterRG_6F LWS AAAAGCGCCACCATTTACAACCCA GarterRG_1R LWS CAACCGCACGGATAGCCATCCAC GarterRG_2R LWS GTGAAAGGCATAGCCTGGATTGG TetUV232F SWS1 GCCGTGGCCGCiCARCARCARGA AvesUV306R SWS1 GAACTGCTTGTTCATRAARCARTA GarterUV_12F SWS1 GTGCTTGGGAGGATGCCCGTAGAA GarterUV_71F SWS1 AATATCACATCGCCCCCATGTGG GarterUV_1075R SWS1 TTGGTTGCGTTCCAACGTGCAGAG GarterUV_1091R SWS1 GGCATAGTCATCGTCTTGGTTGC

94

Chapter 3 Targeted capture of complete coding regions across divergent species

Citation: Schott RK, B Panesar, DC Card, M Preston, TA Castoe, BSW Chang. 2017. Targeted capture of complete coding regions across divergent species. Genome Biology and Evolution

9:398–414.

Author Contributions: Conceived and designed the study: RKS, BSWC. Obtained and processed tissue and genetic samples: RKS, DCC, TAC, BSWC. Designed the probes: RKS.

Developed the bioinformatics pipeline: RKS, BP, MP. Analyzed the data: RKS. Wrote the manuscript: RKS, DCC, TAC, BSWC, with approval from all authors.

3.1 Abstract

Despite continued advances in sequencing technologies, there is a need for methods that can efficiently sequence large numbers of genes from diverse species. One approach to accomplish this is targeted capture (hybrid enrichment). While these methods are well established for genome resequencing projects, cross-species capture strategies are still being developed and generally focus on the capture of conserved regions, rather than complete coding regions from specific genes of interest. The resulting data is thus useful for phylogenetic studies, but the wealth of comparative data that could be used for evolutionary and functional studies is lost.

Here we design and implement a targeted capture method that enables recovery of complete

95 coding regions across broad taxonomic scales. Capture probes were designed from multiple reference species and extensively tiled in order to facilitate cross-species capture. Using novel bioinformatics pipelines we were able to recover nearly all of the targeted genes with high completeness from species that were up to 200 myr divergent. Increased probe diversity and tiling for a subset of genes had a large positive effect on both recovery and completeness. The resulting data produced an accurate species tree, but importantly this same data can also be applied to studies of molecular evolution and function that will allow researchers to ask larger questions in broader phylogenetic contexts. Our method demonstrates the utility of cross-species approaches for the capture of full length coding sequences, and will substantially improve the ability for researchers to conduct large-scale comparative studies of molecular evolution and function.

3.2 Introduction

It is difficult, in terms of the amount of resources needed, to study the evolution of a large number of complete genes from a large number of taxa, but continued advances in next- generation sequencing (NGS) technology have made this approach more feasible within reasonable time-frames and budgets. Despite these advances, sequencing entire genomes is generally too time-consuming, and too costly, on comparative taxonomic scales, and produces much more data than necessary for most evolutionary questions. PCR, on the other hand, still excels at sequencing small numbers of genes, but quickly becomes cost ineffective when large numbers of genes are required, while primer design and optimization becomes time inefficient across divergent species (Mamanova et al. 2010; Shen et al. 2013). As a result, there is a need for methods that can efficiently sequence a large set of genes of interest from a large number of

96 species. Currently, such data are limited to the relatively small number of sequenced genomes and a growing number of transcriptomes. RNA-Seq (Wang et al. 2009) is becoming increasingly popular for comparative studies (e.g., Kunstner et al. 2010; Brousseau et al. 2014; Gallant et al.

2014; Gerstein et al. 2014; LoVerso and Cui 2015; Yang et al. 2015; Havird and Sloan 2016;

Phillips et al. 2016; Wu et al. 2016), but has several downsides including a reliance on fresh tissue samples and variation in transcript expression levels. However, there are many methods that can be utilized to target, enrich, and capture specific sections of the genome (for reviews see

Mamanova et al. 2010; Teer et al. 2010; Mertes et al. 2011). Targeted capture (also called targeted enrichment, hybrid enrichment, or sequence capture) is one of these methods that has been shown to perform well and is gaining popularity (Albert et al. 2007; Hodges et al. 2007;

Okou et al. 2007; Porreca et al. 2007; Gnirke et al. 2009; Summerer et al. 2009; Mamanova et al.

2010; Nijman et al. 2010; Teer et al. 2010; Teer and Mullikin 2010; Kenny et al. 2011; Mason et al. 2011; Mertes et al. 2011; Bi et al. 2012; Bundock et al. 2012; Crawford et al. 2012; Cronn et al. 2012; Faircloth et al. 2012; Grover et al. 2012; Lemmon et al. 2012; McCormack et al. 2012;

Rohland and Reich 2012; Li et al. 2013; Ilves and Lopez-Fernandez 2014; Penalba et al. 2014;

Bragg et al. 2016; Portik et al. 2016).

Targeted capture is a method to selectively enrich the genome for particular regions of interest by using a set of DNA or RNA probes as bait (Gnirke et al. 2009). This can either be done on a microarray (Albert et al. 2007; Hodges et al. 2007; Okou et al. 2007) or in solution

(Gnirke et al. 2009), but the principle is the same. The probes are designed to complement the region(s) of interest, whether a small section of the genome or the entire set of protein coding genes (the exome). The probes are then allowed to hybridize with a gDNA library that has been fragmented to produce inserts in the range of 200–700 bp. Inserts that fail to hybridize are washed away thus selectively enriching the genome for the regions of interest. Sequencing can

97 then proceed normally, including multiplexing many samples to increase efficiency.

Hybrid enrichment was originally proposed, and has been most widely used, to capture and resequence the human exome (e.g., Albert et al. 2007; Hodges et al. 2007; Okou et al. 2007;

Porreca et al. 2007; Gnirke et al. 2009), and has since been applied to whole exome sequencing in other model species for applications such as variant discovery and population genetics (for reviews see Warr et al. 2015; Jones and Good 2016). Applications of whole exome sequencing to related species show a decline in performance with even small amounts of divergence (Vallender

2011; Jin et al. 2012; Jones and Good 2016). Consequently, capture across divergent species requires modifications and tends to focus on more conserved sequences or smaller targets.

Several cross-species approaches have been developed and used both at broad taxonomic scales to capture highly conserved (e.g., Lemmon et al. 2012) and ultraconserved (e.g., Crawford et al.

2012; Faircloth et al. 2012; McCormack et al. 2012) regions, at narrow taxonomic scales to capture the mitochondrial genome (Mason et al. 2011), and at varying scales to capture individual exons and partial coding sequences (e.g., Bi et al. 2012; Li et al. 2013; Ilves and

Lopez-Fernandez 2014; Penalba et al. 2014; Bragg et al. 2016; Hugall et al. 2016; Portik et al.

2016). The focus of each of these methods, however, is solely on producing data for phylogenetic studies. As a result, they use automated methods to select targets favorable for capture, rather than using complete coding regions from specific genes of interest. The resulting data is thus useful for phylogenetic studies, but the wealth of comparative data that could be used for molecular evolutionary and functional studies is lost.

Here we adapt and expand this method to selectively enrich complete coding regions from specific genes of interest across broad taxonomic scales. The focus on specific genes allows selection of sequences associated with aspects of organismal physiology that can be used to address various research questions. The capture of specific complete coding regions, however,

98 presents unique experimental and computational challenges compared to capturing computationally selected conserved regions or exons. Regions that have high divergence, are predicted to have poor hybridization, or that are too short cannot simply be excluded as is typically done (e.g., Lemmon et al. 2012; Ilves and Lopez-Fernandez 2014; Hugall et al. 2016).

The computational assembly of the data is also more complex as the individual exons need to be assembled into a continuous sequence while removing the intronic sequence that will be enriched alongside the targeted exons. To address these issues, we use a unique probe design strategy and develop novel bioinformatics pipelines for the assembly and analysis of complete coding regions. We evaluate the effects of different assembly algorithms and references, and compare our method to alternative approaches, namely whole genome sequencing (WGS) and RNA-Seq.

The method we develop allows complete coding regions to be captured from a set of genes of interest, while maintaining the ability to capture across divergent species. The resulting data is still useful for phylogenetic analyses, but importantly can also be applied to studies of molecular evolution and function that will allow researchers to ask larger questions in a broader phylogenetic context.

3.3 Results and Discussion

Here we adapt solution-based targeted capture following Gnirke et al. (2009), Lemmon et al.

(2012), and Faircloth et al. (2012), to selectively sequence complete coding regions from specific genes of interest across divergent species. We took the unique approach of manually curating a set of genes of interest. This differs from the approach employed by other cross-species targeted capture methods, which target ultra (e.g., Faircloth et al. 2012) or highly (e.g., Lemmon et al.

2012) conserved regions, or individual exons with specific properties (e.g., Li et al. 2013; Ilves

99 and Lopez-Fernandez 2014; Bragg et al. 2016; Hugall et al. 2016; Portik et al. 2016) compiled using purely computational means, and also provides benefits over alternative NGS strategies

(Table 3.1). The focus of the method presented here is on sequencing the complete coding regions from genes of specific interest for more broad evolutionary applications, in addition to phylogenetic reconstruction. A total of 166 genes of interest were targeted composed of 1435 individual exons. These included visual, housekeeping, and phylogenetic marker genes (Table

S3.1).

Probes were designed from the individual exons comprising the coding regions of each of these genes as shown in Figure 3.1. To facilitate capture of complete coding regions across divergent species probes were designed from multiple reference species following Lemmon et al.

(2012). This resulted in an increased diversity of sequences comprising the probes targeting each sequence and may have allowed hybridization to occur in regions that were otherwise too divergent, or missing, in a specific reference (Fig. 3.1C). Probes were extensively tiled (10x) across the reference sequences, which similarly increased the diversity of probes sequences available for hybridization, and may have both compensated for hybridization issues with individual probes (e.g., secondary structure, GC content) and allowed capture across divergent regions (Fig. 3.1C). Exons that were shorter than the probe length of 120 bp could not be tiled and instead had to be padded with non-homologous sequence to increase the length to 120 bp

(Fig. 3.1B). In total we targeted 3888 exons from the reference species, which resulted in 45,895 probes after tiling and boosting to normalize coverage (Schott et al. 2017).

As a proof-of-concept, we selected 16 squamate reptile (lizard and snake) species for hybrid enrichment and sequencing (Table S3.2). The species sampled here span a broad set of the major lineages of squamates, encompassing approximately 200 myr of divergence (Hedges et al. 2015). To recover the coding sequences, after hybrid enrichment and sequencing, we

100 employed a guided assembly strategy utilizing custom assembly and analysis pipelines (Schott et al. 2017). Reads were assembled against a reference composed of the coding sequences of the targeted genes. The primary set of reference sequences were from Anolis, the probe species most closely related to the species we sequenced. Additional sequences were included from the other probe species for genes absent from the Anolis genome. Because assemblies were performed across species, we also used additional references compiled from snake genomic and transcriptomic data and from the Gekko japonicus genome, which were produced or became available after the design and synthesis of the probes. Several different assemblers were also used with different tolerances for mismatches. After assembly, consensus sequences were called and their identities confirmed by BLAST and, when necessary, phylogenetic analysis. We also calculated the completeness of each recovered sequence relative to the reference sequence

(capture sensitivity).

Overall, the method was highly successful and nearly all genes were recovered (92% on average, Fig. 3.2, Tables S3.3–S3.6). The level of recovery we achieved is higher than that of a previous cross-species capture study, which recovered from 16%–80% of coding sequence targets in comparisons that varied from having 106–299 myr of divergence (Li et al. 2013). Of the original 166 genes that we targeted, two genes (ALB, SLC24A1) were not recovered from any of the 16 squamate sequences (Table S3.4) or from any of the available squamate genomes

(Anolis carolinensis (Alfoldi et al. 2011), Python molurus bivittatus (Castoe et al. 2013),

Ophiophagus hannah (Vonk et al. 2013), Thamnophis sirtalis (Castoe et al. 2011), Gekko japonicus (Liu et al. 2015)) suggesting they were likely lost ancestrally in squamates. One gene

(CRYD2) appears to be a lineage-specific duplication in some birds (e.g., chicken), based on its absence in the squamate (and other reptilian) genomes, and was also not recovered for any of the

16 species. The probe for one gene, STRA6, was later found to lack any with other

101

STRA6 sequences, which explained its lack of capture success. These four genes were thus excluded from further analysis. Three genes, UBC, UBB, and UBI, were found to all represent (at least part of) the same gene (which we term UBC) and thus were combined. This left a total of

160 for further analysis.

While the vast majority of genes were recovered in all species, a number of genes were not recovered in particular groups or individual species. In most cases, the lack of recovery appears to be due to gene loss rather than a failure of the method. For example, snakes and geckos are both known to have lost several visual genes (Zhang et al. 2006; Castoe et al. 2013).

In colubrid snakes 17 genes were not recovered including the 10 opsin genes reported previously to have been lost in snakes (NEUR2, NEUR3, OPN4m, parapinopsin, parietopsin, pinopsin,

RH2, SWS2, TMT2, TMTa; (Castoe et al. 2013)) as well as five lens crystallins (CRYBA1,

CRYBA4, CRYBB1, CRYBB3, CRYGN), one phototransduction gene (GRK1), and one HOX gene

(HOXD12). None of these genes were found in the Python, Ophiophagus, or Thamnophis genomes. Eleven genes were not recovered in the any of the three gecko species, including seven opsins (NEUR2, NEUR3, OPN4m, parapinopsin, parietopsin, RH1, SWS2), a lens

(CRYGN), and three phototransduction genes (CNGA1, PDE6B, PDE6G). These genes were also absent from the Gekko japonicus genome. In Phelsuma, TMTa was also not recovered, and in

Sphaerodactylus CRYBA4 was not recovered. Both of these genes are present in the Gekko genome, so it is unclear whether they represent lineage-specific losses or a failure of the hybrid capture. All genes, other than those absent in all squamate taxa, were recovered for the other species, except for CRYD in Anolis and pinopsin in Chamaeleo. CRYD was also absent from the

Anolis genome, although other genes absent from the Anolis genome were at least partially recovered.

Enrichment (or capture specificity) of the targeted genes was high, with an average

102

(mean, throughout) of 55% of the reads mapping to the reference (Fig. 3.2, Table S3.3). This level of enrichment is much higher than that reported for other cross-species targeted capture methods (Bi et al. 2012; Lemmon et al. 2012; Ilves and Lopez-Fernandez 2014), which have reported mapping rates ranging from 5–33%. Our positive control, Anolis, had 71.5% of the reads mapping to the reference, which is close to the level seen in human resequencing studies, which achieve up to 80% reads on target (Mamanova et al. 2010). Coverage was also high with an average depth of coverage across all species and genes of 2159X that ranged from 2903X in

Anolis to 1884X in Phyllorhynchus (Table S3.3). The high level of coverage we obtained suggests that we sequenced at much higher depth than will be necessary for future experiments.

In subsequent experiments it would likely be possible to multiplex and sequence substantially more species, or many more genes, with the same amount of sequencing and still obtain high coverage.

Among the most important factors for the utility of the approach for wide evolutionary application is the completeness of the recovered coding regions (capture sensitivity).

Completeness was generally high with an overall average of 89.1% using the best method and reference for each gene (Table S3.6). However, some genes were recovered with low completeness, all the way down to our cut-off of 5% (Table S3.4). Completeness varied considerably with the different assemblers and references used and varied among the genes and the species and will be discussed further below. While many cross-species capture studies do not report the completeness of recovered targets, Portik et al. (2016), who used transcriptome-based exon capture, reported an average completeness of 80% for their ingroup sample (up to 56 myr divergence) and 34% for their outgroup sample (up to 103 myr divergence), demonstrating a marked decrease in completeness with divergence not seen with our method.

103

Figure 3.1. Cross-species hybrid capture methods. A, exons were extracted from the genomes of an average of three reference species (anole, turtle, and chicken). B, probes were designed against each exon. Since probe length was constant at 120 bp, exons shorter than the probe length were padded with non-homologous sequence. Exons the same length as the probe matched exactly, while those longer were extensively tiled across the exon (10X coverage). The overall number of probes covering each base was normalized to ensure even coverage. C, multiple reference species and tiling were designed to help facilitate cross-species capture. For

104 example, a region of high divergence may occur in one species and not another, or could still be captured by tiling across it.

Figure 3.2. Species relationships of the 16 species sequenced and the enrichment, percent of genes recovered and the average completeness of those genes that were recovered. These results represent the combined best for the different assembly methods and references used.

Genes were considered recovered if they were at least 5% complete and could be properly identified based on BLAST similarity and/or phylogenetic position. Species most closely related to the reference are shown in red, snakes in green, the plated lizard in orange, and geckos in blue.

Tree topology based on Pyron et al. (2013). Divergence times from Hedges et al. (2015).

105

Table 3.1. Comparison of pros and cons of different high-throughput sequencing strategies.

Method Pros Cons References Produces data that can be used Expensive. Assembly can be WGS Koepfli et al. (2015) for many different applications. difficult and time consuming.

Need fresh tissue for RNA. Produces data that can be used Obtaining sequences depends on for many different applications. RNA-Seq expression levels and Wang et al. (2009) Also provides information on tissue/temporal-specific expression levels. expression.

Primer design can be difficult, Can sequence small numbers of Meyer et al. (2008); time consuming, and needs PCR loci (up to ~100) efficiently across Bybee et al. (2011); conserved regions. Can give divergent species. Shen et al. (2013) biased results.

Can produce 1000s of loci for use Altshuler et al. (2000); RRL/RAD- Only useful at very shallow time in intraspecific and shallow Miller et al. (2007); Seq scales. phylogenetic studies. Baird et al. (2008)

Targeted Can sequence 100s to 1000s loci Initial probe design can be Capture from species at varying levels of expensive/time consuming and See below (Hybrid divergence. may require genomic resources. Enrichment)

Whole Only applicable with model Produces data that can be used Albert et al. (2007); Exome organisms and their very close for many different applications. Porreca et al. (2007) Capture relatives. Sequence 1000s of loci that can Conserved be used for phylogenetic studies Data only useful for phylogenetic Faircloth et al. (2012); Region at shallow to deep timescales. studies. Lemmon et al. (2012) Capture

Li et al. (2013); Bragg Sequence 100s to 1000s of Data only useful for phylogenetic Exon et al. (2016); Portik et individual exons across low to studies. Performance decreases Capture al. (2016) moderate levels of divergence. sharply with divergence.

Produces data that can be used Manual curation can be time Complete for many different applications. consuming. Guided assembly can CDS Applicable across divergent Current study require additional reference Capture species. Can target genes with sequences for divergent species. physiological relevance.

Abbreviations—WGS, whole genomes sequencing; RRL, reduced representations library; RAD, restriction site associated DNA; CDS, coding sequence.

106

3.3.1 Reference sequences have a large effect on cross-species guided assembly

Three main sets of reference sequences were used in the assembly of the reads (Schott et al.

2017). The primary set of sequences was based on the coding sequences from Anolis, the same sequences used to develop the probes (Anolis reference). This set necessarily also included sequences from other taxa when a gene was missing or lost in Anolis (see Materials and Methods for more details). When reads were assembled with BWA against this reference recovery and completeness were high for species more closely related to Anolis, but suffered for the more divergent species, especially the snakes and geckos (Table 3.2). To address this, we produced two more sets of reference sequences: one utilizing a de novo eye transcriptome from

Thamnophis sirtalis and the Python molurus bivittatus (Castoe et al. 2013) and Thamnophis sirtalis genome assemblies (snake reference), and one with sequences from the Gekko japonicus genome (Liu et al. 2015) (Gekko reference). These references contained 139 and 110 sequences each as they only included snake or gecko sequences, respectively. When the colubrid snakes were assembled to the snake reference using BWA we obtained a small increase in the enrichment (percent of reads mapped to reference) and recovery of genes, but a ~20% increase in the completeness of the recovered genes present in both references (Table 3.2). A very similar increase in completeness (~19%) was found for the geckos when assembled to the Gekko reference (Table 3.3).

The effect of using different reference sequences was also demonstrated using our positive control, Anolis. As would be expected, Anolis had extremely high enrichment, recovery, and completeness when assembled with BWA against the Anolis reference (72%, 99%, and 95%, respectively; Table 3.2). However, when Anolis was assembled against the snake reference a

~27% reduction in both enrichment and completeness occurred (when only genes present in both references were compared). This demonstrates that the use of proper reference sequences is

107 essential for the recovery of complete genes. Furthermore, these results imply that the targeted capture method employed here is highly tolerant to divergence, much more so than current guided-assembly programs.

Table 3.2. Comparison of BWA assembly with the Anolis and Snake references.

Reference Sequences Species Anolis (160 seqs.) Anolis trim Snake (139 seqs.) Snake (139 seqs.) Enrich. Rec. Comp. Enrich. Rec. Comp. Enrich. Rec. Comp. Anole (control) 71.5% 99.4% 94.8% 56.5% 99.3% 95.9% 30.1% 98.6% 66.7% Leopard lizard 58.6% 100% 88.2% 48.7% 100% 89.7% 32.4% 98.6% 70.3% Chameleon 48.5% 98.1% 72.8% 42.4% 98.6% 74.3% 31.9% 95.0% 63.9% Monitor 56.7% 100% 77.7% 48.6% 100% 79.8% 39.6% 97.8% 71.5% Glossy snake 48.7% 88.1% 64.4% 44.5% 98.6% 64.9% 50.8% 100% 85.6% Scarlet snake 49.6% 88.1% 67.8% 45.9% 98.6% 68.3% 52.1% 100% 86.1% King snake 48.7% 87.5% 64.8% 43.9% 97.8% 65.3% 50.5% 100% 85.6% Corn snake 48.5% 88.8% 62.6% 44.8% 99.3% 63.0% 50.9% 100% 85.0% Leaf-nosed sn. 52.6% 89.4% 67.6% 48.9% 100% 67.8% 56.5% 100% 86.2% Long-nosed sn. 47.3% 86.9% 61.5% 43.6% 97.1% 61.9% 50.7% 100% 85.6% Night snake 49.1% 87.5% 64.2% 45.8% 97.8% 64.7% 52.4% 100% 85.0% Garter snake 49.1% 86.3% 63.8% 45.8% 96.4% 64.2% 51.6% 100% 84.7% Plated lizard 48.3% 98.8% 75.2% 42.5% 99.3% 75.9% 35.8% 97.8% 67.4% Tokay gecko 47.5% 90.0% 69.4% 41.5% 97.1% 69.5% 36.9% 92.8% 61.7% Giant day gecko 47.8% 90.6% 69.1% 39.9% 96.4% 69.7% 33.6% 94.2% 61.4% Reef gecko 53.4% 90.6% 65.6% 46.7% 97.1% 66.1% 38.7% 93.5% 58.7% AVERAGE 51.6% 91.9% 70.6% 45.6% 98.3% 71.3% 43.4% 98.0% 75.3%

To make the comparison between the Anolis and Snake reference even, the Anolis reference was trimmed to contain only the sequences present in the Snake reference (Anolis trim Snake). Enrichment was measured as the percent of reads mapping to the reference (uncorrected for genome size). Recovery was calculated as the percent of the 165 targeted genes that had at least 5% completeness and that could be identified based on BLAST similarity and/or phylogenetic analysis. Completeness was calculated relative to the reference sequence for those genes identified as recovered using the BWA assembly method and the best reference for each gene. Abbreviations— Enrich., Enrichment; Rec., Recovery; Comp., Completeness; sn., snake.

108

Table 3.3. Comparison of BWA assembly with the Anolis and Gekko references.

Reference Sequences Species Anolis (160 seqs.) Anolis trim Gekko (110 seqs.) Gekko (110 seqs.) Enrich. Rec. Comp. Enrich. Rec. Comp. Enrich. Rec. Comp. Tokay gecko 47.5% 90.0% 69.4% 34.1% 97.3% 70.5% 36.9% 100% 88.4% Giant day gecko 47.8% 90.6% 69.1% 33.1% 97.3% 69.7% 37.6% 100% 89.6% Reef gecko 53.4% 90.6% 65.6% 38.3% 99.1% 65.8% 43.0% 100% 84.3% AVERAGE 49.6% 90.4% 68.0% 35.2% 97.9% 68.7% 39.2% 100% 87.5%

To make the comparison between the Anolis and Gekko reference even, the Anolis reference was trimmed to contain only the sequences present in the Snake reference (Anolis trim Gekko). Enrichment was measured as the percent of reads mapping to the reference (uncorrected for genome size). Recovery was calculated as the percent of the 165 targeted genes that had at least 5% completeness and that could be identified based on BLAST similarity and/or phylogenetic analysis. Completeness was calculated relative to the reference sequence for those genes identified as recovered using the BWA assembly method and the best reference for each gene. Abbreviations— Enrich., Enrichment; Rec., Recovery; Comp., Completeness.

3.3.2 Different assemblers performed best on similar and divergent reads

Several different assembly programs were used and their effectiveness evaluated: BWA-MEM

(Li 2013), NGM (Sedlazeck et al. 2013), Stampy (Lunter and Goodson 2011), and Bowtie2

(Langmead and Salzberg 2012) (Table 3.4, Tables S3.3–S3.6). BWA and Bowtie2 are both

Burrows Wheeler transform-based methods and were designed for assembly of reads to their reference genome, while NGM and Stampy are hash-based methods and were designed for assembly to moderately divergent or polymorphic reference genomes. Thus, we are using these methods in unorthodox ways not only in assembly to divergent species, but also in assembly to complete coding regions rather than whole genomes.

Initially we implemented BWA under default parameters, but found these to be too restrictive for divergent assembly (Table 3.4), so we relaxed the mismatch penalty to facilitate cross-species assembly (see Materials and Methods). This improved divergent capture (~10% increase in average completeness), while giving similar results for the positive control (Table

109

3.4). Similarly, for the more divergent species we found a 10% increase in going from BWA to

NGM and another 10% from NGM to Stampy (Table 3.4). For the geckos, we found a 17% increase in completeness going from BWA to Stampy. However, this increase in completeness was almost removed when the gecko reference (rather than the Anolis reference) was used (2% increase in completeness). For Anolis, Stampy actually performed more poorly than BWA, as did

NGM (Table 3.4).

Overall, we found that Stampy performed best when assembling reads to more divergent references. However, this increased ability for divergent assembly appears to have come at the cost of completeness and accuracy in some cases. We found that Stampy incorporated more ambiguous bases than BWA and occasionally resulted in unambiguous differences. This was most apparent in genes with lower completeness, whereas genes with high completeness were most often found to have identical sequences between BWA and Stampy (and NGM). This is not surprising given genes with lower completeness likely had higher divergence, and had much lower depth of coverage (presumably due to lower enrichment), which makes it more difficult to reliably map reads and more likely that incorrect read placements would be accepted. As a result, we tended to prefer BWA assembled sequences to Stampy, but when the divergence was high

Stampy was able to capture much more of the gene. In some cases it was possible to combine

BWA and Stampy sequences, using BWA to resolve ambiguities and differences and Stampy to fill in missing, presumably divergent regions.

In addition to running Stampy under default settings, we also adjusted the substitution rate option, which should improve mapping of divergent reads. We changed the default rate of

0.001 to 0.1, which corresponds to an expected divergence of 10% and ran this for the most divergent species, the geckos. However, changing this parameter did not have a positive effect on the sequences recovered and perhaps resulted in slightly less completeness (Table 3.4).

110

Compared to BWA and Stampy, NGM resulted in intermediate completeness when reads were aligned to a divergent reference (Table 3.4). When reads were aligned to a more similar reference (e.g., Anolis to Anolis or a colubrid to the snake reference), however, NGM performed worse than BWA. As such, we did not find NGM to be particularly useful for assembly of either divergent or non-divergent reads.

Lastly, we implemented Bowtie 2, which has recently been used to assemble target enrichment sequencing reads from cichlid fishes (Ilves and Lopez-Fernandez 2014). We ran

Bowtie 2 in our assembly and analysis pipeline using the ‘very-sensitive’ preset, which was used by Ilves and Lopez-Fernandez (2014), against both the Anolis and snake references. We found that Bowtie 2 performed worse than BWA in all respects (Table 3.4, Tables S3.3–3.6). For example, with the positive control we found a 29% reduction in enrichment and a 21% reduction in completeness. This is a surprising result, but may be due to issues associated with assembly to coding sequences rather than a complete (mammalian-size) genome, which Bowtie 2 was designed to assemble against. Overall, these results highlight the important fact that no single assembler will be best in all situations.

111

Table 3.4. Comparison of average completeness of recovered coding regions obtained using different assemblers.

Average Completeness (%) Species BWA BWA STAMPY NGM STAMPY BT2 (Default) (b = 2) (Div. = 0.1) Anole (control) 94.1 94.8 91.9 89.8 - 74.1 Leopard lizard 84.9 88.2 89.5 91.1 - 37.4 Chameleon 63.4 72.8 81.5 85.7 - 42.0 Monitor 70.3 77.7 83.9 88.8 - 38.2 Glossy snake 54.7 64.4 74.3 86.5 - 28.3 Scarlet snake 59.2 67.8 75.2 87.1 - 32.0 King snake 54.6 64.8 73.4 86.7 - 32.0 Corn snake 53.9 62.6 72.3 85.5 - 26.5 Leaf-nosed sn. 60.3 67.6 77.0 87.3 - 21.8 Long-nosed sn. 53.2 61.5 73.0 86.4 - 31.1 Night snake 54.4 64.2 72.8 86.4 - 27.5 Garter snake 53.8 63.8 73.2 85.9 - 31.0 Plated lizard 65.7 75.2 81.8 86.7 - 37.2 Tokay gecko 59.2 69.4 79.6 85.4 85.3 47.6 Giant day gecko 60.0 69.1 78.2 84.5 84.2 43.9 Reef gecko 55.4 65.6 77.3 84.3 84.0 42.3 AVERAGE 62.3 70.6 78.4 86.8 84.5 37.1

All assembly was done to the Anolis reference. Abbreviations—Div., divergence parameter; BT2, Bowtie 2.

3.3.3 Increased probe diversity and tiling substantially increase gene recovery and completeness

To evaluate the effect of increased probe diversity and higher levels of tiling we increased both for a small subset of genes. For the visual opsins we included probe sequence from nine different species (including a colubrid snake and a gecko) and doubled the amount of tiling to 20X.

Because we had these additional sequences, we also used additional references to assemble this subset of genes. The result was near complete recovery of the visual opsin genes, with the exception of those genes that appear to have been lost in particular lineages (Table S3.7). These

112 results suggest that when fully complete coding regions are required both the number of probe and reference sequences and, presumably to a lesser extent, the amount of tiling should be increased. Unfortunately the experimental design did not allow us to differentiate between the effects of the number of probe/reference sequences and the amount of tiling, and this will need to be evaluated in a future study. It seems likely that the most important factor in the recovery of complete coding regions is the availability of probe and reference sequences that are as similar as possible to the target, but increased tiling may provide additional benefits when this is not possible.

3.3.4 Short exons had only a small effect on completeness of recovered genes

One of the largest drawbacks of targeted capture may be difficulty in capturing short targets.

This is especially true when targeted exons are shorter than the probe length, which in our case was 120 bp. Including flanking intron sequence to make up the remaining sequence is ideal when doing targeted resequencing, but for cross-species capture is more problematic due to higher sequence divergence among introns. Instead, exons less than 120 bp were padded with non- homologous sequence. To determine what effect this had on the capture of short exons we compared completeness between genes that did not have exons less than 120 bp with those that did. We found, with the non-parametric Mann-Whitney test, that genes with exons less than 120 bp had significantly lower completeness; however, this difference was small with average completeness only being 4% less (Table S3.8). This difference was similar when the cutoff was set to 100 bp and 50 bp (Table S3.8).

While significant, the difference in completeness was quite small and many genes with short exons were captured with high (or full) completeness, including the regions comprised

113 from the short exons. For example, ABCA4 which has 14 exons under 100 bp in length with our

Anolis probes was captured with 98.3% completeness in Anolis and 87.7% on average across the

16 species. The area of the sequence that was not captured in Anolis corresponded to a section of sequence where the probe sequence extracted from ENSEMBL and the current predicted sequence on NCBI disagreed. The lack of capture of this area appears to be due to incorrect probe sequence rather than a short exon. USH1C, which also had 14 exons under 100 bp, as well as four under 50 bp, was captured at 98.5% in Anolis and 90.9% overall. Similarly, the areas not captured in Anolis were portions of the sequence that disagreed between the ENSEMBL exons we used to design the probes and the (most recent) NCBI predicted CDS. For our method, short exons do not appear to be a major determinant of the completeness of recovered exons and thus do not represent a substantial obstacle.

3.3.5 Incomplete and erroneous probe sequences caused substantial reductions in gene completeness

As noted above, one reason that sequences may not be captured is if they were missing from the probe sequence, either due to an incomplete or erroneous sequence. The probe sequences we used were based on the ENSEMBL and NCBI gene predictions available at the time, but the gene predictions have been updated considerably since the probes were developed, especially following the reannotation of the Anolis genome (Eckalbar et al. 2013). Though we manually curated the set of sequences and corrected errors when possible using a multiple sequence alignment, it was not possible to fix all of the errors. The most common incongruence between the sequences used for probe design and the updated gene predictions was in the prediction of the first and last exon of the gene. In many cases the first or last exon was incorrectly predicted, but in some cases may have represented an alternate transcript variant. In other cases the updated

114 sequence was actually incorrect based on a multiple sequence alignment (e.g., CNGA3, GNB5).

More rarely other sections of the sequence would be missing or had insertions (presumably intronic sequence), but these differences were much easier to identify and fix.

When the probe sequence was incomplete or erroneous that part of the sequence was often not captured. This resulted in an overall lower completeness for those genes with missing

(72%) or incomplete (76%) Anolis probes or references compared to those with complete probes and references (93%, Table S3.4). This difference was not as substantial as we expected, likely because the additional reference sequences we used allowed the sequence to still be captured in some cases. This is evident from four genes (CNGA1, CRYZL1, ENO1, PGK1) that lacked an

Anolis probe (but had an Anolis reference sequence for assembly) and still had an average completeness of 82% despite being captured with only turtle or chicken probes.

3.3.6 Completeness of recovered genes decreased with increasing sequence divergence

One of the most important aspects to consider when designing a cross-species sequence capture experiment is the level of sequence divergence that can be tolerated. Many previous methods have targeted sequences with low divergence and/or only closely related species (Mason et al.

2011; Bi et al. 2012; Crawford et al. 2012; Faircloth et al. 2012; Lemmon et al. 2012;

McCormack et al. 2012; Ilves and Lopez-Fernandez 2014). Instead, we have targeted a broad range of both sequence divergence and relatedness including species that are up to 200 myr divergent from the closest probe sequences used. In order to evaluate the effect sequence divergence had on the recovery of the coding regions, we developed two approaches for independently calculating divergence. This was necessary because directly measuring divergence between the captured data and the reference would highly bias the results towards the captured

115 data. First, we calculated an average sequence similarity between Anolis (the reference) and each of the 15 other species using three independently sequenced genes as a proxy (Table S3.9). This metric provided a rough estimate of average sequence similarity between each of the species and the probe/reference sequences. However, it was not possible to separate the effects of the hybrid capture and the guided assembly on the completeness of the recovered genes with this approach.

To more directly measure the effect of divergence on the hybrid capture we utilized the Gekko reference, which allowed us to directly and independently calculate sequence identity between the Anolis and Gekko sequences. We then compared similarity to completeness calculated through BWA assembly against the Gekko reference, which removed the effect of the cross- species assembly and allowed a more direct evaluation of the effect of sequence divergence on hybrid capture (Table S3.10).

Our measure of species-level sequence similarity revealed a strong correlation with completeness (Fig. 3.3, Table S3.9; r = 0.95 p < 0.001). The colubrid snakes, which showed the highest level of sequence divergence (despite being more closely related to Anolis than the geckos) also showed the lowest completeness. However, despite having average pairwise identities to Anolis below 80%, average completeness was above 85%. Variation within the group may represent specific differences in the sequences or variation in DNA or library quality

(e.g., the low completeness of Rhinocheilus). The lowered completeness of Sphaerodactylus is likely due at least in part to divergence from the Gekko sequences used as a reference for assembly. Rather than implying a strict relationship between divergence and completeness, these results highlight multiple factors that correlate with evolutionary divergence, which are likely to affect both the hybrid capture and the computational assembly. Generally, these results imply that acceptable levels of completeness can be expected down to 75% sequence similarity when a reasonably close reference is available for assembly. At 90% similarity most genes can be

116 expected to be nearly complete.

The level of sequence divergence tolerated with our approach compares favourably with the transcriptome-based exon capture method of Portik et al. (2016). Using a similar approach, the authors compared average pairwise divergence from the probe design species to completeness. Similarly, the authors found a strong linear relationship between divergence and completeness, but with a stronger slope and lower completeness at equivalent levels of divergence (Portik et al. 2016). For example completeness at 10% divergence (90% similarity) was only that at ~70% compared to over 90% in our study. At the other end completeness at 20% divergence (80% similarity) was ~20% compared to over 85% (Portik et al. 2016).

The Gekko-specific metric, which allowed us to more directly compare the effects of sequence divergence on the completeness of recovered genes, showed a weaker, but still highly significant, correlation (Fig. 3.4, r = 0.64, p < 0.001). This suggests that, as noted earlier, sequence divergence has a strong negative affect on the performance of the guided assembly.

When the effect of the cross-species assembly is removed (by assembling Gekko reads to a

Gekko reference), the correlation weakens, but still accounts for the majority of variation in completeness. Thus, at the individual gene level, aspects other than just overall sequence similarity can have a large effect on the performance of the hybrid capture. As noted above, the accuracy of the probe sequence can have a large affect, but other aspects, including overall similarity in gene structure, concentration of differences, secondary structure, and GC content, may also contribute significantly. The large amount of variation in completeness is exemplified by the large range of sequence identities, from 80–94%, for those genes that were recovered with

100% completeness. When these genes were removed, the correlation was strengthened somewhat (r = 0.71), but the remaining comparisons still show a large amount of variation.

These results demonstrate that sequence similarity down to 78% between the probe sequence and

117 the target can still result in the capture and enrichment of nearly complete (95%+) coding sequences.

Figure 3.3. Analyses of the effect of divergence on completeness of recovered coding regions. Average gene completeness for each species was compared to average pairwise sequence identity to anole demonstrating the strong correlation between sequence identity and the completeness of genes recovered. Average completeness was calculated as the average across each gene recovered using the best assembly method and reference for each gene. Pairwise identity was calculated between each species and anole for a set of representative genes obtained

118 independently for each species. The reduced major axis regression lines are shown both including and excluding anole in the regression.

Figure 3.4. Analyses of the effect of divergence on completeness when removing the effect of cross-species assembly. Completeness of each gene captured in Gekko was compared to the pairwise identity between Anolis and Gekko for those genes. Completeness was calculated for genes with complete probe and reference sequences in Anolis that were also found in the de novo transcriptome assembly of Gekko and are the values for BWA assembly against the Gekko reference. Pairwise identity was calculated between the de novo assembled Gekko coding

119 sequences and the Anolis reference sequences thus removing bias from the guided assembly approach.

3.3.7 Targeted capture performed similarly or better, and cost as little or less, than RNA-Seq and WGS

To demonstrate the usefulness of our method for sequencing complete coding sequences in comparison to other approaches, we assembled RNA-seq reads from Thamnophis sirtalis eye tissue and previously published whole genome sequencing datasets (Card et al. 2014; Zhang et al. 2014) of various coverage using our guided assembly pipeline. We evaluated the recovery and completeness of the assembled sequences, as well as the costs of sequencing, in comparison to our targeted capture approach. We found that the RNA-Seq dataset recovered substantially fewer genes, but that the completeness of genes that were recovered was similar to that of the capture data in general, and slightly higher than the capture for Thamnophis specifically (Table 3.5,

Table S3.11). The reduced number of genes recovered was due primarily to the fact that RNA-

Seq was performed on eye tissue and not all of the 160 genes that were targeted are expressed in the eye, despite the focus on visual genes in our probe set. This highlights one of the main drawbacks of RNA-Seq, but could be overcome if the genes of interest were all sufficiently expressed in a single tissue type or by pooling RNA extracted from multiple tissue types, although the second option may require additional sequencing depth. Additionally, genes may not have been recovered if they were not expressed at sufficient levels to be captured at the sequenced coverage level. Sequencing rare transcripts will be more difficult (i.e., require higher coverage) with RNA-Seq, but this drawback does not apply to targeted capture or other genomic approaches.

While the completeness levels were similar between reference guided RNA-Seq and

120 targeted capture, we also compared results for de novo assembly of the transcriptome as this would be the typical procedure with a species that lacks a reference genome. With de novo assembly we recovered 77.5% of the 160 genes with an average completeness of 91.5% for

Thamnophis (Table S3.11). However, our guided assembly pipeline is highly stringent in that it requires a minimum of 10X coverage for a base to be called in the consensus. When we relaxed this to the default minimum of 3X we found an increase in recovery and completeness that slightly exceeded the de novo assembly.

We also compared our targeted capture method against non-enriched whole genome sequencing (WGS). We selected four previously sequenced datasets containing an increasing number of reads in order to determine what sequence coverage was necessary to overcome the lack of enrichment. Using our same assembly and analyses pipeline, we found that only at the highest sequencing coverage tested (265 million reads) was recovery and completeness satisfactory (Table 3.5, Table S3.12). However, when we relaxed the stringent 10X coverage requirement to the default minimum of 3X we obtained a marked increase in both recovery and completeness, but this was still well below that found for the sequence capture experiment for the low and medium coverage tests (26–184 million reads).

In terms of the costs of producing the data, RNA-Seq and targeted capture are similar, whereas whole genome sequencing was substantially more expensive. A single targeted capture sample (for a run with 16 samples total) and ~30 million reads of RNA-Seq (approximately one sixth of a HiSeq lane) cost essentially the same (Table 3.6, Table S3.13). Sequence capture further excels due to its scalability to larger numbers of samples, which at 96 samples would have reduced the cost by almost 20% per sample. When smaller numbers of samples are required, and RNA appropriate tissue is available, RNA-Seq would be the preferred method due to the time investment involved in developing a set of probe sequences and the relative ease of

121 de novo assembly. For genome sequencing, even the low coverage genomes cost more than targeted capture (per sample), with the higher coverage genomes costing almost four times as much. Despite the substantially higher cost, WGS may be desirable if there is a very large number of genes of interest and/or if there is only a small number of species to be sequenced.

RNA-Seq may still be preferable in these cases if fresh tissue is available. Otherwise, sequencing a species on the equivalent of a single HiSeq lane (our highest coverage tested, but still much lower than needed to de novo assemble a complete genome) may be a useful way to obtain many nearly complete coding sequences. A cross-species reference-guided genome assembly approach, as proposed by Card et al. (2014), may be able to recover more complete genes from lower coverage genomes as well.

Table 3.5. Comparison of the performance of the assembly and annotation pipeline on

RNA-Seq and whole genome data with the targeted capture approach.

Sequencing Number QC- Percent Reads Genes Average Species Method passed Reads Mapped Recovered Completeness Targeted Average of 16 8,567,574 51.6% 147 (92%) 70.6% capture RNA-Seq Thamnophis sirtalis 29,843,391 4.29% 112 (70%) 73.5% Centrocercus minimus 40 (26%) / 12.6% / WGS 26,817,128 0.05% (SRR1166456) 106 (68%) 30.8% Nucifraga columbiana 42 (27%) / 12.6% / WGS 43,587,682 0.05% (SRR1166560) 135 (87%) 40.4% 31 (20% / 26.8% / WGS Tyto alba (SRR959575) 184,107,287 0.02% 85 (55%) 39.0% Struthio camelus 122 (79%) / 48.6% / WGS 219,605,123 0.02% (SRR950910) 151 (97%) 76.2% Calypte anna 82.9% / WGS 265,237,073 0.10% 155 (100%) (SRR943144) 88.2%

Data shown is for assembly with BWA against the Anolis (targeted capture, RNA-Seq) or Gallus reference (WGS). Data for individual genes is shown in Tables S3.11 and S3.12. Since the whole genome sequencing (WGS) suffered from low mapping rates (due to no enrichment) we additionally relaxed the requirement in our assembly pipeline to only require the minimum (default) level of coverage of 3X, which is shown after the slash (/). A sequence was considered recovered if it had a minimum of 5% completeness and could be identified via BLAST. Completeness was calculated relative to the reference sequence for those genes identified as recovered.

122

Table 3.6. Cost comparison of targeted capture, RNA-Seq, and whole genome sequencing experiments.

Per Sample Kit & Library Fraction of Cost of Relative Extraction Total Costs Reagents Prep HiSeq Lane Sequencing Cost Targeted Capture 16 $3.81 $482.74 $100.00 1/32 $65.63 $652 100.0% Targeted Capture 96 $3.81 $367.79 $100.00 1/32 $65.63 $537 82.4% RNA-Seq $14.45 - $300.00 1/6 $350.00 $664 101.9% WGS $3.81 - $251.00 1/7 $300.00 $555 85.1% WGS $3.81 - $251.00 1/5 $420.00 $675 103.5% WGS $3.81 - $251.00 1/4 $525.00 $780 119.6% WGS $3.81 - $251.00 3/4 $1,575.00 $1,830 280.6% WGS $3.81 - $251.00 1 $2,100.00 $2,356 361.2%

Costs were those incurred by us in CDN dollars and are likely to vary depending on country and sequencing centre used, and are expected to change over time. Additional cost details are available in Table S3.13. Costs assume use of an Agilent Custom SureSelect kit (either 16 or 96 samples) and of a paid service at a core facility for targeted capture, library preparation, and/or sequencing, and thus should be generally reproducible by any lab without need for specialized equipment/expertise. They additionally assume the possibility of sequencing on partial HiSeq lanes when necessary.

3.3.8 Captured phylogenetic markers produced an accurate species tree

The most common application of cross-species sequence capture is for phylogenetic analysis, and our method can also be applied for this purpose. We targeted 23 genes previously used as phylogenetic markers in reptile and squamate phylogenetic studies. Of those, 16 were over 80% complete for all species (Table S3.14) and so were used to construct a multigene phylogenetic species tree using MrBayes. The resulting tree was highly supported and closely matched a recent multigene squamate tree (Fig. 3.5; Pyron et al. (2013)). The only topological difference was in the position of Pantherophis, which is unsurprising due to the extremely short branch lengths in this portion of the tree. These results demonstrate that it is possible to get high quality data for phylogenetic reconstruction that can also be used to address a variety of other research questions.

123

Figure 3.5. Bayesian multigene phylogeny of the 16 species with two outgroups. A total of 16 phylogenetic marker genes were used. The topology agrees strongly with the multigene phylogeny of (Pyron et al. 2013) differing only in the placement of the corn snake. Posterior probability support is shown at each node.

124

3.4 Conclusions

Overall, the cross-species targeted capture method proposed here was highly successful in recovering the 160 genes of interest with high completeness over large evolutionary distances

(up to 200 myr of divergence). Our use of complete coding regions from specific genes of interest allowed us to focus on aspects of organismal physiology producing data useful for both phylogenetics and studies of molecular evolution and function. Recovery of more divergent sequences was lower, but this was primarily due to the difficulty of cross-species guided assembly rather than a failure of the hybrid enrichment. This issue was partly overcome through the use of additional reference sequences obtained from whole genome data. Since the hybrid enrichment appears to have been highly robust to sequence divergence, development of a de novo assembly pipeline that removes the reliance on cross-species assembly is a promising avenue for future research. A de novo approach, however, is not trivial and our preliminary attempts have produced results worse than or on par with the guided approach developed here.

Differences in assembly methods between cross-species enrichment approaches likely accounts for a large amount of the variation in the quality of divergent capture and needs to be further evaluated. Because our results show substantial increased recovery when additional probe sequences are included, adopting a transcriptome-based targeted capture approach similar to that proposed by Bi et al. (2012) and Portik et al. (2016), where transcriptomes from one or more species are first sequenced and de novo assembled and then used to design a set of probes for hybrid capture, may be highly beneficial. Modifications to the targeted capture protocol, as well as introduction of a second round of capture, as implemented by Li et al. (2013) may further extend the ability to capture divergent sequences. The data produced by our method was more than sufficient to produce a robust phylogenetic tree and will be used for future molecular evolutionary and functional studies. While we initially targeted a modest number of genes and

125 species the method is easily scalable to much large numbers of both, which will further increase its efficiency and per sample cost effectiveness. The cross-species targeted capture method developed here will enable the study of a variety of evolutionary questions in virtually any set of genes of interest across divergent groups of species.

3.5 Materials and Methods

3.5.1 Probe Design

As a proof-of-concept for this method we targeted 166 visual, housekeeping, and phylogenetic marker genes. This set of genes included nearly all genes known to function in the phototransduction and visual cycles, as well as genes involved in photoreceptor development and maintenance, non-visual opsins, and lens crystallins (Table S3.1). This set of genes was collected, in part, from proteome and transcriptome papers of photoreceptor outer segments and retinas (Schulz et al. 2004; Kwok et al. 2008). Phylogenetic markers were selected from reptile and squamate phylogenetic studies (e.g., Harshman et al. 2003; Iwabe et al. 2005; Vidal and

Hedges 2005; McAliley et al. 2006; Hugall et al. 2007; Barley et al. 2010) and housekeeping genes from the list produced by She et al. (2009). In order to promote cross-species hybridization, probes were designed from a representative set of taxa that have complete genomes spanning reptilian phylogenetic diversity, including Anolis (lizard),

Pelodiscus/Chrysemys (turtle), and Gallus (bird), following Lemmon et al. (2012). This ensured that a range of sequence variation was present in the probe sequences to promote hybridization with divergent sequences. For each of the 166 targeted genes, we obtained mRNA or predicted mRNA sequences from Genbank and coding sequences (CDS) and individual exon sequences

126 from ENSEMBL, as available for each of the probe taxa. When exon sequences were not available on ENSEMBL we attempted to obtain them through direct BLAST searches of the genomes. The individual exons were aligned to the complete coding and/or mRNA sequence using custom scripts and manually inspected and corrected as necessary. Sequences from all probe taxa were aligned together, which allowed intronic and UTR sequences present in the exon annotation to be identified and removed. We found such contaminating sequences to be common in the exon sequences obtained from ENSEMBL as intron-exon boundaries and start and stop codons were often misidentified. This step also allowed us to manually verify the annotation of each gene and exon. All 166 genes were not present in the reference genomes and thus some genes are only represented by sequences from only one or two species and in a few cases substitutes were used (Table S3.1). If sequences for both Pelodiscus and Chrysemys were available only the longer sequence was kept. If they were of the same length and overall quality the Pelodiscus sequence was given preference for consistency, as it had the more complete genome in general. Once exons were validated, the mRNA and CDS sequences were removed and only the exons retained. For the opsin genes, we added additional sequences from lizards, snakes, and alligator (as available) to the probe sequences. These sequences were manually broken into their constituent exons based on the multiple sequence alignment including the exons from ENSEMBL for Anolis, Pelodiscus/Chrysemys, and Gallus. Additional probe species were added for the opsins in order to evaluate the effect of increased probe diversity on capture efficiency and to ensure complete capture of these genes. Altogether, the process resulted in a total of 3888 exons from which the probes were designed. A complete breakdown of the genes and exons targeted by the probes is available in Tables S3.1 and S3.8.

The probes consisted of 120 bp of RNA synthesized by Agilent, which is the median size of protein coding exons in the (Clamp et al. 2007) and because RNA has stronger

127 hybridization with DNA than DNA does. Probes were extensively tiled across the exons (10X coverage, 20X coverage for opsin genes) to increase the likelihood of hybridization of inserts with at least a single probe variant, with the goal of increasing capture of complete exons. Exons that were shorter than the probe length were padded with non-homologous sequence because probe length could not be shorter than 120 bp. The number of probes targeting short exons was boosted in order to normalize coverage of the target region. This resulted in a total of 45,895 probes after tiling and boosting.

3.5.2 Sample Preparation and Sequencing

To test the method, 16 squamate reptiles were selected that varied in their divergence from Anolis, including eight snakes, three geckos, and five other lizards (Table S3.2). As a positive control, we included Anolis (the squamate probe species) in this set of 16 species. This range of species spans much of the diversity of squamates and allowed for evaluation of the efficiency of capture and enrichment at different levels of sequence divergence. Genomic DNA

(gDNA) was extracted from the muscle and/or liver samples using the DNeasy Blood and Tissue

Kit (Qiagen) following the manufacturer’s protocol. Library creation, hybridization and sequencing were performed according to the Agilent SureSelect protocol at the Centre for

Applied Genomics (TCAG; Sick Kids Hospital, Toronto). The 16 samples were sequenced on roughly half a HiSeq (Illumina) lane (approximately 1/32 of a lane per sample).

3.5.3 Reference File Creation for Guided Assembly

To facilitate assembly across divergent species, and evaluate the effect of the computational assembly on gene recovery, several different sets of reference files were generated that differed in the primary species used to build the reference. These were an Anolis, a snake, and a Gekko

128 reference, as well as several additional small references targeting just the visual opsin genes. The

Anolis reference was constructed using the available Anolis sequences from Genbank for the 166 targeted genes. Due to improvements to the Anolis genome and its associated gene predictions that occurred after building the probe set, the Anolis sequences present in the reference are not necessarily the same as those used to construct the probes. The updated sequences should represent more accurate predictions and thus were used in most cases. In some cases this meant inclusion of an Anolis sequence in the reference that was not present in the probe set. If a sequence still could not be found from Anolis the next most closely related sequence was used

(Python, Pelodiscus, Alligator, Gallus). In some cases when only a partial Anolis sequence was obtainable the missing portion was added from Python. After an initial survey using this reference six sequences were removed. Two genes (ALB, SLC24A1) were found to have been ancestrally lost in squamates (not present in any squamate genome or any of the 16 species sequenced in this study). One was identified to be a lineage-specific duplication in some birds

(CRYD2). The probe for one gene, STRA6, was found to lack any homology with other STRA6 sequences and thus was not successful at capture. Additionally, three genes initially included,

UBC, UBB, and UBI, were found to all represent the same gene (which we term UBC) and thus were combined. This left a total of 160 genes that made up the Anolis reference.

The snake reference was built primarily from sequences obtained from the Python and

Thamnophis genomes and a de novo Trinity transcriptome assembly of Thamnophis. The

Thamnophis sequences were preferred over the Python as the included snake species are more closely related to Thamnophis than Python. If only a partial sequence was obtainable, Anolis sequence was used to complete it when possible. This resulted in 139 sequences in the snake reference. A third reference was also used and this was based on sequences obtained from the

Gekko japonicus genome (110 sequences).

129

3.5.4 Assembly and Analysis Pipeline

Raw reads were processed by Trimmomatic (Bolger et al. 2014) to remove low quality reads, as well as primer and index contamination under default settings. A complete pipeline was developed for the assembly and analysis of trimmed reads. First, reads were assembled using one of three methods: BWA-MEM (Li 2013), NGM (Sedlazeck et al. 2013), or Stampy (Lunter and

Goodson 2011). BWA-MEM is the most conservative method in terms of tolerating mismatches between the reads and the reference, but is also the most accurate, whereas NGM and Stampy both tolerate more mismatches, but at the cost of some accuracy (Lunter and Goodson 2011; Li

2013; Sedlazeck et al. 2013; Turki and Roshan 2014). However, the benefit of allowing more mismatches in assembling reads to divergent reference sequences may outweigh a small reduction in accuracy. BWA-MEM was first run under default parameters, but assembly was found to suffer when applied across species. To address this, we reduced the mismatch penalty from the default of 4 to 2 (-B 2) and used this for subsequent analysis. NGM was run under default parameters. Stampy was also run under default parameters, but a subset of analyses were run to test the effect of changing the substitution rate parameter. Since Bowtie 2 (Langmead and

Salzberg 2012) has been used recently to assemble targeted capture data (Ilves and Lopez-

Fernandez 2014) we additionally implemented Bowtie 2 under the ‘very sensitive’ preset used by

Ilves and Lopez-Fernandez (2014) in our analysis pipeline using the Anolis and snake references.

Consensus sequences were called using the mpileup-bcf-vcfutils pipeline of Samtools (Li et al. 2009) with a minimum sequence and mapping quality score of 20 (-Q 20 and -q 20) and a minimum depth of coverage of 10 (-d 10). Additionally the parameter ‘l’ was set to 1 in vcfutils, which reduced the number of bases surrounding an indel that were replaced with ‘N’s to one.

Note that while the probes were targeted to individual exons the assembly was done against complete coding regions. The consensus sequences generated are the assembled coding region of

130 the captured gene. After removing lowercase letters from the consensus sequence (which signify bases that did not meet the quality and depth of coverage standards) the completeness of the recovered coding region, relative to the reference sequence, was calculated using custom scripts.

Consensus sequences were annotated by BLAST to identify the recovered gene in comparison to the gene targeted by the reference.

Completeness calculations and BLAST annotations were manually verified for each gene. Genes were considered recovered when they had at least 5% completeness and the BLAST annotations matched the targeted reference sequence. Where BLAST annotations were ambiguous, simple maximum likelihood gene trees were inferred using either PhyML (Guindon et al. 2010) or MEGA (Tamura et al. 2011) to verify sequence identities. Sequences that did not meet these criteria were removed and not used for further comparative analyses.

3.5.5 Method Analysis and Evaluation

Completeness of the recovered coding regions was compared across the different reference sets and assembly methods. In addition to completeness, we also compared the enrichment efficiency using the simple proxy of the percentage of reads that mapped to the reference. In order to evaluate the effect of sequence divergence on the recovery of the gene, a proxy for average sequence divergence between Anolis and each of the other 15 taxa was calculated. To avoid biasing the results, divergences could not be calculated based on the recovered sequences.

Instead, genes that were independently sequenced and available on Genbank for each of the 16 species were needed. Six candidate genes were identified: BDNF, MOS, NTF3, RAG1,

R35/GPR149, and ZEB2. To increase sample size, sequences from different species within the same genus were included. Pairwise identity was calculated between the sequence for Anolis and each of the 15 other species using PRANK (Loytynoja and Goldman 2005) to align the

131 sequences followed by USEARCH to calculate a distance matrix (Edgar 2010). Three of the six genes (MOS, NTF3, and R35) had almost complete taxon coverage and similar average identities with the inclusion of additional species from the same genera. Comparison of identities between multiple species in the same genera revealed very little variation. As such, the average of these three genes was used as a proxy for average sequence identity between the species.

In addition to estimating sequence identity between species, we also calculated levels of sequence identity of the individual genes by utilizing the Gekko reference in a more specific, but also more robust, comparison. Pairwise sequence identity was calculated between the Anolis and

Gekko reference sequences (obtained from the genome and thus independent from the target enrichment sequences) for each of the genes present and complete in both species. We compared these sequence identities to the completeness of the recovered coding regions obtained from assembly of the Gekko targeted capture reads assembled against the Gekko reference. This approach removed the effect of the cross-species assembly, enabling evaluation of the targeted capture efficiency directly.

To compare the effect of increased probe diversity and tiling, we compared the completeness of the visual opsin genes to the overall average. We also investigated the effect of short exons on gene completeness. Because exons shorter than 120 bp were padded with non-homologous sequence, and necessarily could not be tiled, we expected a reduction in the recovery of these exons. To evaluate this, we compared completeness of genes with no exons under 120, 100, or

50 bp to those that had one or more exons under these thresholds. Differences between the two groups were evaluated with the non-parametric Mann-Whitney test as the distributions were highly skewed (non-normal).

132

3.5.6 Phylogenetic Analysis

To evaluate the usefulness of the recovered data for molecular evolutionary studies, a multi-gene species tree was inferred using the captured phylogenetic marker genes. Only genes that had both complete coding sequences in the probe and reference files and that were at least 80% complete were used. This resulted in the selection of 16 out of the 23 genes we had identified as phylogenetic markers. Sequences for each of these genes were aligned using MUSCLE (Edgar

2004) codon alignment implemented in MEGA (Tamura et al. 2011) along with outgroup sequences from Alligator mississippiensis and Chrysemys picta. Individual multiple sequences alignments were concatenated and partitioned into individual genes. The matrix was analyzed using MrBayes (Ronquist et al. 2012) using reversible jump MCMC with a gamma rate and invariant sites parameter (nst=mixed, rates=invgamma), which explores the parameter space for the nucleotide model and the phylogenetic tree simultaneously. The analysis was run for five million generations with a 25% burn-in. Convergence was confirmed by checking that the standard deviations of split frequencies approached zero and that there was no obvious trend in the log likelihood plot.

3.6 Data Availability

Data associated with the manuscript including probe information, custom scripts, and reference files are available through DRYAD (Schott et al. 2017).

133

3.7 Acknowledgements

We thank Jiayang Wu for improvements to the assembly and analysis pipeline scripts. We would also like to thank Dante Cerrullo and Agilent for their help with probe design and Sergio Pereira and the Centre for Applied Genomics at Sick Kids for their assistance with the targeted capture and sequencing. This work was supported by a Natural Sciences and Engineering Research

Council (NSERC) Discovery grant (BSWC), an Ontario Graduate Scholarship (RKS), and a

Vision Science Research Program Scholarship (RKS).

3.8 References

Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song XZ, Richmond TA, Middle

CM, Rodesch MJ, Packard CJ, Weinstock GM, Gibbs RA 2007. Direct selection of human

genomic loci by microarray hybridization. Nat Methods 4: 903-905.

Alfoldi J, Di Palma F, Grabherr M, Williams C, Kong LS, Mauceli E, Russell P, Lowe CB, Glor

RE, Jaffe JD, Ray DA, Boissinot S, Shedlock AM, Botka C, Castoe TA, Colbourne JK,

Fujita MK, Moreno RG, ten Hallers BF, Haussler D, Heger A, Heiman D, Janes DE,

Johnson J, de Jong PJ, Koriabine MY, Lara M, Novick PA, Organ CL, Peach SE, Poe S,

Pollock DD, de Queiroz K, Sanger T, Searle S, Smith JD, Smith Z, Swofford R, Turner-

Maier J, Wade J, Young S, Zadissa A, Edwards SV, Glenn TC, Schneider CJ, Losos JB,

Lander ES, Breen M, Ponting CP, Lindblad-Toh K 2011. The genome of the green anole

lizard and a comparative analysis with birds and mammals. Nature 477: 587-591.

Altshuler D, Pollara VJ, Cowles CR, Van Etten WJ, Baldwin J, Linton L, Lander ES 2000. An

SNP map of the human genome generated by reduced representation shotgun sequencing.

Nature 407: 513-516.

134

Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis ZA, Selker EU, Cresko WA,

Johnson EA 2008. Rapid SNP discovery and genetic mapping using sequenced RAD

markers. PLoS One 3: e3376.

Barley AJ, Spinks PQ, Thomson RC, Shaffer HB 2010. Fourteen nuclear genes provide

phylogenetic resolution for difficult nodes in the turtle tree of life. Mol Phylogen Evol 55:

1189-1194.

Bi K, Vanderpool D, Singhal S, Linderoth T, Moritz C, Good JM 2012. Transcriptome-based

exon capture enables highly cost-effective comparative genomic data collection at

moderate evolutionary scales. BMC Genomics 13: 403.

Bolger AM, Lohse M, Usadel B 2014. Trimmomatic: a flexible trimmer for Illumina sequence

data. Bioinformatics 30: 2114-2120.

Bragg JG, Potter S, Bi K, Moritz C 2016. Exon capture phylogenomics: efficacy across scales of

divergence. Molecular Ecology Resources 16: 1059-1068.

Brousseau L, Tinaut A, Duret C, Lang TG, Garnier-Gere P, Scotti I 2014. High-throughput

transcriptome sequencing and preliminary functional analysis in four Neotropical tree

species. BMC Genomics 15: 238.

Bundock PC, Casu RE, Henry RJ 2012. Enrichment of genomic DNA for polymorphism

detection in a non-model highly polyploid crop plant. Plant Biotechnol J 10: 657-667.

Bybee SM, Bracken-Grissom H, Haynes BD, Hermansen RA, Byers RL, Clement MJ, Udall JA,

Wilcox ER, Crandall KA 2011. Targeted amplicon sequencing (TAS): a scalable next-gen

approach to multilocus, multitaxa phylogenetics. Genome Biol Evol 3: 1312-1323.

Card DC, Schield DR, Reyes-Velasco J, Fujita MK, Andrew AL, Oyler-McCance SJ, Fike JA,

Tomback DF, Ruggiero RP, Castoe TA 2014. Two Low Coverage Bird Genomes and a

135

Comparison of Reference-Guided versus De Novo Genome Assemblies. PLoS One 9:

e106649.

Castoe TA, Bronikowski AM, Brodie ED, Edwards SV, Pfrender ME, Shapiro MD, Pollock DD,

Warren WC 2011. A proposal to sequence the genome of a garter snake (Thamnophis

sirtalis). Standards in Genomic Sciences 4: 257-270.

Castoe TA, de Koning APJ, Hall KT, Card DC, Schield DR, Fujita MK, Ruggiero RP, Degner

JF, Daza JM, Gu WJ, Reyes-Velasco J, Shaney KJ, Castoe JM, Fox SE, Poole AW,

Polanco D, Dobry J, Vandewege MW, Li Q, Schott RK, Kapusta A, Minx P, Feschotte C,

Uetz P, Ray DA, Hoffmann FG, Bogden R, Smith EN, Chang BSW, Vonk FJ, Casewell

NR, Henkel CV, Richardson MK, Mackessy SP, Bronikowsi AM, Yandell M, Warren WC,

Secor SM, Pollock DD 2013. The Burmese python genome reveals the molecular basis for

extreme adaptation in snakes. Proc Natl Acad Sci U S A 110: 20645-20650.

Clamp M, Fry B, Kamal M, Xie XH, Cuff J, Lin MF, Kellis M, Lindblad-Toh K, Lander ES

2007. Distinguishing protein-coding and noncoding genes in the human genome. Proc Natl

Acad Sci U S A 104: 19428-19433.

Crawford NG, Faircloth BC, McCormack JE, Brumfield RT, Winker K, Glenn TC 2012. More

than 1000 ultraconserved elements provide evidence that turtles are the sister group of

archosaurs. Biol Lett 8: 783-786.

Cronn R, Knaus BJ, Liston A, Maughan PJ, Parks M, Syring JV, Udall J 2012. Targeted

enrichment strategies for next-generation plant biology. Am J Bot 99: 291-311.

Eckalbar WL, Hutchins ED, Markov GJ, Allen AN, Corneveaux JJ, Lindblad-Toh K, Di Palma

F, Alfoldi J, Huentelman MJ, Kusumi K 2013. Genome reannotation of the lizard Anolis

carolinensis based on 14 adult and embryonic deep transcriptomes. BMC Genomics 14: 49.

136

Edgar RC 2004. MUSCLE: multiple sequence alignment with high accuracy and high

throughput. Nucleic Acids Res 32: 1792-1797.

Edgar RC 2010. Search and clustering orders of magnitude faster than BLAST. Bioinformatics

26: 2460-2461.

Faircloth BC, McCormack JE, Crawford NG, Harvey MG, Brumfield RT, Glenn TC 2012.

Ultraconserved Elements Anchor Thousands of Genetic Markers Spanning Multiple

Evolutionary Timescales. Syst Biol 61: 717-726.

Gallant JR, Traeger LL, Volkening JD, Moffett H, Chen PH, Novina CD, Phillips GN, Anand R,

Wells GB, Pinch M, Guth R, Unguez GA, Albert JS, Zakon HH, Samanta MP, Sussman

MR 2014. Genomic basis for the convergent evolution of electric organs. Science 344:

1522-1525.

Gerstein MB, Rozowsky J, Yan KK, Wang DF, Cheng C, Brown JB, Davis CA, Hillier L, Sisu

C, Li JJ, Pei BK, Harmanci AO, Duff MO, Djebali S, Alexander RP, Alver BH, Auerbach

R, Bell K, Bickel PJ, Boeck ME, Boley NP, Booth BW, Cherbas L, Cherbas P, Di C,

Dobins A, Drenkows J, Ewing B, Fang G, Fastucas M, Feingold EA, Frankish A, Gao GJ,

Good PJ, Guigo R, Hammonds A, Harrow J, Hoskins RA, Howald C, Hu L, Huang HY,

Hubbard TJP, Huynh C, Jhas S, Kasper D, Kato M, Kaufman TC, Kitchen RR, Ladewig E,

Lagarde J, Lai E, Leng L, Lu Z, MacCoss M, May G, McWhirter R, Merrihew G, Miller

DM, Mortazavi A, Murad R, Oliver B, Olson S, Park PJ, Pazin MJ, Perrimon N,

Pervouchine D, Reinke V, Reymond A, Robinson G, Samsonova A, Saunders GI,

Schlesingers F, Sethi A, Slack FJ, Spencer WC, Stoiber MH, Strasbourger P, Tanzer A,

Thompson OA, Wan KH, Wang GL, Wang H, Watkins KL, Wen JY, Wen KJ, Xue CH,

Yang L, Yip K, Zaleskis C, Zhang Y, Zheng H, Brenner SE, Graveley BR, Ceniker SE,

137

Gingeras TR, Waterston R 2014. Comparative analysis of the transcriptome across distant

species. Nature 512: 445-+.

Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, Brockman W, Fennell T,

Giannoukos G, Fisher S, Russ C, Gabriel S, Jaffe DB, Lander ES, Nusbaum C 2009.

Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted

sequencing. Nat Biotechnol 27: 182-189.

Grover CE, Salmon A, Wendel JF 2012. Targeted sequence capture as a powerful tool for

evolutionary analysis. Am J Bot 99: 312-319.

Guindon S, Dufayard J, Lefort V, Anisimova M, Hordijk W, Gascuel O 2010. New Algorithms

and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance

of PhyML 3.0. Syst Biol 59: 307-321.

Harshman J, Huddleston CJ, Bollback JP, Parsons TJ, Braun MJ 2003. True and false gharials: A

nuclear gene phylogeny of Crocodylia. Syst Biol 52: 386-402.

Havird JC, Sloan DB 2016. The Roles of Mutation, Selection, and Expression in Determining

Relative Rates of Evolution in Mitochondrial versus Nuclear Genomes. Mol Biol Evol 33:

3042-3053.

Hedges SB, Marin J, Suleski M, Paymer M, Kumar S 2015. Tree of Life Reveals Clock-Like

Speciation and Diversification. Mol Biol Evol 32: 835-845.

Hodges E, Xuan Z, Balija V, Kramer M, Molla MN, Smith SW, Middle CM, Rodesch MJ,

Albert TJ, Hannon GJ, McCombie WR 2007. Genome-wide in situ exon capture for

selective resequencing. Nat Genet 39: 1522-1527.

Hugall AF, Foster R, Lee MSY 2007. Calibration choice, rate smoothing, and the pattern of

tetrapod diversification according to the long nuclear gene RAG-1. Syst Biol 56: 543-563.

138

Hugall AF, O'Hara TD, Hunjan S, Nilsen R, Moussalli A 2016. An Exon-Capture System for the

Entire Class Ophiuroidea. Mol Biol Evol 33: 281-294.

Ilves KL, Lopez-Fernandez H 2014. A targeted next-generation sequencing toolkit for exon-

based cichlid phylogenomics. Molecular Ecology Resources 14: 802-811.

Iwabe N, Hara Y, Kumazawa Y, Shibamoto K, Saito Y, Miyata T, Katoh K 2005. Sister group

relationship of turtles to the bird-crocodilian clade revealed by nuclear DNA-coded

proteins. Mol Biol Evol 22: 810-813.

Jin X, He MZ, Ferguson B, Meng YH, Ouyang LM, Ren JJ, Mailund T, Sun F, Sun LD, Shen J,

Zhuo M, Song L, Wang JF, Ling F, Zhu YQ, Hvilsom C, Siegismund H, Liu XM, Gong

ZL, Ji F, Wang XZ, Liu BQ, Zhang Y, Hou JG, Wang J, Zhao H, Wang YY, Fang XD,

Zhang GJ, Wang J, Zhang XJ, Schierup MH, Du HL, Wang J, Wang XN 2012. An Effort

to Use Human-Based Exome Capture Methods to Analyze Chimpanzee and Macaque

Exomes. PLoS One 7: e40637.

Jones MR, Good JM 2016. Targeted capture in evolutionary and ecological genomics. Mol Ecol

25: 185-202.

Kenny EM, Cormican P, Gilks WP, Gates AS, O'Dushlaine CT, Pinto C, Corvin AP, Gill M,

Morris DW 2011. Multiplex Target Enrichment Using DNA Indexing for Ultra-High

Throughput SNP Detection. DNA Res 18: 31-38.

Koepfli KP, Paten B, O'Brien SJ 2015. The Genome 10K Project: a way forward. Annu Rev

Anim Biosci 3: 57-111.

Kunstner A, Wolf JBW, Backstrom N, Whitney O, Balakrishnan CN, Day L, Edwards SV, Janes

DE, Schlinger BA, Wilson RK, Jarvis ED, Warren WC, Ellegren H 2010. Comparative

genomics based on massive parallel transcriptome sequencing reveals patterns of

substitution and selection across 10 bird species. Mol Ecol 19: 266-276.

139

Kwok MCM, Holopainen JM, Molday LL, Foster LJ, Molday RS 2008. Proteomics of

photoreceptor outer segments identifies a subset of SNARE and Rab proteins implicated in

membrane vesicle trafficking and fusion. Mol Cell Proteomics 7: 1053-1066.

Langmead B, Salzberg SL 2012. Fast gapped-read alignment with Bowtie 2. Nat Methods 9:

357-U354.

Lemmon AR, Emme SA, Lemmon EM 2012. Anchored Hybrid Enrichment for Massively High-

Throughput Phylogenomics. Syst Biol 61: 727-744.

Li CH, Hofreiter M, Straube N, Corrigan S, Naylor GJP 2013. Capturing protein-coding genes

across highly divergent species. BioTechniques 54: 321-+.

Li H 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM.

arXiv 1303.3997v1.

Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R,

Genome Project Data P 2009. The Sequence Alignment/Map format and SAMtools.

Bioinformatics 25: 2078-2079.

Liu Y, Zhou Q, Wang YJ, Luo LH, Yang J, Yang LF, Liu M, Li YR, Qian TM, Zheng Y, Li

MY, Li J, Gu Y, Han ZJ, Xu M, Wang YJ, Zhu CL, Yu B, Yang YM, Ding F, Jiang JP,

Yang HM, Gu XS 2015. Gekko japonicus genome reveals evolution of adhesive toe pads

and tail regeneration. Nature Communications 6.

LoVerso PR, Cui F 2015. A Computational Pipeline for Cross-Species Analysis of RNA-seq

Data Using R and Bioconductor. Bioinform Biol Insights 9: 165-174.

Loytynoja A, Goldman N 2005. An algorithm for progressive multiple alignment of sequences

with insertions. Proc Natl Acad Sci U S A 102: 10557-10562.

Lunter G, Goodson M 2011. Stampy: A statistical algorithm for sensitive and fast mapping of

Illumina sequence reads. Genome Res 21: 936-939.

140

Mamanova L, Coffey AJ, Scott CE, Kozarewa I, Turner EH, Kumar A, Howard E, Shendure J,

Turner DJ 2010. Target-enrichment strategies for next-generation sequencing. Nat Methods

7: 111-118.

Mason VC, Li G, Helgen KM, Murphy WJ 2011. Efficient cross-species capture hybridization

and next-generation sequencing of mitochondrial genomes from noninvasively sampled

museum specimens. Genome Res 21: 1695-1704.

McAliley LR, Willis RE, Ray DA, White PS, Brochu CA, Densmore LD 2006. Are crocodiles

really monophyletic? Evidence for subdivisions from sequence and morphological data.

Mol Phylogen Evol 39: 16-32.

McCormack JE, Faircloth BC, Crawford NG, Gowaty PA, Brumfield RT, Glenn TC 2012.

Ultraconserved elements are novel phylogenomic markers that resolve placental mammal

phylogeny when combined with species-tree analysis. Genome Res 22: 746-754.

Mertes F, ElSharawy A, Sauer S, van Helvoort J, van der Zaag PJ, Franke A, Nilsson M,

Lehrach H, Brookes AJ 2011. Targeted enrichment of genomic DNA regions for next-

generation sequencing. Briefings in Functional Genomics 10: 374-386.

Meyer M, Stenzel U, Hofreiter M 2008. Parallel tagged sequencing on the 454 platform. Nat

Protoc 3: 267-278.

Miller MR, Atwood TS, Eames BF, Eberhart JK, Yan YL, Postlethwait JH, Johnson EA 2007.

RAD marker microarrays enable rapid mapping of zebrafish mutations. Genome Biol 8:

R105.

Nijman IJ, Mokry M, van Boxtel R, Toonen P, de Bruijn E, Cuppen E 2010. Mutation discovery

by targeted genomic enrichment of multiplexed barcoded samples. Nat Methods 7: 913-

U967.

141

Okou DT, Steinberg KM, Middle C, Cutler DJ, Albert TJ, Zwick ME 2007. Microarray-based

genomic selection for high-throughput resequencing. Nat Methods 4: 907-909.

Penalba JV, Smith LL, Tonione MA, Sass C, Hykin SM, Skipwith PL, McGuire JA, Bowie

RCK, Moritz C 2014. Sequence capture using PCR-generated probes: a cost-effective

method of targeted high-throughput sequencing for nonmodel organisms. Molecular

Ecology Resources 14: 1000-1010.

Phillips GAC, Carleton KL, Marshall NJ 2016. Multiple Genetic Mechanisms Contribute to

Visual Sensitivity Variation in the Labridae. Mol Biol Evol 33: 201-215.

Porreca GJ, Zhang K, Li JB, Xie B, Austin D, Vassallo SL, LeProust EM, Peck BJ, Emig CJ,

Dahl F, Gao Y, Church GM, Shendure J 2007. Multiplex amplification of large sets of

human exons. Nat Methods 4: 931-936.

Portik DM, Smith LL, Bi K 2016. An evaluation of transcriptome-based exon capture for frog

phylogenomics across multiple scales of divergence (Class: Amphibia, Order: Anura).

Molecular Ecology Resources 16: 1069-1083.

Pyron RA, Burbrink FT, Wiens JJ 2013. A phylogeny and revised classification of Squamata,

including 4161 species of lizards and snakes. BMC Evol Biol 13: 93.

Rohland N, Reich D 2012. Cost-effective, high-throughput DNA sequencing libraries for

multiplexed target capture. Genome Res 22: 939-946.

Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Hohna S, Larget B, Liu L,

Suchard MA, Huelsenbeck JP 2012. MrBayes 3.2: Efficient Bayesian Phylogenetic

Inference and Model Choice Across a Large Model Space. Syst Biol 61: 539-542.

Schott RK, Panesar B, Card DC, Preston M, Castoe TA, Chang BSW. 2017. Data from: Targeted

capture of complete coding regions across divergent species. Dryad Digital Repository:

http://dx.doi.org/10.5061/dryad.f5qk7.2.

142

Schulz HL, Goetz T, Kaschkoetoe J, Weber BHF 2004. The retinome - Defining a reference

transcriptome of the adult mammalian retina/retinal pigment epithelium. BMC Genomics 5:

50.

Sedlazeck FJ, Rescheneder P, von Haeseler A 2013. NextGenMap: fast and accurate read

mapping in highly polymorphic genomes. Bioinformatics 29: 2790-2791.

She XW, Rohl CA, Castle JC, Kulkarni AV, Johnson JM, Chen RH 2009. Definition,

conservation and epigenetics of housekeeping and tissue-enriched genes. BMC Genomics

10: 269.

Shen XX, Liang D, Feng YJ, Chen MY, Zhang P 2013. A Versatile and Highly Efficient Toolkit

Including 102 Nuclear Markers for Vertebrate Phylogenomics, Tested by Resolving the

Higher Level Relationships of the Caudata. Mol Biol Evol 30: 2235-2248.

Summerer D, Wu HG, Haase B, Cheng Y, Schracke N, Stahler CF, Chee MS, Stahler PF, Beier

M 2009. Microarray-based multicycle-enrichment of genomic subsets for targeted next-

generation sequencing. Genome Res 19: 1616-1621.

Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S 2011. MEGA5: molecular

evolutionary genetics analysis using maximum likelihood, evolutionary distance, and

maximum parsimony methods. Mol Biol Evol 28: 2731-2739.

Teer JK, Bonnycastle LL, Chines PS, Hansen NF, Aoyama N, Swift AJ, Abaan HO, Albert TJ,

Margulies EH, Green ED, Collins FS, Mullikin JC, Biesecker LG, Sequencing NC 2010.

Systematic comparison of three genomic enrichment methods for massively parallel DNA

sequencing. Genome Res 20: 1420-1431.

Teer JK, Mullikin JC 2010. Exome sequencing: the sweet spot before whole genomes. Hum Mol

Genet 19: R145-R151.

143

Turki T, Roshan U 2014. MaxSSmap: a GPU program for mapping divergent short reads to

genomes with the maximum scoring subsequence. BMC Genomics 15: 969.

Vallender EJ 2011. Expanding whole exome resequencing into non-human primates. Genome

Biol 12.

Vidal N, Hedges SB 2005. The phylogeny of squamate reptiles (lizards, snakes, and

amphisbaenians) inferred from nine nuclear protein-coding genes. C R Biol 328: 1000-

1008.

Vonk FJ, Casewell NR, Henkel CV, Heimberg AM, Jansen HJ, McCleary RJR, Kerkkamp

HME, Vos RA, Guerreiro I, Calvete JJ, Wuster W, Woods AE, Logan JM, Harrison RA,

Castoe TA, de Koning APJ, Pollock DD, Yandell M, Calderon D, Renjifo C, Currier RB,

Salgado D, Pla D, Sanz L, Hyder AS, Ribeiro JMC, Arntzen JW, van den Thillart G,

Boetzer M, Pirovano W, Dirks RP, Spaink HP, Duboule D, McGlinn E, Kini RM,

Richardson MK 2013. The king cobra genome reveals dynamic gene evolution and

adaptation in the snake venom system. Proc Natl Acad Sci U S A 110: 20651-20656.

Wang Z, Gerstein M, Snyder M 2009. RNA-Seq: a revolutionary tool for transcriptomics. Nat

Rev Genet 10: 57-63.

Warr A, Robert C, Hume D, Archibald A, Deeb N, Watson M 2015. Exome Sequencing: Current

and Future Perspectives. G3-Genes Genomes Genetics 5: 1543-1550.

Wu YH, Hadly EA, Teng WJ, Hao YY, Liang W, Liu Y, Wang HT 2016. Retinal transcriptome

sequencing sheds light on the adaptation to nocturnal and diurnal lifestyles in raptors (vol

6, 33578, 2016). Sci Rep 6.

Yang ZZ, Wafula EK, Honaas LA, Zhang HT, Das M, Fernandez-Aparicio M, Huang K,

Bandaranayake PCG, Wu B, Der JP, Clarke CR, Ralph PE, Landherr L, Altman NS,

Timko MP, Yoder JI, Westwood JH, dePamphilis CW 2015. Comparative Transcriptome

144

Analyses Reveal Core Parasitism Genes and Suggest Gene Duplication and Repurposing as

Sources of Structural Novelty. Mol Biol Evol 32: 767-790.

Zhang GJ, Li B, Li C, Gilbert MTP, Jarvis ED, Wang J, Avian Genome C 2014. Comparative

genomic data of the Avian Phylogenomics Project. Gigascience 3.

Zhang X, Wensel TG, Yuan C 2006. Tokay gecko photoreceptors achieve rod-like physiology

with cone-like proteins. Photochem Photobiol 82: 1452-1460.

3.9 Supplementary Tables

Supplementary Tables are available at: https://academic.oup.com/gbe/article- lookup/doi/10.1093/gbe/evx005#supplementary-data

Table S3.1. Detailed information on the 166 targeted genes and the probes designed based on them. Genes that did not have an Anolis probe are highlighted in orange.

Table S3.2. List of the 16 squamate reptiles that were sequenced in order of phylogenetic divergence from Anolis.

Table S3.3. Assembly statistics using each of the four different assemblers against each of the three reference sets.

Table S3.4. Recovery and completeness of individual genes for each of the assembly methods for the 16 species.

145

Table S3.5. Average gene recovery for each species using of the different assembly methods and references.

Table S3.6. Average gene completeness for each species using of the different assembly methods and references.

Table S3.7. Completeness of opsins gene using additional reference demonstrating increased completeness as a results of increase probe diversity.

Table S3.8. Differences in completeness between genes with only long exons compared to those that also had short exons. Only genes present in the Anolis probeset were included.

Table S3.9. Pairwise sequence identity of the 16 species compared to Anolis for a set of independent genes compared to average completeness levels for those species.

Table S3.10. Comparison of pairwise sequence identity between Anolis and Gekko for those genes present (and complete) in both species to the recovered completeness of the Gekko hybrid enrichment data assembled using BWA and the Gekko reference. The Gekko sequences used to determine identity were obtained from the Gekko genome and thus are independent from the targeted capture data that completeness was calculated from.

Table S3.11. Comparison of targeted capture and RNA-seq data for Thamnophis sirtalis.

RNA-Seq data was assembled using both the hybrid enrichment pipeline and de novo using

Trinity.

146

Table S3.12. Test of targeted capture assembly pipeline on and whole genome sequence data of varying coverage. Genomic reads were assembled with the hybrid enrichment pipeline using BWA to a Gallus reference.

Table S3.13. Detailed cost breakdown of targeted capture, RNA-Seq, and whole genome sequencing experiments. Costs were those incurred by us in CDN dollars and are likely to vary depending on country and sequencing centre used and are expected to change over time. Costs assume use of an Agilent Custom SureSelect kit (either 16 or 96 samples) and of a paid service at a core facility for targeted capture, library preparation, and/or sequencing, and thus should be generally reproducible by any lab without need for specialized equipment/expertise. They additionally assume the possibility of sequencing on partial HiSeq lanes when necessary.

Table S3.14. List of the 23 phylogenetic genes captured and their respective average and minimum completeness values. Of the 23, 16 genes (highlighted in green) had average and minimum completeness levels above 80% and were therefore selected for phylogenetic analysis.

Despite high recovered completeness, ENO1 was excluded due to issues with the probe sequence.

147

Chapter 4 Shifts in selective pressures on snake phototransduction genes associated with photoreceptor transmutation and dim-light ancestry

Citation: Schott RK, A Van Nynatten, DC Card, TA Castoe, BSW Chang. In revision. Shifts in selective pressures on snake phototransduction genes associated with photoreceptor transmutation and dim-light ancestry. Molecular Biology and Evolution MBE-17-0667.

Author Contributions: Conceived and designed the study: RKS, BSWC. Obtained and processed tissue and genetic samples: RKS, DCC, TAC, BSWC. Performed the experiments:

RKS. Analyzed the data: RKS. Assisted with data visualization: AVN. Wrote the manuscript:

RKS, TAC, BSWC, with approval from all authors.

4.1 Abstract

The visual system of snakes is heavily modified relative to other squamates, a condition thought to reflect either their fossorial or aquatic origins. Further modifications are seen in caenophidian snakes, where evolutionary transitions between rod and cone photoreceptors, termed photoreceptor transmutations, have occurred in many lineages. Little previous work, however, has focused on the molecular evolutionary underpinnings of these morphological changes. To address this, we sequenced seven snake eye transcriptomes and utilized new whole genome and targeted capture sequencing data. We used this data to analyze gene loss and shifts in selection pressures in phototransduction genes that may be associated with snake evolutionary origins and

148 photoreceptor transmutation. We identified the surprising loss of rhodopsin kinase (GRK1), despite a low degree of gene loss overall and a lack of relaxed selection early during snake evolution. These results provide some of the first evolutionary genomic corroboration for a dim- light ancestor that lacks strong fossorial adaptations. Our results also indicate that snakes with photoreceptor transmutation experienced significantly different selection pressures from other reptiles. Significant positive selection was found primarily in cone-specific genes, but not rod- specific genes, contrary to our expectations. These results reveal potential molecular adaptations associated with photoreceptor transmutation, and also highlight unappreciated functional differences between rod- and cone-specific phototransduction proteins. This intriguing example of snake visual system evolution illustrates how the underlying molecular components of a complex system can be reshaped in response to changing selection pressures.

4.2 Introduction

Snakes are a diverse group of squamate reptiles that are fascinating due in part to their contested evolutionary origins. Early work suggested that snakes may have had an aquatic origin based on affinities with extinct marine squamates such as mosasaurs and dolichosaurs (Nopcsa 1908;

1923; for review see Lee and Caldwell 2000). Walls (1940), however, noted that snake eyes were heavily modified compared to other squamates such that they contain no structural features that could identify them as being squamate, or even reptilian, eyes. Walls (1940) hypothesized that these changes were due to a fossorial phase during the early evolution of snakes that led to a degeneration of the eye, followed later by recolonization of terrestrial habitats that necessitated a re-evolution of eye structure and function. While this view was supported by later studies

(Bellairs and Underwood 1951; Rieppel 1988), a quantitative morphometric analysis of eye

149 morphology by Caprette et al. (2004) indicated that snake eyes most closely resembled those of primitively aquatic vertebrates, supporting an aquatic origin for snakes. Similarly, phylogenetic and fossil evidence has provided mixed, and often contradictory, support for both hypotheses resulting in an ongoing debate on snake origins (Caldwell and Lee 1997; Lee 2005; Longrich et al. 2012; Hsiang et al. 2015; Simões et al. 2015; Yi and Norell 2015; Lee et al. 2016).

Beyond their implications for snake origins, snake eyes are also very interesting due to the predominance of all-cone and all-rod retinas, a feature that is extremely rare in other vertebrate groups (Walls 1942; Underwood 1970; Schott et al. 2016b). Typical vertebrate retinas are duplex, containing both rod and cone photoreceptors. Rods are much more photosensitive and less noisy than cones enabling vision in dim light, but have slow response and recovery kinetics causing them to saturate under bright light (Lamb 2013). Cones have much faster response and recovery times and can respond over a wider range of intensities than rods; however, they are less sensitive and noisy, making vision in dim light unreliable (Lamb 2010,

2013). Rod and cone photoreceptor cells differ in both their morphology and molecular components, and these contribute to their differences in physiology (for a review see Ingram et al. 2016). For this reason, only a few groups, most notably snakes and other squamate reptiles, have simplex retinas that contain only rods or only cones. Snakes in particular have a wide range of retinal compositions, including not only all-cone and all-rod retinas, but also retinas with photoreceptor morphologies that are intermediate between typical vertebrate rods and cones

(Walls 1942; Underwood 1970).

This diversity of retinal types and photoreceptor morphologies within snakes appears to be restricted to caenophidians, a taxonomically, ecologically, and phenotypically diverse lineage

(Walls 1942; Greene 1997; Vidal et al. 2007). While non-caenophidian snakes surveyed to date have retinas containing only reduced rods (scolecophidians), or simple duplex retinas with single

150 cones and rods (‘henophidian’-grade species, such as pythons and boas), caenophidians have retinas that are more complex and variable (Fig. 4.1; Walls 1942; Underwood 1970). It was this variability in photoreceptor morphology that led Walls (1934, 1942) to formulate the transmutation theory, whereby he postulated that rod and cone photoreceptors could evolutionarily transition to the opposite cell type. As a result of photoreceptor transmutation,

Walls (1934, 1942) inferred that evolutionary shifts between duplex (rod and cone) and simplex

(all-cone or all-rod) retinas were possible and had occurred in a few specific vertebrate groups, most notably geckos and caenophidian snakes.

Recently we have provided the first molecular evidence for photoreceptor transmutation in snakes (Schott et al. 2016b). We demonstrated that the diurnal garter snake, Thamnophis proximus, which has a superficially all-cone retina (based on gross photoreceptor morphology), expresses RH1, the rod visual pigment, in a cone-like photoreceptor that is actually evolutionarily derived from a rod, based on its ultrastructure and rod-specific molecular components (Schott et al. 2016b). This finding supports the evolution of the all-cone retina in snakes through photoreceptor transmutation, rather than the loss of rods as originally proposed by Walls (1942). Instead, the all-cone retina likely evolved from a duplex retina similar to that seen in some vipers, with the other retinal types evolving from this ancestral type, as well as each other, multiple times (Fig. 4.1).

Despite these advances, the extent and impact of photoreceptor transmutation on the evolution and function of the visual system of snakes remains largely unknown. All previous molecular-based work in this area has focused on the visual pigments (Davies et al. 2009;

Simões et al. 2015; Schott et al. 2016b; Simões et al. 2016a; Simões et al. 2016b), and have largely ignored the numerous other proteins involved in vertebrate visual systems, including those involved in the phototransduction cascade. In vertebrates, vision begins with

151 phototransduction, the process by which light is converted to an electrical signal in the rod and cone photoreceptors (for a review of phototransduction see Fig. 4.2; for more detailed reviews see Wensel 2008; Fain et al. 2010; Lamb 2013). The process in rods and cones is similar, but involves some distinct proteins in the two photoreceptor types (Table 4.1), including the light- sensitive visual pigments that begin the phototransduction cascade. Overall the phototransduction cascade involves over 35 proteins, many of which are either cone or rod specific (Fig. 4.2; Table 4.1).

To study the molecular evolution of phototransduction genes in snakes we sequenced whole eye transcriptomes from seven colubrid caenophidians, including species with all-cone and all-rod retinas based on gross morphology of the outer segments. We also utilized recently sequenced snake and other reptile genomes, as well as new targeted capture sequencing data

(Schott et al. 2017), and available resources from Genbank. From these sources we extracted all known reptilian phototransduction gene coding sequences. Using this data we investigated the effect snake evolutionary origins and photoreceptor transmutation may have had on the evolution of phototransduction genes using codon-based likelihood models implemented in PAML (Yang

2007). We focused on use of the clade models (Bielawski and Yang 2004; Weadick and Chang

2012), which allow variation in selective constraint between (or among) different partitions of a phylogeny. These models have been shown to be extremely useful in testing for long-term shifts in selection pressure associated with changes in ecology and function (Schott et al. 2014; Torres-

Dowdall et al. 2015; Van Nynatten et al. 2015; Baker et al. 2016; Dungan et al. 2016;

Castiglione et al. 2017; Hauser et al. 2017). Evolutionary transitions between duplex, all-rod, and all-cone retinas through photoreceptor transmutation would presumably require extensive changes to the underlying molecular components of the visual system beyond the morphological changes observed by Walls (1942). These changes likely imposed distinct selection pressures on

152 snake, and in particular caenophidian, visual transduction genes relative to other reptiles.

Furthermore, if snakes had a fossorial origin as proposed by Walls (1940) that included a degradation in the visual system we would expect a relaxation of selective constraint on phototransduction genes early in snake evolution, as well as considerable gene loss.

Figure 4.1. Schematic illustration of major snake retina types. The henophdian-type duplex retina contains large single cones with LWS (red), small single cones with SWS1 (pink), and rods with RH1 (white). The caenophidian-type duplex retina additionally has double cones that both contain LWS. The other retina types are variations on the caenophidian-type duplex retina and are inferred to be derived from it (and/or each other) through photoreceptor transmutation.

Considerable variation in photoreceptor morphology exists within each of these major retina types. The ‘degenerate’, all-rod retinas of scolecophidians are not shown. Generalized photoreceptor outer segment morphology is shown based on Walls (1942) and Underwood

(1970). Contained visual pigments are based on MSP and sequencing data (Sillman et al. 1997;

Sillman et al. 1999; Sillman et al. 2001; Davies et al. 2009; Simões et al. 2015; Schott et al.

2016b; Simões et al. 2016a; Simões et al. 2016b). Photoreceptor cartoons based on those of

Bowmaker (2008).

153

Figure 4.2. Generalized schematic of vertebrate rod phototransduction cascade. In the dark the components of the cascade are largely inactive, except for the cyclic nucleotide gated channel

(CNG), and the light-insensitive Na+/Ca2+-K+ exchanger. These proteins, located on the plasma membrane result in a stable depolarizing current in the dark. Light activation of rhodopsin, shown here as a dimer, results in a conformational change that opens a binding site for the G protein transducin, facilitating the exchange of GDP with GTP within its α-subunit. Dissociation of the transducin α-subunit allows it to activate phosphodiesterase (PDE) via binding of its inhibitory γ-subunit. Activated PDE hydrolyzes cGMP to GMP resulting in a decrease in cGMP

154 concentration, which in turn results in closing the CNGs. This results in hyperpolarization of the cell, slowing the release of glutamate into the synapse and eventually resulting in a visual signal being sent to the brain. Recovery begins with deactivation of rhodopsin. Reduction in calcium concentration causes dissociation of recoverin from G protein-coupled receptor kinase (GRK) allowing it to phosphorylate the activated rhodopsin reducing its activity. Phosphorylated rhodopsin is further deactivated by the binding of arrestin. Transducin is deactivated by hydrolysis of its bound GTP, which is catalyzed by the binding of the regulator of G-protein signalling complex (RGS9-GNB5-RGS9BP). Hydrolysis of GTP causes transducin to dissociate from the PDE inhibitory subunit, which deactivates PDE. Finally, the lowered Ca2+ concentration results in activation of guanylate cyclase activating proteins (GCAPS) through replacement of Ca2+ with Mg2+. The GCAPs activate guanylate cyclases, which synthesize cGMP. Increasing cGMP concentration reopens the CNGs, thus restoring the Ca2+ concentration and in turn deactivating the GCAPs and guanylate cyclases. The cone phototransduction cascade is similar, but involves cone-specific copies of several proteins. The genes that encode the proteins involved in phototransduction, including which genes are specific to rods or cones, are outlined in Table 4.1.

155

Table 4.1. Major components of the vertebrate visual phototransduction cascade and their presence or absence in snakes and other reptile groups.

Protein Gene Symbol Photoreceptor Gene Name Lost In RH1 Rod Rhodopsin (RHO) LWS Cone Long-wave Sensitive Cone Opsin Opsin RH2 Cone Middle-wave Sensitive Cone Opsin Snakes SWS1 Cone Short-wave Sensitive Cone Opsin 1 SWS2 Cone Short-wave Sensitive Cone Opsin 2 Snakes GNAT1 Rod G Protein α-subunit 1 GNB1 Rod G Protein β-subunit 1 GNGT1 Rod G Protein γ-subunit 1 Reptiles Transducin GNAT2 Cone G Protein α-subunit 2 GNB3 Cone G Protein β-subunit 3 GNGT2 Cone G Protein γ-subunit 2 PDE6A Rod Phosphodiesterase α-subunit 6A Reptiles PDE6B Rod Phosphodiesterase β-subunit 6B Phosophodiesterase PDE6G Rod Phosphodiesterase γ-subunit 6G PDE6C Cone Phosphodiesterase β-subunit 6C PDE6H Cone Phosphodiesterase γ-subunit 6H CNGA1 Rod CNG α-subunit 1 Cyclic Nucleotide CNGB1 Rod CNG β-subunit 1 Gated Channel CNGA3 Cone CNG α-subunit 3 CNGB3 Cone CNG β-subunit 3 Na+/Ca2+-K+ SLC24A1 Rod Solute Carrier Family 24 Member 1 Squamates Exchanger SLC24A2 Cone Solute Carrier Family 24 Member 1 SAG Rod Rod Arrestin (S-Antigen) Arrestin ARR3 Cone Arrestin 3 (Cone Arrestin; X-Arrestin) G Protein-Coupled GRK1 Rod GRK 1 (Rhodopsin Kinase) Snakes Receptor Kinase GRK7 Cone GRK 7 (Cone Opsin Kinase) Regular of G- RGS9 Both Regulator of G-Protein Signaling 9 Protein Signalling RGS9BP Both RGS9 Binding Protein Complex GNB5 Both G Protein β-subunit 5 GUCA1A Both Guanylate Cyclase Activator 1A Guanylate Cyclase GUCA1B Both Guanylate Cyclase Activator 1B Activating Protein GUCA1C Cone Guanylate Cyclase Activator 1C GUCY2D Both Guanylate Cyclase 2D Guanylate Cyclase GUCY2F Both Guanylate Cyclase 2F Snake eyes? Recoverin RCVRN Both Recoverin

156

4.3 Results

4.3.1 Loss of Rhodopsin Kinase (GRK1) in Snakes

A total of 35 visual transduction genes (Table 1) were targeted for extraction from the de novo eye transcriptomes, as well as from NCBI Genbank, a previously published visual gene hybrid enrichment experiment (Schott et al. 2017), and publically available draft genomes (Castoe et al.

2011; Bradnam et al. 2013; Castoe et al. 2013; Vonk et al. 2013; Green et al. 2014; Georges et al.

2015; Liu et al. 2015; Song et al. 2015). New sequences were extracted or sequenced from 21 species for a total of 515 new sequences and this was combined with sequences available on

Genbank for 1243 sequences total (Supplementary Files 4.1 and 4.2). Of the 35 genes targeted,

29 were recovered in snakes. Two genes, PDE6A and GNGT1, were absent in all sampled reptiles, but present in sampled mammals, amphibians, and fishes. A second gene, SLC24A1, was absent in all sampled squamates, but is present in other vertebrates. Two visual opsins, RH2 and SWS2, previously identified to have been lost in snakes (Davies et al. 2009; Castoe et al.

2013; Schott et al. 2016b; Simões et al. 2016b), were not recovered from any of the snake transcriptomes or genomes we analyzed, further supporting their ancestral loss in snakes.

Additionally, we did not recover GRK1 (rhodopsin kinase) in any snake eye transcriptome or genome, suggesting this gene was also lost ancestrally in snakes. This is particularly notable because the loss of GRK1 has not been reported in any other vertebrate group.

One gene, GUCY2F, was recovered from the snake genomes, but was absent from the snake eye transcriptomes. GUCY2F sequences from the cobra and corn snake genomes were used as references for extraction of GUCY2F from the snake eye transcriptome, and using these references we were able to recover GUCY2D, but not even a fragment of GUCY2F, suggesting it was not expressed in the eye transcriptomes, rather than just being expressed at a low level.

157

However, it remains possible that GUCY2F is only expressed in the eye under specific conditions (e.g., in juveniles) or may be expressed outside of the retina.

4.3.2 Distinct Selection Pressures on Snake Phototransduction Genes

Of the 29 phototransduction genes recovered in snakes and other reptiles, 26 were analyzed with codon-base likelihood models implemented in PAML (Yang 2007). The gamma subunits of transducin and phosphodiesterase (GNGT2, PDE6G, PDE6H) were very short (70, 88, and 86 amino acids, respectively) and so were not analyzed further. The 26 analyzed genes were broken into three groups: 7 rod-specific genes, 11 cone-specific, and 8 non-specific genes found in both photoreceptor types. A single species phylogeny was used to maintain an even comparison between all genes (Figs. S4.1, S4.2).

Random sites models were used to determine overall selective constraint acting on each gene in reptiles and snakes (M0 and M3), and to test for positive selection (M2a vs M1a, and M8 vs M8a/M7). Overall constraint was highly variable for the phototransduction genes in reptiles ranging from 0.005 for GNB1 to 0.250 for GUCA1C, with an average ω of 0.124 (Table S1;

Supplementary File 4.3); these values span the range expected for functional protein coding genes (Fay and Wu 2003). Positive selection across reptiles was somewhat rare, with significant evidence from the M8 or M2a models occurring in 9 of the 26 genes (Table 4.2; Supplementary

File 4.3). Comparatively snakes had significantly higher ω than reptiles in general (average of

0.231, p < 0.001, paired samples t-test), which ranged from 0.010 for GNB1 to 0.498 for

CNGB3. Concordantly, pervasive positive selection was more widespread in snakes with significant evidence occurring in 15 of the 26 genes (Table 4.2). In most cases the positive selection within snakes seems to account for the positive selection seen in reptiles generally; however for two genes (CNGB1, GUCA1B) we found significant positive selection in the reptile

158 dataset, but not in the snake-only dataset. This could be a result of positive selection elsewhere in the reptile tree or may be a result of a lack of power to detect the weak signal of positive selection found in these genes with the smaller number of taxa present in the snake-only dataset.

Within snakes, cone-specific genes had significantly higher ω than rod-specific genes

(average ω 0.278 vs 0.170, unpaired t-test, two-tailed p = 0.041; Table S4.1). Non-specific genes had an intermediate ω (0.192), and a one-way ANOVA comparing all three groups neared significance (p = 0.053). The elevated ω of cone-specific genes was reflected in positive selection where 10 of the 11 cone-specific genes were under significant positive selection. The difference between cone- and rod-specific genes was not maintained across reptiles (unpaired t- test, two-tailed p = 0.39), suggesting that the elevated ω of cone-specific genes is particular to snakes. We also compared average ω for genes involved in phototransduction activation to those involved in recovery, but found no significant difference between them (average ω 0.21 vs 0.25, unpaired t-test, two-tailed p = 0.46). Ion channels, which were found to have some the highest ω values among visual genes in mammals (Invergo et al. 2013), were not found to have significantly higher ω than other phototransduction genes in snakes (average ω 0.28 vs 0.22, unpaired t-test, two-tailed p = 0.37). Overall patterns of ω appear to be largely driven by differences between reptiles and snakes and, within snakes, between cone-specific and other gene types.

Previous analyses of visual gene molecular evolution in snakes have focused solely on the visual opsins (Simões et al. 2015; Schott et al. 2016b; Simões et al. 2016a). Simões et al.

(2016a) found significant positive selection in all three opsin genes using random sites models, which differs from our current results that did not recover significant positive selection in SWS1

(Table 4.2; Supplementary File 4.3). Instead, we find that SWS1 is under stronger constraint than either LWS or RH1. RH1, in particular stands out in our dataset as being the only rod-specific

159 gene with pervasive positive selection in snakes. This appears to be due to the larger sample sizes for this gene, and the other visual opsin, thanks to the sequencing efforts of Simões et al.

(2015, 2016a, 2016b). When the RH1 data is restricted to the same taxon sampling as the other genes, evidence for a positively selected class of sites was not found (Supplementary File 4.3).

This suggests a more subtle effect that may be restricted to particular taxa. When sampling was restricted for the other genes the results were qualitatively the same, although the strength of positive selection in LWS was even higher, again suggesting taxon-specific differences. These results suggest that additional sampling may be needed to detect more subtle and taxon-specific effects, but more broad scale patters are readily captured by our data.

160

Table 4.2. Summary of the selection analyses performed on the reptile and snake datasets using the random sites models to test for pervasive positive selection and the clade models to test for divergent (and positive) selection.

Reptile Dataset Snake Dataset Type Gene Pos. Sel. Snake Caen Snake + Caen Pos. Sel. Caen Rod CNGA1 no CmD bg+ CmD bg+ CmD bg+ no No CmC fg Rod CNGB1 M8 CmC fg CmC/D fg no No CmD bg CmC fg CmC fg Rod GNAT1 no CmC fg M3/8 No CmD fg+ CmD fg+ Rod GNB1 no no no No no no CmC fg CmC fg Rod PDE6B no CmC/D fg M2/3/8 no CmD fg CmD sn CmC fg CmC fg Rod RH1 no CmC/D fg CmC fg M2/3/8 CmD fg CmD fg+ Rod SAG no no no no no no CmC fg Cone ARR3 no CmC/D fg CmC/D fg M2/3/8 CmC/D fg+ CmD fg CmC fg Cone CNGA3 M8 CmC fg CmC fg M3 CmC/D fg+ CmD fg CmC fg Cone CNGB3 M8 CmC/D fg CmC/D fg M2/8 CmC fg+ CmD fg CmC fg+ Cone GNAT2 no CmC/D fg+ CmC fg+ M2/3/8 CmC/D fg+ CmD fg CmC fg Cone GNB3 M2/8 CmC/D fg CmC/D fg M2/8 CmC/D fg+ CmD fg CmC fg Cone GRK7 M2/8 CmC/D fg CmC/D fg M2/3/8 CmC/D fg+ CmD fg+ CmC fg CmC fg Cone GUCA1C M2/8 no M3 CmC/D fg+ CmD bg CmD bg CmC fg+ Cone LWS no CmC/D fg+ CmC/D fg+ M2/3/8 CmC/D fg+ CmD fg+ CmC fg Cone PDE6C M8 CmC/D fg CmC/D fg M2/3/8 CmC/D fg+ CmD fg Cone SLC24A2 M8 CmC fg CmC/D fg no M2/3/8 CmC/D fg+ CmC fg Cone SWS1 no CmC/D fg CmC fg no CmD fg+ CmD fg CmC fg Both GNB5 no CmC/D fg CmC/D fg no no CmD fg CmC fg CmC fg Both GUCA1A no no M2/3/8 no CmD fg+ CmD fg+ Both GUCA1B M8 CmC fg CmC fg no no CmC/D fg CmC fg Both GUCY2D no CmC/D fg CmC/D fg M3/8 CmC/D fg+ CmD fg CmC fg Both GUCY2F no CmC fg no no CmD bg CmD bg Both RCVRN no CmC fg CmC fg CmC fg no no Both RGS9 no CmC/D fg CmC fg CmC fg no CmC/D fg

161

CmD fg CmD fg CmC fg Both RGS9BP no CmC/D fg CmD s>c>bg no CmC bg CmD fg

The positive selection columns (pos. sel.) indicate significant results with the M2a, M3 and/or M8 random sites models. The snake, caenophidian (caen), and snake + caen columns indicate significant results with CmC and CmD on the respective partitions, as illustrated in Figure 4.3 and Figure S4.3. The bold and underlined entry in each row indicates the best fit among the snake, caen, and snake/caen partitions of the reptile dataset. Elevated selection in the foreground partition (i.e., the snake, caen, or snake/caen partition) is represented by fg, while elevated selection in the background in represented by bg. The + symbol indicates significant inference of positive selection.

4.3.3 Long-term Shifts in Selection Pressures on Caenophidian Snake Phototransduction Genes

To further explore the selective pressures acting on the snake visual system relative to other reptiles, we analyzed the reptile datasets using Clade Model C (CmC) and Clade Model D

(CmD) (Bielawski and Yang 2004). These models allow selective constraint on a proportion of sites to vary between two or more partitions of the phylogeny. Through a comparison to a null model that does not allow different partitions (M2a_rel; Weadick and Chang 2012), these models test for a long-term shift in the intensity of selection (i.e., divergent selective pressures;

Bielawski and Yang 2004; Schott et al. 2014; Baker et al. 2016). We used three different partitions in order to test whether caenophidian snakes, where photoreceptor transmutation has occurred, have experienced a shift in selective pressures relative to other reptiles and snakes (Fig.

4.3). We first test for a difference between snakes and all other reptiles (snake partition), which could be due to general difference in snakes, perhaps as a result of their evolutionary origins.

Next we tested for a difference between caenophidian snakes and other reptiles (caenophidian partition), to examine the potential influence of photoreceptor transmutation. Finally, because the first two partitions overlap we also use a three-partition model that compared caenophidians, non-caenophidian snakes, and non-snake reptiles (snake + caenophidian partition), allowing us to

162 differentiate between shifts in selective pressures that may be present in all snake and those specific to caenophidians.

We found significant evidence for a shift in selection pressure in snakes relative to other reptiles in 24 of the 26 phototransduction genes (Fig. 4.4; Table 4.2; Fig. S4.4; Supplementary

File 4.3). One of the genes that did not have any evidence of divergent selection (GNB1) was under extremely high constraint, but the other (SAG), was under low constraint, and it appears there may have been elevated ω in one or more of the background (non-snake) lineages. Two of the genes also showed a shift in selection in the opposite direction to the other genes, with elevated ω in the background (non-snake) partition (CNGA1, GUCY2F). We also found significant shifts in selection pressures in the same 24 genes when the caenophidian partition was used (Table 4.2). The same two genes (GNB1 and SAG) lacked significant differences, and

CNGA1 similarly had elevated rates in the background rather than the foreground. This similarity is not unexpected as the two partitions are not mutually exclusive.

To differentiate between the two-partition models, we conducted a third set of tests with three partitions: non-snakes, non-caenophidian snakes, and caenophidian snakes (Fig. 4.3). The combined test using the three-partition models was significant for 19 of the 26 genes, and in seven cases the three-partition model was the overall best-fitting model as determined by AIC

(Fig. 4.4, Table 4.2; Supplementary File 4.3). For several genes the three-partition model failed to converge, likely due to the low sample size for non-caenophidian snakes and these results were also reported as non-significant. The overall best-fitting partition varied considerably among the genes with the best-fit being the snake partition eight times, caenophidian nine times, and the 3-partition model seven times, with no discernible pattern among rod-specific, cone- specific, or non-specific genes. These findings are likely influenced by the fact that only two non-caenophidian snakes were present for the majority of the gene datasets (Supplementary Files

163

4.1 and 4.2). Only for the three opsin genes was a larger sampling of non-caenophidians snakes possible. For these genes, the snake partition was best-fitting for RH1 and SWS1, while the 3- partition model was the best fit for LWS (Table 4.2).

Overall, we found support for a shift in selective pressure specific to snakes for 15 of the

26 genes, while we found support for a shift specific to caenophidians for 16 genes. The magnitude of the shift was larger for caenophidian snakes than non-caenophidian snake, and, in line with the random site results, for cone-specific genes than for other genes (Fig. 4.4).

Together, these findings represents a high number of positive outcomes, but are consistent with the expectation that photoreceptor transmutation would have resulted in substantial changes to the phototransduction machinery.

Figure 4.3. Partitions used to analyze shifts in selective constraint in snakes relative to other reptiles. The snake partition compares snakes to all other reptiles. We additionally compared the branch leading to snakes to all other branches (Fig. S4.3). The caenophidian partition compares caenophidian snakes to all other reptiles, and was additionally tested within

164 only snakes by comparing caenophidian snakes to other snakes (Fig. S4.3). Because the two- partition tests are not mutually exclusive we tested them simultaneously by comparing caenophidian snakes, non-caenophidians snakes, and non-snake reptiles using a 3-partition test

(snake + caen partition). Topology of trees is based on the species tree shown in Figure S4.1.

165

Figure 4.4. Tests for shifts in selective pressures on phototransduction genes. Tests were performed between snakes and other reptiles (Snake Partition), caenophidian snakes and other reptiles (Caenophidian Partition), and between caenophidians, other snakes, and other reptiles

(Snake + Caen Partition) as shown in Figure 4.3. The ω (dN/dS) values of the divergent site class using CmC are shown highlighting the difference between the background (open circle) and

166 foreground (closed circle) partitions for each gene. When only an open circle is shown the difference was not significant and instead the equivalent value from the null model (M2a_rel) is shown. Differences in ω were averaged for rod-, cone-, and non-rod/cone-specific genes demonstrating the relative strength of divergent selection. For the Snake + Caen partition, the hatched area represents the difference in ω between snakes and other reptiles, while the full bar represents the difference between caenophidian and other reptiles. Error bars are standard error.

4.3.4 Positive Selection in Caenophidians Primarily in Cone-specific Phototransduction Genes

To examine the effect that photoreceptor transmutation may have had on phototransduction genes more specifically, we utilized a snake-only dataset to test for shifts in selective pressure, and positive selection, between caenophidian and non-caenophidian snakes directly (Fig. S4.3).

Among the rod-specific genes, only RH1 showed significant evidence for different selection pressures between caenophidians and other snakes (Fig. 4.5, Table 4.2; Supplementary File 4.3).

The cone-specific genes showed a strong pattern where all genes were found to have significant shifts in selection between caenophidians and other snakes, with positive selection in the caenophidians snakes (Fig. 4.5, Fig. S4.4, Table 4.2). This includes SWS1, which did not have evidence of pervasive positive selection in snakes in general (when using the random sites models), but did show significant positive selection in caenophidian snakes with the CmD model

(Table 4.2). Non-rod/cone-specific genes showed a somewhat intermediate pattern with five of the eight genes having significant evidence for a shift in selective pressures between caenophidians and other snakes. For one non-rod/cone-specific gene (GUCY2D), we also found evidence of positive selection in caenophidians. In two other non-rod/cone specific genes

(GUCY2F, RGS9BP), the elevated ω was found to be in non-caenophidians, rather than

167 caenophidians, as expected. These results support our hypothesis that caenophidians have experienced positive selection in phototransduction genes that may be associated with photoreceptor transmutation, although this is limited primarily to cone-specific genes.

168

Figure 4.5. Tests for shifts in selective pressures on phototransduction genes between caenophidian and non-caenophidian snakes (Fig. S4.3). The ω (dN/dS) values of the divergent site class using CmC are shown highlighting the difference between the background (open circle) and foreground (closed circle) partitions for each gene. Where only an open circle is shown the differences were not significant and instead the equivalent value from the null model (M2a_rel)

169 is shown. Differences in ω were averaged for rod-, cone-, and non-rod/cone-specific genes demonstrating the relative strength of divergent selection. Error bars are standard error.

4.3.5 No Evidence for a Relaxation of Constraint on the Branch Leading to Snakes

A fossorial origin of snakes, as hypothesized by Walls (1940), would predict that relaxed selection along the branch leading to snakes may have occurred (Fig S4.3). To test for this, we employed the branch and clade models. We also applied the branch-site model to this branch to test for positive selection.

We found sporadic evidence for elevated ω along the branch leading to snakes using the branch model and CmC in each of the three categories of visual transduction genes; however consistent support for a relaxation of selection on the branch leading to snakes was not found

(Table S4.2; Supplementary File 4.3). Three of the cone-specific genes (ARR3, GNAT2,

SLC24A2) showed significant evidence of positive episodic selection with the branch-site model

(Table S4.2), indicating the potential for adaptive evolution along for snake branch in these genes. Surprisingly, no evidence for positive selection, or a shift in selection pressure, was found on GRK7, which may have been expected alongside the loss of GRK1. Genes that lacked a shift in selective pressure in snakes and caenophidians (SAG and GNB1) also had no evidence for shifts along the branch leading to snakes.

Evidence for divergent and episodic positive selection on the branch leading to caenophidians was much more prevalent (Table S4.2; Supplementary File 4.3). This is likely due, at least in part, to the long-term shift in selection pressures found in snakes and caenophidian snakes, rather than selection specifically along the branch leading to

170 caenophidians. However, this may also indicate potential changes associated re-evolution of double cones inferred to have occurred ancestrally in caenophidians.

4.4 Discussion

We used new whole eye transcriptome data, combined with data derived from recent whole genomes and targeted capture, to produce the largest dataset of reptilian visual transduction genes to date. This dataset was analyzed with a suite of codon-based likelihood models to examine changes in selective pressure on phototransduction genes in snakes that may be associated with snake evolutionary origins and photoreceptor transmutation. Within the set of 26 visual transduction genes analyzed we found strong support for elevated ω in snakes relative to other reptiles in 24 genes. Surprisingly, we found very little evidence for relaxed selection on the branch leading to snakes. However, we did find significant evidence for a long-term shift in selection between caenophidians and other reptiles in 24 of the genes. Within caenophidian snakes, we found the strongest evidence for positive selection in cone-specific genes. We also confirmed the loss of two cone opsins in snakes with transcriptome and genomic data, and further identified the apparent loss of expression of GUCY2F within the snake eye, as well as the unique, snake-specific loss of GRK1.

The loss of GRK1 in snakes is surprising because this gene encodes rhodopsin kinase, which is expressed in rods (and in some species also cones; Osawa and Weiss 2012). Regardless of the precise nature of the evolutionary origins of snakes, a nocturnal or otherwise dim-light ancestry for extant snakes is well supported (Walls 1942; Hsiang et al. 2015; Simões et al. 2015;

Lee et al. 2016; Anderson and Wiens 2017). Rather than the loss of GRK1, the loss of GRK7

(cone opsin kinase), for example, would be more consistent with a dim-light lifestyle, and such a

171 loss has occurred in nocturnal rodents, which express GRK1 in both cones and rods (Weiss et al.

2001). Since snakes generally have rods and dim-light vision, it is likely that GRK7 has been co- opted to also be expressed in rods. This suggests that GRK7 was already expressed in rods prior to its loss in snakes and raises the possibility that other squamates may also express GRK7 in both rods and cones, which has interesting functional implications. GRK7 was shown to have a

10–30 fold higher specific activity than GRK1 in fishes, and has been implicated as contributing to the much faster photoresponse recovery times characteristic of cones (Wada et al. 2006;

Tachibanaki et al. 2012; but see also Horner et al. 2005). Accordingly, it is logical to expect that loss of GRK1 imposed changes in the selective pressures acting on GRK7 and may have resulted in positive selection. We did not, however, find evidence of episodic positive selection on the branch leading to snakes in GRK7, but did find evidence for positive selection on GRK7 within caenophidian snakes. This pattern may reflect adaptive changes towards higher (cone-like) and lower (rod-like) activity in diurnal and nocturnal lineages, respectively, associated with the evolution of all-cone and all-rod retinas through transmutation in caenophidians. Further work to clarify the functional differences between GRK1 and GRK7, and the function of GRK7 in diurnal and nocturnal caenophidian snakes, would provide valuable insight into the evolution of snake visual systems and allow testing of hypotheses for a transition in GFK7 function suggested by our results.

We found evidence for a long-term shift in selection pressure in snakes versus other reptiles in nearly all visual transduction genes with significantly elevated ω (dN/dS), and in some cases positive selection, in snakes. Only two rod-specific genes (SAG, GNB1) showed no evidence of different selection between snakes and other reptiles, while one rod-specific gene

(GNGA1) and one non-specific gene (GUCY2F) showed elevated selection in non-snakes.

Compared to recent studies of phototransduction gene evolution in mammals and raptorial birds

172

(Invergo et al. 2013; Wu et al. 2016), snakes had a much higher incidence of positive selection across genes, and the only evidence of positive selection detected across an entire clade. In mammals, positive selection was detected in only two genes, each on a single branch (Invergo et al. 2013). Furthermore, the elevated ω in cone-specific genes relative to rod-specific genes was not found in mammals (Invergo et al. 2013), nor did we find this pattern in reptiles as a whole. In raptorial birds, positive selection was detected on the branches leading to particular subgroups in several genes; for example, 9 out of 120 visual genes analyzed showed some evidence of positive selection on specific branches within owls (Wu et al. 2016). Notably, two of the genes that were not under positive selection and instead were highly constrained in snakes, SAG and CNGA1, were found to be positively selected on the branch leading to strigiform owls, which Wu et al.

(2016) suggested is related to the shift to nocturnality in this group. The difference between snakes and other vertebrates is striking and is indicative of the distinct nature of the snake visual system that may be linked to photoreceptor transmutation, as well as their evolutionary origins.

Snakes are thought to have originated from either fossorial or aquatic lizards, but distinguishing between these hypotheses has been difficult and contradictory evidence presented on both sides (Walls 1942; Caldwell and Lee 1997; Caprette et al. 2004; Lee 2005; Longrich et al. 2012; Hsiang et al. 2015; Simões et al. 2015; Yi and Norell 2015; Lee et al. 2016). Using our data, we tested a prediction of a fossorial origin: that an extended fossorial phase during snake evolution would have resulted in a degradation in the visual system that may be detectable through both a relaxation of selective constraint and through the wholesale loss of visual transduction genes. While we found no consistent evidence for relaxation in selective constraint along the branch leading to snakes, we did find that snakes have lost three visual transduction genes (two opsins, and one kinase). These findings are strikingly similar to patterns observed in nocturnal, burrowing rodents such as mice that have also lost two opsins and one kinase, rather

173 than the more extreme patterns observed in fossorially adapted mammals that have lost 5–16 phototransduction genes depending of the degree of fossorial adaptation (Emerling and Springer

2014). The relatively low degree of gene loss in snakes and lack of evidence for relaxed evolutionary constraint early in their evolutionary history are most consistent with a dim-light activity phase during early snake evolution that may have included nocturnal, burrowing, and/or aquatic habits, but did not entail strong adaptation to fossoriality. These findings are exciting because they provide new genomic insight into long-standing debates on snake origins, and are further consistent with the conclusions of other recent studies based on visual pigment complement (Simões et al. 2015), phylogenetics (Hsiang et al. 2015), and the morphology of the candidate stem-snake Tetrapodophis (Lee et al. 2016). Additionally, these findings also agree, at least in part, with the early views of Rochon-Duvigneaud (1943) and Underwood (1977) that nocturnality played a key role in the evolution of the snake eye (see Simões et al. 2015).

Analysis of phototransduction gene evolution in the early diverging and highly fossorial scolecophidian snakes, which were not sampled in the current study, but whose visual opsins genes have been analyzed (Simões et al. 2015), is likely to provide additional insight.

In addition to the dramatic changes to the eye that occurred during the evolutionary origins of snakes, major evolutionary transitions between retina types in caenophidian snakes are thought to have occurred through photoreceptor transmutation. In our analyses, we expected photoreceptor transmutation to have required extensive changes to the underlying molecular components of the visual system that may have imposed distinct selection pressures on caenophidian phototransduction genes. Consistent with this idea, we found 13 visual transduction genes under positive selection in caenophidians. Somewhat surprisingly, the strongest selection was found on the cone-specific genes, while the rod-specific genes showed very little difference in selection between caenophidians and other snakes. This selective pattern

174 may be explained by repeated shifts from diurnality to nocturnality with only a single (or few) shifts in the opposite direction from nocturnality to diurnality. Under these conditions, we would not expect to be able to detect positive selection in rod-specific genes during a nocturnal to diurnal transition since the transition primarily occurred only once. However, with repeated reversions to nocturnality we could expect positive selection in cone-specific genes as they adapted to function under dim-light conditions in the transmuted all-rod retinas of nocturnal species. These adaptations could have acted to enable greater spectral sensitivity and even nocturnal colour vision, as has been demonstrated in nocturnal geckos (Roth and Kelber 2004).

With transitions to nocturnality we might not expect much change to rod-specific genes because rods are already thought to be operating near their biophysical limits (Gozem et al. 2012). A recent broad scale analysis of activity pattern evolution in vertebrates is largely consistent with this pattern (Anderson and Wiens 2017); however a more detailed analysis of snake activity pattern evolution is needed, such as that recently performed for geckos (Gamble et al. 2015).

Alternatively, it may be that adaptive changes during cone to rod photoreceptor transmutation involved more changes to transduction proteins, while rod to cone photoreceptor transmutation tends to involve comparatively fewer changes to protein function, but may instead involve changes to protein concentrations and/or retinal pathways. The apparently independent evolution of double cones in caenophidian snakes likely also contributed to shifts in selective pressures on cone-, rather than rod-specific genes. Unfortunately the function of double cones is largely unknown (Pignatelli et al. 2010) making it difficult to assess to the potential contribution of the evolution of double cones to our findings.

Despite having been proposed over 80 years ago, the molecular mechanisms underlying photoreceptor transmutation in snakes have only recently begun to be revealed. Schott et al.

(2016b) demonstrated that the morphologically ‘all-cone’ retina of the diurnal natricine garter

175 snake Thamnophis proximus in fact contains a photoreceptor class with rod ultrastructural features that expresses rod-specific proteins, such as RH1 and rod transducin, strongly suggesting it is actually a transmuted cone-like rod. We also recently confirmed this in a second species, the colubrine pine snake Pituophis melanoleucus, which is not closely related to garter snakes (Bhattacharyya et al. 2017). In both species, RH1 was evolutionarily highly conserved, functional when expressed in vitro, and possessed cone-like functional properties, such as a blue- shifted absorption spectrum, decreased stability, and a cone-like retinal binding pocket (Schott et al. 2016b; Bhattacharyya et al. 2017). Our inference of positive selection in caenophidian snake

RH1 is consistent with these results and may reflect adaptation towards a more cone-like function of RH1 in species that evolved morphologically ‘all-cone’ retinas (which are also likely to contain a class of cone-like rods). Furthermore, these results agree with the only electrophysiological study in diurnal colubrids, which found no evidence of a separate scotopic

(dim-light) visual response (Jacobs et al. 1992). We proposed that the transmuted cone-like rods of diurnal colubrids may contribute to an increased range of spectral sensitivity and lay the basis for trichromatic colour vision under mesopic (when both the rods and cones are typically active), and potentially even photopic, conditions (Schott et al. 2016b). Although this has not been investigated in snakes, increasing evidence from mammals suggests that rods can contribute to colour vision (Cao et al. 2008; Joesch and Meister 2016) under both mesopic (McKee et al.

1977; Reitner et al. 1991) and photopic conditions (Oppermann et al. 2016). Additional studies, including behavioural experiments, to further evaluate the functional consequences of rod to cone transmutation in snakes would be ideal for corroborating these hypotheses.

In contrast to the morphologically ‘all-cone’ retinas of typical of diurnal colubrids and other caenophidian snakes, some highly nocturnal species have retinas that appear to contain only rods (Walls 1942; Underwood 1970). While the morphological changes to the

176 photoreceptor cells in these ‘all-rod’ retinas, along with the strong positive selection in cone- specific visual transduction genes, are suggestive of transmuted rod-like cones that function under scotopic conditions, the visual capabilities of nocturnal snakes with ‘all-rod’ retinas have not been studied. An analogous process, however, may have occurred in geckos. Geckos have all-cone retinas that in the majority of species, which are nocturnal, resemble all-rod retinas in both their appearance and their function (Walls 1942; Tansley 1964; Underwood 1970; Röll

2000; Zhang et al. 2006). With their rod-like cones, nocturnal geckos are able to discriminate colours at dim-light levels at which humans are colour blind (Roth and Kelber 2004). Nocturnal caenophidian snakes with ‘all-rod’ retinas may have similar nocturnal visual capabilities.

However, a key difference between nocturnal geckos and caenophidians, is that geckos have lost true rods and RH1, while caenophidian snakes have not. What difference this makes, and how

‘true’ rods interact with rod-like cones remain open questions for future studies.

4.5 Conclusions

Here we conducted the first analysis of selection in phototransduction genes in reptiles representing one of the most comprehensive analyses of visual system genes to date. Our results suggest that snake phototransduction genes are under considerably different selective constraints than other reptiles and have experienced positive selection to a degree not found in other vertebrate groups. We surmise that these exceptional selective patterns are linked to both the evolutionary origins of snakes, and the evolutionary process of photoreceptor transmutation in caenophidian snakes. The degree of gene loss and divergent selection in snake visual transduction genes supports a dim-light early snake ancestor that was not highly adapted for fossoriality. Indeed, these data provide some of the first evolutionary genomic support for a

177 nocturnal origin of snakes, but unfortunately provide limited insight into the terrestrial/fossorial vs aquatic debate that is ongoing based on controversial fossil data (Caldwell and Lee 1997; Lee

2005; Longrich et al. 2012; Yi and Norell 2015; Lee et al. 2016). Within caenophidian snakes, high levels of positive selection in cone-specific genes likely reflect adaptive evolution towards a more rod-like function that occurred on multiple branches within the caenophidian clade to facilitate the development of all-rod retinas. Our findings further suggest considerable differences in the function of rod and cone visual transduction proteins that warrants further study. For instance, studies have repeatedly found that rod and cone transducin are functionally similar or even equivalent (Deng et al. 2009; Gopalakrishna et al. 2012; Tachibanaki et al. 2012;

Mao et al. 2013); however we have found strong evidence for positive selection in snake rod and cone transducin (GNAT1, GNAT2, GNB3) that suggests adaptation and functional divergence.

Differences between rod- and cone-specific copies of phototransduction proteins are currently an area of active research (Kawamura and Tachibanaki 2008; Renninger et al. 2011; Tachibanaki et al. 2012; Mao et al. 2013; Majumder et al. 2015; Orban and Palczewski 2016; Sakurai et al.

2016). Our results are important for understanding how visual systems evolve and adapt in response to gene loss and changes to activity patterns. Further work will need to be done to elucidate the functional consequences of the changes that occurred in visual genes and to expand sampling in order to better understand the evolutionary history of those changes. The evolution of phototransduction in the snake visual system provides an extreme and illustrative example of how powerful selective forces can be in fundamentally reshaping and repurposing genetic components of such a complex system as the vertebrate eye.

178

4.6 Methods

4.6.1 Animals

Colubrid snakes were obtained from commercial retailers and euthanized under approval of the

University of Toronto and University of Texas Arlington Animal Care Committees. Eyes were extracted and either frozen in liquid nitrogen or placed in RNAlater (Ambion) and stored at -

80°C.

4.6.2 Transcriptome Sequencing

Whole eyes were homogenized in Trizol (Invitrogen) using a BeadBug (Benchmark Scientific).

Total RNA was extracted following a combined Trizol/RNeasy (Qiagen) protocol according to the manufacturer’s instructions. Library construction and sequencing on the Illumina HiSeq pipeline were performed according to standard protocols at The Centre for Applied Genomics, the Hospital for Sick Children (Toronto). Resulting 150 bp paired end reads were trimmed with

Trimmomatic v0.33 (Bolger et al. 2014) using default settings. Trimmed reads were assembled de novo using Trinity (Grabherr et al. 2011) under default settings. Visual transduction gene transcripts were identified and extracted using BLAST (discontinuous megablast, e-value cutoff of 1e-10). Transcript identities (i.e., orthology to annotated genes) were confirmed through phylogenetic analysis.

4.6.3 Visual Transduction Gene Datasets

Genes encoding each of the major, known components of the visual transduction cascade (Lamb

2013) were targeted, comprising a total of 35 genes (Table 4.1). The NCBI Genbank database

179 was searched for these genes and coding regions extracted using BlastPhyMe (Schott et al.

2016a). All non-avian reptile sequences were retained, with a representative sample of 17 avian sequences selected that span avian diversity (Jarvis et al. 2014) in order to not bias the datasets heavily towards birds. Coding regions from Genbank were used as references to extract those genes from the de novo eye transcriptomes, a previous visual gene hybrid enrichment experiment

(Schott et al. 2017), and publically available unannotated draft genomes (Castoe et al. 2011;

Bradnam et al. 2013; Castoe et al. 2013; Vonk et al. 2013; Green et al. 2014; Georges et al. 2015;

Liu et al. 2015; Song et al. 2015). Coding regions for each gene dataset were aligned using

MUSCLE codon alignment as implemented in MEGA (Edgar 2004; Tamura et al. 2011). Areas of non-homology and poor alignment were removed in order to improve the accuracy of inferences of positive selection (Privman et al. 2012). This often included trimming the ends of the sequences, as well as removing sequence that was non-homologous either due to being from a transcript variant or incorrectly included in the coding sequence due to errors in automated prediction. Maximum likelihood trees were estimated in MEGA using the GTR+G model in order to ensure that all genes were correctly identified, free of contaminants, and properly aligned prior to downstream analyses.

To maintain an even comparison among genes, and to avoid potential issues of convergence and homoplasy in individual genes, a single species tree was used for all analyses.

The topology was based on Pyron et al. (2013) for the squamate relationships, Jarvis et al. (2014) for the avian relationships, and Chiari et al. (2012) and Crawford et al. (2012) for the higher order relationships with the basal trichotomy required by PAML formed by turtles, archosaurs, and squamates (Figs. S1 and S2). The species tree was trimmed or added to as needed to match the sampling available for each gene. In addition to the full reptile dataset and tree, each gene dataset and tree was trimmed to contain only snakes.

180

4.6.4 Molecular Evolutionary Analyses

To estimate the strength and form of selection acting on visual transduction genes of reptiles and snakes, each dataset was analyzed using codon-based likelihood models from the codeml program of the PAML 4 software package (Yang 2007). Specifically, the random sites (M0,

M1a, M2a, M2a_rel, M3, M7, M8a, and M8), branch (Br), branch-site (BrS), and clade models

(CmC, CmD) were used (Bielawski and Yang 2004; Zhang et al. 2005; Yang 2007). All analyses were run with varying starting values to avoid potential local optima. To determine significance, model pairs were compared using a likelihood ratio test (LRT) with a χ2 distribution, while non- nested models where evaluated using Akaike Information Criterion (AIC).

Random sites models were used to determine overall selective patterns and to test for gene-wide positive selection in reptiles and in snakes. The M3 vs M0 comparison tests for variation among sites, while the M2a vs M1a and M8 vs M7/M8a comparisons test for a proportion of positively selected sites. M0, M2a_rel, and M3 are also the null models for the Br,

CmC, and CmD models, respectively.

To test for long-term shifts in selection pressures (i.e., divergent selection) in phototransduction gene we utilized the clade models (CmC and CmD). CmC assumes that some sites evolve conservatively across the phylogeny (two classes of sites where 0 < ω0 < 1 and ω1 =

1), while a class of sites is free to evolve differently among two or more partitions (e.g., ωD1 > 0 and ωD1 ≠ ωD2 > 0). Despite the name, partitions can be any combination of branches and entire clades. CmD is similar, but all three site classes (ω0, ω1, ωD) are unconstrained (meaning they can assume any value). This can be useful when there is little support for a neutral class of sites.

A number of different partitions were tested, using both the reptile and snake-only datasets as shown in Figure 3 and Figure S3.

181

We also tested for relaxed selection, and episodic positive selection, on the branch leading to snakes, and to caenophidians (Fig. S3). The branch leading to snakes or caenophidians was placed in the foreground partition and tested using the Br, BrS, and CmC. The Br model is similar to the clade models, but contains only a single class of sites, and thus tests for average differences between partitions. This results in a less sensitive test, but it useful for detection of relaxed selective constraints. The BrS model was designed to test for episodes of positive selection on specific branches (although it can be applied to clade or mixed partitions as well).

Unlike the branch and clade model, the BrS model explicitly differentiates between the background and foreground partitions. It has four site classes: 0) 0 < ω0 < 1 for all branches; 1)

ω1 = 1 for all branches, 2a) ω2a = ω2b ≥ 1 in the foreground and 0 < ω2a = ω0 < 1 in the background, and 2b) ω2b = ω2a ≥ 1 in the foreground and ω2b = ω1 = 1 in the background.

Positive selection is only allowed in the foreground, which results in a powerful test for episodic positive selection, but can result in false positives when positive selection is also present in the background (Schott et al. 2014)

Additional details on these models and their use to test for long-terms selective shifts, episodic selection, and positive selection can be found in Schott et al. (2014) and Baker et al.

(2016).

4.7 Acknowledgements

This work was supported by a Natural Sciences and Engineering Research Council (NSERC)

Discovery grant (to BSWC) and Vision Science Research Program Scholarships (to RKS, AVN).

We thank Gianni Castiglione, Frances Hauser, David Gower, the Associate Editor, and three

182 anonymous reviewers for the feedback and suggestions on earlier versions of the manuscript, which helped to improve the final version.

4.8 References

Anderson SR, Wiens JJ 2017. Out of the dark: 350 million years of conservatism and evolution

in diel activity patterns in vertebrates. Evolution.

Baker JL, Dunn KA, Mingrone J, Wood BA, Karpinski BA, Sherwood CC, Wildman DE,

Maynard TM, Bielawski JP 2016. Functional Divergence of the Nuclear Receptor NR2C1

as a Modulator of Pluripotentiality During Hominid Evolution. Genetics 203: 905-922.

Bellairs AD, Underwood G 1951. The origin of snakes. Biol Rev Camb Philos Soc 26: 193-237.

Bhattacharyya N, Darren B, Schott RK, Tropepe V, Chang BSW 2017. Cone-like rhodopsin

expressed in the all cone retina of the colubrid pine snake as a potential adaptation to

diurnality. J Exp Biol.

Bielawski JP, Yang Z 2004. A maximum likelihood method for detecting functional divergence

at individual codon sites, with application to gene family evolution. J Mol Evol 59: 121-

132.

Bolger AM, Lohse M, Usadel B 2014. Trimmomatic: a flexible trimmer for Illumina sequence

data. Bioinformatics 30: 2114-2120.

Bowmaker JK 2008. Evolution of vertebrate visual pigments. Vision Res 48: 2022-2041.

Bradnam KR, Fass JN, Alexandrov A, Baranay P, Bechner M, Birol I, Boisvert S, Chapman JA,

Chapuis G, Chikhi R, Chitsaz H, Chou WC, Corbeil J, Del Fabbro C, Docking TR, Durbin

R, Earl D, Emrich S, Fedotov P, Fonseca NA, Ganapathy G, Gibbs RA, Gnerre S,

Godzaridis E, Goldstein S, Haimel M, Hall G, Haussler D, Hiatt JB, Ho IY, Howard J,

183

Hunt M, Jackman SD, Jaffe DB, Jarvis ED, Jiang H, Kazakov S, Kersey PJ, Kitzman JO,

Knight JR, Koren S, Lam TW, Lavenier D, Laviolette F, Li Y, Li Z, Liu B, Liu Y, Luo R,

Maccallum I, Macmanes MD, Maillet N, Melnikov S, Naquin D, Ning Z, Otto TD, Paten

B, Paulo OS, Phillippy AM, Pina-Martins F, Place M, Przybylski D, Qin X, Qu C, Ribeiro

FJ, Richards S, Rokhsar DS, Ruby JG, Scalabrin S, Schatz MC, Schwartz DC,

Sergushichev A, Sharpe T, Shaw TI, Shendure J, Shi Y, Simpson JT, Song H, Tsarev F,

Vezzi F, Vicedomini R, Vieira BM, Wang J, Worley KC, Yin S, Yiu SM, Yuan J, Zhang

G, Zhang H, Zhou S, Korf IF 2013. Assemblathon 2: evaluating de novo methods of

genome assembly in three vertebrate species. Gigascience 2: 10.

Caldwell MW, Lee MSY 1997. A snake with legs from the marine Cretaceous of the Middle

East. Nature 386: 705-709.

Cao D, Pokorny J, Smith VC, Zele AJ 2008. Rod contributions to color perception: linear with

rod contrast. Vision Res 48: 2586-2592.

Caprette CL, Lee MSY, Shine R, Mokany A, Downhower JF 2004. The origin of snakes

(Serpentes) as seen through eye anatomy. Biol J Linn Soc 81: 469-482.

Castiglione GM, Hauser FE, Liao BS, Lujan NK, Van Nynatten A, Morrow JM, Schott RK,

Bhattacharyya N, Dungan SZ, Chang BSW 2017. Evolution of nonspectral rhodopsin

function at high altitudes. Proc Natl Acad Sci U S A 114: 7385-7390.

Castoe TA, Bronikowski AM, Brodie ED, Edwards SV, Pfrender ME, Shapiro MD, Pollock DD,

Warren WC 2011. A proposal to sequence the genome of a garter snake (Thamnophis

sirtalis). Standards in Genomic Sciences 4: 257-270.

Castoe TA, de Koning APJ, Hall KT, Card DC, Schield DR, Fujita MK, Ruggiero RP, Degner

JF, Daza JM, Gu WJ, Reyes-Velasco J, Shaney KJ, Castoe JM, Fox SE, Poole AW,

Polanco D, Dobry J, Vandewege MW, Li Q, Schott RK, Kapusta A, Minx P, Feschotte C,

184

Uetz P, Ray DA, Hoffmann FG, Bogden R, Smith EN, Chang BSW, Vonk FJ, Casewell

NR, Henkel CV, Richardson MK, Mackessy SP, Bronikowsi AM, Yandell M, Warren WC,

Secor SM, Pollock DD 2013. The Burmese python genome reveals the molecular basis for

extreme adaptation in snakes. Proc Natl Acad Sci U S A 110: 20645-20650.

Chiari Y, Cahais V, Galtier N, Delsuc F 2012. Phylogenomic analyses support the position of

turtles as the sister group of birds and crocodiles (Archosauria). BMC Biol 10: 65.

Crawford NG, Faircloth BC, McCormack JE, Brumfield RT, Winker K, Glenn TC 2012. More

than 1000 ultraconserved elements provide evidence that turtles are the sister group of

archosaurs. Biol Lett 8: 783-786.

Davies WL, Cowing JA, Bowmaker JK, Carvalho LS, Gower DJ, Hunt DM 2009. Shedding light

on serpent sight: the visual pigments of henophidian snakes. J Neurosci 29: 7519-7525.

Deng WT, Sakurai K, Liu JW, Dinculescu A, Li J, Pang JJ, Min SH, Chiodo VA, Boye SL,

Chang B, Kefalov VJ, Hauswirth WW 2009. Functional interchangeability of rod and cone

transducin alpha-subunits. Proc Natl Acad Sci U S A 106: 17681-17686.

Dungan SZ, Kosyakov A, Chang BS 2016. Spectral Tuning of Killer Whale (Orcinus orca)

Rhodopsin: Evidence for Positive Selection and Functional Adaptation in a Cetacean

Visual Pigment. Mol Biol Evol 33: 323-336.

Edgar RC 2004. MUSCLE: multiple sequence alignment with high accuracy and high

throughput. Nucleic Acids Res 32: 1792-1797.

Emerling CA, Springer MS 2014. Eyes underground: Regression of visual protein networks in

subterranean mammals. Mol Phylogen Evol 78: 260-270.

Fain GL, Hardie R, Laughlin SB 2010. Phototransduction and the Evolution of Photoreceptors.

Curr Biol 20: R114-R124.

185

Fay JC, Wu C-I 2003. Sequence divergence, functional constraint, and selection in protein

evolution. Annual review of genomics and human genetics 4: 213-235.

Gamble T, Greenbaum E, Jackman TR, Bauer AM 2015. Into the light: diurnality has evolved

multiple times in geckos. Biol J Linn Soc 115: 896-910.

Georges A, Li Q, Lian J, O'Meally D, Deakin J, Wang Z, Zhang P, Fujita M, Patel HR, Holleley

CE, Zhou Y, Zhang X, Matsubara K, Waters P, Graves JA, Sarre SD, Zhang G 2015.

High-coverage sequencing and annotated assembly of the genome of the Australian dragon

lizard Pogona vitticeps. Gigascience 4: 45.

Gopalakrishna KN, Boyd KK, Artemyev NO 2012. Comparative analysis of cone and rod

transducins using chimeric Galpha subunits. Biochemistry 51: 1617-1624.

Gozem S, Schapiro I, Ferre N, Olivucci M 2012. The Molecular Mechanism of Thermal Noise in

Rod Photoreceptors. Science 337: 1225-1228.

Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L,

Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma

F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A 2011. Full-length

transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol

29: 644-652.

Green RE, Braun EL, Armstrong J, Earl D, Nguyen N, Hickey G, Vandewege MW, St John JA,

Capella-Gutierrez S, Castoe TA, Kern C, Fujita MK, Opazo JC, Jurka J, Kojima KK,

Caballero J, Hubley RM, Smit AF, Platt RN, Lavoie CA, Ramakodi MP, Finger JW, Jr.,

Suh A, Isberg SR, Miles L, Chong AY, Jaratlerdsiri W, Gongora J, Moran C, Iriarte A,

McCormack J, Burgess SC, Edwards SV, Lyons E, Williams C, Breen M, Howard JT,

Gresham CR, Peterson DG, Schmitz J, Pollock DD, Haussler D, Triplett EW, Zhang G,

Irie N, Jarvis ED, Brochu CA, Schmidt CJ, McCarthy FM, Faircloth BC, Hoffmann FG,

186

Glenn TC, Gabaldon T, Paten B, Ray DA 2014. Three crocodilian genomes reveal

ancestral patterns of evolution among archosaurs. Science 346: 1254449.

Greene HW. 1997. Snakes: The Evolution of Mystery in Nature. Berkeley: University of

California Press.

Hauser FE, Ilves KL, Schott RK, Castiglione GM, Lopez-Fernandez H, Chang BSW 2017.

Accelerated evolution and functional divergence of the dim light visual pigment

accompanies cichlid colonization of Central America. Mol Biol Evol.

Horner TJ, Osawa S, Schaller MD, Weiss ER 2005. Phosphorylation of GRK1 and GRK7 by

cAMP-dependent protein kinase attenuates their enzymatic activities. J Biol Chem 280:

28241-28250.

Hsiang AY, Field DJ, Webster TH, Behlke ADB, Davis MB, Racicot RA, Gauthier JA 2015.

The origin of snakes: revealing the ecology, behavior, and evolutionary history of early

snakes using genomics, phenomics, and the fossil record. BMC Evol Biol 15.

Ingram NT, Sampath AP, Fain GL 2016. Why are rods more sensitive than cones? J Physiol 594:

5415-5426.

Invergo BM, Montanucci L, Laayouni H, Bertranpetit J 2013. A system-level, molecular

evolutionary analysis of mammalian phototransduction. BMC Evol Biol 13.

Jacobs GH, Fenwick JA, Crognale MA, Deegan JF 1992. The all-cone retina of the garter snake -

spectral mechanisms and photopigment. J Comp Phys A 170: 701-707.

Jarvis ED, Mirarab S, Aberer AJ, Li B, Houde P, Li C, Ho SYW, Faircloth BC, Nabholz B,

Howard JT, Suh A, Weber CC, da Fonseca RR, Li JW, Zhang F, Li H, Zhou L, Narula N,

Liu L, Ganapathy G, Boussau B, Bayzid MS, Zavidovych V, Subramanian S, Gabaldon T,

Capella-Gutierrez S, Huerta-Cepas J, Rekepalli B, Munch K, Schierup M, Lindow B,

Warren WC, Ray D, Green RE, Bruford MW, Zhan XJ, Dixon A, Li SB, Li N, Huang YH,

187

Derryberry EP, Bertelsen MF, Sheldon FH, Brumfield RT, Mello CV, Lovell PV, Wirthlin

M, Schneider MPC, Prosdocimi F, Samaniego JA, Velazquez AMV, Alfaro-Nunez A,

Campos PF, Petersen B, Sicheritz-Ponten T, Pas A, Bailey T, Scofield P, Bunce M,

Lambert DM, Zhou Q, Perelman P, Driskell AC, Shapiro B, Xiong ZJ, Zeng YL, Liu SP,

Li ZY, Liu BH, Wu K, Xiao J, Yinqi X, Zheng QM, Zhang Y, Yang HM, Wang J, Smeds

L, Rheindt FE, Braun M, Fjeldsa J, Orlando L, Barker FK, Jonsson KA, Johnson W,

Koepfli KP, O'Brien S, Haussler D, Ryder OA, Rahbek C, Willerslev E, Graves GR, Glenn

TC, McCormack J, Burt D, Ellegren H, Alstrom P, Edwards SV, Stamatakis A, Mindell

DP, Cracraft J, Braun EL, Warnow T, Jun W, Gilbert MTP, Zhang GJ 2014. Whole-

genome analyses resolve early branches in the tree of life of modern birds. Science 346:

1320-1331.

Joesch M, Meister M 2016. A neuronal circuit for colour vision based on rod-cone opponency.

Nature 532: 236-+.

Kawamura S, Tachibanaki S 2008. Rod and cone photoreceptors: Molecular basis of the

difference in their physiology. Comparative Biochemistry and Physiology - A Molecular

and Integrative Physiology 150: 369-377.

Lamb TD 2013. Evolution of phototransduction, vertebrate photoreceptors and retina. Prog Retin

Eye Res 36: 52-119.

Lamb TD. 2010. Phototransduction: adaptation in cones. In: Dartt DA, Besharse JC, Dana R,

editors. Encyclopedia of the Eye,. Oxford: Academic Press. p. 354-360.

Lee MSY 2005. Molecular evidence and marine snake origins. Biol Lett 1: 227-230.

Lee MSY, Caldwell MW 2000. Adriosaurus and the affinities of mosasaurs, dolichosaurs, and

snakes. J Paleontol 74: 915-937.

188

Lee MSY, Palci A, Jones MEH, Caldwell MW, Holmes JD, Reisz RR 2016. Aquatic adaptations

in the four limbs of the snake-like reptile Tetrapodophis from the Lower Cretaceous of

Brazil. Cretaceous Res 66: 194-199.

Liu Y, Zhou Q, Wang Y, Luo L, Yang J, Yang L, Liu M, Li Y, Qian T, Zheng Y, Li M, Li J, Gu

Y, Han Z, Xu M, Wang Y, Zhu C, Yu B, Yang Y, Ding F, Jiang J, Yang H, Gu X 2015.

Gekko japonicus genome reveals evolution of adhesive toe pads and tail regeneration.

Nature Communications 6: 10033.

Longrich NR, Bhullar BA, Gauthier JA 2012. A transitional snake from the Late Cretaceous

period of North America. Nature 488: 205-208.

Majumder A, Pahlberg J, Muradov H, Boyd KK, Sampath AP, Artemyev NO 2015. Exchange of

Cone for Rod Phosphodiesterase 6 Catalytic Subunits in Rod Photoreceptors Mimics in

Part Features of Light Adaptation. J Neurosci 35: 9225-9235.

Mao W, Miyagishima KJ, Yao Y, Soreghan B, Sampath AP, Chen JE 2013. Functional

Comparison of Rod and Cone G alpha(t) on the Regulation of Light Sensitivity. J Biol

Chem 288: 5257-5267.

McKee SP, McCann JJ, Benton JL 1977. Color-vision from rod and long-wave cone interactions:

conditions in which rods contribute to multi-colored images. Vision Res 17: 175-185.

Nopcsa F 1923. Eidolosaurus und Pachyophis. Zwei neue Neocom-Reptilien.

Palaeontographica 65.

Nopcsa F 1908. Zur Kenntnis der fossilen Eidechsen. Beitrӓge zur Palӓontologie und Geologie

Ӧsterreich-Ungarns und des Orients 21: 33-62.

Oppermann D, Schramme J, Neumeyer C 2016. Rod-cone based color vision in seals under

photopic conditions. Vision Res 125: 30-40.

189

Orban T, Palczewski K. 2016. Structure and Function of G-Protein-Coupled Receptor Kinases 1

and 7. In: Gurevich VV, Gurevich EV, Tesmer JJG, editors. G Protein-Coupled Receptor

Kinases. New York, NY: Springer New York. p. 25-43.

Osawa S, Weiss ER 2012. A tale of two kinases in rods and cones. Adv Exp Med Biol 723: 821-

827.

Pignatelli V, Champ C, Marshall J, Vorobyev M 2010. Double cones are used for colour

discrimination in the reef fish, Rhinecanthus aculeatus. Biol Lett 6: 537-539.

Privman E, Penn O, Pupko T 2012. Improving the Performance of Positive Selection Inference

by Filtering Unreliable Alignment Regions. Mol Biol Evol 29: 1-5.

Pyron RA, Burbrink FT, Wiens JJ 2013. A phylogeny and revised classification of Squamata,

including 4161 species of lizards and snakes. BMC Evol Biol 13: 93.

Reitner A, Sharpe LT, Zrenner E 1991. Is color-vision possible with only rods and blue-sensitive

cones. Nature 352: 798-800.

Renninger SL, Gesemann M, Neuhauss SC 2011. Cone arrestin confers cone vision of high

temporal resolution in zebrafish larvae. Eur J Neurosci 33: 658-667.

Rieppel O 1988. A review of the origin of snakes. Evol Biol 22: 37-130.

Röll B 2000. Gecko vision-visual cells, evolution, and ecological constraints. J Neurocytol 29:

471-484.

Roth LS, Kelber A 2004. Nocturnal colour vision in geckos. Proc Biol Sci 271 Suppl 6: S485-

487.

Sakurai K, Vinberg F, Wang T, Chen J, Kefalov VJ 2016. The Na(+)/Ca(2+), K(+) exchanger 2

modulates mammalian cone phototransduction. Sci Rep 6: 32521.

Schott RK, Gow D, Chang BS 2016a. BlastPhyMe: A toolkit for rapid generation and analysis of

protein-coding sequence datasets. bioRxiv.

190

Schott RK, Muller J, Yang CG, Bhattacharyya N, Chan N, Xu M, Morrow JM, Ghenu AH, Loew

ER, Tropepe V, Chang BS 2016b. Evolutionary transformation of rod photoreceptors in the

all-cone retina of a diurnal garter snake. Proc Natl Acad Sci U S A 113: 356-361.

Schott RK, Panesar B, Card DC, Preston M, Castoe TA, Chang BS 2017. Targeted capture of

complete coding regions across divergent species. Genome Biol Evol.

Schott RK, Refvik SP, Hauser FE, Lopez-Fernandez H, Chang BS 2014. Divergent positive

selection in rhodopsin from lake and riverine cichlid fishes. Mol Biol Evol 31: 1149-1165.

Sillman AJ, Carver JK, Loew ER 1999. The photoreceptors and visual pigments in the retina of a

boid snake, the ball python (Python regius). J Exp Biol 202: 1931-1938.

Sillman AJ, Govardovskii VI, Rohlich P, Southard JA, Loew ER 1997. The photoreceptors and

visual pigments of the garter snake (Thamnophis sirtalis): a microspectrophotometric,

scanning electron microscopic and immunocytochemical study. J Comp Phys A 181: 89-

101.

Sillman AJ, Johnson JL, Loew ER 2001. Retinal photoreceptors and visual pigments in Boa

constrictor imperator. J Exp Zool 290: 359-365.

Simões BF, Sampaio FL, Douglas RH, Kodandaramaiah U, Casewell NR, Harrison RA, Hart

NS, Partridge JC, Hunt DM, Gower DJ 2016a. Visual Pigments, Ocular Filters and the

Evolution of Snake Vision. Mol Biol Evol 33: 2483-2495.

Simões BF, Sampaio FL, Jared C, Antoniazzi MM, Loew ER, Bowmaker JK, Rodriguez A, Hart

NS, Hunt DM, Partridge JC, Gower DJ 2015. Visual system evolution and the nature of the

ancestral snake. J Evol Biol 28: 1309-1320.

Simões BF, Sampaio FL, Loew ER, Sanders KL, Fisher RN, Hart NS, Hunt DM, Partridge JC,

Gower DJ 2016b. Multiple rod-cone and cone-rod photoreceptor transmutations in snakes:

evidence from visual opsin . Proc Biol Sci 283.

191

Song B, Cheng S, Sun Y, Zhong X, Jin J, Guan R, Murphy RW, Che J, Zhang Y, Liu X 2015. A

genome draft of the legless anguid lizard, Ophisaurus gracilis. Gigascience 4: 17.

Tachibanaki S, Yonetsu SI, Fukaya S, Koshitani Y, Kawamura S 2012. Low Activation and Fast

Inactivation of Transducin in Carp Cones. J Biol Chem 287: 41186-41194.

Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S 2011. MEGA5: Molecular

Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance, and

Maximum Parsimony Methods. Mol Biol Evol 28: 2731-2739.

Tansley K 1964. The gecko retina. Vision Res 4: 33-37.

Torres-Dowdall J, Henning F, Elmer KR, Meyer A 2015. Ecological and Lineage-Specific

Factors Drive the Molecular Evolution of Rhodopsin in Cichlid Fishes. Mol Biol Evol 32:

2876-2882.

Underwood G. 1970. The Eye. In: Gans C, editor. Biology of the Reptilia. New York: Academic

Press. p. 1-97.

Van Nynatten A, Bloom D, Chang BS, Lovejoy NR 2015. Out of the blue: adaptive visual

pigment evolution accompanies Amazon invasion. Biol Lett 11.

Vidal N, Delmas AS, David P, Cruaud C, Coujoux A, Hedges SB 2007. The phylogeny and

classification of caenophidian snakes inferred from seven nuclear protein-coding genes. C

R Biol 330: 182-187.

Vonk FJ, Casewell NR, Henkel CV, Heimberg AM, Jansen HJ, McCleary RJR, Kerkkamp

HME, Vos RA, Guerreiro I, Calvete JJ, Wuster W, Woods AE, Logan JM, Harrison RA,

Castoe TA, de Koning APJ, Pollock DD, Yandell M, Calderon D, Renjifo C, Currier RB,

Salgado D, Pla D, Sanz L, Hyder AS, Ribeiro JMC, Arntzen JW, van den Thillart G,

Boetzer M, Pirovano W, Dirks RP, Spaink HP, Duboule D, McGlinn E, Kini RM,

192

Richardson MK 2013. The king cobra genome reveals dynamic gene evolution and

adaptation in the snake venom system. Proc Natl Acad Sci U S A 110: 20651-20656.

Wada Y, Sugiyama J, Okano T, Fukada Y 2006. GRK1 and GRK7: unique cellular distribution

and widely different activities of opsin phosphorylation in the zebrafish rods and cones. J

Neurochem 98: 824-837.

Walls GL 1940. Ophthalmological Implications for the Early History of the Snakes. Copeia

1940: 1-8.

Walls GL 1934. The Reptilian Retina: I. A new concept of visual-cell evolution. Am J

Ophthalmol 17: 892-915.

Walls GL. 1942. The vertebrate eye and its adaptive radiation. Bloomfield Hills, MI: Cranbrook

Institute of Science.

Weadick CJ, Chang BSW 2012. An improved likelihood ratio test for detecting site-specific

functional divergence among clades of protein-coding genes. Mol Biol Evol 29: 1297-1300.

Weiss ER, Ducceschi MH, Horner TJ, Li A, Craft CM, Osawa S 2001. Species-specific

differences in expression of G-protein-coupled receptor kinase (GRK) 7 and GRK1 in

mammalian cone photoreceptor cells: implications for cone cell phototransduction. J

Neurosci 21: 9175-9184.

Wensel TG 2008. Signal transducing membrane complexes of photoreceptor outer segments.

Vision Res 48: 2052-2061.

Wu Y, Hadly EA, Teng W, Hao Y, Liang W, Liu Y, Wang H 2016. Retinal transcriptome

sequencing sheds light on the adaptation to nocturnal and diurnal lifestyles in raptors. Sci

Rep 6: 33578.

Yang Z 2007. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24: 1586-

1591.

193

Yi H, Norell MA 2015. The burrowing origin of modern snakes. Science advances 1: e1500743.

Zhang J, Nielsen R, Yang Z 2005. Evaluation of an improved branch-site likelihood method for

detecting positive selection at the molecular level. Mol Biol Evol 22: 2472-2479.

Zhang X, Wensel TG, Yuan C 2006. Tokay gecko photoreceptors achieve rod-like physiology

with cone-like proteins. Photochem Photobiol 82: 1452-1460.

194

4.9 Supplementary Figures

Figure S4.1. Species topology and representative taxon sampling used for the selection analyses on visual transduction genes. Taxon sampling was greatly enhanced by new sequences extracted from draft genomes, a hybrid enrichment experiment, and new

195 transcriptome sequencing. Additional taxa were added when available (Fig. S4.2; Supplementary

Files 4.1 and 4.2). Topology based on Pyron et al. (2013), Jarvis et al. (2014), Chiari et al.

(2012), Crawford et al. (2012).

196

Figure S4.2. Snake species tree topology used for SWS1 to illustrate the expanded taxon sampling available for the visual opsin genes. Complete taxon sampling for each gene is listed in Supplementary Files 4.1 and 4.2.

197

Figure S4.3. Additional partitioning schemes. The branch leading to snakes was compared to all other branches and caenophidian snakes were compared to other snakes using the snake-only dataset.

198

Figure S4.4. Tests for shifts in selective pressures on phototransduction genes between reptiles and snakes, and snakes and caenophidian snakes (Fig. 4.3, Fig S4.3). The ω (dN/dS) values of the divergent site class using CmD are shown highlighting the difference between the background (open circle) and foreground (closed circle) partitions for each gene. When only an

199 open circle is shown the difference were not significant and instead the equivalent value from the null model (M3) is shown. Differences in ω were averaged for rod-, cone-, and non-rod/cone- specific genes demonstrating the relative strength of divergent selection. Error bars are standard error.

200

4.10 Supplementary Tables

Table S4.1. Comparison of selective constraint between snakes and other reptiles, and between caenophidian snakes and other snakes for each of the analyzed genes. The M0 model give an average ω estimate across the gene, while the CmC analyses partition the dataset

into two groups and estimated divergent selection at a subset of sites (ωd).

CmC (ωd) CmC (ωd) M0 (ω) Reptile Dataset Snake Dataset Type Gene Reptile (No Reptile (Incl. Snake Reptile Snake Snake Caenophidian Snakes) Snakes) Rod CNGA1 0.123 0.123 0.108 0.285 0.285 0.717 0.717 Rod CNGB1 0.180 0.192 0.257 0.270 0.362 0.325 0.325 Rod GNAT1 0.029 0.043 0.143 0.084 0.290 4.478 4.478 Rod GNB1 0.004 0.005 0.010 0.074 0.074 0.004 0.004 Rod PDE6B 0.090 0.106 0.200 0.219 0.543 3.416 3.416 Rod RH1 0.078 0.115 0.221 0.240 0.639 0.000 1.008 Rod SAG 0.210 0.214 0.250 0.267 0.267 1.155 1.155 Rod AVG 0.102 0.114 0.170 0.206 0.351 1.442 1.586 Cone ARR3 0.086 0.104 0.201 0.179 0.317 0.250 1.882 Cone CNGA3 0.098 0.110 0.186 0.247 0.530 0.271 1.302 Cone CNGB3 0.205 0.237 0.498 0.256 0.649 0.229 4.134 Cone GNAT2 0.033 0.051 0.182 0.128 0.802 0.000 2.310 Cone GNB3 0.057 0.081 0.237 0.199 0.968 0.403 2.856 Cone GRK7 0.202 0.233 0.477 0.246 0.608 0.776 5.210 Cone GUCA1C 0.223 0.250 0.442 0.175 0.685 0.000 3.012 Cone LWS 0.075 0.128 0.363 0.325 1.455 0.765 3.162 Cone PDE6C 0.125 0.144 0.252 0.178 0.463 0.941 3.028 Cone SLC24A2 0.202 0.219 0.338 0.210 0.453 0.513 2.821 Cone SWS1 0.030 0.048 0.101 0.106 0.299 0.294 0.294 Cone AVG 0.121 0.158 0.316 0.206 0.704 0.527 2.912 Both GNB5 0.021 0.028 0.049 0.160 0.352 0.131 0.131 Both GUCA1A 0.058 0.070 0.134 0.181 0.427 5.213 5.213 Both GUCA1B 0.095 0.126 0.318 0.155 0.682 0.187 1.158 Both GUCY2D 0.078 0.107 0.185 0.205 0.375 0.452 4.194 Both GUCY2F 0.187 0.189 0.205 0.326 0.394 0.486 0.486 Both RCVRN 0.044 0.075 0.223 0.136 0.397 4.192 4.192 Both RGS9 0.133 0.157 0.340 0.213 0.561 0.000 1.106 Both RGS9BP 0.042 0.057 0.083 0.006 0.040 0.071 0.022 Both AVG 0.082 0.101 0.192 0.173 0.404 1.342 2.063 TOTAL AVG 0.104 0.124 0.231 0.195 0.497 0.972 2.216

201

Table S4.2. Summary of the selection analyses performed on the branch leading to snakes, and to caenophidian snakes using the branch, branch-site, CmC models. Significantly elevated selection on the specified (foreground) branch is represented by fg, while elevated selection in the background is represented by bg. The + symbol indicates positive selection.

. Snake Branch Caenophidian Branch Type Gene Branch Branch-site CmC Branch Branch-site CmC Rod CNGA1 fg no no no no bg Rod CNGB1 no no no no no no Rod GNAT1 no no bg fg no fg Rod GNB1 no no no no no no Rod PDE6B fg no fg fg fg+ fg Rod RH1 no no bg fg fg+ no Rod SAG bg no bg no no No Cone ARR3 no fg+ no fg no fg Cone CNGA3 no no no fg no fg+ Cone CNGB3 fg no no fg no fg Cone GNAT2 no fg+ no fg fg+ no Cone GNB3 no no fg fg no fg+ Cone GRK7 no no no fg fg+ fg Cone LWS no no bg no no no Cone PDE6C no no no fg+ fg Cone SWS1 no no no fg fg+ fg Cone GUCA1C no no no fg+ no fg+ Cone SLC24A2 no fg+ no no no no Both GNB5 fg no fg no no no Both GUCY2D no no no fg no fg Both GUCY2F bg no no no no no Both RCVRN no no no fg no fg Both RGS9 no no fg fg+ fg+ Both RGS9BP fg no fg fg no fg Both GUCA1A no no no fg no no Both GUCA1B no no no fg no fg+

202

4.11 Supplementary Files

Supplementary files are available at: https://cp.sync.com/dl/37c015d30#jnzrj74e-wkqenpbs- z8re6q3k-axkrf5c5

Supplementary File 4.1. List of taxa sampled with accession numbers for each gene (except the opsins, which are in Supplementary File 4.2).

Supplementary File 4.2. List of taxa sampled with accession numbers for the opsin genes.

Supplementary File 4.3. PAML results tables for each gene. Files marked with ‘br’ contain results for the analyses done using the branch leading to the clade indicated as the foreground, while those analyses unmarked contain results for the analyses done on the clades indicated.

203

Chapter 5 Gene loss and divergent selection in gecko visual transduction genes

5.1 Abstract

The rod and cone photoreceptors of the vertebrate retina together enable vision from starlight nights to the brightest sunlight. In rare cases however, simplex retinas have evolved that contain only rods or cones, presumably reducing the range of visual sensitivity. One of the best examples of this occurs in geckos where the all-rod retina found in nocturnal species is thought to be derived from the all-cone retina of diurnal lizards through a process known as photoreceptor transmutation. Recently we found that photoreceptor transmutation in snakes was accompanied by significant divergent and positive selection in phototransduction genes, but minimal gene loss.

In order to better understand the molecular evolutionary changes that underlie the striking morphological changes associated with photoreceptor transmutation, we sequenced whole eye transcriptomes from the nocturnal Tokay gecko and the diurnal Carolina anole lizard, and combined this with recent whole genome and targeted capture sequencing data. We test two hypotheses: (1) that nocturnal geckos express only cone phototransduction genes in their rod-like cone photoreceptors and (2) that gecko phototransduction genes are under divergent selection relative to other reptiles, but similar to that present in snakes, which also have undergone considerable photoreceptor transmutation. Surprisingly we find that geckos still express most, but not all, rod phototransduction genes, and at levels similar to those found in the anole. The loss of a functional rhodopsin, which we identified as a pseudogene in the gecko genome, suggests that true rods have been lost and implies that rod phototransduction genes are expressed

204 in cones, possibly along with cone transduction genes. Such co-expression may contribute to the rod-like physiology of nocturnal gecko cones and also could extend the range of visual sensitivity in order to compensate for the loss of true rods. In addition, we found strong support for the hypothesis that photoreceptor transmutation has imposed divergent selective constraints on gecko phototransduction genes that are similar to those previously identified in snakes. This suggests that in addition to adaptation through gene loss, co-option of rod genes, and changes in expression levels, there may be adaptation in phototransduction proteins. Together these results demonstrate that adaptation in complex systems can occur through multiple mechanisms simultaneously.

5.2 Introduction

Vertebrate retinas typically contain two types of photoreceptors that differ in their morphology and physiology. Rod photoreceptors have large cylindrical outer segments with enclosed discs, and are highly sensitive, but have slower reaction speeds and recovery times; whereas cone photoreceptors have smaller, tapered outer segments with open discs, and are less sensitive, but have faster response and recovery times (Lamb 2013). These properties enable rods and cones to function in dim-light and bright-light, respectively. In order to maintain vision under varying light conditions, most vertebrate retinas contain both rods and cones (duplex retina). Simplex retinas, those that contain only cones or only rods, are generally rare, but occur frequently in squamates (Walls 1942; Underwood 1970). Diurnal lizards and diurnal caenophidian snakes typically have morphologically all-cone retinas while some nocturnal lizards and nocturnal caenophidian snakes can have morphologically all-rod retinas, with some species having photoreceptors with morphologies intermediate between rods and cones (Walls 1942;

205

Underwood 1970). The preponderance of simplex retinas and the implied transitions between all- cone and all-rod led Walls (1934, 1942) to formulate his photoreceptor transmutation hypothesis.

The best studied example of photoreceptor transmutation occurs in geckos, one of the most diverse groups of squamate lizards. Most geckos are nocturnal and have retinas that contain only photoreceptors that resemble rods in the size and shape of their outer segments. Based on comparative retinal and photoreceptor morphology, Walls (1942) proposed that the all-‘rod’ retinas of nocturnal geckos were derived from the all-cone retinas of ancestral diurnal lizards

(Fig. 5.1). Furthermore, he hypothesized that extant diurnal geckos reverted to diurnality, and consequently their all-‘rod’ retinas were transmuted back to all-cone retinas (Walls 1942).

Support for this hypothesis has come from several avenues of research. Röll (2000) found that nocturnal gecko 'rods' were actually cones at all levels of their ultrastructure, having such cone features as open outer segment membranes. Molecular studies of the gecko visual system have revealed that geckos lack the rod photoreceptor pigment rhodopsin (RH1), instead expressing cone pigments (LWS, RH2, SWS1) in their 'rods' (Kojima et al. 1992). Furthermore, the 'rods' were found to use cone phototransduction machinery, not the phototransduction machinery that is specific to normal rod photoreceptors (Zhang et al. 2006). The photoreceptors of nocturnal geckos, however, function more similarly to rods than they do to cones (Kleinschmidt and

Dowling 1975; Zhang et al. 2006). Together these studies provide strong support for the hypothesis that nocturnal gecko ‘rods’ were ‘transmuted’ from diurnal lizard cones and as such can be more accurately referred to as rod-like cones. Röll (2001), also found support for the tertiary diurnality of some geckos through an analysis of lens crystallins. An extensive phylogenetic study of temporal activity patterns in geckos confirmed the nocturnal ancestry of geckos followed by multiple independent transitions to diurnality (Gamble et al. 2015). While each of these studies support Walls' (1942) transmutation theory, very little is known about the

206 extent and nature of the impact of transmutation on the evolution and function of the visual system.

Recently, we have shown that caenophidian snakes (the group of snakes in which photoreceptor transmutation has occurred) have undergone divergent and positive selection relative to other snakes and reptiles (Chapter 4). We proposed that this may reflect functional adaptation as a result of the multiple transitions to all-rod-like and all-cone-like retinas in this group. However, we did not detect any gene loss associated with transmutation, such as has been reported with the loss of rods, and rod transduction genes, in geckos (Zhang et al. 2006). To further examine the effect of photoreceptor transmutation, we sequenced the whole eye transcriptome of the nocturnal species Gekko gecko and combined this data with recent whole genome sequencing of the closely related Gekko japonicus (Liu et al. 2015), and hybrid enrichment data from Sphaerodactylus notatus and Phelsuma madagascariensis grandis, two species that transitioned to diurnality independently (Gamble et al. 2015). We used this to examine patterns of gene loss and selective constraint associated with photoreceptor transmutation in geckos. Specifically we tested the hypotheses that (1) geckos express only cone phototransduction genes in their all-cone retinas and (2) that gecko phototransduction genes are under divergent selective constraint relative to other reptiles, but similar to that present in snakes, which also have undergone considerable photoreceptor transmutations.

207

Figure 5.1. Schematic view of photoreceptor transmutation in geckos. Different rod and cone photoreceptor types are depicted based on their gross morphology and the visual pigments contained therein, identified based on the wavelength of maximal absorption. The ancestral tetrapod most likely had large single cones and double cones that contained LWS (red); small single cones that contained RH2 (green), SWS2 (blue), and SWS1 (purple); and rods that contained RH1 (white) (Bowmaker 2008). At some point a diurnal lizard ancestor of geckos lost rods and RH1. This ancestral lineage transitioned to a nocturnal lifestyle, which was accompanied by photoreceptor transmutation. All of the cone photoreceptors were modified to resemble rods. Small single photoreceptors were lost, as was SWS2. Instead RH2 and SWS1 are found in the accessory members of double cones. Several gecko lineages have independently re- evolved diurnality and this was accompanied by a return to an all-cone retina. Schematic is based on Walls (1942); Pedler and Tilly (1964); Tansley (1964); Underwood (1970); Kleinschmidt and

Dowling (1975); Kojima et al. (1992); Loew et al. (1996); Röll (2000, 2001); Zhang et al.

(2006).

208

5.3 Results

5.3.1 Geckos still possess and express several rod transduction phototransduction genes

A total of 35 visual transduction genes (Table 5.1) were targeted for extraction from the de novo eye transcriptome and genome, as well as from NCBI Genbank, and a previously published visual gene hybrid enrichment experiment (Schott et al. 2016b). Of the 35 genes, 28 were recovered in geckos. As we noted previously, two genes, PDE6A and GNGT1, were found to be absent in all reptiles, but present in mammals, amphibians, and fishes. A second gene, SLC24A1, was found to be absent in all squamates, but is present in archelosaurs (turtles, crocodiles, and birds) and other vertebrate groups. Two visual opsins, RH1 and SWS2, previously identified to have been lost in geckos (Zhang et al. 2006), were not recovered in the eye transcriptome.

Interestingly, however, we detected a probable RH1 pseudogene in the Gekko japonicus genome and from the Gekko gecko and Phelsuma (but not Sphaerodactylus) targeted capture data, based on the presence of internal stop codons. This raises the possibility that RH1 was lost early in gecko evolutionary history rather than in a diurnal lizard ancestor and suggests that perhaps not all groups of geckos have lost RH1.

Previous studies have suggested that geckoes possess only cone transduction copies of phototransduction genes and have lost all rod transduction copies (Zhang et al. 2006). We used previous whole genome sequencing and targeted capture, and new eye transcriptome sequencing, to test this. Surprisingly we found rod transduction copies of several phototransduction genes: rod transducin (GNAT1, GNB1), rhodopsin kinase (GRK1), rod arrestin (SAG), and the beta subunit of the rod cyclic nucleotide gated channel (CNGB1) were found in the Gekko japonicus,

G. gecko, Phelsuma and Sphaerodactylus genomes. Rod phosphodiesterase (PDE6B, PDE6G)

209 and the alpha subunit of the rod cyclic nucleotide gated channel (CNGA1) were not found in any of the genomes. Along with the loss of the two visual opsins noted above, geckos appear to have lost a total of five phototransduction proteins.

While we identified the presence of rod transduction phototransduction genes in gecko genomes, we also needed to determine if they were actually expressed in the eye. When the relative expression levels of the phototransduction genes were calculated, we found that the rod transduction genes were expressed, although generally at much lower levels than their cone transduction counterparts (Fig. 5.2). The exception to this is rhodopsin kinase (GRK1), which was expressed at a higher level than GRK7. When compared to the expression of rod transduction genes in the diurnal lizard Anolis, which does express the rod visual pigment RH1, the expression levels were overall similar (Fig. 5.3). The most striking differences were in

GRK1, which was much more highly expressed in Gekko, and in SAG, which was much more highly expressed in Anolis. Interestingly, CNGA1 was not expressed in Gekko, while CNGB1 was not expressed in Anolis. These two genes encode the alpha and beta subunits of the rod cyclic nucleotide gate channel, which is typically formed as a dimer with a single alpha and beta subunit. Overall, the similarity in expression levels between Gekko and Anolis suggests that the

Gekko rod phototransduction proteins are being expressed in the retina and functioning in a similar manner to that in Anolis.

210

Figure 5.2. Relative expression levels (TPM) of phototransduction genes in Gekko gecko eye. Values are relative to total expression of phototransduction genes.

Figure 5.3. Comparison of relative expression levels (TPM) of rod phototransduction genes between the diurnal anole (Anolis) and nocturnal gecko (Gekko). Values are relative to total expression of phototransduction genes.

211

Table 5.1. Major components of the vertebrate visual phototransduction cascade and their presence or absence in geckos and other reptile groups.

Protein Gene Symbol Photoreceptor Gene Name Lost In RH1 Rod Rhodopsin (RHO) Geckos LWS Cone Long-wave Sensitive Cone Opsin Opsin RH2 Cone Middle-wave Sensitive Cone Opsin Snakes SWS1 Cone Short-wave Sensitive Cone Opsin 1 SWS2 Cone Short-wave Sensitive Cone Opsin 2 Geckos, Snakes GNAT1 Rod G Protein α-subunit 1 GNB1 Rod G Protein β-subunit 1 GNGT1 Rod G Protein γ-subunit 1 Reptiles Transducin GNAT2 Cone G Protein α-subunit 2 GNB3 Cone G Protein β-subunit 3 GNGT2 Cone G Protein γ-subunit 2 PDE6A Rod Phosphodiesterase α-subunit 6A Reptiles PDE6B Rod Phosphodiesterase β-subunit 6B Geckos Phosophodiesterase PDE6G Rod Phosphodiesterase γ-subunit 6G Geckos PDE6C Cone Phosphodiesterase β-subunit 6C PDE6H Cone Phosphodiesterase γ-subunit 6H CNGA1 Rod CNG α-subunit 1 Geckos Cyclic Nucleotide CNGB1 Rod CNG β-subunit 1 Anolis Gated Channel CNGA3 Cone CNG α-subunit 3 CNGB3 Cone CNG β-subunit 3 Na+/Ca2+-K+ SLC24A1 Rod Solute Carrier Family 24 Member 1 Squamates Exchanger SLC24A2 Cone Solute Carrier Family 24 Member 1 SAG Rod Rod Arrestin (S-Antigen) Arrestin ARR3 Cone Cone Arrestin (X-arrestin) G Protein-Coupled GRK1 Rod Rhodopsin Kinase Snakes Receptor Kinase GRK7 Cone Cone Opsin Kinase Regular of G- RGS9 Both Regulator of G-Protein Signaling 9 Protein Signalling RGS9BP Both RGS9 Binding Protein Complex GNB5 Both G Protein β-subunit 5 GUCA1A Both Guanylate Cyclase Activator 1A Guanylate Cyclase GUCA1B Both Guanylate Cyclase Activator 1B Activating Protein GUCA1C Cone Guanylate Cyclase Activator 1C GUCY2D Both Guanylate Cyclase 2D Guanylate Cyclase GUCY2F Both Guanylate Cyclase 2F Recoverin RCVRN Both Recoverin

5.3.2 Divergent, Elevated Selection in Gecko Visual Transduction Genes

To test whether gecko phototransduction genes are under divergent selective pressures relative to other reptiles (Fig. 5.4) we analyzed selection patterns using the branch (Br), branch-site (BrS),

212 and clade (CmC, CmD) models of the codeml PAML software package. We analyzed both the branch leading the geckos, as well as the gecko clade, which included a small sample of both nocturnal and diurnal geckos. Of the 28 genes recovered in geckos, 25 were analyzed. The gamma subunits of transducin and phosphodiesterase (GNGT2, PDE6G, PDE6H) were very short and so were not analyzed further. A single species phylogeny was used to maintain an even comparison between all genes (Fig. S5.1).

The hypothesis of divergent selection between geckos and other reptiles was supported for nearly every gene (Table 5.2, Fig. 5.5). Only six of the 25 genes (CNGB1, GNB1, SWS1,

GNB5, GUCY2F, GUCA1B) showed no evidence of divergent selection in geckos. Unlike the analyses previously done in snakes, we found little evidence for positive selection in geckos.

This is unsurprising due to the small sample size we are currently limited to. Additional sampling of geckos will likely be very important for obtaining a better understanding of the molecular evolution of their visual system.

In addition to the analyses performed on the gecko clade, we also analyzed the branch leading to the geckos. This allowed us to identify potential episodic bursts of selection along this branch, as well as divergent selection that may not be reflected across the whole clade. Overall, the results for the branch leading to geckos tend to match those of the clade results suggesting that divergent selection has occurred throughout the clade. Seven genes showed evidence of episodic positive selection on the branch leading to geckos with the branch-site model (CNGB1,

ARR3, CNGB3, LWS, PDE6C, GUCY2D, GUCA1B). For two genes (CNGB1 and GUCA1B) we found evidence for episodic positive selection of the branch leading to geckos, but not for divergent selection between geckos and other reptiles. This suggests the divergent, positive selection in these genes may have been limited to this branch, prior to the diversification of

213 geckos. In the other genes, the episodic positive selection along the gecko branch appears to have been followed by divergent selection during gecko diversification.

Figure 5.4. Clade partitions used to test the hypothesis that geckos and snakes experienced similar divergent selective pressures on phototransduction genes as a result of photoreceptor transmutation. The three partitioning schemes were compared to determine the best fitting model for each gene. The geckos + snakes partition being the best fit demonstrates that selective pressures are similar in both groups, which provides support for the hypothesis.

214

Figure 5.5. Analysis of divergent selection on visual transduction genes between reptiles and geckos, and reptiles and geckos + snakes. The ω (dN/dS) values of the divergent site class using CmC are shown highlighting the difference between the background (open circle) and foreground (closed circle) partitions for each gene. When only a single, open circle is shown the difference was not significant and instead the equivalent value from the null model (M2a_rel) is shown. Note that GRK1 and RH2 were lost in snakes and thus are not shown for geckos + snakes.

215

Table 5.2. Summary of the selection analyses. Analyses were performed on the gecko branch

(Br) and clade (Cl), and the snake clade, and on the combined gecko + snake clade (Fig. 5.4).

Significant results are indicated for the branch (Br) and branch-site (BrS) models (gecko Br, only), and the clade models (CmC and CmD, all partitions). The bolded entry on each row indicates the best fit among the gecko, snake, and gecko + snake partitions. The ‘+’ symbol indicates positive selection and bg indicates selection was elevated in the background rather than the foreground partition.

Type Gene Gecko Br Gecko Cl Snake Cl Gecko+Snake Cl Rod CNGB1 BrS+ no CmC CmD CmC/D Rod GNAT1 Br CmC CmC CmD CmC CmD CmC CmD Rod GNB1 No no No No Rod GRK1 Br, CmC CmC, CmD n/a n/a Rod SAG CmC CmC CmD no CmC/D Cone ARR3 Brs+ CmC CmC/D CmC CmD+ Cone CNGA3 no CmC CmD+ CmC/D CmC CmD Cone CNGB3 Br Brs+ CmC CmC Cmd CmC CmD Cone GNAT2 Br CmC CmC CmD CmC Cmd CmC CmD Cone GNB3 Br CmC CmC/D CmC/D CmC CmD Cone GRK7 CmC CmC/D CmC/D CmC CmD Cone LWS Br Brs+ CmC CmC/D CmC CmD+ CmC CmD+ Cone PDE6C Brs+ CmC CmC/D CmC/D CmC CmD Cone SWS1 no no CmC CmD CmC/D Cone GUCA1C CmC CmC CmD CmC CmD CmC CmD Cone RH2 Br CmC CmC CmD n/a n/a Cone SLC24A2 Br CmC CmC/D CmC/D CmC CmD Both GNB5 No no CmC CmD CmC CmD Both GUCY2D Br BrS+ CmC CmC/D CmC/D CmC CmD Both GUCY2F no no CmC/D CmC CmD bg Both RCVRN no CmC/D CmC CmD CmC/D Both RGS9 Br CmC CmC CmD CmC/D CmC CmD Both RGS9BP Br CmC CmC CmD CmC/D CmC CmD Both GUCA1A Br CmC CmC CmC CmD Both GUCA1B Brs+ CmC no CmC CmC CmD

Abbreviations: Br, branch model; BrS, branch-site model; bg, elevated ω in background; +, positive selection.

216

5.3.3 Geckos and snakes have experienced similar divergent selective pressures that are associated with photoreceptor transmutation

We previously demonstrated that snakes have experienced divergent selective pressures that are associated with photoreceptor transmutation (Chapter 4). We hypothesized that photoreceptor transmutation may have imposed similar selective pressures on both snakes and geckos (Fig.

5.4). We tested this using clade model analyses where both snakes and geckos were placed into the same partition. These models were then compared to the models that contained a separate partition for only geckos or for only snakes in order to determine which model was the best fit for each phototransduction gene (Fig. 5.4).

In most cases (15 out of 22; Table 5.2) the gecko plus snake partition was the best-fitting of the three partitions tested. The exceptions are primarily limited to cases were the gene was not under divergent selection in snakes alone (SAG) or in geckos alonge (CNGB1, SWS1, GNB5). In only two genes (LWS, RCVRN) was the snake partition a better fit than the snake plus gecko partition despite their being evidence for divergent selection in geckos alone. LWS in particular stands out as there was evidence in this gene for episodic positive selection on the branch leading to geckos. In GUCA1B we found somewhat of the opposite pattern where there was evidence of episodic positive selection on the branch leading to geckos, but no evidence of divergent selection on the whole clade; however, the addition of snakes to the foreground partition resulted in a better fitting model than snakes alone. These results highlight the importance of accounting for variation in other taxa outside the focal group that may obscure inferences of divergent or positive selection. Overall, the pattern of selection strongly supports the hypothesis that snakes and geckos are under similar divergent selective pressure that are associated with photoreceptor transmutation.

217

5.4 Discussion

We sequenced the whole eye transcriptome of Gekko gecko and combined this data with previous whole genome sequencing of Gekko japonicus and targeted capture sequencing of

Sphaerodactylus, Phelsuma, and G. gecko in order to better understand the molecular changes associated with photoreceptor transmutation in geckos. Specifically we tested two hypotheses: 1) that as a result of the loss of rod photoreceptors geckos have lost all rod transduction phototransduction genes and 2) that geckos and snakes have undergone similar divergent selective pressures on phototransduction genes as a result of photoreceptor transmutation. We found geckos still express most rod transduction phototransduction genes in the eye, but not rhodopsin, which we identified as a pseudogene in the genomes of G. japonicus, G. gecko, and

Phelsuma. The loss of rhodopsin provides further support that true rods were lost, presumably prior to the diversification of geckos. With only cones in the gecko retina, the maintenance of expression of rod phototransduction genes implies that these genes are expressed in cones, possibly along with cone phototransduction genes. We also found strong support for our second hypothesis: when we tested for divergent selection in snakes and geckos together this was a better fit than either snakes or geckos alone for the majority of genes.

Previous studies of the gecko visual system have shown that geckos possess only cone photoreceptors that resemble rods in nocturnal species and cones in diurnal species, with considerable amounts of variation (Walls 1942; Tansley 1964; Underwood 1970; Röll 2000). In nocturnal species, the rod-like cones were found to function similarly to true rods (Meneghini and Hamasaki 1967; Kleinschmidt and Dowling 1975; Zhang et al. 2006). This was thought to have been achieved using only cone visual pigments and cone phototransduction proteins

(Kojima et al. 1992; Zhang et al. 2006). We confirmed the absence of rhodopsin, the rod visual pigment, in geckos, but found several rod transduction genes were present in the genome of both

218 nocturnal and diurnal geckos, as well as being expressed in the eye of the nocturnal Tokay gecko

(Gekko gecko). The inability of Zhang et al. (2006) to sequence any rod genes does not suggest they were actually absent (not expressed), as they were also unable to sequence the genes encoding either rhodopsin kinase (GRK1) or cone opsin kinase (GRK7), the encoded protein for at least one of which they detected in the photoreceptor outer segments. This highlights a strength of RNA-Seq, and whole genome sequencing, data that we were able to utilize.

The loss of rhodopsin, but maintenance of other rod transduction genes, is unusual, especially in a group that is ancestrally nocturnal (Röll 2001; Gamble et al. 2015). Presumably rhodopsin was lost in the hypothesized ancestral diurnal lizard that possessed an all-cone retina, a hypothesis that has some support from photoreceptor morphology (Walls 1942; Pedler and

Tilly 1964; Tansley 1964; Underwood 1970; Röll 2000); however we have yet to identify the loss of rhodopsin in any other lizards besides geckos, although sampling is very still very limited

(Schott et al. 2017; Chapter 4). An alternate possibility that could be implied by the presence of a

RH1 pseudogene is that RH1 was not lost in all geckos. Additional sampling of both geckos and their close relatives is needed to address these possibilities.

With the loss of rods, the maintenance of rod transduction genes suggests that they are instead expressed in cone photoreceptors in geckos and perhaps other squamates as well. Since cone transduction genes are also expressed, rod and cone copies may be expressed in the same photoreceptors, but this could also mean that some photoreceptor types express only rod genes and some only cone genes. Geckos possess four different morphologically and spectroscopically distinguishable types of photoreceptors (Fig. 5.1), but additional cryptic subtypes could be present that possess different combinations to rod and cone copies of phototransduction proteins.

Co-expression of rod and cone copies within cone photoreceptors is known in a few cases.

Specifically, both rhodopsin kinase (GRK1) and cone opsin kinase (GRK7) are expressed in the

219 cones of some, but not all, vertebrates (Zhao et al. 1999; Weiss et al. 2001; Wada et al. 2006;

Osawa and Weiss 2012). Rod (SAG) and cone (ARR3) arrestin are also co-expressed in cones of mice (Nikonov et al. 2008) and in blue cones of human (Nork et al. 1993; Sakuma et al. 1996), and there is some evidence for co-expression in specific cone types in other vertebrates as well

(see Nork et al. 1993 and references therein). While other cases of rod and cone copies of phototransduction gene being co-expressed have not been reported, the specificity of rod and cone genes has only been tested in a small number of species, and, as far as we are aware, has not been examined at all in squamates.

We found that geckos possess both rhodopsin kinase (GRK1) and cone opsin kinase

(GRK7) and that in Gekko gecko the expression of GRK1 was higher than GRK7. This is opposite the pattern found for Anolis where GRK7 expression is higher, as would be expected for an animal with a cone-dominated retina. However, in the nocturnal G. gecko, where the cones are rod-like, both in terms of morphology and physiology, the expression of GRK1, at a higher level than GRK7, in the cone photoreceptors could contribute to their more rod-like function.

GRK7 has a much higher activity than GRK1 (Tachibanaki et al. 2005; Wada et al. 2006) and when GRK7 was ectopically expressed in zebrafish rods it was found to lower the rod sensitivity

(Vogalis et al. 2011). While the effect of expression of GRK1 in cones has not been tested, it stands to reason that it could contribute to a more rod-like physiology.

The expression of both types of arrestin in mouse cones is unusual, but co-expression in a single subtype of cone (eg., blue cones) may be more common (Nork et al. 1993). In mice cones,

SAG expression is higher than ARR3 (Nikonov et al. 2008), but because rod arrestin self- associates forming dimers and tetramers, and only the monomer is active, the effective concentration of cone arrestin was estimated to be double that of rod arrestin (Gurevich and

Gurevich 2010). The low expression of rod arrestin (SAG) relative to cone arrestin (ARR3) in the

220

G. gecko eye suggests that SAG may only be expressed in green (RH2) or UV (SWS1) cones alongside ARR3. Expression of SAG in green cones has been reported for a turtle (red-eared slider) and chicken (van Veen et al. 1986) and thus this cone type may be a good candidate for arrestin co-expression in geckos. Rod and cone arrestin show very different activities when expressed in vitro, where amphibian and mammalian rod bound with a high affinity and cone arrestins bound with a very low affinity forming only short-lived complexes (Smith et al.

2000; Sutton et al. 2005). When co-expressed in mice, rod and cone arrestin were found to both function similarly to inactivate cones, but are thought to have distinct modulatory roles (Nikonov et al. 2008; Deming et al. 2015). In zebrafish larvae, cone arrestin (ARR3a) was found to be important for high temporal resolution, but the function of the rod arrestins, which were not expressed in cones, was not evaluated (Renninger et al. 2011). It is unclear whether co- expression of rod and cone arrestin in gecko cones could contribute to a more rod-like physiology in nocturnal species. A hint perhaps comes from carp rods which contain two rod arrestins, one which binds strongly (similar to amphibian and mammal rod arrestins), while a second copy binds weakly and transiently (similar to cone arrestins) (Tomizuka et al. 2015). The affect these two arrestins have on carp vision, however, remains to be evaluated, as does the potential impact difference arrestin function could have for geckos.

Unlike opsin kinases and arrestins, rod and cone transducins and cyclic nucleotide gated channels (CNG) have not been reported to be co-expressed in cone cells. However, geckos have maintained both rod transducin and rod CNG and express them in the eye and so presumably in the cone photoreceptors. Most studies have found that rod and cone transducin are functionally very similar in terms of activation and inactivation (Deng et al. 2009; Gopalakrishna et al. 2012;

Tachibanaki et al. 2012; Mao et al. 2013; but see Chen et al. 2010). Rather the concentration of transducin may be important for rates of activation and inactivation (Sokolov et al. 2002; Mao et

221 al. 2013). Since rod transducin translocates to the inner segment under normal daylight conditions, but cone transducin does not (Lobanova et al. 2010), co-expression of rod and cone transducin in the same cell could increase the range of light levels under which the photoreceptor can operate, a property that would presumably be highly advantageous in an organism that possess only rod-like cones, as in nocturnal geckos, rather than having both rods for dim-light vision and cones for bright-light vision.

The apparent loss of the rod cyclic nucleotide gated channel α-subunit (CNGA1), but not the β-subunit (CNGB1) in geckos is highly unusual. While CNGA1 can form functional monomeric channels on its own in vitro, CNGB1 cannot (Kaupp et al. 1989; Chen et al. 1993).

Native rod CNG is formed from three CNGA1 subunits and one CNGB1 subunit (Weitz et al.

2002; Zheng et al. 2002; Zhong et al. 2002). The addition of the single CNGB1 subunit results in several functional specializations including increased Ca2+ permeation, modulation by Ca2+- calmodulin, more rapid kinetics, a 10-fold increase in the current activated by cAMP, and sensitivity to L-cis diltiazem (Chen et al. 1993; Hsu and Molday 1993; Chen et al. 1994; Hsu and

Molday 1994; Korschen et al. 1995; Zheng et al. 2002). CNGB1 is also necessary for proper targeting and mice lacking CNGB1 experience retinal degeneration (Huttl et al. 2005). It is possible that CNGB1 is expressed along with CNGA3, the cone α-subunit, as it was found that co-expression of different combinations of α- and β-subunits all produced functional channels in vitro (Finn et al. 1998), although the specific combination of CNGA3 with CNGB1 has not been reported. Typically rod channels are less permeable, carry less Ca2+ current, are less sensitive to cGMP, and have a narrower range of modulation (Kaupp and Seifert 2002). Each of these features is modulated by the β-subunit (Chen et al. 1993; Gerstner et al. 2000; Zheng et al. 2002;

Peng et al. 2003) suggesting that a CNGA3-CNGB1 heteromer could produce a more rod-like channel. It should be noted that splice variants of CNGB1 are also expressed in olfactory neurons

222

(Sautter et al. 1998) and in testes (Wiesner et al. 1998); however this still does not explain continued expression in the eye. Interestingly the reverse situation, loss of CNGB1, but not

CNGA1, appears to have occurred in Anolis, but not any other squamate examined thus far

(Schott et al. 2017; Chapter 4). In this case expression of CNGA1 is maintained in the eye. This could be explained through co-expression with CNGB3, which might result in a more cone-like channel and could be expressed in the rhodopsin-bearing cones of Anolis. Such scenarios in geckos and anoles are speculative, but certainly hint at interesting functional evolution that warrants further study.

The morphologically rod-like cones of nocturnal geckos also have a rod-like physiology that includes high sensitivity and slow response times (Zhang et al. 2006). Zhang et al. (2006) proposed that one of the key factors responsible for this was the reduced GTPase accelerating protein (GAP, formed by RGS9-GNB5-RGS9BP) activity and concentration. In bovine and chipmunk rods GAP concentration is much lower than in cones (Cowan et al. 1998; Zhang et al.

2003), and Zhang et al. (2006) found that the concentration of RGS9 relative to visual pigment was actually slightly lower in Gekko, compared to bovine rods. The expression level we found for RGS9 while low relative to the other transduction genes, was similar to Anolis. Whether this represents a disconnect between mRNA and protein expression levels or a much lower concentration of RGS9 in Anolis cones compared to mammalian cones is unclear. Interestingly, we found recoverin (RCVRN) expression to be relatively much higher in Gekko than Anolis.

RCVRN is a Ca2+-binding protein that increases visual sensitivity and prolongs phototransduction (Gray-Keller et al. 1993; Erickson et al. 1998; Makino et al. 2004; Sampath et al. 2005) and thus a high RCVRN concentration is likely to contribute to a more rod-like physiology of gecko cones.

223

Beyond differences in expression levels of rod and cone transduction genes, our results suggest the possibility of functional adaptation in protein function. We found significant evidence for divergent selective pressures between geckos and other reptiles for the majority of phototransduction genes. This pattern was very similar to what we found previously for snakes, which also experienced substantial photoreceptor transmutation (Chapter 4). In fact, including both geckos and snakes in the same partition, in comparisons to other reptiles was a significantly better fit than a partition containing either geckos or snakes alone, for the vast majority of genes.

These results suggest that geckos and snakes were under similarly elevated selective pressures as a result of photoreceptor transmutation.

In caenophidian snakes we observed considerably more divergent and positive selection in cone phototransduction genes than rod or nonspecific genes. We proposed that this was a result of repeated independent transitions from diurnality to nocturnality and reflected adaptation of the cone genes towards a more rod-like function with the evolution of all-rod retinas. While we were unable to evaluate positive selection within geckos due to the small sample size, we did find more divergent selection in cone genes than rod genes. In geckos the activity pattern transitions are reversed with several independent evolutions of diurnality from a nocturnal ancestral condition (Röll 2001; Gamble et al. 2015); however, in nocturnal geckos the cone genes are acting in rod-like cones and thus might be expected to be adapted to a more rod-like function. The observed pattern would be explained by selective pressures for more cone-like function in the cone genes imposed by the evolutionary transitions to diurnality and morphologically and physiologically all-cone retinas.

224

5.5 Conclusions

Photoreceptor transmutation was originally proposed over 70 years ago by Gordon Walls (1934), but it is only recently that we have begun to study the molecular underpinnings of the morphological changes that occur as part of the evolutionary process of photoreceptor transmutation. In nocturnal geckos the rod-like photoreceptors that were derived from cones not only have strikingly different gross morphologies, but also distinctly rod-like physiology. Here we have provided evidence that this rod-like physiology has evolved through multiple mechanisms including the co-option of expression of rod genes in cone photoreceptors, changes in expression levels, and protein adaptation driven by divergent selective pressures. Despite the loss of rhodopsin and any photoreceptors with rod ultrastructure, we found geckos still express rod phototransduction genes in the eye, and presumably in cone photoreceptor cells. Further work will be needed to localize the expression of these proteins and to determine the functional consequences of the potential co-expression of rod and cone proteins in the same cell. We further found that the relative expression of recoverin in the nocturnal gecko was much higher than a diurnal anole, which could contribute substantially to the increased sensitivity and slower reaction times of the nocturnal gecko rod-like cone. Finally we found that geckos have experienced significant divergent selective pressures on phototransduction genes, similar to those found in snakes, that are associated with photoreceptor transmutation. This suggests that gecko phototransduction genes, as well as those in snakes, may be functionally adapted towards more rod-like and cone-like activity in nocturnal and diurnal species, respectively. Overall our results demonstrate how drastic changes can evolve in a complex system through multiple concomitant mechanisms.

225

5.6 Methods

5.6.1 Animals

A Gekko gecko and Anolis carolinensis specimen were obtained from commercial retailers and euthanized under approval of the University of Toronto Animal Care Committee. Eyes were extracted and either frozen in liquid nitrogen or placed in RNAlater (Ambion) and stored at -

80°C.

5.6.2 Transcriptome Sequencing

Whole eyes were homogenized in Trizol (Invitrogen) using a BeadBug (Benchmark Scientific).

Total RNA was extracted following a combined Trizol/RNeasy (Qiagen) protocol according to the manufacturer’s instructions. Library construction and sequencing on the Illumina HiSeq pipeline were performed according to standard protocols at The Centre for Applied Genomics, the Hospital for Sick Children (Toronto). Resulting 150 bp paired end reads were trimmed with

Trimmomatic v0.33 (Bolger et al., 2014) using default settings. Trimmed reads were assembled de novo using Trinity (Grabherr et al. 2011) under default settings. Visual transduction gene transcripts were identified and extracted using BLAST. Transcript identities were confirmed through phylogenetic analysis.

5.6.3 Visual Transduction Gene Datasets

Genes encoding each of the major, known components of the visual transduction cascade (Lamb

2013) were extracted from the Gekko gecko eye transcriptome, the G. japonicus genome, and the targeted capture data from Schott et al. (2017) and combined with the datasets from Chapter 4

226 following the same methodology. Because GRK1 and RH2 are absent in snakes complete datasets for theses gene were created following the methodology outlined in Chapter 4.

5.6.4 Expression Analyses

The complete set of phototransduction genes from Gekko and Anolis were used as a reference to estimate relative expression levels of phototransduction genes. Trimmed reads were aligned to the reference using BWA-MEM (Li 2013) and read counts calculated using Samtools idxstats (Li et al. 2009) implemented in a custom pipeline. Read counts were used to calculate TPM (Conesa et al. 2016).

5.6.5 Molecular Evolutionary Analyses

To estimate the strength and form of selection acting on visual transduction genes of gecko, each dataset was analyzed with the codeml program of from the PAML 4 software package (Yang

2007) using the branch (Br), branch-site (BrS), and clade models (CmC, CmD). Random sites models M0, M2a_rel, and M3 were also used as these are the null models for the Br, CmC, and

CmD models, respectively. All analyses were run with varying starting values to avoid potential local optima. To determine significance, model pairs were compared using a likelihood ratio test

(LRT) with a χ2 distribution, while non-nested models where evaluated using Akaike

Information Criterion (AIC).

To test the hypothesis that geckos experienced divergent selection from other reptiles they were placed into a separate partition and analyzed with CmC and CmD. These models both test for divergent selection at a subset of sites between two or more partitions, but differ in the requirement in CmC for a neutral class of sites (ω = 1). Due to the previously identified presence of divergent and positive selection in snakes, we also used the clade models to test for divergent

227 selection in the snake clade and with both the snakes and geckos clades placed in the foreground.

The likelihoods of the gecko, snake, and gecko+snake partition were each compared to determine the best-fitting partition.

In contrast to divergent selection between clades, we also tested for divergent selection, and episodic positive selection, on the branch leading to geckos. The branch leading to geckos was placed in the foreground partition and tested using the Br, BrS, and CmC. The Br model tests for overall differences between the two partitions, while the BrS model explicitly tests for positive selection.

5.7 References

Chen CK, Woodruff ML, Chen FS, Shim H, Cilluffo MC, Fain GL 2010. Replacing the rod with

the cone transducin alpha subunit decreases sensitivity and accelerates response decay.

Journal of Physiology-London 588: 3231-3241.

Chen TY, Illing M, Molday LL, Hsu YT, Yau KW, Molday RS 1994. Subunit 2 (or beta) of

retinal rod cGMP-gated cation channel is a component of the 240-kDa channel-associated

protein and mediates Ca(2+)-calmodulin modulation. Proc Natl Acad Sci U S A 91: 11757-

11761.

Chen TY, Peng YW, Dhallan RS, Ahamed B, Reed RR, Yau KW 1993. A new subunit of the

cyclic nucleotide-gated cation channel in retinal rods. Nature 362: 764-767.

Chiari Y, Cahais V, Galtier N, Delsuc F 2012. Phylogenomic analyses support the position of

turtles as the sister group of birds and crocodiles (Archosauria). BMC Biol 10: 65.

228

Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, Szcześniak

MW, Gaffney DJ, Elo LL, Zhang X, Mortazavi A 2016. A survey of best practices for

RNA-seq data analysis. Genome Biol 17: 13.

Cowan CW, Fariss RN, Sokal I, Palczewski K, Wensel TG 1998. High expression levels in

cones of RGS9, the predominant GTPase accelerating protein of rods. Proc Natl Acad Sci

U S A 95: 5351-5356.

Crawford NG, Faircloth BC, McCormack JE, Brumfield RT, Winker K, Glenn TC 2012. More

than 1000 ultraconserved elements provide evidence that turtles are the sister group of

archosaurs. Biol Lett 8: 783-786.

Deming JD, Pak JS, Shin JA, Brown BM, Kim MK, Aung MH, Lee EJ, Pardue MT, Craft CM

2015. Arrestin 1 and Cone Arrestin 4 Have Unique Roles in Visual Function in an All-

Cone Mouse Retina. Invest Ophthalmol Visual Sci 56: 7618-7628.

Deng WT, Sakurai K, Liu JW, Dinculescu A, Li J, Pang JJ, Min SH, Chiodo VA, Boye SL,

Chang B, Kefalov VJ, Hauswirth WW 2009. Functional interchangeability of rod and cone

transducin alpha-subunits. Proc Natl Acad Sci U S A 106: 17681-17686.

Erickson MA, Lagnado L, Zozulya S, Neubert TA, Stryer L, Baylor DA 1998. The effect of

recombinant recoverin on the photoresponse of truncated rod photoreceptors. Proc Natl

Acad Sci U S A 95: 6474-6479.

Finn JT, Krautwurst D, Schroeder JE, Chen TY, Reed RR, Yau KW 1998. Functional co-

assembly among subunits of cyclic-nucleotide-activated, nonselective cation channels, and

across species from nematode to human. Biophys J 74: 1333-1345.

Gamble T, Greenbaum E, Jackman TR, Bauer AM 2015. Into the light: diurnality has evolved

multiple times in geckos. Biol J Linn Soc 115: 896-910.

229

Gerstner A, Zong X, Hofmann F, Biel M 2000. Molecular cloning and functional

characterization of a new modulatory cyclic nucleotide-gated channel subunit from mouse

retina. J Neurosci 20: 1324-1332.

Gopalakrishna KN, Boyd KK, Artemyev NO 2012. Comparative analysis of cone and rod

transducins using chimeric Galpha subunits. Biochemistry 51: 1617-1624.

Gray-Keller MP, Polans AS, Palczewski K, Detwiler PB 1993. The effect of recoverin-like

calcium-binding proteins on the photoresponse of retinal rods. Neuron 10: 523-531.

Gurevich VV, Gurevich EV. 2010. Phototransduction: inactivation in cones. In: Dartt DA,

editor. Encyclopedia of the Eye. Oxford: Academic Press. p. 370-374.

Hsu YT, Molday RS 1994. Interaction of calmodulin with the cyclic GMP-gated channel of rod

photoreceptor cells: modulation of activity, affinity purification, and localization. J Biol

Chem 269: 29765-29770.

Hsu YT, Molday RS 1993. Modulation of the cGMP-gated channel of rod photoreceptor cells by

calmodulin. Nature 361: 76-79.

Huttl S, Michalakis S, Seeliger M, Luo DG, Acar N, Geiger H, Hudl K, Mader R, Haverkamp S,

Moser M, Pfeifer A, Gerstner A, Yau KW, Biel M 2005. Impaired channel targeting and

retinal degeneration in mice lacking the cyclic nucleotide-gated channel subunit CNGB1. J

Neurosci 25: 130-138.

Jarvis ED, Mirarab S, Aberer AJ, Li B, Houde P, Li C, Ho SYW, Faircloth BC, Nabholz B,

Howard JT, Suh A, Weber CC, da Fonseca RR, Li JW, Zhang F, Li H, Zhou L, Narula N,

Liu L, Ganapathy G, Boussau B, Bayzid MS, Zavidovych V, Subramanian S, Gabaldon T,

Capella-Gutierrez S, Huerta-Cepas J, Rekepalli B, Munch K, Schierup M, Lindow B,

Warren WC, Ray D, Green RE, Bruford MW, Zhan XJ, Dixon A, Li SB, Li N, Huang YH,

Derryberry EP, Bertelsen MF, Sheldon FH, Brumfield RT, Mello CV, Lovell PV, Wirthlin

230

M, Schneider MPC, Prosdocimi F, Samaniego JA, Velazquez AMV, Alfaro-Nunez A,

Campos PF, Petersen B, Sicheritz-Ponten T, Pas A, Bailey T, Scofield P, Bunce M,

Lambert DM, Zhou Q, Perelman P, Driskell AC, Shapiro B, Xiong ZJ, Zeng YL, Liu SP,

Li ZY, Liu BH, Wu K, Xiao J, Yinqi X, Zheng QM, Zhang Y, Yang HM, Wang J, Smeds

L, Rheindt FE, Braun M, Fjeldsa J, Orlando L, Barker FK, Jonsson KA, Johnson W,

Koepfli KP, O'Brien S, Haussler D, Ryder OA, Rahbek C, Willerslev E, Graves GR, Glenn

TC, McCormack J, Burt D, Ellegren H, Alstrom P, Edwards SV, Stamatakis A, Mindell

DP, Cracraft J, Braun EL, Warnow T, Jun W, Gilbert MTP, Zhang GJ 2014. Whole-

genome analyses resolve early branches in the tree of life of modern birds. Science 346:

1320-1331.

Kaupp UB, Niidome T, Tanabe T, Terada S, Bonigk W, Stuhmer W, Cook NJ, Kangawa K,

Matsuo H, Hirose T, Miyata T, Numa S 1989. Primary structure and functional expression

from complementary-DNA of the rod photoreceptor cyclic GMP-gated channel. Nature

342: 762-766.

Kaupp UB, Seifert R 2002. Cyclic nucleotide-gated ion channels. Physiol Rev 82: 769-824.

Kleinschmidt J, Dowling JE 1975. Intracellular-recordings from gecko photoreceptors during

light and dark-adaptation. J Gen Physiol 66: 617-648.

Kojima D, Okano T, Fukada Y, Shichida Y, Yoshizawa T, Ebrey TG 1992. Cone visual

pigments are present in gecko rod cells. Proc Natl Acad Sci U S A 89: 6841-6845.

Korschen HG, Illing M, Seifert R, Sesti F, Williams A, Gotzes S, Colville C, Muller F, Dose A,

Godde M, Molday L, Kaupp UB, Molday RS 1995. A 240 kDa protein represents the

complete β subunit of the cyclic nucleotide-gated channel from rod photoreceptor. Neuron

15: 627-636.

231

Lamb TD 2013. Evolution of phototransduction, vertebrate photoreceptors and retina. Prog Retin

Eye Res 36: 52-119.

Li H 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM.

arXiv 1303.3997v1.

Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R,

Genome Project Data P 2009. The Sequence Alignment/Map format and SAMtools.

Bioinformatics 25: 2078-2079.

Liu Y, Zhou Q, Wang Y, Luo L, Yang J, Yang L, Liu M, Li Y, Qian T, Zheng Y, Li M, Li J, Gu

Y, Han Z, Xu M, Wang Y, Zhu C, Yu B, Yang Y, Ding F, Jiang J, Yang H, Gu X 2015.

Gekko japonicus genome reveals evolution of adhesive toe pads and tail regeneration.

Nature Communications 6: 10033.

Lobanova ES, Herrmann R, Finkelstein S, Reidel B, Skiba NP, Deng WT, Jo R, Weiss ER,

Hauswirth WW, Arshavsky VY 2010. Mechanistic basis for the failure of cone transducin

to translocate: why cones are never blinded by light. J Neurosci 30: 6815-6824.

Loew ER, Govardovskii VI, Rohlich P, Szel A 1996. Microspectrophotometric and

immunocytochemical identification of ultraviolet photoreceptors in geckos. Vis Neurosci

13: 247-256.

Makino CL, Dodd RL, Chen J, Burns ME, Roca A, Simon MI, Baylor DA 2004. Recoverin

regulates light-dependent phosphodiesterase activity in retinal rods. J Gen Physiol 123:

729-741.

Mao W, Miyagishima KJ, Yao Y, Soreghan B, Sampath AP, Chen JE 2013. Functional

Comparison of Rod and Cone G alpha(t) on the Regulation of Light Sensitivity. J Biol

Chem 288: 5257-5267.

232

Meneghini KA, Hamasaki DI 1967. The electroretinogram of the iguana and Tokay gecko.

Vision Res 7: 243-251.

Nikonov SS, Brown BM, Davis JA, Zuniga FI, Bragin A, Pugh EN, Craft CM 2008. Mouse

cones require an arrestin for normal inactivation of phototransduction. Neuron 59: 462-

474.

Nork TM, Mangini NJ, Millecchia LL 1993. Rods and cones contain antigenically distinctive S-

antigens. Invest Ophthalmol Visual Sci 34: 2918-2925.

Osawa S, Weiss ER 2012. A tale of two kinases in rods and cones. Adv Exp Med Biol 723: 821-

827.

Pedler C, Tilly R 1964. The nature of the gecko visual cell: a light and electron microscopic

study. Vision Res 4: 499-510.

Peng C, Rich ED, Thor CA, Varnum MD 2003. Functionally important calmodulin-binding sites

in both NH2- and COOH-terminal regions of the cone photoreceptor cyclic nucleotide-

gated channel CNGB3 subunit. J Biol Chem 278: 24617-24623.

Pyron RA, Burbrink FT, Wiens JJ 2013. A phylogeny and revised classification of Squamata,

including 4161 species of lizards and snakes. BMC Evol Biol 13: 93.

Renninger SL, Gesemann M, Neuhauss SC 2011. Cone arrestin confers cone vision of high

temporal resolution in zebrafish larvae. Eur J Neurosci 33: 658-667.

Röll B 2000. Gecko vision-visual cells, evolution, and ecological constraints. J Neurocytol 29:

471-484.

Röll B 2001. Multiple origin of diurnality in geckos: evidence from eye lens crystallins.

Naturwissenschaften 88: 293-296.

Sakuma H, Inana G, Murakami A, Higashide T, McLaren MJ 1996. Immunolocalization of X-

arrestin in human cone photoreceptors. FEBS Lett 382: 105-110.

233

Sampath AP, Strissel KJ, Elias R, Arshavsky VY, McGinnis JF, Chen J, Kawamura S, Rieke F,

Hurley JB 2005. Recoverin improves rod-mediated vision by enhancing signal

transmission in the mouse retina. Neuron 46: 413-420.

Sautter A, Zong XG, Hofmann F, Biel M 1998. An isoform of the rod photoreceptor cyclic

nucleotide-gated channel beta subunit expressed in olfactory neurons. Proc Natl Acad Sci

U S A 95: 4696-4701.

Schott RK, Panesar B, Card DC, Preston M, Castoe TA, Chang BS 2017. Targeted capture of

complete coding regions across divergent species. Genome Biol Evol.

Smith WC, Gurevich EV, Dugger DR, Vishnivetskiy SA, Shelamer CL, McDowell JH, Gurevich

VV 2000. Cloning and functional characterization of salamander rod and cone arrestins.

Invest Ophthalmol Visual Sci 41: 2445-2455.

Sokolov M, Lyubarsky AL, Strissel KJ, Savchenko AB, Govardovskii VI, Pugh EN, Arshavsky

VY 2002. Massive light-driven translocation of transducin between the two major

compartments of rod cells: A novel mechanism of light adaptation. Neuron 34: 95-106.

Sutton RB, Vishnivetskiy SA, Robert J, Hanson SM, Raman D, Knox BE, Kono M, Navarro J,

Gurevich VV 2005. Crystal structure of cone arrestin at 2.3 angstrom: Evolution of

receptor specificity. J Mol Biol 354: 1069-1080.

Tachibanaki S, Arinobu D, Shimauchi-Matsukawa Y, Tsushima S, Kawamura S 2005. Highly

effective phosphorylation by G protein-coupled receptor kinase 7 of light-activated visual

pigment in cones. Proc Natl Acad Sci U S A 102: 9329-9334.

Tachibanaki S, Yonetsu SI, Fukaya S, Koshitani Y, Kawamura S 2012. Low Activation and Fast

Inactivation of Transducin in Carp Cones. J Biol Chem 287: 41186-41194.

Tansley K 1964. The gecko retina. Vision Res 4: 33-37.

234

Tomizuka J, Tachibanaki S, Kawamura S 2015. Phosphorylation-independent Suppression of

Light-activated Visual Pigment by Arrestin in Carp Rods and Cones. J Biol Chem 290:

9399-9411.

Underwood G. 1970. The Eye. In: Gans C, editor. Biology of the Reptilia. New York: Academic

Press. p. 1-97. van Veen T, Vigh-Teichmann I, Vigh B, Hartwig HG 1986. Light and electron microscopy of S-

antigen- and opsin-immunoreactive photoreceptors in the retina of turtle, chicken, and

hedgehog. Exp Biol 45: 1-14.

Vogalis F, Shiraki T, Kojima D, Wada Y, Nishiwaki Y, Jarvinen JLP, Sugiyama J, Kawakami K,

Masai I, Kawamura S, Fukada Y, Lamb TD 2011. Ectopic expression of cone-specific G-

protein-coupled receptor kinase GRK7 in zebrafish rods leads to lower photosensitivity and

altered responses. Journal of Physiology-London 589: 2321-2348.

Wada Y, Sugiyama J, Okano T, Fukada Y 2006. GRK1 and GRK7: unique cellular distribution

and widely different activities of opsin phosphorylation in the zebrafish rods and cones. J

Neurochem 98: 824-837.

Walls GL 1934. The Reptilian Retina: I. A new concept of visual-cell evolution. Am J

Ophthalmol 17: 892-915.

Walls GL. 1942. The vertebrate eye and its adaptive radiation. Bloomfield Hills, MI: Cranbrook

Institute of Science.

Weiss ER, Ducceschi MH, Horner TJ, Li A, Craft CM, Osawa S 2001. Species-specific

differences in expression of G-protein-coupled receptor kinase (GRK) 7 and GRK1 in

mammalian cone photoreceptor cells: implications for cone cell phototransduction. J

Neurosci 21: 9175-9184.

235

Weitz D, Ficek N, Kremmer E, Bauer PJ, Kaupp UB 2002. Subunit stoichiometry of the CNG

channel of rod photoreceptors. Neuron 36: 881-889.

Wiesner B, Weiner J, Middendorff R, Hagen V, Kaupp UB, Weyand I 1998. Cyclic nucleotide-

gated channels on the flagellum control Ca2+ entry into sperm. J Cell Biol 142: 473-484.

Zhang X, Wensel TG, Kraft TW 2003. GTPase regulators and photoresponses in cones of the

eastern chipmunk. J Neurosci 23: 1287-1297.

Zhang X, Wensel TG, Yuan C 2006. Tokay gecko photoreceptors achieve rod-like physiology

with cone-like proteins. Photochem Photobiol 82: 1452-1460.

Zhao XY, Yokoyama K, Whitten ME, Huang J, Gelb MH, Palczewski K 1999. A novel form of

rhodopsin kinase from chicken retina and pineal gland. FEBS Lett 454: 115-121.

Zheng J, Trudeau MC, Zagotta WN 2002. Rod cyclic nucleotide-gated channels have a

stoichiometry of three CNGA1 subunits and one CNGB1 subunit. Neuron 36: 891-896.

Zhong HM, Molday LL, Molday RS, Yau KW 2002. The heteromeric cyclic nucleotide gated

channel adopts a 3A : 1B stoichiometry. Nature 420: 193-198.

236

5.8 Supplementary Figure

Figure S5.1. Species topology and representative taxon sampling used for the selection analyses on visual transduction genes. Topology based on Chiari et al. (2012); Crawford et al. (2012);

Pyron et al. (2013); Jarvis et al. (2014).

237

Chapter 6 Conclusions

6.1 Summary and Conclusions

In the proceeding four studies I have advanced the goal of expanding our understanding of the evolution and molecular basis of photoreceptor transmutation. I started by asking the question: how did the all-cone retina found in some diurnal snakes evolve? Walls (1942) believed that these simplex retinas evolved through the loss of rods and it was only later, in evolutionary transitions between simplex retinas, that photoreceptor transmutation took place. He did not appear to consider the alternate possibility that the rods were transmuted into cones. I believe at least part of the reason for this is the lack of understanding at the time of the molecular components of photoreceptors and the difficulty in extracting and identifying visual pigments.

This led Walls (1942) to mistakenly conclude that rhodopsin was lost and re-evolved in several instances. As Simões et al. (2016) note, this also influenced Underwood’s (1967, 1970) conclusions that snakes with all-rod retinas actually lacked true rods. The view that rods were lost was perhaps reinforced by the finding of Jacobs et al. (1992) that garter snakes lacked a separate rod (scotopic) visual response. Sillman et al. (1997, 1999, 2000) found that diurnal garter snakes with all-cone retinas and nocturnal henophidian snakes with duplex retinas (Python and Boa) have the same number of visual pigments, which provided the first clue that the all- cone retina may not be derived through the loss of the rods; however the differences in λmax between the Thamnophis and Python/Boa precluded assuming that the visual pigments were the same. Davies et al. (2009) identified the visual pigments in henophidians with duplex retinas, but those in snakes with all-cone retinas, such as the garter snake were still unknown.

238

In the first study of my thesis I tested two competing hypotheses about the evolutionary fate of the rods in a diurnal garter snake: 1) that the rods, and their corresponding molecular machinery, were lost or 2) that the rods were evolutionarily modified to resemble, and function, as cones. To test these hypotheses I utilized multiple experiments performed by both myself and others. First we confirmed by SEM that Thamnophis proximus has no photoreceptors that could be identified morphologically as rods. We next used MSP to identify the presence of three distinct visual pigments and sequenced the corresponding opsin genes which revealed the expression of RH1, the rod opsin gene. This gene was under strong selective constraint suggesting a conserved function relative to other snakes and vertebrates. When expressed in vitro, T. proximus RH1 produced a functional visual pigment with a λmax that matched the MSP estimates for one of the visual pigments present in small single cones. Immunofluorescent staining demonstrated that RH1 was present in the outer segments of a subset of ‘cone’ photoreceptor cells in T. proximus retina. Another rod-specific component of the phototransduction cascade, rod transducin (GNAT1), was found to co-localize in the same subset of photoreceptors. Finally, while the general morphology of the photoreceptors was indicative of an all-cone retina, close examination of the ultrastructure of individual cells using TEM revealed that a subset of ‘cones’ in fact had rod features, including outer segment discs that were completely enclosed by plasma membrane. Taken together the findings of each of these experiments strongly support the hypothesis that the all-cone retina of diurnal colubrids and other snakes is derived not from the loss of the rods, but rather their transmutation into cones.

While we provided strong evidence for the mechanism behind the evolution of the all- cone retina, the driving force is perhaps less clear. Vertebrates require both rods and cones in order to maintain visual perception through the range of natural illumination. Rods can respond to single photons, but saturate in bright light, whereas cones never saturate, but have insufficient

239 sensitivity to function in very dim light. For this reason nearly all vertebrates have both rods and cones, presumably because even highly diurnal animals may encounter, or be active in, dim-light environments, and vice versa. In fact, only diurnal squamates are thought to have lost rods and thus have all-cone retinas (with the possible exception of the stellate sturgeon) and only geckos are known to have lost RH1 (Bowmaker 2008). Thus, the change to cone-like rods in diurnal snakes, and the corresponding reduction in dim-light visual capabilities, as shown by Jacobs et al. (1992), might be seen as detrimental. However, the loss of the SWS2 and RH2 cone opsins in snakes, which results in low sensitivity to a large portion of the visual spectrum, may explain this. The loss of SWS2 and RH2 would severely limit the amount of visible light snakes would be sensitive to and reduced the potential for colour vision. In primarily nocturnal snakes this may not be an issue, but in highly diurnal snakes, such as Thamnophis, there may be a significant advantage to increasing the range of spectral sensitivity through utilization of RH1 that could outweigh the reduction in scotopic sensitivity. This would also help to explain the highly blue- shifted absorption spectra of T. proximus RH1. Based on studies in mammals it is possible that

RH1 may also be able to contribute to colour vision (McKee et al. 1977; Reitner et al. 1991; Cao et al. 2008), which could provide an additional adaptive advantage. Overall this study illustrates how sensory evolution can be shaped not only by environmental constraints, but also by historical contingency in forming new cell types with convergent functionality.

With an idea of the basis for the evolution of the all-cone retina in snakes, my next goal was to further understand the molecular basis consequences of photoreceptor transmutation in snakes. However the molecular evolution of the visual system in snakes, and squamates in general, is severely understudied and much less genomic resources are available for squamates than for other groups such as mammals or birds. As a result, I first needed to produce the necessary data. Expanding studies of photoreceptor transmutation, and visual evolution more

240 generally, beyond visual pigments would require sequencing of large numbers of genes from a large number of species. Despite continued advances in sequencing technologies, this remains a nontrivial task. Sequencing entire genomes is generally too time-consuming, and too costly, on comparative taxonomic scales, and produces much more data than necessary for most evolutionary questions. PCR, on the other hand, still excels at sequencing small numbers of genes, but quickly becomes cost ineffective when large numbers of genes are required, while primer design and optimization becomes time inefficient across divergent species (Mamanova et al. 2010; Shen et al. 2013). As a result, there is a need for methods that can efficiently sequence a large set of genes of interest from a large number of species. Transcriptome sequencing (RNA-

Seq; Wang et al. 2009) is an increasingly popular option (and one I also utilize in my later studies), but has several downsides including a reliance on fresh tissue samples and variation in transcript expression levels. I wanted an approach that did not suffer from these issues and that could be used instead of, or in addition to, RNA-Seq.

The approach I selected to accomplish this was targeted capture (hybrid enrichment).

While this suit of methods is well established for genome resequencing projects, cross-species capture strategies are still being developed and generally focus on the capture of conserved regions, rather than complete coding regions from specific genes of interest. The data produced by existing methods is thus useful for phylogenetic studies, but the wealth of comparative data that could be used for evolutionary and functional studies is lost. As a result I needed to design and implement a targeted capture method that enables recovery of complete coding regions across broad taxonomic scales. Specifically, I selected 166 genes for targeted enrichment and sequencing, composed primarily of visual genes, from 16 squamate reptiles that spanned the major lineages, encompassing approximately 200 myr of divergence. To accomplish this goal capture probes were designed from multiple reference species and extensively tiled in order to

241 facilitate cross-species capture. I developed, with the assistance of two undergraduate students that I mentored, novel bioinformatics pipelines that were able to recover nearly all of the targeted genes with high completeness. Recovery of more divergent sequences was lower, but this was primarily due to the difficulty of cross-species guided assembly rather than a failure of the hybrid enrichment. Increased probe diversity and tiling for a subset of genes had a large positive effect on both recovery and completeness. The resulting data produced an accurate species tree, but importantly this same data can also be applied to studies of molecular evolution and function, greatly expanding the data available to me for the study of the evolution of photoreceptor transmutation in squamates. When compared with RNA-Seq the targeted enrichment method performed well both in terms of the quality of the data and the cost, making it an ideal alternative when the drawbacks of RNA-Seq become important. This method demonstrates the utility of cross-species approaches for the capture of full length coding sequences, and has substantially improved the ability to conduct large-scale comparative studies of molecular evolution and function.

Most previous studies of molecular evolution of the visual system, and all of those in squamates, have focused solely on opsin genes. With my third study I wanted to utilize the data produced in the second study, as well as new whole eye transcriptome data I produced for this study and the growing genomic resources in reptiles, to expand beyond this, with a focus on how the visual system may have changed with photoreceptor transmutation. I also hoped to utilize this data to provide new insight into the long standing debate on the origin of snakes, between fossorial, aquatic, or a more general dim-light ancestry. To this end I tested two predictions: 1) that snakes lost phototransduction genes and experienced a relaxation of selective pressures, due to their potential fossorial origins, and 2) that caenophidian snakes, in which transmutation

242 appears to have been widespread, have experienced positive selection as they adapted to simplex retinas.

Using phototransduction gene coding sequences from new whole eye transcriptomes, the targeted enrichment experiment in the previous study, and available genomic resources I compared selective pressures between snakes and other reptiles and within snakes using random sites, branch, branch-site, and clade models. The results suggest that phototransduction genes are under considerable positive and divergent selection in snakes to a degree not found in other vertebrate groups. These exceptional selective patterns are likely linked to both the unique evolutionary origins of snakes and with photoreceptor transmutation in caenophidians. I identified the surprising loss of rhodopsin kinase in snakes, despite a low degree of gene loss overall, and a lack of relaxed selection early during snake evolution. This findings support a dim- light early snake ancestor that was not highly adapted for fossoriality. These data provide some of the first clear evolutionary genomic corroboration for discerning among many possible hypotheses for the evolutionary origins of snakes based on controversial fossil data (Caldwell and Lee 1997; Lee 2005; Longrich et al. 2012; Yi and Norell 2015; Lee et al. 2016). Our results also indicate that caenophidian snakes experienced significant positive selection, particularly in cone-specific genes, that likely reflects adaptive evolution towards a more rod-like function that occurred on multiple branches within the caenophidian clade to facilitate the development of all- rod retinas. These results reveal potential molecular adaptations as a result of photoreceptor transmutation, and also highlight unappreciated functional differences between rod- and cone- specific phototransduction proteins. This intriguing example of snake visual system evolution illustrates how the underlying molecular components of a complex system can be reshaped in response to changing selection pressures.

243

In the final study I switched from a focus on photoreceptor transmutation in snakes to that in geckos. Geckos represent the most well studied case of photoreceptor transmutation. Most geckos are nocturnal and have retinas that contain only rod-like photoreceptors. Walls (1942) proposed that the all-rod retinas of nocturnal geckos were derived from the all-cone retinas of ancestral diurnal lizards. He further believed that diurnal geckos where in fact tertiarily diurnal, having reverted to diurnality, and consequently their all-rod retinas were transmuted back to all- cone retinas (Walls 1942). Since Walls’ (1942) work, and Underwood’s (1970) following that, several studies have supported this view demonstrating that nocturnal gecko ‘rods’ have cone features suggesting they were actually cones (Tansley 1964; Röll 2000), that the ‘rods’ function intermediately between cones and true rods (Kleinschmidt and Dowling 1975; Zhang et al.

2006), and that gecko photoreceptors utilize only cone phototransduction proteins (Kojima et al.

1992; Zhang et al. 2006). While each of these studies support Walls' (1942) transmutation theory, very little is known about the extent and nature of the impact of transmutation on the evolution and function of the visual system. My goal with this chapter was to further explore photoreceptor transmutation utilizing the data produced in the previous two studies, as well as new whole eye transcriptome sequencing from Gekko gecko and Anolis carolinensis that I performed for this study and the recently released Gekko japonicus genome.

Using this data I tested two hypotheses: (1) that geckos express only cone phototransduction genes in their all-cone retinas and (2) that gecko phototransduction genes are under elevated selective constraint relative to other reptiles, but similar to that present in snakes, which also have undergone considerable photoreceptor transmutation. Using both genomic and whole eye transcriptome data I did not find support for the first hypothesis. Surprisingly I found that geckos still express most, but not all, rod phototransduction genes, and at levels similar to those found in the diurnal anole, which has an all-cone retina, but retains expression of

244 rhodopsin. The loss of a functional rhodopsin in geckos, which I identified as a pseudogene in the gecko genome, suggests that true rods have been lost and implies that the rod transduction genes are expressed in cones, possibly along with cone transduction genes. Such co-expression may contribute to the rod-like physiology of gecko cones and also could extend the range of visual sensitivity in order to compensate for the loss of true rods. The explanation for the loss of rhodopsin in an ancestrally nocturnal group (Röll 2001; Gamble et al. 2015) remains a mystery.

In contrast to the first hypothesis, I found strong support for the hypothesis that photoreceptor transmutation has imposed divergent selective constraints on gecko phototransduction genes that are similar to those I identified in snakes in the previous study.

When geckos and snakes were included in the same partition this resulted in a better fit for the clade model for nearly all genes. In snakes I suggested that the divergent and positive selection may indicate functional evolution of the phototransduction proteins. This seems likely to have occurred in geckos as well and suggests that in addition to adaptation through gene loss, co- option of rod genes, and changes in expression levels, there may be adaptation in phototransduction protein function. Together these results demonstrate that adaptation in complex systems can occur through multiple mechanisms simultaneously.

Overall with this thesis I have advanced the understanding of photoreceptor transmutation, specifically in snakes and geckos. I have produced the first molecular evidence for photoreceptor transmutation in snakes, provided a new methodology for sequencing genes that are relevant for molecular evolutionary studies, and revised the view of photoreceptor transmutation in geckos. Additionally, I have provided the first evidence for molecular changes that are associated with photoreceptor transmutation. Despite this there are still many aspects of photoreceptor transmutation, and squamate visual evolution more generally, that are poorly understood. Below I suggest some avenues for future research to address this.

245

6.2 Future Directions

In Chapter 2 I proposed that transmutation of rods into cones may enhance the range of spectral sensitivity, increase spectral discrimination, and even contribute to trichromatic colour vision. This is a highly plausible hypothesis and explains the spectral shifts seen in the visual pigments, but requires substantial additional testing. The spectral sensitivity of garter snakes and other snakes with all-cone retinas needs to be thoroughly examined. Based on ERG recordings,

Jacobs et al. (1992) concluded that Thamnophis had only a single visual pigment and no rod

(scotopic) visual response. Revisiting this with the molecular information we now have and modern electrophysiological techniques is likely to reveal interesting new insights. It is not currently known whether snakes have colour vision. This has been recently behaviourally demonstrated in the transmuted rod-like all-cone retinas of nocturnal geckos (Roth and Kelber

2004). A similar experiment is needed to demonstrate colour vision in snakes, but training snakes to respond to colour is likely to prove difficult, if it is even possible. Without behavioural verification of colour vision it may not be possible to support the last part of the trichromacy hypothesis.

There are two other immediate areas for future research to follow up on the first study.

The first is an examination of the all-rod retinas of colubrids and other caenophidian snakes.

Underwood (1967, 1970) believed that nocturnal colubrids with all-rod retinas lacked true

(primary) rods, similar to the situation in geckos. While research in geckos has supported the absence of true rods, similar studies have not been performed to the same extent in nocturnal snakes. The presence of rhodopsin and other rod transduction proteins in all snakes studied thus far strongly suggests that some of the photoreceptors in snakes with morphologically all-rod retinas are true rods, while the others are transmuted cones (see also Simões et al. 2016).

Electron microscopy and immunohistochemistry experiments are needed to confirm this and

246 analyze the diversity of photoreceptors in detail. The functional properties of the transmuted rod- like cones is also unknown. Do they function like rods, similar to the rod-like cones of geckos?

Electrophysiological studies may be able to uncover their function and provide more insight into the functional implications of photoreceptor transmutation, including the function of photoreceptors with morphology that is intermediate between rods and cones.

The final immediate follow-up is to look at the evolutionary origins of other simplex retinas, the first step of which would be to look at the origin of the all-cone retina of squamate lizards. According to Walls (1942) round-pupiled diurnal lizards all have all-cone retinas and he believed this evolved through loss of the rods. While rods appear to have been lost in geckos, all other lizards examined thus far (see Chapters 3–5) have rhodopsin and other rod phototransduction proteins, suggesting that the rods have not been lost, but instead may have been transmuted into cones. Thus the all-cone retina of lizards likely evolved though the same mechanism as the all-cone retina in snakes. A repeat of the experiments done in Chapter 2, but on Anolis and other lizards with all-cone retinas should be sufficient to demonstrate this.

Rhodopsin expression has been shown with the 4D2 antibody in Anolis (McDevitt et al. 1993),

Chamaeleo (Bennis et al. 2005), and Tiliqua (New et al. 2012) photoreceptors; however

McDevitt et al. (1993) also found 4D2 reactivity in Gekko gecko where I have found strong evidence that rhodopsin has been pseudogenized and is not expressed (Chapter 5). I believe that the 4D2 antibody is actually not specific for rhodopsin (RH1), but also binds RH2, at least in squamates. RH2 is a cone opsin with high sequence similarity to RH1 that has been lost in mammals, snakes, and amphibians, but is present in lizards including geckos and Anolis. The

4D2 antibody binds to amino acids 2–39 of bovine rhodopsin (Hicks and Molday 1986).

Alignment of this region shows that it is highly conserved in both RH1 and RH2, but is divergent in the other opsins (LWS, SWS1, and SWS2), suggesting that the 4D2 antibody likely binds both

247

RH1 and RH2 (Fig. 6.1). Anolis RH2 only differs from bovine rhodopsin in this region by four additional sites, all of which are at the end of the epitope. This might explain the unusual staining reported by McDevitt et al. (1993) where they found concentrated 4D2 staining in the fovea.

Rods (and therefore rhodopsin) are usually absent from the fovea and found concentrated in the periphery (especially in diurnal species). If 4D2 also stains RH2, this would suggest a concentration of RH2 in the fovea, which might be expected in diurnal species. The specificity of

4D2 could be further evaluated by in vitro expression of RH1 and RH2 followed by western blotting. Dual labelling with 4D2 and rod transducin (K20) would also be useful in identifying potential rods in ‘all-cone’ lizard retinas.

Two other groups have good morphological evidence for photoreceptor transmutation that warrants further study: the night lizards (Xantusiidae) and tuatara (Sphenodon), both of which have partially transmuted their photoreceptors to rods according to Walls (1942). Night lizards are a nocturnal group of lizards that have photoreceptors with enlarged outer segments that Walls (1942) views as intermediate between rods and cones. The function and molecular components of the photoreceptors is unknown. It is possible that, like geckos they have lost true rods and this could be the driving pressure behind the evolution of the all-rod retina. Tuatara is last remaining species of the sister group to squamates (Rhynchocephalia) and is endemic to

New Zealand. Tuatara have, again according to Walls (1942), converted most, but not all of their photoreceptors to rods. In tuatara there appears to be a reversal of the typical condition where there are multiple types of cones and a single type rod. Instead tuatara have single rods and double rods and a very rare small single cone. The functional and molecular properties of these photoreceptors are unknown. Study of the tuatara visual system will likely prove very interesting, but difficult due to the endangered status of the animal.

248

There are several potential improvements that could be made to the targeted enrichment methodology that I developed in the second study. The most obvious would be further incorporating transcriptome sequencing into the process. In the study, I used transcriptome (and genomic) data to supplement and create new references that greatly increased the recovery and completeness of the targeted coding regions. Incorporating transcriptome sequencing from the start would allow both the probes and the reference sequences to benefit from this data (see for example Bi et al. 2012; Portik et al. 2016). I found that increased probe representation also substantially increased recovery and completeness, and thus this is likely to be beneficial. Such an approach, however, does introduce some of the limitations of transcriptome sequencing, namely tissue and timing specific expression. As a result, this would either limit the probes that could be designed, or genomic data would still be needed limiting probe diversity for some genes. Depending on the genes of interest this limitation may not apply and transcriptome sequencing combined with targeted capture could result in the best of both methods.

In the targeted enrichment experiment I increased probe diversity and tiling for a small set of genes (the opsins) and this resulted in substantially increased recovery and completeness relative to the average for other genes. Unfortunately the experimental design did not allow these two aspects to be decoupled. While I am confident that increased probe diversity had a positive effect, it is unclear what effect, if any, increasing the tiling had. The level of tiling we used for the majority of genes (10x), which was still quite high compared to many targeted capture methods, may not have been necessary. Tilling can account for a large amount of available probe space and thus optimizing it will be important to achieving increased cost (and time) efficiency.

Additional experiments that vary the tiling could help determine the ideal level, which is likely to vary with divergence and other factors. Another aspect that could be modified in the experimental protocol is the hybridization conditions. The manufacturer’s hybridization protocol

249 was used, but modifications to this, including adjustment of temperature and the addition of a second round of hybridization, could improve divergent capture (see Li et al. 2013).

Finally, in order to improve recovery of divergent genes it may be beneficial to develop a de novo, rather than guided, based method of assembly. Since the hybrid enrichment appears to have been highly robust to sequence divergence, a de novo assembly pipeline that removes or reduces the reliance on cross-species assembly could improve capture when such references are unavailable or impractical to produce. A de novo approach would involve first assembling the reads into contigs with a de novo genome or transcriptome assembler. These contigs should be whole or partial exons and are likely to include flanking intron and other noncoding sequence.

To obtain complete coding sequences these contigs would need to be aligned to a reference (e.g., using BLAST), trimmed, and stitched together. A reference is still needed, but alignment of contigs should be much less sensitive to sequence divergence. Instead the reliance will be on the de novo assemblers to properly assemble the reads into complete exons. Our preliminary results suggest that such an approach is not as good as using additional reference sequences, but may offer improvement when only highly divergent references are available.

The findings presented in Chapter 4 suggest considerable differences in the function of rod and cone visual transduction proteins that warrants further study. For example studies have repeatedly found that rod and cone transducin are functionally similar or even equivalent (Deng et al. 2009; Gopalakrishna et al. 2012; Tachibanaki et al. 2012; Mao et al. 2013); however I found strong evidence for positive selection in snake rod and cone transducin (GNAT1, GNAT2,

GNB3) that suggests adaptation and functional divergence. Many other phototransduction genes were also found to be under positive selection and the potential functional effects of this need to be further explored. In general differences between rod- and cone-specific copies of phototransduction proteins are not well understood and is an area of active research (Kawamura

250 and Tachibanaki 2008; Renninger et al. 2011; Tachibanaki et al. 2012; Mao et al. 2013;

Majumder et al. 2015; Orban and Palczewski 2016; Sakurai et al. 2016). How photoreceptor transmutation further effects, and has modified, the function of these proteins could provide substantial insight into the evolution and function of the visual system. Studying the functional difference in phototransduction proteins is typically limited to model systems (such as mice and zebrafish) and thus may prove difficult to explore in snakes; however the repeated natural experiment provided by photoreceptor transmutation provides a unique opportunity to study the functional evolution of these proteins.

In order to further explore the evolution of phototransduction genes in snakes, as well as the origins of snakes and the effect of photoreceptor transmutation, additional taxon sampling of phototransduction genes is needed. The blindsnakes (Scolecophidia) were almost completely absent from the sampling in Chapter 4. This group, which most recent phylogenetic studies have found is not monophyletic (see Pyron et al. 2013), is highly specialized for a fossorial lifestyle.

Simões et al. (2015) found that of the visual opsins blindsnakes possess only RH1. A complete analysis of phototransduction proteins in these snake will likely be very informative with regard to adaptation to fossoriality and may provide additional insight into the origins of snakes.

Sampling of other non-caenophidian snakes was also quite low, with most genes represented only by Python and Boa. This limited the possible analyses that could be performed on snakes.

Additional sampling would enable a more detailed study of the evolution of these genes.

The finding in Chapter 5 that rod phototransduction genes are expressed in the eyes of the nocturnal Tokay gecko represents a significant departure from the previous view of photoreceptor transmutation in geckos. The exact location of this expression, and the resulting proteins, has not yet been confirmed. Localization of mRNA expression to the photoreceptors with in situ hybridization or localization of the proteins with immunohistochemistry is a

251 necessary next step. It may also be useful to isolate the retina, and even individual photoreceptors cells, to perform qPCR. Further isolating where rod and cone phototransduction proteins are co-expressed and in which photoreceptor types would substantially increase our understanding of the mechanisms behind photoreceptor transmutation.

Similar to Chapter 4, the small sample of geckos in Chapter 5 limited the possible analyses that could be performed. Where with snakes I was able to analyze them separately from other reptiles, this was not possible with the sampling of geckos I had available. Additional sampling of phototransduction genes from geckos is necessary in order to test for positive selection in geckos specifically and to compare selection patterns between diurnal and nocturnal geckos. Sampling of other groups with photoreceptor transmutation, such as night lizards and tuatara, could uncover whether these groups also experienced divergent selection pressure as a result of photoreceptor transmutation similar to geckos and snakes.

Finally it would be useful to analyze the function of gecko visual pigments through in vitro expression. In order to achieve the high sensitivity needed for dim-light vision, such as the dim-light colour vision nocturnal gecko possess, it is necessary to reduce noise (Field et al. 2005;

Pahlberg and Sampath 2011). This is partially achieved through the extremely low thermal activation rate of rhodopsin, which sets the minimum threshold for vision (Baylor et al. 1980;

Aho et al. 1988; Holcman and Korenbrot 2005; Gozem et al. 2012; Angueyra and Rieke 2013;

Yanagawa et al. 2015). Cone visual pigments have higher thermal activation rates and are thus noisier preventing reliable vision in dim-light (Rieke and Baylor 2000; Sakurai et al. 2007; Fu et al. 2008; Mooney et al. 2015). As I have confirmed in Chapter 5, nocturnal geckos only have cone visual pigments, suggesting these pigments have likely evolved to be more thermally stable than typical cone pigments in order to achieve dim-light sensitivity. Recently, Kojima et al.

(2017) found that the SWS2-based visual pigment expressed in the green rods of frogs achieved

252

RH1-like thermal stability through a single amino acid mutation (Thr47). None of the gecko pigments have this residue suggesting that increase thermal stability evolved through an alternate mechanism. Other mechanisms to reduce dark noise beyond the visual pigment level may also have evolved and warrant further study, but without increased thermal stability of the cone visual pigments it seems unlikely that nocturnal colour vision would be possible.

Figure 6.1. Multiple sequence alignment of bovine rhodopsin (Bos RH1) with squamate visual opsins, highlighting the area (black box) that the 4D2 antibody binds. This area is highly conserved in RH1s and RH2s suggesting the antibody is likely to bind both opsins.

6.3 References

Aho AC, Donner K, Hyden C, Larsen LO, Reuter T 1988. Low retinal noise in animals with low

body temperature allows high visual sensitivity. Nature 334: 348-350.

Angueyra JM, Rieke F 2013. Origin and effect of phototransduction noise in primate cone

photoreceptors. Nat Neurosci 16: 1692-1700.

Baylor DA, Matthews G, Yau KW 1980. Two components of electrical dark noise in toad retinal

rod outer segments. J Physiol 309: 591-621.

253

Bennis M, Molday RS, Versaux-Botteri C, Reperant J, Jeanny JC, McDevitt DS 2005.

Rhodopsin-like immunoreactivity in the 'all cone' retina of the chameleon (Chameleo

chameleo). Exp Eye Res 80: 623-627.

Bi K, Vanderpool D, Singhal S, Linderoth T, Moritz C, Good JM 2012. Transcriptome-based

exon capture enables highly cost-effective comparative genomic data collection at

moderate evolutionary scales. BMC Genomics 13: 403.

Bowmaker JK 2008. Evolution of vertebrate visual pigments. Vision Res 48: 2022-2041.

Caldwell MW, Lee MSY 1997. A snake with legs from the marine Cretaceous of the Middle

East. Nature 386: 705-709.

Cao D, Pokorny J, Smith VC, Zele AJ 2008. Rod contributions to color perception: linear with

rod contrast. Vision Res 48: 2586-2592.

Davies WL, Cowing JA, Bowmaker JK, Carvalho LS, Gower DJ, Hunt DM 2009. Shedding light

on serpent sight: the visual pigments of henophidian snakes. J Neurosci 29: 7519-7525.

Deng WT, Sakurai K, Liu JW, Dinculescu A, Li J, Pang JJ, Min SH, Chiodo VA, Boye SL,

Chang B, Kefalov VJ, Hauswirth WW 2009. Functional interchangeability of rod and cone

transducin alpha-subunits. Proc Natl Acad Sci U S A 106: 17681-17686.

Field GD, Sampath AP, Rieke F 2005. Retinal processing near absolute threshold: from behavior

to mechanism. Annu Rev Physiol 67: 491-514.

Fu Y, Kefalov V, Luo DG, Xue T, Yau KW 2008. Quantal noise from human red cone pigment.

Nat Neurosci 11: 565-571.

Gamble T, Greenbaum E, Jackman TR, Bauer AM 2015. Into the light: diurnality has evolved

multiple times in geckos. Biol J Linn Soc 115: 896-910.

Gopalakrishna KN, Boyd KK, Artemyev NO 2012. Comparative analysis of cone and rod

transducins using chimeric Galpha subunits. Biochemistry 51: 1617-1624.

254

Gozem S, Schapiro I, Ferre N, Olivucci M 2012. The Molecular Mechanism of Thermal Noise in

Rod Photoreceptors. Science 337: 1225-1228.

Hicks D, Molday RS 1986. Differential immunogold dextran labeling of bovine and frog rod and

cone cells using monoclonal-antibodies against bovine rhodopsin. Exp Eye Res 42: 55-71.

Holcman D, Korenbrot JI 2005. The limit of photoreceptor sensitivity: molecular mechanisms of

dark noise in retinal cones. J Gen Physiol 125: 641-660.

Jacobs GH, Fenwick JA, Crognale MA, Deegan JF 1992. The all-cone retina of the garter snake -

spectral mechanisms and photopigment. J Comp Phys A 170: 701-707.

Kawamura S, Tachibanaki S 2008. Rod and cone photoreceptors: Molecular basis of the

difference in their physiology. Comparative Biochemistry and Physiology - A Molecular

and Integrative Physiology 150: 369-377.

Kleinschmidt J, Dowling JE 1975. Intracellular-recordings from gecko photoreceptors during

light and dark-adaptation. J Gen Physiol 66: 617-648.

Kojima D, Okano T, Fukada Y, Shichida Y, Yoshizawa T, Ebrey TG 1992. Cone visual

pigments are present in gecko rod cells. Proc Natl Acad Sci U S A 89: 6841-6845.

Kojima K, Matsutani Y, Yamashita T, Yanagawa M, Imamoto Y, Yamano Y, Wada A, Hisatomi

O, Nishikawa K, Sakurai K, Shichida Y 2017. Adaptation of cone pigments found in green

rods for scotopic vision through a single amino acid mutation. Proc Natl Acad Sci U S A

114: 5437-5442.

Lee MSY 2005. Molecular evidence and marine snake origins. Biol Lett 1: 227-230.

Lee MSY, Palci A, Jones MEH, Caldwell MW, Holmes JD, Reisz RR 2016. Aquatic adaptations

in the four limbs of the snake-like reptile Tetrapodophis from the Lower Cretaceous of

Brazil. Cretaceous Res 66: 194-199.

255

Li CH, Hofreiter M, Straube N, Corrigan S, Naylor GJP 2013. Capturing protein-coding genes

across highly divergent species. BioTechniques 54: 321-+.

Longrich NR, Bhullar BA, Gauthier JA 2012. A transitional snake from the Late Cretaceous

period of North America. Nature 488: 205-208.

Majumder A, Pahlberg J, Muradov H, Boyd KK, Sampath AP, Artemyev NO 2015. Exchange of

Cone for Rod Phosphodiesterase 6 Catalytic Subunits in Rod Photoreceptors Mimics in

Part Features of Light Adaptation. J Neurosci 35: 9225-9235.

Mamanova L, Coffey AJ, Scott CE, Kozarewa I, Turner EH, Kumar A, Howard E, Shendure J,

Turner DJ 2010. Target-enrichment strategies for next-generation sequencing. Nat Methods

7: 111-118.

Mao W, Miyagishima KJ, Yao Y, Soreghan B, Sampath AP, Chen JE 2013. Functional

Comparison of Rod and Cone G alpha(t) on the Regulation of Light Sensitivity. J Biol

Chem 288: 5257-5267.

McDevitt DS, Brahma SK, Jeanny JC, Hicks D 1993. Presence and foveal enrichment of rod

opsin in the all-cone retina of the american chameleon. Anat Rec 237: 299-307.

McKee SP, McCann JJ, Benton JL 1977. Color-vision from rod and long-wave cone interactions:

conditions in which rods contribute to multi-colored images. Vision Res 17: 175-185.

Mooney V, Sekharan S, Liu J, Guo Y, Batista VS, Yan ECY 2015. Kinetics of Thermal

Activation of an Ultraviolet Cone Pigment. J Am Chem Soc 137: 307-313.

New ST, Hemmi JM, Kerr GD, Bull CM 2012. Ocular anatomy and retinal photoreceptors in a

skink, the sleepy lizard (Tiliqua rugosa). Anat Rec 295: 1727-1735.

Orban T, Palczewski K. 2016. Structure and Function of G-Protein-Coupled Receptor Kinases 1

and 7. In: Gurevich VV, Gurevich EV, Tesmer JJG, editors. G Protein-Coupled Receptor

Kinases. New York, NY: Springer New York. p. 25-43.

256

Pahlberg J, Sampath AP 2011. Visual threshold is set by linear and nonlinear mechanisms in the

retina that mitigate noise: how neural circuits in the retina improve the signal-to-noise ratio

of the single-photon response. Bioessays 33: 438-447.

Portik DM, Smith LL, Bi K 2016. An evaluation of transcriptome-based exon capture for frog

phylogenomics across multiple scales of divergence (Class: Amphibia, Order: Anura).

Molecular Ecology Resources 16: 1069-1083.

Pyron RA, Burbrink FT, Wiens JJ 2013. A phylogeny and revised classification of Squamata,

including 4161 species of lizards and snakes. BMC Evol Biol 13: 93.

Reitner A, Sharpe LT, Zrenner E 1991. Is color-vision possible with only rods and blue-sensitive

cones. Nature 352: 798-800.

Renninger SL, Gesemann M, Neuhauss SC 2011. Cone arrestin confers cone vision of high

temporal resolution in zebrafish larvae. Eur J Neurosci 33: 658-667.

Rieke F, Baylor DA 2000. Origin and functional impact of dark noise in retinal cones. Neuron

26: 181-186.

Röll B 2000. Gecko vision-visual cells, evolution, and ecological constraints. J Neurocytol 29:

471-484.

Röll B 2001. Multiple origin of diurnality in geckos: evidence from eye lens crystallins.

Naturwissenschaften 88: 293-296.

Roth LS, Kelber A 2004. Nocturnal colour vision in geckos. Proc Biol Sci 271 Suppl 6: S485-

487.

Sakurai K, Onishi A, Imai H, Chisaka O, Ueda Y, Usukura J, Nakatani K, Shichida Y 2007.

Physiological properties of rod photoreceptor cells in green-sensitive cone pigment knock-

in mice. J Gen Physiol 130: 21-40.

257

Sakurai K, Vinberg F, Wang T, Chen J, Kefalov VJ 2016. The Na(+)/Ca(2+), K(+) exchanger 2

modulates mammalian cone phototransduction. Sci Rep 6: 32521.

Shen XX, Liang D, Feng YJ, Chen MY, Zhang P 2013. A Versatile and Highly Efficient Toolkit

Including 102 Nuclear Markers for Vertebrate Phylogenomics, Tested by Resolving the

Higher Level Relationships of the Caudata. Mol Biol Evol 30: 2235-2248.

Sillman AJ, Carver JK, Loew ER 1999. The photoreceptors and visual pigments in the retina of a

boid snake, the ball python (Python regius). J Exp Biol 202: 1931-1938.

Sillman AJ, Govardovskii VI, Rohlich P, Southard JA, Loew ER 1997. The photoreceptors and

visual pigments of the garter snake (Thamnophis sirtalis): a microspectrophotometric,

scanning electron microscopic and immunocytochemical study. J Comp Phys A 181: 89-

101.

Sillman AJ, Johnson JL, Loew ER 2001. Retinal photoreceptors and visual pigments in Boa

constrictor imperator. J Exp Zool 290: 359-365.

Simões BF, Sampaio FL, Jared C, Antoniazzi MM, Loew ER, Bowmaker JK, Rodriguez A, Hart

NS, Hunt DM, Partridge JC, Gower DJ 2015. Visual system evolution and the nature of the

ancestral snake. J Evol Biol 28: 1309-1320.

Simões BF, Sampaio FL, Loew ER, Sanders KL, Fisher RN, Hart NS, Hunt DM, Partridge JC,

Gower DJ 2016. Multiple rod-cone and cone-rod photoreceptor transmutations in snakes:

evidence from visual opsin gene expression. Proc Biol Sci 283.

Tachibanaki S, Yonetsu SI, Fukaya S, Koshitani Y, Kawamura S 2012. Low Activation and Fast

Inactivation of Transducin in Carp Cones. J Biol Chem 287: 41186-41194.

Tansley K 1964. The gecko retina. Vision Res 4: 33-37.

Underwood G 1967. A comprehensive approach to the classification of higher snakes.

Herpetologica 23: 161-168.

258

Underwood G. 1970. The Eye. In: Gans C, editor. Biology of the Reptilia. New York: Academic

Press. p. 1-97.

Walls GL. 1942. The vertebrate eye and its adaptive radiation. Bloomfield Hills, MI: Cranbrook

Institute of Science.

Wang Z, Gerstein M, Snyder M 2009. RNA-Seq: a revolutionary tool for transcriptomics. Nat

Rev Genet 10: 57-63.

Yanagawa M, Kojima K, Yamashita T, Imamoto Y, Matsuyama T, Nakanishi K, Yamano Y,

Wada A, Sako Y, Shichida Y 2015. Origin of the low thermal isomerization rate of

rhodopsin chromophore. Sci Rep 5: 11081.

Yi H, Norell MA 2015. The burrowing origin of modern snakes. Science advances 1: e1500743.

Zhang X, Wensel TG, Yuan C 2006. Tokay gecko photoreceptors achieve rod-like physiology with cone-like proteins. Photochem Photobiol 82: 1452-1460.