Phylogeny and Molecular Evolution of the Voltage-Gated Sodium Channel Gene scn4aa in the Electric Fish Genus Gymnotus

by

Dawn Dong-yi Xiao

A thesis submitted in conformity with the requirements for the degree of Masters of Science Cell and Systems Biology University of Toronto

© Copyright by Dawn Dong-yi Xiao « 2014 »

Phylogeny and Molecular Evolution of the Voltage-Gated Sodium Channel Gene scn4aa in the Electric Fish Genus Gymnotus

Dawn Dong-yi Xiao

Masters of Science

Cell and Systems Biology University of Toronto

« 2014 » Abstract

Analyses of the evolution and function of voltage-gated sodium channel proteins (Navs) have largely been limited to mutations from individual people with diagnosed neuromuscular disease.

This project investigates the carboxyl-terminus of the Nav paralog (locus scn4aa 3’) that is preferentially expressed in electric organs of Neotropical weakly-electric fishes (Order

Gymnotiformes). As a model system, I used the genus Gymnotus, a diverse clade of fishes that produce species-specific electric organ discharges (EODs). I clarified evolutionary relationships among Gymnotus species using mitochondrial (cytochrome b, and 16S ribosome) and nuclear

(rag2, and scn4aa) gene sequences (3739 nucleotide positions from 28 Gymnotus species). I analyzed the molecular evolution of scn4aa 3’, and detected evidence for positive selection at eight amino acid sites in seven Gymnotus lineages. These eight amino acid sites are located in motifs that may be important for modulation of EOD frequencies.

ii

Acknowledgments

This project would not have been possible were it were not for my supervisor Dr. Nathan Lovejoy, for providing me with the opportunity to work on this project, and giving me the freedom to take initiative. I am thankful for the support of my supervisory committee members, Dr. Asher Cutter, and Dr. Mark Fitzpatrick. I am also indebted to Ian Buglass, for sharing a positive outlook and encouragement.

I am grateful for the role that several people played in enhancing the content of my thesis, and the role my supervisor played in facilitating these opportunities. Thanks to Hermina Ghenu, for taking me through my first RNA extraction and cDNA amplification. I might still have your “1 free PCR” coupon among my lab notes somewhere! Thanks to Dr. Belinda Chang, for introducing me to the world of molecular evolution. Thanks to Mu-Quing Huang, not only for providing those gene sequences that I obtained from lab records, but more importantly, for providing additional perspectives on data formatting during the time we worked together.

Special thanks to Dr. Ari Chow, who not only shared tips on primer design, but inspired me to cultivate perseverance and uphold scientific integrity. Special thanks to Dr. Shelley Brunt, who not only provided me prompt advice on high throughput PCR techniques, but helped instill critical thinking skills in myself and countless other students. I also wish to thank my family, friends, and colleagues for their continued support, encouragement, and especially for sharing advice from their graduate school experiences.

This project was funded by grants awarded to me from the Sigma Xi the Scientific Research Society (Grant-in-Aid of Research, in spring 2009) and the Society of Systematic Biologists (Graduate Student Research Award, in summer 2009). Thank you for taking a chance on me! Funding was also provided through an NSERC discovery grant to Dr. Nathan Lovejoy, and various grants & TA-ships from the University of Toronto.

iii

Table of Contents

Abstract ...... ii

Acknowledgements ...... iii

Table of Contents ...... iv-v

List of Tables ...... vi

List of Figures ...... vii

List of Appendices ...... viii-ix

Chapter 1: Introduction ...... 1-20

1.1 Overview ...... 1-2

1.2 Clades of Electric Fish ...... 2-3

1.3 Phylogeny, Biogeography, and Morphology of ...... 4-6

1.4 Phylogeny, Biogeography, and Morphology of Gymnotus ...... 6-7

1.5 Evolutionary Adaptations of Electric Organ Discharges in Neotropical American Knifefishes ...... 8-11

1.6 Anatomy and Neuronal Control of Electric Organs ...... 11

1.7 Cellular Features of Electrocytes and Molecular Basis of Membrane Excitability ...... 12-13

1.8 Genetic Evolution and Protein Expression of Voltage-Gated Sodium Channels ...... 14

1.9 Molecular Features and Mechanisms of Voltage-Gated Sodium Channels ...... 15-18

1.10 Significance and Objectives ...... 18-20

Chapter 2: Materials and Methods ...... 21-34

2.1 Taxon Sampling ...... 21

2.2 Locus Selection ...... 21-22

2.3 Primer Design ...... 22-25

2.3.1 Amplification Primers for scn4aa 3’ ...... 22-25

2.3.2 Sequencing Primers ...... 25

iv

2.4 DNA and RNA Extraction ...... 25-26

2.5 Nucleotide Amplification and Sequencing ...... 26

2.6 Nucleotide Sequence Verification and Alignment ...... 26-30

2.7 Phylogenetic Reconstruction ...... 30-31

2.8 Molecular Evolution Analyses ...... 31-34

Chapter 3: Results ...... 35-52

3.1 Differences Between DNA and cDNA Sequences for the scn4aa 3’ ...... 35

3.2 Nucleotide Sequence Data ...... 35-36

3.3 Phylogenetic Reconstruction ...... 36-41

3.4 Patterns of Gymnotus scn4aa C-terminus Nucleotide Sequence Variation ...... 41-45

3.5 Positively Selected Sites on the Gymnotus Nav1.4a C-terminus Amino Acid Alignment 46-51

Chapter 4: Discussion ...... 52-60

4.1 Evolutionary Relationships Among Gymnotus ...... 52-53

4.2 Utility of the scn4aa 3’ for Phylogenetic Reconstruction ...... 53-54

4.3 Natural Selection at the Nav1.4a C-terminus Among Gymnotus Lineages ...... 55-56

4.4 Natural Selection at Specific Sites of the Nav1.4a C-terminus Among Gymnotus ...... 57-59

4.5 Summary and Future Directions ...... 59-60

References ...... 61-76

v

List of Tables

Table 1. Primer Sequences ...... 23

Table 2. cDNA Sequences Used for scn4aa 3’ Primer Design ...... 24

Table 3. Specimens and Nucleotide Sequences Used for Gymnotus Analysis ...... 27-29

Table 4. Models of Evolution Analyzed for the Gymnotus Nav1.4a C-terminus ...... 33

Table 5. Patterns of Gymnotus Nav1.4a C-terminus Nucleotide Sequence Variation ...... 43

Table 6. Nav1.4a C-terminus ω ratios for Gymnotus from the branch-site model A ...... 45

Table 7. Amino Acid Alignment of Nav1.4a C-terminus Showing Positively Selected Sites Relative to Motifs of Functional Significance ...... 47-49

Table 8. Amino Acid Identities of Positively Selected Sites on the Nav1.4a C-terminus for Various Gymnotus Species ...... 50-51

vi

List of Figures

Figure 1. Evolutionary Relationships Among Electrogenic Fishes and their Voltage-Gated Sodium Channel Paralogs ...... 3

Figure 2. Published phylogenies of Gymnotiformes ...... 5

Figure 3. Published Phylogenies of Gymnotus ...... 7

Figure 4. Examples of Electric Organ Discharges from Gymnotiformes ...... 9

Figure 5. Schematic of Voltage-Gated Sodium Channel Motifs ...... 16

Figure 6. Molecular Phylogeny of Gymnotus Based on Various Alignments Using Maximum Parsimony ...... 38

Figure 7. Molecular Phylogeny of Gymnotus Based on Various Alignments Using Bayesian Inference ...... 39

Figure 8. Molecular Phylogeny of Gymnotus Based on the Total Evidence Alignment ...... 40

Figure 9. Molecular Phylogeny of Gymnotus and Positively Selected Lineages ...... 44

vii

List of Appendices

Appendix A.0: Phylogeny and Molecular Evolution of the Voltage-Gated Sodium Channel Gene scn4aa in the Electric Fish Order Gymnotiformes ...... 77-78

A.0.1 Abstract ...... 77-78

Appendix A.1: Introduction ...... 79-81

A.1.1 Significance and Objectives ...... 79-81

Appendix A.2: Materials and Methods ...... 82-96

A.2.1 Taxon Sampling ...... 82

A.2.2 Locus and Primer Selection ...... 82-84

Appendix A Table 1. Primer Sequences ...... 83-84

A.2.3 DNA Extraction, Nucleotide Amplification, and Sequencing ...... 85

A.2.4 Nucleotide Sequence Verification and Alignment ...... 85

A.2.5 Phylogenetic Reconstruction ...... 85-96

Appendix A Table 2. Specimens and Nucleotide Sequences Used for Gymnotiformes Analysis ...... 86-95

Appendix A.3: Results ...... 97-103

A.3.1 Nucleotide Sequence Data...... 97-98

A.3.2 Phylogenetic Reconstruction ...... 97-103

Appendix A Figure 1. Molecular Phylogeny of Gymnotiformes Based on the Cytb Nucleotide Alignment Using Maximum Parsimony ...... 99

Appendix A Figure 2. Molecular Phylogeny of Gymnotiformes Based on the Rag2 Nucleotide Alignment Using Maximum Parsimony ...... 100

Appendix A Figure 3. Molecular Phylogeny of Gymnotiformes Based on the scn4aa 3’ Nucleotide Alignment Using Maximum Parsimony ...... 101

Appendix A Figure 4. Molecular Phylogeny of Gymnotiformes Based on the Total Evidence Alignment ...... 102

A.3.3 Variation in the Nav1.4a C-terminus ...... 103

Appendix A.4: Discussion ...... 104-109

A.4.1 Gymnotiform Phylogeny ...... 104-106

viii

A.4.2 Utility of the scn4aa 3’ for Phylogenetic Reconstruction ...... 106-107

A.4.3 Variation at the Nav1.4a C-terminus ...... 108

A.4.4 Summary and Future Directions ...... 108-109

Appendix A.5: References ...... 110-114

ix

Chapter 1 Introduction

1.1 Overview

Fishes are among the most diverse of vertebrates. Among the 60,000 described species of vertebrates, half are fishes (Froese and Pauly 2012). Many clades of fishes are able to detect electric fields in the water (electroreception). Some of them are also able to produce electric fields (electrogenesis; Moller 1995; Maddison and Schulz 2007; Alves-Gomes 2001).

Electroreception can be used by aquatic organisms to detect electrical fields that are produced as a byproduct of the muscle movement of predators and prey (Bedore and Kajiura 2013). It may also be used in conjunction with the production of weak electric discharges (< 10 V) for electrolocation and communication with conspecifics. Strong electric discharges (up to 600 V) may be used to stun prey (Crampton and Albert 2006). In many electrogenic clades, patterns of electric discharges are species-specific, and vary based on variations in the environment, anatomy, and molecular features.

Electrogenic fishes, especially Electrophorus electricus, are a classical model system for studying the highly conserved mechanisms of membrane excitability (Gotter et al. 1998; Keesey 2005; Albert et al. 2008). The electrogenic cells (electrocytes) in these fishes do not need to serve additional functions such as contraction or secretion, unlike other electrically excitable cells (myocytes, neuroendocrine cells, etc). However, they do share key features with other electrically excitable cells.

Voltage-gated sodium channels (Navs) are one of the main proteins supporting action potentials. When they were discovered, the first homolog to be sequenced was from E. electricus

(Agnew 1984; Catterall 1984; Noda et al. 1984). Navs are the targets of various naturally occurring toxins and synthetic drugs (Catterall et al. 2005). Navs are associated with mutations that cause several human skeletal, cardiac, and neuronal diseases, which severely impact the quality of life (Lehmann-Horn and Jukart-Rott 1999). An estimated 1 in 3500 people worldwide will be affected a neuromuscular disorder during some point in their lives (Emery 1991).

1

Conserved genes involved in electrogenicity, such as the Navs, may contribute towards reconstruction of phylogenetic relationships among electrogenic fishes. These fishes are a natural source of variations in electric signals. Analyses of the evolutionary history of these molecules in electrogenic fishes may contribute towards further understanding of molecular mechanisms for membrane excitability (Zakon et al. 2006; Arnegard et al. 2010). These analyses may also contribute towards understanding the evolutionary history of electrogenicity in these fishes.

1.2 Clades of Electrogenic Fish

Some clades of electrogenic fishes inhabit saltwater (Figure 1). The weakly electrogenic skates (genus Raja) and strongly electrogenic electric rays (order Torpediniformes) belong to the class of cartilaginous fishes (class Chondrichthyes). The strongly electrogenic stargazers (family Uranoscopidae) belong to the class of ray-finned fishes (class ).

Several clades of electrogenic fishes inhabit freshwater (Figure 1), all of which belong to the class of ray-finned fishes (class Actinopterygii). The weakly electrogenic African knifefishes (Gymnarchus niloticus) and weakly electrogenic African elephantfishes (family Mormyridae) belong to the order of bony-tongued fishes (order Osteoglossiformes). The weakly electrogenic African catfishes (genera Auchenoglanis, Clarias, and Synodontis) and strongly electrogenic African catfishes (genus Malapterurus) belong to the superorder of Ostariophysi fishes. The weakly electrogenic Neotropical American knifefishes (order Gymnotiformes), and strongly electrogenic South American electric eel (E. electricus in order Gymnotiformes) also belong to the superorder of Ostariophysi fishes.

2

Figure 1. Evolutionary Relationships Among Electrogenic Fishes and their Voltage-Gated Sodium Channel Genes Phylogenetic topology of Gnathostoma (jawed vertebrates), illustrating evolutionary relationships among electrogenic fishes and other key clades (Moller 1995; Maddison and Schulz 2007). Branch length is not to scale. Electrogenic fishes are coloured as follows: saltwater (green); freshwater (blue). Voltage-gated sodium channel genes associated with major clades (Gnathostoma, Teleostei, and Tetrapoda) are listed in boxes (Goldin et al. 2000; Widmark et al. 2011).

3

1.3 Phylogeny, Biogeography, and Morphology of Gymnotiformes

Among the clades of electrogenic fish, Gymnotiformes was selected as the focus of this project, for the following reasons: 1) the order Gymnotiformes is one of the most diverse clades of electrogenic fish, with approximately 200 described species (Froese and Pauly 2012); 2) phylogenetic analyses of Gymnotiformes are relatively well developed (Alves-Gomes 1999); 3) Gymnotiformes are phylogenetically close to a model species for which the genome has been sequenced (the zebrafish Danio rerio); 4) a classic model species for electrogenic properties is a gymnotiform (the electric eel E. electricus; Keesey 2005; Albert et al. 2008); and 5) there is active research on variations in electric field pattern among gymnotiform species, and mechanisms of their electric signal production (Crampton et al. 2011).

Within the superorder Ostariophysi, the most basal extant order is the monophyletic, saltwater-inhabiting Gonorhynchiformes (Figure 1). The other extant orders are monophyletic, and represent over 90% of the earth's freshwater fishes (Saitoh et al. 2003). These include the (includes tetras and piranhas), (includes , such as D. rerio), Gymnotiformes (electrogenic fishes, such as E. electricus); and Siluriformes (catfishes). Based on morphological data, Siluriformes is the sister order of Gymnotiformes (Fink and Fink 1981). However, based on nucleotide data, Siluriformes is the sister order of Gymnotiformes and Characiformes (Saitoh et al. 2003).

Within the order Gymnotiformes, there are two main pairs of sister families (Figure 2; Froese and Pauly 2012): Electrophoridae (1 described species – E. electricus) and Gymnotidae (38 described species); and Hypopomidae (25 described species) and Rhamphichthyidae (16 described species). Other families include Apteronotidae (85 described species) and Sternopygidae (30 described species). It is not clear which family is the most basal (Figure 2).

Gymnotiforms are adapted to the lowland freshwaters of the Neotropics (Central and South America), with wide geographical distributions (Crampton and Albert 2006). They occur in various zones of the water column (benthic to epipelagic). Their ecological habitats (forest streams, floodplains, deep fast-flowing rivers) are often turbid (Lissman 1958).

4

Based on Based on Based on Based on Based on morphological data morphological data morphological data strict consensus from morphological data (Ellis 1913) (Triques 1993; Gayet et al. 1994) (Mago-Leccia 1994) mitochondrial nucleotide data (Albert 2001) (Alves-Gomes 1995)

Figure 2. Published phylogenies for Gymnotiformes Phylogenetic topologies for Gymnotiformes based on morphological and nucleotide data from published sources. The families are coloured as follows: Apteronotidae (light blue); Electrophoridae (dark blue); Gymnotidae (violet); Hypopomidae (red); Rhamphichthyidae (yellow); Sternopygidae (green).

5

Gymnotiformes generally have subcutaneous eyes (often with poor sight), short bodies, and lengthy tapering tails (Albert and Lundberg 1995). They have no pelvic, dorsal, or adipose fins. However, they do have lengthy anal fins that undulate for locomotion, while their tail stays rigid to facilitate their electroreceptive and electrogenic capabilities.

1.4 Phylogeny, Biogeography, and Morphology of Gymnotus

Among the families of Gymnotiformes, Gymnotidae was selected as the focus of this project, for the following reasons: 1) among the families with myogenic electric organs in adulthood, Gymnotidae is the most diverse; and 2) phylogenetic studies of Gymnotidae based on morphology and nucleotide sequences exist for comparison (Albert et al. 2005; Lovejoy et al. 2010); and 3) there is active research on variations in electric field pattern among gymnotidae species, and mechanisms of their electric signal production (Crampton et al. 2011).

In some classifications, the family Gymnotidae only contains the genus Gymnotus. In other classifications, the family Gymnotidae also includes the sole species from the sister family Electrophoridae (E. electricus). For clarity, this project will focus on the genus Gymnotus within family Gymnotidae.

Within the genus Gymnotus, the Gymnotus carapo group is a diverse monophyletic clade (Figure 3). Phylogenetic topology among the remaining Gymnotus species based on nucleotide data differs from that based on morphological data.

Gymnotus fishes are the most geographically widespread of gymnotiforms (Albert et al. 2005). Their range extends from as far North as Southeastern Chiapas, Mexico (18° N), to as far South as Rio Salado in the Pampas plains of Argentina (36° S). They occur in all the major river systems of the Neotropics except for the estuarine Maracaibo Basin.

Gymnotus fishes are sometimes referred to as banded knifefishes, since many of the species have obliquely arranged dark and light coloured bands along their body (Albert et al. 2005). Gymnotus fishes have superior mouths with protruding lower jaws, and gapes that are large for gymnotiforms (Albert and Lundberg 1995; Albert et al. 2005).

6

Based on Based on morphological data strict consensus of mitochondrial and nuclear nucleotide data (Albert et al. 2005) (after Figure 4 from Lovejoy et al. 2010)

Figure 3. Published Phylogenies for Gymnotus Phylogenetic topologies for Gymnotus based on morphological and nucleotide data were obtained from published sources. The clades are coloured as follows: G. carapo group (green); G. pantherinus group (violet); G. cylindricus group (red); G2 group (light blue); and G1 group (dark blue).

7

1.5 Evolutionary Adaptations of Electric Organ Discharges in Neotropical Knifefishes

Gymnotiformes produce electric discharges using myogenic electric organs (EOs) derived from hypaxial muscles (Albert and Lundberg 1995; Zakon and Unguez 1999; Crampton and Albert 2006). In most genera (within the families Electrophoridae, Hypopomidae, Rhamphichthyidae, and Sternopygidae), there are species with additional accessory electric organs. In one family (Apteronotidae), myogenic electric organs are replaced by neurogenic ones (derived from motor neurons) during the first two months of life.

There are interspecific variations in electric organ discharge (EOD) frequencies and waveform complexities (Crampton and Albert 2006; Figure 4). Species-specific EODs may be produced in short pulses with frequencies up to 150 Hz (as short as ~ 7 ms between pulses in families Electrophoridae, Gymnotidae, Rhamphichthyidae, and Hypopomidae) or continuous waves with frequencies up to 2500 Hz (families Apteronotidae and Sternopygidae). In Gymnotus, EODs are typically produced in pulses lasting 1-3 ms each, with frequencies of 15-70 Hz (equivalent to ~ 14-67 ms between pulses). The number of phases within each pulse is also species-specific, with 3-4 being the most common (Crampton and Albert 2006). Other examples of EOD variations include the low frequency (~ 10 Hz) monophasic pulses of E. electricus, low frequency (~ 10-100 Hz) multiphasic pulses among Brachyhypopomus, low frequency (~ 30-150 Hz) monophasic waves among Sternopygus, and high frequency (900-1100 Hz) multiphasic waves among Apteronotus.

Gymnotiforms are electroreceptive using two types of morphologically distinct electroreceptors (Bullock 1982; Alves-Gomes 2001): ampullary electroreceptors for low- frequency direct current (DC) signals (0.1 – 50 Hz); and tuberous electroreceptors for high- frequency alternating current (AC) signals (50-2000 Hz). They are also electrogenic using a variety of species-specific EOD patterns.

8

Figure 4. Examples of Electric Organ Discharges from Gymnotiformes Traces of electric organ discharges scaled to the same peak-to-peak amplitude and plotted head-positive-up on the same time scale; dotted line represents 0 voltage baseline (from Arnegard et al. 2010).

9

Gymnotiform ampullary electroreceptors are used for passive electrolocation. The electroreceptors are likely tuned to the inadvertent electric signals from their prey's movements (Collin and Whitehead 2004). Gymnotiforms are nocturnally active predators of aquatic invertebrates (Winemiller and Adite 1997; Crampton and Albert 2006). Some gymnotiforms also feed on terrestrial arthopods, shrimp, and small fish. Gymnotus fishes are also aggressive predators of fishes and other aquatic (Albert et al. 2005).

Gymnotiform tuberous electroreceptors are used for active electrolocation. The electroreceptors are tuned to the self-generated EODs, and interpret disturbances of the electric field to navigate their habitat (Hopkins 1988).

Abiotic environmental conditions seem to constrain and correlate with certain EOD characteristics (Stoddard 2002). Capacitive elements such as dense vegetation attenuate lower frequencies, and may favour higher frequency EODs (von der emde 1990). Energy constraints limit the anatomy of the electric organs in terms of the number of columns of electrocytes and numbers of electrocytes per column, for optimal impedance-matching – species that inhabit waters with higher conductivity tend to have EOs with more columns with fewer electrocytes each (Hopkins 1999). Energy constraints in low oxygen habitats may have favoured pulse-type EODs and other adaptations, such as aerial respiration among Gymnotus fishes (Crampton 1998; Crampton and Albert 2006). Temperature fluctuations may also have favoured pulse-type EODs, while fast-flowing water may have favoured wave-type EODs (Stoddard 2002).

Biotic evolutionary pressures also correlate with certain EOD characteristics. Gymnotiforms coexist in polyphyletic species assemblages with piscivorous siluriforms and other gymnotiforms (Crampton and Albert 2006). Gymnotiforms' predators include siluriforms and Potamotrygonidaes (river stingray family within order Rajiformes) with ampullary electroreceptors, as well as some other gymnotiforms such as E. electricus (Szabo et al. 1972; Szamier and Bennett 1980; Lovejoy 1996; Stoddard 1999; Alves-Gomes 2001; Stoddard 2002). Siluriforms generally do not have tuberous electroreceptors, with the possible exception of the family Cetopsidae (Alves-Gomes 2001). Predation avoidance may have favoured higher frequency, lower magnitude, and more complex EOD waveforms. It may have favoured lack of DC content among wave-type EODs, and existence of occasional silence among pulse-type

10

EODs (Stoddard 1999; Alves-Gomes 2001; Stoddard 2002). It may also have favoured androgen-induced handicaps in males through sexual selection (Hopkins 1988; Hopkins et al. 1990; Stoddard 1999; Stoddard 2003; Stoddard 2006; Zahavi 2003).

1.6 Anatomy and Neuronal Control of Electric Organs

Electric organs are typically located immediately above and along the anal fin musculature. Within electric organs, tubes of connective tissue are arranged one above the other in the dorsal- ventral plane. Within these tubes, electrocytes are arranged midway within stacked compartments divided by connective tissue septa (Bennett and Grundfest 1959). Variation in number, size, and shape of electrocytes are associated with variations in the electric organ discharge amplitude (EOD) (Caputi 1999).

A lattice hierarchy of neurons innervates electrocytes (Lorenzo et al. 1993; Caputi 1999). The EOD frequency is synchronized by pacemaker cells in the medulla, providing input to a group of 70-90 relay neurons at the ventral surface of the medulla. Relay neuron processes extend along the bulbospinal tract to provide input to electromotor neurons, which provide input to electrocytes. In Gymnotidae, axons of the relay neurons vary in length and conduction velocity. Relay neurons with slower fibres primarily project onto rostral electromotor neurons, while those with faster fibres primarily project onto caudal ones.

In monophasic fish, electromotor neurons only innervate the rostral or caudal face. In multiphasic fish, there are two morphologically distinct types of electromotor neurons. Small (25-40 μm) round neurons with fine dendrites lacking spines innervate the rostral face of rostral electrocytes. Large (45-60 μm) oval neurons with thick dendrites up to 200 μm long innervated the caudal face of caudal electrocytes. Both small and large electromotor neurons innervate electrocytes in the mid-section of the electric organ, on the electrocytes' rostral and caudal faces, respectively. The earlier portions of the EOD waveform are produced by smaller neurons recruiting a small number of electrocytes, according to Henneman's size principle (Henneman 1957). Variation in innervation patterns of electrocytes is associated with variations in the EOD amplitude and waveform.

11

1.7 Cellular Features of Electrocytes and Molecular Basis of Membrane Excitability

Electrocytes are multinucleated cells with similar cellular and molecular features to myocytes in striated muscle (Machado et al. 1976, Machado et al. 1980; Yablonka-Reuveni 2011). In Electrophorus electricus, the innervated and its opposing non-innervated plasma membranes are undulated, with those on the latter membrane more so. This provides an increased surface area on which the abundant macromolecules associated with electroexcitability are anchored. The majority of organelles and glycogen granules are located near these undulating plasma membranes to support and provide energy for electric organ discharge (EOD) production (Gotter et al. 1998; Machado et al. 1976; Williamson et al. 1967). Binding sites for calcium (a ubiquitous signalling molecule) are also located near these undulating plasma membranes (de Arujo Jorge et al. 1979). A loose filamentous network consisting mostly of microtubules, actin, and desmin (which is characteristic of myocytes) maintains the cell morphology and macromolecule localization (Benchimol et al. 1978; Gotter et al. 1998; Mermelstein et al. 2000).

The cells' resting potential is mainly due to K+ ions (Lester 1978), though contribution from Cl- has not been ruled out (Ferrari and Zakon 1993). The potential of approximately -85 mV across each face (Keynes and Martins-Ferreira 1953) is within 10-15 mV of that in neurons and myocytes (Hopkins 2006), and is similarly maintained by an abundance of Na+/K+ ATPase ion channels moving 3 Na+ out for every 2 K+ in (Morth et al. 2011). These ion channels are also concentrated at the undulating plasma membranes, especially the non-innervated membrane (Solmó et al. 1977; Ariyasu et al. 1987).

The cells' action potentials are mainly due to cholinergic synapses and voltage-gated ion channels at the innervated plasma membranes (Gotter et al. 1998). These action potentials occur in a series of events similar to that in neurons and myocytes (Hodgkin et al. 1952, Gotter et al. 1998, Ruff 2003). When acetylcholine from the innervating motor neurons bind to the nicotinic acetylcholine receptor ion channels in the innervated plasma membrane, the ion channels change conformation, allowing Na+ and K+ to flow down their electrochemical gradient into the cells (Heidmann and Changeux 1978). If the cells' membrane potential depolarizes (becomes more

12 neutral or positive) by approximately 10-15 mV (Hodgkin et al. 1952; Keynes and Martins- Ferreira 1953), an action potential will be triggered.

The voltage-gated ion channels that contribute to an action potential are mainly sodium + + channels (Navs) that facilitate Na influx, though potassium channels that facilitate K efflux are also thought to contribute to repolarization of the cells (Nakamura et al. 1965; Ferrari and Zakon 1993), and calcium channels have been hypothesized to facilitate influx of Ca2+ (Gotter et al.

1998). The Navs are concentrated at the innervated undulating plasma membranes (Ellisman and Levinson 1982; Fritz and Brockes 1983).

When an action potential is triggered, Navs undergo fast activation (typically < 1 ms), and allow Na+ to flow down its chemical gradient into the cells (Hodgkin et al. 1952; Ulbricht 2005).

When the peak potential is almost reached, Navs undergo fast inactivation (typically < 1 ms), and prevent more Na+ from flowing in. The peak potential of the innervated plasma membrane of E. electricus electrocytes is approximately 65 mV (Keynes and Martins-Ferreira 1953), compared with 45 mV and 30 mV in the giant squid axon (Hodgkin et al. 1952) and skeletal muscle (Hopkins 2006), respectively. After the peak potential has been reached, the cells typically enter a refractory period, during which the cells' potentials repolarize and Na+s recover back to their resting states (Hodgkin et al. 1952).

Recovery typically proceeds on the order of milliseconds (Ulbricht 2005). If there is prolonged or repeated depolarization, slow inactivation may occur, where recovery proceeds on the order of seconds to minutes (Ulbricht 2005). If during recovery from inactivation (when the cell is almost repolarized), there is a brief resurgent current of Na+ flowing in, then recovery may proceed faster (Rose 2007; Cannon and Bean 2010).

Action potentials are a highly conserved feature across cell types and taxa, where there are variations in specific characteristics. Variation in amplitude and time-course of synchronously triggered action potentials are associated with variations in the EOD waveform, amplitude, and frequency (Bennett 1961; Mills and Zakon 1987).

13

1.8 Genetic Evolution and Protein Expression of Voltage-Gated Sodium Channels

Voltage-gated ion channels are one of the largest superfamilies of signal transduction proteins, and among the most common drug targets. They are encoded by homologous genes, and are structurally conserved (Yu et al. 2005; Charalambous and Wallace 2011). Functional elements of this superfamily are ion conductance, pore gating, and regulation. Members of this superfamily include voltage-gated potassium, calcium, and sodium channels.

Among jawed vertebrates (infraphylum Gnathostoma), voltage-gated sodium channels

(Navs) are encoded by a family of paralogous genes belonging to four monophyletic lineages (Lopreato et al. 2001; Figure 1). After the divergence of terrestrial vertebrates (superclass Tetrapoda) and most living ray-finned fishes (infraclass Teleostei), tandem duplications in Tetrapoda increased the number of paralogs in two of the four lineages to a total of ten, while whole genome duplication in Teleostei doubled the number of paralogs to eight (Lopreato et al. 2001; Goldin 2002; Novak et al. 2006; Widmark et al. 2011; Figure 1). The protein structure and functional elements are conserved among paralogs, especially within each of the four lineages of

Navs. However, they are even more conserved among orthologs across species (Catterall et al. 2005).

Navs in myogenic tissue are generally encoded by a single gene (scn4a) in Tetrapoda, while there are two paralogs (scn4aa and scn4ab) in Actinopterygii (Goldin et al. 2000; Goldin 2002; Novak et al. 2006; Widmark et al. 2011). Gene duplication has allowed for the evolution of gene-specific expression patterns and electrical characteristics (Lynch et al. 2001; Goldin et al. 2002; Novak et al. 2006; Widmark et al. 2011). Non-electrogenic Actinopterygii express both scn4aa and scn4ab in myocytes. While expression of scn4aa is absent in myocytes of electrogenic fish with myogenic electric organs, it is preferentially expressed in electrocytes of electrogenic fish with myogenic electric organs (Noda et al. 1984; Agnew et al. 1978; Zakon et al. 2006; Arnegard et al. 2010). Nav paralogs (α subunits) are often associated with auxiliary β subunits, which are involved in channel localization and functional modulation. However, α subunits are sufficient for functional expression (Catterall et al. 2005).

14

1.9 Molecular Features and Mechanisms of Voltage- Gated Sodium Channels

Voltage-gated sodium channels (Navs) consist of approximately 1000 amino acids, with a molecular weight of ~ 230 kDa prior to post-translational modification (Noda 1984; Cohen and Levitt 1993). The channels are structured into four homologous domains DI-IV, each with six transmembrane segments S1-6, and oriented with the amino-terminus (N-terminus) and carboxyl-terminus (C-terminus) on the intracellular side (Noda 1984; Gordon et al. 1987; Gordon et al. 1988; Catterall et al. 2005; Figure 5). The extracellular loops and transmembrane segments are highly conserved among the Nav family, with > 50% amino acid sequence similarity (Catterall et al. 2005). The voltage-sensing domain (VSD) includes transmembrane segments S1-4 from each of DI-IV. The pore module (PM) includes transmembrane segments S5-6 from each of DI-IV, and forms an extracellular funnel, selectivity filter, central cavity, and activation gate (Payandeh et al. 2011; Zhang et al. 2012). The C-terminus consists of almost 300 amino acids (Noda et al. 1984), and includes several motifs: a flexible linker joining DIVS6; an EF-hand; an IQ; and a PY (Cormier et al. 2002; Chagot et al. 2009).

The N-terminus may include conserved amino acid sequences for membrane localization (Catterall et al. 2005; Eijkelkamp et al. 2012). The extracellular loops include specific amino acids for modulation of surface charge by glycosylation, which increases the total molecular mass by 13 kDa to 60 kDa (Levinson et al. 1986; Schmidt and Catterall 1987; Cohen and Levitt 1993; Liu et al. 2012). The N-terminus, intracellular inter-domain linkers (especially DI-II), and C-terminus include paralog-specific phosphorylation sites for functional modulation (Emerick et al. 1993; Cantrell and Catterall 2001; Scheuer 2010). The DII-III linker (Fache et al. 2004) and the paralog-specific PY motif at the C-terminus (Fotia et al. 2004; Rougier et al. 2005) are associated with protein internalization, which modulates the current magnitude.

15 a

b

Figure 5. Schematic of Voltage-Gated Sodium Channel Motifs a) Schematic of the whole voltage-gated sodium channel (Gotter et al. 1998). Domains I-IV are identified, with a rectangle representing each transmembrane segment. Phosphorylation sites of the Electrophorus electricus Nav1.4a are identified by P (Emerick et al. 1993). b) Schematic of motifs that change conformation during fast inactivation (Potet et al. 2009). Domains III and IV are identified, with a big grey cylinder representing each transmembrane segment. Helices of the fast inactivation occlusion particle (DIII-IV linker) are identified by the small green and violet cylinders. Helices of the carboxyl-terminus (C-terminus) are identified by the small blue and grey cylinders. The EF-hand and IQ motifs bind each other. Calmodulin binds the IQ motif and a helix of the DIII-IV linker (violet cylinder).

16

Fast activation is triggered by changes in plasma membrane voltage being relayed by regularly spaced positively charged amino acids on the S4 segments of DI-III, to change conformations of the S3-4 and S4-5 linkers as well as the VSD (Payandeh et al. 2011; Zhang et al. 2012; Payandeh et al. 2012; Ahern 2013). Changes in VSD conformation are relayed to the PM by the S4-5 linkers, opening the activation gate, allowing sodium ions to pass the highly conserved amino acids of the selectivity filter (Favre et al. 1996).

Slow inactivation is likely conferred by conformational changes near the selectivity filter at the S5-6 linker and S6 segments lining the pore (Ulbricht 2005; Payandeh et al. 2012).

Fast inactivation is triggered by changes in voltage being relayed by regularly spaced positively charged amino acids on the S4 segment of DIV (Ahern 2013). Changes in conformation result in occlusion of the activation gate by the DIII-IV linker.

Fast inactivation is modulated by Ca2+ binding on the C-terminus being relayed to the activation gate by calmodulin (Wingo et al. 2004; Young and Caldwell 2005; Sarhan et al. 2012). Calmodulin is a highly conserved calcium sensing protein that has been found in many eukaryote cells, including electrocytes (Baba et al. 1984; Munjaal et al. 1986). It has 2 lobes, each with a Ca2+-binding EF-hand motif consisting of two pairs of α helices (Chin and Means 2000). The C-terminus EF-hand motif is structurally analogous to one lobe of calmodulin, but with a lower affinity for Ca2+ (Miloushev et al. 2009). The IQ motif is found in many Ca2+- dependent calmodulin binding proteins (Bahler and Rhoads 2001). In the absence of Ca2+, the EF-hand binds loosely to the IQ motif, which binds tightly to the C-lobe of calmodulin, leaving the N-lobe free. Since the N-lobe of calmodulin does not bind the DIII-IV linker, the linker is free to occlude the activation gate (Wingo et al. 2004; Shah et al. 2006; Chagot et al. 2009; Chagot et al. 2011; Sarhan et al. 2012). With increased levels of Ca2+, the EF-hand binds tightly to the IQ motif, which binds loosely to either the N-lobe or C-lobe of calmodulin. When the C- lobe of calmodulin binds to the DIII-IV linker, it is less likely to occlude the activation gate.

Resurgent current is associated with an alternate time course of fast inactivation. It may result from a parallel process competing with the typical fast activation mechanism at the activation gate (Cruz et al. 2011). It may result from specific amino acids on the S4 segment of

17

DIV (Jarecki et al. 2010). It may also result from specific amino acids on the C-terminus EF- hand – a drug that binds to the C-terminus EF-hand has been shown to prolong occlusion of the activation gate and decrease resurgent current (Hebert et al. 1994; Theiss et al. 2007; Bello et al. 2012).

1.10 Significance and Objectives

Many advances have been made in recent years, to identify the roles of various motifs in voltage- gated sodium channel protein (Nav) channel function and modulation (Chagot et al. 2009; Miloushev et al. 2009; Payandeh et al. 2011; Sarhan et al. 2012; Zhang et al. 2012). However, analyses of the roles of specific amino acid sites have largely been limited to the sites that are known to be mutated in people with diagnosed neuromuscular disease (Lehmann-Horn and Jukart-Rott 1999). In this project, I will use the gymnotiform genus Gymnotus as a model system to investigate the evolution and function of amino acid sites on the Nav1.4a.

Fishes of the genus Gymnotus produce species-specific electric organ discharges (EODs) for electrolocation (foraging, navigation) and communication (Crampton and Albert 2006). EODs are the summation of action potentials produced at the electric organ(s) (EO) by electrogenic cells (Bennett 1961; Mills and Zakon 1987). Navs at the plasma membranes of those cells have a key role in supporting action potentials (Agnew 1984; Catterall 1984; Noda et al.

1984). Upon neuronally triggered changes in voltage, Navs activate to allow specific ions to discharge through their pores, across the membranes. Those same changes in voltage also trigger

Navs to inactivate, to allow the membrane voltage gradient to recover, in preparation for the next discharge.

Navs are encoded by a family of paralogous genes that translate to highly conserved amino acid sequences and motifs (Catterall et al. 2005). Gene duplication among teleostei and preferential expression in various tissues (Lopreato et al. 2001; Lynch et al. 2001; Goldin 2002; Novak et al. 2006; Widmark et al. 2011) has been predicted to allow paralogs to evolve independently without compromising functions of Navs in other tissues. Analyses of nucleotide sequences encoding various motifs of the EO paralog (scn4aa) from limited sampling of gymnotiform fishes, resulted in identification of positive, neutral, and purifying selection of the

18

protein (Nav1.4a) among certain lineages (Zakon et al. 2006; Arnegard et al. 2010). However, positively selected amino acid sites were not identified.

One component of the scn4aa gene that has not been previously analyzed for patterns of selection among gymnotiforms includes the nucleotides encoding the protein’s carboxyl- terminus (scn4aa 3’). This portion includes key motifs that are involved in regulation of protein internalization, fast inactivation, and possibly also resurgent current. Modulation of these

Nav1.4a activities affects the amplitude and frequency of action potentials at the EO, which may in turn affect those components of the EODs. Variations in EOD amplitude may be associated with variations in multiple anatomical, cellular, and molecular characteristics (Gotter 1998; Caputi 1999). However, variations in EOD frequency among gymnotiforms with myogenic electric organs are likely limited to those associated with variations in Nav1.4a function.

Since species-specific characteristics of EODs among gymnotiforms (especially variation in frequency) are the result of adaptations to abiotic and biotic selective pressures in their varied habitats (Stoddard 2002), I predict that amino acid sites of the Nav1.4a C-terminus that contribute to variance of (but not abolish) protein function, will show evidence of positive selection in

Gymnotus fishes. I also predict that the Nav1.4a C-terminus will only show evidence of positive selection in some lineages of Gymnotus, as has been observed for other portions of Nav1.4a sequences from a limited sample of gymnotiform fishes (Zakon et al. 2006; Arnegard et al.

2010). To assess patterns of selection on the Nav1.4a C-terminus among Gymnotus, I will analyze the corresponding nucleotide sequences based the phylogenetic relationships among these fishes (Yang 2007).

Since existing phylogenetic relationships among Gymnotus based on morphology and nucleotide sequences are not entirely consistent with each other, I will conduct phylogenetic analyses with additional taxa and molecular characters to contribute towards resolving remaining inconsistencies (Wiens 1998). The additional characters that I will use are the Gymnotus scn4aa nucleotide sequences that encode the protein's C-terminus. Since other portions of scn4aa had been used for successful reconstruction of phylogenies from limited sampling of gymnotiform fishes (Zakon et al. 2006; Arnegard et al. 2010), I predict that this portion of the gene will also

19 contribute towards clarification of phylogenetic relationships among a large sample of Gymnotus species.

The objectives of this project can be summarized as follows:

1) To clarify evolutionary relationships among known and newly discovered species of Gymnotus fishes using orthologous genetic loci, including the scn4aa 3’;

2) To determine the utility of the scn4aa 3’ locus for reconstruction of phylogenetic relationships; and

3) To assess patterns of selection on the Nav1.4a C-terminus, thereby contributing towards understanding the evolutionary history of Gymnotus fishes, and molecular mechanisms of the protein.

20

Chapter 2 Materials and Methods

2.1 Taxon Sampling

Efforts were made to comprehensively sample Gymnotus species from all three clades described in Albert et al. (2005). Outgroup species were sampled from other gymnotiform families: Electrophoridae; Hypopomidae; and Sternopygidae. More than one individual was sampled per species whenever possible, as a control for variation among species.

Tissues for DNA extraction were stored in either 95-100% ethanol or salt saturated buffer (20% DMSO, 0.25 M EDTA pH 8, saturated with NaCl). Tissues for RNA extraction were stored in RNALater. Tissue samples were from the collections of Nathan Lovejoy, William Crampton, James Albert, and Javier Maldonaldo.

2.2 Locus Selection

The loci selected were: mitochondrial genes cytochrome b (cytb) and 16S ribosome (16S); and nuclear genes recombination activating gene 2 (rag2) and the portion of the voltage-gated sodium channel gene scn4aa encoding the Nav1.4a protein’s carboxyl-terminus (this region is herein referred to as scn4aa 3’).

Cytb and 16S sequences have been successfully used for phylogenetic classification of fish (Lovejoy and Collette 2001, Lavoué and Sullivan 2004). These are housekeeping genes which have key conserved functions for the maintenance of every cell among various cell types and across taxa (Warrington et al. 2000), which would decrease the chances of inaccurate phylogeny due to variations in patterns of natural selection among clades (Kullberg et al. 1996). Mitochondrial genes do not have introns, which simplifies the sequence alignment process.

Rag2 sequences have also been successfully used for phylogenetic classification of fish (Lovejoy and Collette 2001, Lavoué and Sullivan 2004). Rag2 is essential to the inducible immune response in jawed vertebrates (Rast and Litman 1998). In fish, it is a conserved single

21 copy gene (Willett et al. 1997) that seems to not have introns (Hansen and Kaattari 1996, and Willett et al. 1997). Phylogenetic reconstruction based on single copy genes decrease the chances of inaccurate phylogeny due to mistaken orthology (Li et al. 2007).

Scn4aa encodes the voltage-gated sodium channel protein Nav1.4a, and is part of a sodium channel gene family that has been conserved among vertebrates (Goldin et al. 2000; Novak et al. 2006; Widmark et al. 2011). There were nucleotide sequences from scn4aa and scn4ab, and other members of the scn gene family from GenBank for comparison to avoid mistaken orthology. The Nav1.4a protein's carboxyl-terminus is approximately 300 amino acids long.

Mitochondrial genes tend to evolve rapidly compared to nuclear ones (Brown 1979). A combination of nucleotide sequences from these different sources could complement each other when resolving phylogenetic relationships.

2.3 Primer Design

Table 1 lists the primer sequences used for DNA amplification and sequencing. Amplification primers for cytb, 16S, and rag2 have been previously published. Amplification primers for scn4aa 3’ were designed as part of this project (see below). Sequencing primers for all loci selected were designed as necessary.

For both amplification and sequencing primers, annealing characteristics such as melting temperature, % GC content, and secondary structures were analyzed using NetPrimer (http://www.premierbiosoft.com/netprimer/index.html).

2.3.1 Amplification Primers for scn4aa 3’

To predict intron/exon boundaries for the portion of the scn4aa gene encoding the carboxyl- terminus (scn4aa 3’), published fish scn4aa, scn4ab, and other scn cDNA sequences were obtained from GenBank (Table 2), and aligned using ClustalX version 1.83 (Thompson et al. 1997). The scn4aa 3’ locus was predicted to be contained within one exon by comparing the fish scn alignment with annotated genome sequences of Danio rerio (GenBank Accession #s DQ221253 and NW_001510719).

22

Table 1. Primer Sequences Primers used for nucleotide sequence amplification and sequencing are identified by their target loci, name, annealing direction, sequence, and source.

Target Locus Primer Name Amplification/Sequencing Direction 1 Nucleotide Sequence (listed as 5' → 3') Source of Sequence scn4aa 3’ (6)1F 5' → 3' TCCTCCTGACTGTGACCCTG This study (6)1R 3' ← 5' CATTTTTACACTTCATCACTCTCCAC This study cytochrome b GLU-L-CARP (AKA 5' → 3' TGACTTGAAGAACCACCGTTG Palumbi et al. 1991 CytbF) GLUDG-L 5' → 3' CGAAGCTTGACTTGAARAACCAYCGTTG Palumbi et al. 1991 HA-danio (AKA CytbR) 3' ← 5' CTCCGATCTTCGGATTACAAG Mayden et al. 2007 (C)Seq1F CAATGAGTCTGAGGAGGNTT This study (C)Seq3F CAATGAGTTTGAGGGGGNTT This study (C)Seq5F CAATGAGTCTGAGGGGGNTT This study (C)Seq8F CAATGAGTTTGAGGCGGNTT This study recombination activating Rag2GyF 5' → 3' ACAGGCRTCTTTGGKRTTCG Lovejoy et al. 2010 gene 2 Rag2GyR 3' ← 5' TCATCCTCCTCATCTTCCTC Lovejoy et al. 2010 (R)Seq1F AGAACCACAGAGAACTGGAACAC This study (R)Seq1R CTCTACACGCAGCCTGAACA This study (R)Seq2R TGCATTCGCTTYTGGGA This study 16S mitochondrial 16sar-L 5' → 3' CGCCTGTTTATCAAAAACAT Palumbi et al. 1991 ribosomal subunit 16sbr-H 3' ← 5' CCGGTCTGAACTCAGATCACGT Palumbi et al. 1991

1 Amplification/sequencing direction is only identified for primers used for both amplification and sequencing, since sequencing-only primers may have been used to sequence nucleotides in different directions.

23

Table 2. cDNA Sequences Used for scn4aa 3’ Primer Design Voltage-gated sodium channel nucleotide sequences of fish were downloaded from GenBank for the design of primers specific to the carboxyl-terminus. For clarity, genes and proteins were all named using the protein naming convention from Novak et al. (2006).

Superorder Acanthopterygii Ostariophysi Osteoglossomorpha Order Tetraodontiformes Cypriniformes Gymnotiformes Siluriformes Osteoglossiformes Takifugu pardalis Tetraodon Danio rerio Apteronotus Brachyhypopomus Electrophorus Sternopygus Ictalurus Chitala Gnathonemus Osteoglossum Species nigroviridis leptorhynchus pinnicaudatus electricus macrurus punctatus chitala petersiii bircirrhosum Nav1.1La BC044197 AF378142 AY204535 DQ275140 BC133130 BC150220 DQ149503

NM_200132

Nav1.1Lb NW_001513569 AF378141 AY204534 DQ275139 DQ149504 NM_001044895 1 Nav1.4 AB030482 Nav1.4a DQ221251 DQ351532 DQ351533 DQ351534 M22252 AF378144 AY204537 DQ336344 DQ275142 DQ336343 DQ149506 NW_001510719 NM_001039825 Nav1.4b DQ221252 DQ221254 AF378139 AY204532 DQ275137 DQ149505 NM_001045065 Nav1.5La DQ149507 AF378140 AY204533 DQ275138 NW_001512993

Geneand Family; GenBank Accession #s NM_001044922

v Nav1.5Lb NW_001512737 AY183895 DQ149508 NM_001045123 Nav1.6a NW_001512571 DQ286578 DQ385608 NM_131628 Nav1.6b NW_001513595 AF378143 AY204536 DQ275141

Member theof Na NM_001045183

1 GenBank identifies this as a sequence from skeletal muscle, but unclear as to whether it's Nav1.4a or Nav1.4b.

24

Primer sequences were designed to amplify the scn4aa 3’ exon, but not scn4ab sequences, or sequences from any other scn sequence. To accomplish this, potential primer sequences were blasted (http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) against scn4aa and non-scn4aa portions of the fish scn alignment. Only sequences specific to scn4aa, and not to any other scn's were selected. The resulting primer sequences were experimentally tested to verify amplification of scn4aa 3’. The absence of introns was confirmed by comparing corresponding scn4aa DNA and cDNA sequences from gymnotiforms (see below).

2.3.2 Sequencing Primers

Sequencing primers were designed for amplicons that did not produce clear nucleotide sequences using amplification primers for sequencing. Existing nucleotide sequences for those loci were aligned using SequencherTM (Gene Code Corporation, Ann Arbor, MI). Primers were designed to anneal with conserved regions within the loci to obtain the remaining nucleotide sequences.

2.4 DNA and RNA Extraction

To obtain genomic DNA for amplifying scn4aa sequences, excised muscle tissue was processed using the DNeasy Blood and Tissue Spin-Column Kit (Qiagen).

RNA was obtained from electric organ tissue of a Gymnotus tigre specimen so that scn4aa cDNA encoding the protein's carboxyl-terminus can be transcribed. To obtain RNA, fresh tissue was homogenized (ground with a mortar and pestle at -80ºC, and vortexed with least 1 mL Trizol /100 mg tissue). Nucleic acids, amino acids, and lipids were separated by adding a denser chloroform organic layer (200 µL /1 mL Trizol), and further homogenizing and breaking of large pieces of DNA (vortex for 15 s). The solution was placed at room temperature to allow contents to drift into their phases (2-3 mins), and centrifuged to obtain clear phase separation (12000 g for 15 min at 4ºC). Nucleic acids were allowed to precipitate by adding isopropanol (500 µL /1 mL Trizol) to the aqueous phase, and incubating at room temperature (10 mins). The nucleic acids were pelleted (centrifugation at 12000 g for 10 min at 4ºC, and removal of supernatant), and washed (80% ethanol /1 mL Trizol, centrifugation at 7500 g for 5 mins at 4ºC, and removal of supernatant ethanol) to increase purity of the sample. To prevent nucleic acid degradation,

25 samples were heated to denature nucleases (70ºC for 2-3 mins), and resuspended in diethyl pyrocarbonate treated water (81 µL; any nucleases including RNAse were inactivated in DEPC water). DNA was selectively degraded by adding DNase I (8 µL of 10X DNase I buffer, 2 µL of DNase I enzyme), mixing (vortex, quick spin), and incubating to activate the enzyme (42ºC for 25 mins). RNA was purified using the RNA Cleanup protocol in the RNeasy Mini Kit (Qiagen).

2.5 Nucleotide Amplification and Sequencing

Nucleotide sequences from previous studies were obtained from GenBank. This includes most of the cytb, 16S, and rag2 data. All of the scn4aa sequences encoding the protein’s carboxyl- terminus were experimentally obtained as part of this study. See Table 3 for the source of each sequence.

Polymerase Chain Reaction (PCR) was used to amplify nucleic acids from target loci (1x

+(NH4) 2SO4 PCR Buffer (Fermentas), 0.8 mM dNTPs, 0.2 µM of each primer, 0.02 U/µL Taq

DNA Polymerase (Fermentas), 0.5-4 mM MgCl2). Both standard thermal cycling profiles (denaturation at 95°C for 2.5 min; 32 cycles of denaturation at 95°C for 30 s, annealing at 53- 54°C 1 min, extension at 72°C for 1 min 30s; and extension at 72°C for 5 min) and touchdown protocols (Don et al. 1991) were used. Concentrations of MgCl2 and annealing temperatures were optimized for each primer pair.

Amplified products were assessed by gel electrophoresis (1% agarose in 1x TAE buffer

(50x TAE: 242 g Tris base, 57.1 mL glacial acetic acid, 100 mL 0.5 M EDTA pH 8.0, H2O to 1L), and staining with SYBRSafe (Invitrogen). PCR products showing one distinct amplicon were purified using the QIAquick PCR Purification Kit (Qiagen). They were sequenced by capillary electrophoresis and dye termination cycle sequencing (3730xl DNA Analyzer with KB Basecaller software, Applied Biosystems) at the Centre for Applied Genomics (TCAG, The Hospital for Sick Children, Toronto, Canada).

26

Table 3. Specimens and Nucleotide Sequences Used for Gymnotus Analyses Specimens used for analysis are identified by their scientific names, tissue sample numbers, museum catalogue numbers, collection localities, and applicable GenBank Accession numbers. Drainage basins are classified according to Albert et al. (2005): MA - Middle America, NW - Northwestern South America, PS - Pacific Slope, GO - Guyanas-Orinoco, WA - Western Amazon, EA - East Amazon, NE - Northeast Brazil, PA - Paraguay-Paraná basin of Argentina, SE - Southeast Brazil. Sequences obtained by the author for this project are identified with “**”. Sequences obtained from lab records are identified with “*” or their GenBank Accession Number, if applicable.

Genus Species Tissue sample Museum catalog Collection locality; scn4aa 3' cytochrome b recombination 16S ribosome activating gene 2 Family Gymnotidae Gymnotus arapaima 2002 MZUSP 75179 Lago Mamirauá, Tefé, Amazonas, Brazil; ** GQ862595 GQ862543 GQ862647 Gymnotus arapaima 2003 MZUSP 103219 Lago Mamirauá, Tefé, Amazonas, Brazil; ** GQ862596 GQ862544 GQ862648 Gymnotus carapo 2004 MZUSP 76066 Lago Secretaria, Brazil; ** GQ862599 GQ862547 GQ862651 Gymnotus carapo 2006 UF 131129 Rio Amazonas, Peru; ** GQ862601 GQ862549 GQ862653 Gymnotus carapo 2007 UF 131129 Rio Amazonas, Peru; ** GQ862602 GQ862550 GQ862654 Gymnotus carapo 2030 MZUSP 76066 Lago Secretaria, Brazil; ** GQ862600 GQ862548 GQ862652 Gymnotus carapo 2040 UF 174335 Rio Guaratico, Venezuela; ** GQ862597 GQ862545 GQ862649 Gymnotus carapo 2041 UF 174335 Rio Guaratico, Venezuela; ** GQ862598 GQ862546 GQ862650 Gymnotus cataniapo 2062 UF 174330 Rio Atabapo, Venezuela; ** GQ862603 GQ862552 GQ862656 Gymnotus cataniapo 2063 UF 174332 Rio Cataniapo, Venezuela; ** GQ862604 GQ862579 GQ862683 Gymnotus coatesi 2042 MCP 34471 Lago Tefé, Brazil; ** GQ862605 GQ862553 GQ862657 Gymnotus coatesi 2043 MCP 34472 Rio Tefé, Brazil; ** GQ862605 GQ862554 GQ862658 Gymnotus coropinae 2010 MZUSP 75188 Lago Tefé, Brazil; ** GQ862611 GQ862559 GQ862663 Gymnotus coropinae 2025 MZUSP 60611 Lago Tefé, Brazil; ** GQ862612 GQ862560 GQ862664 Gymnotus coropinae 2035 ANSP 179126 Sauriwau River, Guyana; * GQ862607 GQ862555 GQ862659 Gymnotus coropinae 2036 AUM 35848 Sauriwau River, Guyana; ** GQ862608 GQ862556 GQ862660 Gymnotus coropinae 2037 ANSP 179127 Mazaruni River, Guyana; ** GQ862609 GQ862557 GQ862661 Gymnotus coropinae 2038 ANSP 179127 Mazaruni River, Guyana; ** GQ862610 GQ862558 GQ862662 Gymnotus curupira 2009 MZUSP 75148 Lago Tefé, Brazil; ** GQ862613 GQ862561 GQ862665 Gymnotus curupira 2021 MZUSP 75146 Lago Tefé, Brazil; ** GQ862614 GQ862562 GQ862666 Gymnotus cylindricus 2092 ROM 84772 Rio Tortuguero, Costa Rica; ** GQ862615 GQ862563 GQ862667 Gymnotus cylindricus 2093 ROM 84772 Rio Tortuguero, Costa Rica; ** GQ862616 GQ862564 GQ862668 Gymnotus cylindricus 2094 ROM 84772 Rio Tortuguero, Costa Rica; ** GQ862617 GQ862565 GQ862669

27

Genus Species Tissue sample Museum catalog Collection locality; scn4aa 3' cytochrome b recombination 16S ribosome activating gene 2 Gymnotus javari 2020 UF 122824 Iquitos, Brazil; ** GQ862618 GQ862566 GQ862670 Gymnotus jonasi 2016 MZUSP 103220 Rio Solimões, Tefé, Amazonas, Brazil; ** GQ862619 GQ862567 GQ862671 Gymnotus jonasi 2471 UF 131410 Rio Ucayali, Pacaya Samiria Reserve, Peru; ** GQ862620 GQ862568 GQ862672 Gymnotus mamiraua 2012 MZUSP 103221 Rio Solimões, Tefé, Amazonas, Brazil; ** GQ862621 GQ862569 GQ862673 Gymnotus mamiraua 2013 MCP 29805 Rio Solimões, Tefé, Amazonas, Brazil; ** GQ862622 GQ862570 GQ862674 Gymnotus obscurus 2017 MZUSP 75155 Lago Mamirauá, Tefé, Amazonas, Brazil; Gymnotus obscurus 2018 MZUSP 75157 Lago Mamirauá, Tefé, Amazonas, Brazil; Gymnotus omarorum 7092 AMNH 239656 Laguna del Cisne, Uruguay; ** ** ** ** Gymnotus omarorum 7093 AMNH 239656 Laguna del Cisne, Uruguay; ** ** ** ** Gymnotus pantanal 7076 (not catalogued) Rio Parana, Corrientes, Chaco Region, Argentina; ** ** * * Gymnotus pantherinus 2039 (no voucher) Rio Perequê-Açu, Brazil; ** GQ862625 GQ862573 GQ862677 Gymnotus pantherinus 2945 MZUSP 87564 Rio Vermelho, Sao Paulo, Brazil; ** * * * Gymnotus stenoleucus 2060 UF 174329 Rio Atabapo, Venezuela; ** GQ862628 GQ862576 GQ862680 Gymnotus stenoleucus 2061 UF 174331 Rio Cataniapo, Venezuela; ** GQ862629 GQ862577 GQ862681 Gymnotus stenoleucus 2064 UF 174329 Rio Atabapo, Venezuela; ** GQ862630 GQ862578 GQ862682 Gymnotus sylvius 7240 MZUSP 100267 Rio Ribeira de Iguape-Rio Juqueia-Rio São Lourenço, ** ** ** ** Miracatú, São Paolo, Brazil; Gymnotus tigre 7090 (not catalogued) (aquarium specimen) ** ** ** ** Gymnotus tigre 7090_804pe3_1 (not catalogued) (aquarium specimen) ** (aliquot from 7090) Gymnotus tigre 7349 (not catalogued) (aquarium specimen) ** ** ** ** Gymnotus ucamara 1927 UF 126184 Rio Ucayali, Peru; ** * * * Gymnotus ucamara 1950 UF 126184 Rio Ucayali, Peru; ** * * * Gymnotus varzea 2014 MZUSP 75163 Rio Solimões, Tefé, Amazonas, Brazil; ** * * * Gymnotus varzea 2015 MZUSP 75164 Rio Solimões, Tefé, Amazonas, Brazil; ** * * * Gymnotus n. sp. 2956 (not catalogued) Rio São João, Rio de Janeiro, Brazil; ** ** ** ** Gymnotus n. sp. 2957 (not catalogued) Rio São João, Rio de Janeiro, Brazil; ** ** ** ** Gymnotus aff. anguillaris 2091 AUM 36616 Rio Aponwao, Guyana; ** GQ862594 GQ862542 GQ862646 Gymnotus n. sp. chaviro 7357 (unknown) (unknown) ** ** ** ** Gymnotus n. sp. chaviro 7358 (unknown) (unknown) ** ** ** ** Gymnotus n. sp. fritzi 7109 (not catalogued) Tefé, Amazonas, Brazil; ** ** ** **

28

Genus Species Tissue sample Museum catalog Collection locality; scn4aa 3' cytochrome b recombination 16S ribosome activating gene 2 Gymnotus n. sp. itapua 2559 MZUSP 85947 Southern Brazil ** * ** * Gymnotus n. sp. itapua 7071 (not catalogued) Rio Parana, Corrientes, Chaco Region, Argentina; ** * * * Gymnotus n. sp. itapua 7072 (not catalogued) Rio Parana, Corrientes, Chaco Region, Argentina; ** * * * Gymnotus n. sp. RS1 2558 MZUSP 85943 Southern Brazil ** * ** * Gymnotus n. sp. RS1 7088 MNRJ 31520 Lagoa dos Tropeiros, Piumhi, Minas Gerais Region, ** ** ** ** Brazil Gymnotus cf. tigre 2019 UF 122823 Rio Amazonas, Peru; ** GQ862631 GQ862579 GQ862683 Gymnotus cf. tigre 2024 UF 122821 Rio Amazonas, Peru; ** GQ862632 GQ862580 GQ862684 Gymnotus sp. xingu 7305 MNRJ 33642 Xingú-Tapajós, Brazil; ** ** ** ** Family Electrophoridae Electrophorus electricus M22252 Electrophorus electricus 2026 MZUSP 103218 Lago Secretaria, Tefé, Amazonas, Brazil; Electrophorus electricus 2619 UF 116585 Rio Nanay, Peru; ** GQ862592 GQ862540 GQ862644 Family Hypopomidae Brachyhypopomus diazi 305 UF 174334 Rio Las Marias, Venezuela; ** GQ862589 GQ862537 GQ862641 Brachyhypopomus diazi 2408 UF 174334 Rio Alpargatón, Venezuela; ** GQ862590 GQ862538 GQ862642 Brachyhypopomus n. sp. PAL 2432 UF 148572 Rio Palenque, Ecuador; ** GQ862591 GQ862539 GQ862643 Hypopomus artedi 2232 ANSP 179505 Rio Mazaruni, Guyana; ** GQ862637 GQ862585 GQ862689 Family Sternopygidae Sternopygus astrabes 2203 (unknown) Lago Tefé, Igarapé Repartimento, Brazil; Sternopygus macrurus 2639 UF 117121 Rio Nanay, Peru; ** GQ862639 GQ862587 GQ862691

29

2.6 Nucleotide Sequence Verification and Alignment

All sequences experimentally obtained for this study were visually inspected for misreads, and edited using SequencherTM (Gene Code Corporation, Ann Arbor, MI). Ambiguous base calls were considered as possibly any nucleotide.

For scn4aa sequences, amplification and sequencing of the exon encoding the protein’s carboxyl-terminus (scn4aa 3’) from the desired member of the gene family was verified as follows. Each sequence was blasted as a translated nucleotide against the translated nucleotide database in GenBank’s Nucleotide Collection (tblastx: http://www.ncbi.nlm.nih.gov/blast/Blast.cgi). All sequences were found to have higher alignment scores with scn4aa sequences than with scn4ab, any other scn, or any other nucleotide sequence. To verify that the expected exon had been amplified, each scn4aa sequence was blasted (http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) against the Electrophorus electricus scn4aa mRNA sequence (Accession # M22252).

Directions and applicable codon positions of the nucleotide sequences were determined by comparison with published Danio rerio (rag2 Accession # NM_131385, cytb and 16S Accession # NC_002333), E. electricus (scn4aa 3’ Accession # M22252), and Pygocentrus nattereri (16S Accession # U33591) sequences. Nucleotides from protein coding loci (cytb, rag2, and scn4aa 3’) were aligned based on their amino acid alignments using a combination of software (Mesquite; ClustalX1.83; RevTrans http://www.cbs.dtu.dk/services/RevTrans/14). The 16S nucleotide sequences were aligned under various gap cost settings in ClustalX 1.83 (Thompson et al. 1997). Gap opening / gap extension values used were: 15/6.66; 7/5; 10/5; 20/5; and 10/10. 16S nucleotide positions which did not align consistently under all those settings were removed from the analysis.

2.7 Phylogenetic Reconstruction

Phylogenetic reconstruction was conducted using the total evidence nucleotide alignment, and compared with separate analyses of the following alignments: mitochondrial (cytb and 16S);

30 rag2; and scn4aa 3’. The cDNA sequences (Electrophorus electricus Accession # M22252; and Gymnotus tigre sequence from tissue # 7090_804pe3_1) were not used for phylogenetic reconstruction.

Parsimony based phylogenetic reconstruction was implemented in PAUP* (Swofford 2002) using the stepwise heuristic search algorithm with the following parameters for 2000 search replicates: tree bisection reconstruction branch swapping; and holding 10 variants at each step. Bootstrapping was also conducted for 2000 search replicates with the same parameters (Müller 2005).

Bayesian phylogenetic reconstruction was implemented in MrBayes 3.1.2 (Huelsenbeck and Ronquist 2001), using the model of molecular evolution that best fit the data as determined using MrModeltest 2.3 (Nylander 2004). It was the same model for the total evidence and individual locus alignments – general time-reversible model, with a proportion of nucleotide sites that are invariant, and the variation in nucleotide substitution rates across the variant nucleotide sites estimated from a gamma distribution (GTR + I + G; Brinkman and Leipe 2001). The total evidence and mitochondrial alignments were partitioned into the four and two loci, respectively. The total evidence alignment was analyzed with temp = 0.2. The mitochondrial, rag2, and scn4aa 3’ alignments were analyzed with temp = 0.2; and nperts = 2. Each of these four analyses had 25% burnin, after running up to 5.5 million generations with four chains each until the average standard deviation of split frequencies was 0.01 or less. All other parameters were program defaults.

2.8 Molecular Evolution Analyses

Molecular evolution analyses were conducted to determine patterns and test hypotheses of nucleotide sequence variation at the Gymnotus scn4aa 3’. The outgroup species were not used for molecular evolution analyses, so that patterns of amino acid evolution among genus Gymnotus could be examined in isolation from outgroup taxa.

For protein coding nucleotides, every three nucleotides are considered as one codon, which encodes the amino acid identity at one amino acid site. There are multiple possible nucleotide combinations in one codon that encode the same amino acid. Thus, some nucleotide

31 mutations within a codon would change the identity of the amino acid (non-synonymous mutations, or dN), while other mutations would not (synonymous mutations, or dS). For neutrally evolving amino acid sites, the ratio of non-synonymous to synonymous nucleotide mutations (dN/dS, or ω) is expected to be 1. Amino acid sites evolving under purifying and positive selection are expected to have ω < 1 and ω > 1, respectively. In other words, amino acid sites evolving under purifying selection retain very few nucleotide mutations that change the identity of the amino acid, while amino acid sites evolving under positive selection retain many of those mutations.

For the molecular evolution analyses conducted, the parameters of the null models prevent the hypotheses from being true, while those of the alternative models allow the hypotheses to be true (Table 4). To determine whether the null hypotheses could be rejected, the likelihoods (lnL values) of nested null and alternative models of evolution were compared using the Likelihood Ratio Test (OpenOffice Spreadsheet version 3.3.0).

Maximum likelihood models of codon substitution were implemented using the codeml program of PAML to test various hypotheses (version 4.5; Yang 2007). Given a nucleotide alignment and phylogenetic tree, the program provides likelihoods of various models of evolution and sites of possible positive selection. Ambiguous sites and gaps in the nucleotide alignment were treated as the consensus identities (same nucleotide identity as in other nucleotide sequences) and non-consensus identities (any nucleotide identity), respectively. The phylogenetic tree for molecular evolution analyses was a strict consensus between 2 topologies: 50% majority consensus of those from parsimony analysis of the total evidence nucleotide alignment; and 50% majority consensus of those from Bayesian analysis of the total evidence nucleotide alignment. The phylogenetic tree and nucleotide alignment were pruned to remove duplicate individuals of the same species those scn4aa 3’ nucleotide sequences were identical. The individual that remained was the one with the smaller number of nucleotide ambiguities among the scn4aa 3’ locus. The tie-breaker locus was the number of nucleotide ambiguities among all loci.

32

Table 4. Models of Evolution Analyzed for the Gymnotus Nav1.4a C-terminus Various null and alternative models of evolution were used to test hypotheses of codon evolution (Yang 2007). Models are categorized by specific hypothesis tested, and their fixed and free parameters are identified.

Hypothesis tested Alternative model Null model Name of model Parameters (fixed parameters are # of free Name of model Parameters (fixed parameters are # of free underlined) parameters underlined) parameters

Variation in ω among M0f Free ratio ω1-x, where x = # of lineages # of M0 One ratio ω 1 lineages lineages minus 1

M2aII-f Branch-site background: p1-2; ω1 < 1; ω2 ~ 1 2 M0 One ratio ω 1 model A foreground: p1-2; ω1 < 1; ω2 ~ 1, ω3 > 1

M0f Free ratio ω1-x, where x = # of lineages # of M2aII-f Branch-site background: p1-2; ω1 < 1; ω2 ~ 1 2 lineages model A foreground: p1-2; ω1 < 1; ω2 ~ 1, ω3 > 1 minus 1

Variation in ω among sites M3 Discrete p1-3; ω1; ω2; ω3 5 M0 One ratio ω 1

Positive selection (ω > 1) at M2a Positive p1-3; ω1 < 1; ω2 ~ 1, ω3 > 1 4 M1a Nearly p1-2; ω1 < 1; ω2 ~ 1 2 some sites in all lineages selection neutral

M8 Beta & ω p1-(x+1); q1-x ≤ 1; ωx+1 > 1 4 M7 Beta p1-x; q1-x ≤ 1 2 x = 10 categories in a beta distribution x = 10 categories in a beta distribution

M8 Beta & ω p1-(x+1); q1-x ≤ 1; ωx+1 > 1 4 M8a Beta & p1-(x+1); ω1-x ≤ 1; ωx+1 ~ 1 3 x = 10 categories in a beta distribution (ω=1) x = 10 categories in a beta distribution

Positive selection at some M2aII-f Branch-site background: p1-2; ω1 < 1; ω2 ~ 1 2 M2aII Branch-site background: p1-2; ω1 < 1; ω2 ~ 1 2 sites in some lineages model A foreground: p1-2; ω1 < 1; ω2 ~ 1, ω3 > 1 model A, foreground: p1-2; ω1 < 1; ω2 ~ 1, ω3 ~ 1 where ω2 = 1

33

The hypotheses “variation in ω among lineages” was tested using alternative/null model pairs for various groups of lineages: each lineage having a different dN/dS ratio (M0f) vs all lineages have similar dN/dS ratios (M0); one dN/dS ratio for lineages identified as having strong positive selection (ω > 100) from the M0f analysis and another dN/dS ratio for the other lineages

(M2aII-f) vs all lineages have similar dN/dS ratios (M0); and each lineage having a different dN/dS ratio (M0f) vs one dN/dS ratio for lineages identified as having strong positive selection

(ω > 100) from the M0f analysis and another dN/dS ratio for the other lineages (M2aII-f). The alternate hypothesis for variation in ω among all lineages (M0f) was analyzed with three technical replicates, due to the large number of free parametersThe hypotheses “variation in ω among some sites” (M3 vs M0) and “positive selection (ω > 1) among some sites in all lineages” (M2a vs M1a; M8 vs M7; and M8 vs M8a) were tested assuming all lineages had similar dN/dS ratios.

The hypothesis positive selection (ω > 1) at some sites in some lineages was tested with one dN/dS ratio for lineages that were identified as having strong positive selection (ω > 100) from the M0f analysis (M2aII-f vs M2aII).

Positively selected sites on the voltage-gated sodium channel protein carboxyl-terminus

(Nav1.4a C-terminus) amino acid alignment were identified from statistically significant models of evolution that resulted in at least 1 site class having ω > 1. Posterior probabilities of the positively selected sites were calculated using both naïve empirical Bayes (NEB) and Bayes empirical Bayes (BEB) approaches. The positively selected sites and posterior probabilities were identified relative to other Nav1.4a C-terminus acid alignments for comparison. These other amino acid alignments were translated from nucleotides that had been aligned in the same way as for the Gymnotus and outgroup dataset.

34

Chapter 3 Results

3.1 Differences Between DNA and cDNA Sequences for the scn4aa 3’

The primers amplified the portion of the voltage-gated sodium channel gene scn4aa that encodes the protein’s carboxyl-terminus (scn4aa 3’) nucleotide sequences, and there was no evidence of introns. Scn4aa DNA/cDNA sequence pairs were compared for both Electrophorus electricus (DNA sequence obtained for this study from tissue #s 2026 and 2619; cDNA sequence from GenBank Accession # M22252) and Gymnotus tigre (DNA and cDNA sequences obtained for this study from tissue #s 7090 and 7090_804pe3_1, respectively). There were no alignment gaps for either DNA/cDNA sequence pair. All nucleotides were identical for each DNA/cDNA pair, with the exception of a few ambiguous base calls from the experimental process.

3.2 Nucleotide Sequence Data

Nucleotide sequences were obtained from 59 Gymnotus individuals: 45 of which represent 19 recognized species, and 14 of which represent up to 9 undescribed species. Sequences were also obtained from 9 outgroup individuals, which represent 6 species from other gymnotiform families (Electrophoridae, Hypopomidae, and Sternopygidae). Table 3 identifies the specimens used for analysis by their scientific names, tissue sample numbers, museum catalogue numbers, and collection localities.

A total of 272 nucleotide sequences were obtained for phylogenetic analyses (excluding cDNA from tissue # 7090_804pe3_1 and Accession # M22252). For each of cytochrome b (cytb), 16S ribosome (16S), and recombination activating gene 2 (rag2), 23 sequences were collected for this study, and 44 were obtained from GenBank. For the portion of the voltage- gated sodium channel gene scn4aa that encodes the protein’s carboxyl-terminus (scn4aa 3’): all 68 sequences were collected for this study.

35

The total evidence nucleotide alignment consisted of 3739 nucleotide positions, 1258 of which were parsimony informative, and another 173 were variable but parsimony uninformative. The alignment consisted of nucleotide positions from the following loci: 1139 from cytb; 555 from 16S; 1250 from rag2; and 795 from scn4aa 3’. Nucleotides from the housekeeping mitochondrial loci (cytb + 16S) included 752 variable positions, of which 686 were parsimony informative. Nucleotides from rag2 included 332 variable positions, of which 263 were parsimony informative. Nucleotides from scn4aa 3’ included 347 variable positions, of which 309 were parsimony informative.

Among the nucleotide sequences obtained, only 4.90% of nucleotides were ambiguous (proportion of ambiguous sites among nucleotides: 1506/76313 cytb nucleotides; 1513/36515 16S nucleotides; 7694/83750 rag2 nucleotides; and 1615/54855 scn4aa 3’ nucleotides). The ambiguous sites have chromatograms that do not clearly show a single nucleotide identity. Although it is possible some are polymorphic sites, it was assumed that they were due to experimental error for the purposes of phylogenetic analyses.

This dataset represents the most complete sampling of Gymnotus nucleotide sequence data. Compared to the most recent molecular phylogenetic reconstruction of Gymnotus (Lovejoy et al. 2010), this dataset includes 10 additional Gymnotus species (2 additional recognized species, and up to 8 additional undescribed species) as well as an additional locus.

3.3 Phylogenetic Reconstruction

Molecular phylogenetic analyses were conducted using nucleotide alignments of various loci (cytb, 16S, rag2, and scn4aa 3’) and the total evidence alignment, using both maximum parsimony (MP) and Bayesian inference (BI) algorithms. The 50% majority-rule consensus topologies are shown in Figures 6-8. The strict consensus topology for Gymnotus from the total evidence nucleotide alignments using MP and BI algorithms is shown in Figure 9.

The MP consensus topologies were produced from the most parsimonious trees based on analyses of various loci: housekeeping mitochondrial (3148 trees), rag2 (3931 trees), scn4aa 3’ (1622 trees); and the total evidence nucleotide alignment (742 trees). The BI consensus topologies all resulted from analyses where the standard deviation of split frequencies was ≤

36

0.01. The potential scale reduction factors (psrf's) of the topologies from various loci were within 0.05 of the convergence diagnostic value of 1.00, and the burnin cutoff percentage was after the log probability plateaued. Although the psrf's for the total evidence nucleotide alignment was 3.586, the burnin cutoff percentage was after the log probability plateaued.

The genus Gymnotus was resolved as a monophyletic group based on phylogenetic reconstruction of each locus using an MP algorithm (Figure 6), the scn4aa 3’ locus using a BI algorithm (Figure 7), and the total evidence nucleotide alignment using either algorithm (Figure 8).

The outgroup consisted of gymnotiform species belonging to families outside of the family Gymnotidae (Electrophoridae, Hypopomidae, and Sternopygidae). The closest outgroup family to Gymnotus was identified as Electrophoridae (E. electricus) based on the housekeeping mitochondrial and total evidence nucleotide alignments using an MP algorithm (Figures 6 and 8). However, the closest outgroup was identified as Sternopygidae (Sternopygus astrabes and Sternopygus macrurus) based on the scn4aa 3’ and total evidence alignments using a BI algorithm (Figures 7 and 8). Sternopygidae was also identified as the closest outgroup based on the rag2 and scn4aa 3’ loci using an MP algorithm, although bootstrap values were either lower or there was no corresponding node from the bootstrap phylogeny (Figure 6).

Three major monophyletic clades were consistently resolved within Gymnotus (Figures 6-8; clade names as per Lovejoy et al. 2010): Gymnotus carapo group; G2 group; and G1 group. The G1 group was identified as the sister clade of a group composed of the other two major clades based on the housekeeping mitochondrial and rag2 loci using an MP algorithm (Figure 6), as well as from the total evidence nucleotide alignment using either algorithm (Figure 8). Although the G2 group was identified as the sister clade of the other two major clades based on the scn4aa 3’ alignment using either algorithm, it was not well supported.

Within the G. carapo group, there were five lineages for which phylogenetic topology varied among all three loci, whether an MP or BI algorithm was used for analysis (Figures 6 and 7): Gymnotus n. sp. (tissue # 2956); G. carapo (tissue #s 2040 and 2041); Gymnotus omarorum (tissue #s 7092 and 7093); Gymnotus obscurus (tissue #s 2017 and 2018); and the Gymnotus pantanal and Gymnotus sp. xingu lineage (tissue #s 7076 and 7035, respectively).

37

Based on the nucleotide alignment from Based on the nucleotide alignment from Based on the nucleotide alignment from cytochrome b & 16S ribosome recombination activating gene 2 The portion of the voltage-gated sodium channel gene scn4aa that encodess the protein’s carboxyl-terminus

Figure 6. Molecular Phylogeny for Gymnotus Based on Various Alignments Using Maximum Parsimony Phylogenetic reconstruction was conducted based on the nucleotide alignments of various loci using maximum parsimony. The 50% majority-rule consensus topologies are shown. Numbers above the branches indicate bootstrap values. The clades are coloured as follows: G. carapo group (green); G2 group (light blue); and G1 group (dark blue).

38

Based on the nucleotide alignment from Based on the nucleotide alignment from Based on the nucleotide alignment from cytochrome b & 16S ribosome recombination activating gene 2 The portion of the voltage-gated sodium channel gene scn4aa that encodess the protein’s carboxyl-terminus

Figure 7. Molecular Phylogeny for Gymnotus Based on Various Alignments Using Bayesian Inference Phylogenetic reconstruction was conducted based on the nucleotide alignments of various loci using Bayesian inference. The 50% majority-rule consensus topologies are shown. Numbers above the branches indicate posterior probabilities. The clades are coloured as follows: G. carapo group (green); G2 group (light blue); and G1 group (dark blue).

39

Based on maximum parsimony Based on Bayesian inference

Figure 8. Molecular Phylogeny for Gymnotus Based on the Total Evidence Alignment Phylogenetic reconstruction was conducted using the total evidence nucleotide alignment from Gymnotus, consisting of nucleotide sequences from cytochrome b, 16S ribosome, recombination activating gene 2, and the portion of the voltage-gated sodium channel gene scn4aa that encodes the protein’s carboxyl-terminus. The 50% majority-rule consensus topologies are shown. Numbers above the branches indicate bootstrap values and posterior, respectively. The clades are coloured as follows: G. carapo group (green); G2 group (light blue); and G1 group (dark blue).

40

There were three Gymnotidae lineages in addition to the 3 major monophyletic clades: the Gymnotus cylindricus lineage, the Gymnotus tigre lineage, and the Gymnotus pantherinus lineage. Gymnotus cylindricus was identified as the sister lineage to the G. carapo group based on the scn4aa 3’ alignment using an MP algorithm, the housekeeping mitochondrial and rag2 nucleotide alignments using a BI algorithm, and the total evidence nucleotide alignment using either algorithm (Figures 6-8). G. tigre was identified as basal to the G. carapo group + G. cylindricus lineage based on those same nucleotide alignment + algorithm combinations, as well as the scn4aa 3’ alignment using a BI algorithm (Figure 7). G. pantherinus was identified as basal to the G2 clade based on the housekeeping mitochondrial alignment using an MP algorithm, and the scn4aa C-terminus and total evidence alignments using either algorithm.

The topology based on the total evidence nucleotide alignment using a BI algorithm seemed to be slightly better resolved than using an MP algorithm (Figure 8). Some G. carapo lineages (tissue #s 2040 and 2041) were topologically variable within the G. carapo group, and better supported using a BI algorithm. One Gymnotus varzea lineage (tissue # 2014) was identified as being derived from the other (tissue # 2015) based on the total evidence nucleotide alignment using a BI algorithm and not any of the other phylogenetic analyses, but this relationship was not well supported.

3.4 Patterns of Gymnotus scn4aa C-terminus Nucleotide Sequence Variation

There were 43 variable sites on the Gymnotus Nav1.4a C-terminus amino acid alignment. Various hypotheses for patterns of nucleotide sequence variation were tested on the scn4aa 3’ nucleotide alignment using codon-based analyses (Table 4).

Variation in the ratio of non-synonymous to synonymous amino acids (dN/dS = ω) among lineages was supported for certain lineages (Table 5). Estimating a separate ω for each

Gymnotus lineage (M0f) was not a significantly better fit for the data than the null model of one

ω for all lineages (M0). However, the separate estimations of ω for each Gymnotus lineage (M0f) seemed to consistently identify seven lineages as having very high ω values (ω > 100; three technical replicates). The seven lineages and their median ω values are identified on Figure 9.

41

The seven lineages were confirmed as positively selected, since estimating ω values for those seven lineages separately from the other lineages (M2aII-f) was a significantly better fit for the data than the null model (same ω for all lineages, M0). In addition, estimating a separate ω for each Gymnotus lineage (M0f) was not a significantly better fit for the data than the null model (an ω for the seven lineages with very different ω values and another ω for the rest of the lineages, M2aII-f).

Variation in ω among amino acid sites was supported (Table 5). Estimating more than one ω for all amino acid sites (M3) was a significantly better fit for the data than the null model (one ω for all amino acid sites, M0). The ω values were estimated for three site classes using the M3 model: 0.00 (70.9% of sites), 0.90 (0.001% of sites), and 0.93 (29.1% of sites).

Positive selection (ω > 1) at some amino acid sites across all lineages was not supported (Table 5). None of the three alternative models of positive selection were a significantly better fit for the data than their null models: M2a vs M1a; M8 vs M7; and M8 vs M8a. In addition, none of the estimated ω values were > 1. The ω values were estimated for three site classes using the M2a model: 0.00 (71.7% of sites), and 1.0 (28.3% of sites) for the other two classes. The ω values were estimated for 11 site classes using the M8 model: 0.00 (8.89% of sites) for the first seven classes, 0.00016 (8.89% of sites), 0.84 (8.89% of sites), and 1.0 (19.99% of sites) for the last two classes.

Positive selection (ω > 1) at some amino acid sites in the seven lineages with very different ω values was supported (Table 6). Estimating ω values for those seven lineages separately from the other lineages (M2aII-f) was a significantly better fit for the data than the null model (limiting some of the ω values to 1, M2aII). The ω values of the seven lineages were estimated for two site classes using the M2aII model (Table 6): 999 for both site classes (14.2% of sites for both of the site classes combined). The ω values for the other site classes were fixed: 0.000 (65.0% of sites), and 1.00 (20.7% of sites).

42

Table 5. Results of PAML Analyses of Gymnotus Nav1.4a C-terminus Codon Evolution Hypotheses regarding patterns of codon evolution were tested using models of molecular evolution implemented in the codeml program of PAML version 4.5 (Yang 2007). See Table 4 for a summary of hypotheses tested. To determine whether the alternative models of evolution were significantly better at describing the data than the null models (p-value < 0.05), the likelihood values were compared using the likelihood ratio test.

Hypothesis tested Alternative model Null model Likelihood Degrees p-value ratio test of Name of model lnL (L = # of Name of model lnL (L = # of value freedom likelihood value) parameters likelihood value) parameters 1 Variation in ω among lineages M0f Free ratio -1975.238063 139 M0 One ratio -2000.427564 71 50.37900200 68 0.95 -10 M2aII-f Branch-site -1976.626775 74 M0 One ratio -2000.427564 71 47.60157800 3 2.6 x 10 model A 1 M0f Free ratio -1975.238063 139 M2aII-f Branch-site -1976.626775 74 2.777424000 65 1.0 model A Variation in ω among all sites M3 Discrete -1984.459630 75 M0 One ratio -2000.427564 71 31.93586800 4 2.0 x 10-6 Positive selection (ω > 1) at some M2a Positive -1984.50491 74 M1a Nearly neutral -1984.50491 72 0.0000000000 2 1.0 sites in all lineages selection M8 Beta & ω -1984.472282 74 M7 Beta -1984.488361 72 0.0321580000 2 0.98 M8 Beta & ω -1984.472282 74 M8a Beta & (ω=1) -1984.504914 73 0.0652640000 1 0.80 -4 Positive selection (ω > 1) at some M2aII-f Branch-site -1976.626775 74 M2aII Branch-site -1982.112158 73 10.97076600 1 9.3 x 10 sites in some lineages model A model A, where ω2 = 1

1 Analysis results from the first technical replicate is listed.

43

Figure 9. Molecular Phylogeny for Gymnotus and Positively Selected Lineages A strict consensus topology was determined from the 50 % majority-rule consensus topologies from maximum parsimony and Bayesian inference based reconstruction of the total evidence nucleotide alignment. The median non-synonymous to synonymous amino acid ratio (dN/dS = ω) from 3 technical replicates of the alternate hypothesis for variation in ω among all lineages (M0f), is shown above each branch. Estimates of ω were not obtained for the 2 most basal branches, because the analytical methods require a basal polytomy for the phylogenetic topology. Branches with ω > 100 are coloured grey. The clades are coloured as follows: G. carapo group (green); G2 group (light blue); and G1 group (dark blue).

44

Table 6. Nav1.4a C-terminus ω ratios for Gymnotus from the branch-site model A

The alternative model of codon evolution branch-site model A (M2aII-f) was tested using the codeml program of PAML version 4.5 (Yang 2007) using seven defined foreground lineages. See Table 5 for comparative results with the null model of evolution branch- site model A, where ω2 = 1 (M2aII). Ratios of non-synonymous to synonymous sites (dN/dS = ω) are listed for each site class, with those from site classes of fixed ω values highlighted in grey.

Site class 0 1 2a 2b % of sites 65.0 20.7 10.8 3.4 ω values for the 7 lineages 0.000 1.00 999 999 ω values for the other lineages 0.000 1.00 0.000 1.00

45

3.5 Positively Selected Sites on the Gymnotus Nav1.4a C-terminus Amino Acid Alignment

Sites of possible positive selection in the seven lineages with very high ω values were identified on the voltage-gated sodium channel protein Nav1.4a carboxyl-terminus (Nav1.4a C-terminus) amino acid alignment using the M2aII-f model of evolution (Table 7). Posterior probabilities were calculated using both naïve empirical Bayes (NEB) and Bayes empirical Bayes (BEB) approaches. The NEB implementation for M2aII-f resulted in eight sites identified as positively selected, with posterior probabilities ≥ 95%. The BEB implementation resulted in all sites being identified as positively selected, with posterior probabilities ≥ 79.0%. This included sites with no amino acid variation. Eight sites had posterior probabilities ≥ 95%, and they were at the same locations as those identified using NEB. Amino acid identities at the positively selected sites vary among the seven lineages and other Gymnotus species sampled (Table 8).

Locations of the positively selected sites were identified relative to amino acid sequences from the Nav1.4a C-terminus of other Gymnotus and gymnotiform fishes, homologs from an ostariophysian model species for which the genome has been sequenced (Nav1.4a and Nav1.4b of

Danio rerio), and homologs for which there is more research on protein function (Nav1.4,

Nav1.5, and Nav1.2 of Homo sapiens).

46

Table 7. Amino Acid Alignment for the Nav1.4a C-terminus Showing Positively Selected Sites Relative to Motifs of Functional Significance

The voltage-gated sodium channel carboxyl-terminus (Nav1.4a C-terminus) amino acid sites evolving under positive selection in the 7 Gymnotus lineages with positively selected Nav1.4a C-terminus sequences are identified relative to motifs of functional significance. The Danio rerio Nav1.4a C-terminus amino acid sequence was used as the reference sequence during the alignment process. The Nav1.4a consensus sequence from other gymnotiforms and other paralogs of the Nav from commonly used model species are included here for comparison. The significance of motifs highlighted in grey or identified by *, correspond to the legend on the left of that row. The PY motif is identified on the Homo sapiens and rat Nav1.2 and Nav1.5 sequences by a green background (Cormier et al. 2002; Rougier et al. 2005). Phosphorylation sites on the Electrophorus electricus Nav1.4a are identified by a red background (Emerick et al. 1993). Phosphorylation sites on the rat Nav1.2 are identified by a magenta background (Berendt et al. 2010). Models of evolution used to determine sites evolving under positive selection were implemented in the codeml program of PAML version 4.5 (Yang 2007). Positively selected sites were calculated using naïve empirical Bayes (NEB) and Bayes empirical Bayes (BEB) approaches, and identified on the Nav1.4a C-terminus amino acid alignment by “+”. Those sites are coloured by posterior probabilities as follows: 100 % (dark blue); > 99 % (blue); and > 95 % (light blue).

Helices I-IV of the EF-hand, and helix V a I II III IV Ca2+ binding b & interaction sites with CaM (*) c IQ & its interaction sites with the EF-hand (*) d * *** ** * * * * * * ** ** 6 H. sapiens Nav1.2 ...SV.T...AE..S....E..Y.V.....P...... EFAK.S..A...DP..L.....KVQ..A..L.M.S..R..CL...F.F.KR...ES 5 H. sapiens Nav1.5 ...SV.T...TE..S...... Y.I.....PE.....EYSV.S..A...S...... QIS..N..L.M.S..R..CM...F.F.KR...ES 4 H. sapiens Nav1.4 ....V.T...SE..G....E..Y...... P...... A.SR.S....T...... KI...TL.L.M.P.....CL...F.L.K.....S 3 D. rerio Nav1.4b ....V.T...S...... E..Y...... PT.S.....NR.SE.C.T.KD....P...T....T....M.T.....CL.L...L.G....GS 2 D. rerio Nav1.4a ENFNNAQEESGDPLCEDDFDMFDETWEKFDVDATQFIDYDRLFDFVDALQEPLRIAKPNRLKLISMDIPIVNGDKIHSQDILLAVTREVLGDT 1 gymnotiforms Nav1.4a ....L...... L...... D...... KP.....AKTNLSVSAE....CL.L..G..Q...... GN H SEA M I DRY FEG FYLE DRVPR IES L MHVPN HQ VN MY TE PFV V I K SV Y T L N H IHS Y K S N M Q R I SY Q S Q L LNT V M V Q Q T Gymnotus Nav1.4a ...GV.....S...... C...... L.....G...L..NQV....AALE..M..PKPN.HR.AKMDLNV.M....PYL...... TQ...... D L Q N S CM K V S I I V T Site positions of the C-terminus 0 0 0 0 0 0 0 0 0 0... amino acids 0 1 2 3 4 5 6 7 8 9... 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123 Model of Evolution Approach for calculating P Sites on the Nav1.4a C-terminus M2aII-f NEB + + + + BEB + + + +

47

Helices I-IV of the EF-hand, and helix V a V Ca2+ binding b & interaction sites with CaM (*) c ********* **** * * * IQ & its interaction sites with the EF-hand (*) d IQ-motif 6 H. sapiens Nav1.2 G....LRIQM.ER.MAS..SKVSY..IT...K..Q..VS.III..A..RY..KQKVKKVSSIYKKDKGKECD-QGT-.IK.DTLID.L.EN-S 5 H. sapiens Nav1.5 G....L.IQM.E..MAA..SKISY..IT...K..H..VS.MVI..AF.R.....SLKH.S.LFRQQAGSGL-SEEDA..R.....YV.SENFS 4 H. sapiens Nav1.4 G....L.QTM.E..MAA..SKVSY..IT...K..H..VC.IKI..A..R...Q.SMKQ.SYMYRHSHDGS---GDDA..K...L.NT.SKM.G 3 D. rerio Nav1.4b DQ..G..ATM.E..MAN..SK.SY..ITS..K..Q..V..STI..A..S.I...CVKQ.SYMYRD.TGSK-KPTG.A..KV.M..EN.RS..G 2 D. rerio Nav1.4a IEMDAMKESIEAKFIMNNPTSASFEPIITTLRRKEEERAAIAVQRIYRRHLLKRAIRYACFMRQSKRKVRNPNDNEPPETEGLIARKMNTLYG 1 Gymnotiformes Nav1.4a ...A...Q..Q.....D..IFE...... H...II.KA..Q...... L...A..H...-.KHE.-...A..-.....H...... A E GL VM IKKLHSNHLF VV M WR AK SKM MF Y FM VVHH SLLQCC-Q-RNM-D-DDIADDDS VEQ SA FR P S R N VLLSSTRITT DN LV R Q V SRIE E QRGHEGGMLPE T I S T T S S QTT SLV LQ M S TM D VSKKNTVVVSG V V V T P VW V V R G TTQQ K V Y R V M S N Q Gymnotus Nav1.4a ...E...K...... LLD..GP.FC..V...... A..KVI..A...Y.....MEH.S.LSR..D.KL-EEQDDAVLE.....Q..SVLYD T R KT ST Q T V Q VQ L ER MEMQ M K V G V QT V S Site positions of the C-terminus 1 1 1 1 1 1 1 1 1. . . amino acids 0 1 2 3 4 5 6 7 8. . . 456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456 Model of Evolution Approach for calculating P Sites on the Nav1.4a C-terminus M2aII-f NEB + + + + BEB + + + +

48

Helices I-IV of the EF-hand, and helix V a Ca2+ binding b & interaction sites with CaM (*) c IQ & its interaction sites with the EF-hand (*) d 6 H. sapiens Nav1.2 -T..KT-DMTP------STTS..SYDS..K--PEK---EKFEKDKSEKEDKGKDI------....------.-R..KK 5 H. sapiens Nav1.5 RPLGPPSSSS-----ISST.F..SYDS..R--.--T-SD-NL-..-Q-.R-G---SDY.HSEDLADFPPSPDRD----.-R.... 4 H. sapiens Nav1.4 HENGNSSSPSP.EKGEAGDAG.TMGLMPIS--P.D-TAW.PAPPPGQT.RPGVKES------....------L.V----- 3 D. rerio Nav1.4b DQAVED-DHPVG----CSF..HG.TQFGAKRPPVKVQSDVVLHSA--.F-PVP.SST-A.--D-....--.------L.-R.... 2 D. rerio Nav1.4a SNPELAMALELETRPMRPNSQPPKPSQVTQTRASVTFPRPQGQ--LIPVELTSEVILRSAPTTH----SFNSSENATT-IKESIV 1 Gymnotiformes Nav1.4a ...... QA...LA..RM.-DFK..----A..D..---..I.....D.N....H.....I....--...----...R.... FGS PS PMDE IEALPDHP TS SIRE-SNQIPESHS-TLA-PVP I V AV TNEIRLHS H---FSEAIVD TI GK T TQ G PKSTVTK LSIP L Y Q PD S Q K I V MV QNCFH GEL V V M Q Q V V TGTQ VGM N V S R Y Y Gymnotus Nav1.4a I.A...... QAK.ILAQTRMPS-LK-----.P..Y------PN...I.V.N....H...MVR....-Q...FSRAL.VR.... P R M R V G S K T Site positions of the C-terminus 1 2 2 2 2 2 2 2..... 2 amino acids 9 0 1 2 3 4 5 6..... 7 7890123456789012345678901234567890123456789012345678901234567890123456789012345678901 Model of Evolution Approach for calculating P Sites on the Nav1.4a C-terminus M2aII-f NEB BEB

a, d From Chagot et al. 2009. c From Chagot and Chazin 2011. b From Miloushev et al. 2009.

1 Except Gymnotidae and Apteronotidae sequences. 2 From Accession # DQ149506. 3 From Accession # DQ149505. 4 From Accession # BC172375. 5 From Chagot et al. 2009; and Accession # BC172375. 6 From Miloushev et al. 2009; and Accession # NG_008143.

49

Table 8. Amino Acid Identities of Positively Selected Sites on the Nav1.4a C-terminus for Various Gymnotus Species

Models of evolution testing for positive selection were implemented in the codeml program of PAML version 4.5 (Yang 2007). Amino acid identities are identified for Gymnotus species based on the translated nucleotide alignment of Gymnotus Nav1.4a carboxyl-terminus. Properties of amino acids were determined from the CRC Handbook of Chemistry and Physics (91st Edition) and Kyte and Doolittle (1982). The larger the hydropathy number, the more hydrophobic the amino acid is.

Amino acid site # of Amino acid Properties that differ among various Gymnotus species/lineages with the specified the Nav1.4a C- identity amino acid identities at the same site amino acid identity terminus (see Table (tissue #s are identified in brackets if applicable) 7) 20 Leucine (L) Non-polar; bigger (consensus identity among Gymnotus) Cysteine (C) Polar; smaller All members of a (positively selected) lineage, consisting of: G. cataniapo, G. n. sp. FRITZI, G. aff. anguillaris, and G. pantherinus. 55 Isoleucine (I) Hydropathy 4.5 (consensus identity among Gymnotus) Methionine (M) Hydropathy 1.9 All members of a (positively selected) lineage, consisting of: G. carapo (2004, 2006, and 2007), G. ucamara, and G. arapaima. 69 Threonine (T) Hydropathy -0.7; smaller (consensus identity among Gymnotus) Asparagine (N) Hydropathy -3.5; bigger All members of a (positively selected) lineage, consisting of: G. xingu, and G. pantanal. Serine (S) Hydropathy -0.8; bigger All members of the lineage, consisting of: G. chaviro, and G. varzea. 85 Valine (V) Smaller (consensus identity among Gymnotus) Isoleucine (I) Bigger All members of a (positively selected) lineage, consisting of: G. xingu, and G. pantanal. 94 Isoleucine (I) Neutral (consensus identity among Gymnotus)

Threonine (T) Polar The (positively selected) lineage:

G. curupira.

50

Amino acid site # of Amino acid Properties that differ among various Gymnotus species/lineages with the specified the Nav1.4a C- identity amino acid identities at the same site amino acid identity terminus (see Table (tissue #s are identified in brackets if applicable) 7) 113 Glycine (G) Smaller (consensus identity among Gymnotus) Serine (S) Bigger; in Electrophorus electricus, this All members of a (positively selected) lineage, site has been determined to be a serine consisting of: phosphorylation site (Emerick et al G. cataniapo, 1993) G. n. sp. FRITZI, G. aff. anguillaris, and G. pantherinus. 134 Lysine (K) Basic; bigger (consensus identity among Gymnotus) Glutamine (Q) Polar; smaller All members of a (positively selected) lineage, consisting of: G. coropinae (2025, 2036, and 2037). 154 Phenylalanine (F) Hydropathy 2.8; bigger (consensus identity among Gymnotus) Valine (V) Hydropathy 4.2; smaller The (positively selected) lineage, consisting of: G. curupira. All members of a (positively selected) lineage, consisting of: G. xingu, and G. pantanal. All other members of the same monophyletic lineage, including: G. cf. tigre, G. obscurus, G. chaviro, and G. varzea. Leucine (L) Hydropathy 3.8; smaller The (positively selected) lineage: G. coropinae 2025. The lineage: G. jonasi.

51

Chapter 4 Discussion

4.1 Evolutionary Relationships Among Gymnotus

There is currently a comprehensive phylogeny of genus Gymnotus based on morphology (Albert et al. 2004), as well as one based on both morphology and nucleotide sequences (Lovejoy et al. 2010). Some of the proposed phylogenetic relationships from morphological and nucleotide data are consistent with each other, while some are unclear (Figure 3). This project used additional taxa and nucleotide sequences to provide further evidence towards clarifying phylogenetic relationships among Gymnotus.

The genus Gymnotus and the Gymnotus carapo group were both well supported as monophyletic, consistent with both existing phylogenies (Figure 3, 6-8). Within the G. carapo group, the G. carapo complex (Albert et al. 2004; a subset of G. carapo group that includes G. carapo, Gymnotus arapaima, and Gymnotus choco) was resolved as monophyletic, consistent with both existing phylogenies. However, it was weakly supported unless Gymnotus mamiraua and some of the new taxa were included. The most basal G. carapo variant was well supported to be the same as that identified in the existing nucleotide-based phylogeny (Lovejoy et al. 2010).

Within the G. carapo complex + G. mamiraua clade, the topology of new taxa Gymnotus omarorum and Gymnotus n. sp. were not well resolved. Within the G. carapo group, the topology of Gymnotus obscurus and some of the other new taxa (Gymnotus pantanal and Gymnotus sp. xingu) were not well resolved either.

The G1 and G2 groups were both well supported as monophyletic, but not as a single monophyletic clade that includes Gymnotus pantherinus. This is consistent with the existing nucleotide-based phylogeny (Lovejoy et al. 2010). As expected, the G1 group was resolved & well supported as the most basal Gymnotus clade, when reconstructed with the same housekeeping mitochondrial and nuclear loci as the existing nucleotide phylogeny. The G. pantherinus taxon was well supported as basal to the G2 group, which had not been clear from existing phylogenies.

52

The Gymnotus cylindricus taxon was well supported as the sister to the G. carapo group, which confirms a suggestion from Lovejoy et al. 2010. The G. tigre taxon was well supported as basal to the G. carapo + G. cylindricus clade. The topology of Gymnotus tigre may seem inconsistent with the existing nucleotide-based phylogeny (Lovejoy et al. 2010). However, this was simply a case of specimen re-identification. The G. tigre specimens used for this project were adult fish, whose morphological features are more easily identified (James Albert and Nathan Lovejoy, personal communication). The topology of the juvenile Gymnotus cf. tigre specimens remained consistent with the existing nucleotide-based phylogeny, and likely represents a species other than G. tigre (Lovejoy et al. 2010).

4.2 Utility of the scn4aa 3’ for Phylogenetic Reconstruction

The portion of the voltage-gated sodium channel gene scn4aa that encodes the protein’s carboxyl terminus (scn4aa 3’) locus was one of several loci used to reconstruct the Gymnotus phylogeny. This locus is approximately 800 nucleotides long (Noda et al. 1984), and nucleotide sequences were obtained from 28 Gymnotus species. Analyses of these sequences showed that the scn4aa 3’ locus contributes towards a meaningful and accurate phylogenetic topology, with a reasonable amount of resolution.

The aligned nucleotides were from an orthologous locus, which contributed towards meaningful reconstruction of the phylogeny among Gymnotus species (Fitch 2000). The scn4aa gene is one of two paralogs expressed in actinopterygiian myogenic tissue, and one of eight paralogs encoded in the actinopterygiian genome (Novak et al. 2006; Widmark et al. 2011). The scn4aa 3’ amplification primers were designed to be specific for and resulted in sole amplification of those orthologous sequences, rather than sequences from other paralogs.

The nucleotide alignment had a large proportion of parsimony-informative characters, and the proportion of ambiguous characters was low. This contributed towards accurate reconstruction of the phylogeny among Gymnotus species (Wiens 1998; Hall 2011). Absence of alignment gaps for Electrophorus electricus and Gymnotus tigre scn4aa 3’ DNA/cDNA sequence pairs confirms the absence of introns at this locus in gymnotiforms (Widmark et al. 2011). This increases the chance of an accurate alignment, since introns tend to be more variable

53 in length (Hughes and Yeager 1997). Also, amino acids are more conserved than nucleotides, and alignments of those sequences can be used to mitigate mis-alignment of exon indels among species (Wernersson and Pedersen 2003). The scn4aa 3’ sequences contained the highest proportion of parsimony-informative characters per total characters among the loci in the dataset. In addition, the scn4aa 3’ sequences only contained 2.94% ambiguous characters, compared with 4.90% from the whole dataset.

The nucleotide characters of scn4aa 3’ seemed to be reasonably variable, which contributed towards resolution of the phylogeny among Gymnotus species (Brown et al. 1979). Voltage-gated sodium channels are highly conserved in nucleotide sequence and function across species (Goldin 2002). However, scn4aa in Actinopterygii had been predicted to vary in nucleotide sequence (Novak et al. 2006). This variability was confirmed among the actinopterygiian order Gymnotiformes (Zakon et al. 2006; Arnegard et al. 2010), and among genus Gymnotus (in this project). When scn4aa 3’ sequences are included for phylogenetic reconstruction, the proposed evolutionary relationships among Gymnotus are consistent with both the existing morphology-based and nucleotide-based phylogenies when they are consistent with each other. Inclusion of the scn4aa 3’ sequences increased the phylogenetic resolution, since some evolutionary relationships are proposed where they had previously been unresolved (e.g clarifying the topologies of G. pantherinus and G. cylindricus).

When characters at a locus vary at similar rates among lineages, the resulting phylogeny may be used as a primary means for estimation of species divergence timing (Schwartz 2007). However, characters are unlikely to vary at similar rates among lineages if they were subjected to selective pressures that resulted in divergence of those species. The voltage-gated sodium channel protein Nav1.4a in gymnotiform fishes may be an example of the latter case, since the protein has an important role in characteristics that may be under selective pressure among some lineages.

4.3 Natural Selection at the Nav1.4a C-terminus Among Gymnotus lineages

Zakon et al. 2006 and Arnegard et al. 2010 presented analyses of patterns of selection at the voltage-gated sodium channel protein Nav1.4a among gymnotiforms and non-electric fish. These

54 authors focused on motifs at and between the homologous domains of the protein. Purifying selection was detected among lineages of non-electric fish, and neutral (or relaxed) selection was detected among basal lineages of gymnotiforms. Positive selection was also detected among gymnotiform lineages, but the analysis only included four species representing four gymnotiform families (Zakon et al. 2006). In contrast, the project described here focused on motifs of the

Nav1.4a carboxyl-terminus (C-terminus) that may be involved in varying amplitudes and frequencies of electric organ discharges (EODs). While only one of the gymnotiform families was represented, the species sample was larger by seven times. Variation in selection among lineages of Gymnotus was detected, including statistically significant positive selection in seven lineages.

For most Gymnotus lineages, the amino acids of the Nav1.4a C-terminus seem to be evolving under purifying selection (Figure 9). This is consistent with purifying selection being identified for other motifs of the Nav1.4a in the Gymnotus cylindricus taxon (Arnegard et al.

2010). Purifying selection on the Nav1.4a suggests that the EODs of most Gymnotus species are generally adapted to their habitats, with little benefit to novel variation. The order that includes the genus Gymnotus (order Gymnotiformes) diverged from other ostariophysan orders approximately 100 million years ago (Alves-Gomes 1999), and the genus Gymnotus diverged from other gymnotiform families approximately 56.6 million years ago (Lovejoy et al. 2010). Since then, Gymnotus species have adapted to a large variety of ecological habitats (Lissman 1958), among various distinct hydrogeographic regions (Albert et al. 2005). Observations of these fishes indicate that their species-specific EOD characteristics are already fairly constrained by their abiotic environment and biotic evolutionary pressures (Stoddard 1999; Alves-Gomes 2001; Stoddard 2002).

For two Gymnotus lineages, the amino acids seem to be evolving under neutral selection (Figure 9). This is consistent with neutral selection being identified for other motifs of the

Nav1.4a among basal lineages of gymnotiforms (Arnegard et al. 2010). Neutral selection on the

Nav1.4a indicates that the EODs of those lineages are less constrained by abiotic and/or biotic pressures. The habitat of one of the lineages evolving under neutral selection (the Gymnotus cylindricus lineage) is geographically isolated relative to other Gymnotus species (Lovejoy et al. 2010), and is devoid of most electroreceptive predators with ampullary electroreceptors

55 including siluriforms (catfishes) and Potamotrygonidaes (river stingrays), as well as the electric eel Electrophorus electricus (Szabo et al. 1972; Szamier and Bennett 1980; Lovejoy 1996; Stoddard 1999; Alves-Gomes 2001; Stoddard 2002). The G. cylindricus lineage may be less constrained by biotic pressures, since previous analyses have suggested predation as an important evolutionary pressure for increased EOD complexity (Stoddard 1999).

For seven Gymnotus lineages, the amino acids are evolving under positive selection (Table 5; Figure 9). This is the first time positively selected gymnotiform lineages have been detected using a large sample of species. Positive selection on the Nav1.4a indicates that the EODs of those lineages are likely under novel environmental constraints and/or biotic evolutionary pressures. From the limited collection locality information in this project, a few examples can be identified, where positively selected lineages are geographically isolated relative to closely related lineages (Figure 9; Table 3; Albert et al. 2005). 1) The positively selected lineage from which Gymnotus arapaima, Gymnotus ucamara, and some Gymnotus carapo species are derived, only includes species from the highly diverse Western Amazon region. However, the G. carapo lineage under purifying selection (tissue # 2040) is from the Guyanas-Orinoco basin. 2) The positively selected Gymnotus coropinae lineage is from the highly diverse Western Amazon region. However, the G. coropinae lineages under purifying selection (tissue #s 2036 and 2037) are from the Guyanas-Orinoco basin. 3) The positively selected lineage from which Gymnotus sp. xingu and Gymnotus pantanal are derived, only includes species from the Paraguay-Paraná basin of Argentina. However, this lineage's sister lineage and basal lineages mostly include species from the highly diverse Western Amazon region. Specific environmental constraints and biotic evolutionary pressures may be identified in future comparisons of EODs between positively selected lineages and closely related lineages that are not under positive selection.

4.4 Natural Selection at Specific Sites of the Nav1.4a C-terminus Among Gymnotus

The existing analyses of patterns of selection at the voltage-gated sodium channel protein

Nav1.4a among gymnotiforms and non-electric fish focused on motifs associated with protein internalization (DII-III linker), the voltage-sensing component of fast activation (DIIS2-4,

56

DIIS4-5 linker, DIIIS2-4, DIIIS4-5 linker, and DIVS1-2), pore module (DIIS5-6 and DIIIS5-6), and the fast inactivation occlusion particle (DIII-IV linker) (Zakon et al. 2006; Arnegard et al. 2010). Statistically significant evidence of positive selection at specific sites among those motifs was not identified. This project focused on motifs of the Nav1.4a carboxyl-terminus (C-terminus) that are involved in regulation of protein internalization, fast inactivation, and possibly also resurgent current. Statistically significant evidence of amino acid sites under purifying, neutral (or relaxed), and positive selection were identified among these motifs.

When all the Gymnotus species were included in the analysis, there was statistically significant evidence for variation in the level of selection (between purifying and neutral selection) among amino acid sites of the Nav1.4a C-terminus (Table 5). This is consistent with purifying and neutral selection being identified for amino acid sites of other Nav1.4a motifs

(Zakon et al. 2006; Arnegard et al. 2010). Most amino acid sites of the Nav1.4a C-terminus are evolving under purifying selection (70.9% of sites), which is consistent with the Nav1.4a protein structure and functional elements being highly conserved among orthologs across species (Catterall et al. 2005). The finding that some amino acid sites are evolving under neutral selection (29.101% of sites) is consistent with the Nav1.4a protein being the paralog that is preferentially expressed in the electric organ, since it is unlikely that evolution of this paralog would adversely affect other organs (Lopreato et al. 2001; Goldin 2002; Novak et al. 2006; Widmark et al. 2011).

When all the Gymnotus species were included in the analysis, there was no statistically significant positive selection found among amino acid sites of the Nav1.4a C-terminus (Table 5).

This is consistent with lack of such evidence for amino acid sites of other Nav1.4a motifs (Zakon et al. 2006; Arnegard et al. 2010). Since most lineages of Gymnotus fishes are not evolving under positive selection, it is not surprising that there were no positively selected amino acid sites detected across all lineages of Gymnotus fishes.

In the seven Gymnotus lineages that are evolving under positive selection, there was statistically significant positive selection at specific amino acid sites of the Nav1.4a C-terminus (Table 5). This novel finding may be a result of greatly increased taxonomic representation, compared with analyses of other Nav1.4a motifs among gymnotiforms (Zakon et al. 2006;

57

Arnegard et al. 2010). Most amino acid sites of the Nav1.4a C-terminus among the seven positively selected lineages are under purifying selection (65.0% of sites), while a smaller proportion are under neutral selection (20.7% of sites), and an even smaller proportion are under positive selection (14.2% of sites; Table 6). Statistically significant positively selected sites of the Nav1.4a C-terminus identified using both naïve empirical Bayes (NEB) and the more sensitive Bayes empirical Bayes (BEB) approaches were identical (Table 7).

As predicted, amino acid sites of the Nav1.4a C-terminus that are positively selected, and likely contribute to altered (but not abolished) protein function were identified in Gymnotus fishes. Amino acid variations associated with the neutrally and positively selected sites are unlikely to abolish Nav1.4a protein function, since EODs are essential for Gymnotus fishes' survival, and field collection of electrogenic fish for tissue samples relies on detection of the fishes' EODs. Amino acid variations at the eight positively selected sites likely result in altered protein function that affects the EOD frequency. The positively selected sites are at motifs involved in fast activation, resurgent current, and phosphorylation (Table 7). The typical time course for Nav activation and subsequent fast inactivation (~ 1 ms for each step; Hodgkin et al. 1952; Ulbricht 2005) coincides with the time course for one Gymnotus EOD pulse (1-3 ms;

Crampton and Albert 2006). The typical time course for Nav recovery back to its resting state (on the order of milliseconds; Ulbricht 2005) coincides with the range in Gymnotus EOD frequencies (~ 14-67 ms between pulses; Crampton and Albert 2006). There was no evidence for selective pressures on amplitudes of EODs, due to absence of positively selected sites at the PY motif (Table 7). However, this does not preclude the possibility of natural selection on other characteristics that vary with EOD amplitude such as anatomical and cellular characteristics.

At the eight positively selected sites, the amino acid identities of Gymnotus species in positively selected lineages (as well as a few other lineages), are different from the identities in the majority of Gymnotus species (Table 8). The Gymnotus species in positively selected lineages had a different amino acid identity at as few as one of the eight positively selected sites. However, even single mutations can have significant effects on physiological characteristics of the tissue if they are at key amino acid sites of the Navs (Lehmann-Horn and Jukart-Rott 1999). Since amino acid variants are present at very few sites for each positively selected lineage, this provides a unique opportunity for future assessments of specific functions of those sites.

58

Predictions can be made from comparisons of EOD frequencies between species with different amino acid identities at a particular site. These predictions can then be verified by site-directed mutagenesis and patch clamp recordings. The presence of amino acid variations at very few sites for each positively selected lineage also provides a unique opportunity to contribute towards future assessments of selective pressures in various habitats of Central and South America. If the EOD frequencies of the positively selected lineages are higher than the EOD frequencies of comparative lineages with the consensus amino acid identity, then the habitat of species in the positively selected lineages can be predicted to have higher predation pressure from predators that are sensitive to lower EOD frequencies (e.g. more predatory fishes with ampullary electroreceptors).

4.5 Summary and Future Directions

Evolutionary relationships among Gymnotus were clarified using additional taxa and nucleotide sequences. The resultant topologies were generally consistent with previously proposed phylogenetic relationships. This project is the first to use the portion of the voltage-gated sodium channel gene scn4aa that encodes the protein’s carboxyl-terminus (scn4aa 3’) for phylogenetic reconstruction. The locus contributed towards a meaningful and accurate species-level phylogenetic topology, with a reasonable amount of resolution. This project was the first to find evidence of purifying, neutral (relaxed), and positive selection on the scn4aa 3’ among specific lineages of the Gymnotus genus of the order Gymnotiformes. This finding is generally consistent with those from previous analyses of other motifs of this scn paralog where a small sample of Gymnotiform species was used (Zakon et al. 2006; Arnegard et al. 2010). This project was also the first to find evidence of positive selection at specific sites on the scn4aa gene, in addition to purifying and neutral selection at specific sites. The amino acid sites under positive selection in the seven positively selected lineages were likely under selective pressure to alter their EOD frequencies, since amino acid sites under positive selection are part of motifs associated with voltage-gated sodium channel protein Nav1.4a fast inactivation and possibly resurgent current. The eight positively selected sites on the scn4aa 3’ among Gymnotus species in the seven positively selected lineages represent amino acids that likely contribute to altered protein function.

59

Future analyses of Gymnotus EOD frequencies among lineages experiencing neutral and positive selection may contribute to the identification of selective pressures in particular habitats of the Neotropics (Central and South America). Comparisons of EOD frequencies between species with different amino acid identities at particular positively selected sites can provide predictions of protein function that may be verified by site-directed mutagenesis and patch clamp recordings.

The methods for determining positive selection from this project may be used in similar projects focused on other clades of electric fish. Among the genus Gymnotus, there were eight positively selected Nav1.4a C-terminus amino acid sites out of the 43 sites variable in amino acid identity (Table 7). Among taxa within the order Gymnotiformes, there may be more positively selected sites identified, since the number of sites variable in amino acid identity is more than four times larger (at least 177 sites). The methods from this project may also be applied to other motifs of the Nav1.4a involved in fast inactivation (DIVS4, DIII-IV linker, S5-6, intracellular linkers). Future analyses of Nav1.4a may identify additional amino acid sites and identities that contribute to knowledge of protein function.

60

References

Agnew, W. S. (1984). Voltage-regulated sodium channel molecules. Annu Rev Physiol. 46, 517- 30.

Agnew, W. S., Levinson, S. R., Brabson, J. S. and Raftery, M. A. (1978). Purification of the tetrodotoxin-binding component associated with the voltage-sensitive sodium channel from Electrophorus electricus electroplax membranes. Proc Natl Acad Sci. 75(6), 2606- 2610.

Ahern, C. A. (2013). What activates inactivation? J Gen Physiol. 142(2), 97.

Albert, J. S. and Lundberg, J. G. (1995). Gymnotiformes. The Neotropical electric eels and knifefishes. Version 01 January 1995 (under construction). .

Albert, J. S. (2001). Species diversity and phylogenetic systematics of American knifefishes (Gymnotiformes, Teleostei). Misc Publ Mus Zool. University of Michigan. 190, 1-129.

Albert, J. S., Crampton, W. G. R., Thorsen, D. H. and Lovejoy, N. R. (2005). Phylogenetic systematics and historical biogeography of the Neotropical electric fish Gymnotus (Teleostei: Gymnotidae). Syst Biodiv. 2(4), 375-417.

Alves-Gomes, J. A., Orti, G., Haygood, M., Heiligenberg, W. and Meyer, A. (1995). Phylogenetic analysis of the South American electric fishes (Order Gymnotiformes) and the evolution of their electrogenic system: a synthesis based on morphology, electrophysiology, and mitochondrial sequence data. Mol Biol Evol. 12(2), 298-318.

Alves-Gomes, J. (1999). Systematic biology of Gymnotiform and Mormyriform electric fishes: phylogenetic relationships, molecular clocks, and rates of evolution in the mitochondrial rRNA genes. J Exp Biol. 202, 1167-1183.

Alves-Gomes, J. A. (2001). The evolution of electroreception and bioelectrogenesis in teleost fish: a phylogenetic perspective. J Fish Biol. 58, 1489-1511.

61

Albert, J. S., Zakon, H. H., Stoddard, P. K., Unguez, G. A., Holmberg-Albert, S. K. S. and Sussman, M. R. (2008). The case for sequencing the genome of the electric eel Electrophorus electricus. J Fish Biol. 72: 331–354.

Ariyasu, R. G., Deerinck, T. J., Levinson, S. R. and Ellisman, M. H. (1987). Distribution of (Na+ + K+)ATPase and sodium channels in skeletal muscle and electroplax. Journal of Neurocytology. 16, 511-522.

Arnegard, M. E., Zwickl, D. J., Lu, Y. and Zakon, H. H. (2010). Old gene duplication facilitates origin and diversification of an innovative communication system – twice. Proc Natl Acad Sci. 107(51), 22172-22177.

Baba, M. L., Goodman, M., Berger-Cohn, J., Demaille, J. G. and Matsuda, G. (1984). The early adaptive evolution of calmodulin. Mol Biol Evol. 1(6), 442-455.

Bedore, C. N. and Kajiura, S. M. (2013). Bioelectric fields of marine organisms: voltage and frequency contributions to detectability by electroreceptive predators. Physiol Biochem Zool. 86(3), 298–311.

Bahler, M. and Rhoads, A. (2002). Calmodulin signaling via the IQ motif. FEBS Lett. 513, 107- 113.

Bello, O. S., Gonzalez, J., Capani, F. and Barreto, G. E. (2012). In silico docking reveals

possible riluzole binding sites on Nav1.6 sodium channel: implications for amyotrophic lateral sclerosis therapy. J Theor Biol. 315, 53-63.

Benchimol, M., Machado, R. D. and de Souza, W. (1978). Staining of microtubules of the electrocyte of Electrophorus electricus L. by alcian blue and lanthanum. Experientia. 35 (5), 670-671.

Bennett, M. V. L. and Grundfest, H. (1959). Electrophysiology of electric organ in Gymnotus carapo. J Gen Physiol. 42(5), 1067-1103.

Bennett, M. V. L. (1961). Modes of operation of electric organs. Ann N Y Acad Sci. 94, 458- 509.

62

Berendt, F. J., Park, K. S. and Trimmer, J. S. (2010). Multisite phosphorylation of voltage-gated sodium channel alpha subunits from rat brain. J Proteome Res. 9(4), 1976-1984.

Brinkman, F. S. L. and Leipe, D. D. (2001). Chapter 14: Phylogenetic analysis (In: Baxevanis, A. D. and Ouellette, B. F. F. Eds.), Bioinformatics: A practical guide to the analysis of genes and proteins, Second Edition. John Wiley & Sons Inc. (Electronic), pp. 323-358. ISBN 0-471-22392-1.

Brown, W. M., George, M. and Wilson, A. C. (1979). Rapid evolution of mitochondrial DNA. Proc Natl Acad Sci. 76(4), 1967-1971.

Bullock, T. H. (1982). Electroreception. Annu Rev Neurosci. 5, 121–170.

Cannon, S. C. and Bean, B. P. (2010). Sodium channels gone wild: resurgent current from neuronal and muscle channelopathies. J Clin Invest. 120(1), 80-83.

Catterall, W. A. (1984). The molecular basis of neuronal excitability. Science. 223(4637), 653- 661.

Cantrell, A. R. and Catterall, W. A. (2001). Neuromodulation of Na+ channels: an unexpected form of cellular plasticity. Nat Rev Neurosci. 2, 397-407.

Catterall, W. A., Goldin, A. and Waxman, S. G. (2005). International Union of Pharmacology. XLVII. Nomenclature and structure-function relationships of voltage-gated sodium channels. Pharmacol Rev. 57(4), 397-409.

Caputi, A. A. (1999). The electric organ discharge of pulse Gymnotiforms: the transformation of simple impulse into a complex spatio-temporal electromotor pattern. J Exp Biol. 202, 1229-1241.

Chagot, B., Potet, F., Balser, J. R. and Chazin, W. J. (2009). Solution NMR structure of the C- terminal EF-hand domain of human cardiac sodium channel Nav1.5. J Biol Chem. 284 (10), 6436-6445.

63

Chagot, B. and Chazin, W. J. (2011). Solution NMR structure of apo-calmodulin in complex with the IQ motif of human cardiac sodium channel Nav1.5. J Mol Biol. 406(1), 106-119.

Charalambous, K. and Wallace, B. A. (2011). NaChBac: the long lost sodium channel ancestor. Biochemistry. 50(32), 6742-6752.

Chin, D. and Means, A. R. (2000). Calmodulin: a prototypical calcium sensor. Trends Cell Biol. 10(8), 322-328.

Cohen, S. A. and Levitt, L. K. (1993). Partial characterization of the rH1 sodium channel protein from rat heart using subtype-specific antibodies. Circ Res. 73, 735-742.

Collin, S. P. and Whitehead, D. The functional roles of passive electroreception in non-electric fishes. Animal Biology. 54(1), 1-25.

Cormier, J. W., Rivolta, I., Tateyama, M., Yang, A.-S. And Kass, R. S. (2002). Secondary structure of the human cardiac Na+ channel C terminus. J Biol Chem. 277(11), 9233- 9241.

Crampton, W. G. R. (1998). Effects of anoxia on the distribution, respiratory strategies and electric signal diversity of Gymnotiform fishes. J Fish Biol. 53(A), 307-330.

Crampton, W. G. R. and Albert, J. S. (2006). Evolution of electric signal diversity in Gymnotiform fishes (In: Ladich, F., Collin, S. P., Moller, P. and Kapoor, B. G. Eds.), Communication in fishes. Science Publishers, Enfield, New Hampshire, pp. 657-731.

Crampton, W. G. R., Lovejoy, N. R. and Waddell, J. C. (2011). Reproductive character displacement and signal ontogeny in a sympatric assemblage of electric fish. Evolution. 65(6), 1650-1666.

Cruz, J. S., Silva, D. F., Ribeiro, L. A., Araújo, I. G. A., Magalhães, N., Medeiros, A., Freitas, C., Araujo, I. C. and Oliveira, F. A. (2011). Resurgent Na+ current: A new avenue to neuronal excitability control. Life Sci. 89, 564-569.

64 de Arujo Jorge, T. C., de Souza, W. and Machado, R. D. (1979). Ultrastructural localization of calcium-binding sites in the electrocyte of the Electrophorus electricus (L.). J Cell Sci. 38, 97-104.

Don, R. H., Cox, P. T., Wainwright, B. J., Baker, K. and Mattick, J. S. (1991). 'Touchdown' PCR to circumvent spurious priming during gene amplification. Nucl Acids Res. 19(14), 4008.

Eijkelkamp, N., Linley, J. E., Baker, M. D., Minett, M. S., Cregg, R., Werdehausen, R., Rugiero, F. and Wood, J. N. (2012). Neurological perspectives on voltage-gated sodium channels. Brain. 135(9), 2585-2612.

Ellis, M. M. (1913). The gymnotid eels of tropical America. Mem Carneg Mus. 6(3), 109-195.

Ellisman, M. H. and Levinson, S. R. (1982). Immunocytochemical localization of sodium channel distributions in the excitable membranes of Electrophorus electricus. Proc Natl Acad Sci. 79, 6707-6711.

Emerick, M. C. Shenkel, S. and Agnew, W. S. (1993). Regulation of the eel electroplax Na channel and phosphorylation of residues on amino- and carboxyl-terminal domains by cAMP-dependent protein kinase. Biochemistry. 32(36), 9435-9444.

Emery, A. E. H. (1991). Population frequencies of inherited neuromuscular disease – a world survey. Neuromuscular Disorders. 1(1), 19-29.

Favre, I., Moczydiowski, E. and Schild, L. (1996). On the structural basis for ionic selectivity among Na+, K+, and Ca+ in the voltage-gated sodium channel. Biophys J. 71, 3110-3125.

Ferrari, M. B. and Zakon H. H. (1993). Conductances contributing to the action potential of Sternopygus electrocytes. J Comp Physiol A. 173, 281-292.

Fink, S. V. and Fink, W. L. (1981). Interrelationships of the ostariophysan fishes (Teleostei). Zool J Linn Soc. 72, 297-353.

Fitch, W. M. (2000). Homology a personal view on some of the problems. Trends Genet. 16(5), 227-31.

65

Fotia, A. B., Ekberg, J., Adams, D. J., Cook, D. I., Poronnik, P. and Kumar, S. (2004). Regulation of neuronal voltage-gated sodium channels by the ubiquitin-protein ligases nedd4 and nedd4-2. J Biol Chem. 279(28), 28930-28935.

Fritz, L. C. and Brockes, J. P. (1983). Immunochemical properties and cytochemical localization of the voltage-sensitive sodium channel from the electroplax of the eel (Electrophorus electricus). J Neurosci. 3(11), 2300-2309.

Froese, R. and Pauly, D. Editors. (2012). FishBase. .

Gayet, M., Meunier, F. J. and Kirschbaum, F. (1994). Gymnotiforme fossile de bolivie et ses relations phylogénétiques au sien des formes actuelles. Cybium. 18(3), 273-306.

Goldin, A. L. (2002). Evolution of voltage-gated Na+ channels. J Exp Biol. 205, 575-584.

Goldin, A. L., Barchi, R. L., Caldwell, J. H., Hofmann, F., Howe, J. R., Hunter, J. C., Kallen, R. G., Mandel, G., Meisler, M. H., Netter, Y. B., Noda, M., Tamkun, M. M., Waxman, S. G., Wood, J. N. and Catterall, W. A. (2000). Nomenclature of voltage-gated sodium channels. Neuron. 28, 365-368.

Gordon, R. D., Fieles, W. E., Schotland, D. L., Hogue-Angeletti, R. and Barchi, R. L. (1987). Topographical localization of the C-terminal region of the voltage-dependent sodium channel from Electrophorus electricus using antibodies raised against a synthetic peptide. Proc Natl Acad Sci. 84, 308-312.

Gordon, R. D., Li, Y., Fieles, W. E., Schotland, D. L. and Barchi, R. L. (1988). Topographical localization of a segment of the eel voltage-dependent sodium channel primary sequence (aa 927-938) that discriminates between modes of tertiary structure.

Gotter, A. L., Kaetzel, M. A. and Dedman, J. R. (1998). Electrophorus electricus as a Model System for the Study of Membrane Excitability. Comp. Biochem. Physiol. 119A (1), 225-241.

Hall, B. G. (2011). Phylogenetic trees made easy: a how-to manual, 4th edition. Sinauer Associates, Inc., Sunderland, MA.

66

Hansen, J. D. and Kaattari, S. L. (1996). The recombination activating gene 2 (Rag2) of the rainbow trout Oncorhynchus mykiss. Immunogenetics. 44, 203-211.

Haynes, W. and Lide, D. (2010). CRC handbook of chemistry and physics (91st Edition): a ready-reference book of chemical and physical data. Boca Raton, Fla. London: CRC Taylor & Francis distributor.

Hebert, T., Drapeau, P., Pradier, L. and Dunn, R. J. (1994). Block of the rat brain IIA sodium channel alpha subunit by the neuroprotective drug riluzole. Mol Pharmacol. 45(5), 1055- 60.

Heidmann, T. and Changeux, J.-P. (1978). Structural and functional properties of the acetylcholine receptor protein in its purified and membrane-bound states. Ann Rev Biochem. 47, 317-57.

Hennemann, E. (1957). Relation between size of neurons and their susceptibility to discharge. Science. 126, 1345-1346.

Hodgkin, A. L., Huxley, A. F. and Katz, B. (1952). Measurement of current-voltage relations in the membrane of the giant axon of Loligo. J Physiol. 116, 424-448.

Hopkins, C. D. (1988). Neuroethology of electric communication. Ann Rev Neurosci. 11, 497- 535.

Hopkins, C. D. (1999). Design features for electric communication. J Exp Biol. 202, 1217-1228.

Hopkins, C.D., Comfort, N. C., Bastian, J. and Bass, A. H. (1990). Functional analysis of sexual dimorphism in an electric fish, Hypopomus pinnicaudatus, order Gymnotiformes. Brain Behav Evol. 35(6), 350-367.

Hopkins, P. M. (2006). Skeletal muscle physiology. Contin Educ Anaesth Crit Care Pain. 6 (1), 1-6.

Huelsenbeck, J.P. and Ronquist, F. (2001). MrBayes: bayesian inference of phylogenetic trees. Bioinformatics 17, 754–755.

67

Hughes, A. L. and Yeager, M. (1997). Comparative evolutionary rates of introns and exons in murine rodents. J Mol Evol. 45(2), 125-130.

Jarecki, B. W., Piekarz, A. D., Jackson II, J. O. and Cummins, T. R. (2010). Human voltage- gated sodium channel mutations that cause inherited neuronal and muscle channelopathies increase resurgent sodium currents. J Clin Invest. 120, 369-378.

Kaetzel, M. A. and Dedman J. R. (1987). Identification of a 55-kDa high-affinity calmodulin- binding protein from Electrophorus electricus. J Biol Chem. 262(4), 1818-1822.

Keesey, J. (2005). How electric fish became sources of acetylcholine receptor. J Hist Neurosci. 14 (2), 149-164.

Keynes, R. D. and Martins-Ferreira, H. (1953). Membrane Potentials in the Electroplates of the Electric Eel. J. Physiol. 119, 315-351.

Kullberg, M., Nilsson, M., Arnason, U., Harley, E. H. and Janke, A. (2006). Housekeeping genes for phylogenetic analysis of eutherian relationships. Mol Biol Evol. 23(8), 1493-1503.

Kyte, J. and Doolittle, R. F. (1982). A simple method for displaying the hydropathic character of a protein. J Mol Biol. 157(1), 105-32.

Lavoué, S. and Sullivan, J. P. (2004). Simultaneous analysis of five molecular markers provides a well-supported phylogenetic hypothesis for the living bony-tongue fishes (Osteoglossomorpha: Teleostei). Mol Phylogent Evol. 33 (1), 171-185.

Lehmann-Horn, F. and Jukart-Rott, K. (1999). Voltage-gated ion channels and hereditary disease. Physiol Rev. 79, 1317-1372.

Lester, H. (1978). Analysis of sodium and potassium redistribution during sustained permeability increases at the innervated face of Electrophorus Electroplaques. J Gen Physiol. 72, 847- 862.

Levinson, S. R., Duch, D. S., Urban, B. W. and Recio-Pinto, E. (1986). The sodium channel from Electrophorus electricus. Ann N Y Acad Sci. 479(1), 162-178.

68

Li, C., Ortí, G., Zhang, G. and Lu, G. (2007). A practical approach to phylogenomics: the phylogeny of ray-finned fish (Actinopterygii) as a case study. BMC Evol. Biol. 7(44), 1- 11.

Lissman, H. W. (1958). On the function and evolution of electric organs in fish. J Exp Biol. 35, 156-191.

Liu, Z., Tao, J., Ye, P. and Ji, Y. (2012). Mining the virgin land of neurotoxicology: a novel paradigm of neurotoxic peptides action on glycosylated voltage-gated sodium channels. J Toxicol. 2012(843787).

Lopreato, G. F., Lu, Y., Southwell, A., Atkinson, A. S., Hillis, D. M., Wilcox, T. P. and Zakon, H. H. (2001). Evolution and divergence of sodium channel genes in vertebrates. Proc Natl Acad Sci. 98(13), 7588-7592.

Lorenzo, D., Sierra, F., Silva, A. and Macadar, O. (1990). Spinal mechanisms of electric organ discharge synchronization in Gymnotus carapo. J Comp Physiol A. 167, 447-452.

Lorenzo, D., Sierra, F., Silva, A. and Macadar, O. (1993). Spacial distribution of the medullary command signal within the electric organ of Gymnotus carapo. J Comp Physiol A. 173, 221-226.

Lovejoy, N. R. (1996). Systematics of myliobatoid elasmobranchs: with emphasis on the phylogeny and historical biogeography of neotropical freshwater stingrays (Potamotrygonidae: Rajiformes). Zool J Linn Soc. 117, 207-257.

Lovejoy, N. R. and Collette, B. (2001). Phylogenetic relationships of new world needlefishes (Teleostei: Belonidae) and the biogeography of transitions between marine and freshwater habitats. Copeia. 2, 324-338.

Lovejoy, N. R., Lester, K., Crampton, W. G. R., Marques, F. P. L. and Albert, J. S. (2010). Phylogeny, biogeography, and electric signal evolution of Neotropical knifefishes of the genus Gymnotus (Osteichthyes: Gymnotidae). Mol Phylogenet Evol. 54, 278-290.

69

Lynch, M., O'Hely, M., Walsh, B. and Force, A. (2001). The probability of preservation of a newly arisen gene duplicate. Genetics. 159, 1789-1804.

Machado, R. D., de Souza, W., Cotta-Pereira, G. C. and de Oliveira Castro, G. (1976). On the fine structure of the electrocyte of Electrophorus electricus L. Cell Tiss Res. 174, 355- 366.

Machado, R. D., de Souza, W., Benchimol, M., Attias, M. and Porter, K. R. (1980). Observations on the innervated face of the electrocyte of the main organ of the electric eel (Electrophorus electricus L.). Cell Tissue Res. 213, 69-80.

Maddison, D. R. and K.-S. Schulz (eds.) (2007). The tree of life web project. .

Mago-Leccia, F. (1994). Electric fishes of the continental waters of America: classification and catalogue of the electric fishes of the order Gymnotiformes (Teleostei: Ostariophysi) with descriptions of new genera and species. Volume 29: Biblioteca de la Academia de Ciencias Físicas, Matemáticas y Naturales. Fundacion para el Desarrollo de las Ciencias Fisicas, Matematicas y Naturales (FUDECI), Clemente, Caracas, Venezuela.

Mayden, R. L., Tang, K. L., Conway, K. W., Freyhof, J., Chamberlain, S., Haskins, M., Schneider, L., Sudkamp, M., Wood, R. M., Agnew, M., Bufalino, A., Sulaiman, Z., Miya, M., Saitoh, K. and He, S. P. (2007). Phylogenetic relationships of Danio within the order Cypriniformes: a framework for comparative and evolutionary studies of a model species. J Exp Zool B Mol Dev Evol. 308B, 642-654.

Mermelstein, C. D. S., Costa, M. L. and Neto, V. M. (2000). The cytoskeleton of the electric tissue of Electrophorus electricus L. An Acad Bras Ci. 72(3).

Mills, A. and Zakon, H. H. (1987). Coordination of EOD frequency and pulse duration in a weakly electric wave fish: the influence of androgens. J Comp Physiol A. 161, 417-430.

Miloushev, V. Z., Levine, J. A., Arbing, M. A., Hunt, J. F., Pitt, G. S. and Palmer III A. G. (2009). Solution structure of the Nav1.2 C-terminal EF-hand domain. J Biol Chem. 284(10), 6446-6454.

70

Moller, P. (1995). Electric fishes: history and behavior. Chapman & Hall, pp. 583.

Morth, J. P., Pedersen, B. P., Buch-Pedersen, M. J., Andersen, J. P., Vilsen, B., Palmgren, M. G. and Nissen, P. (2011). A structural overview of the plasma membrane Na+, K+-ATPase and H+-ATPase ion pumps. Nat Rev Mol Cell Biol. 12(1), 60-70.

Müller, K. F. (2005). The efficiency of different search strategies in estimating parsimony jackknife, bootstrap, and Bremer support. BMC Evol Biol. 5(58).

Munjaal, R. P., Connor, C. G., Turner, R. and Dedman, J. (1986). Eel electric organ: hyperexpressing calmodulin system. Mol Cell Biol. 6(3), 950-954.

Nakamura, Y., Nakajima, S. and Grundfest, H. (1965). Analysis of spike electrogenesis and depolarizing K inactivation in electroplaques of Electrophorus electricus, L. J Gen Physiol. 49(2), 321-49.

Noda, M., Shimizu, S, Tanabe, T., Takai, T., Kayano, T., Ikeda, T., Takahashi, H., Nakayama, H., Kanaoka, Y., Minamino, N., Kangawa, K., Matsuo, H., Raftery, M. A., Hirose, T., Inayama, S., Hayashida, H., Miyata, T. and Numa, S. (1984). Primary structure of Electrophorus electricus sodium channel deduced from cDNA sequence. Nature. 312, 121-127.

Novak, A. E., Jost, M. C., Lu, Y., Taylor, A. D., Zakon, H. H. and Ribera, A. B. (2006). Gene duplications and evolution of vertebrate voltage-gated sodium channels. J Mol Evol. 63, 208-221.

Nylander, J.A.A. (2004). MrModeltest. Technical report. Evolutionary Biology Centre, Uppsala University, Uppsala.

Palumbi, S., Martin, A., Romano, S., McMillan, W.O., Stice, L. and Grabowski, G. (1991). The simple fool’s guide to PCR, version 2.0. Honolulu: Department of Zoology and Kewalo Marine Laboratory, University of Hawaii.

71

Payandeh, J., El-Din, T. M. G., Scheuer, T., Zheng, N. and Catterall, W. A. (2012). Crystal structure of a voltage-gated sodium channel in two potentially inactivated states. Nature. 486, 135-140.

Payandeh, J., Scheuer, T., Zheng, N. and Catterall, W. A. (2011). The crystal structure of a voltage-gated sodium channel. Nature. 475, 353-359.

Potet, F., Chagot, B., Anghelescu, M., Viswanathan, P. C., Stepanovic, S. Z., Kupershmidt, S., Chazin, W. J. and Balser, J. R. (2009). Functional interactions between distinct sodium channel cytoplasmic domains through the action of calmodulin. J Biol Chem. 284(13), 8846-8854.

Rast, J. P. and Litman, G. W. (1998). Towards understanding the evolutionary origins and early diversification of rearranging antigen receptors. Immunol Rev. 166, 79-86.

Rose, P. K. (2007). Persistence has its own reward: repetitive firing of action potentials in neurons. J Physiol. 580 (2), 357.

Rougier, J.-S., van Bemmelen, M. X., Bruce, C., Jespersen, T., Gavillet, B., Apothéloz, F., Cordonier, S., Staub, O., Rotin, D. and Abriel, H. (2005). Molecular determinants of voltage-gated sodium channel regulation by the Nedd4/Nedd4-like proteins. AM J Physiol Cell Physiol. 288(3), C692-701.

Ruff, R. L (2003). Neurophysiology of the neuromuscular junction: overview. Ann N Y Acad Sci. 998, 1-10.

Saitoh, K., Miya, M., Inoue, J. G., Ishiguro, N. B. and Nishida, M. (2003). Mitochondrial genomics of Ostariophysan fishes: perspectives on phylogeny and biogeography. J Mol Evol. 56, 464-472.

Sarhan, M. F., Tung, C.-C., Petegem, F. V. and Ahern, C. A. (2012). Crystallographic basis for calcium regulation of sodium channels. Proc Natl Acad Sci. 109(9), 3558-3563.

Scheuer, T. (2010). Regulation of sodium channel activity by phosphorylation. Semin Cell Dev Biol. 22, 160-165.

72

Schmidt, J. W. and Catterall, W. A. (1987). Palmitylation, sulfation, and glycosylation of the α subunit of the sodium channel. J Biol Chem. 262(28), 13713-13723.

Schwartz, J. H. (2007). Do molecular clocks run at all? A critique of molecular systematics. Biol Theory. 1(4), 357-371.

Shah, V. N., Wingo, T. L., Weiss, K. L., Williams, C. K., Balser, J. R., Chazin, W. J. (2006). Calcium-dependent regulation of the voltage-gated sodium channel hH1: intrinsic and extrinsic sensors use a common molecular switch. Proc Natl Acad Sci. 103(10), 3592- 3597.

Solmó, C., de Souza, W., Machado, R. D. and Hassón-Voloch, A. (1977). Biochemical and cytochemical localization of ATPases on the membranes of the electrocyte of Electrophorus electricus. Cell Tiss Res. 185, 115-128.

Stoddard, P. K. (1999). Predation enhances complexity in the evolution of electric fish signals. Nature. 400, 254-256.

Stoddard, P. K. (2002). Electric signals: predation, sex, and environmental constraints. Advances in the Study of Behaviour. 31, 201-242.

Stoddard, P.K. (2006). Plasticity of the electric organ discharge waveform: contexts, mechanisms, and implications for electrocommunication. In: Communication in Fishes. ch. 22, pp 623-646. F. Ladich, S.P. Collin, P. Moller, B.G. Kapoor, eds. Science Publisher, Inc., Enfield, NH, USA

Swofford, D.L. (2002). PAUP* 4:40: Phylogenetic analysis using parsimony *and other methods. Sinauer Associates, Sunderland, MA.

Szabo, T., Kalmijn, A. J., Enger, P. S. and Bullock, T. H. (1972). Microampullary organs and a submandibular sense organ in the fresh water ray, Potamotrygon. J Comp Physiol. 79(1), 15-27.

Szamier, R.B. and Bennett, M.V.L. (1980). Ampullary electroreceptors in the fresh water ray, Potamotrygon. J Comp Physiol. 138(3), 225-230.

73

Theiss, R. D., Kuo, J. J. and Heckman, C. J. (2007). Persistent inward currents in rat ventral horn neurones. J Physiol. 580(2), 507-522.

Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F. and Higgins, D. G. (1997). The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876-4882.

Triques, M. L. (1993). Filogenia dos genêros de gymnotiformes (Actinopterygii, Ostariophysi), com base em caracteres queléticos. Comun Mus Ciênc. PURCS, série zool. 6(8), 85-130.

Ulbricht, W. (2005). Sodium channel inactivation: molecular determinants and modulation. Physiol Rev. 85, 1271-1301. von der Emde, G. (1990). Discrimination of objects through electrolocation in the weakly electric fish, Gnathonemus petersii. J Comp Physiol A. 167(3), 413–421.

Warrington, J. A., Nair, A., Mahadevappa, M. and Tsyganskaya, M. (2000). Comparison of human adult and fetal expression and identification of 535 housekeeping/maintenance genes. Physiol Genomics. 2, 143-147.

Wernersson, R. and Pedersen, A. G. (2003). RevTrans: Multiple alignment of coding DNA from aligned amino acid sequences. Nucleic Acids Res. 31(13), 3537-3539.

Widmark, J., Sundström, G., Daza, D. O. and Larhammar, D. (2011). Differential evolution of voltage-gated sodium channels in tetrapods and teleost fishes. Mol Biol Evol. 28(1), 859- 871.

Wiens, J. J. (1998). Does adding characters with missing data increase or decrease phylogenetic accuracy? Syst Biol. 47(4), 625-640.

Willett, C. E., Cherry, J. J. and Steiner, L. A. (1997). Characterization and expression of the recombination activating genes (Rag1 and Rag2) of zebrafish. Immunogenetics. 45, 394- 404.

74

Williamson, J. R., Cheung, W. Y., Coles, H. S. and Herczeg, B. E. (1967). Glycolytic control mechanisms IV. kinetics of glycolytic intermediate changes during electric discharge and recovery in the main organ of Electrophorus electricus. J Biol Chem. 242, 5112-5118.

Winemiller, K. O. and Adite, A. (1997). Convergent evolution of weakly electric fishes from floodplain habitats in Africa and South America. Environmental Biology of Fishes. 49, 175-186.

Wingo, T. L., Shah, V. N., Anderson, M. E., Lybrand, T. P., Chazin, W. J. and Balser, J. R. (2004). An EF-hand in the sodium channel couples intracellular calcium to cardiac excitability. Nat Struct Mol Biol. 11(3), 219-225.

Yablonka-Reuveni, Z. (2011). The skeletal muscle satellite cell: still young and fascinating at 50. J Histochem Cytochem. 59(12), 1041-1059.

Yang, Z. (2007). PAML 4: a program package for phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586-1591. .

Young, K. A. and Caldwell, J. H. (2005). Modulation of skeletal and cardiac voltage-gated sodium channels by calmodulin. J Physiol. 565(2), 349-370.

Yu, F. H., Yarov-Yarovoy, V., Gutman, G. A. and Catterall, W. A. (2005). Overview of molecular relationships in the voltage-gated ion channel superfamily. Pharmacol Rev. 57, 387-395.

Zahavi, A. (2003). Indirect selection and individual selection in sociobiology: my personal views on theories of social behaviour. Anim Behav. 65, 859-863.

Zakon, H. H., Lu, Y., Zwickl, D. J. and Hillis, D. M. (2006). Sodium channel genes and the evolution of diversity in communication signals of electric fishes: convergent molecular evolution. Proc Natl Acad Sci. 103, 3675-3680.

Zakon, H. H. and Unguez, G. A. (1999). Development and regeneration of the electric organ. J Exp Biol. 202, 1427-1434.

75

Zhang, X., Ren, W., DeCaen, P., Yan, C., Tao, X., Tang, L., Wang, J., Hasegawa, K., Kumasaka, T., He, J., Wang, J., Clapham, D. E. and Yan, N. (2012). Crystal structure of an orthologue of the NaChBac voltage-gated sodium channel. Nature. 486, 130-135.

76

Appendix A.0 Abstract

A.0.1 Phylogeny and Molecular Evolution of the Voltage-Gated Sodium Channel Gene scn4aa in the Electric Fish Order Gymnotiformes

Many advances have been made in recent years, to identify the roles of various motifs in voltage- gated sodium channel protein (Nav) channel function and modulation. However, analyses of the roles of specific amino acid sites have largely been limited to mutations from individual people with diagnosed neuromuscular disease. In this project, I used the order Gymnotiformes as a model system to investigate the evolution and function of amino acid sites on the Nav that is specifically adapted to the production of electric fields. Gymnotiformes is a diverse clade of ray- finned fishes (class Actinopterygii) that are adapted to the lowland freshwaters of Central and South America, with wide geographical distributions. They produce species-specific electric organ discharges (EODs) from electric organs (EOs) for electrolocation (foraging, navigation) and communication.

To clarify evolutionary relationships among Gymnotiformes species, I reconstructed the phylogeny using an alignment of 3570 nucleotide positions from 57 gymnotiform species. This alignment included loci that were used for previous phylogenies of a gymnotiform genus (cytb and rag2), as well as nucleotides encoding the carboxyl-terminus Nav gene that is preferentially expressed in EOs (scn4aa 3’). Unfortunately nucleotide sequences from one of six gymnotiform families could not be obtained, and further analytical techniques to obtain them were outside the scope of this project. The maximum parsimony phylogenetic reconstruction algorithm was successful in providing a reasonably well supported phylogeny, while the Bayesian inference algorithm was not. Nevertheless, the results indicate that the scn4aa 3’ locus contributed towards a meaningful phylogenetic topology that provides a reasonable amount of resolution.

Further analyses of phylogenetic topology is needed prior to analyses of patterns of selection, due to inconsistencies among existing phylogenetic topologies of gymnotiforms and unexpected inclusion of a siluriform species (Cetopsis coecutiens) within the Gymnotiform order

77

(from the cytb phylogeny). Since the number of sites variable in amino acid identity among order Gymnotiformes is more than four times larger than those among genus Gymnotus (177 vs 43 sites), future analyses of scn4aa 3’ may identify additional amino acid sites that contribute to knowledge of protein function.

78

Appendix A.1 Introduction

A.1.1 Significance and Objectives

Many advances have been made in recent years, to identify the roles of various motifs in voltage- gated sodium channel protein (Nav) channel function and modulation (Chagot et al. 2009; Miloushev et al. 2009; Payandeh et al. 2011; Sarhan et al. 2012; Zhang et al. 2012). However, analyses of the roles of specific amino acid sites have largely been limited to the sites that are known to be mutated in people with diagnosed neuromuscular disease (Lehmann-Horn and Jukart-Rott 1999). In this project, I will use the order Gymnotiformes as a model system to investigate the evolution and function of amino acid sites on the Nav.

Fishes of the order Gymnotiformes is a produce species-specific electric organ discharges (EODs) for electrolocation (foraging, navigation) and communication (Crampton and Albert 2006). EODs are the summation of action potentials produced at the electric organ(s) (EO) by electrogenic cells (Bennett 1961; Mills and Zakon 1987). Navs at the plasma membranes of those cells have a key role in supporting action potentials (Agnew 1984; Catterall 1984; Noda et al.

1984). Upon neuronally triggered changes in voltage, Navs activate to allow specific ions to discharge through their pores, across the membranes. Those same changes in voltage also trigger

Navs to inactivate, to allow the membrane voltage gradient to recover, in preparation for the next discharge.

Navs are encoded by a family of paralogous genes that translate to highly conserved amino acid sequences and motifs (Catterall et al. 2005). Gene duplication among teleostei and preferential expression in various tissues (Lopreato et al. 2001; Lynch et al. 2001; Goldin 2002; Novak et al. 2006; Widmark et al. 2011) has been predicted to allow paralogs to evolve independently without compromising functions of Navs in other tissues. Analyses of nucleotide sequences encoding the carboxyl-terminus EO paralog (scn4aa 3’) from a genus of gymnotiform fishes (genus Gymnotus), resulted in identification of positive, neutral, and purifying selection of the protein (Nav1.4a) among certain lineages, as well as identification of positively selected amino acid sites (Chapters 1-4).

79

The carboxyl-terminus (C-terminus) of Navs includes key motifs that are involved in regulation of protein internalization, fast inactivation, and possibly also resurgent current.

Modulation of these Nav1.4a activities affects the amplitude and frequency of action potentials at the EO, which may in turn affect those components of the EODs. Variations in EOD amplitude may be associated with variations in multiple anatomical, cellular, and molecular characteristics (Gotter 1998; Caputi 1999). However, variations in EOD frequency among gymnotiforms with myogenic electric organs are likely limited to those associated with variations in Nav1.4a function.

Since species-specific characteristics of EODs among gymnotiforms (especially variation in frequency) are the result of adaptations to abiotic and biotic selective pressures in their varied habitats (Stoddard 2002), I predict that amino acid sites of the Nav1.4a 3’ that contribute to variance of (but not abolish) protein function, will show evidence of positive selection in gymnotiform fishes. I also predict that the Nav1.4a 3’ will only show evidence of positive selection in some lineages of gymnotiforms, as has been observed for other portions of Nav1.4a sequences from a limited sample of gymnotiform fishes (Zakon et al. 2006; Arnegard et al.

2010). To assess patterns of variation on the Nav1.4a 3’ among gymnotiforms, I will analyze the corresponding nucleotide sequences.

Existing phylogenetic relationships among gymnotiforms consistently resolve 6 families (Electrophoridae, Gymnotidae, Hypopomidae, Rhamphichthyidae, Apteronotidae, and Sternopygidae). However, relationships among the families are inconsistent. I will use additional taxa and molecular characters, to contribute towards resolving these inconsistencies (Wiens 1998). The additional characters that I will use are the gymnotiform scn4aa nucleotide sequences that encode the protein's C-terminus. Since this portion of scn4aa has been used for successful clarification of the phylogeny of a genus of gymnotiform fishes, (Chapters 1-4), I predict that this portion of the gene will also contribute towards clarification of phylogenetic relationships among gymnotiform species.

The objectives of this project can be summarized as follows:

80

1) To clarify evolutionary relationships among known and newly discovered species of gymnotiforms using orthologous genetic loci, including the scn4aa C-terminus;

2) To determine the utility of the scn4aa 3’ locus for reconstruction of phylogenetic relationships; and

3) To assess patterns of variation at the Nav1.4a C-terminus, thereby contributing towards understanding the evolutionary history of Gymnotiformes, and molecular mechanisms of the protein.

81

Appendix A.2 Materials and Methods

A.2.1 Taxon Sampling

Efforts were made to comprehensively sample gymnotiform species from as many genera as possible among all six families described in published phylogenies (Figure 2). Outgroup species were comprehensively sampled from multiple other ostariophysan families since there is no consensus on the closest order to Gymnotiformes (Fink and Fink 1981; Saitoh et al. 2003). These other families were Characiformes (Calcagnotto et al. 2005), Cypriniformes (Mayden et al. 2009), Siluriformes (Sullivan et al. 2006), and Gonorhynchiformes. Efforts were made during outgroup species sampling so that: nucleotide sequences were available from GenBank for as many loci as possible; specimens were more likely to be easily available (i.e. through the hobby aquarium trade); and taxa were phylogenetically diverse. More than one individual was sampled per species whenever possible, as a control for variation within species.

Tissues for DNA extraction were stored in either 95-100% ethanol or salt saturated buffer (20% DMSO, 0.25 M EDTA pH 8, saturated with NaCl). Tissue samples were from the collections of Nathan Lovejoy, William Crampton, James Albert, and Javier Maldonaldo.

A.2.2 Locus and Primer Selection

The loci selected were: mitochondrial gene cytochrome b (cytb); and nuclear genes recombination activating gene 2 (rag2) and the portion of the voltage-gated sodium channel gene scn4aa that encodes the protein’s carboxyl-terminus (scn4aa 3’). They were selected for similar reasons as those for the Gymnotus phylogeny in Chapter 2.

Appendix A Table 1 lists the primer sequences used for DNA amplification and sequencing. Amplification primers for cytb and rag2 have been previously published. Amplification primers for scn4aa 3’ were designed as per Chapter 2. Sequencing primers for all loci selected were designed as necessary, also as per Chapter 2.

82

Appendix A Table 1. Primer Sequences Primers used for polymerase chain reaction and sequencing are identified by their target loci, name, annealing direction, sequence, and source.

Target Locus Name Amplification/Sequencing Direction 1 Sequence (listed as 5' → 3') Source of Sequence scn4aa 3’ (6)1F 5' → 3' TCCTCCTGACTGTGACCCTG Chapter 2 Table 1 (6)2F 5' → 3' GGGCTTCTCCTSCCAACTC This study (6)3F 5' → 3' GCTTCTCCTSCCAACTCTAAACA This study (6)1R 3' ← 5' CATTTTTACACTTCATCACTCTCCAC Chapter 2 Table 1 (6)2R 3' ← 5' TCATTCCTAGACACCARCAAACAT This study (6)3R 3' ← 5' CATCATTCCTAGACACCAGCAAACAT This study (6)Seq1F TTGTAATGGGAGACAANATCC This study (6)Seq2F GTCACTCARGAGGTCCT This study (6)Seq1R GGCCGCATASWCCTCCTCCTT This study (6)Seq2R TGAGGAGGTRYTGGCGGTA This study (4)2R TTCCTGCAGTGCATCAACAAAG This study (4)3R TGGGAATACGCATGGGTTC This study cytochrome b GLU-L-CARP (AKA 5' → 3' TGACTTGAAGAACCACCGTTG Palumbi et al. 1991 CytbF) GLUDG-L 5' → 3' CGAAGCTTGACTTGAARAACCAYCGTTG Palumbi et al. 1991 L14841 5' → 3' AAAAAGCTTCCATCCAACATCTCAGCATGATGAAA Kocher et al. 1989 HA-danio (AKA CytbR) 3' ← 5' CTCCGATCTTCGGATTACAAG Mayden et al. 2007 CytbH15915 3' ← 5' AACTGCAGTCATCTCCGGTTTACAAGA Irwing et al. 1991 (C)Seq1F CAATGAGTCTGAGGAGGNTT Chapter 2 Table 1 (C)Seq2F CAATGAGTATGAGGAGGNTT This study (C)Seq3F CAATGAGTTTGAGGGGGNTT Chapter 2 Table 1 (C)Seq4F CAATGAGTGTGGGGGGGNTT This study (C)Seq5F CAATGAGTCTGAGGGGGNTT Chapter 2 Table 1 (C)Seq6F CAATGAGTATGAGGGGGNTT This study (C)Seq7F CAATGAGTATGAGGGGGNTT This study (C)Seq8F CAATGAGTTTGAGGCGGNTT Chapter 2 Table 1 (C)Seq9F CAATGAGTCTGAGGCGGNTT This study (C)Seq10F CAATGAGTTTGAGGTGGNTT This study (7)7R TCTAGTTCCTCTGGCTCCTC This study recombination Rag2GyF 5' → 3' ACAGGCRTCTTTGGKRTTCG Lovejoy et al. 2010 activating gene 2

83

Target Locus Name Amplification/Sequencing Direction 1 Sequence (listed as 5' → 3') Source of Sequence Rag2-F1 (this was only used for amplification) TTTGGRCARAAGGGCTGGCC Lovejoy and Collette 2001 MHRag2-F1 (AKA 5' → 3' Hardman 2003 Rag2MHF1) Rag2GyR 3' ← 5' TCATCCTCCTCATCTTCCTC Lovejoy et al. 2010 Rag2-R6 (this was only used for amplification) TGRTCCARGCAGAAGTACTTG Lovejoy and Collette 2001 MHRag2-R1 (AKA 3' ← 5' Hardman 2003 Rag2MHR1) (R)Seq1F AGAACCACAGAGAACTGGAACAC Chapter 2 Table 1 (R)Seq1R CTCTACACGCAGCCTGAACA Chapter 2 Table 1 (R)Seq2R TGCATTCGCTTYTGGGA Chapter 2 Table 1 16S mitochondrial 16sar-L 5' → 3' CGCCTGTTTATCAAAAACAT Palumbi et al. 1991 ribosomal subunit 16sbr-H 3' ← 5' CCGGTCTGAACTCAGATCACGT Palumbi et al. 1991

1 Amplification/sequencing direction is only identified for primers used for both amplification and sequencing, since sequencing-only primers may have been used to sequence nucleotides in different directions.

84

A.2.3 DNA Extraction, Nucleotide Amplification, and Sequencing

To obtain DNA, excised muscle tissue was processed using the DNeasy Blood and Tissue Spin- Column Kit (Qiagen). Nucleotide sequences from previous studies were obtained from GenBank. This includes most of the cytb and rag2 data. All of the scn4aa 3’ sequences were experimentally obtained as part of this study. See Appendix A Table 2 for the source of each sequence. Nucleotide amplification and sequencing methods were the same as those for Chapter 2.

A.2.4 Nucleotide Sequence Verification and Alignment

All sequences experimentally obtained for this study were visually inspected for misreads, and edited using SequencherTM (Gene Code Corporation, Ann Arbor, MI). Ambiguous base calls were considered as possibly any nucleotide. For scn4aa sequences, amplification and sequencing of the exon encoding the protein’s carboxyl-terminus (scn4aa 3’) from the desired member of the gene family was verified as per Chapter 2.

Directions and applicable codon positions of the nucleotide sequences were determined by comparison with published Danio rerio (rag2 Accession # NM_131385, cytb Accession # NC_002333) and Electrophorus electricus (scn4aa 3’ Accession # M22252) sequences. Nucleotides from the protein coding loci (cytb, rag2, and scn4aa 3’) were aligned based on their amino acid alignments, as per Chapter 2.

A.2.5 Phylogenetic Reconstruction

Phylogenetic reconstruction was conducted using the total evidence alignment. The resulting phylogeny was compared with separate analyses of the following alignments: cytb; rag2; and scn4aa 3’.

85

Appendix A Table 2. Specimens and Nucleotide Sequences Used for Gymnotiformes Analyses Specimens used for analysis are identified by their scientific names, tissue sample numbers, museum catalogue numbers, collection localities, and applicable GenBank Accession numbers. Sequences obtained by the author for this project are identified with “**”. Sequences obtained from lab records are identified with “*” or their GenBank Accession Number, if applicable.

Genus Species Tissue Museum catalog Collection Locality Nucleotide sequences sample number scn4aa 3’ cytochrome recombination number b activating gene 2 Order Gymnotiformes: Family Apteronotidae Adontosternarchus sachsi 2877 (unknown) (unknown) ** ** Adontosternarchus sachsi 2888 (unknown) (unknown) ** ** Apteronotus albifrons 7301 (unknown) Brazil ** ** Apteronotus albifrons 2615 (unknown) (aquarium specimen) ** ** Apteronotus bonapartii 2914 (unknown) (unknown) ** ** Apteronotus bonapartii 2616 (unknown) (aquarium specimen) ** ** Apteronotus (Ubidia) magdalenensis 4008 (unknown) Colombia ** ** Apteronotus (Ubidia) magdalenensis 4009 (unknown) Colombia ** ** Compsaraia n. sp. B 1991 (unknown) Peru ** ** Magosternarchus raptor 2838 (unknown) (unknown) ** ** Magosternarchus raptor 2910 (unknown) (unknown) ** ** Orthosternarchus tamandua 2447 (unknown) Peru ** ** Orthosternarchus tamandua 2625 (unknown) (aquarium specimen) ** ** Parapteronotus hasemani 2626 (unknown) (aquarium specimen) ** ** Parapteronotus hasemani 2627 (unknown) (aquarium specimen) ** ** Platyurosternarchus macrostomus 7302 (unknown) Brazil ** ** Platyurosternarchus macrostomus 2629 (unknown) (aquarium specimen) ** ** Porotergus gimbeli 2889 (unknown) (unknown) ** ** Porotergus gimbeli 2902 (unknown) (unknown) ** ** Sternarchella schotti 2860 (unknown) (unknown) ** ** Sternarchella schotti 2876 (unknown) (unknown) ** **

86

Genus Species Tissue Museum catalog Collection Locality Nucleotide sequences sample number scn4aa 3’ cytochrome recombination number b activating gene 2 Sternarchogiton natteneri 2863 (unknown) (unknown) ** ** Sternarchogiton natteneri 2864 (unknown) (unknown) ** ** Sternarchorhampus muelleri 2103 (unknown) (unknown) ** ** Sternarchorhampus muelleri 2102 (unknown) (unknown) ** ** Sternarchorhynchus roseni 2920 (unknown) (unknown) ** ** Sternarchorhynchus oxyrhynchus 7303 (unknown) Brazil ** ** Sternarchorhynchus oxyrhynchus 7304 (unknown) (unknown) ** ** Order Gymnotiformes: Family Electrophoridae Electrophorus electricus 2026 MZUSP 103218 Lago Secretaria, Tefé, Brazil ** GQ862593 GQ862541 Electrophorus electricus 2619 UF 116585 Rio Nanay, Peru ** GQ862592 GQ862540 Order Gymnotiformes: Family Gymnotidae MZUSP 75179 Lago Mamirauá, Tefé, ** GQ862595 GQ862543 Gymnotus arapaima 2002 Amazonas, Brazil MZUSP 103219 Lago Mamirauá, Tefé, ** GQ862596 GQ862544 Gymnotus arapaima 2003 Amazonas, Brazil Gymnotus cataniapo 2062 UF 174330 Rio Atabapo, Venezuela ** GQ862603 GQ862552 Gymnotus cataniapo 2063 UF 174332 Rio Cataniapo, Venezuela ** GQ862604 GQ862579 Gymnotus cylindricus 2092 ROM 84772 Rio Tortuguero, Costa Rica ** GQ862615 GQ862563 Gymnotus cylindricus 2093 ROM 84772 Rio Tortuguero, Costa Rica ** GQ862616 GQ862564 MZUSP 103220 Rio Solimões, Tefé, ** GQ862619 GQ862567 Gymnotus jonasi 2016 Amazonas, Brazil UF 131410 Rio Ucayali, Pacaya Samiria ** GQ862620 GQ862568 Gymnotus jonasi 2471 Reserve, Peru MZUSP 103221 Rio Solimões, Tefé, ** GQ862621 GQ862569 Gymnotus mamiraua 2012 Amazonas, Brazil MCP 29805 Rio Solimões, Tefé, ** GQ862622 GQ862570 Gymnotus mamiraua 2013 Amazonas, Brazil MZUSP 75155 Lago Mamirauá, Tefé, ** GQ862623 GQ862571 Gymnotus obscurus 2017 Amazonas, Brazil

87

Genus Species Tissue Museum catalog Collection Locality Nucleotide sequences sample number scn4aa 3’ cytochrome recombination number b activating gene 2 MZUSP 75157 Lago Mamirauá, Tefé, ** GQ862624 GQ862572 Gymnotus obscurus 2018 Amazonas, Brazil Gymnotus pantherinus 2039 (no voucher) Rio Perequê-Açu, Brazil ** GQ862625 GQ862573 MZUSP 87564 Rio Vermelho, Sao Paulo, ** * * Gymnotus pantherinus 2945 Brazil Gymnotus tigre 7090 (not catalogued) (aquarium specimen) ** ** ** Gymnotus tigre 7349 (not catalogued) (aquarium specimen) ** ** ** MZUSP 75163 Rio Solimões, Tefé, ** * * Gymnotus varzea 2014 Amazonas, Brazil MZUSP 75164 Rio Solimões, Tefé, ** * * Gymnotus varzea 2015 Amazonas, Brazil Gymnotus n. sp. fritzi 7109 (not catalogued) Tefé, Amazonas, Brazil ** ** ** Gymnotus cf. tigre 2024 UF 122821 Rio Amazonas, Peru ** GQ862632 GQ862580 Order Gymnotiformes: Family Hypopomidae Brachyhypopomus beebei 2510 (unknown) Peru ** ** ** Brachyhypopomus beebei 2524 (unknown) Peru ** ** Brachyhypopomus brevirostris 2617 UF 116556 Rio Nanay, Peru GQ862588 GQ862536 Brachyhypopomus brevirostris 7019 (unknown) Suriname ** ** Brachyhypopomus diazi 305 UF 174334 Rio Los Marias, Venezuela ** GQ862589 GQ862537 Brachyhypopomus diazi 2408 UF 174334 Rio Alpargatón, Venezuela ** GQ862590 GQ862538 Brachyhypopomus occidentalis 2948 (unknown) Rio Atrato, Choco, Colombia ** ** ** Brachyhypopomus occidentalis 2949 (unknown) Rio Atrato, Choco, Colombia ** ** ** Brachyhypopomus occidentalis 7156 (unknown) Panama ** ** ** Brachyhypopomus occidentalis 7162 (unknown) Panama ** ** ** Brachyhypopomus n. sp. PAL 2432 UF 148572 Rio Palenque, Ecuador ** GQ862591 GQ862539 Brachyhypopomus n. sp. PAL 2433 (unknown) Rio Palenque, Ecuador ** ** ** Brachyhypopomus pinnicaudatus 2121 (unknown) Tefé, Brazil ** ** ** Brachyhypopomus pinnicaudatus 2122 (unknown) Tefé, Brazil ** ** **

88

Genus Species Tissue Museum catalog Collection Locality Nucleotide sequences sample number scn4aa 3’ cytochrome recombination number b activating gene 2 Hypopomus artedi 2232 ANSP 179505 Rio Mazaruni, Guyana ** GQ862637 GQ862585 Hypopomus artedi 2233 AUM 35574 Rio Mazaruni, Guyana ** ** ** Hypopygus lepturus 2438 (unknown) Rio Nanay, Peru ** ** Hypopygus lepturus 2439 (unknown) Rio Nanay, Peru ** ** Microsternarchus bilineatus 2396 (unknown) Rio Atabapo, Venezuela ** ** (unknown) Rio Atabapo, Santa Barbara, ** ** ** Racenisia fimbripinna 2339 Venezuela (unknown) Rio Atabapo, Santa Barbara, ** ** ** Racenisia fimbripinna 2340 Venezuela Steatogenys duidae 2146 (unknown) Tefé, Brazil ** ** ** Steatogenys duidae 2147 (unknown) Tefé, Brazil ** ** ** (unknown) Rio Atabapo, Santa Barbara, ** ** Stegostenopos cryptogenes 2322 Venezuela Order Gymnotiformes: Family Rhamphichthyidae Gymnorhamphichthys rondoni 2153 (unknown) Brazil ** ** Gymnorhamphichthys rondoni 2154 (unknown) Brazil ** ** Rhamphyichthys "saddled" 7282 (unknown) (unknown) ** ** Rhamphyichthys "saddled" 7283 (unknown) (unknown) ** ** Rhamphyichthys "clear" 7284 (unknown) (unknown) ** ** ** Rhamphyichthys "clear" 7285 (unknown) (unknown) ** ** ** Rhamphyichthys hypostomus 7309 (unknown) Brazil ** ** Rhamphyichthys hypostomus 7310 (unknown) Brazil ** ** Rhamphyichthys lineatus 2630 (unknown) (aquarium specimen) ** ** ** Rhamphyichthys lineatus 2158 (unknown) Brazil ** ** ** Rhamphyichthys sp. 7286 (unknown) (unknown) ** Rhamphyichthys sp. 7287 (unknown) (unknown) ** ** Order Gymnotiformes: Family Sternopygidae Archoalemus blax 7307 (unknown) Brazil ** ** **

89

Genus Species Tissue Museum catalog Collection Locality Nucleotide sequences sample number scn4aa 3’ cytochrome recombination number b activating gene 2 Archoalemus blax 7308 (unknown) (unknown) ** ** ** Distocyclus conirostris 7306 (unknown) (unknown) ** ** Distocyclus conirostris 2911 (unknown) (unknown) ** ** Eigenmannia humboldtii 2811 (unknown) Colombia ** ** ** Eigenmannia humboldtii 2822 (unknown) Colombia ** ** ** Eigenmannia limbata 1938 UF 126255 Peru ** ** ** Eigenmannia limbata 1939 UF 126255 Peru ** ** ** Eigenmannia virescens 2817 (unknown) Colombia ** ** ** Eigenmannia virescens 2818 (unknown) Colombia ** ** Eigenmannia virescens 2309 (unknown) Venezuela ** ** ** Eigenmannia virescens 2310 (unknown) Venezuela ** ** ** Rhabdolichops caviceps 2883 (unknown) (unknown) ** ** ** Rhabdolichops caviceps 2887 (unknown) (unknown) ** ** ** Rhabdolichops eastwardi 2105 (unknown) (unknown) ** ** Rhabdolichops eastwardi 2104 (unknown) (unknown) ** ** Sternopygus aequilabiatus 2819 (unknown) Colombia ** * ** Sternopygus aequilabiatus 2820 (unknown) Colombia ** * ** (unknown) Lago Tefé, Igarapé ** * ** Sternopygus astrabes 2203 Repartimento, Brazil Sternopygus astrabes 2204 (unknown) Brazil ** * ** Sternopygus dariensis 7223 (unknown) West of the Andes ** ** ** Sternopygus dariensis 7224 (unknown) West of the Andes ** ** ** Sternopygus macrurus 2507 UF 131396 Peru ** * ** Sternopygus macrurus 2639 UF 117121 Rio Nanay, Peru ** GQ862639 GQ862587 Order Characiformes: Family Alestes baremoze AMNH 226451 AY791360 AY804029 Alestopetersius hilgendorfi AMNH 233438 AY791432 AY804114 Arnoldichthys spilopterus AMNH 233399 AY791364 AY804032

90

Genus Species Tissue Museum catalog Collection Locality Nucleotide sequences sample number scn4aa 3’ cytochrome recombination number b activating gene 2 Bathyaethiops breuseghemi AMNH 233422 AY791430 AY804113 Brycinus carolinae 1 RUSI 065136 AY791359 AY804028 Brycinus carolinae 2 AMNH 233628 AY791373 AY804045 Brycinus nurse AMNH 233415 AY804034 Brycinus schoutedeni AY791377 AY804050 Bryconaethiops microstoma AMNH 233390 AY791371 AY804041 Hydrocynus vitattus 1 AMNH 233623 AY791404 AY804083 Hydrocynus vitattus 2 RUSI 061489 AY791410 AY804091 Ladigesia roloffi AMNH 233394 AY791417 AY804097 Micralestes occidentalis RUSI 065135 AY791358 AY804027 Phenacogrammus interruptus 1 AMNH 233442 AY791421 AY804102 Phenacogrammus interruptus 2 AMNH 233444 AY791434 AY804116 Order Characiformes: Family Charicidae Chalceus macrolepidotus AMNH 233404 AY791385 AY804060 Exodon paradoxus AMNH 233426 AY791397 AY804072 Salminus maxillosus AY791438 AY804124 Order Characiformes: Family Crenuchidae Characidium fasciatum AMNH 233251 AY791380 AY804055 Characidium vidali MNRJ 12838 AY791388 AY804064 Order Characiformes: Family Ctenolucidae Ctenolucius hujeta AMNH 233412 AY791384 AY804059 Order Characiformes: Family Distochodus notospilus AMNH 231537 AY791395 AY804069 Distochodus sexfasciatus AMNH 233393 AY791396 AY804071 Hemigrammocharax multifasciatus RUSI 63497 AY791407 AY804085 Neolebias ansorgii AY791423 AY804106 Neolebias trilineatus AMNH 233439 AY791425 AY804108 Order Characiformes: Family Hepsetidae

91

Genus Species Tissue Museum catalog Collection Locality Nucleotide sequences sample number scn4aa 3’ cytochrome recombination number b activating gene 2 Hepsetus odoe AMNH 231495 AY791408 AY804086 Order Characiformes: Family Prochilodontidae Prochilodus nigricans AMNH 233305 AY791437 AY804120 Order Characiformes: Family Serrasalmidae Colossoma macropomum AY791386 AY804061 Piaractus brachypomus MZUSP 85849 AY791429 AY804112 Pygocentrus nattereri AY791436 AY804119 Order Cypriniformes: Family Catosomidae Myxocyprinus asiaticus AP006764 DQ367043 Order Cypriniformes: Family NM_001 NC_002333 NM_131385 Danio rerio 039825 Barbus barbus AB238965 DQ366990 Carassius auratus DQ366941 Order Cypriniformes: Family Gobioniae Gobio gobio AB239596 DQ367015 Order Cypriniformes: Family Leuciscinae Cyprinella lutrensis AB070206 DQ367019 Phoxinus phoxinus EF094550 DQ367022 Order Cypriniformes: Family Tincinae Tinca tinca AB218686 DQ367029 Order Cypriniformes: Family Xenocyprinae argentea AP009059 DQ367024 Order Siluriformes: Family Akysidae Acrochordonichthys rugosus INHS 93578 EU490899 DQ492332 Order Siluriformes: Family Anchariidae Gogo arcuatus UMMZ 238042 FJ013160 DQ492415 Order Siluriformes: Family Ariidae

92

Genus Species Tissue Museum catalog Collection Locality Nucleotide sequences sample number scn4aa 3’ cytochrome recombination number b activating gene 2 Bagre marinus CU 906192 AJ581355 DQ492411 Order Siluriformes: Family Aspredinidae Mycromyzon akamai ANSP 182777 EU490892 DQ492424 Order Siluriformes: Family Auchenipteridae Ageneiosus ucayalensis INHS 52920 EU490898 DQ492351 Order Siluriformes: Family Bagridae Bagrus docmak CU 90408 EU490906 EU490906 Hemibagrus wyckioides INHS 93682 EU490911 DQ492349 Heterobagrus bocourti INHS 93586 EU490912 DQ492350 Private Collection EU490918 DQ492347 Olyra longicaudatus H.H. Ng Rita rita Private Collection EU490921 DQ492405 H.H. Ng Order Siluriformes: Family Cetopsidae Cetopsis coecutiens INHS 52923 DQ486759 DQ492419 Order Siluriformes: Family Clariidae Clarias gabonensis CU 803712 AY995129 DQ492406 Order Siluriformes: Family Cranoglanididae Cranoglanis bouderius ASIZB 1383452 AF475155 DQ492401 Order Siluriformes: Family Doradidae Acanthodoras cataphractus ANSP 179854 EU490895 DQ492354 Order Siluriformes: Family Horabagridae Horabagrus brachysoma INHS 935851; EU490913 DQ492409 INHS 9359052 Order Siluriformes: Family Ictaluridae Ictalurus punctatus INHS 939041; AY184254 DQ492398 ANSP 1803682 Order Siluriformes: Family Pimelodidae

93

Genus Species Tissue Museum catalog Collection Locality Nucleotide sequences sample number scn4aa 3’ cytochrome recombination number b activating gene 2 Phractocephalus hemioliopterus ANSP 179452 DQ486763 DQ492364 Pimelodus ornatus uncat., Coll. M. EF564741 DQ492363 Azpelicueta P2791; INHS 491022 Order Siluriformes: Family Plotosidae Plotosus lineatus ANSP 182776 EU490919 DQ492418 Order Siluriformes: Family Schilbidae Ailia coila Private Collection EU490901 DQ492340 H.H. Ng Schilbe intermedius CU 882512 AJ245673 DQ492395 Order Siluriformes: Family Siluridae Kryptopterus minor TNHC 293491; AY458895 DQ492373 ANSP 1827782 Order Siluriformes: Family Sisoridae Bagarius yarrelli INHS 93673 EU490904 DQ492334 Order Siluriformes: Family Trichomycteridae Trichomyceterus guianense INHS 49567 DQ486760 DQ492319 Order Gonorhynchiformes: Family Chanidae Chanos chanos AB054133

1 Museum catalog number associated with cytb sample only. 2 Museum catalog number associated with rag2 sample only.

94

Parsimony based phylogenetic reconstruction was implemented in PAUP* (Swofford 2002) using the stepwise heuristic search algorithm with the following parameters for 2000 search replicates: tree bisection reconstruction branch swapping; and holding 10 variants at each step. Bootstrapping was also conducted for 2000 search replicates with the same parameters (Müller 2005).

Bayesian phylogenetic reconstruction was implemented in MrBayes 3.1.2 (Huelsenbeck and Ronquist 2001), using the model of molecular evolution that best fit the data as determined using MrModeltest 2.3 (Nylander 2004). It was the same model for the total evidence and individual locus alignments – general time-reversible model, with a proportion of nucleotide sites that are invariant, and the variation in nucleotide substitution rates across the variant nucleotide sites estimated from a gamma distribution (GTR + I + G; Brinkman and Leipe 2001). The total evidence alignment was partitioned into the three loci, and analyzed with temp = 0.2. The cytb, rag2, and scn4aa 3’ alignments were analyzed with temp = 0.05, 0.1, and 0.2. Each of these four alignments were analyzed with nperts = default, 2, and 4. The analyses were run for up to 10 million generations with four chains each to reach an average standard deviation of split frequencies of 0.01 or less. All other parameters were program defaults.

95

Appendix A.3 Results

A.3.1 Nucleotide Sequence Data

Nucleotide sequences were obtained from 110 gymnotiform individuals representing all 6 families: 99 of which represent 49 recognized species; 11 of which represent up to another 8 undescribed species. The numbers of species sampled per family are as follows: Apteronotidae (15); Electrophoridae (1); Gymnotidae (11); Hypopomidae (13); Rhamphichthyidae (7); and Sternopygidae (10). Sequences were also obtained from 65 outgroup individuals, which represent 62 species from other ostariophysian orders. The numbers of species sampled per order are as follows: Characiformes (29); Cypriniformes (8); Siluriformes (24); and Gonorhynchiformes (1). Appendix A Table 2 identifies the specimens used for analysis by their scientific names, tissue sample numbers, museum catalogue numbers, and collection localities.

A total of 409 nucleotide sequences were obtained for phylogenetic analyses. For cytochrome b (cytb), recombination activating gene 2 (rag2), and the portion of the voltage-gated sodium channel gene scn4aa that encodes the protein’s carboxyl-terminus (scn4aa 3’): 85, 86, and 66 sequences were from experiments for this study, respectively; and 85, and 86, and 1 were from GenBank, respectively.

The total evidence nucleotide alignment consisted of 3570 nucleotide positions, 1473 of which were parsimony informative, and another 291 were variable but parsimony uninformative. The alignment consisted of nucleotide positions from the following loci: 1146 from cytb; 1611 from rag2; and 813 from scn4aa 3’. Nucleotide positions among the housekeeping mitochondrial locus cytb included 730 variable positions, of which 671 were parsimony informative. Those among rag2 included 870 variable positions, of which 744 were parsimony informative. Those among scn4aa 3’ included 564 variable positions, of which 458 were parsimony informative.

Scn4aa 3’ sequences were not obtained for Apteronotidae and outgroup taxa, despite the use of various primers (amplification and sequencing) and polymerase chain reaction parameters (Appendix A Table 2). Among the nucleotide sequences obtained, 18.60% of nucleotides were

96

ambiguous (proportion of ambiguous sites among nucleotides: 9743/194820 cytb nucleotides; 86260/277092 rag2 nucleotides; and 2343/56910 scn4aa 3’ nucleotides). The ambiguous sites have chromatograms that do not clearly show a single nucleotide identity. Although it is possible some are polymorphic sites, it was assumed that they were due to experimental error for the purposes of phylogenetic analyses.

This dataset represents the most complete sampling of gymnotiform nucleotide sequence data. Compared to the most recent molecular phylogenetic reconstruction of Gymnotiformes (Alves-Gomes et al. 1995), this dataset includes 9 additional gymnotiform genera as well as an additional locus.

A.3.2 Phylogenetic Reconstruction

Molecular phylogenetic analyses were conducted using nucleotide alignments of various loci (cytb, rag2, and scn4aa 3’) and the total evidence alignment, using both maximum parsimony (MP) and Bayesian inference (BI) algorithms. The 50% majority-rule consensus topologies from the MP analyses are shown in Appendix A Figures 1-4. The MP consensus topologies were produced from the most parsimonious trees based on analyses of various loci: cytb (4 trees); rag2 (46 trees); scn4aa 3’ (2822 trees); and the total evidence nucleotide alignment (7 trees). The BI analyses did not converge to the target average standard deviation of split frequencies (≤ 0.01). This was after more than two months of analysis per nucleotide alignment using a 2.50 GHz quad-core computer with 4 GB of random access memory.

The order Gymnotiformes was resolved as a monophyletic group based on phylogenetic reconstruction of the rag2 locus (Appendix A Figure 2). It would also have been resolved as a monophyletic group based on the cytb locus and total evidence nucleotide alignment if it were not for the inclusion of a siluriform species (Cetopsis coecutiens) in the group (Appendix A Figures 1 and 4). The least close outgroup to order Gymnotiformes was identified as Cypriniformes based on the cytb and total evidence nucleotide alignments (Appendix A Figures 1 and 4). The closest outgroup was identified as order Characiformes based on the total evidence nucleotide alignment (Appendix A Figure 3), however this was not well supported.

97

Appendix A Figure 1. Molecular Phylogeny for Gymnotiformes Based on the cytb Nucleotide Alignment Using Maximum Parsimony

Phylogenetic reconstruction was conducted based on the nucleotide alignment of cytochrome b (cytb) using maximum parsimony. Individuals for which nucleotide sequences had not been obtained for that locus were pruned from the 50% majority-rule consensus topologies. Numbers above the branches indicate bootstrap values. The families are coloured as follows: Apteronotidae (light blue); Electrophoridae (dark blue); Gymnotidae (violet); Hypopomidae (red); Rhamphichthyidae (yellow); Sternopygidae (green).

98

Appendix A Figure 2. Molecular Phylogeny for Gymnotiformes Based on the rag2 Nucleotide Alignment Using Maximum Parsimony

Phylogenetic reconstruction was conducted based on the nucleotide alignment of recombination activation gene 2 (rag2) using maximum parsimony. Individuals for which nucleotide sequences had not been obtained for that locus were pruned from the 50% majority-rule consensus topologies. Numbers above the branches indicate bootstrap values. The families are coloured as follows: Apteronotidae (light blue); Electrophoridae (dark blue); Gymnotidae (violet); Hypopomidae (red); Rhamphichthyidae (yellow); Sternopygidae (green).

99

Appendix A Figure 3. Molecular Phylogeny for Gymnotiformes Based on the scn4aa 3’ Nucleotide Alignment Using Maximum Parsimony

Phylogenetic reconstruction was conducted using the nucleotide alignment of the portion of the voltage-gated sodium channel gene scn4aa that encodes the protein’s carboxyl-terminus (scn4aa 3’) using maximum parsimony. Individuals for which nucleotide sequences had not been obtained for that locus were pruned from the 50% majority-rule consensus topologies. Numbers above the branches indicate bootstrap values. The families were coloured as follows: Apteronotidae (light blue); Electrophoridae (dark blue); Gymnotidae (violet); Hypopomidae (red); Rhamphichthyidae (yellow); Sternopygidae (green).

100

Appendix A Figure 4. Molecular Phylogeny of Gymnotiformes Based on the Total Evidence Alignment

Phylogenetic reconstruction was conducted based on the total evidence nucleotide alignment from Gymnotus, consisting of nucleotide sequences from cytochrome b, recombination activating gene 2, and the portion of the voltage-gated sodium channel gene scn4aa that encodes the protein’s carboxyl-terminus. The 50% majority-rule consensus topology is shown. Numbers above the branches indicate bootstrap values. The families are coloured as follows: Apteronotidae (light blue); Electrophoridae (dark blue); Gymnotidae (violet); Hypopomidae (red); Rhamphichthyidae (yellow); Sternopygidae (green).

101

The families Apteronotidae and Sternopygidae were resolved as monophyletic groups based on the cytb, rag2, and total evidence nucleotide alignments (Appendix A Figures 1-2 and 4). The family Gymnotidae was resolved as a monophyletic group based on the rag2, scn4aa 3’, and total evidence nucleotide alignments (Appendix A Figures 2-4). It would also have been resolved as a monophyletic group based on the cytb locus if it were not for the inclusion of a siluriform species (C. coecutiens) in the group (Appendix A Figure 1). The families Electrophoridae (consisting of only 1 species) and Rhamphichthyidae, as well as a clade of the family Rhamphichthyidae as sister clade to a specific sub-group of the family Hypopomidae (including Steatogenys duidae) were consistently resolved as monophyletic groups (Appendix A Figures 1-4). The families Rhamphichthyidae and Hypopomidae were resolved as a monophyletic group where the former was derived from the latter based on the cytb, rag2, and total evidence nucleotide alignments.

Within the family Gymnotidae, three major monophyletic clades were consistently resolved (Appendix A Figures 1-2 and 4; clade names as per Lovejoy et al. 2010): Gymnotus carapo group; G2 group; and G1 group. Within the family Sternopygidae, the genus Sternopygus was consistently resolved as a monophyletic group with the possible exception of Sternopygus aequilabiatus (Appendix A Figures 1-4).

The family Electrophoridae was identified as sister clade of a group composed of all the other Gymnotiformes families based on the rag2 locus (Appendix A. Figure 2). However, based on the total evidence nucleotide alignment: the family Sternopygidae was sister clade of the other gymnotiform families; the family Electrophoridae was sister clade of the family Gymnotidae; and the family Apteronotidae was sister clade of the Electrophoridae + Gymnotidae group (Appendix A. Figure 3).

A.3.3 Variation in the Nav1.4a C-terminus

There were 177 variable sites in the gymnotiform voltage-gated sodium channel protein Nav1.4a carboxyl-terminus (C-terminus) amino acid alignment compared with 43 in the Gymnotus alignment (Table 7). The eight sites that were identified as positively selected from the Gymnotus alignment were also variable in amino acid identity in the gymnotiform alignment.

102

Appendix A.4 Discussion

A.4.1 Gymnotiform Phylogeny

There are several phylogenies of order Gymnotiformes (Chapter 1 Figure 2). Some of proposed phylogenetic relationships are consistent with each other, while some are unclear. This project used additional taxa and nucleotide sequences to provide further evidence towards clarifying proposed phylogenetic relationships among gymnotiforms. However, it is still not clear which family is the most basal.

Phylogenies were reconstructed using maximum parsimony (MP) and Bayesian inference (BI). For both MP and BI analyses, the length of time required generally depends on the number of taxa, number of non-ambiguous characters, and number of sequence replicates or generations specified. How well the consensus phylogenies fit the data is estimated by bootstrap values and posterior probabilities, respectively. For MP analyses, individual sequence replicates represent independent calculations. For BI analyses, previous generations are used as a basis for the more improved recent generations, so consensus phylogenies are estimated from recent generations that are similar to each other. Unfortunately, BI analyses based on individual loci and the total evidence alignment did not result in recent generations that met the target amount of similarity with each other (average standard deviation of split frequencies ≤ 0.01). The possibility of meeting this target through additional generations and computer power is low, given experience with other datasets (Hall 2011). The possibility of meeting this target through decrease in non- ambiguous characters exists (Wiens 1998). The possibility of meeting this target through increase in taxa exists. Taxon sampling was deliberately diverse and inclusive (approximately 2 and 10 times the number of ingroup and outgroup species compared with Chapters 1-4, respectively). However, none of the individual loci analyses met the target. The possibility of using other phylogenetic reconstruction algorithms such as minimum evolution and maximum likelihood also exists (Alves-Gomes 1999).

The outgroup consisted of ostariophysian species belonging to orders outside of order Gymnotiformes (Characiformes, Cypriniformes, Gonorhynchiformes, and Siluriformes). The

103

order Gymnotiformes was resolved as monophyletic when reconstructed with the recombination activating gene 2 (rag2) locus, consistent with all the existing phylogenies (Appendix A Figure 2). However, when reconstructed with the cytochrome b (cytb) locus, a siluriform species of family Cetopsidae (Cetopsis coecutiens) was well supported as part of the gymnotiforms clade (Appendix A Figure 1). The possibility of technical error during and downstream from GenBank sequence download is low, since the C. coecutiens cytb nucleotide sequence in the alignment is identical to that listed in GenBank. The possibility of technical error during initial collection of sequence data exists. According to listings in GenBank, the C. coecutiens rag2 and cytb sequences are associated with the same museum catalog number, but 2 different publications (Hardman and Lundberg 2006; Sullivan et al. 2006). Although the topology of the C. coecutiens specimen was consistent between the 2 publications, the topologies may not be comparable due to key differences in species sampled. The possibility of other explanations also exist, due to the intriguing observation that family Cetopsidae may be the only non-gymnotiform fish within superorder Ostariophysi with tuberous electroreceptors (Alves-Gomes 2001). The least close outgroup to Gymnotiformes was identified as Cypriniformes, consistent with most Ostariophysi phylogenies (Saitoh et al. 2003). The closest outgroup to Gymnotiformes was identified as Characiformes, but this was not well supported.

Within the order Gymnotiformes, families Electrophoridae, Gymnotidae, Rhamphichthyidae, Apteronotidae, and Sternopygidae were resolved as monophyletic, as excepted, with the exception of non-gymnotiform taxon C. coecutiens (Appendix A Figures 1-2 and 4). There was little evidence to support families Electrophoridae and Gymnotidae as a monophyletic clade of sister families, consistent with the existing nucleotide phylogeny and inconsistent with morphological phylogenies. The families Hypopomidae and Rhamphichthyidae were resolved as a monophyletic clade, consistent with existing phylogenies. However, the topology within this clade differs from existing phylogenies. It contains 2 sister clades, one of which includes genera from both families (Gymnorhamphichthys, Rhamphyichthys, Microsternarchus, Steatogenys, and Stegostenopos), and the other includes only genera from family Hypopomidae (Brachyhypopomus, Hypopomus, and Racenisia). It was not clear which of those sister clades genus Hypopygus is more closely related to.

104

Within family Gymnotidae, 3 major monophyletic clades were resolved, consistent with Chapters 1-4 and the existing nucleotide phylogeny (Lovejoy et al. 2010; Appendix A Figures 1- 2 and 4). Within family Sternopygidae, the genus Sternopygus was resolved as monophyletic, consistent with existing phylogenies (Appendix A Figures 1-4).

Within Gymnotiformes, various families have been proposed as the most basal. Data from this project supports either Electrophoridae or Sternopygidae as the most basal clade (Appendix A Figures 2 and 4, respectively).

A.4.2 Utility of the scn4aa 3’ for Phylogenetic Reconstruction

The portion of the voltage-gated sodium channel gene scn4aa that encodes the protein’s carboxyl terminus (scn4aa 3’) locus was one of several loci used to reconstruct the Gymnotus phylogeny. This locus is approximately 800 nucleotides long (Noda et al. 1984), and nucleotide sequences were obtained from 57 gymnotiform species. Analyses of these sequences showed that the scn4aa 3’ locus contributes towards a meaningful and accurate phylogenetic topology with a reasonable amount of resolution. However, better use of phylogenetic reconstruction algorithms are needed, as well as additional nucleotide sequences from certain clades of gymnotiforms and outgroups.

The aligned nucleotides were from an orthologous locus, which contributed towards meaningful reconstruction of the phylogeny among gymnotiform species (Fitch 2000). The scn4aa gene is one of two paralogs expressed in actinopterygiian myogenic tissue, and one of eight paralogs encoded in the actinopterygiian genome (Novak et al. 2006; Widmark et al. 2011). The scn4aa 3’ amplification primers were designed to be specific for and resulted in sole amplification of those orthologous sequences, rather than sequences from other paralogs.

The nucleotide character alignment had a large proportion of parsimony-informative characters, and the proportion of ambiguous characters was low. This contributed towards accurate reconstruction of the phylogeny among gymnotiform species (Wiens 1998; Hall 2011). Previous analyses had confirmed the absence of introns at this locus in gymnotiforms (Chapters 1-4; Widmark et al. 2011). This improves the accuracy of the alignment, since introns tend to be

105

more variable in length (Hughes and Yeager 1997). Also, amino acids are more conserved than nucleotides, and alignments of those sequences can be used to mitigate mis-alignment of exon indels among species (Wernersson and Pedersen 2003). The scn4aa C-terminus sequences contained the highest proportion of parsimony-informative characters per total characters among the loci in the dataset. In addition, the scn4aa 3’ sequences only contained 4.11% ambiguous characters, compared with 18.60% from the whole dataset. However, the scn4aa 3’ primers were not specific enough to only amplify scn4aa sequences from the family with neurogenic electric organs (Apteronotidae) and outgroup ostariophysians. To obtain these nucleotide identities, perhaps specific internal sequencing primers could be designed to be used with the Polymerase Chain Reaction (PCR) products, or specific PCR products can be purified by gel filtration and sequenced using existing primers.

The nucleotide characters of scn4aa 3’ seemed to be reasonably variable, which contributed towards resolution of the phylogeny among gymnotiform species (Brown et al. 1979). Voltage-gated sodium channels are highly conserved in nucleotide sequence and function across species (Goldin 2002). However, scn4aa in Actinopterygii had been predicted to vary in nucleotide sequence (Novak et al. 2006). This variability was previously confirmed among the actinopterygiian order Gymnotiformes with a small sample of species (Zakon et al. 2006; Arnegard et al. 2010), and confirmed again with a more comprehensive sample of species (in this project). When scn4aa 3’ sequences are included for phylogenetic reconstruction, the proposed evolutionary relationships among gymnotiforms are generally consistent with existing published phylogenies.

When characters at a locus vary at similar rates among lineages, the resulting phylogeny may be used as a primary means for estimation of species divergence timing (Schwartz 2007). However, characters are unlikely to vary at similar rates among lineages, if they were subjected to selective pressures that resulted in divergence of those species. The voltage-gated sodium channel protein Nav1.4a in Gymnotiformes may be an example of the latter case, since the protein has an important role in characteristics that may be under selective pressure among some lineages.

106

A.4.3 Variation at the Nav1.4a C-terminus

The existing analyses of patterns of selection at the voltage-gated sodium channel protein

Nav1.4a among gymnotiforms and non-electric fish focused on motifs at and between the homologous domains of the protein (Zakon et al. 2006; Arnegard et al. 2010). Purifying selection was detected among lineages of non-electric fish, and neutral (or relaxed) selection was detected among basal lineages of gymnotiform fishes. Positive selection was also detected among gymnotiform lineages, but the analysis only included four species representing four gymnotiform families (Zakon et al. 2006). In contrast, the project described here focused on motifs of the Nav1.4a carboxyl-terminus (C-terminus) that may be involved in varying amplitudes and frequencies of electric organ discharges (EODs). All 6 of the gymnotiform families were represented, and the species sample was larger by more than 14 times. Further analyses of phylogenetic topology is needed prior to analyses of patterns of selection, due to inconsistencies among existing phylogenetic topologies of gymnotiforms and unexpected inclusion of a siluriform species (Cetopsis coecutiens) within the Gymnotiform order.

There were 8 positively selected Nav1.4a C-terminus amino acid sites out of the 43 sites variable in amino acid identity among genus Gymnotus (Table 7 in Chapter 3). It is possible that other Gymnotiforms are also positively selected at some of those amino acid sites, since some of them are even more variable in amino acid identity when other gymnotiform families are included. It is also possible that there are more positively selected sites identified among order Gymnotiformes, since the number of sites variable in amino acid identity is more than four times larger (at least 177 sites).

A.4.4 Summary and Future Directions

Some aspects of evolutionary relationships among Gymnotiformes were clarified using additional taxa and nucleotide sequences. However, it is still not clear which family is the most basal. In addition, the novel possibility of a species from family Cetopsidae (order Siluriformes) being evolutionarily closer to order Gymnotiformes needs to be further verified.

The scn4aa 3’ locus contributed towards a meaningful phylogenetic topology that provides a reasonable amount of resolution. Future phylogenetic reconstruction analyses that

107

make better use of existing algorithms would provide better accuracy and confidence in the topology. In addition, future analyses would likely benefit from inclusion of scn4aa 3’ sequences of family Apteronotidae among Gymnotiformes and of the majority of gymnotiform outgroups.

Analyses of patterns of selection may be assessed when evolutionary relationships among gymnotiform fishes is further clarified (Yang 2007). Previous analyses of other motifs of the scn4aa paralog among a small sample of gymnotiform species, found evidence of purifying, neutral (relaxed), and positive selection among specific lineages (Zakon et al. 2006; Arnegard et al. 2010). In addition, previous analyses of scn4aa 3’ among genus Gymnotus, found evidence of purifying, neutral (relaxed), and positive selection among specific lineages and at specific amino acid sites (Chapters 1-4). Since the number of sites variable in amino acid identity among order Gymnotiformes is more than four times larger than those among genus Gymnotus (177 vs 43 sites), future analyses of scn4aa 3’ may identify additional amino acid sites that contribute to knowledge of protein function.

108

Appendix A.5 References

Agnew, W. S. (1984). Voltage-regulated sodium channel molecules. Annu Rev Physiol. 46, 517- 30.

Alves-Gomes, J. A. (2001). The evolution of electroreception and bioelectrogenesis in teleost fish: a phylogenetic perspective. J Fish Biol. 58, 1489-1511.

Alves-Gomes, J. A., Ortí, G., Haygood, M., Heiligenberg, W. and Meyer, A. (1995). Phylogenetic analysis of the South American electric fishes (Order Gymnotiformes) and the evolution of their electrogenic system: a synthesis based on morphology, electrophysiology, and mitochondrial sequence data. Mol. Biol. Evol. 12(2), 298-318.

Arnegard, M. E., Zwickl, D. J., Lu, Y. and Zakon, H. H. (2010). Old gene duplication facilitates origin and diversification of an innovative communication system – twice. Proc Natl Acad Sci. 107(51), 22172-22177.

Bennett, M. V. L. (1961). Modes of operation of electric organs. Ann N Y Acad Sci. 94, 458- 509.

Bergsten, J. (2005). A review of long-branch attraction. Cladistics. 21(2). 163-193.

Brinkman, F. S. L. and Leipe, D. D. (2001). Chapter 14: Phylogenetic analysis (In: Baxevanis, A. D. and Ouellette, B. F. F. Eds.), Bioinformatics: A practical guide to the analysis of genes and proteins, Second Edition. John Wiley & Sons Inc. (Electronic), pp. 323-358. ISBN 0-471-22392-1.

Brown, W. M., George, M. and Wilson, A. C. (1979). Rapid evolution of animal mitochondrial DNA. Proc Natl Acad Sci. 76(4), 1967-1971.

Calcagnotto, D., Schaefer, S. A. and DeSalle, R. (2005). Relationships among characiform fishes inferred from analysis of nuclear and mitochondrial gene sequences. Mol. Phylogent. Evol. 36, 135-153.

109

Caputi, A. A. (1999). The electric organ discharge of pulse Gymnotiforms: The transformation of simple impulse into a complex spatio-temporal electromotor pattern. J Exp Biol. 202, 1229-1241.

Catterall, W. A. (1984). The molecular basis of neuronal excitability. Science. 223(4637), 653- 661.

Catterall, W. A., Goldin, A. and Waxman, S. G. (2005). International Union of Pharmacology. XLVII. Nomenclature and structure-function relationships of voltage-gated sodium channels. Pharmacol Rev. 57 (4), 397-409.

Crampton, W. G. R. and Albert, J. S. (2006). Evolution of electric signal diversity in Gymnotiform fishes (In: Ladich, F., Collin, S. P., Moller, P. and Kapoor, B. G. Eds.), Communication in Fishes. Science Publishers, Enfield, New Hampshire, pp. 657-731.

Fink, S. V. and Fink, W. L. (1981). Interrelationships of the Ostariophysan fishes (Teleostei). Zool J Linn Soc. 72, 297-353.

Fitch, W. M. (2000). Homology a personal view on some of the problems. Trends Genet. 16(5), 227-31.

Goldin, A. L. (2002). Evolution of voltage-gated Na+ channels. J Exp Biol. 205, 575-584.

Gotter, A. L., Kaetzel, M. A. and Dedman, J. R. (1998). Electrophorus electricus as a model system for the study of membrane excitability. Comp Biochem Physiol. 119A (1), 225- 241.

Hall, B. G. (2011). Phylogenetic trees made easy: a how-to manual, 4th Edition. Sinauer Associates, Inc., Sunderland, MA.

Hardman, M. and Lundberg, J. G. (2006). Molecular phylogeny and a chronology of diversification for "phractocephaline" catfishes (Siluriformes: Pimelodidae) based on mitochondrial DNA and nuclear recombination activating gene 2 sequences. Mol Phylogenet Evol. 40(2), 410-418.

110

Hardman, M. and Page, L.M. (2003). Phylogenetic relationships among bullhead catfishes of the genus Ameiurus (Siluriformes: Ictaluridae). Copeia. 2003 (1), 20-33.

Huelsenbeck, J.P. and Ronquist, F. (2001). MrBayes: Bayesian inference of phylogenetic trees. Bioinformatics. 17, 754–755.

Hughes, A. L. and Yeager, M. (1997). Comparative evolutionary rates of introns and exons in murine rodents. J Mol Evol. 45(2), 125-130.

Irwin, D.M., Kocher, T.D. and Wilson, A.C. (1991). Evolution of the cytochrome b gene of mammals. J Mol Evol. 32, 128-144.

Kocher, T.D., Thomas, W.K., Meyer, A., Edwards, S.V., Paabo, S., Villablanca, F.X. and Wilson, A.C. (1989). Dynamics of mitochondrial DNA evolution in mammals: amplification and sequencing with conserved primers. Proc Natl Acad Sci USA. 86, 6196-6200.

Lopreato, G. F., Lu, Y., Southwell, A., Atkinson, A. S., Hillis, D. M., Wilcox, T. P. and Zakon, H. H. (2001). Evolution and divergence of sodium channel genes in vertebrates. Proc Natl Acad Sci. 98(13), 7588-7592.

Lovejoy, N. R. and Collette, B. (2001). Phylogenetic relationships of new world needlefishes (Teleostei: Belonidae) and the biogeography of transitions between marine and freshwater habitats. Copeia 2, 324-338.

Lovejoy, N. R., Lester, K., Crampton, W. G. R., Marques, F. P. L. and Albert, J. S. (2010). Phylogeny, biogeography, and electric signal evolution of Neotropical knifefishes of the genus Gymnotus (Osteichthyes: Gymnotidae). Mol Phylogenet Evol. 54, 278-290.

Lynch, M., O'Hely, M., Walsh, B. and Force, A. (2001). The probability of preservation of a newly arisen gene duplicate. Genetics. 159, 1789-1804.

Mayden, R. L., Chen, W.-J., Bart, H. L., Doosey, M. H., Simons, A. M., Tang, K. L., Wood, R. M., Agnew, M. K., Yang, L. Hirt, M. V., Clements, M. D., Saitoh, K., Sado, T., Miya, M. and Nishida, M. (2009). Reconstructing the phylogenetic relationships of the earth's most

111

diverse clade of freshwater fishes – order Cyrpiniformes (Actinoptergii: Ostariophysi): A case study using multiple nuclear loci and the mitochondrial genome. Mol. Phylogent. Evol. 51, 500-514.

Mayden, R. L., Tang, K. L., Conway, K. W., Freyhof, J., Chamberlain, S., Haskins, M., Schneider, L., Sudkamp, M., Wood, R. M., Agnew, M., Bufalino, A., Sulaiman, Z., Miya, M., Saitoh, K. and He, S. P. (2007). Phylogenetic relationships of Danio within the order Cypriniformes: a framework for comparative and evolutionary studies of a model species. J Exp Zool B Mol Dev Evol. 308B, 642-654.

Mills, A. and Zakon, H. H. (1987). Coordination of EOD frequency and pulse duration in a weakly electric wave fish: the influence of androgens. J Comp Physiol A. 161, 417-430.

Müller, K. F. (2005). The efficiency of different search strategies in estimating parsimony jackknife, bootstrap, and Bremer support. BMC Evol Biol. 5(58).

Noda, M., Shimizu, S, Tanabe, T., Takai, T., Kayano, T., Ikeda, T., Takahashi, H., Nakayama, H., Kanaoka, Y., Minamino, N., Kangawa, K., Matsuo, H., Raftery, M. A., Hirose, T., Inayama, S., Hayashida, H., Miyata, T. and Numa, S. (1984). Primary structure of Electrophorus electricus sodium channel deduced from cDNA sequence. Nature 312, 121-127.

Novak, A. E., Jost, M. C., Lu, Y., Taylor, A. D., Zakon, H. H. and Ribera, A. B. (2006). Gene duplications and evolution of vertebrate voltage-gated sodium channels. J Mol Evol. 63, 208-221.

Nylander, J.A.A. (2004). MrModeltest. Technical report. Evolutionary Biology Centre, Uppsala University, Uppsala.

Palumbi, S., Martin, A., Romano, S., McMillan, W.O., Stice, L. and Grabowski, G. (1991). The simple fool’s guide to PCR, version 2.0. Honolulu: Department of Zoology and Kewalo Marine Laboratory, University of Hawaii.

112

Saitoh, K., Miya, M., Inoue, J., Ishiguro, N. B. and Nishida, M. (2003). Mitochondrial genomics of Ostariophysan fishes: perspectives on phylogeny and biogeography. J Mol Evol. 56, 464-472.

Schwartz, J. H. (2007). Do molecular clocks run at all? A critique of molecular systematics. Biol Theory. 1(4), 357-371.

Stoddard, P. K. (2002). Electric signals: predation, sex, and environmental constraints. Advances in the Study of Behaviour. 31, 201-242.

Sullivan, J. P., Lundberg, J. G. and Hardman, M. (2006). A phylogenetic analysis of the major groups of catfishes (Teleostei: Siluriformes) using rag1 and rag2 nuclear gene sequences. Mol Phylogent Evol. 41(3), 636-662.

Swofford, D.L. (2002). PAUP* 4:40: Phylogenetic analysis using parsimony *and other methods. Sinauer Associates, Sunderland, MA.

Widmark, J., Sundström, G., Daza, D. O. and Larhammar, D. (2011). Differential evolution of voltage-gated sodium channels in tetrapods and teleost fishes. Mol Biol Evol. 28(1), 859- 871.

Wiens, J. J. (1998). Does adding characters with missing data increase or decrease phylogenetic accuracy? Syst Biol. 47(4), 625-640.

Zakon, H. H., Lu, Y., Zwickl, D. J. and Hillis, D. M. (2006). Sodium channel genes and the evolution of diversity in communication signals of electric fishes: convergent molecular evolution. Proc Natl Acad Sci. 103, 3675-3680.

113