<<

University of New England DUNE: DigitalUNE

All Theses And Dissertations Theses and Dissertations

11-1-2014 Species Identification And Phylogeny Of Phycinae Hakes And Related Gadoid Laura Ann Whitefleet-Smith University of New England

Follow this and additional works at: http://dune.une.edu/theses Part of the Aquaculture and Fisheries Commons, and the Marine Biology Commons

© 2014 Laura Whitefleet-Smith

Preferred Citation Whitefleet-Smith, Laura Ann, "Species Identification And Phylogeny Of Phycinae Hakes And Related Gadoid Fishes" (2014). All Theses And Dissertations. 19. http://dune.une.edu/theses/19

This Thesis is brought to you for free and open access by the Theses and Dissertations at DUNE: DigitalUNE. It has been accepted for inclusion in All Theses And Dissertations by an authorized administrator of DUNE: DigitalUNE. For more information, please contact [email protected]. SPECIES IDENTIFICATION AND PHYLOGENY OF PHYCINAE HAKES AND RELATED GADOID

FISHES

BY

Laura Ann Whitefleet-Smith B.S. Coastal Carolina University, 2012

THESIS

Submitted to the University of New England in Partial Fulfillment of the Requirements for the Degree of

Master of Science

in

Marine Sciences

November, 2014

This thesis has been examined and approved.

Thesis Director, Dr. Anna Bass, Research Associate of Biology

-Dr. Charles Tijÿtfrg Associate pdan, College of Arts and Sciences

Dr. Steven Travis Associate Professor of Biology

/? ftey 2.Q/"/ Date ACKNOWLEDGEMENT

I would like to thank the many people and organizations who have helped me throughout my graduate research at UNE, without whom, this thesis would not have been possible. First and foremost, I thank Dr. Anna Bass for turning me into a molecular ecologist and for her invaluable guidance, support and encouragement. Special thanks are extended to my committee members Dr. Charles Tilburg and Dr. Steven Travis for their edits, advice and support throughout this project. I sincerely wish to thank the teachers, fellows and administrators of the SPartACUS (NSF GK-12) community for everything they taught me over the last two years and for their friendship and support. I could not have made it this far without the love of my family and friends and I cannot thank you enough. Thank you to Marian Reagan, Tim Arienti and Shaun Gill for all of your help. I thank the following people for their assistance in securing biological samples utilized during this study: Dr. Joseph Kunkel, Laura Sirak, Ryan Knotek, Connor

Capizzano, and Leah Bymers from UNE, Pearse Webster and Jordan Neely from the

South Carolina Department of Natural Resources, Jakub Kircun and the Ecosystems

Surveys Branch of the NOAA Fisheries Northeast Fisheries Science Center and the officers and crew of the NOAA research vessel Henry B. Bigelow. Last but not least, I would like to thank Maine Sea Grant and the University of New England Graduate

Program for funding my research and the SPartACUS NSF DGE GK-12 program (grant

#8041361) for funding me.

iii

TABLE OF CONTENTS

TITLE PAGE ……………………………………………………………………………………………………………………. i THESIS APPROVAL ………………………………………………………………………………………………………… ii ACKNOWLEDGEMENTS ……………………………………………………………………………………………….. iii TABLE OF CONTENTS …………………………………………………………………………………………………… iv LIST OF TABLES …………………………………………………………………………………………….……………… vi LIST OF FIGURES …………………………………………………………………………………………………………. vii ABSTRACT ………………………………………………………………………………………………………………….. viii

CHAPTER PAGE

GENERAL INTRODUCTION ……………………………………………………………………………………………. 1 I. CHAPTER 1: SPECIES IDENTIFICATION OF GADIOD FISHES OF THE GULF OF MAINE USING RESTRICTION FRAGMENT LENGTH POLYMORPHISM …………………………………………. 5

Abstract ……………………………………………………………………………………………………………. 5

1. Introduction ……………………………………………………………………………………………….. 6 2. Materials and Methods ………………………………………………………………………………. 9 2.1 Sample Collection …………………………………………………………………………………. 9 2.2 DNA Isolation, PCR-RFLP and sequencing ………………………….….…..………….. 9

3. Results and Discussion ..……………………………………………………………………………. 12 3.1 Predicted RFLP Patterns …………………………………………………………….………… 12 3.2 Identification of Market Samples ………………………………………………….…….. 13 3.3 Unexpected RFLP patterns …………………………………………………………………… 15 4. Conclusion …………………………………………………………………………………………….….. 17

iv

II. CHAPTER 2: PHYLOGENY AND TRAIT EVOLUTION OF MORPHOLOGICAL AND LIFE HISTORY CHARACTERISTICS OF THE GADOIDEI ……………………………………………….. 24 Abstract …………………………………………………………………………………………………………. 24

1. Introduction ……………………………………………………………………………………………... 25 2. Materials and Methods …………………………………………………………………………….. 28 2.1 Samples, DNA extraction, PCR and sequence alignment .……………………. 28 2.2 Tests for substitution saturation ……………………………………………………….… 30 2.3 Phylogenetic Analysis …………………………………………………………………………. 31 2.4 Character Mapping Analysis …………………………………………………………….…. 34 2.5 Substitution Pattern Homogeneity Analysis ………………………………………... 35

3. Results …………………….…………………………………………………………………………….…. 37 3.1 Sequencing Results ……..……………………………………………………………………… 37 3.2 Substitution Saturation ……………..……………………………………………………….. 37 3.3 Phylogenetic Analysis ……………………………………………………………………….... 39 3.4 Character Mapping ……………………………………………………………………………… 40 3.5 Substitution Pattern Homogeneity Analysis ……………………………….…….…. 41 4. Discussion …………………………………………………………………………………………………. 42 4.1 Phylogeny ………………………………………………………………………………………..…. 42 4.2 Character Mapping ………………………………………………………………………..…… 44 4.3 Conclusions and Future Directions ……………………………………………………... 46 REFERENCES ………….…………………………………………………………………………….………………..…… 62 APPENDIX I: GenBank sequences used in Chapter 1 …………………...………………………….…. 71

APPENDIX II: Character Mapping of Physiological Tolerances ……………………………………. 74

v

LIST OF TABLES

TABLE PAGE

1.1 Summary of reference samples …………………………………………………………………..…. 19

1.2 Summary of market samples ……………………………………………………………………….…. 20

1.3 Primers used in RaGIA assay ………………………………………………………………………..…. 21

1.4 Expected fragment lengths for RaGIA assay ………………………………………………….… 22

2.1 Molecular sequence data ……………………………………………………………………………….. 48

2.2 Primers used for PCR amplification ………………………………………………………………... 49

2.3 Substitution models selected for ML and NJ analysis ………………………………….….. 50

2.4 Traits used for character mapping ………………………………………………………………….. 51

2.5 Substitution saturation testing results ………………………………………………………….… 52

2.6 Cytochrome b substitution homogeneity matrix …………………………………………….. 53

2.7 RAG1 substitution homogeneity matrix ……………………………………………………...... 54

vi

LIST OF FIGURES

FIGURE PAGE

1.1 Expected fragment patterns ……………………………………………………………………….... 23

2.1 Substitution saturation scatterplots ……………………………………………………………... 55

2.2 Consensus tree of the concatenated data ……………………………………………………… 56

2.3 Consensus tree using RAG1 sequence data ……………………………………………………. 57

2.4 Consensus tree using cyt b sequence data ……………………………………………………… 58

2.5 Meristics mapped onto the concatenated consensus tree …………………….…….…. 59

2.6 Egg oil globule presence mapped onto the concatenated consensus tree …..…. 60

2.7 Geographic distribution of species ……………………….………………………………………… 61

vii

ABSTRACT

SPECIES IDENTIFICATION, PHYLOGENETIC ANALYSIS AND CHARACTER TRAIT EVOLUTION OF PHYCINAE HAKES AND RELATED GADOIDS

By

Laura Whitefleet-Smith

University of New England, November, 2014

The term hake refers to a number of species belonging to multiple families of in the suborder Gadoidei and includes two main groups: Phycinae hakes (family ) and spp. hakes (family ). The use of the common name hake for this diverse group of fish prompts questions such as: how are these species related and how can they be differentiated? Chapter one details the development of the Rapid

Gadoid Identification Assay (RaGIA) for molecular identification of 11 gadoid fishes

(including six hakes) using Polymerase Chain Reaction Restriction Fragment Length

Polymorphism (PCR-RFLP). RaGIA was used for species identification of fillets of hake, pollock and haddock sold in southern Maine markets. Testing found that market labelling was accurate; however, there were inconsistences in the labels provided by the fish distributors (from whom the markets obtained their fillets). Chapter two explores the development of a phylogeny, based on a mitochondrial DNA gene and a nuclear encoded gene, which includes members of the families Gadidae and Merlucciidae. The resulting phylogeny was used for morphological character mapping and investigation of trait evolution in this group. Consistent with previous studies, the analysis resolved the families Gadidae, as well as several subfamilies, and Merlucciidae with strong support.

viii

The putative Lotinae subfamily clade was not resolved in this analysis and suggests that further study is needed to investigate the monophyly of this group. The three dorsal fins and two anal fins morphological states as well as the life history characteristic of the absence of an egg oil globule were all found to be characteristic of the Gadinae, the most derived clade of the Gadoidei.

ix

GENERAL INTRODUCTION

Molecular techniques play an important role in fish ecology research and fisheries management. Molecular methods are useful at multiple organizational levels including communities, populations and individuals. These methods are also applicable across multiple taxonomic levels and can be used to explore relationships between families, genera, species and populations within a single species.

Population ecology and genetics are essential in fisheries management (Hare et al. 2011). Population genetics studies have the ability to estimate population size (N), effective population size (Ne), and the relationship between the two. Genetic techniques to estimate Ne have been used to assess the status of protected and endangered species, stock assessment of commercial species, connectivity between populations, management of protected areas and identification of “at risk” areas and/or populations

(Hare et al. 2011). Traditional means of population estimates include mark-and- recapture studies, however, it is often impractical to catch (and recapture) the number of individuals needed to compute an accurate estimate (Luikart et al. 2010). Genetic approaches are much more practical, less invasive and less expensive than mark-and- recapture for protected species or those that are difficult to safely capture.

Molecular analyses can be used as an additional tool for determining the most appropriate scale for management. Wilson et al. (2013) highlights a common problem in fisheries management: overlooking the scale at which biological processes are occurring. Currently, many fisheries (including the Gulf of Maine) are managed as one large-scale unit, which is problematic as most systems are comprised of smaller subunits

1 within a complex and dynamic ecosystem. For most species in the Gulf of Maine, the larger unit managed is actually a metapopulation comprised of a group of smaller local populations (Wilson et al. 2013; Ames and Lichter 2013). Wilson et al. (2013) suggests that current fisheries management strategies unintentionally facilitate the systematic extirpation of local populations. It is difficult and time consuming to recover from this type of loss. Population genetics can help us understand finer scale interactions as well as define these local populations in order to improve management.

The incorporation of phylogenetic information is important in community ecology. Having a well-supported phylogeny is critical for testing evolutionary hypotheses of species distribution and community structure. An ecological community is made of an assemblage of species that all play functional roles in their ecosystem.

Adding or removing species from the assemblage has the potential to change the functional diversity of a community (Halpern and Floeter, 2008). One ecological question of interest is whether species in a community have assembled in a random or non-random manner. If a community has assembled non-randomly, the relative influences of phylogeny and environmental adaptation are of interest. While many studies focus on the influence of phylogeny and adaptive traits independently, Cadotte et al. (2013) argue that using a combination of trait based distances and phylogenetic distances between species gives greater power for understanding how communities form and what drives species to assemble into a community. Additionally, phylogeographic data can be useful in describing the processes of diversification,

2 dispersal and evolution (Marsket et al. 2014). It is apparent through these examples that phylogenetics is an essential tool in the study of community ecology and evolution.

In addition to community population dynamics, molecular data are also useful for inferring phylogenetic relationships between species. These relationships can then be used as a context for interpreting trends in biological traits. Producing reliable phylogenies is crucial in the study of trait evolution, including life history characteristics, physiological mechanisms and morphological features. For instance, phylogenetic analyses have been used to study the evolution of phototransduction (an essential physiological mechanism in vision) (Plachetzki et al. 2010).

Molecular techniques are extremely useful for species identification purposes.

Molecular techniques can be informative in diet analysis, allowing for higher resolution in prey identification (compared to traditional methods) when analyzing both stomach contents (Paquin et al. 2014) and fecal samples (Bowser et al. 2013). Additionally, a number of DNA-dependent methods have been developed for seafood authentication and checking for mislabeling (Bossier 1999; Teletchea et al. 2009; Griffiths et al. 2014).

These methods may be used to verify and manage regulated species. For example,

Japan imports and regulates Gadoid fishes belonging to the genera Gadus (some species formerly classified as Theragra) and Merluccius; therefore there is a need for rapid and accurate species identification (Akasaki et al. 2006). Akasaki et al. (2006) addressed this need by using Polymerase Chain Reaction Restriction Fragment Length Polymorphism

(PCR-RFLP) as a method of quick identification of fish products imported to Japan.

3

The families Gadidae and Merlucciidae (suborder Gadoidei) include a number of commercially important species such as the , haddocks, pollocks, and hakes. I have used molecular techniques for species identification and phylogenetic analysis of the

Gadidae and Merlucciidae. The first chapter focuses on the development of a molecular assay for species identification of 11 Gadoids; this assay was then used to test for mislabeling of fish fillets in southern Maine Markets. The second chapter focuses on phylogenetic reconstruction of the Gadidae and Merlucciidae and investigates evolutionary trends in morphological and life history traits among these fish.

4

CHAPTER 1

SPECIES IDENTIFICATION OF GADOID FISHES OF THE GULF OF MAINE USING

RESTRICTION FRAGMENT LENGTH POLYMORPHISM

Abstract

Among the gadoid fishes (e.g. cods, haddock, pollocks, hakes) the hakes in particular have a high potential for mislabeling in a market setting as this common name describes a number of different species belonging to multiple families of fish. In this study, a Rapid Gadoid Identification Assay (RaGIA) was developed to differentiate six species of hake and five additional Gulf of Maine gadoids. The assay used Polymerase

Chain Reaction Restriction Fragment Length Polymorphism (PCR-RFLP) and consisted of a single-tube double-digestion of a 978 bp fragment (39 bp tRNA-Glu: 939bp cytochrome b) using the enzymes ApoI and TaqαI. Development and validation of RaGIA were conducted using reference tissue samples (n=118). RaGIA was used to assess the validity and specificity of labelling for hake, pollock and haddock fillets in southern Maine markets. Assay accuracy (percentage of samples matching the predicted pattern) was

100% for all species with the exception of Merluccius bilinearis (94.7%),

5 regia (92.3%), and Gadus morhua (77.8%). Market hake were identified as 93.5%

Urophycis tenuis and 6.5% Urophycis chuss. All market pollock were identified as

Pollachius virens and all haddock were Melanogrammus aeglefinus. All fillets tested were sold by their respective markets under acceptable names. However, distributors’ labels provided the incorrect scientific name for 24 hake (U. tenuis mislabeled as M. bilinearis) and five pollock (P. virens mislabeled as Pollachius pollachius) samples.

1. Introduction

Mislabeling of seafood products is a growing issue worldwide (Di Pinto et al.

2013). Authenticating the species of commercially sold fish is important for consumer health and prevention of fraudulent substitution and sale of fish of lesser value.

Traditionally, fish are identified by morphological features, but fish fillets and processed seafood products cannot be reliably identified by these means. Protein analysis is one alternative identification technique; however, proteins can be denatured during the treatment of seafood products. Consequently, protein analysis is unreliable for heat- treated (e.g. canned or smoked, boiled) and processed products (Bossier 1999). DNA- dependent techniques that are robust to the effects of degradation are often preferred.

Molecular techniques that can identify market substitutions can also be used to highlight areas in fisheries management that require closer attention (Di Pinto et al.

2013). Verification of the species harvested is important for accurate population size estimates. Garcia-Vazquez et al. (2012) found that in Spanish markets 12% of products labeled either Merluccius bilinearis or “North American hakes” were actually Merluccius albidus and 73% of products labeled Merluccius capensis were actually Merluccius

6 paradoxus. The likelihood that landings for M. albidus and M. paradoxus were underestimated is a concern. If the incidence of misidentification observed was a reflection of misrepresentation in all catches, Garcia-Vazquez et al. (2012) estimated that reported landings could have been off by thousands of tons of fish. In another example, Di Pinto et al. (2013) analyzed “salted cod fillets” and “battered cod chunks” with DNA barcoding, finding 31% of the fillets and 100% of the chunks to be mislabeled.

With the high prevalence of mislabeling in the marketplace, it is necessary to develop techniques that can test the accuracy of fish labeling and provide a means of quality assurance.

Gadoid fishes are a commercially and ecologically important group worldwide that includes Atlantic and Pacific cod, haddock, pollock, and hake. The hakes could be particularly susceptible to mislabeling as the term hake is used to describe a number of different gadiform fishes belonging to the genera Merluccius, Urophycis and .

Phycinae hakes (genera Urophycis and Phycis) belong to the family Gadidae along with the cods and pollocks while Merluccius hakes belong to the family Merlucciidae

(Teletchea et al. 2006; Roa-Varón and Ortí 2009). The use of the name hake may cause these species to be grouped together in a commercial setting under the assumption that all hakes are related. In fact, reported landings of hakes in the United States (Urophycis tenuis and U. chuss; Merluccius bilinearis and M. albidus) have historically been confounded by mixed hake species catches and individual species landings for mixed catches have been determined using model estimates (NEFSC 2011; 2013). These factors may increase the possibility of mislabeling or ambiguous labeling of hake fillets.

7

Polymerase Chain Reaction—Restriction Fragment Length Polymorphism (PCR-

RFLP) has been used to successfully identify a number of gadoid fishes as well as to examine phylogenetic relationships between different species (Akasaki et al. 2006;

Aranishi et al. 2005; Calo-Mata et al. 2003; Dooley et al. 2005; Quinteiro et al. 2001;

Wolf et al. 2000). While PCR-RFLP was the most commonly used method for fish species identification from 1997 to 2007 (Teletchea, 2009), a number of different molecular techniques can be used for species identification purposes. Real time PCR has been used as a simple and efficient technique for determining the presence/absence of European hake, (Sanchez et al. 2009). However, real time PCR is not an efficient means of differentiating more than two species. DNA barcoding has become an increasingly popular technique in the field of fish forensics (McCusker et al. 2013; Di

Pinto et al. 2013; Handy et al. 2011), but is typically more expensive to execute than

PCR-RFLP. Ram et al. (1996) used both traditional sequencing alongside PCR-RFLP to determine the contents of canned tuna samples. While both techniques were reliable for differentiating species present in canned samples, PCR-RFLP was less expensive and therefore, the more practical of the approaches.

We developed an efficient, reliable and relatively inexpensive Rapid Gadoid

Identification Assay (RaGIA) using PCR-RFLP, to identify six hake species and five additional gadoid species using a 978 base pair (bp) fragment of mitochondrial DNA spanning two loci (39 bp of the 3’ end of the tRNA-Glu and 939 bp of the 5’ end of the cytochrome b). Eleven species were targeted because they satisfied at least one of the following criteria: 1) species is present in markets in Southern Maine, USA 2) species is

8 likely to be ambiguously identified or 3) species is a non-commercial gadoid likely to be caught with and potentially substituted in place of a commercial species. This technique was then used to survey markets southern Maine, USA for mislabeling of hake, pollock and haddock fillets.

2. Materials and Methods

2.1 Sample Collection

Reference tissue samples (n=118) were taken from whole fish that were collected and identified by researchers in the northwest Atlantic Ocean (Table 1.1).

Samples consisted of an approximately 1cm2 fin clip preserved in either 95% ethanol or a salt-saturated dimethyl sulfoxide (DMSO) buffer (Seutin et al., 1991). Market samples

(n=178) were obtained from grocery stores and fish markets in the greater Portland,

Maine, area (Table 1.2). Market fillet samples (n= 152) consisted of approximately 1cm2 pieces of muscle tissue taken from individual fillets of hake, pollock or haddock and were preserved in either 95% ethanol or frozen at -20°C. Fillet samples were collected in two ways 1) individually purchased (n=13) from 7 different stores and 2) direct collaboration with the seafood departments of two supermarkets in Biddeford, ME

(n=139). For the latter, the distributors’ product labels were available, for the former the distributor (and their labeling accuracy) is unknown. Market hake carcasses (n=26) or “rack” (the remains left following filleting fish) were also provided for analysis by a fish market in Portland, Maine, USA and were preserved in modified saline ethanol

(MSE) (Miller and Scholin, 2000).

2.2 DNA Isolation, PCR-RFLP and sequencing

9

A modified version of the “Rapid Isolation of Mammalian DNA” protocol from

Sambrook and Russell (2000) was used for DNA isolation of all samples. Specifically, a small piece of tissue (approximately 0.125cm2) was mechanically minced with scissors, placed in 600 µL fish extraction buffer (1mM Tris-HCl, 1mM EDTA, 0.5% (v/v) SDS and 20

μg/mL RNase A) with 3μL proteinase k (20 mg/ml) and digested overnight at 55°C.

Following this, 200 μL potassium acetate (3M potassium acetate, 5M glacial acetic acid) was added and the sample was placed on ice for 5 minutes followed by centrifugation.

The supernatant was decanted into a clean tube and DNA was precipitated with 600 μL cold isopropanol, centrifuged, and the supernatant discarded. DNA was then washed with 600 μL 70% ethanol, centrifuged, and the supernatant pipetted off. Samples were left to air dry for 3 hours, re-suspended in 100 μL 1X TE buffer and stored at -20°C.

Initially, primers from Aranishi et al. (2005) and Teletchea et al. (2006) were used to amplify a 950 bp fragment of the cytochrome b (cyt b) gene for targeted species that did not have GenBank representatives, Urophycis chuss and (Table 1.3).

Amplicons were cleaned using an ExoSAP protocol based on Bell (2008). Specifically, 3.9

μL dH2O, 0.3 μL Shrimp Alkaline Phosphatase (SAP) (1 U/ μL) (USB, Cleveland, OH) and

1.8 μL Exonuclease I (20,000 U/ml) (New England Biolabs Ipswich, MA, USA) were added to 14 μL of PCR product. The amplicon mixture was then subjected to the following temperature regimen: incubation at 37°C for 20 minutes and denaturation at 85°C for

15 minutes. Cleaned PCR products were shipped to Macrogen, Inc. (Rockville, MD, USA) for sequencing. Both forward and reverse sequences were produced for each individual.

Sequences for all other species of interest were downloaded from GenBank (Appendix).

10

Additionally, a representative reference sample for each species was sequenced

(Macrogen, Inc., Rockville, MD, USA) to ensure consistency and validity of sequences obtained from GenBank.

Sequencher (v.5, Gene Codes Corporation, Ann Arbor, MI, USA) was used to analyze all sequences (both newly generated and GenBank sequences), to generate expected fragment patterns for each species, and to design new primers. Three primers

(one forward primer: L14322, and two reverse primers: cyb939PHR and cyb939GAR) were used to amplify a 978 bp fragment of mitochondrial DNA spanning two loci (39 bp tRNA-Glu: 939 bp cytochrome b). It was necessary to design two reverse primers, each of which targeted a subgroup of the species of interest (cyb939PHR and cyb939GAR: see

Table 1.3) in order to amplify across the diverse group of target species.

A thermal cycler was used to conduct PCRs as follows: initial denaturation at

95°C for 1 minute, followed by 35 cycles of 95°C for 30 seconds, 60°C for 30 seconds,

72°C for 1 minute and a final extension of 72°C for 5 minutes. Reactions consisted of 5µL

5X buffer, 0.5 µL dNTPs (2.5mM each), 3 µL MgCl2 (25mM), 1 µL of each primer (L14322, cyb939PHR and cyb939GAR, 10µM), 0.125 µL BSA (20mg/ml), 0.125 µL GoTaq G2 Flexi

Taq Polymerase (5 Units/µL) (Promega, Madison, WI, USA), 0.5 to 2 µL DNA template and nuclease free water to a total reaction volume of 25 µL. Products were visualized using electrophoresis on a 1% agarose gel with 4 µL (5mg/mL) ethidium bromide.

PCR products were diluted to a concentration range of approximately 3-8 ng/µL to achieve an optimal DNA-to-enzyme ratio of 1-2 Units of enzyme per ng/µL of product.

PCR products were digested with 7.5 Units each of the restriction enzymes ApoI and

11

TaqαI in a single tube, two-step double digestion as follows: 2.28 µL NEB Buffer 3.1 (New

England Biolabs Ipswich, MA, USA) and 0.075 µL (100,00U/ml) TaqαI restriction enzyme was added to each sample consisting of 20 µL of diluted PCR product, incubated at 65°C for 30 minutes, denatured at 80°C for 20 minutes, and then held at 50°C for 1 minute.

Following this, 0.3 µL NEB Buffer 3.1 and 0.75 µL (10,000U/ml) ApoI restriction enzyme were added to each tube and all tubes were re-spun. Samples were then incubated at

50°C for 30 minutes and denatured at 80°C for 20 minutes. Products from restriction enzyme digestion were visualized using electrophoresis on a 1.5% agarose gel with 4 µL

(5mg/mL) ethidium bromide.

The accuracy of RaGIA in correct identification was assessed for each species by analyzing the reference samples (Table 1.1). For each sample, the observed fragment pattern was compared to the expected pattern for that species. The result for each individual was denoted as an “expected” pattern if the two matched or “unexpected” if the two did not match. The repeatability of the assay was quantified by calculating the percentage of samples for each species that produced the expected fragment patterns.

3. Results and Discussion

3.1 Predicted RFLP Patterns

Two restriction enzymes, ApoI and TaqαI, used in a double-digestion were predicted to differentiate 11 gadoid species of interest. The expected fragment lengths for each species can be found in Table 1.4. Restriction mapping in Sequencher predicted a single unique fragment pattern for all species except Enchelyopus cimbrius and Brosme brosme: each of which produced two different patterns (denoted as Types 1 and 2)

12

(Table 1.4, Fig. 1.1). In both species the two pattern types were distinct from all other species. Only Type 2 of E. cimbrius was observed in our reference samples. However, both haplotypes of B. brosme were observed in our reference samples and occurred at a frequency of 25% Type 1 and 75% Type 2.

The expected restriction fragment patterns for each species were validated using reference samples that were previously identified using morphological characteristics.

The validity of this test (percentage of reference samples that produced the expected pattern) was 100% for Urophycis tenuis, U. chuss, Phycis chesteri, Merluccius albidus,

Enchelyopus cimbrius, Brosme brosme, Pollachius virens and Melanogrammus aeglefinus. Validity was 92.3% for U. regia, 94.7% for Merluccius bilinearis, and 77.8% for Gadus morhua.

Sequencing of representative reference samples produced the expected restriction sites for all species compared to the original predictions based on GenBank data. These results support our initial assumption that the sequences available through

GenBank would be representative of the individuals caught and/or sold in the Gulf of

Maine.

3.2 Identification of Market Samples

RaGIA was able to successfully identify 173 out of 178 market samples. DNA was not successfully amplified from two market samples (one putative pollock and one putative haddock); therefore, these samples were excluded from the subsequent analysis. With the exception of three rack samples, all remaining market samples were identified. The 26 hake rack samples consisted of 21 U. tenuis, two U. chuss, and three

13 samples that produced a fragment pattern not matching any of the predicted patterns.

The 69 hake fillet samples consisted of 65 U. tenuis and four U. chuss. Pooling across all stores and samples types (rack and fillets) the overall species composition of identifiable market hake was 93.5% U. tenuis and 6.5% U. chuss. All 52 market pollock were identified as Pollachius virens and all 29 haddock were Melanogrammus aeglefinus.

All fillets tested in this study were sold under acceptable market names, however, 24 samples of hake and five samples of pollock had inaccuracies in the distributor’s labeling. Distributor C2 provided a label that identified 24 market hake samples as both “hake fillets” and “Latin name: Merluccius bilinearis.” This label is problematic for two reasons: 1) According to the FDA 2014 Seafood List, the only acceptable market name for Merluccius bilinearis is whiting and 2) All 24 samples associated with this label were identified as Urophycis tenuis and not M. bilinearis.

Therefore this label is both misleading in identifying M. bilinearis as hake (instead of whiting) and inaccurate since the fillets were actually U. tenuis.

Distributor C3 labeled five pollock as Pollachius pollachius (European pollock); however, these samples were identified using RaGIA as P. virens. The predicted fragment pattern for P. pollachius is similar yet distinguishable from P. virens with fragments of 642, 241 and 95 bp. To confirm that these samples were P. virens, two of the samples in question were sequenced and fragment patterns were predicted using

Sequencher. Both sequences contained the expected restriction sites and did not contain any unexpected sites. As an additional check, these sequences were also identified using blastn (GenBank) and were found to have 99% and 100% matching

14 identity with published cytochrome b sequences for P. virens (Accession #s EU492146 and EU492302, respectively). While both P. pollachius and P. virens may acceptably be sold under the market name pollock, the mislabeling of the scientific name on the distributor’s label is a concern. Many consumers are interested in the source of their fish and may have a preference for locally caught (i.e. Gulf of Maine) fish. While P. virens is found locally in the Gulf of Maine, P. pollachius is found in the Eastern Atlantic and could not be considered locally caught.

3.3 Unexpected RFLP patterns

Four reference samples (representing less than 3.5% of the total reference samples) produced fragment patterns that did not match any of the predicted patterns: one U. regia (Ure001), one M. bilinearis (Mbi001), and two G. morhua (Gmo001 and

Gmo010). Each of these samples was sequenced to verify identity and determine the reason for the differing results. In the following discussion the position of restriction sites are reported relative to the first base on the 5’ end (in the tRNA-Glu) of the amplicon used in this study.

Sample Ure001 (Urophycis regia) produced a pattern with bands at approximately 800 bp and 500 bp in addition to the three expected bands of 451, 312 and 170 bp. The sequence produced by Ure001 did not have any unique mutations when compared to all other U. regia individuals that were sequenced during this study.

Additionally, restriction mapping of this sequence showed all expected restriction sites and no unexpected sites. There are a number of possible explanations for why the expected pattern was not produced including sample contamination, amplification of a

15 non-target sequence or poor quality of the DNA. Low-quality DNA is the most likely cause considering that this sample had weak amplification in general and the sequence produced was relatively low quality but did not show any indications of non-target amplification. The degradation likely resulted in incomplete cleavage of the amplicon.

Sample Mbi001 (Merluccius bilinearis) had an additional, unexpected restriction site at position 652. This mutation caused the expected 763 bp band to cleave into two smaller fragments and produced a banding pattern of: 437, 326, 121 and 94 bp.

However, this banding pattern is also unique from all other species and likely represents a second haplotype for Merluccius bilinearis.

Sample Gmo001 (Gadus morhua) had a mutation at position 690 resulting in the loss of the ApoI cutsite at position 689, and yielding a banding pattern with the following fragment lengths: 448, 265, 132, 83, 50 bp. This sample was collected from a farm-raised fish and when compared with several other G. morhua sequences three unique mutations (not shared with any of the other individuals) were detected.

Gmo010 had all expected cutsites at 132, 215, 450 and 689. We were not able to sequence through the last cutsite at 928, however, the pattern produced by Gmo010 would support the hypothesis that this cutsite is missing (resulting in a 289 bp fragment that would normally be split into a 239 and a 50 bp fragment).

Three putative hake rack samples produced the same unexpected fragment pattern and cyt b fragments from all three samples were sequenced. The edited sequences were run through blastn (GenBank) and all three samples had a 99% to 100% match in identity to cytochrome b sequences for Lophius americanus (monkfish)

16

(Accession #s HE608203, HE608205, HE608206). Our rack sequences were aligned with and compared to a published cyt b sequence for L. americanus from GenBank (Accession

# HE608212). Both the rack sequences and the L. americanus sequence downloaded from GenBank produced the same predicted restriction fragment pattern with a restriction site at 652 bp (expected fragment sizes of 652 and 326 bp) that matched the

RaGIA pattern produced by the three rack samples. In total, this evidence suggests that these three samples were L. americanus. It is unlikely that whole fish of L. americanus could be mistaken for a hake species due to their vastly different morphology. However, monkfish fillets are a white meat similar to many of the groundfish examined in this study and would be in the same size range as a hake. Therefore it would not be completely unreasonable for the fillets to be mislabeled.

4. Conclusion

We have developed a rapid, low-cost molecular assay for differentiating commercial gadoid fishes, particularly the hakes, pollock and haddock. Expected patterns were found in 79 out of 81 hake reference samples. There was over 90% accuracy for all hake species, suggesting that this is an effective method of identification for this group. RaGIA was also highly successful at identifying Pollachius virens and

Melanogrammus aeglefinus, which is evident from the analysis of both reference samples and market samples. Analysis of reference samples suggests that the RaGIA method would also be effective for identifying B. brosme and Enchelyopus cimbrius, however, an increased sample size and market sample testing of these species is required. The low accuracy at identifying G. morhua reference samples suggests that

17

RaGIA was detecting population level variation and the methods developed here should be applied with caution in G. morhua.

RaGIA may be applied in future studies to examine temporal changes in the species composition of hake in the marketplace. This type of analysis could be especially useful as a possible indicator of changes in the spatial distribution of hake species in the

Gulf of Maine over time. Four species of hake that occur in the Gulf of Maine have exhibited distributional changes over the past 45 years that may be related to increasing water temperatures (Nye et al. 2009). Urophycis chuss and M. bilinearis stocks have shown an apparent northward shift while U. tenuis stocks have shown range contraction and U. regia stocks have shown range expansion (Nye et al. 2009). While our study suggests that U. tenuis are the dominant species of hake appearing in Southern Maine markets, there may be seasonal, annual or long-term variation in this species composition. RaGIA has the ability to detect this kind of variation and may be useful as a tool to detect important ecological changes.

18

CHAPTER 1 TABLES

Table 1.1 Species summary of the 118 reference samples including sampling location and current taxonomic classification following Teletchea et al. (2006) and Roa-Varón and Ortí (2009). N is the number of samples per species.

Subfamily Species N Common Name Sampling Location Gulf of Maine (wild Gadinae Gadus morhua 9 Atlantic Cod caught) and Portsmouth, NH (farm-raised)

Gadinae Melanogrammus 6 Haddock Gulf of Maine aealefinus Gadinae Pollachius virens 9 Pollock Gulf of Maine Gaidropsarinae Enchelyopus cimbrius 5 Fourbeard Gulf of Maine Rockling Lotinae Brosme brosme 8 Cusk Gulf of Maine Phycinae Phycis chesteri 9 Longfin Hake Continental shelf off of DE Phycinae Urophycis chuss 20 Gulf of Maine and Georges Banks Phycinae Urophycis regia 13 Spotted Hake Gulf of Maine; Coast of DE to VA Phycinae Urophycis tenuis 10 Coast of NJ; Gulf of Maine Merlucciinae Merluccius albidus 10 Offshore hake Cape Hatteras to the coast of MD Merlucciinae Merluccius bilinearis 19 Silver hake/whiting Gulf of Maine and Georges Banks

19

Table 1.2 Identification information for all market samples by store and distributor. Store and distributor identities have been withheld by request. Asterisks (*) indicate samples purchased individually from the market and for which distributor information is unknown. Ɨ Symbol indicates a store that processes its own fish and does not have a distributor.

Number of Marketed As Samples Store ID Distributor ID Scientific Name Hake Rack 26 A N/A* Not available Hake 30 B B1 Not available Hake 8 C CI Not available Hake 24 C C2 Merluccius bilinearis Hake 7 A; C, D, E unknown* TOTAL HAKE 95 Haddock 8 C CI Not available Icelandic haddock 5 c C3 Melanogrammus oeglefinus Icelandic haddock 9 c C4 Melanogrommus aeglefinus Haddock 3 c C4 Melanogrammus aeglefinus Haddock 5 F, G, H unknown* TOTAL HADDOCK 30 Pollock 30 B B1 Not available Pollock 11 C C2 Pollachius Pollock 6 C C3 Pollachius pollachius Pollock 5 C C5 Not available Pollock 1 D unknown* TOTAL POLLOCK 53 OVERALL TOTAL 178

20

Table 1.3 Primers used for PCR amplification. Primer direction is either denoted by the last letter of the primer name (F= forward, R= reverse) or in parentheses following the primer name. Asterisk indicates weak amplification of some target species with this primer.

TR 571R

•XiGTSJG-

21

Table 1.4 Expected fragment lengths produced for each species using a virtual digestion in Sequencher with the restriction enzymes ApoI and TaqαI. Fragments less than 90 bp long are highlighted in gray since they are not easily visualized on a gel and cannot be seen in our reference sample pictures. Boxes surround fragments that are within 20 bp of the same length and typically appear as one band on a gel (see Fig. 1.1).

G. morhua M. aeglefinus P. virens E. cimbrius B. brosme Typel Type2 Typel Type2 Fragment 265 411 642 413 251 411 411 Lengths 239 395 276 198 198 265 265 209 132 60 119 162 112 215 132 40 106 119 103 47 83 91 106 47 40 50 51 91 40 51

P. chesteri U. chuss U. reaia_ U. tenuis_M. aIbidus M. bilinearis

Fragment 352 713 451 474 312 763 Lengths 289 170 312 239 289 121 164 50 170 170 162 94 122 45 45 50 121 51_ 45_ 94

22

t 1 1 W D UUU1

CHAPTER 1 FIGURES

M 1 2 3 4 5 6 M 7 8 9 10 11 12 M 1031 t bp t

\ 500 1 bp

mini 1 mini

100

bp tin

1 1 '

Figure 1.1 Expected fragment patterns from reference samples of each species. RaGIA I

products were run on a 1.5% agarose gel with ethidiumII bromide. Lanes 1 to 12 are as follows: M. albidus, M. bilinearis, U. tenuis, U. chuss, U. regia, P. chesteri, B. brosme

(Type 1), B. brosme (Type 2), E. cimbrius (Type 2), G. morhualit , P. virens, M. aeglefinus. M

r lanes are a low range DNA ladder (80 bp, every 100 bp from 100 to 900 bp, and 1031bp). I

I I

i I

i I

i i i

23

CHAPTER 2

PHYLOGENY AND TRAIT EVOLUTION OF MORPHOLOGICAL AND LIFE HISTORY

CHARACTERISTICS OF THE GADOIDEI

Abstract

Despite their global economic importance, the phylogeny of the Gadoid fishes

(cods, haddocks, hakes, whiting, etc.) remains uncertain and there is little known about trait evolution in this group. In this study a phylogeny was inferred with molecular sequence data from cytochrome b and RAG1 genes using Maximum Likelihood,

Neighbor Joining and Bayesian reconstruction methods. A consensus tree produced from multiple inference methods was used for character mapping of two meristics

(number of dorsal and anal fins) and two categorical traits (geographic distribution and presence of an egg oil globule). Phylogenetic analysis resolved the Gadidae and

Merlucciidae family clades as well as three subfamilies of Gadidae: Gadinae,

Gaidropsarinae and Phycinae; however, the Lotinae subfamily clade was not resolved.

24

Possession of three dorsal fins, two anal fins and the lack of an egg oil globule were characteristic of the most derived group (Gadinae).

1. Introduction

Gadiform fishes are among the most economically and ecologically important fish groups worldwide. The suborder Gadoidei in particular includes a number of commercially important species in the families Gadidae and Merlucciidae such as the cods, pollocks, hakes, haddocks, whiting and cusk. According to the FAO 2012 yearbook statistics, the group “cods, hakes and haddocks” are among the top five fisheries resources (in terms of metric tons of capture). Gadoids are a diverse group that primarily includes marine fishes as well as the freshwater , Lota lota. Gadoids are generally considered benthopelagic and are geographically widespread: inhabiting the coasts, continental shelves and slopes of the Arctic, Atlantic and Pacific Oceans as well as the Gulf of Mexico and the Mediterranean Sea. The Gadidae and Merlucciidae fishes are diverse in their life history characteristics, as well. As adults, species in these families range in maximum size from 30 to 200 cm. Eggs and larvae of these fishes vary in size, shape, buoyancy and adhesive properties; additionally, their eggs may or may not contain oil globules and pigments (Markle and Frost 1985; Fahay 1983). The general body shape of this group is relatively conserved with the most prominent morphological differences occurring in the number and shape of fins (Cohen et al. 1990). While all

Gadid and Merlucciid fishes have a distinct caudal fin, some of the other members of the Gadoids, such as the Trachyrincidae, have a tail continuous with their body. Due to their diversity in morphological, geographical, and life history traits, the Gadoid fishes

25 are of interest for examining trait evolution. This study will use a phylogenetic approach to investigate evolutionary relationships and trait evolution among the Gadoids.

A fully resolved and highly supported phylogeny of gadiform fishes remains elusive and a number of different phylogenetic hypotheses have been proposed over the last 70 years (Roa-Varón and Ortí 2009; Teletchea et al. 2006; Endo 2002). One of the greatest sources of discrepancy among the hypotheses of gadiform phylogeny lies in the hierarchical division of genera into families and subfamilies (Cohen 1990; Nelson

2006). The earliest effort to create a comprehensive phylogeny of the was completed by Svetovidov (1948), whose analysis was based on morphological features including bone structure and characteristics of vertical fins. Svetovidov (1948) identified

Gadidae as one family with three subfamilies: Gadinae, Lotinae and Merlucciinae. Based on otolith morphology, Nolf and Steurbout (1989) also recognized the family Gadidae with the subfamilies Gadinae, Lotinae and Merlucciinae, however, they added a new subfamily: Steindachneriinae.

With the exception of Nolf and Steurbout (1989), most publications after

Svetovidov (1948) place the genus Merluccius in its own family (Merlucciidae) within the suborder Gadoidei that also contains the family Gadidae (Markle 1982; Cohen et al.

1990; Howes 1989; 1991; Endo 2002; Teletchea et al. 2006; Roa-Varón and Ortí 2009).

While Merlucciidae is currently recognized by the Integrated Taxonomic Information

System (ITIS) (September 2014: http://www.itis.gov) as an independent family, the status (subfamily vs. family) of some gadid subdivisions remains uncertain. Markle

(1982) and Cohen et al. (1990) both recognized Gadidae as a family with three

26 subfamilies: Gadinae, Lotinae and Phycinae. However, Cohen (1984), Markle (1989), and

Howes (1989; 1991) all classified Gadidae, Lotidae, and Merlucciidae as separate families.

Recent studies that have used both traditional morphological analyses (Endo

2002; Teletchea et al. 2006) and molecular analyses (Teletchea et al. 2006; Roa-Varón and Ortí 2009) agree on the retention of the Merlucciidae and Gadidae with the latter containing four subfamilies: Gadinae, Lotinae, Gaidropsarinae and Phycinae. Roa-Varón and Ortí (2009) also include the families Ranicipitidae, Bregmacerotidae, Eulichthyidae,

Melanonidae, Moridae, and Trachyrincidae in the suborder Gadoidei and for the remainder of this discussion I will follow their classification scheme.

While numerous studies referenced above have investigated the systematics of the Gadiformes (and various clades therein), there is little information regarding trait evolution within the group. A goal in evolutionary biology is to compare traits among species and to infer a trait’s evolutionary history. Phylogenies play an important role in comparative analysis of traits among species by providing a context. Felsenstein (1985b) illustrated that a comparative method utilizing a phylogenetic hypothesis of species relationships can help to avoid the downfall of traditional statistics that treat all species as independent units: an assumption we know to be false. This multidisciplinary approach of incorporating phylogenetic information in comparative analysis has received much attention and is a central concept in the field of evolutionary physiology

(see Feder et al. 2000; Webb et al. 2002; Garland et al. 2005; Kraft et al. 2007).

27

A powerful method to examine trait evolution over time and to infer ancestral states is mapping characters onto a phylogeny generated independently of the traits of interest. Character mapping has been used with fish species to examine the evolution of a variety of important biological functions including: endothermy in tunas and billfishes

(Block et al. 1993), dietary trends in association with feeding biomechanics in Labrid fishes (Westneat 1995), and developmental modes of posterior lateral lines in ancestral and derived fish lineages (Pichon and Ghysen 2004). The diversity among Gadoid species provokes interest in the evolutionary history of their fin characteristics, geographic distribution and life history characteristics. The main objectives of this paper are to 1) infer a phylogenetic tree that includes the families Gadidae and Merlucciidae 2) examine character evolution in this group in a phylogenetic context.

2. Methods

2.1 Samples, DNA extraction, PCR, sequence alignment

Twenty-two gadoid species (representing 13 genera) were included in the analysis (Table 2.1). These species represent the families Merlucciidae and Gadidae and include representatives from all four subfamilies of Gadidae (Gadinae, Phycinae, Lotinae and Gaidropsarinae). Trachyrincus murrayi (family Trachyrincidae, formerly

Macrouridae) is a Gadoid that belongs to a separate clade from both the Gadidae and

Merlucciidae and was included as the outgroup. This analysis used one nuclear gene, recombination activating gene 1 (RAG1), and one mitochondrial gene, cytochrome b (cyt b). These genes were selected based on the availability of sequence data for the species of interest with the intention that the more conserved gene (RAG1) would help to

28 resolve deeper nodes (subfamily and family clades) of the phylogenetic tree while cyt b would help resolve inter-species relationships. Molecular sequence data was obtained primarily from GenBank records; however, this study produced novel sequences (that were not previously available) for Urophycis earllii (both cyt b and RAG1), Urophycis floridana (cyt b), and Brosme brosme (RAG1) (See Table 2.1).

For novel sequences, DNA was extracted from fin clip samples preserved in either 95% ethanol or a salt-saturated DMSO buffer (Seutin et al., 1991). DNA isolations were completed using a modified version of the “Rapid Isolation of Mammalian DNA” protocol from Sambrook and Russell (2000) (see Chapter 1 Methods). Cytochrome b fragments for Urophycis earllii and Urophycis floridana were amplified using the primers

L14332 (Teletchea 2006) and cytb939PHR following the Polymerase Chain Reaction

(PCR) protocol found in the Chapter 1 Methods. For Urophycis earllii, a 690 bp fragment of RAG1 was amplified using the primers RAG56F and RAG746R. For Brosme brosme, a

600 bp fragment of RAG1 was amplified using the primers RAG127F and RAG727R.

Primers used for PCR amplification are described in Table 2.2. PCR was conducted under the following cycling conditions: initial denaturation for 1 minute at 95°C, followed by

30 cycles of: 95°C for 30 seconds, 56°C or 61°C for 30 seconds (annealing temperatures for U. earllii and B. brosme, respectively), 72°C for 45 seconds, with a final extension at

72°C for 5 minutes. PCR products were visualized using ethidium bromide on a 1% agarose gel. PCR products were cleaned using a modified ExoSAP protocol from Bell

(2008) as described in the Chapter 1 Methods. Cleaned products were sent to Macrogen

29

Inc. (Rockville, MD, USA) for sequencing and both forward and reverse sequences were generated.

Raw sequences were initially assembled and edited using Sequencher (v.5, Gene

Codes Corporation, Ann Arbor, MI, USA). Both RAG1 and cyt b sequences (both newly generated sequences and those obtained from GenBank) were aligned and the reading frame was verified using the OPAL package (Wheeler and Kececioglu 2007) in Mesquite v. 2.75 (build 566) (Maddison and Maddison, 2011). In addition, alignments were checked manually for accuracy. There were no gaps in the Cyt b alignment and all regions with gaps in the RAG1 alignment were deleted for subsequent phylogenetic analyses.

2.2 Tests for Substitution Saturation

Because of the reported negative analytical effects of substitution saturation

(Philippe and Forterre 1999; Xia et al. 2003; Xia and Lemey, 2009), substitution saturation for both genes was initially assessed with a graphical analysis using scatterplots of the number of transitions and transversions per site against a corrected genetic distance in DAMBE5 (Xia, 2013) following Salemi (2009). Because the scatterplots display transitions and transversions separately, a K80-corrected genetic distance (Kimura, 1980) was used to allow for a transition-transversion bias. If little saturation has occurred, a linear trend would be observed; however, if saturation has been reached, the number of transitions and transversions is expected to plateau.

Saturation was also assessed in DAMBE using a statistical test by Xia et al. (2003) following Xia and Lemey (2009). Xia’s test calculates an index of substitution saturation

30

(Iss) which is compared to the critical index of substitution saturation (Iss.c). When Iss is significantly greater than Iss.c, substitution saturation is severe and these sequences are considered “useless” for phylogenetic analysis. Because the topology of a tree influences the calculation of the Iss.c value, this method tests two different topology assumptions: symmetrical and asymmetrical. A highly asymmetrical tree has a stepwise topology while a symmetrical topology is characterized by clades that are symmetrical about the root of the tree (see Xia and Lemey 2009). Although a highly asymmetrical tree is unlikely, this topology produces lower Iss.c estimates, indicating that it is more susceptible to the negative impacts of substitution saturation. Substitution saturation was evaluated for both genes using all positions and third positions only. In addition, I tested for saturation (using all positions) at the individual clade level (i.e., clades comprising Merluccius, Phycinae, Lotinae & Gaidropsarinae grouped, and Gadinae) using the same approach.

2.3 Phylogenetic Analysis

Phylogenetic trees were inferred separately for cyt b and RAG1 using three different methods: Maximum Likelihood (ML) (Felsenstein, 1981), Neighbor-Joining (NJ)

(Saitou and Nei, 1987), and Bayesian analysis (Yang and Rannala, 1997). Maximum

Likelihood and Neighbor-Joining analyses were conducted using MEGA6 (Tamura et al.

2013) and Bayesian analyses were conducted using MrBayes v3.2.2 (Huelsenbeck and

Ronquist, 2001; Ronquist and Huelsenbeck, 2003). The evolutionary models used in ML and NJ analyses were determined using a model selection test in MEGA6, which ranked models (from best to worst fit) based on the lowest Bayesian Information Criterion (BIC)

31 scores (Table 2.3). Due to limitations in the types of models that can be applied in a

Neighbor-Joining analysis, the model with the lowest BIC score that could be applied for

NJ analysis in MEGA was selected. Support for nodes in ML and NJ trees were tested using 1000 bootstrap pseudo-replicates (Felsenstein 1985a). Felsenstein (1985a) considered a bootstrap value ≥ 95 to be statistically significant evidence for the existence of a clade. However, Hillis and Bulls (1993) argued that bootstrap values ≥ 50 often underestimate the probability that a clade is real, and suggested that bootstrap values over 70 provide strong support for a clade. In the current analysis, a relatively conservative interpretation of bootstrap support values is applied; for trees inferred using Maximum Likelihood and Neighbor Joining methods, I considered bootstrap support to be weak if less than 70, moderate if between 70 and 89 and strong if greater than or equal to 90.

Bayesian analyses were run separately for each gene and the individual datasets

(cyt b and RAG1) were not partitioned: such that for each gene a single model of evolution was used for all sites. For both genes, a mixed model +Gamma +Invariant Sites was specified, such that: all sub-models of the General Time Reversible Model (GTR) were treated as equally likely, the variability in evolutionary rates across sites was assumed to follow a gamma distribution and the model allowed for a proportion of sites to be invariable. Because the Markov chain used in Bayesian analysis begins searching the tree space at a random point, it is thought that the trees sampled at the beginning of an analysis have a low likelihood and therefore a percentage of the sampled trees are typically discarded as burn-in (Ronquist et al. 2009). For cyt b, a codon analysis was run

32 for 5,000,000 generations with a burn-in of 50%. For RAG1, a nucleotide analysis was run for 10,000,000 generations with a burn-in of 25%. For all Bayesian analyses, no prior probabilities were set (default settings were used) to allow MrBayes to choose the

“best” model.

Bayesian analysis was also conducted on concatenated cyt b and RAG1 sequence data using MrBayes. Three different analyses were run with the concatenated data: one codon-based analysis (codon) and two nucleotide analyses: one with the third position of cyt b excluded (3rdExluded) and one with the third position partitioned separately

(3rdSep) such that third positions were allowed to have a separate model of evolution from first and second positions. For the codon tree, two partitions were used: cytochrome b and RAG1. For the 3rdExcluded tree, five partitions were used, one for each position of each gene included in the analysis. For the 3rdSep tree, four partitions were used: 1st and 2nd positions of cyt b, 3rd positions of cyt b, 1st and 2nd positions of

RAG1 and 3rd positions of RAG1. For all concatenated data analyses, partitions were allowed to evolve at different rates and the following variables were unlinked such that each partition’s parameters were estimated separately: state frequencies, substitution rates in the GTR model, gamma parameter shape and the proportion of invariant sites.

The two concatenated nucleotide analyses were run for 10,000,000 generations and the codon model was run for 5,300,000 generations. All three concatenated analyses used a burn-in of 50%. A high burn-in (50%) was used for cyt b and concatenated analyses because codon analysis and concatenated data are more computationally complex and time intensive than nucleotide analysis with a single gene. A higher burn-in for these

33 datasets was used to ensure that the data used for inferring the phylogenetic trees had reached stationarity following the methods of Douady et al. (2003). All Bayesian analyses were run with three heated chains and one cold chain and the chain was sampled every 1000 generations following Ronquist et al. (2009). Support for nodes of trees inferred using Bayesian analysis was assessed with posterior probabilities (PP). For

Bayesian analyses, I considered posterior probability support of nodes to be weak if less than 0.85, moderate if between 0.85 and 0.95 and strong if greater than or equal to

0.96. The PP values corresponding to moderate and strong support are more conservative than bootstrap support values because PP values have been shown to be consistently higher than bootstrap values (using the same dataset and evolutionary models) and often overestimate the probability that a clade is real (Erixon et al. 2003;

Simmons et al. 2004). In this analysis, clades for which the species level relationships cannot be determined, due to very weak support (bootstrap <50; PP < 0.50), are defined as soft polytomies. A soft polytomy refers to a clade for which there is insufficient data to resolve the nodes or cladogenic events that define the species level topology.

Three consensus trees were made from the analyses described above: one for cyt b data, one for RAG1 data and one for the concatenated data. For each gene individually, consensus trees were created using one tree from each inference method

(ML, NJ and Bayesian). This consensus tree generation was conducted in Mesquite

(Maddison and Maddison, 2011) and a majority rule condition was applied.

2.4 Character Mapping Analysis

34

A number of characters were mapped to the consensus of the concatenated trees (as this tree is assumed to be the most representative topology) (Table 2.4). The characters examined included two meristics (number of dorsal fins and number of anal fins) and two categorical characters (geographic distribution and egg oil globule presence). Character state data was collected from the literature (Collette and

MacPhee, 2002; Cohen et al. 1990; Markle 1982; Markle and Frost 1985; Fahay 1983;

Dunn and Matarese 1987) and character state values (or categories) can be found in

Table 2.4.

All character mapping was conducted in Mesquite (v. 2.75, build 566) using the

Trace Character History function and Parsimony Analysis (Maddison and Maddison,

2011). The default settings for the parsimony model associated with each character were used; meristic characters were assumed to be under a linear meristic model, categorical characters were assumed to be unordered (Maddison 1991).

2.5 Substitution Pattern Homogeneity Analysis

Gaut et al. (2011) identifies three types of variation in substitution rates: variation among sites, genes and lineages. Substitution pattern heterogeneity substantially impacts the inference of phylogenetic trees (Kolaczkowski and Thornton

2004). Variation in substitution rates across sites was addressed by adding a gamma parameter and specifying a proportion of invariant sites in the evolutionary models used for inference. Variation in rates among genes was addressed by initially inferring trees for each gene separately and partitioning the concatenated data in Bayesian analyses such that different models of evolution were applied to each gene.

35

The variation in substitution patterns across lineages is a more complicated issue to address. There are a number of factors that affect substitution rates including life history characteristics and physiological traits, many of which have complex effects; these factors may additionally interact with each other to produce new effects (Martin and Palumbi 1993; Lanfear et al. 2014). Two of the most important species-specific factors that affect substitution rates are generation time and metabolic rate (Gaut 1989;

Marin & Palumbi 1993). In theory, if the number of substitutions that occur per generation is effectively constant, then species with shorter generation times will evolve at faster rates (i.e. will have more substitutions per year). Additionally, higher metabolic rates produce more free oxygen radicals, causing oxidative damage to DNA and resulting in higher substitution rates (Martin and Palumbi 1993). There is much debate over which method(s) of phylogenetic inference are the most robust to violations of the assumption of substitution pattern homogeneity (Grievink et al. 2010; Lockhart et al.

2006; Kolaczkowski and Thornton 2004). The intent of this study is not to argue for or against the use of particular methods of phylogenetic analysis in the context of their ability to resolve an accurate tree when substitution pattern heterogeneity is present.

However, the extent of substitution pattern heterogeneity in the genes will be determined to account for possible sources of error in phylogenetic hypotheses.

Substitution pattern homogeneity was tested in MEGA6 using the Disparity Index

(Kumar and Gadagkar, 2001) and a Monte Carlo test with 1000 replicates. All codon positions were included in the analysis and any sites with less than 80% coverage were not included in analysis (i.e. if a nucleotide site was represented in only 79% or less of

36 the total number of species, then this site was excluded). This analysis tested sequences in pairs against the null hypothesis that the two sequences have the same pattern of evolutionary substitution. The results were used to examine similarities and differences in substitution patterns within and among lineages.

3. Results

3.1 Sequencing Results

For the majority of sequences obtained from GenBank, I was able to obtain the entire cytochrome b sequence (1140 bp) and a partial RAG1 sequence (786 bp). The exceptions for cytochrome b data are Merluccius albidus (928 bp fragment), and

Urophycis regia and Urophycis chuss (1114 bp fragments). A single sample each of

Urophycis earllii and U. floridana were sequenced, which produced 912 bp and 755 bp fragments of cytochrome b, respectively. A 636 bp fragment of RAG1 was generated for

U. earllii. I was able to sequence a 570 bp fragment of RAG1 from three samples of

Brosme brosme and only one haplotype was observed among these sequences.

3.2 Substitution Saturation

Scatterplots of transitions and transversions against K80 distance show linear trends for both cyt b and RAG1 when all codon positions are included (Figures 2.1A &

2.1B). However, when only third positions are analyzed, both transitions and transversions approach an asymptote in the RAG1 plot (Fig. 2.1D) and make a substantial curve ending in a plateau in the cyt b plot (Fig. 2.1C). This suggests that when all positions are considered, complete saturation has not occurred in either gene; however, third positions of RAG1 are approaching saturation and third positions of cyt b

37 are saturated. The results from substitution saturation statistical testing are summarized in Table 2.5. Testing was conducted for two different assumptions of topology: a symmetrical tree and a highly asymmetrical tree. In the cyt b alignment with all positions and all clades included, little saturation was indicated (regardless of the topology of the tree). When only third positions in all clades are considered, there is substantial saturation if the tree is symmetrical, and in a completely asymmetrical tree the sequences are sufficiently saturated such that these sites are not informative for phylogenetic analysis. Looking at each clade individually, the Phycinae clade shows significant saturation when all positions are included regardless of the topology. All other clades (Merluccius, Gadinae, and Gaidropsarinae and Lotinae combined) show little saturation regardless of topology. Due to significant evidence of substitution saturation at third positions in cyt b, only first and second codon positions were included in the Maximum Likelihood nucleotide analysis using cyt b sequence data.

For RAG1, when all positions and all clades are included, little saturation is indicated. When only third positions in all clades are considered, there is little saturation with a symmetrical tree; however, if the tree is highly asymmetrical there is significant saturation. Analysis of each clade individually suggests little saturation regardless of the tree topology. Both graphical and statistical testing of saturation indicated that substitution saturation at third positions in the RAG1 gene was not as extensive as in the cyt b gene (less of a curve in the scatterplot and saturation is only significant if the tree is highly asymmetrical). Therefore, all codon positions were included in analysis for

RAG1.

38

3.3 Phylogenetic Analysis

Of the consensus trees, the concatenated data (Fig. 2.2) provided the strongest supported tree (as indicated by bootstrap and posterior probability values) followed by the RAG1 gene tree (Fig. 2.3). The cyt b gene tree (Fig. 2.4) had the weakest overall support. All three consensus trees resolved the following clades with high support using

Maximum Likelihood and Bayesian analysis and moderate to high support in Neighbor

Joining analysis: Merlucciidae, Gadidae, Phycinae, Gaidropsarinae and Gadinae. This analysis did not resolve the Lotinae clade. Gadus morhua, Gadus macrocephalus and

Gadus chalcogramma formed a clade with high support; however, the sister taxa relationships at this level are uncertain. The concatenated data suggests that G. morhua and G. chalcogrammus are most closely related, the RAG1 data suggest that G. chalcogrammus and G. macrocephalus are most closely related and the cyt b data suggests that G. morhua and G. macrocephalus are most closely related and none of the three different hypotheses are well supported by all inference methods used.

Observed differences in tree topology across phylogenetic inference methods typically occurred in terminal nodes representing species. Merluccius merluccius appears most basal within the Merlucciidae clade from the concatenated (Fig. 2.2) and

RAG1 trees (Fig. 2.3), however, Merluccius bilinearis is basal in the cyt b tree (Fig. 2.4).

The different analyses are not congruent in reference to relationships between species of Phycinae. The concatenated data suggests that Phycis chesteri is the most basal species and sister taxon to the remaining Phycinae species (Fig. 2.2). RAG1 data alone were not able to resolve the species level relationships for Phycinae and formed a large

39 polytomy (Fig. 2.3). The cyt b tree contradicts the topology from the concatenated data and suggests that Urophycis earllii is the most basal species within the Phycinae.

Additionally, the genus Urophycis is not a monophyletic group in the cyt b analysis as

Phycis chesteri aligns with Urophycis tenuis and appears to represent the most recent divergence within the Phycinae (Fig. 2.4). Finally, in the cyt b tree, molva and

Brosme brosme are sister taxa, and Melanogrammus aeglefinus and Pollachius virens are sister taxa (Fig. 2.4), while in the RAG1 and concatenated trees they form a step- wise (asymmetrical) topology (Figures 2.2 and 2.3).

3.4 Character Mapping

Two meristics (anal and dorsal fins) and two categorical variables were mapped onto the concatenated consensus tree. For meristics: the two anal fin state is only found within the Gadinae clade, with all others displaying the one anal fin state (Fig. 2.5A). For dorsal fins, three fins are characteristic of the Gadinae clade, all others have two fins with the exception of the Gaidropsarinae clade and Brosme brosme which have just one

(Fig. 2.5B).

For categorical data, the lack of an egg oil globule appears to be characteristic of the Gadinae clade, as all other species for which data were available had at least one oil globule. Egg oil globule presence data were not available for proximus and this analysis was not able to infer which state would be more likely because both states

(present/absent) were equally parsimonious (split black and white color of the branch,

Fig. 2.6). For geographic distribution, multiple states (geographic locations) were allowed and therefore a number of species show branches with two colors (Fig. 2.7).

40

Trachyrincus murrayi, Enchelyopus cimbrius, Molva molva, Brosme brosme, Pollachius virens, Melanogrammus aeglefinus and Gadus morhua are all distributed in both the

Arctic and Atlantic. Merluccius merluccius can be found in both the Atlantic and the

Mediterranean and Urophycis floridana occupies both the Atlantic and the Gulf of

Mexico. Merluccius gayi, Microgadus proximus, Gadus macrocephalus and Gadus chalcogrammus are the Pacific fishes. All other species can be found in the Atlantic with the exception of Lota lota, a circumarctic species.

3.5 Substitution Pattern Homogeneity Analysis

While the Lotinae clade proposed by Roa-Varón and Ortí (2009) was not supported in the current analysis, the following results will use their classification scheme with the term Lotinae referring to Lota lota, Brosme brosme, and Molva molva as this is a traditionally recognized group of species. Substitution pattern homogeneity was tested for each species pair using both cyt b and RAG1 sequence alignments and resulting matrices can be found in Tables 2.6 and 2.7. For cyt b data, the substitution pattern for T. murrayi was found to be significantly different (P < 0.05) from all species in the Merluccius, Lotinae and Phycinae clades as well as Melanogrammus aeglefinus and ensis. Merluccius species were found to have a homogeneous substitution pattern that was significantly different from all other species with the exception of Urophycis regia. Neither the Gaidropsarinae nor the Phycinae clades had homogenous substitution patterns. The Lotinae species had a homogenous substitution pattern that was not significantly different from that of Gaidropsarus ensis. The Gadinae

41 had a homogeneous substitution pattern with the exception of M. aeglefinus which had a substitution pattern that was significantly different from all species except E. cimbrius.

The RAG1 sequence data exhibited a different trend from cyt b in substitution patterns across lineages. In this analysis, there appeared to be a homogenous substitution pattern within the Gadinae clade, Brosme brosme and Lota lota that was significantly different from all remaining species (Table 2.7). The remaining species had a generally homogenous substitution pattern with only 17 species pairs significantly different from each other (Table 2.7).

4. Discussion

4.1 Phylogeny

Overall, the phylogeny inferred in this study agrees with previous findings from

Endo (2002), Teletchea et al. (2007) and Roa-Varón and Ortí (2009) in recovering multiple subfamilies within the Gadidae. While all three of these previous studies divide

Gadidae into four subfamilies (Phycinae, Gaidropsarinae, Lotinae and Gadinae) the relative support for each of these subfamilies and the topology of the Gadidae clade is still in question. For example, the support from molecular analyses for the monophyly of the Lotinae clade is poor; Roa-Varón and Ortí (2009) did not resolve this clade and although Teletchea et al. (2007) did resolve this clade, it was with weak support in the

Maximum Parsimony and Maximum Likelihood analyses (bootstrap values of 70 and 65, respectively) and moderate support in Bayesian analysis (0.90 PP). Roa-Varón and Ortí

(2009) suggested that adding in the species Brosme brosme to the analysis may help resolve this clade, however, as this study shows, even with this species included the

42

Lotinae clade is not recovered. A study on the phylogeographic history of Lota lota examined relationships between the genera in the Lotinae subfamily, but was unable to resolve relationships using cyt b (Van Houdt et al. 2003). However, our analysis shows strong support for the grouping of the Lotinae and Gadinae together, which is a clade that Endo (2002) refers to as “Group II” and is supported by Roa-Varón and Ortí (2009) and morphological (but not molecular) analysis by Teletchea et al. (2007).

Endo (2002) also resolved a “Group I” clade that included the subfamilies

Phycinae and Gaidropsarinae. While the “Group I” clade was supported in both molecular and morphological analysis of Teletchea et al. (2007), this clade was not recovered in the current study or in Roa-Varón and Ortí (2009). While strong support for the monophyly of the Phycinae clade was detected, the species relationships within the clade were not strongly supported; however, the overall topology detected in the concatenated consensus tree (Fig. 2.2) does agree with Roa-Varón and Ortí (2009).

Using genes other than cytochrome b and RAG1 may provide more phylogenetic information and result in well supported nodes among and within some of these subfamilies. While the cytochrome b sequence data resolved some of the nodes within the phylogeny, the loss of the third positions due to significant substitution saturation resulted in the loss of some power for resolving other nodes. Typically, third positions have higher rates of substitution than first and second positions, and therefore may contain important phylogenetic information that resolves lower level phylogenetic relationships, e.g. between species (Hilu et al. 2014). Unfortunately, this was not the case in my analysis. Two possibilities remain: 1) inclusion of more genes that are not

43 saturated at third positions could provide better resolution for the Phycinae and putative Lotinae clades or 2) seemingly simultaneous divergences have occurred and the order of these cladogenic events cannot be reconstructed.

Heterogeneity in substitution patterns among species was found in both cyt b and RAG1 and there were distinctly different patterns of substitution across Gadoid lineages between cyt b and RAG1. Because a consensus has not yet been reached on the best way to address the issue of substitution pattern homogeneity, this discussion will highlight possible influences of the observed heterogeneity on the observed results.

There were no apparent trends in substitution patterns that corresponded to trends observed in traits across the phylogeny. For example, the Phycinae clade was overall the most consistent group in character mapping, however, substitution pattern heterogeneity was detected among the Phycinae species in both cyt b and RAG1. A distinct change in substitution pattern occurs in RAG1 and splits the putative Lotinae clade: B. brosme and L. lota being homogeneous with the Gadinae but statistically different from the substitution pattern of M. molva. It is possible that this abrupt change is a factor resulting in the Lotinae clade not being resolved in molecular studies using RAG1 data (Roa-Varón and Ortí 2009 and this study).

4.2 Character Mapping

Inferring a phylogenetic tree is the first step in evolutionary analysis. While having confidence in our systematic classification schemes is important, it is often more interesting to use these trees to answer evolutionary and ecological questions. While unanswered questions about the systematics of the Gadoids remain, the concatenated

44 consensus tree produced in this study sufficiently reflects our current understanding of the relationships of these species and has enough resolution to begin investigating the evolution of traits within the Gadoids.

The absence of an oil globule in eggs is considered a trait characteristic of the

Gadinae clade (Svetovidov 1948; Markle 1982; Markle and Frost 1985; Fahay 1983), however, presence or absence of an egg oil globule for Microgadus proximus has not been reported. Therefore, while the analysis conducted in this study does not show support for either the presence or absence of an egg oil globule, the inclusion of M. proximus in the Gadinae clade would suggest that this species lacks an egg oil globule.

The ancestor of this group of fishes can be inferred to have two dorsal fins, one anal fin and at least one egg oil globule. The characteristics of having three dorsal fins, two anal fins and the absence of an egg oil globule are found only in the Gadinae clade and these states can be considered the more derived condition. While the number of anal fins and presence/absence of an egg oil globule both appear to have evolved once within the

Gadinae clade, the evolution of the number of dorsal fins appears to be more complicated. Svetovidov (1948) assumed that one dorsal fin and one anal fin was the most ancestral state and therefore proposed B. brosme as the most basal species among the gadoids. My analysis suggests that the ancestral state is two dorsal fins, and the one dorsal fin state evolved twice (once in the Gaidropsarinae and once for Brosme brosme in the Lotinae), and that three dorsal fins evolved with the common ancestor of the

Gadinae.

45

Depending on the taxa analyzed, the current patterns of geographic distribution may or may not be indicated by the phylogenetic relationships of the taxa included here. Campo et al. (2007) identifies two geographically distinct clades among the

Merluccius species: an American clade including the species distributed along the

Eastern Pacific and Western Atlantic Oceans and a Euro-African clade including all species distributed along the Eastern Atlantic. The American clade includes the species:

M. bilinearis, M. productus, M. angustimanus, M. gayi, M. hubbsi, M. australis M. albidus and the Euro-African clade includes the species: M. senegalensis, M. merluccuis,

M. capensis, M. paradoxus and M. polli. Although many of these species were not included in the current analysis, the relative relationships between the Merluccius species included in this study agree with the topology presented by Campo et al. (2007).

Among the Gadinae species, there are three Pacific fishes (Microgadus proximus, Gadus macrocephalus and G. chalcogrammus) and they do not group together, suggesting several independent invasions of the Pacific Ocean by the Gadinae: a finding corroborated by Carr et al. (1999). All three species invasions of the Pacific are hypothesized to have occurred via the Bering Strait approximately 3 million years before present (Carr et al. 1999; Møller et al. 2002).

4.3 Conclusions and Future Directions

While progress has been made in resolving the phylogenetic relationships of the

Gadoids, several questions concerning the Gadidae remain to be answered. For instance, is Lotinae truly a monophyletic clade? Additionally, while the monophyly of the Phycinae clade is well supported, the relationships among the species in this group

46 remain uncertain and a complete phylogeny including all recognized extant species has yet to be created. The inclusion of additional genes in phylogenetic analysis could be helpful in resolving some of these unanswered questions and will likely require the generation of novel sequence data for a number of these species. With our increasing understanding of Gadoid phylogeny, there is great potential for further investigation of trait evolution of life history, morphological and physiological characteristics in the

Gadoid fishes.

47

Chapter 2 Tables

Table 2.1 Molecular sequence data used in this study including GenBank accession numbers for cyt b and RAG1 data. Scientific names used in this study follow the nomenclature of the 2013 American Fisheries Society common and Scientific Names of Fishes.

Species Cyt b Accession # RAG1 Accession # Trachyrincus murrayi NC_008224 FJ215297 Merluccius bilinearis DQ174059 FJ215267 Merluccius gayi DQ174061 FJ215269 Merluccius albidus KM032254 AY308787 Merluccius merluccuis EF438547 JN230904 Brosme brosme EU492337 B.brosme_003_004_005 Lota lota DQ174052 JX190851 Molva molva EF427585 FJ215275 Microgadus proximus DQ174067 FJ215274 Melanogrammus aeglefinus EU492143 AJ566336 Gadus morhua DQ174045 FJ215242 Gadus macrocephalus DQ174044 FJ215241 Pollachius virens DQ174078 FJ215289 Gadus chalcogrammus AB078151 FJ215294 Enchelyopus cimbrius DQ174040 FJ215237 Gaidropsarus ensis DQ174048 FJ215244 Phycis chesteri DQ174074 FJ215287 Urophycis tenuis DQ174085 FJ215302

Urophycis regia KM032269, Ure005, 006, 009 FJ215301

KM032262, KM032263, KM0 Urophycis chuss 32264, KM032265 FJ215299 Urophycis earllii Uea001 Uea001 Urophycis floridana Ufl001 FJ215300

48

Table 2.2 Primers used for PCR amplification. The direction of the primer is indicated either in parentheses following the primer name or by the final letter in the primer name: forward= F, reverse = R.

Primer Name Gene Sequence Citation RAG56F RAG1 5'-TCAAAGAGTCCTGTGACG-3' This study RAG746R RAG1 5'-CCGATTTCATCCTGGAAG-3' This study RAG127F RAG1 5'-GCGGTGCGTTTCTCKTTCAC-3' This study RAG727R RAG1 5'-TCTTGTAGAACTCGGTGGCG-3' This study L14332 (F) cyt b 5'-TGAYITGAARAACCAYCGTTG-3' Teletchea et al. 2006 CYB939PHR cyt b 5'-TCGYTGYTTKGAGGTGTG-3' Chapter 1 of this study

49

Table 2.3 Substitution Models selected for Maximum Likelihood and Neighbor Joining analyses. Model abbreviations: HKY= Hasegawa-Kishino-Yano Model; JTT= Jones-Taylor-Thornton Model. Gamma Proportion of Substitution Shape Invariant Gene Analysis Type Model (citation) Parameter Sites Cyt b ML Nucleotide HKY (Hasegawa et al. 1985) 0.4454 59.69% Cyt b NJ Amino Acid JTT (Jones et al. 1992) 0.1873 N/A RAG1 ML Nucleotide Tamura 3-Parameter (Tamura, 1992) N/A 61.70% RAG1 NJ Nucleotide Tajima-Nei (Tajima and Nei, 1984) 0.5760 N/A

50

Table 2.4 Traits analyzed in this study and their associated states (categorical and meristic). *Multiple states are allowed for this character.

# of States or Characteristic Type Bins State Values (categorical/meristic) or State Range (continuous) Dorsal Fins Meristic 3 one; two; three fins Anal Fins Meristic 2 one; two fins Geographic Distribution Categorical 6* circumarctic; arctic; Atlantic; Mediterranean: Gulf of Mexico; Pacific Egg Oil Globule Categorical 2 present; absent

51

Table 2.5 Substitution saturation testing results. Results that show evidence of substantial saturation (Iss.c > Iss but the difference is not significant) are in bold and those that show significantly higher Iss.c compared to Iss values (indicating non-informative sequences as defined by Xia et al. 2003) are bolded and underlined. symmetrical tree assymetrical tree Gene Positions included Clade Iss Iss.c P-value Iss.c P-value RAG1 all positions all clades 0.4309 0.7538 0.0000 0.5064 0.0000

RAG1 3rd positions all clades 0.5051 0.6389 0.0000 0.4004 0.0005 RAG1 all positions Merluccius 0.0742 0.8115 0.0000 0.7803 0.0000

RAG1 all positions Phycinae 0.5477 0.7824 0.0000 0.7105 0.0000

RAG1 all positions Gaidropsarinae and Lotinae 0.4447 0.7921 0.0000 0.7399 0.0000

RAG1 all positions Gadinae 0.1873 0.7824 0.0000 0.7105 0.0000

Cyt b all positions all clades 0.4136 0.7734 0.0000 0.5431 0.0000

Cyt b 3rd positions all clades 0.6661 0.6854 0.3235 0.4356 0.0000

Cyt b all positions Merluccius 0.4132 0.8249 0.0000 0.7930 0.0000 Cyt b all positions Phycinae 0.8787 0.7996 0.0000 0.7294 0.0000 Cyt b all positions Gaidropsarinae and Lotinae 0.4961 0.7928 0.0000 0.7056 0.0000 Cyt b all positions Gadinae 0.3932 0.7996 0.0000 0.7294 0.0000

52

Table 2.6 Cytochrome b substitution pattern homogeneity matrix. Below the diagonal are the P-values testing the null hypothesis that a species pair evolved under the same pattern of substitution. Above the diagonal are the Disparity Index (DI) values for each pair. Significant P-values (< 0.05) are highlighted in yellow. Sequence pairs from phylogenetic clades are boxed: from top left to bottom right the boxes represent: Merluccius, Phycinae, Gaidropsarinae, Lotinae and Gadinae.

T.mur M.mer M.bil M.alb M.gay P.che U.ten U.ear U.chu U.flo U.reg E.cim G.ens M.mol B.bro L.lot M.pro P.vir M.aeg G.mac G.cha G.mor Trachyrincus murrayi .444 4.397 5.860 5.727 1.575 1.193 0.921 1.959 2.573 3.319 0.000 0.571 0.383 0.442 0.605 0.324 0.253 0.467 0.049 0.036 0.000 Merluccius merluccius 0 0.000 0.000 0.014 0.823 0.763 1.218 0.404 0.342 0.000 5.687 1.565 2.149 1.878 1.589 2.436 2.936 7.524 3.355 4.296 3.449 Merluccuis bilinearis 0.000 1.000 1 004 0.030 0.918 0.780 1.341 0.458 0.597 0.011 5.586 1.464 2.038 1.772 1.525 2.435 2.964 7.557 3.345 4.288 3.412 Merluccius albidus 0.000 1.000 0. 0.000 1.322 1.284 1.997 0.915 0.325 0.212 6.695 2.446 3.075 2.966 2.628 3.447 3.932 8.897 4.444 5.552 4.542 Merluccuis gayi 0.000 0.338 0.256 1.0001 1.641 1.435 1.932 1.058 0.760 0.311 7.176 2.340 2.912 2.713 2.293 3.558 4.201 9.369 4.633 5.723 4.697 Phycis chesteri 0.000 0.004 0.003 0.000 0 0.027 0.000 0.000 0.000 0.308 2.295 0.386 0.818 0.489 0.465 0.262 0.444 3.063 0.688 1.153 0.808 Urophycis tenuis 0.001 0.003 0.003 0.000 0.000 0. 0.000 0.032 0.272 0.357 2.035 0.074 0.284 0.160 0.021 0.295 0.533 3.091 0.675 1.173 0.709 Urophycis earllii 0.003 0.000 0.000 0.000 0.000 1.000 1.0001 0.143 0.461 0.614 1.390 0.121 0.378 0.307 0.272 0.009 0.103 2.129 0.339 0.600 0.323 Urophycis chuss 0.000 0.032 0.024 0.001 0.000 1.000 0.275 0.1 0.007 0.094 2.772 0.375 0.835 0.516 0.473 0.533 0.808 3.856 1.064 1.634 1.171 Urophycis floridana 0.000 0.050 0.008 0.061 0.003 1.000 0.031 0.007 0. 0.054 3.393 0.936 1.323 1.542 1.218 1.111 1.240 3.478 1.430 2.160 1.702 Urophycis regia 0.000 1.000 0.378 0.103 0.047 0.022 0.009 0.001 0.121 02W\ .421 1.030 1.553 1.254 1.038 1.488 1.882 5.812 2.248 3.024 2.357 Enchelyopus cimbrius 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.095| 994 0.862 1.330 0.613 0.452 0.206 0.224 0.070 0.126 Gaidropsarus ensis 0.019 0.000 0.000 0.000 0.000 0.056 0.270 0.201 0.054 0.002 0.002| 0.000 000 0.000 0.000 0.123 0.321 2.345 0.344 0.714 0.290 Molva molva 0.039 0.000 0.000 0.000 0.000 0.003 0.065 0.041 0.002 0.000 0.000 0.001 1.000 0.000 o.oool 0.437 0.608 2.287 0.544 0.877 0.412 Brosme brosme 0.024 0.000 0.000 0.000 0.000 0.023 0.153 0.057 0.018 0.000 0.000 0.002 1.000TOoo 0.0211 0.155 0.302 2.101 0.291 0.621 0.230 Lota lota 0.009 0.000 0.000 0.000 0.000 0.028 0.354 0.065 0.021 0.000 0.001 0.000 1.000 1.000 0.324 .393 0.589 2.570 0.574 0.981 0.481 Microgadus proximus 0.064 0.000 0.000 0.000 0.000 0.081 0.071 0.373 0.015 0.000 0.000 0.008 0.190 0.014 0.104 0 0.000 1.239 0.000 0.132 0.000 Pollachius virens 0.085 0.000 0.000 0.000 0.000 0.035 0.018 0.210 0.002 0.001 0.000 0.029 0.066 0.004 0.040 0.005 1 .912 0.000 0.016 0.000 Melanogrammus aeglefinus 0.030 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.101 0.000 0.000 0.000 0.000 0.000 0. 0.694 0.326 0.682 Gadus macrocephalus 0.303 0.000 0.000 0.000 0.000 0.008 0.009 0.055 0.002 0.000 0.000 0.102 0.046 0.005 0.045 0.006 1.000 1.000 0. .000 0.000 Gadus chalcogrammus 0.313 0.000 0.000 0.000 0.000 0.000 0.001 0.011 0.000 0.000 0.000 0.249 0.007 0.001 0.006 0.000 0.103 0.329 0.014 1. 0.016 Gadus morhua 1.000 0.000 0.000 0.000 0.000 0.005 0.007 0.064 0.001 0.000 0.000 0.188 0.072 0.017 0.067 0.013 1.000 1.000 0.001 1.000

53

Table 2.7 RAG1 substitution pattern homogeneity matrix. Below the diagonal are the P-values testing the null hypothesis that a species pair evolved under the same pattern of substitution. Above the diagonal are the Disparity Index (DI) values for each pair. Significant P-values are highlighted in yellow. Sequence pairs from phylogenetic clades are boxed: from top left to bottom right the boxes represent: Merluccius, Phycinae, Gaidropsarinae, Lotinae and Gadinae.

T.mur M.mer M.bil M.alb M.gay P.che U.ten U.ear U.chu U.flo U.reg E.cim G.ens M.mol B.bro L.lot M.pro P.vir M.aeg G.mac G.cha G.mor Trachyrincus murrayi 0.014 0.294 0.123 0.188 0.047 0.049 0.000 0.096 0.201 0.305 0.056 0.286 0.578 1.140 2.904 1.810 1.962 1.691 2.364 2.273 2.374 Merluccuis merluccius 0.327 0.079 0.000 0.023 0.000 0.000 0.000 0.000 0.000 0.013 0.000 0.013 0.256 0.732 1.974 1.136 1.253 1.193 1.549 1.504 1.571 Merluccius bilinearis 0.006 0.018 0.028 0.006 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.347 1.132 0.497 0.585 0.581 0.808 0.771 0.829 Merluccius albidus 0.072 1.000 0.015 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.096 0.599 1.573 0.812 0.915 0.813 1.175 1.128 1.184 Merluccius gayi 0.023 0.172 0.184 0.358 000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.040 0.494 1.383 0.666 0.764 0.700 1.009 0.964 1.019 Phycis chesteri 0.238 1.000 1.000 1.000 1.000 0.000 0.000 0.012 0.041 0.070 0.000 0.000 0.209 0.669 1.793 1.006 1.117 0.863 1.418 1.368 1.461 Urophycis tenuis 0.241 1.000 1.000 1.000 1.000 1. 0.000 0.000 0.017 0.042 0.000 0.000 0.170 0.618 1.694 0.934 1.039 0.757 1.319 1.276 1.352 Urophycis earlii 1.000 1.000 1.000 1.000 1.000 1.000 7ooo| ).021 0.021 0.037 0.000 0.000 0.149 0.721 1.256 0.813 1.162 0.817 1.135 1.130 1.232 Urophycis chuss 0.150 1.000 1.000 1.000 1.000 0.186 1.000 0. 0.000 0.017 0.000 0.000 0.076 0.572 1.505 0.760 0.860 0.629 1.131 1.080 1.153 Urophycis floridana 0,045 1.000 1.000 1.000 1.000 0.025 0.133 0.078 1.0001 0.000 0.000 0.000 0.013 0.570 1.258 0.583 0.672 0.469 0.917 0.872 0.939 Urophycis regia 0.016 0.346 1.000 1.000 1.000 0.015 0.049 0.023 0.135 7oOO| 000 0.000 0.003 0.476 1.079 0.472 0.551 0.396 0.770 0.739 0.799 Enchelyopus cimbrius 0.221 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1 0.165 0.563 1.742 0.948 1.063 0.713 1.363 1.303 1.400 Gaidropsarus ensis 0.018 0.350 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1 0.000 0.382 1.114 0.456 0.548 0.393 0.773 0.720 0.785 Molva molva 0.000 0.017 1.000 0.119 0.247 0.027 0.041 0.070 0.143 0.335 0.382 0.039 1.000 0.245 0.325 0.194 0.530 0.462 0.526 Brosme brosme 0.000 0.000 0.009 0.001 0.001 0.000 0.000 0.000 0.000 0.000 0.001 0.001 0.008 0.000 0.042 0.024 0.029 0.000 0.061 Lota lota 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.062 0.023 0.097 0.000 0.000 0.000 Microgadus proximus 0.000 0.000 0.005 0.001 0.001 0.000 0.000 0.000 0.001 0.002 0.005 0.000 0.005 0.020 1.000 0.170 0.000 0.000 0.000 0.000 Pollachius virens 0.000 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.002 0.000 0.001 0.006 0.223 1.000 000 0.000 0.000 0.003 Melanogrammus aegtefinus 0.000 0.000 0.002 0.000 0.001 0.000 0.001 0.000 0.001 0.004 0.010 0.001 0.008 0.039 0.293 1.000 1.000 0.047 0.012 0.031 Gadus macrocephalus 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.274 1.000 1.000 1.000 0 003 0.001 Gadus chalcogrammus 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.000 0.000 0.001 .000 1.000 1.000 1.000 0.231 0.324 0.003 Gadus morhua 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.174 1.000 0.345 0.087 0.346 0.247

54

CHAPTER 2 FIGURES

and C A Transversions Transversions Transitions

Transistionsand

T T and and B D ransversions ransversions

Transitions Transitions

Figure 2.1 Scatterplots showing the number of transitions (X symbol) and transversions (Δ symbol) per site with K80 distance. A: cyt b alignment with all positions included; B: RAG1 with all positions included. C: cyt b with 3rd positions only; D: RAG1 with 3d positions only.

55

—/0.71/0.53 — Gadus morhua 1/1/1 — Gadus chalcogrammus 1/1/0.97 — Gadus macrocephalus 1/1/1 Gadinae — Melanogrammus aeglefinus 1/1/1 — Pollachius virens 0.96/1/0.99 — Microgadus proximus 0.95/0.92/0.88 — Lota lota 1/1/1 — Brosme brosme Lotinae 0.99/0.95/0.99 Molva molva } — Enchelyopus cimbrius 1/1/1 — Gaidropsarinae — Gaidropsarus ensis Urophycis regia > 1/1/1 1/1/1— 0.98/0.97/0.94 — Urophycis floridana 0.79/—/0.78 — Urophycis chuss Phycinae 0.76/—/0.56 — Urophycis earllii 1/1/1 1/1/1 — Urophycis tenuis

— Phycis chesteri

0.99/0.98/0.59 — Merluccius gayi 0.99/1/0.98 Merluccius albidus — Merluccius 1/1/1 — Merluccius bilinearis — Merluccius merluccius — Trachyrincus murrayi

Figure 2.2 Consensus tree of the concatenated data. Node labels show posterior probability (pp) values corresponding to Bayesian analysis for the codon tree, 3rdsep tree (for which 3rd positions of cyt b were partitioned separately) and 3rdexcluded tree (for which 3rd positions of cyt b were excluded from analysis). (--) in place of a pp value indicates that for a particular analysis, this node did not agree with the consensus tree.

56

Gadus chalcogrammus 51/69/— — 97/87/— — Gadus macrocephalus

85/88/0.99 Gadus morhua 98/93/0.99 Gadinae Melanogrammus aeglefinus 100/100/1.0 Pollachius virens 77/78/0.98 Microgadus proximus 83/86/0.98 Lota lota 65/52/1.0 Brosme brosme Lotinae 87/84/0.99 Molva molva }

100/100/1.0 Enchelyopus cimbrius Gaidropsarinae cGaidropsarus ensis }

96/95/1.0 ~/38/0I.62I Urophycis earllii — Urophycis tenuis 81/72/1.0' 42/46/- Urophycis floridana Phycinae Urophycis regia 100/100/1.0 Urophycis chuss

Phycis chesteri

90/77/0 Merluccius albidus 100/99/1.0 Meriuccius gayi Merluccius 100/100/1.0 Merluccius bilinearis

Merluccius merluccius Trachyrincus murrayi

Figure 2.3 Consensus gene tree using RAG1 sequence data. Node labels show bootstrap values for Maximum Likelihood and Neighbor Joining analyses, followed by the Bayesian posterior probability. (--) in place of a bootstrap or pp value indicates that for a particular analysis, this node did not agree with the consensus tree.

57

Gadus morhua —/31/0.96 99/82/1 Gadus macrocephalus 39/50/— Gadus chalcogrammus

94/85/1 Microgadus proximus Gadinae

42/96/0.99 Melanogrammus aeglefinus 46/39/0.96 cPollachius virens

60/69/1 91/45/0.87 Brosme brosme cMolva molva Lotinae 41/92/- } Lota lota

98/97/1 Enchelyopus cimbrius Gaidropsarinae cGaidropsarus ensis }

79/89/1 44/59/0.93 Phycis chesteri 40/35/— Urophycis tenuis

53/68/ Urophycis chuss Phycinae 93/68/1 Urophycis regia 99/78/1 Urophycis floridana

Urophycis earllii

76/-/0.92 Merluccius gayi —/—/0.89 Merluccius albidus Merluccius 100/99/1 Merluccius merluccius

Merluccius bilinearis Trachyrincus murrayi

Figure 2.4 Consensus gene tree using cyt b sequence data. Node labels show bootstrap values for Maximum Likelihood and Neighbor Joining analyses, followed by the Bayesian posterior probability. (--) in place of a bootstrap or pp value indicates that for a particular analysis, this node did not agree with the consensus tree.

58

A

I I 1 anal fin ÿ 2 anal fins

B

I I 1dorsal fin I I 2 dorsal fins H 3 dorsal fins

Figure 2.5 Meristics mapped onto the concatenated consensus tree. A: anal fins (1 or 2); B: dorsal fins (1, 2, or 3).

59

Egg Oil Globule(s)

I I present ÿ absent

Figure 2.6 Egg oil globule presence (white branches) vs. absence (black branches) mapped onto the concatenated consensus tree. The color of boxes above branches indicate the state of that species, a missing box indicates a species for which data was unavailable. Branches with both black and white stripes indicate that both sates (present and absent) were equally parsimonious and a single state could not be inferred for that branch.

60

$V <$ÿ ÿ g / 0ÿ f jf / / / /;/,, / ///// / / / /* / # « JP r J" ÿ <$

Geographic Distribution I I circumarctic Arctic H Atlantic ÿ Mediterranean |Gulf of Mexico ÿ Pacific

Figure 2.7 Geographic distribution of the species included in this phylogeny. Multiple character states were allowed in this analysis; therefore several species show two colors.

61

References

Akasaki, T., Yanagimoto, T., Yamakami, K., Tomonaga, H., Sato, S. 2006. Species identification and PCR-RFLP analysis of cytochrome b gene in cod fish (Order Gadiformes) products. J Food Sci 71 (3): C190-C195. doi: 10.1111/j.1365- 2621.2006.tb15616.x

Ames, E.P. and Lichter, J. 2013. Gadids and Alewives: structure within complexity in the Gulf of Maine. Fisheries Research 141:70-78.

Aranishi, F., Okimoto, T., Izumi, S. 2005. Identification of gadoid species (Pisces, Gadidae) by PCR-RFLP analysis. J Appl Genet 46(1): 69-73.

Bakke, I., Johansen, S.D. 2005. Molecular Phylogenetics of Gadidae and Related Gadiformes Based on Mitochondrial DNA Sequences. Marine Biotechnology 7, 61–69. doi:10.1007/s10126-004-3131-0

Bell, J.R. 2008. A simple way to treat PCR Products prior to sequencing using ExoSAP-IT. BioTechniques 44(6): 834. doi: 10.2144/000112890

Berezin, I.V., Kershengol’ts, B.M., Ugarova, N.N. 1975. Microenvironment of enzymes as 1 of the factors determining enzyme stability. Stabilization of soluble and immobilized horseradish peroxidase. Dokl. Akad. Nauk SSSR 223: 1256–1259.

Block, B.A., Finnerty, J.R., Stewart, A.F., Kidd, J. 1993. Evolution of endothermy in fish: mapping physiological traits on a molecular phylogeny. Science 260: 210–214.

Borza, T., Stone, C., Gamperl, A.K., Bowman, S. 2009. Atlantic cod (Gadus morhua) hemoglobin genes: multiplicity and polymorphism. BMC Genetics 10: 51. doi:10.1186/1471-2156-10-51

Bossier, P. 1999. Authentication of seafood products by DNA patterns. J Food Sci 64(2):189-193. doi: 10.1111/j.1365-2621.1999.tb15862.x

Bowser, A.K., Diamond, A.W., Addison, J.A. 2013. From Puffins to Plankton: A DNA- Based Analysis of a Seabird Food Chain in the Northern Gulf of Maine. PLoS ONE 8, e83152. doi:10.1371/journal.pone.0083152

Cadotte, M., Albert, C.H., Walker, S.C. 2013. The ecology of differences: assessing community assembly with trait and evolutionary distances. Ecology Letters 16: 1234– 1244. doi:10.1111/ele.12161

62

Calo-Mata, P, Sotelo, CG, Pérez-Martín, RI, Rehbein, H, Hold, GL, Russell, VJ, Pryde, S, Quinteiro, J, Rey-Méndez, M, Rosa, C, Santos, AT. 2003. Identification of gadoid fish species using DNA-based techniques. Eur Food Res Technol 217(3): 259–264. doi: 10.1007/s00217-003-0735-y

Campo, D., Machado-Schiaffino, G., Perez, J., Garcia-Vazquez, E. 2007. Phylogeny of the genus Merluccius based on mitochondrial and nuclear genes. Gene 406: 171–179. doi:10.1016/j.gene.2007.09.008

Carr, S.M., Kivlichan, D.S., Pepin, P., Crutcher, D.C., 1999. Molecular systematics of gadid fishes: implications for the biogeographic origins of Pacific species. Canadian Journal of Zoology 77: 19–26. doi:10.1139/cjz-77-1-19

Cohen, D.M., Inada, T., Iwanoto, T., Scilabba N. 1990. Food and agricultural organization of the United Nations (FAO) species catalogue. Vol 10. Gadiform fishes of the world (order Gadiformes): an annotated and illustrated catalogue of cods, hakes, grenadiers and other gadiform fishes known to date. FAO Fish. Sinop. No. 125.

Collette, B.B., Klein-MacPhee, G., eds. Bigelow and Schroeder's Fishes of the Gulf of Maine. 3rd ed. Smithsonian Institution Press, Washington, DC, 2002. 748 pp.

Coulson, M.W., Marshall, H.D., Pepin, P., Carr, S.M. 2006. Mitochondrial genomics of gadine fishes: implications for and biogeographic origins from whole-genome data sets. Genome 49: 1115–1130. doi:10.1139/g06-083

Delport, W., Poon, A.F.Y., Frost, S.D.W., Kosakovsky Pond, S.L. 2010. Datamonkey 2010: a suite of phylogenetic analysis tools for evolutionary biology. Bioinformatics 26: 2455– 2457. doi:10.1093/bioinformatics/btq429

Di Pinto, A, Di Pinto, P, Terio, V, Bozzo, G, Bonerba, E, Ceci, E, Tantillo, G. 2013. DNA barcoding for detecting market substitution in salted cod fillets and battered cod chunks. Food Chem 141(3): 1757–1762. doi:10.1016/j.foodchem.2013.05.093

Dooley, JJ, Sage, HD, Clarke, M-AL, Brown, HM, Garrett, SD. 2005. Fish species identification using PCR−RFLP analysis and lab-on-a-chip capillary electrophoresis: application to detect White fish species in food products and an interlaboratory study. J Agric Food Chem 53(9): 3348–3357. doi: 10.1021/jf047917s

Douady, C.J., Delsuc, F., Boucher, Y., Doolittle, W.F., Douzery, E.J.P. 2003. Comparison of Bayesian and Maximum Likelihood Bootstrap Measures of Phylogenetic Reliability. Molecular Biology and Evolution 20: 248–254. doi:10.1093/molbev/msg042

Dunn, J.R., Matarese, A.C. 1987. A review of the early life history of Northeast Pacific gadoid fishes. Fisheries Research 5: 163–184. doi:10.1016/0165-7836(87)90038-5

63

Encyclopedia of Life (EOL). Available from http://www.eol.org. Accessed 11 July 2014.

Endo, H. Phylogeny of the order Gadiformes (Teleostei, Paracanthopterygii). Mem. Grad. Sch. Fish. Sci. Hokkaido Univ. 49: 75–149.

Erixon, P., Svennblad, B., Britton, T., Oxelman, B. 2003. Reliability of bayesian posterior probabilities and bootstrap frequencies in phylogenetics. Systematic Biology 52: 665– 673. doi:10.1080/10635150390235485

Fahay, M. P. 1983. Guide to the early stages of marine fishes occurring in the western north Atlantic Ocean, Cape Hatteras to the southern Scotian Shelf. J. Northw. Atl. Fish. Sci 4: 168-189.

Feder, M.E. 2010. Physiology and Global Climate Change. Annual Review of Physiology 72: 123–125. doi:10.1146/annurev-physiol-091809-100229

Felsenstein, J. 1981. Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution 17: 368-376.

Felsenstein, J. 1985a. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39(4): 783-791.

Felsenstein, J. 1985b. Phylogenies and the comparative method. The American Naturalist 125(1): 1-15

Froese, R. and D. Pauly. Editors. 2014. FishBase. World Wide Web electronic publication. www.fishbase.org, version (07/2014).

Garcia-Vazquez, E, Machado-Schiaffino, G, Campo, D, Juanes, F. 2012. Species misidentification in mixed hake fisheries may lead to overexploitation and population bottlenecks. Fish Res 114: 52–55. doi: 10.1016/j.fishres.2011.05.012

Garland, T., Bennett, A., Rezende, E.L. 2005. Phylogenetic approaches in comparative physiology. Journal of Experimental Biology 208: 3015–3035. doi:10.1242/jeb.01745

Griffiths, A.M., Sotelo, C.G., Mendes, R., Pérez-Martín, R.I., Schröder, U., Shorten, M., Silva, H.A., Verrez-Bagnis, V., Mariani, S. 2014. Current methods for seafood authenticity testing in Europe: Is there a need for harmonisation? Food Control 45: 95–100. doi:10.1016/j.foodcont.2014.04.020

Gaut, B.S. 1998. Molecular Clocks and Nucleotide Substitution Rates in Higher Plants. In: Hecht, M.K., Macintyre, R.J., Clegg, M.T., editors. Evolutionary Biology 30: 93-120.

64

Gaut, B., Yang, L., Takuno, S., Eguiarte, L.E. 2011. The Patterns and Causes of Variation in Plant Nucleotide Substitution Rates. Annual Review of Ecology, Evolution, and Systematics 42: 245–266. doi:10.1146/annurev-ecolsys-102710-145119

Handy, SM, Deeds, JR, Ivanova, NV, Hebert, PDN, Hanner, RH, Ormos, A, Weigt, LE, Moore, MM, Yancy, HF. 2011. A single-laboratory validated method for the generation of DNA barcodes for the identification of fish for regulatory compliance. J AOAC Int 94(1):201-210.

Halpern, B., Floeter, S. 2008. Functional diversity responses to changing species richness in reef fish communities. Marine Ecology Progress Series 364: 147–156. doi:10.3354/meps07553

Hare, M.P., Nunney, L., Schwartz, M.K., Ruzzante, D.E., Burford, M., Waples, R.S., Ruegg, K., Palstra, F. 2011. Understanding and estimating effective population size for practical application in marine species management. Conservation biology 25(3): 438-449.

Hasegawa M., Kishino H., and Yano T. 1985. Dating the human-ape split by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution 22:160-174.

Hillis, D.M., Bull, J.J. 1993. An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Systematic biology 42(2): 182-192.

Hilu, K.W., Black, C.M., Oza, D. 2014. Impact of Gene Molecular Evolution on Phylogenetic Reconstruction: A Case Study in the Rosids (Superorder Rosanae, Angiosperms). PLoS ONE 9 (6) e99725: 1-12. doi:10.1371/journal.pone.0099725

Howes, G.J., 1991. Biogeography of gadoid fishes. Journal of biogeography. 18: 595-622.

Howes, G.J. 1989. Phylogenetic relationships of macrourid and gadoid fishes based on cranial myology and arthrology. In: Cohen, D.M. (Ed.), Papers on the Systematics of Gadiform Fishes. Natural History Museum of Los Angeles County, Science Series No. 32, pp. 113–128

Huelsenbeck, J.P., Ronquist, F. 2001. MrBayes: Bayesian inference of phylogenetic trees. Bioinformatics 17: 754–755.

Jones, D.T., Taylor, W.R., Thornton, J.M. 1992. The rapid generation of mutation data matrices from protein sequences. Bioinformatics 8: 275–282. doi:10.1093/bioinformatics/8.3.275

Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution 16: 111-120.

65

Kolaczkowski, B., Thornton, J.W. 2004. Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous. Nature 431: 980–984. doi:10.1038/nature02917

Kraft, N.J.B., Cornwell, W.K., Webb, C.O., Ackerly, D.D. 2007. Trait evolution, community assembly, and the phylogenetic structure of ecological communities. The American Naturalist 170(2): 271-283

Kumar, S., Gadagkar, S.R. 2001. Disparity Index: A Simple Statistic to Measure and Test the Homogeneity of Substitution Patterns Between Molecular Sequences. Genetics 158: 1321–1327.

Lanfear, R., Kokko, H., Eyre-Walker, A. 2014. Population size and the rate of evolution. Trends in Ecology & Evolution 29: 33–41. doi:10.1016/j.tree.2013.09.009

Lockhart, P., Novis, P., Milligan, B.G., Riden, J., Rambaut, A., Larkum, T. 2005. Heterotachy and Tree Building: A Case Study with Plastids and Eubacteria. Molecular Biology and Evolution 23: 40–45. doi:10.1093/molbev/msj005

Luikart, G., Ryman, N., Tallmon, D.A., Schwartz, M.K., Allendorf, F.W. 2010. Estimation of census and effective population sizes: the increasing usefulness of DNA-based approaches. Conservation genetics 11: 355-373.

Maddison, W.P. 1991. Squared-change parsimony reconstructions of ancestral states for continuous-valued characters on a phylogenetic tree. Syst. Zool. 40: 304-314.

Maddison, W.P. and D.R. Maddison. 2011. Mesquite: a modular system for evolutionary analysis. Version 2.75 http://mesquiteproject.org

Markle, D.F. 1982. Identification of larval and juvenile Canadian Atlantic gadoids with comments on the systematics of gadid subfamilies. Canadian Journal of Zoology 60: 3420–3438. doi:10.1139/z82-432

Markle, D.F., Frost, L.-A. 1985. Comparative morphology, seasonality, and a key to planktonic fish eggs from the Nova Scotian shelf. Canadian Journal of Zoology 63: 246– 257. doi:10.1139/z85-038

Marske, K.A., Rahbek, C., Nogués-Bravo, D. 2013. Phylogeography: spanning the ecology-evolution continuum. Ecography 36: 1169–1181. doi:10.1111/j.1600- 0587.2013.00244.x

66

Martin, A.P., Palumbi, S.R. 1993. Body size, metabolic rate, generation time, and the molecular clock. Proceedings of the National Academy of Sciences 90: 4087–4091. doi:10.1073/pnas.90.9.4087

McCusker, MR, Denti, D, Guelpen, LV, Kenchington, E, Bentzen, P. 2013. Barcoding Atlantic Canada’s commonly encountered marine fishes. Mol Ecol Resour 13: 177-188. doi: 10.1111/1755-0998.12043

Mecklenburg, C.W., Møller, P.R., Steinke, D. 2011. Biodiversity of arctic marine fishes: taxonomy and zoogeography. Marine Biodiversity 41: 109–140. doi:10.1007/s12526- 010-0070-z

Miller, P.E., Scholin, C.A. 2000. On detection of Pseudo-nitzschia (Bacillariophyceae) species using whole cell hybridization: sample fixation and stability. J Phycol 36(1): 238- 250. doi: 10.1046/j.1529-8817.2000.99041.x

Møller P.R., Jordan A.D., Gravlund P., Steffensen, J.F. 2002. Phylogenetic position of the cryopelagic cod-fish genus Arctogadus Drjagin, 1932 based on partial mitochondrial cytochrome b sequences. Polar Biol 25:342–349

Mueter, F.J., Litzow, M.A. 2008. Sea ice retreat alters the biogeography of the bering sea continental shelf. Ecological Applications 18: 309–320. doi:10.1890/07-0564.1

Nelson, J.S. 2006. Fishes of the World. John Wiley and Sons, Hoboken, NJ.

Nolf, D., Steurbaut, E. 1989. Evidence from otoliths for establishing relationships within gadiforms. Pp. 89-111 In: Cohen. D.M. (ed), Papers on the systematics of gadiform fishes, Sci. Ser. No. 32. Nat. Hist. Mus. Los Angeles County, Los Angeles.

Northeast Fisheries Science Center (NEFSC). 2013. 56th Northeast Regional Stock Assessment Workshop (56th SAW) Assessment Summary Report. US Dept. Commer, Northeast Fish Sci Cent Ref Doc. 13-04; 42 p.

Northeast Fisheries Science Center (NEFSC). 2011. 51st Northeast Regional Stock Assessment Workshop (51st SAW) Assessment Report. US Dept Commer, Northeast Fish Sci Cent Ref Doc. 11-01; 70 p.

Nye, JA, Link, JS, Hare, JA, Overholtz, WJ. 2009. Changing spatial distribution of fish stocks in relation to climate and population size on the Northeast United States continental shelf. Mar Ecol Prog Ser 393: 111-129. doi: 10.3354/meps08220

Nye, J.A., Joyce, T.M., Kwon, Y.-O., Link, J.S. 2011. Silver hake tracks changes in Northwest Atlantic circulation. Nature Communications 2: 412. doi:10.1038/ncomms1420

67

OBIS (2014). Data from the Ocean Biogeographic Information System. Intergovernmental Oceanographic Commission of UNESCO. Web. http://www.iobis.org (consulted on 11 Jul 2014).

Paquin, M.M., Buckley, T.W., Hibpshman, R.E., Canino, M.F. 2014. DNA-based identification methods of prey fish from stomach contents of 12 species of eastern North Pacific groundfish. Deep Sea Research Part I: Oceanographic Research Papers 85: 110–117. doi:10.1016/j.dsr.2013.12.002

Perry, A.L. 2005. Climate Change and Distribution Shifts in Marine Fishes. Science 308: 1912–1915. doi:10.1126/science.1111322

Philippe, H., Forterre, P. 1999. The rooting of the universal tree of life is not reliable. J. Mol. Evol. 49: 509–523.

Pichon, F., Ghysen, A. 2004. Evolution of posterior lateral line development in fish and amphibians. Evol. Dev. 6: 187–193. doi:10.1111/j.1525-142X.2004.04024.x

Plachetzki, D.C., Fong, C.R., Oakley, T.H. 2010. The evolution of phototransduction from an ancestral cyclic nucleotide gated pathway. Proceedings of the Royal Society B: Biological Sciences 277: 1963–1969. doi:10.1098/rspb.2009.1797

Quinteiro, J, Vidal, R, Izquierdo, M, Sotelo, CG, Chapela, MJ, Pérez-Martín, RI, Rehbein, H, Hold, GL, Russell, VJ, Pryde, SE, Rosa, C, Santos, AT, Rey-Méndez, M. 2001. Identification of hake species (Merluccius genus) using sequencing and PCR−RFLP analysis of mitochondrial DNA control region sequences. J Agric Food Chem 49: 5108– 5114. doi: 10.1021/jf010421f

Ram, JL, Ram, ML, Baidoun, FF. 1996. Authentication of canned tuna and bonito by sequence and restriction site analysis of polymerase chain reaction products of mitochondrial DNA. J Agric Food Chem 44(8): 2460-2467. doi: 10.1021/jf950822t

Roa-Varón, A, Ortí, G. 2009. Phylogenetic relationships among families of Gadiformes (Teleostei, Paracanthopterygii) based on nuclear and mitochondrial data. Mol Phylogenet Evol 52(3): 688–704. doi: 10.1016/j.ympev.2009.03.020

Ronquist, F., Huelsenbeck, J.P. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572–1574. doi:10.1093/bioinformatics/btg180

Ronquist, F., van der Mark, P., Huelsenbeck, J.P. 2009. Bayesian phylogenetic analysis using MrBayes. Pp. 210-266 in Philippe Lemey, Marco Salemi and Anne-Mieke Vandamme, eds. The Phylogenetic Handbook: A Practical Approach to DNA and Protein Phylogeny. 2nd edition Cambridge University Press.

68

Saitou, N., Nei, M. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4: 406–425.

Salemi, M. 2009. Genetic distances and nucleotide substitution models: practice. Pp. 137-140 in Philippe Lemey, Marco Salemi and Anne-Mieke Vandamme, eds. The Phylogenetic Handbook: A Practical Approach to DNA and Protein Phylogeny. 2nd edition Cambridge University Press.

Sambrook, J, Russell, DW. 2000. Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

Sánchez, A, Quinteiro, J, Rey-Mendez, M, Perez-Martín, RI, Sotelo, CG. 2009. Identification of European hake species (Merluccius merluccius) using Real-Time PCR. J Agric Food Chem 57(9): 3397–3403. doi: 10.1021/jf8036165

Schultz L.P., Welander, D.A. 1935. A review of the cods of the Northeastern Pacific with comparative notes on related species. Copeia 1935: 127–139

Seutin, G., White, B.N., and Boag, P.T. 1991. Preservation of avian blood and tissue samples for DNA analyses. Can J Zool 69(1):82-90. doi: 10.1139/z91-013

Shavit Grievink, L., Penny, D., Hendy, M.D., Holland, B.R. 2010. Phylogenetic Tree Reconstruction Accuracy and Model Fit when Proportions of Variable Sites Change across the Tree. Systematic Biology 59: 288–297. doi:10.1093/sysbio/syq003

Simmons, M.P., Pickett, K., M., Miya, M. 2004. How meaningful are bayesian support values? Molecular Biology and Evolution 21: 188–199. doi:10.1093/molbev/msh014

Svetovidov, A.M., 1948. Gadiformes. In Fauna of USSR, Vo IX No. 4. Fishes. Zoological. Institute of the academy of Sciences of the USSR (Translated by the Israel Program for Scientific Translations for the National Science Foundation and Smithsonian Institution as OTS 63–11071).

Tajima, F., Nei, M. 1984. Estimation of evolutionary distance between nucleotide sequences. Molecular Biology and Evolution 1: 269–285.

Tamura, K. 1992. Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G+C-content biases. Molecular Biology and Evolution 9: 678–687.

Tamura, K., Kumar, S. 2002. Evolutionary Distance Estimation Under Heterogeneous Substitution Pattern Among Lineages. Molecular Biology and Evolution 19: 1727–1736.

69

Tamura, K., Stecher, G., Peterson, D., Filipski, A., Kumar, S. 2013. MEGA6: Molecular Evolutionary Genetics Analysis Version 6.0. Molecular Biology and Evolution 30: 2725– 2729. doi:10.1093/molbev/mst197

Teletchea, F. 2009. Molecular identification methods of fish species: reassessment and possible applications. Rev Fish Biol Fisher 19(3): 265–293. doi: 10.1007/s11160-009- 9107-4

Teletchea, F, Laudet, V, Hänni, C. 2006. Phylogeny of the Gadidae (sensu Svetovidov, 1948) based on their morphology and two mitochondrial genes. Mol Phylogenet Evol 38(1): 189–199. doi: 10.1016/j.ympev.2005.09.001

Tréguer, P.J., De La Rocha, C.L. 2013. The World Ocean Silica Cycle. Annual Review of Marine Science 5: 477–501. doi:10.1146/annurev-marine-121211-172346

Van Houdt, J.K., Hellemans, B., Volckaert, F.A.M. 2003. Phylogenetic relationships among Palearctic and Nearctic burbot (Lota lota): Pleistocene extinctions and recolonization. Molecular Phylogenetics and Evolution 29: 599–612. doi:10.1016/S1055- 7903(03)00133-7

Verde, C., Giordano, D., di Prisco, G., Andersen, Ø. 2012. The haemoglobins of polar fish: evolutionary and physiological significance of multiplicity in Arctic fish. Biodiversity 13: 228–233. doi:10.1080/14888386.2012.700345

Verde, C., Parisi, E., Di Prisco, G. 2004. Comparison of Arctic and Antarctic teleost haemoglobins: primary structure, function and phytogeny. Antarctic Science 16: 59–69. doi:10.1017/S0954102004001828

Von der Heyden, S., Matthee, C.A. 2008. Towards resolving familial relationships within the Gadiformes, and the resurrection of the Lyconidae. Molecular Phylogenetics and Evolution 48: 764–769. doi:10.1016/j.ympev.2008.01.012

Webb, C.O., Ackerly, D.D., McPeek, M.A., Donoghue, M.J. 2002. Phylogenies and community ecology. Annual Review of Ecology and Systematics 33: 475–505. doi:10.1146/annurev.ecolsys.33.010802.150448

Westneat, M.W. 1995. Feeding, Function, and Phylogeny: Analysis of Historical Biomechanics in Labrid Fishes Using Comparative Methods. Systematic Biology 44: 361– 383. doi:10.1093/sysbio/44.3.361

Wheeler, T.J., Kececioglu, J.D. 2007. Multiple alignment by aligning alignments. Bioinformatics 23, i559–i568. doi:10.1093/bioinformatics/btm226

70

Wilson, J., Hayden, A., Kersula, M. 2013. The governance of diverse, multi-scale fisheries in which there is a lot to learn. Fisheries research 141: 24-30.

Wolf, C, Burgener, M, Hübner, P, Lüthy, J. 2000. PCR-RFLP analysis of mitochondrial DNA: differentiation of fish Species. LWT - Food Science and Technology 33(2): 144–150. doi: 10.1006/fstl.2000.0630

Xia, X. 2013. DAMBE5: A comprehensive software package for data analysis in molecular biology and evolution. Mol Biol Evol 30(7): 1720-1728.

Xia, X., Z. Xie, M. Salemi, L. Chen, Y. Wang. 2003. An index of substitution saturation and its application. Molecular Phylogenetics and Evolution 26: 1-7.

Xia, X. and Lemey, P. 2009. Assessing substitution saturation with DAMBE. Pp. 615-630 in Philippe Lemey, Marco Salemi and Anne-Mieke Vandamme, eds. The Phylogenetic Handbook: A Practical Approach to DNA and Protein Phylogeny. 2nd edition Cambridge University Press.

Yang Z., Rannala B. 1997. Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo Method. Mol. Biol. Evol. 14 (7): 717–24.

71

Appendix I: GenBank sequences used in Chapter 1

Species Accession # Publication Brosme brosme DQ174037 Teletchea et al. 2006 Brosme brosme DQ174038 Teletchea et al. 2006 Brosme brosme EU492337 direct submission Brosme brosme KM032248 This study Enchelyopus cimbrius DQ174040 Teletchea et al. 2006 Enchelyopus cimbrius DQ174041 Teletchea et al. 2006 Enchelyopus cimbrius EU224005 direct submission Enchelyopus cimbrius EU224006 direct submission Enchelyopus cimbrius KM032249 This study Gadus macrocephalus AB078152 direct submission Gadus macrocephalus DQ174044 Teletchea et al. 2006 Gadus morhua DQ174045 Teletchea et al. 2006 Gadus morhua DQ174046 Teletchea et al. 2006 Gadus morhua EU492140 direct submission Gadus morhua EU492141 direct submission Gadus morhua KM032250 This study Gadus morhua KM032251 This study Gadus morhua KM032252 This study Lophius americanus HE608212 direct submission Melanogrammus aeglefinus EF427577 direct submission Melanogrammus aeglefinus EU224013 direct submission Melanogrammus aeglefinus EU224014 direct submission Melanogrammus aeglefinus EU492143 direct submission Melanogrammus aeglefinus KM032253 This study Merluccius albidus KM032254 This study Merluccius albidus KM032255 This study Merluccius albidus KM032256 This study Merluccius albidus KM032257 This study Merluccius bilinearis DQ174059 Teletchea et al. 2006 Merluccius bilinearis DQ174060 Teletchea et al. 2006 Merluccius bilinearis KM032258 This study Merluccius bilinearis KM032259 This study DQ174072 Teletchea et al. 2006 Phycis blennoides DQ174073 Teletchea et al. 2006 Phycis chesteri DQ174074 Teletchea et al. 2006 Phycis chesteri DQ174075 Teletchea et al. 2006 Phycis chesteri KM032260 This study Pollachius pollachius EU224029 direct submission Pollachius pollachius EU224028 direct submission Pollachius pollachius DQ174076 Teletchea et al. 2006 Pollachius pollachius EU492146 direct submission

72

Pollachius pollachius EU492302 direct submission Pollachius virens DQ174077 Teletchea et al. 2006 Pollachius virens DQ174078 Teletchea et al. 2006 Pollachius virens EU492147 direct submission Pollachius virens EU492301 direct submission Pollachius virens EU492146 direct submission Pollachius virens EU492302 direct submission Pollachius virens KM032261 This study Theragra chalcogramma AB078151 direct submission Theragra chalcogramma DQ174079 Teletchea et al. 2006 Theragra chalcogramma DQ174080 Teletchea et al. 2006 Urophycis chuss KM032262 This study Urophycis chuss KM032263 This study Urophycis chuss KM032264 This study Urophycis chuss KM032265 This study Urophycis regia KM032266 This study Urophycis regia KM032267 This study Urophycis regia KM032268 This study Urophycis regia KM032269 This study Urophycis tenuis DQ174085 Teletchea et al. 2006 Urophycis tenuis DQ174086 Teletchea et al. 2006 Urophycis tenuis KM032270 This study

73

Appendix II: Character Mapping of Physiological Tolerances

In addition to the traits examined in the main text of this thesis, character mapping was also conducted using abiotic parameters to investigate the evolution of physiological preferences and tolerances of the Gadoidei. It is important to note that this analysis is subjective in nature due to the fact that all of the parameters investigated were continuous characters. In order to map them onto a phylogeny each parameter was divided into discrete bins; the number and size of bins chosen to categorize each parameter may influence the interpretation of these results. Therefore, without further analyses to support my findings, the following results should be interpreted cautiously and be treated as a preliminary investigation. However, I find the prospect of using abiotic data (associated with species occurrences) for examining the evolution of physiological tolerances to be intriguing and potentially useful in predicting species responses to changes in climatic conditions. The results described here could be indicative of physiological trends and are worthy of further investigation.

METHODS

Nineteen continuous water chemistry parameters associated with species occurrences were examined in this study. The water chemistry parameters used in this analysis included six statistics describing temperature, three statistics describing depth, minimum and maximum salinity tolerances, minimum dissolved oxygen concentration

(ml/L) and saturation (%), and minimum concentrations of three nutrients: nitrate, phosphate and silicate (ml/L) (Table A1).

74

Physiological tolerance data were collected from electronic data repositories including FishBase (Froese and Pauly, 2014), and the Encyclopedia of Life (EOL, 2014).

Depth and water chemistry parameters were obtained for each species from the Ocean

Biogeographic Information System (OBIS, 2014). When multiple data sources provided information on depth or temperature associations, the data were considered collectively in determining minimum and maximum values. The entire OBIS record available for each species was downloaded and checked for anomalous occurrences at the maximum and minimum values of each variable of interest (depth, temperature, salinity, dissolved oxygen, nitrate, phosphate and silicate). Summary statistics for each variable were calculated in R (v. 2.14.2). These summary statistics were used to define the minimum and maximum preferred depth and temperature values as the first and third quartile values obtained from the distribution of data points for each parameter, respectively. Preferred spans were calculated by subtracting the minimum (first quartile value) from the maximum (third quartile value). Therefore, the preferred depth and temperature spans for each species are defined as the range over which the center 50% of species occurrences are distributed when sorted by increasing value of a given parameter.

Character mapping was conducted with Mesquite v. 2.75 (build 566) (Maddison and Maddison, 2011) using the default settings for the parsimony model; continuous characters were under the squared change assumption (Maddison 1991). In order to map continuous characters onto the concatenated consensus tree, values were binned into a number of discrete ranges and color coded by bin. The number of bins used to

75 categorize a continuous character was chosen to optimize the visual representation of that parameter on the phylogeny (Table A1).

RESULTS

In general, depth tolerances and preferences did not appear to be related to phylogenetic clades. Minimum depth requirements were less than 10 meters for all species except Enchelyopus cimbrius (20 m) and Merluccius gayi (50 m). For maximum depths, Gaidropsarus ensis was the deepest species at 2351 m, followed by Trachyrincus murrayi (1978 m) and Phycis chesteri (1750.5 m). All other species had maximum depths of less than 1500 m. As minimum depths were typically very shallow, depth range generally followed the same trend as maximum depth. Gaidropsarus ensis had the deepest minimum depth preference at 410.5m, followed by T. murrayi (295.5 m), M. gayi (226.9 m) and P. chesteri (202 m), while all other species had minimum preferred depths of 115 m or less. Gaidropsarus ensis had the deepest maximum preferred depth at 1038 m followed by T. murrayi (863.5 m) and P. chesteri (427 m), while all other species had maximum preferred depths of less than 350 m. Gaidropsarus ensis and T. murrayi had preferred depth spans of over 550 m, while all other species had a preferred span of less than 250 m.

Several different statistics were used to investigate temperature tolerance and preference in these fishes. For all species included in the analysis, temperature ranges were between 2.6 and 26.8°C, while preferred temperature spans ranged from only 0 to

7.82°C. Minimum temperature was variable for most clades in this phylogeny, although the derived Gadids (genera Pollachius, Melanogrammus and Gadus) were all associated

76 with very cold waters (less than 1°C). The Phycinae clade had a higher maximum temperature threshold (18.2°C to 26°C) than the Gadinae clade and the Lotinae, which had relatively cool maximum temperatures (11.1°C to 18°C) and both the Merluccius and Gaidropsarinae clades were variable (Fig. A1). The following species appeared to prefer warmer waters: U. earllii, U. regia, U. floridana and M. gayi with minimum temperature preferences greater than 9°C and maximum preferences greater than

13°C.

Minimum dissolved oxygen requirements were fairly variable for all clades except for Phycinae, for which all species had values between 3 and 4.5 ml/L (Fig. A2).

Lotinae appeared to have slightly higher oxygen demands (3 to 8.2 ml/L) and Gadinae typically a bit lower (0.3 to 4.5 ml/L). Similar trends were observed with minimum dissolved oxygen saturation values: Phycinae at 42-62%, Lotinae at 42-99% and Gadinae at 4-42%. For minimum salinity, the Phycinae and Merluccius clades were the only groups for which all included species had values above 29 PSU. All species had maximum salinity values over 32 PSU, with the exception of Lota lota at 6.901 PSU.

There does not appear to be a trend in minimum nitrate values as all species had minimum values less than 5 µmol/L except for Merluccius gayi which had a value of

27.46 µmol/L. The majority of species had low minimum phosphate (< 0.45 µmol/L) and silicate (< 2µmol/L) values except M. gayi, G. ensis, M. proximus, G. macrocephalus, and

G. chalcogramma. Additionally T. murrayi had a minimum phosphate value of 0.47

µmol/L and Lota lota had a silicate value of 12.05 µmol/L.

77

DISCUSSION

Several of the environmental characteristics appear to be related to geographic and spatial distribution and do not reflect a phylogenetic constraint on physiological tolerances and/or preferences. This trend is most apparent with the minimum phosphate and silicate values. The group of species with elevated values included all of the Pacific species and species that prefer deeper waters. Deeper waters are typically nutrient rich compared to surface waters as nutrients are consumed at the surface by photosynthetic organisms and remineralized at depth. The Pacific Ocean typically has higher nutrient concentrations compared to the Atlantic Ocean due to the circulation of nutrient rich deep water from the Atlantic Ocean into the Pacific Ocean. Merluccius gayi in particular had much higher minimum nutrient values than all other species, likely because M. gayi is mainly found off the western coast of South America where a prominent upwelling zone is located. Lota lota also had high minimum silicate values, which is not surprising since this is the only freshwater taxon within this group and a major source of silica to the ocean is riverine transport of dissolved silica derived from the weathering of continental crust (Tréguer et al. 2013).

For temperature and dissolved oxygen concentrations, there were several trends observed, particularly in regards to the Phycinae clade. Several caveats should be noted: 1) All Phycinae species included in this analysis can be found in the Atlantic

Ocean and observed trends may reflect this distribution and 2) The reconstructed phylogeny does not contain all Phycinae species. To infer a phylogenetic constraint on physiological tolerances, I used the results of the character mapping analysis to make

78 predictions about the physiological tolerances of several clades. The following predictions were tested using OBIS data for oxygen and temperature tolerances for other species in these clades that were not included in this phylogeny.

Oxygen

1. Species in the Phycinae clade have a minimum oxygen requirement in the range of 3 to 4.5 mL/L. 2. Species in the Gadus genus have a minimum oxygen requirement that is less than 3 mL/L. 3. Species in the Pollachius genus have a minimum oxygen requirement in the range of 3 to 4.5 mL/L. 4. Species in the Microgadus genus have a minimum oxygen requirement in the range of 1.5 to 3 mL/L. 5. Species in the Merluccius Euro-African clade (sensu Campo et al. 2007) have a minimum oxygen requirement in the range of 3 to 4.5 mL/L. 6. Species in the Merluccius American clade (sensu Campo et al. 2007) have a minimum oxygen requirement less than 1.5 mL/L.

Temperature

1. Species in the Phycinae clade have a maximum temperature tolerance between 18 and 26°C 2. Species in the Lotinae and Gadinae clades have a maximum temperature tolerance between 11 and 18°C.

Testing of these predictions are summarized using parallel boxplots in Figures A3

(oxygen data) and A4 (temperature data) and the species included in this analysis along with number of occurrences used to create the boxplots are listed in Table A2. Three

Merluccius species, M. senegalensis and M. angustimanus and M. polli were not included in this analysis because there were less than three data points available for each of these species. All results related to dissolved oxygen failed to match predictions; however, the resulting minimum oxygen range for the Phycinae clade was only 0.5 ml/L greater than the predicted range.

79

All Phycinae species had a minimum dissolved oxygen tolerance between 2.683 ml/L (Urophycis cirrata) and 4.601 ml/L (Urophycis mystacea). Compared to the original prediction of 3.0 to 4.5 ml/L, the observed range is relatively close, and from the parallel boxplots (Fig. A3) it is clear that the Phycinae group had a tighter distribution than any other clade. The other clades generally had larger tolerance ranges that did not appear to be related to phylogenetic relationships. For instance, among the Gadus species, G. morhua, G. chalcogrammus, and G. macrocephalus all had minimum oxygen concentrations of less than 2.0 ml/L while G. ogac had a minimum oxygen concentration of 3.564 ml/L. Pollachius virens had a minimum oxygen concentration of 3.118 ml/L and

P. pollachius had a concentration of 5.262 ml/L. Microgadus proximus had a minimum oxygen concentration of 2.379 ml/L while was at 6.494 ml/L.

Preferred oxygen spans of Atlantic Gadinae species were all between 6 and 8 ml/L while the Pacific Gadinae species (G. chalcogrammus, G. macrocephalus, and M. proximus) had much larger preferred oxygen spans albeit at lower oxygen concentrations (between ~ 3 and 7 ml/L). This suggests that the Pacific species may have adapted to lower oxygen environments compared to Atlantic Species. Additionally, among the Merluccius species, M. capensis, M. paradoxus, M. productus, M. gayi and M. albidus all have minimum oxygen concentrations of less than 2.0 ml/L while M. merluccuis, M. bilinearis, M. hubbsi, and M. australis have minimum oxygen concentrations ranging from 3.207 ml/L to 4.098 ml/L. These findings do not coincide with the phylogenetic relationships between these species as described by Campo et al.

(2007).

80

The Gadinae and Lotinae species were found to have maximum temperature thresholds that were less than 18°C. Most Gadinae and Lotinae species had maximum temperatures greater than 11°C as predicted; however, Gadus ogac and Microgadus tomcod had maximum temperatures of 6.586 and 9.635°C, respectively. As for the

Phycinae clade, the Urophycis species agreed with the expected maximum temperature of 18 to 26°C, however Phycis blennoides and P. phycis had maximum temperatures of

15.283 and 15.777°C, respectively: several degrees cooler than expected. It appears that in general, Phycinae species have larger and warmer preferred temperature spans (size of the box in the boxplot) than Gadinae and Lotinae species.

Many fish, including a number of Gadoid fishes, have exhibited distributional changes in response to climate change (Perry et al. 2005; Mueter and Litzow 2008; Nye et al. 2009; Nye et al. 2011). However, phylogenetic relationships among these fishes are not typically considered as a factor in these analyses. This study shows that there is moderate predictive power in mapping maximum temperature tolerances onto this phylogeny, suggesting that it would be worth further analysis to determine where these species groups are most likely to be and where they may potentially move under different scenarios of climate change. Some of the species that have exhibited the greatest movement are those with larger preferred temperature spans (Urophycis regia,

U. chuss, and M. bilinearis) (Nye et al. 2009). Gadus morhua is also among the species with relatively large preferred temperature spans yet the range of this species is contracting. Obviously, temperature is not the only limiting factor in the movement of

81

G. morhua and prey availability, harvest pressure, or other factors likely interact significantly with temperature constraints.

Another interesting observation from the boxplot analysis was that Gadus macrocephalus and G. ogac have distinctly different dissolved oxygen and temperature preferences; G. ogac is generally found in areas with higher oxygen concentrations and colder temperatures than G. macrocephalus. Both similarities in morphological characteristics as well as genetic analyses have provided evidence that G. macrocephalus and G. ogac are two geographically distinct populations of the same species (Schultz 1935; Carr et al. 1999; Møller et al. 2002; Coulson et al. 2006;

Mecklenburg et al. 2011). Further analyses are needed to determine the relative importance of various environmental and evolutionary processes that resulted in their apparent physiological differentiation.

The research described here lays a foundation for future studies to answer more specific questions surrounding the evolution of physiological traits and mechanisms in

Gadoid fishes. For instance, there have been numerous studies focusing on the haemoglobin of Gadus morhua (Verde et al. 2004; Borza et al. 2009; Verde et al. 2012).

If we assume that the minimum oxygen requirements of a species are related to their haemoglobin, then this analysis raises a number of physiological questions to be addressed in future research: How has haemoglobin evolved in the Gadoidei? Does the

Phycinae clade share a particular trait in their haemoglobin that causes this clade to have lower variation in their dissolved oxygen tolerances than other Gadoids?

82

It is important to note that the physiological characteristics used in this study were based on occurrence records (coordinate, depth, and date of capture) used to retrieve model estimates (via the World Ocean Atlas: http://www.nodc.noaa.gov/OC5/woa13/) of environmental parameters at the location of capture and are neither in situ measurements nor do they reflect empirically derived tolerances. A logical next step would be to test Gadoid species’ physiological tolerances to temperature and oxygen in a laboratory setting. This would provide an opportunity to determine if the observed trends in physiological preference and tolerance described in this study could be used for mapping current suitable habitat zones and identifying locations that species will most likely move to under the impacts of climate change.

83

Table A1. Environmental variables analyzed in this study and the ranges of values associated with each characteristic. See text for definition of preferred values.

Characteristic Type # of Bins State Range Maximum Temperature Continuous 6 10.1C to 13.28C Preferred Minimum Temperature Continuous 5 0.792 to 23.41C Preferred Maximum Temperature Continuous 8 3.3C to 23.7C Maximum Temperature Range Continuous 6 2.6 to 26.8C Minimum Temperature Continuous 5 (-2C to 14.2C) Preferred Temperature Span Continuous 8 0 to 7.8 Minimum DO concentration Continuous 6 0.3 to 8.2 ml/L Minimum DO Saturation Continuous 5 4.15 to 99.17% Minimum Salinity Continuous 5 6.1 to 34.8 ppt Maximum Salinity Continuous 5 6.9 to 38.8ppt Minimum Depth Continuous 5 0m to 50m Maximum Depth Continuous 5 221m to 2351m Depth Range Continuous 5 221m to 2351m Preferred Minimum Depth Continuous 6 0 to 410.5 m Preferred Maximum Depth Continuous 5 5.6 to 1038m Preferred Depth Span Continuous 5 5.6 to 627.5m Minimum Nitrate Continuous 6 0.29 to 27.5ml/L Minimum Phosphate concentration Continuous 6 0.05 to 2.07 µmol/L Minimum Silicate Continuous 5 0.61 to 20.53 µmol/L

84

Table A2 Species included in prediction analysis. N is the number of species occurrences with abiotic data that were used in creating boxplots.

Species N Brosme brosme 6378 Enchelyopus cimbrius 19326 Gadus chalcogrammus 1099 Gaidropsarus ensis 784 Gadus macrocephalus 769 Gadus morhua 312811 Gadus ogac 1229 Lota lota 10 Melanogrammus aeglefinus 382276 Merluccuis albidus 1568 Merluccuis australis 4711 Merluccuis bilinearis 32367 Merluccuis capensis 2906 Molva dypterygia 323 Merluccuis gayi 3 Merluccius hubbsi 9242 Molva macrophtalma 492 Merluccius merluccius 55965 Molva molva 8207 Merluccuis paradoxus 3091 Merluccuis productus 3091 Microgadus proximus 70 Microgadus tomcod 16 Phycis blennoides 2567 Phycis chesteri 6384 34 Pollachius pollachius 2416 Pollachius virens 59966 Trachyrincus murrayi 646 Urophycis brasiliensis 348 Urophycis chuss 14023 Urophycis cirrata 68 Urophycis earllii 103 Urophycis floridana 122 Urophycis mystacea 112 Urophycis regia 4776 Urophycis tenuis 19600

85

>/,/ ÿ % / / k •/ - / # p+/J" ÿ #/ - # / / # / f / / / / // // . / / / /" /• / / / # / / / / f J- / / f / / / / « / / / # / y ÿ »>0 # # # # # „ / # # oX° A / # # #

Maximum Temperature (°C) I I 10.1 to 12.75 ÿ 12.75 to 15.4 I I 15.4 to 18.05 I I 18.05 to 20.7 20.7 to 23.35 23.35 to 26.0

Figure A1 Maximum temperature (°C) mapped onto the concatenated consensus tree. Cool colors (blues) indicate colder temperature while warm colors (reds and oranges) indicate high temperatures.

86

A? o* y # / / ÿ ÿ 0V « / / Jr if o£# /s */ & »® J* / & # «# / / ,? £}' f;# -<ÿ f / y C?> ÿ j y /// <ÿ"r / / /j/ ,/ y y ,y j / „/ / /./ y y y # #ÿ ÿ ÿ ° V ILM

Minimum Dissolved Oxygen (ml/L) |~1 0.292 to 1.5 I I 1.5 to 3.0 ÿ 3.0 to 4.5 I I 4.5 to 6.0 F~l 6.0 to 7.5 ÿ 7.5 to 8.178

Figure A2 Minimum dissolved oxygen concentration (ml/L). Cool colors (blues) indicate low values while warm colors (reds and oranges) indicate high values.

87

------ÿ ' ' ' 1 ------I IE-H H-flH Hh I »~1H »H "I I h|— _l H Q-l H H I —

n-H Gaidropsarinae cm ~u H Gadinae Lotinae Phycinae >-CH Merluccius ÿKy I h D »-n H 8 h-CD-— ÿ-Q "Cm

I jj] (ml/L) ZZF— I d-H I -ED— h 6 eÿ4> h []-H Xs> i

— £>>• |H CD Ch <9 H

Dissolved — 4? i Q— *>r 0 — <$> CD c% 4-n

/>»> ,__£ÿ o, _ % xZX '% %2* Q/v. H— I— h-- X- -H I //yÿ >%/x- s y _i h— V<>. v4* h— ~o % % X V, v% %, x, (ZD— X>&- xX- XV** X°x> dh ,:i X/ /-. % h-— XXX x x X xx%

— st V\ OBIS). Boxes enclose the center 50% of the data, with the bottom edge of the box indicating the 1 quartile value, the upper edge of o X xx rd the box indicating the 3(I) quartile value, the center line indicating the median and the bars extending to the minimum and maximum \X%, _ _

ÿ ÿ values. Species are ordered by an approximated topology and phylogenetic clades (as defined by Roa-Varón and Ortí 2009 and \X _ _ ÿ \\\.

v\ referred to in the text) are highlighted by alternating shaded areas. x\ \v Xÿ' ÿ

I 88

-----

------• • ' ------1 ' ' ' 1 ----- m ' a H ------< i-nhH H H — H —

Gaidropsarinae "1 H H Gadinae Lotinae Phycinae Merluccius c<% H H (°C) -I

C/A 25 < tl — c> -H i- '<* 10 i _gjg| — h_| i ÿ_H h >c4 <» £Q h GV A © ° •"%,

CD Ssl. *te, Sp °f>-. c -"C/ [J] H-{E \? °« _ 4h~ h- %/ *« _ _ {]] On °o "SN: ÿcy i— <£<>ÿ cÿy

x?a X> <5. I %_ - — ÿo Q} — Qj." — '

V/t> °\ X, W- •/j/ % £ A

. Figure A4 Parallel boxplots show the distribution of"O temperature associated with occurrences for each species (collected (J) A A st A- O VA

from OBIS). Boxes enclose the center 50% of the*vÿ» data, with the bottom edge of the box indicating the 1 quartile value, the upper %x\ ÿ A- r edge of the box indicating the 3rd quartile value, the center line indicating the median and the bars extending to the minimum and

A v» A.

ÿ /

ÿ maximum values. Species are ordered by an approximated topology and phylogenetic clades (as defined by Roa-Varón and Ortí 2009 ÿ ,

I and referred to in the text) are highlighted by alternating shaded areas. \ ÿ \\\ j

ÿ \\

ÿ 89