THE TYPE I ANTIFREEZE PROTEIN GENE FAMILY IN

by

Kyra Keiko Nabeta

A thesis submitted to the Department of Biochemistry

in conformity with the requirements for

the degree of Master of Science

Queen’s University

Kingston, Ontario, Canada

(January, 2009)

Copyright © Kyra Keiko Nabeta, 2009 Abstract

Antifreeze proteins (AFPs) protect marine teleosts from freezing in icy seawater by binding to nascent ice crystals and preventing their growth. It has been suggested that the gene dosage for AFPs in fish reflects the degree of exposure to harsh winter climates. The starry , stellatus , has been chosen to examine this relationship because it inhabits a range of the Pacific coast from California to the Arctic. This is presumed to produce type I AFP, which is an alanine-rich, amphipathic alpha-helix.

Genomic DNA from four starry flounder was Southern blotted and probed with a cDNA of a winter flounder liver AFP. The hybridization signal was consistent with a gene family of approximately 40 copies. Blots of DNA from other starry flounder indicate that California fish have far fewer gene copies whereas Alaska fish have far more. This analysis is complicated by the fact that there are three different type I AFP isoforms. The first is expressed in the liver and secreted into circulation, the second is a larger hyperactive dimer also thought to be expressed in the liver, and the third is expressed in peripheral tissues. To evaluate the contribution of these latter two isoforms to the overall gene signal on Southern blots, hybridization probes for the three isoforms were isolated from starry flounder DNA by genomic cloning. Two clones revealed linkage of genes for different isoforms, and this was confirmed by genomic Southern blotting, where hybridization patterns indicated that the majority of genes were present in tandem repeats.

The sequence and diversity of all three isoforms was sampled in the starry flounder genome by PCR. All coding sequences derived for the skin and liver isoforms were consistent with the proposed structure-function relationships for this AFP, where the flat hydrophobic side of the helix is conserved for ice binding. There was greater sequence diversity in the skin and hyperactive isoforms than in the liver isoform, suggesting that the latter evolved recently from one of the other two. The genomic PCR primers are currently being used to sample isoform diversity in related right-eyed to test this hypothesis.

ii Acknowledgements

I'd like to thank my supervisor Dr. Peter Davies for the opportunity to gain insight into the true nature of research.

Special thanks to Dr. Laurie Graham for teaching me how to write in Science and for her patience and support and guidance through all that sequence.

I would also like to thank Sherry Gauthier and the rest of the Davies lab for their technical assistance and moral support.

Thanks to my family and friends for their proofreading skills, encouragement and general positivity. Here's to new future directions!

iii Table of Contents

Abstract ...... ii Acknowledgements ...... iii Table of Contents ...... iv List of Figures ...... vi List of Tables ...... vii List of Abbreviations ...... viii Chapter 1 Introduction ...... 1 1.1 Flounder type I AFPs ...... 3 1.1.1 Liver isoform ...... 3 1.1.2 Skin isoform ...... 7 1.1.3 Hyperactive isoform ...... 10 1.2 Type I AFP in the Pleuronectidae family ...... 12 1.2.1 Yellowtail flounder ...... 15 1.2.2 American ...... 16 1.2.3 Other Pleuronectidae ...... 17 1.3 Selective pressure: Shaping evolution ...... 18 1.4 Goals and Objectives ...... 23 Chapter 2 Materials and Methods ...... 24 2.1 Isolation of genomic DNA ...... 24 2.2 Southern blotting ...... 25 2.3 Library preparation and amplification ...... 26 2.4 Library screening...... 26 2.5 Phage isolation ...... 27 2.6 DNA sequencing ...... 28 2.7 PCR analysis ...... 29 2.8 Bioinformatic analysis ...... 32 Chapter 3 Results ...... 34 3.1 Preface ...... 34 3.2 Genomic DNA samples were of variable quality ...... 34 3.3 Genomic Southern blot showed strong liver AFP gene signal in starry flounder ...... 35 3.4 Genomic Southern blot shows strong hyperactive AFP gene signal ...... 38

iv 3.5 Genes for starry flounder liver and skin AFPs are closely linked ...... 39 3.6 Starry flounder liver and skin AFPs are homologous to their winter flounder counterparts 43 3.6.1 Liver isoform ...... 43 3.6.2 Skin isoform ...... 48 3.6.3 Regulatory elements ...... 51 3.7 The starry flounder has multiple variants of the liver and skin AFPs ...... 52 3.7.1 Liver variants ...... 54 3.7.2 Skin variants ...... 57 3.8 The American plaice has multiple variants of the skin AFPs ...... 60 3.9 The starry flounder has multiple variants of the hyperactive AFPs ...... 62 3.10 Genes for starry flounder hyperactive and skin AFPs are closely linked ...... 65 3.11 Regulatory elements are conserved in stfs-AFP8 ...... 69 3.12 Starry flounder hyperactive AFP is homologous to its winter flounder counterpart ...... 70 3.13 Starry flounder-specific probes do not alter Southern blot banding patterns ...... 71 Chapter 4 Discussion ...... 73 4.1 Conclusions ...... 80 References ...... 82 Appendix A DNA alignment of winter and starry flounder liver AFPs ...... 91 Appendix B DNA alignment of winter and starry flounder skin AFPs...... 96 Appendix C DNA alignment of the 3' regions from all three type I AFP isoforms of starry flounder and winter flounder and the liver isoform of yellowtail flounder ...... 100 Appendix D DNA alignment of the upstream region and exon 1 in the liver and hyperactive AFPs of winter, starry and yellowtail flounders ...... 102 Appendix E DNA alignment of four unique starry flounder liver AFP gene sequences ...... 104 Appendix F DNA alignment of fourteen unique genes encoding starry flounder skin AFPs ...... 107 Appendix G DNA alignment the American plaice skin AFPs ...... 114 Appendix H DNA alignment of three unique gene sequences encoding starry flounder hyperactive AFPs ...... 121 Appendix I DNA alignment of winter and starry flounder hyperactive AFPs ...... 123 Appendix J DNA alignment of winter and starry flounder skin AFPs ...... 126 Appendix K Genomic Southern blot probed with hyperactive AFP cDNAs from the winter flounder and starry flounder ...... 130

v List of Figures

Figure 1. Comparison of the three type I AFP isoforms...... 4 Figure 2. Morphological phylogeny of right-eyed flounders ...... 13 Figure 3. Genomic Southern blot of selected members of the Pleuronectidae family probed with the winter flounder liver AFP gene sequence ...... 14 Figure 4. Locations from which starry flounder samples were obtained ...... 21 Figure 5. Genomic Southern blot of starry flounders from different latitudes probed with the winter flounder liver AFP gene sequence ...... 22 Figure 6. Schematic diagram showing the relative positions of the PCR primers on the liver, skin and hyperactive AFP isoforms ...... 31 Figure 7. Quality assessment of genomic DNA from an Alaskan starry flounder ...... 36 Figure 8. Genomic Southern blots of winter flounder and QCI starry flounder probed with various AFP gene sequences ...... 37 Figure 9. Schematic diagram of starry flounder genomic DNA insert #1 ...... 41 Figure 10. Schematic diagram of the AFP gene organization in genomic DNA insert #1...... 44 Figure 11. Protein alignment of liver AFP variants from starry flounder and winter flounder ..... 46 Figure 12. Helical wheel diagrams of liver AFPs from starry flounder and winter flounder...... 47 Figure 13. Protein alignment of skin AFP variants from starry flounder, winter flounder and American plaice ...... 50 Figure 14. Characterization by PCR of the AFP isoforms present in phage stocks isolated from the primary library screen...... 56 Figure 15. Helical wheel diagrams of skin AFPs from starry flounder, winter flounder and American plaice ...... 61 Figure 16. Protein alignment of hyperactive AFP variants from starry flounder and winter flounder ...... 64 Figure 17. Schematic diagram of starry flounder genomic DNA insert #2 ...... 67 Figure 18. Schematic diagram of the AFP gene organization in genomic DNA insert #2...... 68

vi List of Tables

Table 1. Names and sequences of primers used in PCR experiments ...... 30 Table 2. Relative codon usage in starry flounder and winter flounder AFP genes ...... 49 Table 3. Type I AFP variants isolated from the starry flounder genome ...... 55

vii List of Abbreviations

A260 absorbance at 260 nm AFP antifreeze protein AFGP antifreeze glycoprotein AK Alaska aps American plaice skin bp base pair BC British Columbia C carboxyl CA California cDNA complementary deoxyribonucleic acid CsCl cesium chloride DMSO dimethyl sulfoxide dNTP deoxyribonucleotide triphosphate DPPIV dipeptidyl aminopeptidase IV HPLC high performance liquid chromatography kb kilobases KCl potassium chloride kDa kiloDalton MgCl 2 magnesium chloride MHC major histocompatibility complex mRNA messenger RNA mya million years ago N amino (NH 4)2SO 4 ammonium sulfate ORF open reading frame PCR polymerase chain reactions pfu plaque forming units QCI Queen Charlotte Islands rRNA ribosomal RNA stfh starry flounder hyperactive stfl starry flounder liver stfs starry flounder skin TFIID transcription factor IID TH thermal hysteresis Tm melting temperature UTR untranslated region wfh winter flounder hyperactive wfl winter flounder liver wfs winter flounder skin

Note regarding temperature units: Absolute temperature is measured in degrees Celsius (°C) while a change in temperature, such as thermal hysteresis is measured in Celsius degrees (C°). viii Chapter 1 Introduction

Over evolutionary time, fish have radiated to occupy almost every aquatic habitat there is. Teleosts, or bony ray-finned fishes, have been particularly successful, comprising more than

95% of all extant fish species and almost half of all living vertebrates [1]. However, marine teleosts face a major problem in icy water because the solute concentration of their body fluids is lower than that of seawater [2]. This means that they have the potential to freeze at temperatures approximately 1 C° above the freezing point of seawater (-1.9 °C). Fish are able to live in a supercooled state for limited periods, but internal or external contact with ice crystals and other nucleators during these times is lethal, as it seeds rapid ice growth into the body. Ice crystals are prevalent in surface waters but can be driven to greater depths during storms, putting supercooled fish at risk of death by freezing [3]. Despite this danger, fish are nevertheless found in shallow waters at high latitudes.

Plasma freezing temperatures below the then-accepted value (-0.7 °C) for most teleosts were observed as early as 1957 [2]. However, it was not until much later that agents responsible for this freezing point depression were isolated [4] and termed antifreeze (glyco)proteins, or

AF(G)Ps. Several classes of AFPs have since been described [5], but they are all thought to work by the same adsorption-inhibition mechanism [6]. In this model, AF(G)Ps irreversibly bind nascent ice crystals and effectively stop crystal growth at concentrations much lower than those that could affect the bulk properties of the solution. Because the levels required to exert an effect are so small, AFPs and AFGPs are considered to be non-colligative in their mode of action, even though antifreeze activity is concentration-dependent. Freezing point depression occurs via the

1

Kelvin effect [7]. When AF(G)Ps adsorb onto ice crystals, they restrict the addition of water molecules to gaps between the bound proteins, resulting in local surface curvatures. Because it is energetically more difficult for water to join a curved ice surface, the freezing point decreases without significantly affecting the melting point. Once the water cools to a temperature that can overcome the local energy barrier, the AF(G)Ps are overgrown and uncontrolled ice crystal growth occurs. The difference between the melting and freezing temperatures is termed the thermal hysteresis (TH) gap and its value, in C°, is used as a measure of antifreeze activity.

After the initial characterization of AFGPs in Antarctic species, other polar fishes were sampled to determine the distribution of these proteins, and four more unrelated types of AFPs were discovered: type I in the winter flounder ( americanus ) [8], type II in the sea raven ( Hemitripterus americanus ) [9], type III in the ocean pout ( Macrozoarces americanus ) [10] and type IV in the longhorn sculpin ( Myoxocephalus octodecimspinosis ) [11].

Briefly, the AFGPs consist of tandem repeats of an Ala-Ala-Thr tripeptide repeat with a disaccharide moiety attached to each threonyl hydroxyl group. Of the AFPs, type I denotes an

Ala-rich amphipathic alpha-helix. Type II AFPs are globular proteins with mixed secondary structure. Type III AFPs are also globular but contain short beta-strands, resulting in a flat face along one side of the protein. Type IV AFPs are the least well-characterized and are predicted to have a helix-bundle structure. The type I AFPs, specifically those of the right-eyed flounders, are central to this thesis and are described in detail below.

2

1.1 Flounder type I AFPs

There are three different sub-classes of the type I AFPs, all of which have been characterized in a single species of right-eyed flounder: the skin, liver and hyperactive isoforms

(Figure 1). Within each isoform type, there are several variants with the same overall structure and regulation, but with slight differences in sequence. Although the three isoforms differ in some respects, such as their size, tissue distribution and mode of induction, they also share many features. For example, all are alpha-helical near 0 °C, but with a slight variation in the helix, such that each amino acid corresponds to a 98.2° turn rather than the typical 100° rotation. All isoforms are also Ala-rich and effective at stopping crystal growth in a concentration-dependent manner. Each isoform will be reviewed in the order of its discovery and features will be summarized as they pertain to the winter flounder, the species in which this type of AFP is best characterized.

1.1.1 Liver isoform

Initial observations of freezing point depression in the serum or plasma of various fishes led to the discovery and characterization of type I AFPs in winter flounder blood [8, 12, 13]. This isoform is produced in the liver [14, 15] as a preproprotein [13]. The pre-sequence is a 23-amino acid signal peptide that is cleaved after directing co-translational secretion of the protein into the bloodstream [16]. The pro-sequence is an activation peptide of variable length in which every other residue is Ala or Pro [17], and it is cleaved stepwise within 24 h of entering circulation [18].

It has been suggested that dipeptidyl aminopeptidase IV (DPPIV) is the enzyme responsible for activation [19], but this has never been confirmed. The mature protein is 37 amino acids long,

3

Figure 1. Comparison of the three type I AFP isoforms.

Three-dimensional structures of the liver, skin and hyperactive type I AFPs (wfl-AFP6, wfs-F2 and wfh-AFP1, respectively) were constructed by F.-H. Lin in space-filling mode using PyMOL.

Only the structure for wfl-AFP6 has been solved to date, and wfh-AFP1 is modelled here in its soluble homodimeric form. Size ranges for the different variants within each isoform are noted next to each image, and the size difference between the hyperactive isoform and the other two is clearly evident.

4 Liver 3-5 kDa

Skin 3-5 kDa

Hyperactive 32 kDa

4 after post-translational cleavage of the C-terminal Gly and the retention of its amino group as an amide on the penultimate Arg residue [13]. It was predicted by circular dichroism and viscosity studies to be an alpha-helical rod-shaped molecule [12], and these predictions were confirmed when the structure was solved by X-ray crystallography [20]. Stability of this long helix is supported by internal salt bridges [20, 21] and elaborate capping structures composed of hydrogen bonding networks at both termini [22]. The mature peptide consists of approximately

60% Ala, with Thr and Asx being the next most common residues [23], and because codon usage for Ala is heavily biased towards GCC, genes are GC-rich and highly repetitive [13]. Analysis of the mature peptide identified an 11-amino acid repeat motif, Thr-X2-Asx-X7 where X is usually

Ala [24], and models regarding the relevance of these repeats to antifreeze function were put forward. The first mechanistic model proposed that the Thr/Asx residues, which are regularly spaced 4.5 Å apart, form hydrogen bonds with similarly-spaced oxygen atoms in the primary prism plane of the ice lattice [6, 24]. However, this model was revised [25] after ice etching studies established that AFPs bound a specific pyramidal plane of ice [26] and again after mutagenesis studies determined that the key ice-binding residues were on the conserved hydrophobic face [27]. Such adsorption also explained the characteristic hexagonal bipyramidal ice crystals [28] that form in AFP solutions and grow rapidly, or burst, along the c-axis in activity assays when the limits of the TH gap are breached [29]. As similar AFPs were discovered in other flounders, alignments across species brought about modification of the accepted repeat motif to TaaXAXXAAXX, where lowercase a is almost always Ala, X is usually Ala and uppercase T/A are conserved Thr and Ala residues [30].

Genomic Southern blotting and restriction analysis established that this isoform is encoded by a gene family consisting of 30 – 40 members, most of which are arranged in direct tandem repeats [31, 32]. The AFPs expressed are quite similar, aside from minor amino acid

5 substitutions and slight variations in the number of 11-amino acid repeats. Using a nomenclature system proposed by Low et al . [33], the two main variants of the liver isoform in the winter flounder are wfl-AFP6 (winter flounder AFP produced in the liver, renamed from HPLC-6) and wfl-AFP8 (renamed from HPLC-8), which respectively make up close to 60% and 40% of total plasma AFP content in winter [13]. They both contain three 11-amino acid repeats and have a molecular mass of 3.3 kDa. As well, both behave similarly on ion exchange and gel filtration columns, but can be finely separated by reversed-phase high performance liquid chromatography

(HPLC) due to slight differences in amino acid composition and location [13]. Another well- characterized variant of this isoform is wfl-AFP9 (renamed from AFP9), which contains an additional full repeat motif and has a molecular mass of 4.3 kDa [17, 34, 35]. Functionally, it is more active than the other two components because of its larger ice-binding face, but is not considered to be a major variant due to its lower circulating concentration [35, 36]. There is also

DNA evidence of two five-repeat liver AFPs, but no corresponding protein has been isolated to date [17, 21].

Putative cis-regulatory sequences have been identified for the three well-characterized variants mentioned above via sequencing and bioinformatic analysis [31, 34]. There is a CAAT box, a cis-acting promoter-proximal sequence common to many eukaryotic genes, beginning 84 nucleotides upstream of the conserved transcriptional start site (position -84). There is also a

TATA box, a core promoter sequence that usually lies between the CAAT box and the transcriptional start site [37], at position -32 [31, 38]. In the 3'-flanking DNA, there are polyadenylation sites, or sequences that indicate the end of a gene to transcriptional proteins, located at positions +1429, +1769 and +2087, as numbered in [38]. The first is used most commonly, but the latter two are also used in particular environmental conditions [38, 39]. Each

6 gene consists of two exons separated by an intron approximately 500 bp in length. Exon 1 encodes the majority of the signal peptide, while exon 2 encodes the last four residues of the signal peptide and the rest of the proprotein [31]. Within the intron lies an enhancer, designated

Element B, which is bound by liver-enriched transcriptional activators, such as CCAAT/enhancer binding protein α and a novel protein designated the antifreeze enhancer protein. Element B is conserved in the genes for wfl-AFP6 and wfl-AFP8 but not in the wfl-AFP9 gene [40].

Laboratory studies on wild-caught fish found that the key environmental cues for AFP gene expression are a combination of changes in photoperiod, temperature and hormone levels.

Decreasing day lengths in autumn spur AFP production in anticipation of winter ice, via the reduction of growth hormone secreted from the pituitary [41-43]. Cold-specific stability of the

AFP mRNA allows circulating concentrations of the liver isoform to reach 10 – 15 mg/mL at the height of winter [44, 45]. Come spring, higher water temperatures coupled with a rise in circulating concentrations of growth hormone are responsible for the disappearance of AFPs from the bloodstream via negative feedback at the transcriptional level [41, 42] and accelerated degradation of the protein, resulting in a several hundred-fold difference from winter levels [46].

1.1.2 Skin isoform

Alanine-rich peptides that caused non-colligative freezing point depression were isolated from the skin of the shorthorn sculpin ( Myoxocephalus scorpius ) [47]. However, the significance of this discovery was overlooked. Although later studies on the skin of the cunner

(Tautogolabrus adspersus ) confirmed the presence of TH activity in this tissue as well as its usefulness as a physical barrier to ice propagation [48, 49], it was a study in 1992 that revealed

7 the production of AFP mRNAs in many non-liver tissues in the winter flounder [50]. Sequencing of cDNAs made from pooled extracts of skin, dorsal fin and scales revealed nine Ala-rich peptides that were more similar to each other than to the already-characterized liver AFPs [51], and these were classified as skin-type AFPs.

Around the same time, two clones with high sequence identity and similar gene organization to the known liver AFP genes were isolated from a winter flounder genomic library

[31]. However, due to the presence of in-frame stop codons in the putative pro-sequence and the lack of a classical TATA box in the presumed promoter region, these sequences, wfs-F2 and wfs-

11-3 (winter flounder skin AFP, renamed from F2 and 11-3, respectively), were classified as pseudogenes [52]. Later alignments with the skin isoform cDNAs confirmed that they were genomic sequences of skin AFPs and further analysis uncovered a putative TFIID binding motif upstream of the transcriptional start site [51].

Many parallels can be drawn between the liver- and skin-type AFPs, although the latter are less well-characterized. Like the liver AFPs, the skin-type AFPs consist of approximately

60% Ala, which are preferentially coded by GCC. They are thought to be encoded by a gene family of 30 – 40 members [51], though this remains to be confirmed, as the skin probe used in these experiments contained a 91 bp portion of the 3' untranslated region (UTR) that is conserved among all isoforms (36.8% of the total probe length). For comparison, the liver probe used contained a 24 bp portion of the 3' UTR, which was 7.5% of the total length [16, 51]. This, coupled with the low stringency of washing [50], likely resulted in some degree of cross- hybridization of the skin probe to the liver genes, but not vice versa. In any case, the 11-amino acid repeat motif is conserved among the skin variants, but they contain only two full repeats, in

8 contrast to the 3 – 4 full repeats in the liver isoforms. The nature of crystal shaping induced by the winter flounder skin AFPs and their ice hemisphere etch patterns have not been reported. The skin AFPs have not been recombinantly expressed to date and no crystal structure is available, but circular dichroism and modeling suggest that they are also entirely alpha-helical [53]. Many of the residues involved in ice-binding in the liver isoform are also conserved, indicating a similar mechanism of action and possibly a similar binding plane [27, 51].

Despite the many similarities between the skin and liver isoforms, significant differences exist as well. Although the skin AFP genes also consist of two exons separated by an intron, the first exon contains only untranslated sequence [51]. These proteins are produced with no signal peptide or pro-sequence, and relative to the liver isoform, they possess both a unique N-terminal sequence and a variable C-terminus [51]. Mass spectrometry indicates that the N-terminal sequence, which is common among the flounder skin AFP variants, is acetylated [51], but the role of this moiety is unknown, as it appears to have no effect on antifreeze activity [54]. Studies investigating the significance of the different N and C termini between the two isoforms revealed that those of the skin isoforms negatively affect the thermal stability and helical content of the proteins, and contribute to the 50% decrease in relative antifreeze activity [51, 55].

As with the liver isoforms, regulation of skin AFP expression has not been completely characterized. The lack of a signal peptide indicates an intracellular role, but identification of

AFP in the interstitial space of gill tissue [56] suggests that secretion may occur by a non- classical pathway [57, 58]. A region similar to Element B, the liver AFP enhancer, has been found in the intron of the skin variants; however, it contains a dinucleotide insertion that allows transactivation activity but abolishes liver-specificity [59]. This enhancer has been named

9

Element S and likely contributes to the broad expression of the skin AFPs, which is especially high in the skin, scales, fins and gills [50]. Transcripts in the gills have been localized to the pavement cells and the surrounding interstitial space, and appear to be temporally associated with the thickening of the epidermis during metamorphosis of juvenile flounder [56, 60]. This occurs in early summer in the northern range of the winter flounder [61]. Unlike the liver isoform, the skin isoforms undergo only a 5- to 10-fold difference between winter and summer expression levels, and this does not appear to be affected by growth hormone [46].

1.1.3 Hyperactive isoform

Given the gene dosage of the liver isoform in the winter flounder genome, mid-winter circulating AFP concentrations can reach 10 – 15 mg/mL [44]. This translates to 0.7 C° of TH activity. Other blood solutes colligatively add another 0.8 C° for a total freezing point depression of 1.5 C° [62]. Although icy seawater can reach -1.9 °C, winter flounder thrive. To resolve this discrepancy, winter flounder plasma was re-examined, and more than 2 C° of TH activity was detected [63]. The plasma also produced spindle-shaped crystals rather than the hexagonal bipyramidal crystals typically associated with type I AFPs. Careful purification uncovered a

16.7-kDa protein that produced 1.1 C° of TH activity at a concentration of 0.1 mg/mL, a value far beyond that of other known fish AFPs. Dubbed the hyperactive AFP, it is present at only 0.2 mg/mL in the blood and is irreversibly inactivated under the conditions traditionally used to isolate type I AFPs, i.e., room temperature and low pH [63]. This is likely why it had escaped detection for over 30 years [8].

This isoform is similar to the liver and skin isoforms only in its high Ala content and alpha-helicity [64]. Alanine again makes up approximately 60% of the mature protein, which is

10 entirely alpha-helical, in spite of the much larger size and an axial ratio of 18:1 [64]; a to-scale size comparison of the three isoforms is shown in Figure 1. The structure of this isoform has yet to be solved, but suitable residues are available at both termini for cap structure formation [65].

In addition, it exists in solution as a homodimer, which may provide additional structural support

[64]. It has been suggested that the two straight helices form an anti-parallel dimer, such that both polypeptide chains can present their ice-binding faces to the ice [65]. This seems plausible, as it has been shown that AFPs with a larger ice-binding face are more active [35], but there are likely other contributing factors, given the differences in crystal shaping and potency. Activity is irreversibly lost at 18 – 20 °C, even though much of the secondary structure is retained, suggesting that activity may depend on quaternary structure [64]. The ice-binding plane has not been determined, but the spindle-shaped crystals burst along their a-axes unlike the hexagonal bipyramidal crystals shaped by the liver AFPs, which burst along the c-axis, implying different or multiple binding planes [64].

Previously, a DNA sequence containing a long Ala-rich open reading frame was isolated and named 5a (referred to here as wfh-5a). The corresponding protein or mRNA were never detected, and it was classified as a pseudogene [52]. Since the discovery and characterization of the hyperactive isoform, comparisons with the predicted wfh-5a gene product showed that the N- terminal sequence, amino acid composition and size of the two proteins are quite similar, but different enough that the detected hyperactive isoform was termed “5a-like” [64]. It appears that the wfh-5a gene is a variant of those that express the hyperactive AFP but may be silent.

Analysis of the hyperactive AFP cDNA revealed that it is produced with a signal sequence but no propeptide. Again, codon usage for Ala is heavily biased towards GCC, as in the

11 other type I isoforms [65]. DNA alignments of the hyperactive isoform with the liver and short isoforms show > 90% sequence identity throughout the 5' UTR, signal peptide and 3' UTR, but alignment over the mature peptide is problematic due to the repetitiveness of the sequence and their vastly different lengths [65]. There appears to be no post-translational modification to the protein after secretion, and the 11-amino acid repeat motif is less well-defined than in the liver and skin isoforms. Rather than the continuous hydrophobic ice-binding surface conserved in the smaller isoforms, four disparate regions that resemble the smaller isoforms with respect to their ice-binding sites were identified upon close scrutiny of the hyperactive sequence [65]. The physiological source, tissue distribution, and regulation of this isoform are still under investigation.

1.2 Type I AFP in the Pleuronectidae family

The winter flounder belongs to the Pleuronectidae family of right-eyed , or pleuronectid fish. The evolutionary history of these fish is unclear. In an attempt to clarify the interrelationships within the family, Sakamoto examined 77 species using 78 internal and external morphological traits, and a phylogeny was established [66]. Eighteen species were selected from this tree and assessed for the presence of AFP genes on a genomic Southern blot (Figure 2).

Arranged in order of relatedness, the genomes were probed with a representative AFP cDNA, the main winter flounder liver variant, wfl-AFP6 (S. Gauthier, personal communication). Strong hybridization signals were observed in several fishes, centered on the winter flounder (Figure 3), and some were investigated further, specifically those of the yellowtail flounder ( ferruginea ) and the American plaice ( Hippogloissoides platessoides ).

12

Figure 2. Morphological phylogeny of right-eyed flounders

Interrelationships between 77 species of Pleuronectidae were determined by numerical phenetics and the application of Gower's general similarity coefficient, based on 78 internal and external morphological characters. For the sake of simplicity, only 54 species are shown here.

Letter/number combinations denote various species, as listed below. Adapted from [66] with assistance from J. O'Donnell.

A1 - Arrowtooth flounder C1 - E1 - Far Eastern smooth flounder A2 - Kamchatka flounder C2 - English sole E2 - Flounder A3 - C3 - Rock sole E3 - Starry flounder A4 - Atlantic halibut C4 - Dusky sole E4 - Diamond A5 - Pacific halibut C5 - Dab E5 - Curlfin sole A6 - C6 - Yellowfin sole E6 - Ridge-eyed flounder A7 - Petrale sole C7 - Sakhalin sole E7 - Hornyhead turbot A8 - Shotted halibut C8 - Yellowtail flounder E8 - C-O sole A9 - Spotted halibut C9 - Sand flounder E9 - Spotted turbot B1 - Barfin flounder D1 - Longhead dab F1 - Ocellated turbot B2 - D2 - Littlemouth flounder F2 - Witch flounder B3 - American plaice D3 - Marbled flounder F3 - Blackfin flounder B4 - Flathead sole D4 - Cresthead flounder F4 - Rex sole B5 - Flathead flounder D5 - Winter flounder F5 - Stone flounder B6 - Bering flounder D6 - Alaska plaice F6 - Lemon sole B7 - Pointhead flounder D7 - European plaice F7 - Slime flounder B8 - Scale-head plaice D8 - obscurus F8 - Dover sole B9 - Rizuken flounder D9 - Arctic flounder F9 - Willowy flounder

13

A1 A2 A3 A4 A5 A6 A7, A8 A9 B1 B3, B4, B5, B6 B7 B8 B9 B2 C1, C2, C3, C4 C5, C6, C7, C8, C9, D1, D2, D3, D4, D5, D6, D7, D8, D9, E1 E2, E3 E4 E5 E6, E7, E8, E9, F1 F2, F3 F4 F5 F6, F7, F8 F9

13

Figure 3. Genomic Southern blot of selected members of the Pleuronectidae family probed with the winter flounder liver AFP gene sequence

DNA from a variety of species was digested with Sac I and probed with a winter flounder cDNA of the liver isoform (pKEN C17) under highly stringent conditions: 0.1x SSC, 1% SDS, 60 °C,

30 minutes (S. Gauthier, personal communication). Size markers (kb) on the left are from a

Hin dIII digest of lambda DNA. The letter/number combination above each lane denotes the species from which the DNA was isolated (see below), and samples are arranged on the blot according to Sakamoto's morphological phylogeny, shown in Figure 2 [66].

A3 - Greenland halibut A4 - Atlantic halibut C2 - English sole C3 - Rock sole

B3 - American plaice C5 - Dab C6 - Yellowfin sole C8 - Yellowtail flounder D5 - Winter flounder D6 - Alaska plaice

D7 - European plaice E2 - Flounder E3 - Starry flounder E5 - Curlfin sole E7 - Hornyhead turbot E8 - C-O sole F2 - Witch flounder F6 - Lemon sole

14

A3 A4 C2 C3 B3 C5 C6 C8 D5 D6 D7 E2 E3 E6 E7 E8 F2 F6 23.1 9.4 6.6 4.4

2.3 2.0

0.5

14

1.2.1 Yellowtail flounder

Freezing point depression was observed early on in the serum of the yellowtail flounder

[67], a close relative of the winter flounder [66, 68, 69]. The protein responsible was found to be seasonally-produced, Ala-rich and homologous to the winter flounder liver AFPs [21, 62]. It was also predicted to be an amphipathic alpha helix, but with four 11-amino acid repeats rather than three [21]. Protein sequence alignments indicate that the ice-binding residues are conserved between the two species. While a larger ice-binding surface has been correlated with increased efficacy [35], TH activity of the purified yellowtail AFP is 20% less than that of the winter flounder liver AFPs on a mass basis, though they are similar on a molar basis [21]. The yellowtail also has a lower circulating AFP concentration (4 mg/mL versus 10 mg/mL) and a lower AFP gene dosage (~ 10 copies versus 30) compared to the winter flounder [62].

Southern blotting and restriction analysis of genomic clones indicate that the yellowtail

AFP genes are not tandemly arrayed, suggesting that the amplification of the winter flounder genes occurred after the two species diverged [62]. Both fish live in the same geographical range along the Atlantic coast of North America, but the winter flounder winters inshore at depths shallower than 25 m [61] while the yellowtail lives farther offshore in winter, at depths of 60 –

100 m [70]. To explain the reduced antifreeze activity, it was suggested that deeper waters are warmer and more resistant to the fluctuations in air temperature that cause ice crystal formation at the surface, and that the differences in activity and gene dosage are due to decreased selective pressure for AFP [62]. The yellowtail flounder has not been assessed for the presence of skin

AFPs to date. However, after the discovery of the hyperactive AFP in winter flounder, meticulous re-evaluation of yellowtail serum found evidence of a 16.2 kDa Ala-rich protein that

15 had an N-terminal sequence similar to the hyperactive winter flounder AFP, produced spindle- shaped ice crystals, and was by itself fully capable of protecting the fish from freezing [71].

Thus, this fish also produces both the small and large plasma AFPs.

1.2.2 American plaice

The American plaice lives in the north Atlantic at average depths of 50 – 150 m [72], similar to the depth range of the yellowtail flounder. It was considered to be relatively distinct from the yellowtail and winter flounders morphologically [66, 69], but a more recent molecular phylogeny using ribosomal mtDNA markers suggested that the three species were closely related

[68]. Various assessments of gene dosage using a cDNA probe for the winter flounder liver AFP have produced disparate results. Genomic Southern blots have shown a similar degree of hybridization in American plaice and yellowtail flounder DNA [62], while another showed relatively few AFP signals in the American plaice genome (Figure 3). The Southern blot in

Figure 3 is considered to be more reliable because several exposures were done during a series of increasingly stringent washes (P. Davies, personal communication). One interpretation of this latter blot is that the strong signals in species closely related to the winter flounder (lanes C8 –

E3) correspond to the liver isoforms, which underwent amplification in their common ancestor.

Supporting this idea, small AFPs have never been isolated from American plaice plasma.

However, as with gene dosage assessment, results from antifreeze activity assays of American plaice plasma have varied. In one study, mid-winter analysis for TH showed approximately

1.1 C° of activity, which is on par with that of the yellowtail flounder [73]. Another TH assay found 2 C° of freezing point depression at 0.4 mg/mL (P. Davies, personal communication), but the significance of this was not realized until the discovery of the hyperactive thermolabile AFP

16 in winter flounder [63]. Subsequent re-analysis of American plaice plasma under carefully controlled conditions replicated a TH activity of 2 C° [71].

Full gene and protein sequences are not available for the American plaice hyperactive

AFP, but it was deemed to be homologous to the corresponding winter flounder AFP based on similarities in N-terminal sequence, size, ice crystal shaping, activity, elution time, amino acid content and secondary structure [71]. It is slightly more thermolabile, irreversibly losing activity at 9 °C rather than 20 °C [71], and although its quaternary structure has not been determined, it is also assumed to form a homodimer in solution (P. Davies, personal communication). The fact that the American plaice produces the hyperactive AFP without a smaller liver isoform is noteworthy, and has implications for determining the evolutionary path of the type I AFPs.

1.2.3 Other Pleuronectidae

The AFPs of other right-eyed flounders have not been as fully characterized as for the yellowtail and American plaice. A genomic Southern blot of the smooth flounder ( putnami ) DNA produced a strong signal, indicating a large gene family [62], but the signal was not characterized further. The Southern blot arranged by phylogenetic proximity also showed evidence of multiple AFP genes in the genomes of the flounder ( Platichthys flesus ), European plaice ( Pleuronectes platessa ), and starry flounder ( Platichthys stellatus ) (Figure 3). These genes are thought to be homologous to those of the winter flounder due to the strong hybridization signal observed with a winter flounder probe, but no other studies have been published on these species. The same blot also showed relatively few hybridization signals for the Alaska plaice

(Pleuronectes quadrituberculatus ), but only limited information is available on the expressed

AFPs; the sequence of one variant from this fish has been published without substantiation [74].

17

The available data, summarized above, indicate that the flounder AFPs fit within the accepted definition of a type I AFP and that differential amplification of AFP genes has occurred among pleuronectids. However, a comprehensive evaluation is not possible at this time because adequate sequence and structural data are not available for all fish. Similarly, the phylogenetic distribution of these proteins among Pleuronectidae does not indicate a clear ancestor gene or species and thus, the evolution of these proteins also requires further study.

1.3 Selective pressure: Shaping evolution

“Survival of the fittest” is one of the best-known tenets of Darwin’s theory of natural selection, whereby fitness is determined by the environment. The ability to accommodate environmental pressures, such as low nutrient availability, non-ideal temperature or toxicant exposure, determines reproductive success, and successful individuals are more likely to pass beneficial genes on to future generations. For , the simplest method of avoiding these stressors is to migrate to more amenable surroundings, but if the stress occurs over a large temporal or spatial scale, migration may not be an option.

Another key facet of Darwin’s theory is that there exists natural variation between individuals that allows some to better cope with stress. For example, Mouches et al . showed that mosquitoes chronically exposed to organophosphate insecticides produced more of a particular esterase, and closer examination revealed an esterase gene dosage 250 times higher in resistant strains than in sensitive strains [75]. The mosquitoes that can metabolize and thus detoxify insecticides will survive longer and have more offspring. One way genetic amplification can occur is via gene duplication. Replicates can arise during meiotic recombination due to

18 homologous but unequal crossing over, and repetitive regions provide substrate for further duplications and deletions [76]. Such adaptation was also observed in mouse cell lines by Alt et al ., where stepwise methotrexate treatment incurs amplification of dihydrofolate reductase genes

[77]. Scott et al . postulated that an acute selective pressure, namely exposure to ice, induced the tandem amplification of the liver AFP genes in the winter flounder genome [78]. This claim is best evaluated in the context of flounder evolution.

The earliest fossil evidence of teleost fish is dated to the middle of the Triassic period 235 million years ago (mya) [79], though molecular data place teleost origin earlier, in the Paleozoic era [80]. In either case, teleost morphology had attained approximately modern form by the end of the Eocene epoch, 40 mya [81]. Otolith fossils place true Pleuronectiformes in early Eocene,

53 – 57 mya, while the oldest full skeleton is dated to 45 mya [82]. Recently, a fossil representing an evolutionary intermediate between flatfish and their symmetrical ancestors has also been dated to the early Eocene, 50 mya [83, 84]. Thus, the majority of flounder evolution occurred in warm climates where polar temperatures were at least 6 °C warmer than present and peaked at 24 °C in the Arctic Ocean at the Paleocene-Eocene thermal maximum, 55 mya [85, 86].

Sedimentary evidence indicates that the earth began transitioning to a bipolar icehouse, or an environment in which ice sheets are present at both poles, in the middle Eocene, 45 mya [87], thus presenting modern fishes with their first major cold challenge. Based on the observations that not all close relatives of the winter flounder possess equivalent AFP gene dosage [62], and that eukaryotes are under little pressure to purge non-functional DNA, the tandem array of AFP genes in the winter flounder is thought to be a recent evolutionary novelty. Among the various pleuronectid species, both AFP gene amplification and serum AFP activity appear to correlate with habitat depth and the degree of exposure to ice, implying that the genes were amplified

19 according to need [62, 73]. Similarly, differential amplification of AFP genes has been observed not only between species, but also in geographically distinct populations of the same species [88,

89]. As part of a search for other, better examples of intraspecific variation in gene dosage, and in order to test the theory that AFP gene dosage amplified according to need, we have examined the starry flounder.

The starry flounder was chosen because it is a close relative of the winter flounder [66,

68] and because it showed a strong signal on a preliminary Southern blot probed with flounder liver AFP (Figure 3). It spawns in depths of less than 30 m in December and early January, and is common at low salinities [90], thus exposing itself to both icy conditions and higher freezing temperatures. The starry flounder is non-migratory and its habitat encompasses a continuous stretch of coast on both sides of the north Pacific Ocean between 33° and 73° N latitude (Figure

4) [90, 91]. We hypothesize that if AFPs evolved in response to sudden chronic ice exposure, more northerly flounder populations should have a higher AFP gene dosage, because they would be exposed to glaciers descending from the Arctic more severely and over a longer time. When the genomic DNA of three starry flounder from different latitudes (Alaska 61° N, British

Columbia 49° N and California 37° N) was probed with the cDNA from a winter flounder liver

AFP, a graded signal was observed in which intensity correlated positively with latitude (Figure

5). However, because these signals were not further characterized and the blot contained DNA from only three fish, limited conclusions could be drawn at the population or species levels.

20

Figure 4. Locations from which starry flounder samples were obtained

The black line along the coastline denotes the geographical range of the starry flounder [90, 91] with the sampling sites (below) indicated by a code within a circle.

(1) Bering Strait (AK, liver) (2a) Port Moller, Aleutian Peninsula (AK, muscle) (2b) Bering Sea, Aleutian Peninsula (AK, muscle) (3a) Belkofski Bay, Deer Island (AK, muscle) (3b) Sitkalidak Strait, Kodiak Island (AK, muscle) (4) Hecate Strait, Queen Charlotte Islands (BC, liver) (5) English Bay, Vancouver (BC, whole fish) (6) San Francisco Bay, San Francisco (CA, liver)

21 Arctic Ocean Beaufort Sea

1

Bering 3b Sea 2b 2a 3a 4

5

6 Pacific Ocean

21

Figure 5. Genomic Southern blot of starry flounders from different latitudes probed with the winter flounder liver AFP gene sequence

Genomic DNA was isolated from individual starry flounders from San Francisco, California

(CA), Vancouver, British Columbia (BC) and the Bering Strait, off the northwestern coast of

Alaska (AK), sites 1, 5 and 6, respectively, in Figure 4. DNA was digested with Eco RI (E) or

Sac I (S) and probed with the wfl-AFP6 cDNA pKEN C17 [19]. Size markers are not available for this blot; unpublished data courtesy of Dr. Gary K. Scott.

22

CA BC AK

E S E S E S

22

1.4 Goals and Objectives

Type I AFPs are key proteins involved in protecting pleuronectid fish from ice crystal growth into the body. Evidence suggests that they are produced by several closely-related species of flatfish, but characterization of these proteins in species other than the winter flounder is limited or lacking, especially at the genomic level. In addition, it is currently accepted that the differential amplification of AFP genes is a function of selective pressure, but this has not been demonstrated definitively within a single population of AFP-producing fish. With the aim of characterizing the AFP genes in the starry flounder, the following strategies were applied:

1. To replicate and extend the hybridization signals observed in preliminary Southern blots

and to correlate signal intensity with latitude, high molecular weight DNA was isolated

from starry flounder tissues sampled at various locations and latitudes (Figure 4) and

Southern blotted.

2. To characterize the starry flounder AFP gene family and its organization, the

aforementioned Southern blots and a genomic lambda library of starry flounder DNA was

probed with the cDNAs of the different type I AFP isoforms. The banding patterns

generated on the Southern blots and the sequencing and restriction analysis of positive

phage were used to infer the organization of type I AFP gene family.

3. To investigate evolutionary relationships among type I AFP isoforms, PCR was used to

sample the sequence variation of each AFP isoform in the starry flounder genome, and

the starry flounder AFP genes will be compared to those of other pleuronectids.

23

Chapter 2 Materials and Methods

2.1 Isolation of genomic DNA

Starry flounder tissue was collected by others at various sites on the north and east

Pacific coasts (Figure 4). Tissue was stored at -80 °C and cooled in liquid nitrogen prior to DNA extraction. DNA was prepared from 1.5 – 3 g tissue according to the method of Blin and Stafford

[92], as adapted by Scott et al . [32] with other minor modifications. Briefly, tissue and frozen

DNA extraction buffer containing 100 µg/mL proteinase K were ground together to a fine powder under liquid nitrogen with a pestle and mortar then incubated at 63 °C overnight with additional proteinase K. This crude digest was deproteinized by stepwise extractions with equal volumes of buffered phenol, 1:1 v/v phenol:chloroform, and chloroform, prior to at least 12 h of dialysis against TE. Following a 3 h incubation at 37 °C for 3 h with DNAse-free RNAse A prepared according to Sambrook [93], a second proteinase K treatment was performed (100 µg/mL with

0.5% SDS) at 50 °C overnight. Phenol-chloroform extractions and dialysis were repeated as described above. DNA quality was assessed by agarose gel electrophoresis and quantity was

measured by either A 260 or densitometric comparison with a sample of known concentration following agarose gel electrophoresis on a Bio Rad Gel Doc 2000 imaging system coupled to

Quantity One software (version 4.1.0). Preparations that were too dilute were concentrated by sec -butanol extraction.

24

2.2 Southern blotting

Genomic DNA (30 µg) was digested to completion with each of the following enzymes:

Eco RI, Sac I or Bam HI (New England Biolabs). Digestions contained 10 times the amount of enzyme required to cut 1 µg DNA in 1 h and the appropriate buffer, both as specified by the manufacturer, and 100 µg/mL bovine serum albumin. Reactions were incubated at 37 °C without shaking for 48 h. Small aliquots were taken from each reaction at 20 h and 32 h, prior to boosting the digestions with the same amount of enzyme. Electrophoresis of these aliquots against the starting and final products was used to assess the time course of the digestion. Digested DNA was recovered into a smaller volume by ethanol precipitation and resuspension of the pellet in 60

µL 1x TE. DNA samples (10 µg/lane) were electrophoresed on a 0.8% agarose gel at 20 V or less and Southern blotted onto nylon (Zeta-probe GT, Bio-Rad Laboratories) according to the method of Sambrook [93]. Blots were pre-hybridized and washed as described by Davies et al .

[31]. Hybridization probes were labelled with α-32 P dCTP (Perkin Elmer) using the Random

Primers DNA Labelling System (Invitrogen). The 287 bp cDNA for the liver isoform contained the signal peptide, the pro-sequence and mature peptide in their entirety, along with small portions of the 5' UTR (conserved among liver and hyperactive isoforms) and the 3' UTR

(conserved among all isoforms). The probe consisted of nucleotides 27 – 313 of component A, a previously published winter flounder cDNA, as numbered by Pickett et al . [16]. The probe for the winter flounder hyperactive isoform comprised nucleotides 103 – 705, as numbered according to Graham et al . [65], and coded for a portion of the mature peptide only. The starry flounder- specific probes were PCR products isolated from genomic DNA. The hyperactive probe was 261 bp long and encoded a portion of the mature peptide. The starry flounder skin probe was 275 bp

25 long and encoded the entire mature peptide as well as the last 24 bp of the intron. All final washes were done in 0.1 x SSC with 0.5 – 1% SDS at 64 °C for 30 min. Blots were exposed to

Kodak XAR-5 film at -80 °C for periods ranging from 16 – 28 h.

2.3 Library preparation and amplification

A genomic library in lambda phage was prepared from frozen starry flounder (Queen

Charlotte Islands, QCI) liver tissue by Bio S&T (Montreal, QC). Briefly, partially digested

Sau 3AI fragments (20 kb average length) were ligated into the Bam HI site of a Lambda DASH®

II vector (Stratagene). The DNA was packaged as a primary library containing 4.5 x 10 6 plaque forming units (pfu). Twenty percent of the library was amplified according to the manufacturer’s instructions over twenty plates and each stock was stored separately in SM buffer over 0.3% chloroform at 4 °C. All amplifications and screens were done in E. coli XL1-Blue MRA (P2 lysogen) cells. Frozen permanent stocks were made in 7% DMSO and stored at -80 °C; the amplified portions were frozen in two 2 mL aliquots per plate and the remainder of the primary library was frozen in seven 50 µL aliquots.

2.4 Library screening

Approximately 5 x 10 4 pfu were plated onto 150 mm NZY plates during the primary screen; plaques picked from the primary screen were stored in 1 mL SM buffer with 4% v/v chloroform without further analysis. All other screens were performed on 100 mm plates. Plaque lifts were carried out in duplicate onto nylon Colony/Plaque Screen Hybridization Transfer

Membranes (Perkin Elmer) and fixed according to the manufacturer’s recommendations.

26

Membranes were hybridized to the same winter flounder probes as used on the Southern blots.

All washes were done at 60 – 63 °C with 1% SDS present in solution, and the final washes varied as follows. Filters probed with DNA for the liver isoform were washed with 1x SSC for 30 min

(primary screen), 4x SSC for 15 min (secondary screen), 4x SSC for 30 min (tertiary screen), 1x

SSC for 20 min (quaternary screen). Filters probed with DNA for the hyperactive isoform were washed with 0.2x SSC for 60 min (primary screen), 1x SSC for 45 min (secondary screen), 4x

SSC for 30 min (tertiary screen), and 1x SSC for 20 min (quaternary screen).

2.5 Phage isolation

Plaque-purified phage were amplified by the plate lysate method [93], in which 10 7, 10 8 and 10 9 pfu were incubated with separate 500 mL E. coli cultures. Cultures that appeared to be sufficiently lysed, as determined by their optical properties and the presence of visible cell debris, were spun 6175 x g at 4 °C for 10 mins to remove debris and unlysed cells. Following addition of PEG 8000 (50 g) to precipitate the phage, aggregates were pelleted by centrifugation at the same relative centrifugal force and temperature for 30 min. The pellet was resuspended in 10 mL of SM buffer, and the PEG was extracted using an equal volume of chloroform. Following centrifugation at 3000 x g for 15 min at 4 °C, CsCl was added to the aqueous phase to a final concentration of 0.3 g/mL.

Phage were isolated from a CsCl step gradient made up of four different densities (1.1,

1.4, 1.5, 1.7 g/mL) layered to form distinct strata. The phage were centrifuged at 76220 x g for 3 h at 16 °C, then extracted via needle puncture of the centrifuge tube just below the phage band, located between the 1.5 and 1.4 g/mL layers. The band was slowly drawn into the syringe using a

27 wide-bore needle (18-gauge), so as to minimize shearing forces. A second purification was performed in a CsCl equilibrium gradient (1.5 g/mL) in order to separate intact phage from empty phage heads and contaminating nucleic acids. The gradient was established at 171500 x g for

18+ h at 16 °C and the phage band was harvested as described above. DNA was extracted by treatment with formamide at 22 °C for 2 h and ethanol precipitation as described by Sambrook

[93]. Pellets were collected by centrifugation, resuspended in 20 µL TE and stored at 4 °C.

2.6 DNA sequencing

Two phage DNAs isolated as described above were sequenced. The first phage DNA

(insert #1) was sequenced by shotgun cloning followed by the 454 method and the second (insert

#2) was sequenced by the 454 method (Genome Quebec) followed by PCR amplification and re- sequencing of selected portions. For shotgun sequencing, high molecular weight genomic DNA was randomly sheared and fragments approximately 2 kb in length were size-selected, blunt- ended with Klenow, and ligated into Sma I-blunted pUC19 vector. Approximately 72 transformants were randomly selected for plasmid DNA isolation followed by sequencing in a thermocycler with fluorescently-tagged dideoxynucleotides. Each insert was primed with the

M13 forward or reverse primers; 48 were sequenced in both directions while 24 were done using only the forward primer. The resulting ~600 bp sequence reads were assembled using DNAMAN from the Lynnon Corporation, version 4.15.

In the 454 method, genomic DNA was randomly sheared via hydrostatic pressure into

300 – 800 bp fragments. These were then blunt-ended and ligated to two different double- stranded adaptor sequences for selection purposes. The double-stranded fragments were melted,

28 and single-stranded fragments containing both adaptors were selected and fixed onto proprietary

“DNA Capture Beads” in an optimized molar ratio, such that each bead bound no more than one fragment. Each fragment was amplified on its respective bead via oil-emulsion PCR and denatured, leaving single stranded templates attached to the beads. Sequencing reactions took place in a PicoTitrePlate device™, where each bead was incubated with sulfurylase, luciferase and all reagents required for polymerization, except the dNTPs. The plate was sequentially bathed in buffer solutions containing one of dCTP, dGTP, dATP or dTTP, which were incorporated when the appropriate template nucleotide was available. The addition of a single dNTP generated a light signal that was detected by a charged coupled device camera in the sequencer; the number of nucleotides added at a given time to the nascent chain was directly proportional to the strength of the light signal. Reads were short, roughly 100 – 300 bases long, but the 400000 parallel reactions sequenced per plate resulted in both high coverage and accuracy.

TA clones of PCR products were sequenced using M13R or T7 primers at the Robarts

Research Institute (London, ON) by cycle sequencing with fluorescently-tagged dideoxynucleotides.

2.7 PCR analysis

For the isolation of novel AFP variants from starry flounder genomic DNA and library phage plaques, primers were designed based on sequence alignments between AFP isoforms and were synthesized by Cortec DNA Service Laboratories, Inc. (Kingston, ON). Primer sequences and designations are listed in Table 1, and their placement relative to one another within the genes is shown in Figure 6. Standard PCR reaction mixtures contained 1.5 mM magnesium

29

Table 1. Names and sequences of primers used in PCR experiments

All sequences are written 5' à 3' and their relative positions are shown in Figure 6; numbers used for clarity in Figure 6 are noted in brackets next to the primer name. The length of each primer is listed (bp), as is its GC content.

30 NAME SEQUENCE LENGTH %GC

5'up GTCCAGAGAGGGGAAAGAATACA 23 48 3'up ACGCCTCGACTGAATCCTTTTGT 23 48

5'int GGAAGGAAGGATATCTGCATTAT 23 39 3'int TAATAATACCATTAATTTCTGCAG 24 23 stfsk AGACACTACTGCGGGAAACATAC 23 48 allsk GGCCTAAACCTGAAAAAATCTGAGC 25 44 3'univ ACATGATCCCACATCAAGACGAC 23 48 m5' (1) ATAATACCATTAATCTCTGCAGC 23 35 m5'mid (2) CGCATCCATAGCAACCATCAA 21 48 m5'mid stf (2a) AGCAACCATCAAAGCCAATGC 21 48 m5'stf (2b) CAGCAATAGCAGCCGAGGAA 20 55 m3'stf (3b) TTTGTCAAAGATGGCCGCCT 20 50 m3'midstf (3a2) TTGTTTTGGCTGCGGCTGCG 20 60 m3'midstf (3a1) CGATGGTTGTTTTGGCTGCG 20 55 m3'mid (3) TTGTCGATGGTTCTTTTGGCT 21 43 m3' (4) ACGACCACGATCCTTATGGG 20 55

Contig1#1 AAACAGGGTAGAGAACAAGAAC 22 41 Right arm ATACGACTCACTATAGGGCGAAG 23 48 maxisfstop TTAAGGATCGTGGTCGTCTTG 21 48 con2end2#1 GAGTCTCTTCATGTGATACTCT 22 41

30

Figure 6. Schematic diagram showing the relative positions of the PCR primers on the liver, skin and hyperactive AFP isoforms

Boxed portions denote exonic sequences, shaded areas represent translated sequence and primers are indicated by arrowheads. (A) Starry flounder liver isoform 1 (stfl-AFP1). (B) Starry flounder skin isoform 1 (stfs-AFP1). The allsk primer was designed to a region conserved among all starry and winter flounder skin isoforms, whereas the stfsk primer was designed to a region unique to stfs-AFP1. (C) Winter flounder hyperactive AFP gene (wfh-AFP1). The size of the intron is unknown and is indicated by the broken line. Primers 2 and 3 were designed in GC-poor regions of the winter flounder sequences whereas primers 1 and 4 were similar to the 3'int and

3'univ primers, respectively, but shifted a few nucleotides into sequence unique to the hyperactive sequences. Primers 2a, 2b, 3a1, 3a2 and 3b were designed to exactly match the starry flounder hyperactive AFP sequence. The following three primers were designed to well-conserved regions found in all three isoforms and are marked on all three diagrams: 5'int near the 5' end of the intron, 3'int at the 3' end of the intron and 3'univ in the proximal 3' UTR. The primer sequences are listed in Table 1.

31

5’int 3’int 3’univ

(A) 300

5’int stfsk allsk 3’int 3’univ

(B) 300

3b 2b 3a2 3’int 2a 3a1 3’univ 5’int 1 2 3 4 (C)

300

31 chloride (MgCl 2), 1x Qiagen buffer, 1x Q solution, 0.2 mM dNTPs, and 0.025 U/ µL Taq .

Standard thermocycling programs started with 3 min at 98 °C and a hot start at 80 °C. Reactions were cycled 25 times through 1 min at 95 °C, 1 min at 55 °C, and 2 min at 72 °C, and finished with 10 min at 72 °C. MgCl 2 concentration, buffers and salts, annealing temperature and number of cycles were optimized for each primer pair using 0.33 µg starry flounder genomic DNA as the template. With these optimal conditions, PCR was performed on phage plaques picked from the primary library screen using 4 µL of the buffer stock as a template. PCR products of interest were cloned using the TOPO TA Cloning Kit (pCR® 2.1-TOPO® Vector, Invitrogen) and sequenced. Upon sequence alignment, differences were confirmed as necessary by sequencing the opposite strand.

For closure of the gaps remaining in phage insert #2 following assembly of the 454- generated sequence reads, additional primers were designed near the ends of available sequence

(Table 1). PCR amplifications were performed with the same standard reaction conditions listed above without optimization, and various products were cloned and sequenced.

2.8 Bioinformatic analysis

Nucleotide BLAST searches were performed in all nucleotide sequence databases, including GenBank, RefSeq Nucleotides, EMBL, DDBJ, and PDB sequences (excluding

HTGS0,1,2, EST, GSS, STS, PAT, WGS). Searches were performed using default parameters and scoring matrix, except for an expect value threshold of 1 and unmasking regions of low complexity. Open reading frames located by the NCBI ORF Finder within the phage insert were used to query all non-redundant GenBank CDS translations, as well as the PDB, SwissProt, PIR

32 and PRF databases via BLASTP under default parameters. Signal peptide prediction was performed using both neural networks and hidden Markov models on the SignalP 3.0 server [94].

Promoter analysis was performed using the Neural Network Promoter Prediction tool, version

2.2, courtesy of the Berkeley Drosophila Genome Project [95].

33

Chapter 3 Results

3.1 Preface

Genomic DNA was prepared from starry flounder tissue samples by Kyra Nabeta, Gary

Scott, Pliny Hayes and Laurie Graham. Kyra Nabeta prepared the Southern blot and probed it with the winter flounder liver and hyperactive isoforms, and Sherry Gauthier applied the starry flounder hyperactive and skin isoform probes. The phage library was prepared by Bio S&T, and inserts were sequenced by Genome Quebec, both in Montreal, QC. The first phage insert was purified and its sequence was assembled by Kyra Nabeta; the second insert was purified by

Sherry Gauthier, who resolved the gaps in the sequence with Laurie Graham. All sequence analysis, as well as all PCR and TA cloning experiments were performed by Kyra Nabeta, with the exception of those involving the American plaice, which were done by Sherry Gauthier. The manuscript was written by Kyra Nabeta with assistance from Peter Davies and Laurie Graham.

3.2 Genomic DNA samples were of variable quality

To determine whether gene dosage of the starry flounder type I AFPs is correlated with increasing latitude, we first attempted to supplement the results from a preliminary three-location

Southern blot that showed evidence of a positive association (Figure 5). Genomic DNA was prepared from a variety of tissues collected from the south Bering Sea (Port Moller, location 2A), the Pacific coast of the eastern Aleutian Islands (Sitkalidak Strait, location 3B), the Queen

34

Charlotte Islands (Hecate Strait, location 4), and Vancouver (English Bay, location 5), as indicated Figure 4. High molecular weight DNA of good quality was extracted from the Queen

Charlotte Islands (QCI) and English Bay samples. However, the DNA from tissues collected from Port Moller and the Sitkalidak Strait was either of poor quality, as indicated by the presence of smearing (Figure 7), or of insufficient quantity. Consequently, it was not possible to significantly extend the biogeographical study correlating latitude with gene dosage at this time.

3.3 Genomic Southern blot showed strong liver AFP gene signal in starry flounder

The genomic DNAs extracted from four QCI starry flounders (location 3) were of good quality, comparable to that of the control winter flounder DNA in Figure 7. They were digested with Eco RI and Sac I and blotted alongside a control DNA (digested with Bam HI and Sac I) from a single winter flounder. The blot was probed with the winter flounder liver AFP cDNA pKEN

C17 [19], which was selected as a representative of liver type I flounder AFP genes. The starry flounder banding pattern differed from that of the winter flounder (Figure 8A). All four starry flounder Sac I digests revealed one strong band at 4.7 kb, whereas the winter flounder Sac I digest contained three intense bands in a narrow size range (2.7 – 3.2 kb), as previously observed [31,

32, 51, 62, 78]. Both species also showed a few faint bands of various sizes. Eco RI generated hybridization bands in starry flounder DNA were significantly larger than the 7.8 – 8.0 kb Bam HI bands observed in the winter flounder [31, 32]. The total band intensity appeared similar between the two species, suggesting that the starry flounder AFPs are also encoded by a similarly-sized multigene family of 30 – 40 members.

35

Figure 7. Quality assessment of genomic DNA from an Alaskan starry flounder

The quality of genomic DNA extracted from starry flounder tissues was assessed by gel electrophoresis. In this representative gel, high-quality winter flounder genomic DNA was loaded in lanes 1 to 3 (0.08, 0.16 and 0.42 µg) for comparison. Lanes 4 to 7 contain 1, 2, 5 and

10 µL of a genomic DNA preparation from a starry flounder collected at location 3B, and was typical of the Alaska samples. No molecular weight markers were loaded, as a visual comparison with the winter flounder DNA was sufficient to establish sample quality.

36

1 2 3 4 5 6 7

36

Figure 8. Genomic Southern blots of winter flounder and QCI starry flounder probed with various AFP gene sequences

Genomic DNA (10 µg) from individual fish was digested with Sac I or Bam HI for winter flounder, or with Sac I or Eco RI for starry flounder. The four starry flounders (1, 4, 6, 9) were obtained from location 4 (Figure 4). Digested DNA was electrophoresed on a 0.8% agarose gel, blotted onto nylon and probed with the winter flounder liver isoform (A), the winter flounder hyperactive isoform (B), or the starry flounder skin isoform (C). The positions of DNA size markers (kb) are indicated on the left with corresponding markers on the right.

37 Winter Winter Winter Starry flounder Starry flounder Starry flounder flounder flounder flounder Sac Eco Sac Eco I Sac I Eco RI I I RI I I RI H H H I I I c m c m c m a a 1 4 6 91 4 6 9 a a 1 4 6 91 4 6 9 a a 1 4 6 91 4 6 9 S B S B S B 10 8 6 5 4 3.5 3

2

0.75

A BC

37 The starry flounder banding patterns for the Eco RI and Sac I digests were qualitatively and quantitatively similar to those previously observed in starry flounder specimens from the

Vancouver area of BC, Canada (G. Scott, unpublished data) (Figure 5). In this blot, the signal observed in the BC sample was very different from that of the California sample, which was missing the main 4.7 kb Sac I band, and from that of the Alaska sample, where the Sac I band was

2- to 3-fold stronger and located in a significantly larger fragment. In the Alaska sample the hybridization signal in the Eco RI digest was also 2- to 3-fold stronger, and was located in a completely different set of bands that were smaller than those in the BC sample.

3.4 Genomic Southern blot shows strong hyperactive AFP gene signal

Given the considerable sequence differences between the liver and hyperactive isoforms in the coding region [65], the latter was not expected to cross-hybridize to the liver or skin genes, despite their similar GC-richness. The blot from Figure 8A was stripped and re-probed with the winter flounder hyperactive probe, which hybridized to different fragments in the winter flounder lanes than observed with the liver isoform (Figure 8B vs. Figure 8A). The intense bands on this blot were smaller in the Sac I digest, 2.0 and 2.6 kb compared to 2.7 – 3.2 kb with the liver isoform, while the principal signals in the Bam HI digest were located in much larger DNA fragments, >>10 kb in size compared to 7.8 kb. This is the first time this newly-discovered isoform has been used to probe Southern blots.

The starry flounder banding pattern observed with the hyperactive probe was quite similar among all four fish and to the pattern observed with the liver AFP probe (Figure 8B). The main difference was in the relative intensity of the bands. As signal strength is a function of many factors, patterns of relative intensity within lanes can be compared between blots, but no

38 conclusions can be drawn regarding the absolute intensity of individual bands without a single- copy control. With the hyperactive probe, the SacI band centred at 3.5 kb became much more intense relative to the 4.7 kb band. The hybridization pattern in the Sac I and Eco RI digests suggests the presence of several copies of the hyperactive AFP gene in the starry flounder genome.

Upon comparison with the winter flounder, the most noticeable difference between

Figure 8A and Figure 8B was that the starry flounder bands merely displayed different relative intensities whereas the winter flounder banding pattern changed completely. The absence of residual signal from the liver probe in the winter flounder lanes ( e.g., the intense band at 3.2 kb is no longer present) implied that the blot was properly stripped; however the blot was not exposed after stripping to confirm complete removal of the liver probe. The fact that both the liver and hyperactive probes hybridized strongly to the same fragments of starry flounder DNA suggests tight linkage of the two isoforms.

3.5 Genes for starry flounder liver and skin AFPs are closely linked

Based on the genomic Southern blots, it was clear that the genomic DNA extracted from

QCI fish was of high quality and that it contained multiple AFP gene copies. However, the identity of these genes was based only on hybridization signals, and confirmation at the DNA sequence level was essential. Therefore, a genomic lambda library was constructed to characterize the genes at the sequence level. It provided 131-fold coverage of the genome, assuming a genome size of 0.7 pg, which was based on values obtained for other closely related flounder species [96]. A primary screen using both the hyperactive and liver probes was

39 performed on two different amplified phage stocks. Given that 5 x 10 4 plaques were cultured per plate, a single copy gene was expected to produce a positive signal once or twice per plate.

Twelve positive plaques in total were identified on the two plates probed with a cDNA of the liver isoform and ten of these were selected for further screening. Twenty-one signals in total were detected on the two plates probed with a cDNA of the hyperactive isoform and fifteen of these were selected. Two plaques from each primary screen were re-screened three more times to achieve plaque purity. In the end, one plaque identified with each of the liver and hyperactive probes was purified to homogeneity by four rounds of screening in total.

The clone identified using the hyperactive cDNA probe was sequenced in its entirety by

Genome Quebec. Initially, randomly-sheared fragments were shotgun cloned, and seventy-two clones were selected randomly and sequenced. Three gaps remained following an assembly using

DNAMAN. The clone was sequenced a second time by Genome Quebec as a test to optimize their newly acquired 454 system, and a complete assembly was achieved, including an additional

3.2 kb of sequence that spanned the three gaps. The sequences of the regions analyzed by both methods were identical.

Insert #1, was 18707 bp in length, and it contained two type I AFP genes oriented in the same direction and separated by 1 kb of intergenic sequence (Figure 9). BLAST searches revealed that one gene encoded a liver isoform and the other a skin isoform, based on their similarity to known sequences from other right-eyed flounders. These genes were both located within a 4 kb region at one end of the insert. The rest of the insert was used to query Genbank nucleotide databases using BLASTN, and was found to lack other genes. However, there were some small stretches that showed identity to non-coding regions of flounder type I AFPs and

40

Figure 9. Schematic diagram of starry flounder genomic DNA insert #1

The 18707 bp insert is denoted by the solid line and lambda arms by the dotted lines. The two predicted exons of each of the two starry flounder AFP genes are indicated using thicker grey boxes. Lines with diamonds indicate Eco RI cut sites, and the round dumbbell indicates the sole

Sac I cut site. The three underlined sections indicate regions outside the AFP genes with similarity to sequences from the Genbank database; (A) 275 bp segment, beginning at base 9211, that is 92% identical to a portion of the 3' region of wfl-AFP6, (B) 148 bp segment, beginning at base 10060, with 83% identity to a non-coding sequence section in the 3' flank of a pleurocidin- like gene from winter flounder, (C) 196 bp segment, beginning at base 12104, with 78% identity to a sequence 5' of the wfl-AFP6 gene.

41

Skin Liver

12420 λλλR λλλL

1 A B C 12931 14419 18707

1 kb

41 pleurocidin, an antimicrobial peptide [97]. For example, a 275 bp region aligned with 92% identity and an E-value of 3 x 10 -105 to a sequence in the 3' flank of wfl-AFP6 (match A in Figure

9). An E-value, or expect value, describes the probability that a sequence of a given size will match another in the database by random chance. Thus, a low value suggests that two sequences are significantly similar. Another stretch of 196 bp aligned with 78% identity and an E-value of 2 x 10 -31 to a sequence in the 5' flank of the same gene (match C in Figure 9). A 148 bp section aligned with 83% identity (E-value of 2 x 10 -32 ) to a non-coding sequence in the 3' flank of a pleurocidin-like gene from winter flounder (match B in Figure 9). In the insert, all three of these matches occur within 5 kb upstream of the AFP genes. The rest of the matches detected aligned with highly repetitive dinucleotide repeats, such as (GT) n, and were not considered further; such simple sequence repeats are common in eukaryotic genomes and are formed by slippage during replication or recombination [37]. Using the NCBI ORFfinder 1 to scan the 14 kb upstream of the

AFPs, 74 open reading frames (ORFs) of 102 – 384 nucleotides were identified. To demonstrate the likelihood of this happening by chance, five 14 kb sequences were randomly generated in

DNAMAN and were found to contain an average of 71 ORFs with maximum sizes of 363 – 582 nucleotides. These potential ORFs were subjected to protein:protein BLAST searches to identify potential proteins, but no significant matches were found.

Previous restriction analysis of lambda clones showed that many of the type I AFP genes of the winter flounder are arranged in direct tandem repeats [32]. The starry flounder lambda insert was searched for cut sites corresponding to the restriction enzymes used in the Southern blots to determine whether the two genes are part of a tandem array. Two Eco RI restriction sites were found within a 1 kb stretch upstream of the genes but no other sites were present in the rest of the insert (Figure 9). There was only one Sac I site, positioned within the intron of the skin

1 http://www.ncbi.nlm.nih.gov/projects/gorf/ 42 isoform. Because the genes were located at the very end of the insert and no other Eco RI or Sac I sites were present, it was not possible to determine whether the fragments produced in this region of the genome correspond to the band sizes observed on the current Southern blots (Figure 8).

3.6 Starry flounder liver and skin AFPs are homologous to their winter flounder counterparts

One of the main objectives of this project was to sample the variety of AFP genes in the starry flounder genome and to assess their relationship to those of the winter flounder. The two

AFP sequences obtained from insert #1 were aligned with winter flounder AFP sequences, and it became immediately apparent that one was an orthologue to the liver isoform whereas the other was a skin isoform orthologue. Alignment of the starry flounder AFP genes with the corresponding winter flounder genes ( i.e., stfl-AFP1 with wfl-AFP6, Appendix A, and stfs-AFP1 with wfs-11-3, Appendix B) showed that they were identical in terms of intron/exon structure, in that both isoforms consisted of two exons separated by a single intron (Figure 10).

3.6.1 Liver isoform

In stfl-AFP1, the first exon was 91% identical to that of wfl-AFP6 and encoded a portion of the signal peptide (Appendix A). The rest of the signal peptide, the pro-sequence and the mature protein were encoded by the second exon, which was 92% identical to the corresponding exon of wfl-AFP6 (81% including gaps). When each portion was taken individually, there was

93% identity in the signal peptide, 75% identity (52% including gaps) in the pro-sequence, and

80% in the mature sequence. The deduced proteins showed 87% identity in the signal peptides,

43

Figure 10. Schematic diagram of the AFP gene organization in genomic DNA insert #1

The portion of the insert from Figure 9 containing the two AFP genes is shown with intronic and intergenic sequences represented by black lines. Exons are indicated by boxes in which the translated portions are shaded. Promoters are marked by grey vertical arrows and a putative enhancer sequence by an asterisk. The Sac I cut site in the intron of the skin AFP is marked with a dumbbell. Arrows indicate the position and directionality of the primers designed for PCR amplification. Vertical lines denote single nucleotide mismatches and exact sequences are listed in Table 1.

44

Skin Liver

5’int stfskallsk 3’int 5’int 3’int * 1.2 kb 1.0 kb 0.5 kb 3’univ 3’univ

Promoter Un-translated 500 bp Translated * Enhancer

44 81% in the pro-sequence (53% including gaps) and 89% in the mature protein. The introns of these genes were 513 bp and 497 bp long for the starry and winter flounders, respectively, and showed 68% identity (64% including gaps).

The predicted product of stfl-AFP1 was quite similar to the three-repeat liver isoforms from winter flounder. The 23-residue secretory signal peptide predicted by SignalP 3.0 [94] was equal in length to those in the winter flounder AFPs, but the starry flounder sequence contained three substitutions: L11F, M18I and T21S (Figure 11). Assuming the N terminus of the mature peptide is identical to that of the winter flounder, the starry flounder pro-sequence would be 30 residues long. The pro-sequence would contain some substitutions as well as insertions of 9 or 15 residues relative to wfl-AFP6 or wfl-AFP9, respectively. However, if this putative pro-sequence is removed by DPPIV, as is thought to occur in the winter flounder liver AFPs, the twelfth and thirteenth dipeptides would have Asp and Phe residues in the penultimate position rather than the preferred Ala or Pro. Current data do not support Asp or Phe as cleavage substrates for DPPIV

[98, 99]. If the N terminus is identical to that in the winter flounder, the mature starry flounder liver AFP would be 37 amino acids long, after cleavage of the final Gly residue during C-terminal amidation, and would contain three full 11-amino acid repeats. Overall, stfl-AFP1 contained a total of 23 nucleotide substitutions relative to wfl-AFP6, eleven of which were silent (Appendix

A). The amphiphilic character of the helix was determined by helical wheel diagrams, which are oriented such that the reader is looking down the length of the helix. The frequency with which amino acids appear at various positions around the helix is represented by the proportional colouring in boxes that correspond to each of the 11 potential positions, indicated by the internal star (Figure 12). The DNA coding for the hydrophobic ice-binding face was completely conserved (Figure 12) and contained four silent substitutions. Away from the binding face, seven

45

Figure 11. Protein alignment of liver AFP variants from starry flounder and winter flounder

An alignment of the three st arry flounder liver (stfl) AFP variants, along with three representative winter flounder liver (wfl) sequences, is shown with polymorphisms highlighted in grey. Stfl-

AFP1 was obtained from lambda insert #1, whereas stfl-AFP 2 and 3 were obtained by PCR and are incomplete at the N terminus. The signal peptide, pro-sequence and a C-terminal Gly residue

(not shown) are removed during post-translational processing to generate the mature peptide. The number of residues within each portion of the encoded sequence is noted at the end of each row, and the total number of residues in the preproprotein indicated in brackets. Placement of the intron is indicated by an arrow and the ice-binding residues, which are completely conserved, are denoted by asterisks. Genbank accession numbers for the three winter flounder sequences are

CAA37754 (wfl-AFP9), AAA49469 (wfl-AFP6), AAA49468 (wfl-AFP8).

46

Signal peptide stfl-AFP1 MALSLFTVGQFIFLFWTIR ISEA 23 stfl-AFP2 INEA 4 stfl-AFP3 INEA 4 wfl-AFP9 MALSLFTVGQLIFLFWTMR ITEA 23 wfl-AFP6 MALSLFTVGQLIFLFWTMR ITEA 23 wfl-AFP8 MALSLFTVGQLIFLFWTMR ITEA 23

Prosequence stfl-AFP1 NPDPAAKAAAVADPAAAAVAPAADAFSAAA 30 (53) stfl-AFP2 NPDPAAKAAAVADPAAAAVAPAADAFSAAA 30 (34) stfl-AFP3 NPDPAAKAAAVADPAAAAVAPAADAFSAAA 30 (34) wfl-AFP9 NPDPAAKAV------PAAAAP 15 (38) wfl-AFP6 RPDPAAKAA----PAAAAA-----PAAAAP 21 (44) wfl-AFP8 RPDPAAKAA----PAAAAV-----PAAAAP 21 (44)

Mature peptide * * * * * * * * * * * * * * stfl-AFP1 DTASDAAAAAAATAAAAKAAAE------KTARDAAAAAAAT----AR 37 (90) stfl-AFP2 DTASDAAAAAAATAAAAKAAAE------KTARDAAAAAAAT----AR 37 (71) stfl-AFP3 DTASDAAAAAAATAAAAKAVAE------KTARDAAAAAAAT----AR 37 (71) wfl-AFP9 DTASDAAAAAAATAATAAAAAAATAVTAAKAAALTAANAAAAAAATAAAAAR 52 (90) wfl-AFP6 DTASDAAAAAALTAANAKAAAE------LTAANAAAAAAAT----AR 37 (81) wfl-AFP8 DTASDAAAAAALTAANAAAAAK------LTADNAAAAAAAT----AR 37 (81)

46

Figure 12 . Helical wheel diagrams of liver AFPs from starry flounder and winter flounder

The coloured areas indicate the frequency of naturally-occurring amino acid substitutions and the star indicates the 11 potential positions for residues around the helix. The black arcs indicate the residues that make up the ice-binding face. Residues represented include the entire mature peptide from all variants of the starry flounder (A) and winter flounder (B) liver AFPs as indicated in Figure 11, except for the terminal Gly, which is cleaved during post-translational modification, and the terminal Arg, which is thought to curl away from the ice-binding surface

(L. Graham, personal communication).

47

A Starry flounder Val

Glu

Lys

Arg

Thr

B Winter flounder Asp

Ala

Asn

Ser

Leu

47

nucleotide substitutions were silent, and the remaining twelve resulted in five substitutions at the

amino acid level, indicating that more plasticity is tolerated here.

3.6.2 Skin isoform

The similarity between the skin AFP genes of the two species was even higher than

between the liver AFP genes. The first exon of stfs-AFP1, which contained only untranslated

sequence, showed 96% identity to wfs-11-3 and had no gaps, whereas the second exon, which

contained the coding sequence, showed 97% identity (94% including gaps) (Appendix B). Taken

alone, the portion of DNA encoding only the expressed peptide showed 97% identity (93%

including gaps) between the two fish. The deduced protein was 97% identical to wfs-11-3 (92%

including gaps). The stfs-AFP1 intron contained one significant insertion of 439 bp in length,

and was 1212 bp in total, compared to the 794 bp winter flounder intron. The introns were 94%

identical (59% including gaps). The starry flounder genes showed a similar codon bias to the

winter flounder AFP sequences, heavily favouring GCC (~70%) for Ala (Table 2).

The predicted starry flounder skin AFP was very similar to the winter flounder skin

AFPs, in that there was no signal peptide or pro-sequence. There was one conservative

substitution near the N terminus (K6R) and a two-residue deletion near the C terminus (Figure

13). There were also two silent single nucleotide substitutions (Appendix B). The peptide

contained two and a half 11-amino acid repeats, and as with the liver AFPs, the residues involved

in the putative ice-binding face were conserved, reiterating the significance of these residues.

48

Table 2. Relative codon usage in starry flounder and winter flounder AFP genes

49

% Usage

ISOFORM VARIANT SPECIE S GCC GCT GCA GCG

Liver wfl-AFP6 Winter flounder 71.1 10.5 18.4 0.0 stfl-AFP1 Starry flounder 67.4 14.0 18.6 0.0

Skin wfs-11-3 Winter flounder 63.0 3.7 33.3 0.0 stfs-AFP1 Starry flounder 72.0 4.0 24.0 0.0

Hyperactive wfh-AFP1 Winter flounder 63.9 13.9 19.7 2.5 stfh-AFP2 Starry flounder 64.7 9.8 23.3 2.3

49

Figure 13. Protein alignment of skin AFP variants from starry flounder, winter flounder and American plaice

The st arry flounder skin (stfs) and American plaice skin (aps) AFP variants are shown, along with the 10 known winter flounder AFP skin variants. The number of residues is noted at the end of each row. Intraspecific polymorphisms are highlighted in grey, and ice-binding residues [27] are denoted with asterisks. The winter flounder sequences are from Gong et al . [51] and although the variant numbers have been retained, the acronym has been changed from sAFP to wfs-AFP

(winter flounder skin). Genbank accession numbers for wfs-11-3 and wfs-F2 are M63478 and

M63479, respectively.

50

Starry flounder

* * * * * * * * stfs-AFP1 MDAPARAAAA TAAAAKAAAEA TAAAAAKAAAA TK--AAR 37 stfs-AFP2 MDAPARAAAA TAAAAKAAAEA TKAAAAKAAAA TK--AAR 37 stfs-AFP3 MDAPARAAAA TAAAAKAAAEA TAAAAAKAAAD TK--AAAAAAAAL 43 stfs-AFP4 MDAPAKAAAA TAAAAKAAAEA TAAAAAKAAAA TK--AGR 37 stfs-AFP5 MDAPARAAAA TAAAARATAEA TEAAAAKAAAA TK--AAR 37 stfs-AFP6 MDAPARAAAA TAAAAKAATEA TKAAAAKAAAA TK--AAR 37 stfs-AFP7 MGAPARAAAA TAAAAKAAAEA TKAAAAKAAAA TK--AAR 37 stfs-AFP8 MDAPAAAAAA TAAAAKAAAEA TAAAAAKAAAA TKAAAAR 39

Winter flounder

* * * * * * * * * * * * wfs-F2 MDAPAKAAAA TAAAAKAAAEA TAAAAAKAAAA TKAGAAR 39 wfs-11-3 MDAPAKAAAA TAAAAKAAAEA TAAAAAKAAAA TKAAAAR 39 wfs-AFP1 MDAPARAAAA TAAAAKAAAEA TKAAAAKAAAA TKA-AAH 38 wfs-AFP2 MDAPAKAAAA TAAAAKAAAEA TAAAAAKAAAA TKAGAAR 39 wfs-AFP3 MDAPAKAAAA TAAAAKAAAEA TAAAAAKAAAD TKAKAAR 39 wfs-AFP4 MDAPAKAAAA TAAAAKAAAEA TAAAAAKAAAA TKAGAAH 39 wfs-AFP5 MDAPAKAAAA TAAAAKAAAEA TKAAAAKAAAA TKA-AAH 38 wfs-AFP6 MDAPAKAAAA TAAAAKAAAEA TAAAAAKAAA- TKAGAAR 38 wfs-AFP7 MDAPAAAAAA TAAAAKAAAEA TAAAAAKAAAA TKAAAAR 39 wfs-AFP8 MDAPAAAAAA TAAAAKAAAEA TAAAAAAAAAA TAEAAAKAAAA TKAAAAAAAAR 54

American plaice

* * * * * * * * * * aps-AFP1 MDPAKAAAA TAAKAKADAEK TAAAAAKAAAD TAAAAAKAAKA AAH 45 aps-AFP2 MDPAKAAAA TAAKAKADAEK TAAAAAKAAAD TAA---KAAKA AAH 42 aps-AFP3 MDPAKAAAA TAAKAKADAEK TAAAAAKAAAD TAA------KA AAH 39 aps-AFP4 MDPAKAAAA TAAKAKADAEK TAAAAAKAAAD TAAAAA---KA AAP 42

Interspecies comparison

* * * * * * * * * * stfs-AFP1 MDAPARAAAA TAAAAKAAAEA TAAAAAKAAAA TK------AAR 37 wfs-F2 MDAPAKAAAA TAAAAKAAAEA TAAAAAKAAAA TKAG----AAR 39 aps-AFP2 MD-PAKAAAA TAAKAKADAEK TAAAAAKAAAD TAAKAAKAAAH 42

50

3.6.3 Regulatory elements

Analysis of the starry flounder AFP genes indicated that most of the known regulatory elements were also conserved. There are two promoter elements that have been identified for flounder AFPs. The first is the CAAT box, a cis -acting promoter-proximal sequence common to many eukaryotic genes [37]. It is perfectly conserved in the winter flounder liver isoforms, and was easily identified in stfl-AFP1 by alignment (Appendix A). This sequence has not been identified in the skin AFP genes [51]. The second promoter element is the TATA box, a core promoter sequence that usually lies between the CAAT box and the transcriptional start site [37].

There is a 1 bp substitution of this sequence in stfl-AFP1, changing the canonical TATAAAA to a moderately less efficient TACAAAA [100]. Gong et al . performed primer extension studies on skin AFP transcripts from winter flounder and established the location of two transcriptional start sites [51], but core promoter sequences for these genes have not been precisely defined. The same group tentatively identified a TFIID binding site 30 bases upstream of the transcriptional start site, which is perfectly conserved in stfs-AFP1 (Appendix B). To check for other potential promoter sites, 2.4 kb of sequence upstream of the stfs-AFP1 start codon was analysed using the

Neural Network Promoter Prediction tool, version 2.2 [95]. The sites identified by Gong et al . scored highest (0.96 out of 1.0). Two other transcriptional start sites were also predicted, 112 bp downstream and 153 bp upstream of the correct sites, scoring 0.92 and 0.90, respectively. A third regulatory element is an enhancer, Element B, which was identified in the intron of wfl-AFP6. It is bound by the liver-specific transcription factor CCAAT/enhancer binding protein α and a novel protein designated the antifreeze enhancer protein, and appears to be responsible for the integration of hormonal regulation and seasonal AFP expression [40]. It is present in the intron of wfl-AFP8 with a trinucleotide TAT insertion at position 16 of the usually 30-nucleotide

51 element, but the entire sequence has been deleted from wfl-AFP9 (data not shown). This enhancer also appears to be wholly absent in stfl-AFP1 (Appendix A). In the skin isoforms, a ubiquitous enhancer, termed Element S, was identified in the intron wfs-F2, and is thought to play a role in the broad expression of the skin-type AFPs [59]. A dinucleotide TA insertion in

Element S abolishes liver specificity, but it is otherwise identical to Element B [59]. We identified an Element S-like sequence in stfs-AFP1, but there are two single nucleotide substitutions at positions 4 and 21 of the 32-base element (Appendix B); the effects of these changes are uncertain.

The fourth conserved regulatory element is the polyadenylation signal. There is high sequence identity in the 3' UTR between corresponding genes (100% between the liver genes;

98% between skin genes), and all polyadenylation signals are conserved. Two other alternative polyadenylation signals have been identified in the 3' flank of the wfl-AFP6 [38, 39], and both are found in stfl-AFP1; the second contains a 1 bp substitution, but the third is exactly conserved

(Appendix A).

3.7 The starry flounder has multiple variants of the liver and skin AFPs

Alignment of all of the liver and skin AFP genes from the starry and winter flounders indicated that the AFPs were homologous. The 3' UTR was highly conserved among all variants of all isoforms (Appendix C), as were the 5' UTR and signal peptide sequences among the liver isoforms in both fish (Appendix D). Thus, PCR was employed to sample the variation of AFP genes in the starry flounder genome and lambda library. PCR primers were designed to conserved regions identified from alignments of the starry flounder AFP genes with all available

52 winter flounder AFP sequences, and are summarized in Table 1. The 5'int and 3'int primers are located at the 5' and 3' ends of the intron, respectively (Figure 6, Figure 10). They were chosen because their sequences were largely conserved across the liver, skin and hyperactive isoforms in both species, except for a few nucleotide substitutions. The 3'univ primer was designed to a portion of the 3' UTR conserved among all three AFP isoforms. When the 3'int and 3'univ primers were paired, the expected sizes of the fragments amplified from skin, liver, and hyperactive genes would be 185 bp, 275 bp and ~ 610 bp, respectively, as the lengths of the coding sequences are quite different. When the 5'int and 3'univ primers were used, a large deletion in the starry flounder liver gene was expected to allow for differentiation between the

PCR products expected from the skin (1312 bp) and liver (703 bp) isoforms; sequence was not available for this portion of wfh-5a. Two 5' primers specific for the skin isoforms were also designed: allsk is conserved across winter flounder and starry flounder skin genes, whereas the stfsk sequence is only seen in the starry flounder. Band sizes expected from stfsk and allsk when paired with the 3'univ primer were 784 bp and 734 bp, respectively.

Initial PCR reactions using the primers described above included some failed positive controls. Therefore, conditions were optimized for each primer pair using starry flounder genomic DNA as the template. Three different buffers were assessed (Qiagen KCl+(NH 4)2SO 4,

Fermentas KCl, Fermentas (NH 4)2SO 4) as well as four different magnesium concentrations (1.5,

2.0, 2.5 and 3.0 mM MgCl 2) and the effects of Q solution (Qiagen), a PCR additive designed to help amplify GC-rich templates. Calculated melting temperatures for all primers (T m = 2(A+T) +

4(G+C)) were similar and reaction annealing temperature was set to ~ 54 °C to minimize the effects of single mismatches between the primers and potential priming sites. Optimized conditions were applied to the 25 phage stocks from the primary library screen; representative

53

PCR results are shown in Figure 14. PCR products of both expected and anomalous sizes from the genomic optimization and phage screening experiments were cloned into plasmid vectors and sequenced. The template DNA for all these experiments, regardless of whether it was from the phage library or directly from genomic DNA, came from the same starry flounder liver (QCI #1).

Altogether, three liver variants were isolated, along with eight skin and two hyperactive variants

(Table 3). For the purposes of this thesis, a variant is defined as an AFP that can be clearly categorized as one of the three isoforms, but contains one or more unique substitutions. Thus,

"three liver variants" denotes "three type I AFPs of the liver isoform with minor changes in the amino acid sequence". Where necessary, clones were sequenced on both strands to ensure changes were not sequencing artifacts. Since the primers were designed to highly conserved regions and since both genomic DNA and primary phage stocks were sampled, we expect that we have obtained a fairly good representation of the sequence variety in this gene family.

As previously mentioned, two lambda clones had been plaque purified, one from screening with the liver probe and one with the hyperactive probe, but only the latter was sequenced. To determine whether the two inserts were highly similar and/or overlapping, two additional primers (5'up and 3'up, Table 1) were designed to a unique 1050 bp region of the lambda insert 1 kb upstream of the AFP genes. PCR with this primer pair as well as the primers described above produced bands of the expected sizes in the second unsequenced insert, and it was not considered further.

3.7.1 Liver variants

Variability was low among the liver variants. Fourteen clones were obtained from various sources and produced four unique DNA sequences that were greater than 99% identical.

54

Table 3. Type I AFP variants isolated from the starry flounder genome

All type I AFP amino acid sequences inferred from genomic cloning and PCR experiments are shown, with polymorphisms highlighted in grey. The sources listed for each variant denote unique DNA sequences obtained from either genomic clones (g#, see Appendices E, F, and H for full sequences) or primary phage stocks (code provided corresponds to Figure 14). Unknown sequence is marked with dashes (-).

55 ISOFORM SOURCE SEQUENCE

Liver 1 10 20 30 40 50 60 70 Stfl-AFP1 g1, g2, H1, H7, L1, L2, L4 ISEANPDPAAKAAAVADPAAAAVAPAADAFSAAADTASDAAAAAAATAAAAKAAAEKTARDAAAAAAATARG Stfl-AFP2 g3, L7, L8, H9, H13, H14 INEANPDPAAKAAAVADPAAAAVAPAADAFSAAADTASDAAAAAAATAAAAKAAAEKTARDAAAAAAATARG Stfl-AFP3 L5 INEANPDPAAKAAAVADPAAAAVAPAADAFSAAADTASDAAAAAAATAAAAKAVAEKTARDAAAAAAATARG

Skin 1 10 20 30 40 Stfs-AFP1 g4, H1, H7 MDAPARAAAATAAAAKAAAEATAAAAAKAAAATKAAR Stfs-AFP2 g5, g6, g7, L5, L7 MDAPARAAAATAAAAKAAAEATKAAAAKAAAATKAAR Stfs-AFP3 g8 MDAPARAAAATAAAAKAAAEATAAAAAKAAADTKAAAAAAAAL Stfs-AFP4 g9 MDAPAKAAAATAAAAKAAAEATAAAAAKAAAATKAGR Stfs-AFP5 H5 MDAPARAAAATAAAARATAEATEAAAAKAAAATKAAR Stfs-AFP6 g10 MDAPARAAAATAAAAKAATEATKAAAAKAAAATKAAR Stfs-AFP7 H5 MGAPARAAAATAAAAKAAAEATKAAAAKAAAATKAAR Stfs-AFP8 H11 MDAPAAAAAATAAAAKAAAEATAAAAAKAAAATKAAAAR

Hyperactive 1 10 20 30 40 50 60 70 Stfh-AFP1 g11 ------ANAAAAAATAAA AAIAAEEAATAAATAAAAAAATAATAQAAIFDKAAAAASTTATTAATAAATIATTAAAAA------Stfh-AFP2 g12, g13, H11 ITEAIDPAAQAAAAAAAAAAVVTAADAAAAAANAAANAAAVAAATAADVATASIATIKANAAAAAATAAA AAIAAEEAATAAATAAAAAAATAATAQAAIFDKAAAAASTTATTAATAAAATATTAAAAAAATETIDKAA AAAAAAAATAVATAAAAAATAAATAAATLGAAAAKAAATAVAAAAAAAIAAAAAAAAPP

55

Figure 14. Characterization by PCR of the AFP isoforms present in phage stocks isolated from the primary library screen

Both genomic DNA (g) and 25 phage plaques picked during the primary library screen were subjected to PCR, using either the 5'int or 3’int primer with the 3'univ primer (Figure 10), or in a nested reaction using the 2a and 3b primers followed by the 2b and 3b primers (Figure 6). The gel (top panel) shows the products obtained using the 5'int and 3'univ primers. The table beneath the gel indicates the presence (+) or absence (-) of bands of the expected size corresponding to liver (L), skin (S) and hyperactive (H) isoforms for the three primer combinations. N indicates that a plaque was not assessed and circles indicate that the product was subcloned and sequenced

(Appendices E, F, and H, Table 3). Plaques are denoted H1 to H15 or L1 to L10 and were obtained using the Hyperactive AFP and Liver AFP probes from winter flounder respectively.

The negative (-) and positive (+) controls used no template DNA and the primary plaque from which insert #1 was isolated, respectively. The asterisk denotes a phage that produced no bands under these PCR conditions but produced several with the allsk and 3’univ primers (data not shown).

PCR products were electrophoresed on a 1.5% agarose gel and size markers (M, in kb) are noted on the right. Expected sizes for the skin and liver isoforms are 1312 bp and 703 bp, respectively, based the sequence of insert #1. The leftmost lane contains fragments amplified from starry flounder genomic DNA, several of which (g2, g3, g6, g8 and g9) were subcloned and sequenced

(Table 3). Reactions were cycled under standard conditions, except for the magnesium concentration (2 mM) and 10x buffer (KCl only, Fermentas).

56

H L – + * g 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 12 345 6 78 910 M 1.5 g6 1.2 g9 1.0 g8 g2,3 0.5

L + + + + - + + - + - - - + + - + + + + + - - + + + 5’int S + + + + - + + -- - + ------+ - + + + - + - L + + + + - + + - + - - - + + - + + + + + + - + + + 3’int S + + + + - + + -- - + ------+ - + + + - + +

nest H N-NN N NN N- - + - N N N NN-NNN--NN

56 Three of the fourteen sequences were isolated from genomic DNA and eleven from different primary phage stocks. All were isolated using 3'univ as the 3' primer; 5'int was used to amplify four clones whereas 3'int was used for the other ten. There were a total of five nucleotide substitutions among all sequences (Appendix E). Two of these were each located in seven sequences, suggesting that they are real polymorphisms rather than sequencing artefacts; one was silent, while the other caused an S21N substitution in the signal peptide. This S21N substitution will likely have no effect on protein function or cleavage, as it is neither a drastic change in terms of size and hydrophobic character, nor in a critical position. The other three substitutions were observed in only one clone each: two occurred at different locations in the intron while the last resulted in an A73V substitution in the mature protein (Figure 11). These may be the result of mispairing during plasmid replication or base misincorporation during PCR, as Taq polymerase does not have 3' à 5' exonuclease activity. Because the A73V substitution in the coding region of variant 3 could be the result of such a mistake, it may be artifactual. Thus, it appears that there are at least two variants of the liver isoform. A helical wheel diagram combining the mature peptides of all three variants indicates that the ice-binding face of the helix is conserved (Figure

12A), thus supporting the argument that these proteins are functional.

3.7.2 Skin variants

A total of fourteen unique clones containing skin isoforms were obtained by PCR using assorted 5' primers coupled to the 3'univ primer. Seven sequences were isolated with 5'int, four with allsk and three with 3'int (Figure 6B). Of these, seven were isolated from genomic DNA whereas seven were found in the primary phage stocks. There was much more variation observed among the skin variants than among the liver AFPs, but identity remained high over the aligned regions (Appendix F). Of the eleven clones containing intronic sequence, four had large deletions

57 of 502 – 550 bp in the middle of the intron, and therefore did not contain sequence to which the stfsk and allsk primers could bind. Two of these four sequences were also missing a poly(G) tract located approximately 100 – 120 bp from the end of the intron. All eleven sequences contained an Element S-like sequence that was identical aside from a T à A substitution at position 4. All but one sequence contained a C à G substitution at position 21 of the 32-base element

(Appendix B, Appendix F).

Analysis of the translated sequences showed that there are eight skin AFP variants; the total length of the proteins was generally 37 residues, unless otherwise noted. Variant 1 was isolated from lambda insert #1 (Figure 9), and is described above relative to the winter flounder skin AFP. Variant 2 was identical to variant 1, except for an A23K substitution (Figure 13). The

C-terminal sequence of variant 3 was much different from that of the other starry flounder skin

AFPs, as it contained A32D and R37A substitutions, along with six extra residues at the C terminus (AAAAAL, 43 residues in total) not present in any other starry flounder skin AFP variant. C-terminal extensions may arise due to a mutation in the stop codon, leading to the translation of the 3' UTR. However, none of the possible reading frames following the stop codon could generate this sequence. Moreover, since the 3' UTR, including the complementary sequence for the 3'univ primer, was conserved and aligned well with all other starry flounder skin variants (page 5 of Appendix F), these extra C-terminal residues appear to be the result of an insertion. Variant 4 contained two substitutions, R6K and A36G. Variant 5 contained K16R,

A18T and A23E substitutions. Variants 6 and 7 were identical to variant 2 but contained A19T and D2G substitutions, respectively. Finally, variant 8 was 39 residues long and included a two- residue Ala-Ala insertion towards the C-terminal end, along with a R6A substitution. Helical wheel diagrams indicated the frequency with which the various substitutions occurred around the

58 helix (Figure 12); all variants shown in Figure 13 were included. The Thr residue at position 2 is from variant 5, and was observed in only one 11-amino acid repeat motif out of eight three-repeat variants.

The frequency with which each variant was encoded by a unique gene was also assessed.

Variants 1 and 2 were isolated three and five times, respectively. Variant 1 was clearly encoded by more than one gene, as two sequences isolated using 5'int were of significantly different sizes

(g4 and H1, Appendix F). Variant 2 is likely also encoded by at least two genes, as one sequence

(g7) contains two silent single nucleotide substitutions in the coding region. Aside from these two silent substitutions, the five sequences coding for variant 2 contained six unique single substitutions scattered in the intron and one in a non-coding portion of exon 2. Variants 3 through 8 were each isolated only once, but the number of changes at the nucleotide level coupled with the fact that many of these changes were shared with other sequences suggested that these variants are real. The only exception was variant 6, where the DNA sequence (g10) was identical to one encoding variant 2 (g5) over 686 bp, except for a unique C à T transition near a poly(G) tract and a unique G à A transition (Appendix F) that resulted in an A19T substitution. Thus, there are at least seven variants of the skin isoform in the starry flounder genome.

A comparison of the starry flounder skin AFPs to those of the winter flounder revealed that the same residues tend to show variability in both species. For example, Lys, Ala or Arg are seen in both species at position 6 (Figure 13), but at different frequencies. Similarly, both Ala and Lys are found at position 23 for both species, with an additional substitution (Glu) in stfs-

AFP5. There were also differences in terms of protein length, with all insertions and deletions confined to the C-terminal portion of the proteins. Most of the winter flounder skin AFPs were

59

39 residues long, whereas most of the starry flounder skin AFPs were 37 residues long. All of the ice-binding residues and the 11-amino acid repeat motifs were conserved between all isoforms of both species, except in variant 6 as noted above. Helical wheel diagrams including all variants from Figure 13 show clear conservation of the ice-binding face in both the starry and winter flounders (Figure 15), suggesting that the starry flounder skin AFPs are functional.

3.8 The American plaice has multiple variants of the skin AFPs

The American plaice was found early on to have variable levels of plasma TH activity but did not show evidence of a large AFP gene dosage when its genomic DNA was probed with a cDNA of the flounder liver AFP sequences [62, 73]. Subsequently, it was realized that the lone source of its plasma TH activity was a large thermolabile hyperactive isoform [71].

Chronologically, this isoform was actually first discovered in the American plaice, but to maximize the impact of the discovery it was initially reported in the winter flounder [63].

Having clarified the source of the antifreeze activity in the plasma, we wanted to know if the American plaice had skin AFPs. Given the success of the PCR experiments on starry flounder DNA, the skin-specific primer allsk was paired with the 3'univ primer and applied to

American plaice genomic DNA. Standard conditions were used, with the following modifications: (NH 4)2SO 4 buffer (Fermentas), 1.5 mM MgCl 2, annealing temperature of 62 °C,

30 cycles. PCR products were TA cloned, and nineteen of the twenty clones selected were sequenced. These results were obtained only weeks prior to submission of this thesis and thus, exhaustive analysis cannot be reported here. However, all products were quite similar, with a total of 37 nucleotide substitutions in the ~760 bp long sequences (Appendix G). Only 10 of

60

Figure 15. Helical wheel diagrams of skin AFPs from starry flounder, winter flounder and American plaice

The shaded areas indicate the frequency of naturally-occurring amino acid substitutions in all known starry flounder (A), winter flounder (B), and American plaice (C) skin AFPs, and the black arcs indicate the residues that make up the ice-binding face. The diagrams omit residues prior to and including the first Pro (Figure 13) because Pro is a known helix breaker and because these residues are thought to form an N-terminal cap structure that arches away from the ice- binding face. Also omitted is the terminal residue, which is thought to curl away from the ice- binding surface (L. Graham, personal communication).

61

A

Starry flounder Gly

Glu

B Lys

Winter flounder Arg

Thr

Asp C

Ala American plaice

61 these substitutions (37%) were unique whereas the rest were present in multiple clones, and there were no major insertions or deletions in any of the clones relative to the rest. A sequence similar to the Element S enhancer [59] was present, with one T à C substitution present at position 16 in all clones and a C à T substitution at position 17 in fifteen out of nineteen clones. The effects of these substitutions in this context remain to be determined. Use of the 3'univ primer precluded conservation analysis of the 3' UTR, but the eight nucleotides present between the end of the coding region and the start of the primer were identical in all clones.

Overall, we sequenced seventeen unique genes encoding four different proteins. The proteins are identical aside from small insertions in the C-terminal region and a terminal Pro in aps-AFP4 in place of the His seen in the other variants (Figure 13). Residues 35 – 37 have been deleted from aps-AFP2, relative to aps-AFP1, but this does not affect the ice-binding face.

Similarly, residues 38 – 40 have been deleted from aps-AFP4 and this also does not affect the binding face. The last variant, aps-AFP3, contains a six-residue deletion (positions 35 – 40 in aps-AFP1). A helical wheel diagram combining the sequences of all four American plaice skin variants was constructed (Figure 15C), and there is clear conservation of the ice-binding face, suggesting that these proteins are functional. Comparison of a representative skin variant from each of the winter flounder, starry flounder and American plaice indicate that the ice-binding face is conserved in spite of a trend of higher C-terminal variation across species (Figure 13).

3.9 The starry flounder has multiple variants of the hyperactive AFPs

Alignments used to design the 5'int, 3'int, and 3'univ PCR primers included the intron of the wfh-5a “pseudogene” and the 3' UTRs of both wfh-5a and wfh-AFP1. However, no bands

62 large enough to contain a hyperactive isoform were observed after optimization of PCR reaction conditions on starry flounder genomic DNA. To improve the chances of cloning sections that code for the hyperactive AFP, four new primers were developed. Primers 2 and 3 were designed in unique and relatively Ala-poor regions of wfh-AFP1 (Table 1), while primers 1 and 4 were based on 3'int and 3'univ, respectively, but shifted inwards to include unique sequence at the ends of the second wfh-AFP1 exon (Figure 6C). PCR conditions for these primers were optimized on starry flounder genomic DNA. Four magnesium concentration were assessed (1.5, 2.0, 2.5, and

3.0 mM MgCl 2) and an annealing temperature of 53 °C was used, a temperature 7 – 9 °C below the calculated T m, in order to encourage priming. Two successive rounds of amplification generated a large number of bands, but only two conditions produced bands of the expected size

(261 bp). Sequencing of three TA clones made from these PCR products confirmed that they were of the hyperactive type (Appendix H ). These clones were highly similar, with only four single nucleotide substitutions. One clone contained two silent nucleotide substitutions while another contained two nucleotide substitutions that resulted in two amino acid substitutions

(Table 3). Codon usage for Ala reflected typical type I AFP frequencies (Table 2). Alignment of one sequence, stfh-AFP2, with wfh-AFP1 and wfh-5a showed moderately high identity (87% and

83%, respectively); there was 82% identity between the two winter flounder genes (Appendix I).

Many of these changes led to alterations in the protein sequence, relative to the expressed winter flounder AFP (Figure 16). In the winter flounder hyperactive isoform, the 11-amino acid repeat motif is not as well-conserved as it is in the liver and skin isoforms, and the same trend was observed for the starry flounder sequence, indicated in Figure 16. However, the ice-binding face has only been modelled to date [65] and it is not possible at this time to say if these amino acid substitutions in the starry flounder AFP would affect antifreeze activity.

63

Figure 16. Protein alignment of hyperactive AFP variants from starry flounder and winter flounder

The st arry flounder hyperactive (stfh) AFPs are shown, along with the winter flounder hyperactive AFP and the putative pseudogene wfh-5a. The starry flounder sequence stfh-AFP2 was obtained from insert #2 (Figure 17). The end of the signal peptide is italicized, but the rest of the signal peptide sequence was not present in the insert and is unknown. The sequence of stfh-

AFP1 is a partial hyperactive variant isolated from genomic DNA via PCR. Putative ice-binding residues are marked with an asterisk [65] and the number of residues is noted at the end of each row. Genbank accession numbers for wfh-AFP1 and wfh-5a are EU188795 and M63477, respectively.

64

* * * * wfh-AFP1 ITEA NIDPAARAAAAAAASKAAVTAADAAAAAATIAASAASVAAATAADD 50 wfh-5a ITEA -IDPAAKAAAAAAAATAVVTAAAAAAAAAAIAATAAAVAGATAADA 49 stfh-AFP2 ITEA -IDPAAQAAAAAAAAAAVVTAADAAAAAANAAANAAAVAAATAADV 49

* * * * * wfh-AFP1 AAASIATINAASAAAKSIAAAAAMAAKDTAAAAASAAAAAVASAAKALET 100 wfh-5a AAASIASINANTAAAAAIAAAAAKAAEEAAATAAAAAATTAATAATAQAT 99 stfh-AFP2 ATASIATIKANAAAAAATAAAAAIAAEEAATAAATAAAAAAATAATAQAA 99 stfh-AFP1 ANAAAAAATAAAAAIAAEEAATAAATAAAAAAATAATAQAA 42

* * * * wfh-AFP1 INVKAAYAAATTANTAAAAAAATATTAAAAAAAKATIDNAAAAKAAAVAT 150 wfh-5a IKDKAAAAAASTATNAAAAAAATATTAAAAAVAKTTIDKAAAAVAVAAAT 149 stfh-AFP2 IFDKAAAAASTTATTAATAAAATATTAAAAAAATETIDKAAAAAAAAAAT 149 stfh-AFP1 IFDKAAAAASTTATTAATAAATIATTAAAAA 72

* * * * wfh-AFP1 AVSDAAATAATAAAVAAATLEAAAAKAAATAVSAA-AAAAAAAIAFAAAP 199 wfh-5a AVAAAAATAATAAATAAATLGAATVKAAATAVNAAAAAAAATAAAAAAPP 199 stfh-AFP2 AVATAAAAAATAAATAAATLGAAAAKAAATAVAAAAAAAIAAAAAAAAPP 199

64

The small portion of the hyperactive isoform that was isolated indicated that the gene was quite divergent from the winter flounder sequence, and this was likely the cause behind the difficulty in isolating it using winter flounder-based primers. Using this new sequence, five new primers specific to the starry flounder hyperactive isoform were designed. Primer 2a was based on primer 2, but was shifted downstream by nine bases, so that the 3' end of the primer would be specific to the starry flounder sequence. Primers 3a1 and 3a2 were based on primer 3, but were shifted downstream by three and nine nucleotides, respectively, for the same reason. Primers 2b and 3b were designed to regions that were unique to the starry flounder hyperactive gene. PCR conditions were again optimized for these primers, using three magnesium concentrations (1.5,

2.0, and 2.5 mM MgCl 2) as well as six annealing temperatures (52.2, 54.0, 56.0, 58.0, 59.9, 61.6

°C), so as to maximize specific binding. Using optimal conditions (2.0 mM MgCl 2 and 58.0 °C), two successive rounds of PCR were performed using first the 2a and 3b primers, then nesting the reaction products with the 2b and 3b primers. A single band of the expected size was isolated from plaque H11. TA cloning and sequencing of this 148 bp product confirmed that it was identical to positions 59 – 116 of the three genomic clones previously examined (Appendix H).

3.10 Genes for starry flounder hyperactive and skin AFPs are closely linked

Having confirmed the presence of a hyperactive isoform in the H11 primary plaque via

PCR analysis, this phage was plaque purified as described above and sequenced using the 454 method (Genome Quebec). Three gaps (~ 500 bp) remained following assembly using

DNAMAN, and preliminary analysis indicated that the unresolved regions were highly repetitive.

Repetitive and highly similar sequence data are not amenable to the 454 method due to the short length of the 454-generated sequence reads (average 218 bp), and Genome Quebec attempted

65 unsuccessfully to sequence through these gaps. Therefore, primers were designed to the ends of available sequence in order to PCR-amplify the regions and resolve the gaps via conventional methods. No optimization was performed, and reactions were carried out on the lambda DNA under standard conditions, with the following modifications: (NH 4)2SO 4 buffer (Fermentas), 1.5 mM MgCl 2, annealing temperature of 55 °C, 3 minute extension step, 30 cycles. One 414 bp gap was resolved with the primers Right arm and 3'univ (Table 1), whereas the other two gaps (21 bp and 63 bp) were resolved using 3'int with Contig1#1 and maxisfstop with con2end2#1, respectively.

This phage insert (insert #2) was 18024 bp in length and contained a complete gene for a skin variant and exon 2 of a hyperactive variant (Figure 17). The 414 bp gap in the GC-rich interior of the hyperactive exon was presumably not amplified at all using the 454 method, as it was not once represented in the approximately 12000 reads (~ 2.6 Mb) sequenced. As well, the short sequence reads could not be used to reliably assemble the sequence immediately downstream of the two genes, as these regions contained (GT) n repeats and were highly similar, with 98% identity over 661 bp, after exclusion of a 44 bp gap (data not shown). The two genes in this insert were oriented in the same direction and separated by 5.6 kb of intergenic sequence

(Figure 18). Based on the PCR experiments, we were expecting one of the hyperactive variants and stfs-AFP8; we obtained stfh-AFP2 and stfs-AFP8, and no other AFP genes were present in the insert. These genes were located within a 7.6 kb stretch at one end of the insert.

As this sequence was received only a few weeks prior to the submission of this thesis, exhaustive analysis has not yet been performed. The sequence between the two genes as well as that downstream of the skin gene were used to query Genbank nucleotide databases in search of

66

Figure 17. Schematic diagram of starry flounder genomic DNA insert #2

The 18024 bp insert is indicated by the solid line and the lambda arms by the dotted lines.

Predicted exons of the two starry flounder AFP genes are shown by the thicker grey boxes. Note that the short (23 bp) non-coding 5' exon of the skin AFP appears as a solid line at this scale and that the 5' region of the hyperactive gene is missing as the insert starts within the intron. The diamond-topped lines indicate Eco RI cut sites. The four underlined regions indicate regions outside the AFP genes that are similar to known sequences in the Genbank database: (A) 100 bp segment, beginning at base 2838, shows 77 to 92% identity to the D8 domain of the 28S rRNA gene in several fish, (B) and (C) 217 bp and 178 bp segments at positions 3998 and 5073 respectively, with 79% identity to the same non-coding sequence in the 5' flank of wfl-AFP6, (D)

393 bp segment beginning at base 8265 with 82% identity to a non-coding sequence in a cluster of MHC class II antigen genes in the three-spined stickleback Gasterosteus aculeatus .

67

Hyperactive

5388 5910 17324 λλλR λλλL

1 A B C D 18024

Skin

1 kb

67

Figure 18. Schematic diagram of the AFP gene organization in genomic DNA insert #2

The portion of the insert from Figure 17 containing the two AFP genes is shown with intronic and intergenic sequences represented by black lines and exons by boxes in which the translated portions are indicated with shading. The 5.6 kb of intergenic sequence between the AFP genes was shortened for clarity with the break marked by diagonal lines. A putative TFIID binding site is marked by a grey vertical arrow and a putative enhancer sequence is denoted with an asterisk.

68

Hyperactive Skin

3’int 5’int 3’int

. *

3’univ 5.6 kb 0.8 kb 3’univ

Promoter 250 bp Un-translated Translated * Enhancer

68 other genes, and preliminary results showed four regions with significant identity to other genes or gene segments. Two matches in the intergenic region were 178 bp (match C in Figure 17) and

217 bp (match B in Figure 17) in length, and both showed 79% identity to a non-coding region upstream of the main the winter flounder liver isoform, wfl-AFP6 (respective E-values 5 x 10 -31 and 3 x 10 -40 ). Another stretch of approximately 100 bp in the intergenic region showed 77 –

92% identity to domain 8 of the 28S rRNA gene in several fish; the minimum E-value was 1 x 10 -

14 (match A in Figure 17). Finally, a 393 bp region approximately 650 bp downstream of the skin gene aligned with 82% identity to a non-coding sequence in a cluster of MHC class II antigen genes (E-value 3 x 10 -104 ) (match D in Figure 17). Other matches aligned with highly repetitive

(GT) n and (CA) n repeats and were not considered further. Restriction analysis revealed three

Eco RI sites starting at positions 5388, 5910 and 17324 (Figure 17). The latter two sites produce an 11.4 kb fragment, which presents as a faint band on all Southern blots presented here (Figure

8). No Sac I sites were present in the insert, indicating that these two genes are likely not included in the predicted tandem array.

3.11 Regulatory elements are conserved in stfs-AFP8

Like the stfs-AFP1 gene, stfs-AFP8 is also quite similar to the winter flounder skin genes.

Alignment with wfs-11-3 showed 96% identity in the first exon and 96% in the second, with 95% identity in the intron (68% including gaps) (Appendix J). The stfs-AFP8 intron was 835 bp in length, compared to the 794 wfs-11-3 intron and the 1212 bp stfs-AFP1 intron, and contained both a 96 bp deletion and a 149 bp insertion relative to wfs-11-3. Target sequence for the stfsk and allsk primers was not present within the intron of stfs-AFP8, but the Element S enhancer was conserved except for two substitutions at positions 4 and 21. The coding regions were 94%

69 identical, and of the seven substitutions present, five were silent. The 5' and 3' UTRs showed a comparable degree of identity, 97% and 99% respectively, and the polyadenylation signal and putative TFIID binding site were conserved perfectly.

3.12 Starry flounder hyperactive AFP is homologous to its winter flounder counterpart

Isolation of a substantial portion of the starry flounder hyperactive gene and its alignment with the corresponding portion of the winter flounder gene marked the first interspecific comparison for this isoform. The choice of the maxi5'mid primer (Table 1) initially used to search for this isoform was fortunate, as the sequence is conserved perfectly in the starry flounder gene. Some of the initial difficulty in isolating this gene may have arisen from the choice of the maxi3'mid primer, which contains three mismatches in the 3' half of the 21-bp primer (Appendix

I). Over the length of exon 2, there was 83% identity (82% including gaps), a noticeable drop from the same comparison in the other isoforms (92% for liver and 96 – 97% for skin). Within the coding region, the sequences were 81% identical (80% including gaps). There are no major insertions or deletions, but it is interesting to note that a third of the substitutions in the coding region are silent. There was high conservation of the 3' UTR, with 92% identity and complete conservation of the polyadenylation signal. The codon bias for Ala is typical of type I AFPs, with

GCC being most common, followed by GCA and GCT (Table 2). Because sequence from both the wfh-AFP1 intron and the first exon of stfh-AFP2 are currently unavailable, no other analysis could be performed for regulatory elements or gene structure. However, it appears that the end of a signal peptide has been conserved in stfh-AFP2, and the gene structure is expected to be conserved as well. The predicted product of the stfh-AFP2 exon contained many substitutions

70 compared to its winter flounder homologue; there was 75% identity between the proteins. As with the winter flounder hyperactive isoform, the 11-amino acid repeat motif is less well-defined

(Figure 16) than in the smaller isoforms, but the implications of this have yet to be determined.

3.13 Starry flounder-specific probes do not alter Southern blot banding patterns

Because the PCR-isolated starry flounder hyperactive gene fragment was significantly different from that of the winter flounder, it was used to reprobe the same blot. The banding pattern in the winter flounder lanes was similar to that seen with the winter flounder hyperactive probe, except for an increase in the intensity of some of the larger bands in the Sac I digest (3.7,

4.8 and >10 kb) (Appendix K). The relative intensities in the Bam HI digest remained the same as before. In the starry flounder lanes, the banding pattern did not change significantly; the 3.5 kb and 4.7 kb bands in the Sac I digest still hybridized strongly, as did the 7.8 kb band in the Eco RI digest. Differences between fish were the same as described above – there was a relatively weak signal from fish 4 ( Eco RI digest), possibly caused by partial degradation or variable signal strength due to intrapopulational polymorphism. It was not possible to draw any conclusions based on relative band intensities due to multiple rounds of stripping and probing of the blot, differences in the washing conditions, and lack of densitometric analysis, all of which led to increased background hybridization.

Few Southern blots have been probed with the type I skin AFPs. With the ready availability of a starry flounder skin probe from the abovementioned PCR experiments, the

Southern blot was stripped for a fourth time and re-probed. The signal in the winter flounder lanes was much weaker than previously observed with any probe (Figure 8C) despite a strong

71 signal in the starry flounder lanes and high identity between the skin isoforms of the two species

(Appendix B). This may have been due to heavy use of the blot, and these signals must be confirmed on a fresh blot. The band intensities in the starry flounder lanes were significantly stronger than observed previously, and the 4.7 kb Sac I and 7.8 kb Eco RI bands were saturated on the film (Figure 8C). In the Sac I digest, weak bands are still visible at 3.5 kb, but bands that were barely visible in the previous blots now hybridize clearly at 1.5 kb; similarly, previously weak bands are still visible at 4.2 kb in the Eco RI digest, but others at 2.5 kb intensify significantly.

The increased intensity in the previously observed weak bands is likely due to different wash conditions, but the bands appearing at 1.5 kb (Sac I) and 2.5 kb ( Eco RI) may contain a cluster of skin genes. In spite of the poor quality of the blot, the fact that some of the same bands hybridize strongly to skin probes as well as liver and hyperactive probes provides further evidence of close linkage of the skin AFP genes with the other two isoforms.

72

Chapter 4 Discussion

The first correlation between AFP gene dosage and the degree of exposure to icy seawater was made with the ocean pout [88]. Fish sampled from two discrete populations off the

New Brunswick and Newfoundland coasts had very different mid-winter plasma AFP levels, with the former being 2 – 3 mg/mL and the latter 20 – 25 mg/mL [101]. Despite this order of magnitude difference in AFP concentrations, the two populations showed similar AFP isoform complexity in their plasma, with approximately twelve different types resolved by reverse-phase

HPLC. These differences in plasma AFP levels correlated with type III AFP gene dosage: the more southerly New Brunswick population had 30 – 40 copies of the AFP gene and the

Newfoundland population had ~150 copies [88]. However, as only two locations were sampled, the correlation between AFP gene amplification and selection by a harsh environment was tenuous. Similar but less spectacular results were seen between shallow- and deep-water populations of winter flounder in the Canadian Maritimes [89], but results were confounded by contributing factors besides habitat depth, such as latitude and exposure to different currents, so again, no solid conclusions could be drawn.

Therefore, a better system was needed to investigate the influence of environment on

AFP gene amplification. The starry flounder was chosen because it inhabits a long continuous stretch of the Pacific coast spanning a wide range of latitudes from California to the Arctic [90].

Starry flounder are one of the few species of Pleuronectidae in which fish can be right-eyed or left-eyed, and there is a clinal variation in the proportion of left-eyed to right-eyed individuals

73 that varies from approximately 50% in California to almost 100% in Japan [102]. Such graded variation suggests that populations experience limited mixing over the range. Initial samples were collected in the late 1980’s and genomic Southern blots were probed with wfl-AFP6 (Figure

5). The results looked promising, as there were large increases in signal strength between fish from California, BC and Alaska. However, with a limited sample size (only two Alaskan fish), more individuals and additional sample sites were required to establish a trend. A collection of starry flounder tissues was assembled in the years since the initial blot, but preparation of genomic DNAs from those tissues in the experiments described here revealed that the new Alaska samples were too degraded to be used for genomic Southern blots. DNA prepared from the QCI fish was of high quality, and Southern blots showed that their hybridization patterns were very similar to those of fish from Vancouver (Figure 8A vs. Figure 5).

When the original starry flounder Southern blot was probed with the winter flounder liver isoform cDNA, it was assumed that the observed signal measured the one and only set of AFP genes present in the genome. The analysis of gene dosage in this manner has become more complex since this blot was done, due to the subsequent discoveries of the skin [50, 51] and hyperactive isoforms [63, 65]. Knowing the extent to which these AFP probes cross-hybridize to other isoforms is vital for the accuracy of gene dosage assessment. Thus, we chose to study the contribution of each isoform to the AFP gene signals in starry flounder before making further attempts to broaden the collection of samples.

The signal intensity observed in each of the four QCI starry flounder DNAs with the winter flounder liver isoform probe was comparable to that seen in the winter flounder controls.

The winter flounder signal is known to represent 30 – 40 copies of this isoform, suggesting that

74 the QCI fish have at least this many AFP genes. This is consistent with the 1990 blot, whereas in the BC sample, what appears to be single copy gene signals are seen together with intense bands of hybridization that could easily contain 20 – 30 gene copies. The fact that the QCI AFP gene signals are largely concentrated in a 4.7 kb Sac I fragment and in much larger Eco RI fragments strongly suggests that the genes are present in tandem repeats. In the winter flounder, a single liver AFP gene is present in a tandemly repeated 7.8 kb Bam HI fragment that lacks an Eco RI cut site [32]. Therefore, Eco RI and other restriction enzymes that do not cut within the amplicon produce uncharacteristically long tracts of genomic DNA containing numerous tandem repeats.

Such repeats might also occur in the starry flounder genome. Although it is difficult to estimate the length of the large Eco RI fragments near the top of the gel, they appear large enough to contain several of the smaller Sac I fragments. Migration distances in this area of the gel are considerably compressed, such that 1 mm of migration may translate to variation in a fragment length of 1 x 10 4 or 1 x 10 5 bp. As well, it is difficult to avoid shearing DNA of this size during extraction and these DNA fragments may be much shorter than they really are in the genome.

The stripped and re-probed Southern blot of the QCI DNAs strongly suggests that genes for the hyperactive isoform are present in the same 4.7 kb Sa cI fragment observed with the liver probe, as well as a smaller 3.5 kb Sac I fragment to which the liver probe hybridized weakly.

Consistent with this suggestion is the shift in the hybridization signal to larger fragments in the

Eco RI digests, some of which are clearly the same as those detected with the liver probe.

Another interpretation of these results could be that the probes cross-hybridize. We suggest this is unlikely because, while both the liver and hyperactive probes are GC-rich (63% and 72%, respectively), dot matrix comparisons (2 mismatches per 20 bp) show that only a few small segments between them are well matched. As well, the hyperactive probe contains only mature

75 sequence and a 15 bp stretch of 3' UTR that is known to be conserved in the liver probe. A more obvious argument against cross-hybridization is that the two probes detect completely different restriction fragments in the winter flounder control DNAs. This result is consistent with what is known about the organization of AFP genes in winter flounder. The 7.8 kb Bam HI repeats, which give rise to the 3 kb Sac I fragments, contain only the liver isoform [32]. This is the first time winter flounder DNA has been probed with the hyperactive AFP, and the hybridization signal was surprising in the intensity of several different-sized bands. As well, the sequencing of one winter flounder control showed two single nucleotide substitutions compared to wfh-AFP1.

Together, these results indicate that this isoform is also encoded by multiple genes.

The signal obtained from probing the blot a third time with the skin isoform gave a hybridization pattern that appeared to be a combination of the two preceding blots. Here, the possibility of cross-hybridization between skin and liver AFP probes is more likely because the genes are similarly sized and because larger sections of the probes are conserved between isoforms. For example, the liver probe ends with a portion of the 3' UTR that is highly conserved across all three isoforms (positions 1 – 18, identical to wfl-AFP6 in Appendix C), while the skin probe ends with a similar segment that is twice as long (positions 1 – 34, identical to stfs-AFP8 in

Appendix C). As well, the skin probe begins with a 31 bp sequence corresponding to the end of the intron, a region that is also highly conserved across all three isoforms (data not shown).

Together, these two regions make up almost 34% of the total length of the skin probe.

Nevertheless, the intensity of the signal and the persistence of certain bands in all blots strongly suggests that the skin, liver and hyperactive isoforms are closely linked in the same restriction fragments in the starry flounder genome.

76

Our attempt to follow up on these results via genomic cloning was only partially successful. The first lambda clone, which was selected with the hyperactive probe, produced insert #1 and was sequenced in its entirety. It contained both a liver and a skin isoform in close proximity, which is consistent with the idea that AFP genes are closely linked in the starry flounder. However, the 4.7 kb Sac I fragment was not present in the insert. The second lambda clone that was sequenced, insert #2, was similar in this respect. It was isolated using a starry flounder hyperactive probe and contained part of a hyperactive AFP gene and a different skin variant, but was lacking a 4.7 kb Sac I amplicon. These results led to suspicions that these repeats might be incompatible with phage amplification and growth. This has been observed before, particularly with the AFP complement in the wolffish ( Anarhichas lupus ) [103], and at that time, extraordinary measures were required to obtain genomic clones of the repeated sequences. To check on this possibility for the starry flounder, another member of the lab prepared several crude phage cultures from phage plaques picked during the primary library screen. Three of the plaques had been isolated using the winter flounder hyperactive probe and three had been isolated using the liver probe. Southern blots revealed that none of these clones included the 4.7 kb Sac I fragment, but instead appeared to contain the ends of tandem arrays or linked but irregularly spaced AFP genes (S. Gauthier, personal communication).

Previous work by the Davies research group has found that it is relatively easy to obtain clones of linked but irregularly spaced AFP genes from species that have a tandemly repeated gene cluster without ever isolating clones that contain the entire tandem array. This has occurred while characterizing AFP genes from the wolffish [103] and the winter flounder [31]. In the latter case, the AFP genes were missed from an initial winter flounder genomic library because it was made with Eco RI-digested genomic DNA. Since most of the liver-specific AFP genes in

77 winter flounder are in tandem repeats that do not contain an Eco RI restriction site, the fragments generated were far too big to be cloned into lambda phage. Lambda clones that were successfully purified from this winter flounder library showed that the three main isoforms were indeed linked, and two of the gene pairings were coincidentally a hyperactive isoform next to a skin isoform, and a skin isoform adjacent to a liver isoform [31]. The results of the present experiments suggest two possibilities: 1) that the tandem AFP array in starry flounder genome contains all three isoforms in the same repeat unit, or 2) the genes for the three different isoforms amplified separately to generate repeats of identical size and restriction site distribution. The former is much more likely, and it may be worthwhile to isolate and sequence one of the 4.7 kb

Sac I fragments to prove the point.

Genomic PCR followed by cloning and sequencing has been a useful way to sample sequence variation and overcome the bias of missing the 4.7 kb Sac I fragments from the lambda library. Because these sequences were obtained directly from the starry flounder genome, the genes missing from the lambda clones should be well represented. The cloning and sequencing of the three isoforms has produced a great deal of structural information that can be used to test the current model of structure-function relationships for the type I AFPs. However, it must be stressed that some of these genomic sequences might not be functional. Random mutations that prevent a gene from being expressed may lead to the accumulation of mutations in the coding region, which would in turn complicate the structure-function interpretations. Similarly, an expressed gene may acquire mutations that render it less functional, with similar effects on structure-function studies. Despite that disclaimer, all of the starry flounder liver variants fit with the current view that the type I AFP is an amphipathic helix with an Ala-rich ice-binding surface immediately adjacent to a line of Thr residues. This hydrophobic surface is conserved, whereas

78 the opposite, hydrophilic side is more plastic and contains some features that help reinforce the helix and increase its solubility. The compilation of numerous skin variants from the winter flounder, starry flounder and American plaice resulted in similar conclusions. The N-terminal amino acids including the Pro at position 4 were excluded from the helical wheel projections, as these initial residues are likely involved in a helix-capping structure. Potentially, the more informative sequences will be those of the hyperactive isoform. There is currently limited sequence information for this isoform and only a very rough model of the dimer [65]. Here sequence conservation between isoforms and orthologues will potentially reveal the ice-binding site(s) and the dimerization interface, both of which are thought to run the entire length of this approximately 200-residue alpha-helix.

The discovery of the skin and hyperactive isoforms has spurred a re-evaluation of the phylogenetic Southern blot (Figure 3), which was also probed with the cDNA of wfl-AFP6. The initial interpretation of this blot was that only a subset of the right-eye flounders – those in lanes

C8 to E3 showing strong hybridization signals – had AFP genes, and that the faint signals in other fishes were non-specific due to the GC-richness of the probe (P. Davies, personal communication). This suggested that the type I AFP gene was very recent in its origin.

However, with the discovery of the new isoforms, the fainter bands in A3 to C6 and E6 to F6 could now be interpreted as skin or hyperactive AFP genes in species that lack the liver isoform.

The liver isoform may well have arisen recently from the splicing of a liver gene-specific signal peptide exon onto a 3’ exon that is a derivative of a skin or hyperactive AFP gene.

The extent to which the skin AFPs alone can protect a fish is unknown, as is whether the hyperactive isoform can replace the smaller liver AFPs. Studies on American plaice showed that

79 the hyperactive AFPs are the only type present in the plasma [71]. We therefore decided to use universal PCR primers for the skin isoform on the American plaice genome to determine if these

AFPs were present. This strategy was extremely effective and produced many unique sequences.

To further support the hypothesis that the liver isoform evolved recently from the older skin or hyperactive isoforms, one could use PCR to verify that the presence of liver AFP isoforms is limited to that subset of closely related flounders that hybridize strongly to the liver probe. As well, PCR could be used to look for skin and hyperactive isoforms in the other flounders on the phylogenetic blot that show signals akin to those of the American plaice (B3). If present, this would support the hypothesis that the liver AFP developed from one of the other isoforms. If this fusion of the 3' coding region to a 5' signal sequence driven by a liver-specific promoter/enhancer occurred then it might have happened in the common ancestor to C8 to E3.

4.1 Conclusions

It has been nearly 40 years since the discovery of AFPs in flounder, yet their evolutionary origin remains a mystery. In an attempt to test the hypothesis that these proteins arose due to the acute selection pressure of icy seawater, I examined a close relative of the well-studied winter flounder, the starry flounder. I have shown that its genome contains a large family of genes for the type I AFPs, including multiple variants of each of the liver, skin and hyperactive isoforms.

Many regulatory elements in the starry flounder proteins are highly similar to other well- characterized type I AFPs, although their expression and functionality require characterization through transcriptional analysis and antifreeze activity assays. Using Southern blotting and genomic cloning, I established that the three AFP isoforms in the starry flounder are interspersed

80 and are likely closely linked in a tandem array. Future screening of an unamplified genomic library will be required to determine the nature of the gene clusters. The refinement here of a

PCR-based gene-sampling approach has allowed rapid detection of hyperactive and skin AFP sequences in two other closely related flounders, and further screening of other pleuronectid fish will no doubt provide insight as to the impetus and direction behind this evolutionary innovation.

81

References

1. Volff, J.-N. (2005). Genome evolution and biodiversity in teleost fish . Heredity . 94 (3):280-294.

2. Scholander, P.F., van Dam, L., Kanwisher, J.W., Hammel, H.T., and Gordon, M.S. (1957). Supercooling and osmoregulation in arctic fish . Journal of Cellular and Comparative Physiology . 49 (1):5-24.

3. Green, J.M. (1974). A localized mass winter kill of cunners in Newfoundland. Canadian Field-Naturalist . 88 :1.

4. DeVries, A.L. and Wohlschlag, D.E. (1969). Freezing resistance in some Antarctic fishes . Science . 163 (3871):1073-1075.

5. Fletcher, G.L., Hew, C.L., and Davies, P.L. (2001). Antifreeze proteins of teleost fishes . Annual Review of Physiology . 63 :359-390.

6. Raymond, J.A. and DeVries, A.L. (1977). Adsorption inhibition as a mechanism of freezing resistance in polar fishes . Proceedings of the National Academy of Sciences of the USA . 74 (6):2589-2593.

7. Wilson, P.W. (1993). Explaining thermal hysteresis by the Kelvin effect . Cryo-Letters . 14 :31-36.

8. Duman, J.G. and DeVries, A.L. (1974). Freezing resistance in winter flounder Pseudopleuronectes americanus. Nature . 247 (5438):237-238.

9. Slaughter, D., Fletcher, G.L., Ananthanarayanan, V.S., and Hew, C.L. (1981). Antifreeze proteins from the sea raven, Hemitripterus americanus : Further evidence for diversity among fish polypeptide antifreezes . Journal of Biological Chemistry . 256 (4):2022-2026.

10. Hew, C.L., Slaughter, D., Joshi, S., Fletcher, G.L., and Ananthanarayanan, V.S. (1984). Antifreeze polypeptides from the Newfoundland ocean pout, Macrozoarces americanus : Presence of multiple and compositionally diverse components . Journal of Comparative Physiology B . 155 (1):81-88.

11. Deng, G., Andrews, D.W., and Lauersen, R.A. (1997). Amino acid sequence of a new type of antifreeze protein, from the longhorn sculpin Myoxocephalus octodecimspinosis. FEBS Letters . 402 (1):17-20.

12. Ananthanarayanan, V.S. and Hew, C.L. (1977). Structural studies on the freezing-point- depressing protein of the winter flounder Pseudopleuronectes americanus. Biochemical and Biophysical Research Communications . 74 (2):685-689.

82

13. Davies, P.L., Roach, A.H., and Hew, C.L. (1982). DNA sequence coding for an antifreeze protein precursor from winter flounder . Proceedings of the National Academy of Sciences of the USA . 79 (2):335-339.

14. Hew, C.L. and Yip, C. (1976). The synthesis of freezing-point-depressing protein of the winter flounder Pseudopleuronectes americanus in Xenopus laevis oocytes . Biochemical and Biophysical Research Communications . 71 (3):845-850.

15. Davies, P.L. and Hew, C.L. (1980). Isolation and characterization of the antifreeze protein messenger RNA from the winter flounder . Journal of Biological Chemistry . 255 (18):8729-8734.

16. Pickett, M.H., Scott, G.K., Davies, P.L., Wang, N.-C., Joshi, S., and Hew, C.L. (1984). Sequence of an antifreeze protein precursor . European Journal of Biochemistry . 143 (1):35-38.

17. Gourlie, B., Lin, Y., Price, J., DeVries, A.L., Powers, D., and Huang, R.C.C. (1984). Winter flounder antifreeze proteins: a multigene family . Journal of Biological Chemistry . 259 (23):14960-14965.

18. Hew, C.L., Liunardo, N., and Fletcher, G.L. (1978). In vivo biosynthesis of the antifreeze protein in the winter flounder - evidence for a larger precursor . Biochemical and Biophysical Research Communications . 85 (1):421-427.

19. Kenward, K.D., Altschuler, M., Hildebrand, D., and Davies, P.L. (1993). Accumulation of type I fish antifreeze protein in transgenic tobacco is cold-specific . Plant Molecular Biology . 23 (2):377-385.

20. Yang, D.S.C., Sax, M., Chakrabartty, A., and Hew, C.L. (1988). Crystal structure of an antifreeze polypeptide and its mechanistic implications . Nature . 333 (6170):232-237.

21. Scott, G.K., Davies, P.L., Shears, M.A., and Fletcher, G.L. (1987). Structural variations in the alanine-rich antifreeze proteins of the Pleuronectinae . European Journal of Biochemistry . 168 (3):629-633.

22. Sicheri, F. and Yang, D.S.C. (1995). Ice-binding structure and mechanism of an antifreeze protein from winter flounder . Nature . 375 :427-431.

23. Duman, J.G. and DeVries, A.L. (1976). Isolation, characterization, and physical properties of protein antifreezes from the winter flounder, Pseudopleuronectes americanus. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology . 54 (3):375-380.

24. DeVries, A.L. and Lin, Y. (1977). Structure of a peptide antifreeze and mechanism of adsorption to ice . Biochimica et Biophysica Acta . 495 (2):388-392.

25. Wen, D. and Laursen, R.A. (1992). A model for binding of an antifreeze polypeptide to ice . Biophysical Journal . 63 :1659-1662.

83

26. Knight, C.A., Cheng, C.-H.C., and DeVries, A.L. (1991). Adsorption of alpha-helical antifreeze peptides on specific ice crystal surface planes . Biophysical Journal . 59 :409- 418.

27. Baardsnes, J., Kondejewski, L.H., Hodges, R.S., Chao, H., Kay, C., and Davies, P.L. (1999). New ice-binding face for type I antifreeze protein . FEBS Letters . 463 (1-2):87-91.

28. Davies, P.L. and Hew, C.L. (1990). Biochemistry of fish antifreeze proteins . FASEB Journal . 4(8):2460-2468.

29. Knight, C.A., DeVries, A.L., and Oolman, L.D. (1984). Fish antifreeze protein and the freezing and recrystallization of ice . Nature . 308 (5956):295-296.

30. Low, W.K., Lin, Q., Stathakis, C., Miao, M., Fletcher, G.L., and Hew, C.L. (2001). Isolation and characterization of skin-type, type I antifreeze polypeptides from the longhorn sculpin, Myoxocephalus octodecemspinosus. Journal of Biological Chemistry . 276 (15):11582-11589.

31. Davies, P.L., Hough, C., Scott, G.K., Ng, N., White, B.N., and Hew, C.L. (1984). Antifreeze protein genes of the winter flounder . Journal of Biological Chemistry . 259 (14):9241-9247.

32. Scott, G.K., Hew, C.L., and Davies, P.L. (1985). Antifreeze protein genes are tandemly liked and clustered in the genome of the winter flounder . Proceedings of the National Academy of Sciences of the USA . 82 (9):2613-2617.

33. Low, W.K., Miao, M., Ewart, K.V., Yang, D.S.C., Fletcher, G.L., and Hew, C.L. (1998). Skin-type antifreeze protein from the shorthorn sculpin, Myoxocephalus scorpius : Expression and characterization of a Mr 9, 700 recombinant protein . Journal of Biological Chemistry . 273 (36):23098-23103.

34. Gauthier, S.Y., Wu, Y., and Davies, P.L. (1990). Nucleotide sequence of a variant antifreeze protein gene . Nucleic Acids Research . 18 (17):5303.

35. Chao, H., Hodges, R.S., Kay, C.M., Gauthier, S.Y., and Davies, P.L. (1996). A natural variant of Type I antifreeze protein with four ice-binding repeats is a particularly potent antifreeze . Protein Science . 5(6):1150-1156.

36. Fourney, R.M., Joshi, S., Kao, M.H., and Hew, C.L. (1984). Heterogeneity of antifreeze polypeptides from the Newfoundland winter flounder, Pseudopleuronectes americanus. Canadian Journal of Zoology . 62 (1):28-33.

37. Griffiths, A.J.F., Miller, J.H., Suzuki, D.T., Lewontin, R.C., and Gelbart, W.M. (2000) An Introduction to Genetic Analysis , 7 ed.W.H. Freeman and Company: New York. p. 860.

38. Davies, P.L. (1992). Conservation of antifreeze protein-encoding genes in tandem repeats . Gene . 112 (2):163-170.

84

39. Rancourt, D.E., Walker, V.K., and Davies, P.L. (1987). Flounder antifreeze protein synthesis under heat shock control in transgenic Drosophila melanogaster. Molecular and Cellular Biology . 7(6):2188-2195.

40. Chan, S.-L., Miao, M., Fletcher, G.L., and Hew, C.L. (1997). The role of CCAAT/enhancer-binding protein alpha and a protein that binds to the activator-protein- 1 site in the regulation of liver-specific expression of the winter flounder antifreeze protein gene . European Journal of Biochemistry . 247 (1):44-51.

41. Fletcher, G.L. (1981). Effects of temperature and photoperiod on the plasma freezing point depression, Cl− concentration, and protein "antifreeze" in winter flounder . Canadian Journal of Zoology . 59 (2):193-201.

42. Fourney, R.M., Fletcher, G.L., and Hew, C.L. (1984). Accumulation of winter flounder messenger RNA after hypophysectomy . General and Comparative Endocrinology . 54 (3):392-401.

43. Fletcher, G.L., Idler, D.R., Vaisius, A., and Hew, C.L. (1989). Hormonal regulation of antifreeze protein gene expression in winter flounder . Fish Physiology and Biochemistry . 7:387-393.

44. Slaughter, D. and Hew, C.L. (1982). Radioimmunoassay for the antifreeze polypeptides of the winter flounder: seasonal profile and immunological cross-reactivity with other fish antifreezes . Canadian Journal of Biochemistry . 60 (8):824-829.

45. Duncker, B.P., Koops, M.D., Walker, V.K., and Davies, P.L. (1995). Low temperature persistence of type I antifreeze protein is mediated by cold-specific mRNA stability . FEBS Letters . 377 (2):185-188.

46. Gong, Z., King, M.J., Fletcher, G.L., and Hew, C.L. (1995). The antifreeze protein genes of the winter flounder, Pleuronectus americanus , are differentially regulated in liver and non-liver tissues. Biochemical and Biophysical Research Communications . 206 (1):387- 392.

47. Schneppenheim, R. and Theede, H. (1982). Freezing-point depressing peptides and glycoproteins from arctic-boreal and antarctic fish . Polar Biology . 1(2):115-123.

48. Valerio, P.F., Kao, M.H., and Fletcher, G.L. (1990). Thermal hysteresis activity in the skin of the cunner, Tautogolabrus adspersus. Canadian Journal of Zoology . 68 (5):1065– 1067.

49. Valerio, P.F., Kao, M.H., and Fletcher, G.L. (1992). Fish skin: An effective barrier to ice crystal propagation . Journal of Experimental Biology . 164 :135-151.

50. Gong, Z., Fletcher, G.L., and Hew, C.L. (1992). Tissue distribution of fish antifreeze protein mRNAs . Canadian Journal of Zoology . 1992 (4):810-814.

85

51. Gong, Z., Ewart, K.V., Hu, Z., Fletcher, G.L., and Hew, C.L. (1996). Skin antifreeze protein genes of the winter flounder, Pleuronectes americanus , encode distinct and active polypeptides without the secretory signal and prosequences . Journal of Biological Chemistry . 271 (8):4106-4112.

52. Davies, P.L. and Gauthier, S.Y. (1992). Antifreeze protein pseudogenes . Gene . 112 :171- 178.

53. Low, W.K., Lin, Q., Ewart, K.V., Fletcher, G.L., and Hew, C.L. (2002). The skin-type antifreeze polypeptides: A new class of type I AFPs, in Fish Antifreeze Proteins , K.V. Ewart and C.L. Hew, eds. World Scientific Publishing Co. Pte. Ltd.: River Edge, NJ. p.161-186.

54. Lin, Q., Ewart, K.V., Yan, Q., Wong, W.K., Yang, D.S.C., and Hew, C.L. (1999). Secretory expression and site-directed mutagenesis studies of the winter flounder skin- type antifreeze polypeptides . European Journal of Biochemistry . 264 :49-54.

55. Low, W.K., Lin, Q., and Hew, C.L. (2003). The role of N and C termini in the antifreeze activity of winter flounder ( Pleuronectes americanus ) antifreeze proteins . Journal of Biological Chemistry . 278 (12):10334-10343.

56. Murray, H.M., Hew, C.L., and Fletcher, G.L. (2003). Spatial expression patterns of skin- type antifreeze protein in winter flounder ( Pseudopleuronectes americanus ) epidermis following metamorphosis . Journal of Morphology . 257 (1):78-86.

57. Mignatti, P., Morimoto, T., and Rifkin, D.B. (1992). Basic fibroblast growth factor, a protein devoid of secretory signal sequence, is released from cells via a pathway independent of the ER-Golgi complex. Journal of Cellular Physiology . 151 (1):81-93.

58. Nickel, W. (2003). The mystery of nonclassical protein secretion . European Journal of Biochemistry . 270 (10):2109-2119.

59. Miao, M., Chan, S.-L., Hew, C.L., and Gong, Z. (1998). The skin-type antifreeze protein gene intron of the winter flounder is a ubiquitous enhancer lacking a functional C/EBP binding motif . FEBS Letters . 426 (1):121-125.

60. Murray, H.M., Hew, C.L., and Fletcher, G.L. (2002). Skin-type antifreeze protein expression in integumental cells of larval winter flounder . Journal of Fish Biology . 60 (6):1391-1406.

61. Pereira, J.J., Goldberg, R., Ziskowski, J.J., Berrien, P.L., Morse, W.W., and Johnson, D.L., Essential fish habitat source document: Winter flounder, Pseudopleuronectes americanus , life history and habitat characteristics, U.D.o. Commerce, Editor. 1999, US Department of Commerce: Woods Hole, MA. p. 48.

62. Scott, G.K., Davies, P.L., Kao, M.H., and Fletcher, G.L. (1988). Differential amplification of antifreeze protein genes in the Pleuronectinae . Journal of Molecular Evolution . 27 (1):29-35.

86

63. Marshall, C.B., Fletcher, G.L., and Davies, P.L. (2004). Hyperactive antifreeze protein in a fish . Nature . 429 (6988):153.

64. Marshall, C.B., Chakrabartty, A., and Davies, P.L. (2005). Hyperactive antifreeze protein from winter flounder is a very long rod-like dimer of alpha-helices . Journal of Biological Chemistry . 280 (18):17920-17929.

65. Graham, L.A., Marshall, C.B., Lin, F.-H., Campbell, R.L., and Davies, P.L. (2008). Hyperactive antifreeze protein from fish contains multiple ice-binding sites . Biochemistry . 47 (7):2051-2063.

66. Sakamoto, K. (1984). Interrelationships of the family Pleuronectidae. PhD thesis from the Department of Fisheries Science, Hokkaido University, Hokkaido, Japan. p. 95-215.

67. Duman, J.G. and DeVries, A.L. (1975). The role of macromolecular antifreezes in cold water fishes . Comparative Biochemistry and Physiology Part A: Physiology . 52 (1):193- 199.

68. Berendzen, P.B. and Dimmick, W.W. (2002). Phylogenetic relationships of pleuronectiformes based on molecular evidence . Copeia . 3(3):642–652.

69. Cooper, J.A. and Chapleau, F. (1998). Monophyly and intrarelationships of the family Pleuronectidae (Pleuronectiformes), with a revised classification . Fishery Bulletin . 96 (4):686-726.

70. Johnson, D.L., Morse, W.W., Berrien, P.L., and Vitaliano, J.J., Essential fish habitat source document: Yellowtail flounder, Limanda ferruginea , life history and habitat characteristics. 1999, US Department of Commerce: Woods Hole, MS. p. 38.

71. Gauthier, S.Y., Marshall, C.B., Fletcher, G.L., and Davies, P.L. (2005). Hyperactive antifreeze protein in flounder species: The sole freeze protectant in American plaice . FEBS Journal . 272 :4439-4449.

72. Johnson, D.L., Morse, W.W., Berrien, P.L., and Vitaliano, J.J., Essential fish habitat source document: American plaice, platessoides , life history and habitat characteristics, 1st ed. 1999, US Department of Commerce: Woods Hole, MS. p. 40.

73. Goddard, S.V. and Fletcher, G.L. (2002). Physiological ecology of antifreeze proteins - a northern perspective, in Fish Antifreeze Proteins , K.V. Ewart and C.L. Hew, eds. World Scientific: River Edge, NJ. p.17-60.

74. DeVries, A.L. (1983). Antifreeze peptides and glycopeptides in cold-water fishes . Annual Review of Physiology . 45 :245-260.

75. Mouches, C., Pasteur, N., Berge, J.B., Hyrien, O., Raymond, M., de Saint Vincent, B.R., de Silvestri, M., and Georghiou, G.P. (1986). Amplification of an esterase gene is responsible for insecticide resistance in a California Culex mosquito . Science . 233 (4765):778-780.

87

76. Ohno, S. (1970) Evolution by gene duplication .Springer-Verlag: New York. p. 160.

77. Alt, F.W., Kellems, R.E., Bertino, J.R., and Schimke, R.T. (1978). Selective multiplication of dihydrofolate reductase genes in methotrexate-resistant variants of cultured murine cells . Journal of Biological Chemistry . 253 (5):1357-1370.

78. Scott, G.K., Fletcher, G.L., and Davies, P.L. (1986). Fish antifreeze proteins: recent gene evolution . Canadian Journal of Fisheries and Aquatic Sciences . 43 (5):1028-1034.

79. Maisey, J.G. (1996) Discovering Fossil Fishes , 1 ed.Henry Holt & Company. p. 223.

80. Hurley, I.A., Mueller, R.L., Dunn, K.A., Schmidt, E.J., Friedman, M., Ho, R.K., Prince, V.E., Yang, Z., Thomas, M.G., and Coates, M.I. (2007). A new time-scale for ray-finned fish evolution . Proceedings of the Royal Society B: Biological Sciences . 274 (1609):489- 498.

81. Greenwood, P.H., Rosen, D.E., Weitzman, S.H., and Myers, G.S. (1966). Phyletic studies of teleostean fishes, with a provisional classification of living forms . Bulletin of the American Museum of Natural History . 141 :341-455.

82. Munroe, T.A. (2005). Systematic diversity of the Pleuronectiformes, in Flatfishes: Biology and Exploitation , R.N. Gibson, ed Blackwell Publishing: Ames, IA. p.10-41.

83. Friedman, M. (2008). The evolutionary origin of flatfish asymmetry . Nature . 454 (7201):209-212.

84. Trevisani, E., Papazzoni, C.A., Ragazzi, E., and Roghi, G. (2005). Early Eocene amber from the “Pesciara di Bolca” (Lessini Mountains, Northern Italy) . Palaeogeography, Palaeoclimatology, Palaeoecology . 223 (3-4):260-274.

85. Sellwood, B.W. and Valdes, P.J. (2006). Mesozoic climates: General circulation models and the rock record . Sedimentary Geology . 190 (1-4):269-287.

86. Sluijs, A., Schouten, S., Pagani, M., Woltering, M., Brinkhuis, H., Sinninghe Damste, J.S., Dickens, G.R., Huber, M., Reichart, G.-J., Stein, R., Matthiessen, J., Lourens, L.J., Pedentchouk, N., Backman, J., and Morimoto, T. (2006). Subtropical Arctic Ocean temperatures during the Palaeocene/Eocene thermal maximum . Nature . 441 (7093):610- 613.

87. Moran, K., Backman, J., Brinkhuis, H., Clemens, S.C., Cronin, T., Dickens, G.R., Eynaud, F., Gattacceca, J., Jakobsson, M., Jordan, R.W., Kaminski, M., King, J., Koc, N., Krylov, A., Martinez, N., Matthiessen, J., McInroy, D., Moore, T.C., Onodera, J., O'Regan, M., Pälike, H., Rea, B., Rio, D., Sakamoto, T., Smith, D.C., Stein, R., St John, K., Suto, I., Suzuki, N., Takahashi, K., Watanabe, M., Yamamoto, M., Farrell, J., Frank, M., Kubik, P., Jokat, W., and Kristoffersen, Y. (2006). The Cenozoic palaeoenvironment of the Arctic Ocean . Nature . 441 (7093):601-605.

88

88. Hew, C.L., Wang, N.-C., Joshi, S., Fletcher, G.L., Scott, G.K., Hayes, P.H., Buettner, B., and Davies, P.L. (1988). Multiple genes provide the basis for antifreeze protein diversity and dosage in the ocean pout, Macrozoarces americanus. Journal of Biological Chemistry . 263 (24):12049-12055.

89. Hayes, P.H., Davies, P.L., and Fletcher, G.L. (1991). Population differences in antifreeze protein gene copy number and arrangement in winter flounder . Genome . 34 :174-177.

90. Orcutt, H.G., The life history of the starry flounder, Platichthys stellatus (Pallas). 1950, State of California, Department of Natural Resources, Division of Fish and Game Bureau of Marine Fisheries. p. 64 pp.

91. Eschmeyer, W.N. (1983) A field guide to Pacific Coast fishes of North America: from the Gulf of Alaska to Baja, California. The Peterson field guide series.Houghton Mifflin: Boston.

92. Blin, N. and Stafford, D.W. (1976). A general method for isolation of high molecular weight DNA from eukaryotes . Nucleic Acids Research . 3(9):2303-2308.

93. Sambrook, J. and Russell, D.W. (2001) Molecular Cloning: A Laboratory Manual , 3rd ed.Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY.

94. Bendtsen, J.D., Nielsen, H., von Heijne, G., and Brunak, S. (2004). Improved prediction of signal peptides: SignalP 3.0. Journal of Molecular Biology . 340 (4):783-795.

95. Reese, M.G. Neural Network Promoter Prediction Tool, v.2.2 . 1999 9 May 2008 [cited 6 November 2008; Available from: http://www.fruitfly.org/seq_tools/promoter.html .

96. Hinegardner, R. and Rosen, D.E. (1972). Cellular DNA content and the evolution of teleostean fishes . American Naturalist . 106 (951):621-644.

97. Douglas, S.E., Patrzykat, A., Pytyck, J., and Gallant, J.W. (2003). Identification, structure and differential expression of novel pleurocidins clustered on the genome of the winter flounder, Pseudopleuronectes americanus (Walbaum) . European Journal of Biochemistry . 270 (18):3720-3730.

98. Cunningham, D.F. and O'Connor, B. (1997). Proline specific peptidases . Biochimica et Biophysica Acta . 1343 (2):160-186.

99. Lambeir, A.-M., Durinx, C., Scharpé, S., and De Meester, I. (2003). Dipeptidyl-peptidase IV from bench to bedside: An update on structural properties, functions, and clinical aspects of the enzyme DPP IV . Critical Reviews in Clinical Laboratory Sciences . 40 (3):209-295.

100. Breathnach, R. and Chambon, P. (1981). Organization and expression of eucaryotic split genes coding for proteins . Annual Review of Biochemistry . 50 :349-383.

89

101. Fletcher, G.L., Hew, C.L., Li, X.M., Haya, K., and Kao, M.H. (1985). Year-round presence of high levels of plasma antifreeze peptides in a temperate fish, ocean pout (Macrozoarces americanus ). Canadian Journal of Zoology . 63 (3):488-493.

102. Bergstrom, C.A. (2007). Morphological evidence of correlational selection and ecological segregation between dextral and sinistral forms in a polymorphic flatfish, Platichthys stellatus. Journal of Evolutionary Biology . 20 (3):1104–1114.

103. Scott, G.K., Hayes, P.H., Fletcher, G.L., and Davies, P.L. (1988). Wolffish antifreeze protein genes are primarily organized as tandem repeats that each contain two genes in inverted orientation . Molecular and Cellular Biology . 8(9):3670-3675.

90

Appendix A DNA alignment of winter and starry flounder liver AFPs

An alignment of the starry flounder liver AFP from lambda insert #1 (stfl-AFP1, Figure

10) and the winter flounder liver variant wfl-AFP6 is shown. Identical bases are marked with a dot (.) and gaps with a hyphen (-). Intronic and flanking sequences are in lower case, while exonic sequence is in upper case. The coding region is bolded with the translated sequences of stfl-AFP1 or wfl-AF6 above or below the respective gene sequence, and silent substitutions are underlined. The number of nucleotides is noted at the end of each row. The core promoter sequences ( | ), the transcriptional start site (+), and polyadenylation signals (#) are marked based on previous characterization of wfl-AFP6. The intronic enhancer, Element B, is in white text and highlighted in black [40]. The Genbank accession number for wfl-AFP6 is M62415.

91 ||||| stfl-AFP1 aacaaaactgggggagtgttgta ccaat ctgctcagattggtcgacagtc 50 wfl-AFP6 c....c...... 50

||||||| stfl-AFP1 aagcgatgactcaggctccatttac tacaaaa cagactcacattcgcctg 100 wfl-AFP6 ...... c...... g...... t...... t...... ga.... 100

+ stfl-AFP1 tatctttACCACATCTTCCTTTTGTAGTGAAGCAGTGCTCCCTAAAAACT 150 wfl-AFP6 g..a..c...... A...... C...... C..GT. 150

M A L S L F T V G Q F I F L F stfl-AFP1 CTCAAA ATGGCTCTCTCACTCTTCACCGTCGGACAATTCATTTTCTTATT 200 wfl-AFP6 ...... T..... T...... G...... 200 M A L S L F T V G Q L I F L F

W T I R stfl-AFP1 TTGGACAATAAG gtacgtgaacactcactttgtttcttctataaatctgg 250 wfl-AFP6 ...... G...... g...... 250 W T M R stfl-AFP1 ttttactgtaaatatcttgggaaggaaggaatgatatctgcattatccca 300 wfl-AFP6 ...... g...... c 300

stfl-AFP1 gaggggcaggcacgtgcacagatattttggggggcaagtgctaaaacccc 350 wfl-AFP6 ...... catttgt.tt....cc.gcggt.------...gatg 337

stfl-AFP1 caaaaaagggcacccatcgccaaaatgtaaagctgacaacacagtacaca 400 wfl-AFP6 a.g.tcttcatc.gtg.tcatctgt.tg.ccctgatt...... ag.tggt 387

stfl-AFP1 caca------atcttttttttactgttttctgacagagttgaagtattag 444 wfl-AFP6 ....tggacc...... a...aca.aa.g.t.c.tcagcacttcc.g..tt 437

stfl-AFP1 cagcattacactaataatatgccacaatatacctcaccagactggcatat 494 wfl-AFP6 ....ccg.a...t.a.gagg------.....tgga.actt.c.ga 476

stfl-AFP1 agctcaacttaagacctaccgttcatttcaataattacttgattatttga 544 wfl-AFP6 t.a..tgg.g.c-....g.t.g.tgaagg..ac.gagt....gaggcg.c 525

stfl-AFP1 attttaaatgaaacaaaactagagaacttgtctgatttgtagaacagtaa 594 wfl-AFP6 .gaaa....t.tttt.gtt.ga.tg.agaag...tca.t.gatttca.gt 575

stfl-AFP1 aactgcctttaatttctatcacacacagatattgaacactgtcatcactg 644 wfl-AFP6 tggg.ggggggggggtc...... ta...... 625

stfl-AFP1 ggttcggtgaaagtgacggaccagtaaatgttgtgatatataatattatc 694 wfl-AFP6 a...t...... 675

92 I S E A N stfl-AFP1 atc---atttcaataataccattaatctctgcag AATCAGTGAAGCCAAC 741 wfl-AFP6 ..aata...at...... C...... GA 725 I T E A R

P D P A A K A A A V A D P A A A A stfl-AFP1 CCCGACCCCGCAGCCAAAGCTGCCGCAGTCGCCGACCCTGCCGCAGCCGC 791 wfl-AFP6 ...... C...------.. A..A.....T.. 763 P D P A A K A A P A A A A

* V A P A A D A F S A A A D T A S stfl-AFP1 TGTAGCCCCCGCCGCTGACGCCTTCTCAGCAGCCGCCGACACTGCCTCTG 841 wfl-AFP6 C.CC------C..GC.G....C...C.A.....C...... 798 A P A A A A P D T A S

* * * * * D A A A A A A A T A A A A K A A A stfl-AFP1 ACGCCGCCGCCGCCGCCGCCGCCACCGCCGCTGCCGCCAAAGCCGCCGCA 891 wfl-AFP6 ...... T..A...... CTT...... CAA...... T..C 848 D A A A A A A L T A A N A K A A A

* * * * E K T A R D A A A A A A A T A R G stfl-AFP1 GAAAAAACCGCCCGGGACGCCGCTGCAGCAGCCGCAGCCACCGCCAGAGG 941 wfl-AFP6 ...CTC..T...GCCA...... C..C..C..A...... 898 E L T A A N A A A A A A A T A R G

stfl-AFP1 TTAA GGATCGTGGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAGC 991 wfl-AFP6 ...... 948

###### stfl-AFP1 GAGATGTTACCAATCTGCTGAATAAACCTGAGAAGCTGTTTGTTGAaaac 1041 wfl-AFP6 ...... A.....A..... 998

stfl-AFP1 caagtgtccagttcatttcatctctgaaactccttcacagtttctgtaga 1091 wfl-AFP6 ...... t...... g 1048

stfl-AFP1 tcatgtttttaacacataaacctccagaaatcatgatgcgtcacgtttgg 1141 wfl-AFP6 ...g..------1054

stfl-AFP1 actttgggttagaataaaatgacggactgcagctacataagatatgatat 1191 wfl-AFP6 ------1054

stfl-AFP1 gttagtgatcttaaagaggttcttgtttccattatgctaagctaacagtt 1241 wfl-AFP6 ------1054

93 stfl-AFP1 catatttacacgtagactccaggaagtgatgccattgtgctgcctgaaac 1297 wfl-AFP6 ------...... tt--g... 1089

stfl-AFP1 ctgcaggtctacaaggtttcataactgatttagattttaaaatactgact 1347 wfl-AFP6 ...... t...... 1139

stfl-AFP1 aattattcacattttcgttctcaccagctctatgagtatttctccttcaa 1397 wfl-AFP6 t...... a...... 1189

stfl-AFP1 gtacagatgtggacagtgttggaggaa-tcctgaagtttagtacttaagt 1446 wfl-AFP6 ...... g...... g.a...... 1239

stfl-AFP1 aaaagtacaagtacccaggaaaatatatacttaagtaaaagtaaaagtac 1496 wfl-AFP6 ...... a....t...... 1289

###### stfl-AFP1 tacatcaacaatcctacttatttaaaagtaaaaagtacttacttttaaat 1546 wfl-AFP6 ...... a...... g...... 1339

stfl-AFP1 ttactataagtattataagtaaaagtattgacgcaatgggttgcctctca 1596 wfl-AFP6 ...... a...... c.c...... 1389

stfl-AFP1 atgtctaggctgtgccattttgataaagaatgcatatatagctactggta 1646 wfl-AFP6 ...... g...... 1439

stfl-AFP1 atactcatgcctctacagatgtcactactagtaataattataagcaacaa 1696 wfl-AFP6 ...... c...... 1489 stfl-AFP1 catttgtttattggaaaggttggtgtacttattgtgcttaccctctgtaa 1746 wfl-AFP6 ...... t...... 1539

stfl-AFP1 cactgttcacactctatcttacaattctgcggatgacaagttatctacca 1796 wfl-AFP6 t...... g...... g...... 1589

###### stfl-AFP1 gggattatctgcaaagttaaaaccattaagacaaatcaataaagacaaca 1846 wfl-AFP6 ...... c..ct....c...... 1639

stfl-AFP1 agttatatctttaaat--cttatatttaattgtaagtgtgtaaaaaatgg 1894 wfl-AFP6 t....a...... gtt...... a.t...... 1689

stfl-AFP1 aaacatggaacatgaaaaacaac--taaaactggtcagaacaaggcagat 1942 wfl-AFP6 ...... tg...... c... 1739

94 stfl-AFP1 cttagaatgagaaaatttaaa-tgaaggaccttgaaatgaaaatttgagc 1991 wfl-AFP6 ...... c-...... a...... c. 1788

stfl-AFP1 attggtggtttagactcaggcagtcaaaatattcatcttctgaatatttt 2041 wfl-AFP6 ...... c...... 1838

stfl-AFP1 tgaaaggaaggagaaatgttatacttttattttgaaaaggtagttcctga 2091 wfl-AFP6 -...... 1887

stfl-AFP1 aaaacgaaaaaggtcgctaaatggcaggtgttcctcgttgttgcatgaaa 2141 wfl-AFP6 ...... a...... ---.g..a...... ------1927

stfl-AFP1 aagcattacgcttagtt--gtttagcacctggctgataaaggcacaagca 2189 wfl-AFP6 ---...... gtt...... ggtg.a.acg.g...atgctgca 1974

95

Appendix B DNA alignment of winter and starry flounder skin AFPs

An alignment of the starry flounder skin AFP from lambda insert #1 (stfs-AFP1, Figure 10) and the winter flounder skin variant wfs-11-3 is shown. Identical bases are marked with a dot (.) and gaps with a hyphen (-). Intronic and flanking sequences are in lower case, while exonic sequence is in upper case. The coding region is bolded with the translated sequences of stfs-AFP1 or wfs-

11-3 above or below the respective gene sequence, and silent substitutions are underlined. The number of nucleotides is noted at the end of each row. The putative TFIID binding site ( | ), the transcriptional start sites (+), and the polyadenylation signal (#) are marked based on previous characterization of the winter flounder skin AFPs. The intronic enhancer, Element S, is in white text and highlighted in black [59]. The Genbank accession number for wfs-11-3 is M63478.

96 stfs-AFP1 ttacaaaacaagttcatactggccaggatgttcgccacaccttccttttg 50 wfs-11-3 ...... -----...g.t...... g... 45

| stfs-AFP1 ttgtgaaccagtcggagccgacaacctgctgcgtcgcaaacttgaagtga 100 wfs-11-3 a...... gc...... a.g..a.ca...... 95

|||||| ++ stfs-AFP1 ataaat aagagctgctccctaaaagttttcatcaggactcacacACTTTT 150 wfs-11-3 ...... gag...... A...... 145

stfs-AFP1 CACTGTCGAACACTCAGgtacgtgaacactcactttgtttctcctacaaa 200 wfs-11-3 ...... C...... a...... 195

stfs-AFP1 tctggttt-actgtaaatatcttgggaaggaaggaaggatatctgcatta 249 wfs-11-3 ...... t...... 245

stfs-AFP1 tcccagaggggccatttgttttacagccagtggtaaaagttgaagatctt 299 wfs-11-3 ...t-...... c...g....a...... 294

stfs-AFP1 catctgtgttcgtcggatggaaagtttgttctgaaaccttcagtggaagt 349 wfs-11-3 ....ca...... t...... - 343

stfs-AFP1 gtagtatattccccttagcaaatatccatagccttgaatcttaagttcaa 399 wfs-11-3 ------343

stfs-AFP1 acctttaagtattatctccagatgtgttcagtgtgtgtctccttgtctga 449 wfs-11-3 ------343

stfs-AFP1 actatccttgaactgcctatggaataatgagaggagagatggtttccagc 499 wfs-11-3 ------343

stfs-AFP1 gggtccttaaatcttaaggtacgacacattcccaaatttaggcagaaggc 549 wfs-11-3 ------343

stfs-AFP1 cgggttgtgtgacgtcattatatctctaggtttgtggtaaacaacccctc 599 wfs-11-3 ------343

stfs-AFP1 ctatttaacgccttaccttgcagagtcaaggcggattttcactattcggc 649 wfs-11-3 ------343

stfs-AFP1 ttgtgtgttatctccgagttttctagaaactcgtcctgacctataatact 699 wfs-11-3 ------343

97 stfs-AFP1 cattatacttgtaagtactgggtccgcgtctcctctcttcgaacaccgac 749 wfs-11-3 ------343

stfs-AFP1 ttctacaagacactactgcgggaaacatacgatagaagaaagagattcat 799 wfs-11-3 ------...c...... 355

stfs-AFP1 gtgttcaggcctaaacctgaaaaaatctgagctctgttcaatcatgggaa 849 wfs-11-3 ..c...... t...... c...... a...... 405

stfs-AFP1 acaactttttaattgagtcatggctgcaaaactcttttatatgaacagaa 899 wfs-11-3 ...... c.....g.....g...... a...... c...... 455

stfs-AFP1 gaagaagaagtgatctttagttcatcactgtggaaacatcagcagcagtt 949 wfs-11-3 ...... ca...... t...... 505

stfs-AFP1 aaattctgtctgcttcagtatcaccggccagttccagtgctcatgtttct 999 wfs-11-3 ...g...... 555

stfs-AFP1 gatcagcttggtttgaatgatatgaaa-cggatggagtccctgtttgacc 1048 wfs-11-3 ...... a...... 605

stfs-AFP1 ctgtttaacacaagat-ggccaagtggaccatcttt attaacataatgtt 1097 wfs-11-3 ...... t..a.gca...... t...... 655

stfs-AFP1 ttacatgagcacttcctg ttttcagccctaaacctaaagaggcctcatgg 1147 wfs-11-3 ...... c...... t...... 705

stfs-AFP1 aaacttcctgatgatctggtgacacctgctggttgaaggaaacagagttc 1197 wfs-11-3 ...... t 755

stfs-AFP1 gagaggcagctgaacaaattattttagtttgaaagaagaagctgtcattt 1247 wfs-11-3 ...... a...... g...... t...... 805

stfs-AFP1 gagattatgtt-gtagggggggggggg------atactg 1279 wfs-11-3 t.tt....a..t.g...... gggatcaccacacacag...t.. 855

stfs-AFP1 aactctgtcatcaccgggttcggtgaaagtgacggacaagtacatgttgt 1329 wfs-11-3 ...a...... t...... a.a..c...... 905

stfs-AFP1 gataaataattatatcataataattataat–aataccattaatttctgca 1378 wfs-11-3 ....t.....at...... t...... c...... 955

98 M D A P A R A A A A T stfs-AFP1 gAATCACTAAAACGAAC ATGGACGCCCCAGCCAGAGCCGCCGCAGCCACC 1428 wfs-11-3 ...... G.C.TC...... A...... A...... 1005 M D A P A K A A A A T

A A A A K A A A E A T A A A A A K stfs-AFP1 GCCGCCGCCGCCAAAGCCGCCGCAGAAGCCACCGCCGCCGCAGCTGCCAA 1478 wfs-11-3 ...... G...... 1055 A A A A K A A A E A T A A A A A K

A A A A T K A A R Ter stfs-AFP1 AGCAGCAGCCGCCACCAA------AGCAGCCCGTTAA TGATCGTGGTCGT 1522 wfs-11-3 ...... AGCCGC...... 1105 A A A A T K A A A A R Ter

stfs-AFP1 CTTGATGTGGGATCATGTGAACATCTGAGCAGCGAGATGTTACCAATATG 1572 wfs-11-3 ...... C.. 1155

###### stfs-AFP1 CTGAATAAACCTGAGAAGCTGTTTGTTGAaaaccaagtgtcctgttcatt 1622 wfs-11-3 ...... T...... 1205

stfs-AFP1 tcatctctggaactccttcacactttctgtagatcatgtttttattttgt 1622 wfs-11-3 ...... a.....a...... 1205

stfs-AFP1 ccagacgatgttgaactggagcagaatccagaaacgatcc 1642 wfs-11-3 t...... t...... c... 1245

99

Appendix C DNA alignment of the 3' regions from all three type I AFP isoforms of starry flounder and winter flounder and the liver isoform of yellowtail flounder

The sequences downstream of winter flounder (wfl-AFP6, -AFP8, AFP9, wfs-11-3, -F2, wfh-

AFP1, -5a) and yellowtail flounder (ytl-AFP1) AFP genes were obtained from the non-redundant

Genbank database, whereas the starry flounder sequences correspond to the liver (stfl-AFP1) and skin (stfs-AFP1) variants found in lambda insert #1 (Figure 9) as well as the skin (stfs-AFP8) and hyperactive (stfh-AFP1) variants isolated from lambda insert #2 (Figure 17). The alignment begins with the stop codon and positions that are identical in all sequences are marked with an asterisk (*). Exonic sequence is capitalized, differences are highlighted in grey, and polyadenylation signals are marked with pound signs (#). The number of nucleotides is noted at the end of each row. Genbank accession numbers for deposited sequences are M62415 (wfl-

AFP6), X53718 (wfl-AFP9), M63478 (wfs-11-3), M63479 (wfs-F2), X06356 (ytl-AFP1),

M63477 (wfh-5a), EU188795 (wfh-AFP1).

100 Ter stfl-AFP1 TAAGGATCGTGGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAGCG 50 wfl-AFP6 TAAGGATCGTGGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAGCG 50 wfl-AFP9 TAAGGATCGTCGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAGCG 50 ytl-AFP1 TAAGGATCGTGGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAGTG 50 stfs-AFP1 TAATGATCGTGGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAGCG 50 wfs-11-3 TAATGATCGTGGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAGCG 50 wfs-F2 TAATGATCGTGGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAGCG 50 stfs-AFP8 TAATGATCATGGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAGCG 50 5a TAAGGATCGTGGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAGCG 50 wfh-AFP1 TAAGGATCGTGGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAATG 50 stfh-AFP1 TAAGGATCGTGGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAGCG 50 consensus *** **** * ************************************ *

###### stfl-AFP1 AGATGTTACCAATCTGCTGAATAAACCTGAGAAGCTGTTTGTTGAaaacc 100 wfl-AFP6 AGATGTTACCAATCTGCTGAATAAACCTGAGAAGCTGATTGTTAAaaacc 100 wfl-AFP9 AGATGTTACCAATCTGTTGAATAAAGCTGAGAAGCTGTTTGTTTAaaacc 100 ytl-AFP1 AGATGTTATTAATCTGATGAATAAACCTGAGAAGCTGTTTGTTGA 95 stfs-AFP1 AGATGTTACCAATATGCTGAATAAACCTGAGAAGCTGTTTGTTGAaaacc 100 wfs-11-3 AGATGTTACCAATCTGCTGAATAAACCTGAGAAGCTGTTTGTTTAaaacc 100 wfs-F2 AGATGTTACCAATCTGCTGAATAAACCTGAGAAGCTGATTGTTAAaaacc 100 stfs-AFP8 AGATGTTACCAATCTGCTGAATAAACCTGAGAAGCTGTTTGTTGAaaacc 100 5a AGATGTTACCAATCTGCTGAATAAAC 76 wfh-AFP1 AGATATCACCAATCTGTTGAATAAAGCTGAGAAGCTGTTTGTT 93 stfh-AFP1 AGATGTTACCAATCTGCTGAATAAACCTGAGAAGCTGTTTGTTGAaaacc 100 consensus **** * * *** ** ******** *********** ***** ******

stfl-AFP1 aagtgtccagttcatttcatctctgaaactccttcacagtttctgtagat 150 wfl-AFP6 aagtgtcctgttcatttcatctctgaaagtccgtcacagtttctgtagat 150 wfl-AFP9 aagtgtcctgttcatttcatctctgaaactccgtcacagtttctttagat 150 stfs-AFP1 aagtgtcctgttcatttcatctctggaactccttcacactttctgtagat 150 wfs-11-3 aagtgtcctgttcatttcatctctgaaactcattcacagtttctgtagat 150 wfs-F2 aagtgtcctgttcatttcatctctgaaagtccgtcacagtttctgtagat 150 stfs-AFP8 aagtgtccagttcatttcatctctgaaactccttcacactttctgtagat 150 stfh-AFP1 aagtgtcctgttcatttcatcactgaaactccttcacactttctgtagat 150 consensus ******** **************** ** ** ***** ***** *****

stfl-AFP1 catgtttttaacacataaacctccagaaatcatgatgcgtcacgtttgga 200 wfl-AFP6 catgtagactccaggaagtgatgccattgtgctgttgaacctgcaggtct 200 wfl-AFP9 catgtttttcttaacacataaacctccagaaatcatgatgcgtcacgttt 200 stfs-AFP1 catgtttttattttgtccagacgatgttgaactggagcagaatccagaaa 200 wfs-11-3 catgtttttattttgttcagacgatgttgaactggatcagaatccagaaa 200 wfs-F2 catgtagactccaggaagtgatgccattgtgctgttgaacctgcaggg 198 stfs-AFP8 catgtttttaacacataaacctccagaaatcatgatgtgtcacatttgga 200 stfh-AFP1 catgtttttaacacataaacctccagaaatcatgatgtgtcacatttgga 200 consensus ***** *

101

Appendix D DNA alignment of the upstream region and exon 1 in the liver and hyperactive AFPs of winter, starry and yellowtail flounders

Sequences spanning exon 1 and flanking 5' sequences of liver and hyperactive AFP genes were obtained from the non-redundant Genbank database for the winter flounder and yellowtail flounder. The starry flounder liver isoform found in lambda insert #1 (stfl-AFP1, Figure 10) was also included. The alignment terminates at the 3' end of the first exon, which is marked by uppercase letters with the coding region in bold. Positions that are identical in all sequences are marked with an asterisk (*) and differences are highlighted in grey. The number of nucleotides is noted at the end of each row. Genbank accession numbers for sequences used are M62415 (wfl-

AFP6), X53718 (wfl-AFP9), X06356 (ytl-AFP1), M63476 (wfh-5a), EU188795 (wfh-AF1).

102 wfh-5a ggattgttgtaccaatctgctctgattggtcgacagtcaagcgatgactc 50 wfl-AFP6 ggagtgttgtaccaatctgctcagattggtcgacagtcaagcgatgaccc 50 wfl-AFP9 ggagtgttgtaccaatctgctcagattggtcgacagtcaagcgatgaccc 50 wfl-AFP8 gtcgacagtcaagcgatgaccc 22 stfl-AFP1 ggagtgttgtaccaatctgctcagattggtcgacagtcaagcgatgactc 50 consensus *** ****************** ************************* *

wfh-5a aggctcaaattactacaaaacagattcacactcacctggatattcACCAC 100 wfl-AFP6 aggctccagttactataaaacagattcacattgacctggatattcACCAC 100 wfl-AFP9 aggctccagttactataaaacagattcacattgacctggatattcACCAC 100 wfl-AFP8 aggctccagttactataaaacagattcacattcacctggatattcACCAC 72 stfl-AFP1 aggctccatttactacaaaacagactcacattcgcctgtatctttACCAC 100 consensus ****** * ****** ******** ***** * **** ** ** *****

wfh-AFP1 AAGTTCTCAAA ATGGCT 17 wfh-5a ATCTTCATTTTCTAGTGAACCACTGCTCCCTAAAAGTTCTCAAA ATGGCT 150 wfl-AFP6 ATCTTCATTTTGTAGTGAACCAGTGCTCCCTACAAGTTCTCAAA ATGGCT 150 wfl-AFP9 ATCTTCATTTTGTAGTAAACCAGTGCTCCCTACAAGTTCTCAAA ATGGCT 150 wfl-AFP8 ATCTTCATTTTGTAGTGAACCAGTGCTCCCTACAAGTTCTCAAA ATGGCT 122 stfl-AFP1 ATCTTCCTTTTGTAGTGAAGCAGTGCTCCCTAAAAACTCTCAAA ATGGCT 150 ytl-AFP1 GTGAAGCAGTGCTCCCTAAAAGTTCTCAAA ATGGCT 36 consensus ****** **** **** ** ** ********* ** *************

wfh-AFP1 CTCTCACTTTTCACTGTCGGACAATTCATTTTCTTATTTTGGACAATCAG 67 wfh-5a CTCTCACTCTTCACTGTCGGACAATTCATTTTCTTATTTTGGACAATCAG 200 wfl-AFP6 CTCTCACTTTTCACTGTCGGACAATTGATTTTCTTATTTTGGACAATGAG 200 wfl-AFP9 CTCTCACTTTTCACTGTCGGACAATTGATTTTCTTATTTTGGACAATGAG 200 wfl-AFP8 CTCTCACTTTTCACTGTCGGACAATTGATTTTCTTATTTTGGACAATGAG 172 stfl-AFP1 CTCTCACTCTTCACCGTCGGACAATTCATTTTCTTATTTTGGACAATAAG 200 ytl-AFP1 CTCTCACTCTTCACTGTTGGACAATTAATTTTCTTATTTTGGACACTCAG 86 consensus ******** ***** ** ******** ****************** * **

103

Appendix E DNA alignment of four unique starry flounder liver AFP gene sequences

DNA sequences were obtained from genomic DNA (g1 and g3) or phage insert DNA (L5) by

PCR (Figure 14) or from lambda insert #1 (H1, Figure 10, Appendix A). Each of the ten additional sequence sources listed in Table 3 was identical to one of these four sequences.

Differences are highlighted in grey and the number of nucleotides is noted at the end of each row.

Intronic sequence is in lower case, while exonic sequence is in upper case with bolding used to denote coding sequence. The silent substitution is underlined. The primers used for PCR amplification to isolate these sequences are labelled as per Figure 6 and are marked with asterisks. Notes detailing the frequency with which each sequence was isolated are below.

104 *********5’int********* H1 ggaaggaatgatatctgcattatcccagaggggcaggcacgtgcacagatattttggggg 60 g1 cccagaggggcaggcacgtgcacagatattttggggg 37 g3 cccagaggggcaggcacgtgcacagatattttggggg 37

H1 gcaagtgctaaaacccccaaaaaagggcacccatcgccaaaatgtaaagctgacaacaca 120 g1 gcaagtgctaaaacccccaaaaaagggcacccatcgccaaaacgtaaagctgacaacaca 97 g3 gcaagtgctaaaacccccaaaaaagggcacccatcgccaaaatgtaaagctgacaacaca 97

H1 gtacacacacaatcttttttttactgttttctgacagagttgaagtattagcagcattac 180 g1 gtacacacacaatcttttttttactgttttctgacagagttgaagtattagcagcattac 157 g3 gtacacacacaatcttttttttactgttttctgacagagttgaagtattagcagcattac 157

H1 actaataatatgccacaatatacctcaccagactggcatatagctcaacttaagacctac 240 g1 actaataatatgccacaatatacctcaccagactggcatatagctcaacttaagacctac 217 g3 actaataatatgccacaacatacctcaccagactggcatatagctcaacttaagacctac 217

H1 cgttcatttcaataattacttgattatttgaattttaaatgaaacaaaactagagaactt 300 g1 cgttcatttcaataattacttgattatttgaattttaaatgaaacaaaactagagaactt 277 g3 cgttcatttcaataattacttgattatttgaattttaaatgaaacaaaactagagaactt 277

H1 gtctgatttgtagaacagtaaaactgcctttaatttctatcacacacagatattgaacac 360 g1 gtctgatttgtagaacagtaaaactgcctttaatttctatcacacacagatattgaacac 337 g3 gtctgatttgtagaacagtaaaactgcctttaatttctatcacacacagatattgaacac 337

H1 tgtcatcactgggttcggtgaaagtgacggaccagtaaatgttgtgatatataatattat 420 g1 tgtcatcactgggttcggtgaaagtgacggaccagtaaatgttgtgatatataatattat 397 g3 tgtcatcactgggttcggtgaaagtgacggaccagtaaatgttgtgatatataatattat 397

**********3'int********* H1 catcatttcaataataccattaatctctgcag AATCAGTGAAGCCAACCCCGACCCCGCA 480 g1 catcatttcaataataccattaatctctgcag AATCAGTGAAGCCAACCCCGACCCCGCA 457 g3 catcatttcaataataccattaatctctgcag AATCAATGAAGCCAACCCCGACCCCGCA 457 L5-2 AATCAATGAAGCCAACCCCGACCCCGCA 28

H1 GCCAAAGCTGCCGCAGTCGCCGACCCTGCCGCAGCCGCTGTAGCCCCCGCCGCTGACGCC 540 g1 GCCAAAGCTGCCGCAGTCGCCGACCCTGCCGCAGCCGCTGTAGCCCCCGCCGCTGACGCC 517 g3 GCCAAAGCTGCCGCAGTCGCCGACCCTGCCGCAGCCGCTGTAGCCCCCGCCGCTGACGCC 517 L5-2 GCCAAAGCTGCCGCAGTCGCCGACCCTGCCGCAGCCGCTGTAGCCCCCGCCGCTGACGCC 88

H1 TTCTCAGCAGCCGCCGACACTGCCTCTGACGCCGCCGCCGCCGCCGCCGCCACCGCCGCT 600 g1 TTCTCAGCAGCCGCCGACACTGCCTCTGACGCCGCCGCCGCCGCCGCCGCCACCGCCGCT 577 g3 TTCTCAGCAGCCGCCGACACTGCCTCTGACGCCGCCGCCGCCGCCGCCGCCACCGCCGCT 577 L5-2 TTCTCAGCAGCCGCCGACACTGCCTCTGACGCCGCCGCCGCCGCCGCCGCCACCGCCGCT 148

105 H1 GCCGCCAAAGCCGCCGCAGAAAAAACCGCCCGGGACGCCGCTGCAGCAGCCGCAGCCACC 660 g1 GCCGCCAAAGCCGCCGCAGAAAAAACCGCCCGGGACGCCGCTGCAGCAGCCGCAGCCACC 637 g3 GCCGCCAAAGCCGCCGCAGAAAAAACCGCCCGAGACGCCGCTGCAGCAGCCGCAGCCACC 637 L5-2 GCCGCCAAAGCCGTCGCAGAAAAAACCGCCCGAGACGCCGCTGCAGCAGCCGCAGCCACC 208

*********3’univ******** H1 GCCAGAGGTTAA GGATCGTGGTCGTCTTGATGTGGGATCATGT 703 g1 GCCAGAGGTTAA GGATCGTG 657 g3 GCCAGAGGTTAA GGATCGTG 657 L5-2 GCCAGAGGTTAA GGATCGTG 228

Stfl-AFP1 was isolated 7x in total: g1, g2, H1, H7-3, L1-4, L2-1, L4-1; the last 4 were isolated from the 3’int primer only. The sequence of g1 is identical to that of H1, except for a single nucleotide in the intron.

Stfl-AFP2 was isolated 6x in total: g3, H9-2, H13-4, H14-2, L7-2, L8-1; the last 5 were isolated from the 3’int primer only.

Stfl-AFP3 was isolated once, from L5-2. It is identical to a genomic sequence (g3) coding for Stfl-AFP2, except for a single nucleotide in the cds.

106

Appendix F DNA alignment of fourteen unique genes encoding starry flounder skin AFPs

DNA sequences were obtained by PCR from either genomic DNA (g4 to g10, as labeled in

Figure 14) or phage insert DNA (H5-4 and H5-2), and from the lambda library (H1, in Figure 10 and Appendix B , and H11, in Figure 18 and Appendix J). Together, they encode the skin AFP variants listed in Table 3. Polymorphisms are highlighted in grey and the number of nucleotides is noted at the end of each row. Intronic sequence is in lower case, while exonic sequence is in upper case with bolding used to indicate coding sequence. Silent substitutions are underlined, gaps are marked with hyphens (-), and the Element S enhancer [59] is in white text that is highlighted in black. The PCR primers used are labelled as in Figure 6 and are marked with asterisks. Notes detailing the frequency with which each sequence was isolated are below.

107 *********5’int********* H1 ggaaggaaggatatctgcattatcccagaggggccatttgttttacagccagtggtaaaa 60 g4 cccagaggggccatttgttttacagccagtggtaaaa 37 g5 cccagaggggccatttgttttacagccagtggtaaaa 37 g6 cccagaggggccatttgttttacagccagtggtaaaa 37 g8 cccagaggggccatttgttttacagccagtggtaaaa 37 g9 ccccgaggggccattcgttttacagccagtggtaaaa 37 H11 ggaaggaaggatatctgcattatcccagaggggccatttgttttacagccagcggtaaaa 60

H1 gttgaagatcttcatctgtgttcgtcggatggaaagtttgttctgaaaccttcagtggaa 120 g4 gatgaagatcttcatccttattcgtctgatggaaagtttgttctgaaaca------87 g5 gttgaagatcttcatctgtgttcgtcggatggaaagtttgttctgaaaccttcagtggaa 97 g6 gttgaagatcttcatctgtgttcgtcggatggaaagtttgttctgaaaccttcagtggaa 97 g8 gatgaagatcttcatccttattcgtctgatggaaagtttgttctgaaaca------87 g9 gatgaagatcttcatccgtgttcgtctgatggaaagtttgttc------80 H11 gatgaagatcttcaaccgtgttcgtctgatggaaagtttgttctgaaaca------87

H1 gtgtagtatattccccttagcaaatatccatagccttgaatcttaagttcaaacctttaa 180 g4 ------87 g5 gtgtagtatattccccttagcaaatatccatagccttgaatcttaagttcaaacctttaa 157 g6 gtgtagtatattccccttagcaaatatccatagccttgaatcttaagttcaaacctttaa 157 g8 ------87 g9 ------80 H11 ------87

H1 gtattatctccagatgtgttcagtgtgtgtctccttgtctgaactatccttgaactgcct 240 g5 gtattatctccagatgtgttcagtgtgtgtctccttgtctgaactatcgttgaactgcct 217 g6 gtattatctccagatgtgttcagtgtgtgtctccttgtctgaactatccttgaactgcct 217 g8 ------87 g9 ------80 H11 ------87

H1 atggaataatgagaggagagatggtttccagcgggtccttaaatcttaaggtacgacaca 300 g4 ------87 g5 atggaataatgagaggagagatggtttccagcgggtccttaaatcttaaggtacgacaca 277 g6 atggaataatgagaggagagatggtttccagcgggtccttaaatcttaaggtacgacaca 277 g8 ------87 g9 ------80 H11 ------87

H1 ttcccaaatttaggcagaaggccgggttgtgtgacgtcattatatctctaggtttgtggt 360 g4 ------87 g5 ttcccaaatttaggcagaaggccgggttgtgtgacgtcattatatctctaggtttgtggt 337 g6 ctcccaaatttaggcagaaggccgggttgtgtgacgtcattatatctctaggtttgtggt 337 g8 ------87 g9 ------80 H11 ------87

108 H1 aaacaacccctcctatttaacgccttaccttgcagagtcaaggcggattttcactattcg 420 g4 ------87 g5 aaacaacccctcctatttaacgccttaccttgcagagtcaaggcggattttcactattcg 397 g6 aaacaacccctcctatttaacgccttaccttgcagagtcaaggcggattttcactattcg 397 g8 ------87 g9 ------80 H11 ------87

H1 gcttgtgtgttatctccgagttttctagaaactcgtcctgacctataatactc------473 g4 ------87 g5 gcttgtgtgttatctccgagttttctagaaactcgtcctgacctataatactcttgtgta 457 g6 gcttgtgtgttatctccgagttttctagaaactcgtcctgacctataatactcttgtgta 457 g8 ------87 g9 ------80 H11 ------87

H1 ------attatacttgtaagtactgggtccgcgtctcctctcttcgaacaccgacttcta 527 g4 ------87 g5 ataaacattatacttgtaagtactgggtccgcgtctcctctcttcgaacaccgacttcaa 517 g6 ataaacattatacttgtaagtactgggtccgcgtctcctctcttcgaacaccgacttcaa 517 g8 ------87 g9 ------80 H11 ------87

*********stfsk********* ******** H1 caagacactactgcgggaaacatacgatagaagaaagagattcatgtgttcaggcctaaa 587 g4 ------87 g5 caagacactactgcgggaaacatacgatagaagaaagagattcatgtgttcaggcctaaa 577 g6 caagacaatactgcgggaaacatacgatagaagaaagagattcatgtgttcaggcctaaa 577 g8 ------87 g9 ------80 H11 ------87

**allsk********** H1 cctgaaaaaatctgagctctgttcaatcatgggaaacaactttttaattgagtcatggct 647 g4 ------87 g5 cctgaaaaaatctgagctctgttcaatcatgggaaacaactttttaattgagtcatggct 637 g6 cctgaaaaaatctgagctctgttcaatcatggggaacaactttttaattgagtcatggct 637 g7 tctgttcaatcatgggaaacaactttttaattgagtcatggct 43 g8 ------87 g9 -----aaaaatctgagctctgttca----tcggaaacaaattt-gaattaagtcagggct 130 H5-4 tctgttcaatcatgggaaacaactttttaattaagtcatggct 43 g10 tctgttcaatcatgggaaacaactttttaattgagtcatggct 43 H5-2 tctgttcaatcatgggaaacaaccttttaattaagtcatggct 43 H11 ------87

109 H1 gcaaaactcttttatatgaacagaagaagaagaagtgatctttagttcatcactgtggaa 707 g4 ggaaaactatttgatatgcacagaagaagaagaagtgatctttagttcatcactgtggaa 147 g5 gcaaaactcttttatatgaacagaagaagaagaagtgatctttagttcatcactgtggaa 697 g6 gcaaaactcttttatatgaacagaagaagaagaagtgatctttagttcatcactgtggaa 697 g7 gcaaaactcttttatatgaacagaagaagaagaagtgatctttagttcatcactgtggaa 103 g8 ggaaaactatttgatatgcacagaagaagaagaagtgatctttagttcatcactgtggaa 147 g9 ggaaaact---ttatatgaacagaagaagaagaagtgatctttagttcatcactgtggaa 187 H5-4 gca------46 g10 gcaaaactcttttatatgaacagaagaagaagaagtgatctttagttcatcactgtggaa 103 H5-2 gca------46 H11 gtaaaactcttttatatgaacagaagaagaagaagtgatctttagttcatcactgtggaa 147

H1 acatcagcagcagttaaattctgtctgcttcagtatcaccggccagttccagtgctcatg 767 g4 acatcagcagcagttaaagtctgtctgcttcagtatcaccggccagttccagtgctcatg 207 g5 acatcagcagcagttaaattctgtctgcttcagtatcaccggccagttccagtgctcatg 757 g6 acatcagcagcagttaaattctgtctgcttcagtatcaccggccagttccagtgctcatg 757 g7 acatcagcagcagttaaattctgtctgcttcagtatcaccggccagttccagtgctcatg 163 g8 acatcagcagcagttaaagtctgtctgcttcagtatcaccggccagttccagtgctcatg 207 g9 acatcagcagcagttaaagtctgtctgcttcagtatcactggccagttccagtgctcatg 247 H5-4 ------tcagtatcaccggccagttccagtgctcatg 77 g10 acatcagcagcagttaaattctgtctgcttcagtatcaccggccagttccagtgctcatg 163 H5-2 ------tcagtatcaccggccagttccagtgctcatg 77 H11 acatcagcagcagttaaagtctgtctgcttcagtatcactggccagttccagtgctcatg 207

H1 tttctgatcagcttggtttgaatgatatgaaacggatggagtccctgtttgaccctgttt 827 g4 -ttctgatcagcttgttttgaatgatataaaacggatggagtccctgtttgaccctgttt 267 g5 tttctgatcagcttggtttgaatgatatgaaacggatggagtccctgtttgaccctgttt 817 g6 tttctgatcagcttggtttgaatgatatgaaacggatggagtccctgtttgaccctgttt 817 g7 tttctgatcagcttggtttgaatgatatgaaacggatggagtccctgtttgaccctgttt 223 g8 -ttctgatcagcttgttttgaatgatataaaacggatggagtccctgtttgaccctgttt 267 g9 tttctgatcagcttggtttgaatgatatgaaacggatggagtccctgtttgaccctgttt 307 H5-4 tttctgatcagcttggtttgaatgatatgaaacggatggagtccctgtttgaccctgttt 137 g10 tttctgatcagcttggtttgaatgatatgaaacggatggagtccctgtttgaccctgttt 223 H5-2 tttctgatcagcttggtttgaatgatatgaaacggatggagtccctgtttgaccctgttt 137 H11 tttctgatcagcttggtttgaatgatatgaaacggatggagtccctgtttgaccctgttt 267

H1 aacacaagatggccaagtggaccatcttt attaacataatgttttacatgagcacttcct 887 g4 aacacaagatggcaacgtggaccatcttt attaacataatgttttacatgagcacttcct 327 g5 aacacaagatggccaagtggaccatcttt attaacataatgttttacatgagcacttcct 877 g6 aacacaagatggccacgtggatcatcttt attaacataatgttttacatgagcacttcct 877 g7 aacacaagatggccaagtggaccatcttt attaacataatgttttacatgagcacttcct 283 g8 aacacaagatggcaacgtggaccatcttt attaacataatgttttacatgagcacttcct 327 g9 aacacaagatgg------accatcttt attaacataatgttttacat cagcacttcct 359 H5-4 aacacaagatggccacgtggatcatcttt attaacataatgttttacatgagcacttcct 197 g10 aacacaagatggccaagtggaccatcttt attaacataatgttttacatgagcacttcct 283 H5-2 aacacaagatggccacgtggaccatcttt attaacataatgttttacatgagcacttcct 197 H11 aacacaagatgg------accatcttt attaacataatgttttacatgagcacttcct 319

110 H1 gttttcagccctaaacctaaagaggcctcatggaaacttcctgatgatctggtgacacct 947 g4 gttttcagccctaaacttaaagaa-cctcatggaaacttcctgatgatctggtgacacct 386 g5 gttttcagccctaaacctaaagaggcctcatggaaacttcctgatgatctggtgacacct 937 g6 gttttcagccctaaacctaaagaggcctcatggaaacttcctgatgatctggtgacacct 937 g7 gttttcagccctaaacctaaagaggcctcatggaaacttcctgatgatctggtgacacct 343 g8 gttttcagccctaaacttaaagaa-cctcatggaaacttcctgatgatctggtgacacct 386 g9 gttttcagcccgaaacttaaagaa-cctcatggaatcttcctgatgatctggtgacacct 418 H5-4 gttttcagccctaaacctaaagaggcctcatggaaacttcctgatgatctggtgacacct 257 g10 gttttcagccctaaacctaaagaggcctcatggaaacttcctgatgatctggtgacacct 343 H5-2 gttttcagccctaaacctaaagaggcctcatggaaacttcctgatgatctggtgacacct 257 H11 gttttcagccctaaacctaaagaggcctcatagaaacttcctgatgatctggtgacacct 379

H1 gctggttgaaggaaacagagttcgagaggcagctgaacaaattattttagtttgaaagaa 1007 g4 gctggttgaaggaaacagagtttgagagtcagaagaacaaatgattttagtttgaatgaa 446 g5 gctggttgaaggaaacagagttcgagaggcagctgaacaaattattttagtttgaaagaa 997 g6 gctggttgaaggaaacagagttcgagaggcagctgaacaaattattttagtttgaaagaa 997 g7 gctggttgaaggaaacagagttcgagaggcagctgaacaaattattttagtttgaaagaa 403 g8 gctggttgaaggaaacagagtttgagagtcagaagaacaaatgattttagtttgaatgaa 446 g9 gctggttgaaggaaacagagttttagaggcagccgaacaaatgattttagtttgaatgaa 478 H5-4 gctggttgaaggaaacagagttcgagaggcagctgaacaaattattttagtttgaaagaa 317 g10 gctggttgaaggaaacagagttcgagaggcagctgaacaaattattttagtttgaaagaa 403 H5-2 gctggttggaggaaacagagttcgagaggcagctgaacaaattattttagtttgaaagaa 317 H11 gctggttgaaggaaacagagtttgagagtcagaagaacaaatgattttagtttgaaacaa 439

H1 gaagctgtcatttgagattatgttgtagggggggggggg------a 1047 g4 gaagctgtcatttga--ttttgtt------468 g5 gaagctgtcatttgagattatgttgtaggggggggggg------a 1036 g6 gaagctgtcatttgagattatgttgtagggggggg------a 1033 g7 gaagctgtcatttgagattatgttgtaggggggggggggg------a 444 g8 gaagctgtcatttga--ttttgtt------468 g9 gaagctgtcattttattttatgttgggggggggggggggg------518 H5-4 gaagctgtcatttgagattatgttgtagggggggg------a 353 g10 gaagctgtcatttgagattacgttgtagggggggggggg------a 443 H5-2 gaagctgtcatttgagattatgttgtagggggggg------a 353 H11 gaagctgtcatttgatattatgttgtgggggggggggcgggggtggtcatcacacacaga 499

H1 tactgaactctgtcatcaccgggttcggtgaaagtgacggacaagtacatgttgtgataa 1107 g4 ------tcatcactgggttcggtgaaagtgacggaccagtacatgttgtgataa 516 g5 tactgaactctgtcatcaccgggttcggtgaaagtgacggacaagtacatgttgtgataa 1096 g6 tactgaactctgtcatcaccgggttcggtgaaagtgacggacaagtacatgttgtgataa 1093 g7 tactgaactctgtcatcaccgggttcggtgaaagtgacggacaagtacatgttgtgataa 504 g8 ------tcatcactgggttcggtgaaagtgacggaccagtacatgttgtgatat 516 g9 tactgaacactgtcatcactgggttcggtgaaagtgacggaccagtacatgttgtgatat 578 H5-4 tactgaactctgtcatcaccgggttcggtgaaagtgacggagcagtacatgttgtgataa 413 g10 tactgaactctgtcatcaccgggttcggtgaaagtgacggacaagtacatgttgtgataa 503 H5-2 tactgaactctgtcatcaccgggttcggtgaaagtgacggacaagtacatgttgtgataa 413 H11 tattgaacactgtcatcactgggttcagtgaaagtgacggaccagtacatgttgtgataa 559

111 **********3’int********* H1 ataattatatcataataattataataataccattaatttctgcagAATCACTAAAACGAA 1167 g4 ataattatatcataataattataataataccattaatttctgcagAATCACTAAAACGAA 576 H7-2 AATCACTAAAACGAA 15 g5 ataattatatcataataattataataataccattaatttctgcagAATCACTAAAACGAA 1156 g6 ataattatatcataataattataataataccattaatttctgcagAATCACTAAAACGAA 1153 g7 ataattatatcataataattataataataccattaatttctgcagAATCACTAAAACGAA 564 L5 AATCACTAAAACGAA 15 L7-3 AATCACCAAAACGAA 15 g8 ataacattatcataataattataataataccattaatctctgcagAATCACTCATACTAA 576 g9 ataatattttcaaaataattttaataataccattaatttctgcagAATCACTGACATCAA 638 H5-4 ataattatatcataataattataataataccattaatttctgcagAATCACTAAAACGAA 473 g10 ataattatatcataataattataataataccattaatttctgcagAATCACTAAAACGAA 563 H5-2 ataattatatcataataattataataataccattaatttctgcagAATCACTAAAACGAA 473 H11 ataa---tatcataataattataataataccattaatctctgcagAATCACTGACATCAA 617

H1 C ATGGACGCCCCAGCCAGAGCCGCCGCAGCCACCGCCGCCGCCGCCAAAGCCGCCGCAGA 1227 g4 CATGGACGCCCCAGCCAGAGCCGCCGCAGCCACCGCCGCCGCCGCCAAAGCCGCCGCAGA 636 H7-2 CATGGACGCCCCAGCCAGAGCCGCCGCAGCCACCGCCGCCGCCGCCAAAGCCGCCGCAGA 75 g5 CATGGACGCCCCAGCCAGAGCCGCCGCAGCCACCGCCGCCGCCGCCAAAGCCGCCGCAGA 1216 g6 CATGGACGCCCCAGCCAGAGCCGCCGCAGCCACCGCCGCCGCCGCCAAAGCCGCCGCAGA 1213 g7 CATGGACGCCCCAGCCAGAGCCGCGGCAGCCACGGCCGCCGCCGCCAAAGCCGCCGCAGA 624 L5 CATGGACGCCCCAGCCAGAGCCGCCGCAGCCACCGCCGCCGCCGCCAAAGCCGCCGCAGA 75 L7-3 CATGGACGCCCCAGCCAGAGCCGCCGCAGCCACCGCCGCCGCCGCCAAAGCCGCCGCAGA 75 g8 CATGGACGCCCCAGCCAGAGCCGCCGCAGCCACTGCCGCTGCAGCCAAAGCCGCAGCCGA 636 g9 CATGGACGCCCCAGCCAAAGCCGCCGCAGCCACCGCAGCCGCCGCCAAAGCCGCCGCAGA 698 H5-4 CATGGACGCCCCGGCCAGAGCCGCCGCTGCCACCGCCGCCGCCGCCAGAGCCACCGCAGA 533 g10 CATGGACGCCCCAGCCAGAGCCGCCGCAGCCACCGCCGCCGCCGCCAAAGCCGCCACAGA 623 H5-2 CATGGGCGCCCCAGCCAGAGCCGCCGCAGCCACCGCCGCCGCCGCCAAGGCCGCCGCAGA 533 H11 CATGGACGCCCCAGCCGCCGCCGCCGCAGCCACCGCCGCCGCCGCCAAAGCCGCCGCAGA 677

H1 AGCCACCGCCGCCGCAGCTGCCAAAGCAGCAGCCGCCACCAAAGCAGCC------1276 g4 AGCCACCGCCGCCGCAGCTGCCAAAGCAGCAGCCGCCACCAAAGCAGCC------685 H7-2 AGCCACCGCCGCCGCAGCTGCCAAAGCAGCAGCCGCCACCAAAGCAGCC------124 g5 AGCCACCAAAGCCGCAGCTGCCAAAGCAGCAGCCGCCACCAAAGCCGCC------1265 g6 AGCCACCAAAGCCGCAGCTGCCAAAGCAGCAGCCGCCACCAAAGCCGCC------1262 g7 AGCCACCAAAGCCGCAGCTGCCAAAGCAGCAGCCGCCACCAAAGCCGCC------673 L5 AGCCACCAAAGCCGCAGCTGCCAAAGCAGCAGCCGCCACCAAAGCCGCC------124 L7-3 AGCCACCAAAGCCGCAGCTGCCAAAGCAGCAGCCGCCACCAAAGCCGCC------124 g8 AGCCACCGCCGCAGCAGCTGCCAAAGCAGCAGCCGACACCAAAGCTGCCGCAGCCGCCGC 696 g9 AGCCACCGCCGCCGCAGCTGCCAAAGCAGCAGCCGCCACCAAAGCCGGC------747 H5-4 AGCCACCGAAGCCGCAGCTGCCAAAGCAGCAGCCGCCACCAAAGCCGCC------582 g10 AGCCACCAAAGCCGCAGCTGCCAAAGCAGCAGCCGCCACCAAAGCCGCC------672 H5-2 AGCCACCAAAGCCGCAGCTGCCAAAGCAGCGGCCGCCACCAAAGCCGCC------582 H11 AGCCACCGCCGCCGCAGCTGCCAAAGCAGCAGCCGCAACCAAAGCTGCCGCAGCC----- 732

112 *********3’univ******** H1 ------CGTTAA TGATCGTGGTCGTCTTGATGTGGGATCATGT 1313 g4 ------CGTTAA TGATCGTG 699 H7-2 ------CGTTAA TGATCGTG 138 g5 ------CGTTAA TGATCATG 1279 g6 ------CGTTAA TGATCATG 1276 g7 ------CGTTAA TGATCATG 687 L5 ------CGTTAA TGATCATG 138 L7-3 ------CGTTAA TGATCATG 138 g8 CGCCGCCCTTTGAGGATCGTG 717 g9 ------CGTTAA TGATCATG 761 H5-4 ------CGTTAA TGATCATG 596 g10 ------CGTTAA TGATCATG 686 H5-2 ------CGTTAA TGATCATG 596 H11 ------CGTTAA TGATCATGGTCGTCTTGATGTGGGATCATGT 746

Stfs-AFP1 was isolated 3x in total: g4, H1, H7-2. H7-2 is from the 3’int primer only.

Stfs-AFP2 was isolated 5x in total: g5, g6, g7, L5, L7. The last two were isolated from the 3’int primer only.

Stfs-AFP3 was isolated once, from g8.

Stfs-AFP4 was isolated once, from g9.

Stfs-AFP5 was isolated once, from H5.

Stfs-AFP6 was isolated once, from g10.

Stfs-AFP7 was isolated once, from H5.

Stfs-AFP8 was isolated once, from H11.

113

Appendix G DNA alignment the American plaice skin AFPs

An alignment of seventeen unique sequences isolated from American plaice genomic DNA is shown. The sequences were obtained using the allsk and 3'univ primers (Table 1) and are grouped by the encoded protein variant. Intronic sequence is in lower case, exonic sequence is in upper case, and the coding region is bolded. Differences are highlighted in grey, silent substitutions are underlined, and the number of nucleotides is noted at the end of each row.

114 aps-AFP10 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP18 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP09 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP11 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP07 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP12 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP05 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP13 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP17 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP15 tctgttaaaaccatggacctgctttattaggtttgatttaatgttaatca 50 aps-AFP08 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP04 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP03 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP06 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP01 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP14 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP16 tctgttaaaaccatggacctgttttattaggtttgatttaatgttaatca 50 aps-AFP10 tttcaaatcatgggaaacaactttttaattaagtccgggctggaaaactc 100 aps-AFP18 tttcaaatcatgggaaacagctttttaattaagtccgggctggaaaactc 100 aps-AFP09 tttcaaatcatgggaaacaactttttaattaagtccgggctggaaaactc 100 aps-AFP11 tttcaaatcatgggaaacaactttttaattaagtccgggctggaaaactc 100 aps-AFP07 tttcaaatcatgggaaacaactttttaattaagtccgggctggaaaactc 100 aps-AFP12 tttcaaatcatgggaaacaactttttaattaagtccgggctggaaaactc 100 aps-AFP05 tttcaaatcatgggaaacaactttttaattaagtccgggctggaaaactc 100 aps-AFP13 tttcaaatcatgggaaacaactttttaattaagtccgggctggaaaactc 100 aps-AFP17 tttcaaatcatgggaaacaactttttaattaagtccgggctggaaaactc 100 aps-AFP15 tttcaaatcatgggaaacaactttttaattaagtccgggctggaaaactc 100 aps-AFP08 tttcaaatcatgggaaacagctttttaattaagtccgggctggaaaactc 100 aps-AFP04 tttcaaatcatgggaaacaactttttaattaagtccgggctggaaaactc 100 aps-AFP03 ttgcaaatcatgggaaacaactttttaattaagtccgggctggaaaactc 100 aps-AFP06 ttgcaaatcatgggaaacaactttttaattaagtccgggctggaaaactc 100 aps-AFP01 ttgcaaatcatgggaaacaactttttaattaagtccgggctggaaaactc 100 aps-AFP14 tttcaaatcatgggaaacaactttttaattaagtccgggctggaaaacta 100 aps-AFP16 tttcaaatcatgggaaacaactttttaattaagtccgggctggaaaacta 100

aps-AFP10 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP18 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP09 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP11 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP07 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP12 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP05 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP13 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP17 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP15 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP08 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP04 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP03 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP06 ttttatatgcacagaagaagaagaagtgatctttagttcatcactgtgga 150 aps-AFP01 ttttatatgcacagaagaagaagaagtgatctttagtccatcactgtgga 150 aps-AFP14 ttttatatgcacagaaggagaagaagtgatctttagttcatcactgtgga 150 aps-AFP16 ttttatatgcacagaaggagaagaagtgatctttagttcatcactgtgga 150 115 aps-AFP10 aacatcagcagcagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP18 aacatcagcagcagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP09 aacatcagcagcagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP11 aacatcagcagcagttaaagtctgtctgcctcagtgtcaccggccagttc 200 aps-AFP07 aacatcagcagcagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP12 aacatcagcagcagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP05 aacatcagcagcagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP13 aacatcagcagcagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP17 aacatcagcagcagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP15 aacatcagcagcagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP08 aacatcagcagcagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP04 aacatcagcagcagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP03 aacatcagcagaagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP06 aacatcagcagaagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP01 aacatcagcagcagttaaagtctgtctgcttcagtgtcaccggccagttc 200 aps-AFP14 aacatcagcagcagttaaagtctgtctgcttcagtatcagcggccagttc 200 aps-AFP16 aacatcagcagcagttaaagtctgtctgcttcagtatcagcggccagttc 200

aps-AFP10 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250 aps-AFP18 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250 aps-AFP09 cagtgctcatgtttctgatcagcctggtttgaatgatataaaacggatcg 250 aps-AFP11 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250 aps-AFP07 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250 aps-AFP12 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250 aps-AFP05 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250 aps-AFP13 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250 aps-AFP17 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250 aps-AFP15 cagtgctcatgtttctggtcagcttggtttgaatgatataaaacggatcg 250 aps-AFP08 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250 aps-AFP04 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250 aps-AFP03 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250 aps-AFP06 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250 aps-AFP01 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250 aps-AFP14 cagtgctcatgtttctggtcagcttggtttgaatgatataaaacggatcg 250 aps-AFP16 cagtgctcatgtttctgatcagcttggtttgaatgatataaaacggatcg 250

aps-AFP10 agtgcctgtttgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP18 agtgcctgtttgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP09 agtgcctgtgtgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP11 agtgcctgtgtgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP07 agtgcctgtgtgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP12 agtgcctgtttgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP05 agtgcctgtgtgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP13 agtgcctgtgtgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP17 agtgcctgtgtgaccctgtttaatacaagatggccacgtggaccatcttt 300 aps-AFP15 agtgcctgtgtgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP08 agtgcctgtgtgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP04 agtgcctgtttgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP03 agtgcctgtttgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP06 agtgcctgtttgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP01 agtgcctgtttgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP14 agtgcctgtttgaccctgtttaacacaagatggccacgtggaccatcttt 300 aps-AFP16 agtgcctgtttgaccctgtttaacacaagatggccacgtggaccatcttt 300 116 aps-AFP10 atttacataatgtttcatatcagcacttcctg ttttcagccctaaaccta 350 aps-AFP18 atttacataatgtttcatatcagcacttcctg ttttcagccctaaaccta 350 aps-AFP09 atttacataatgtttcatatcagcacttcctg tttttagccctaaaccta 350 aps-AFP11 atttacataatgtttcatatcagcacttcctg tttttagccctaaaccta 350 aps-AFP07 atttacataatgtttcatatcagcacttcctg tttttagccctaaaccta 350 aps-AFP12 atttacataatgtttcatatcagcacttcctg ttttcagccctaaaccta 350 aps-AFP05 atttacataatgtttcatatcagcacttcctg ttttcagccctaaaccta 350 aps-AFP13 atttacataatgtttcatatcagcacttcctg ttttcagccctaaaccta 350 aps-AFP17 atttacataatgtttcatatcagcacttcctg ttttcagccctaaaccta 350 aps-AFP15 atttacataatgtttcatatcagcacttcctg ttttcagccctaaaccta 350 aps-AFP08 atttacataatgtttcatatcagcacttcctg ttttcagccctaaaccta 350 aps-AFP04 atttacataatgtttcatatcagcacttcctg ttttcagccctaaaccta 350 aps-AFP03 atttacataatgtttcatatcagcacttcctg ttttcagccctaaaccta 350 aps-AFP06 atttacataatgtttcatatcagcacttcctg ttttcagccctaaaccta 350 aps-AFP01 atttacataatgtttcatatcagcacttcctg ttttcagccctaaaccta 350 aps-AFP14 atttacataatgtttca catcagcacttcctg ttttcagccctaaaccta 350 aps-AFP16 atttacataatgtttca catcagcacttcctg ttttcagccctaaaccta 350

aps-AFP10 aagaggcctcatggaaacttcctgatgatctggtgacaactgctggttga 400 aps-AFP18 aagaggcctcatggaaacttcctgatgatctggtgacaactgctggttga 400 aps-AFP09 aagaggcctcatggaaacttcctgatgatctggtgacaactgctggttga 400 aps-AFP11 aagaggcctcatggaaacttcctgatgatctggtgacaactgctggttga 400 aps-AFP07 aagaggcctcatggaaacttcctgatgatctggtgacaactgctggttga 400 aps-AFP12 aagaggcctcatggaaacttcctgatgatccggtgacaactgctggttga 400 aps-AFP05 aagaggcctcatggaaacttcctgatgatctggtgacaactgctggttga 400 aps-AFP13 aagaggcctcatggaaacttcctgatgatctggtgacaactgctggttga 400 aps-AFP17 aagaggcctcatggaaacttcctgatgatctggtgacaactgctggttga 400 aps-AFP15 aagaggcctcatggaaacttcctgatgatctggtgacaactgctggttga 400 aps-AFP08 aagaggcctcatggaaacttcctgatgatctggtgacaactgctggttga 400 aps-AFP04 aagaggcctcatggaaacttcctgatgatctggtgacaactgctggttga 400 aps-AFP03 aagaggcctcatgaaaacttcctgatgatctggtgacacctgctggttga 400 aps-AFP06 aagaggcctcatgaaaacttcctgatgatctggtgacacctgctggttga 400 aps-AFP01 aagaggcctcatgaaaacttcctgatgatctggtgacacctgctggttga 400 aps-AFP14 aaaaggcctcatggaaacttcctgatgatctggtgacacctgctggttga 400 aps-AFP16 aaaaggcctcatggaaacttcctgatgatctggtgacacctgctggttga 400

aps-AFP10 aggaaacaatgtttgagaggcagcagaacaaatgattttagttttaatga 450 aps-AFP18 aggaaacaatgtttgagaggcagcagaacaaatgattttagttttaatga 450 aps-AFP09 aggaaacaatgtttgagaggcagcagaacaaatgattttagttttaatga 450 aps-AFP11 aggaaacaatgtttgagaggcagcagaacaaatgattttagttttaatga 450 aps-AFP07 aggaaacaatgtttgagaggcagcagaacaaatgattttagttttaatga 450 aps-AFP12 aggaaacaatgtttgagaggcagcagaacaaatgattttagttttaatga 450 aps-AFP05 aggaaacaatgtttgagaggcagcagaacaaatgattttagttttaatga 450 aps-AFP13 aggaaacaatgtttgagaggcagcagaacaaatgattttagttttaatga 450 aps-AFP17 aggaaacaatgtttgagaggcagcagaacaaatgattttagttttaatga 450 aps-AFP15 aggaaacaatgtttgagaggcagcagaacaaatgattttagttttaatga 450 aps-AFP08 aggaaacaatgtttgagaggcagcagaacaaatgattttagttttaatga 450 aps-AFP04 aggaaacaatgtttgagaggcagcagaacaaatgattttagttttaatga 450 aps-AFP03 aggaaacagagtttgagaggcagcagaacaaatgattttagtttgaatga 450 aps-AFP06 aggaaacagagtttgagaggcagcagaacaaatgattttagtttgaatga 450 aps-AFP01 aggaaacagagtttgagaggcagcagaacaaatgattttagtttgaatga 450 aps-AFP14 aggaaacagagtttgagaggcagcagaacaaatgattttagtttgaatga 450 aps-AFP16 aggaaacagagtttgagaggcagcagaacaaatgattttagtttgaatga 450 117 aps-AFP10 agaagctgtcatttaattttatgttgtggggga-----gggggtcatcac 495 aps-AFP18 agaagctgtcatttaattttatgttgtggggga-----gggggtcatcac 495 aps-AFP09 agaagctgtcatttaattttatgttgtggggga-----gggggtcatcac 495 aps-AFP11 agaagctgtcatttaattttatgttgtggggga-----gggggtcatcac 495 aps-AFP07 agaagctgtcatttaattttatgttgtggggga-----gggggtcatcac 495 aps-AFP12 agaagctgtcatttaattttatgttgtggggga-----gggggtcatcac 495 aps-AFP05 agaagctgtcatttaattttatgttgtggggga-----gggggtcatcac 495 aps-AFP13 agaagctgtcatttaattttatgttgtggggg-----ggggggtcatcac 495 aps-AFP17 agaagctgtcatttaattttatgttgtggggg-----ggggggtcatcac 495 aps-AFP15 agaagctgtcatttaattttatgttgtggggg-----ggggggtcatcac 495 aps-AFP08 agaagctgtcatttaattttatgttgtggggga-----gggggtcatcac 495 aps-AFP04 agaagctgtcatttaattttatgttgtggggga-----gggggtcatcac 495 aps-AFP03 agaagctgtcctttgattttatgttctgggg-aggggggggggtcatcac 499 aps-AFP06 agaagctgtcctttgattttatgttctgggg-aggggggggggtcatcac 499 aps-AFP01 agaagctgtcctttgattttatgttctgggggaggggggggggtcatcac 500 aps-AFP14 agaagctgtcatttgattttatgttgtgggg---gggggggggtcatcac 497 aps-AFP16 agaagctgtcatttgattttatgttgtgggg----ggggggggtcatcac 496

aps-AFP10 acgagaatattgaacactgtcatcactgggttctgtgaaagtgacggacc 545 aps-AFP18 acgagaatattgaacactgtcatcactgggttctgtgaaagtgacggacc 545 aps-AFP09 acgagaatattgaacactgtcatcactgggttctgtgaaagtgacggacc 545 aps-AFP11 acgagaatattgaacactgtcatcactgggttctgtgaaagtgacggacc 545 aps-AFP07 acgagaatattgaacactgtcatcactgggttctgtgaaagtgacggacc 545 aps-AFP12 acgagaatattgaatactgtcatcactgggttctgtgaaagtgacggacc 545 aps-AFP05 acgagaatattgaacactgtcatcactgggttctgtgaaagtgacggacc 545 aps-AFP13 acgagaatattgaacactgtcatcactgggttctgtgaaagtgacggacc 545 aps-AFP17 acgagaatattgaacactgtcatcactgggttctgtgaaagtgacggacc 545 aps-AFP15 acgagaatattgaacactgtcatcactgggttctgtgaaagtgacggacc 545 aps-AFP08 acgagaatattgaacactgtcatcactgggttctgtgaaagtgacggacc 545 aps-AFP04 acgagaatattgaacactgtcatcactgggttctgtgaaagtgacggacc 545 aps-AFP03 acgaggatattgaacactgtcatcactgggttctgtgaaagtgacggacc 549 aps-AFP06 acgaggatattgaacactgtcatcactgggttctgtgaaagtgacggacc 549 aps-AFP01 acgaggatattgaacactgtcatcactgggttctgtgaaagtgacggacc 550 aps-AFP14 acgaggatattgaacactgtcatcactgggttctgtgaaagtgacggacc 547 aps-AFP16 acgaggatattgaacactgtcatcactgggttctgtgaaagtgacggacc 546

aps-AFP10 cgtacatgttgtgatatataatattatcataataattataata---ccat 592 aps-AFP18 cgtacatgttgtgatatataatattatcataataattataata---ccat 592 aps-AFP09 cgtacatgttgtgatatataatattatcataataattataataatgccat 595 aps-AFP11 cgtacatgttgtgatatataatattatcataataattataataatgccat 595 aps-AFP07 cgtacatgttgtgatatataatattatcataataattataataatgccat 595 aps-AFP12 cgtacatgttgtgatatataatattatcataataattataataatgccat 595 aps-AFP05 agtacatgttgtgatatataatattatcataataattataata---ccat 592 aps-AFP13 cgtacatgttgtgatatataatattatcataataattataata---ccat 592 aps-AFP17 cgtacatgttgtgatatataatattatcataataattataata---ccat 592 aps-AFP15 cgtacatgttgtgatatataatattatcataataattataata---ccat 592 aps-AFP08 agtacatgttgtgatatataatattatcataataattataata---ccat 592 aps-AFP04 cgtacatgttgtgatatataatattatcataataattataata---ccat 592 aps-AFP03 cgtacatgttgtgatatataatattatcataataattataata---ccat 596 aps-AFP06 cgtacatgttgtgatatataatattatcataataattataata---ccat 596 aps-AFP01 cgtacatgttgtgatatataatattatcataataattataata---ccat 597 aps-AFP14 cgtacatgttgtgaaatataatattatcataataattataata---ccat 594 aps-AFP16 cgtacatgttgtgaaatataatattatcataataattataata---ccat 593 118 aps-AFP10 taatctctgcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 642 aps-AFP18 taatctctgcagAATCACTGGCATCAAC ATGGACCCAGCAAAAGCCGCCG 642 aps-AFP09 tcatctctgcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 645 aps-AFP11 tcatctctgcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 645 aps-AFP07 tcatctctgcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 645 aps-AFP12 tcatctctgcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 645 aps-AFP05 taatctctgcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 642 aps-AFP13 taaaat--gcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 640 aps-AFP17 taaaat--gcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 640 aps-AFP15 taaaat--gcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 640 aps-AFP08 taaaat--gcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 640 aps-AFP04 taatctctgcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 642 aps-AFP03 taatctctgcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGTTG 646 aps-AFP06 taatctctgcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 646 aps-AFP01 taatctctgcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 647 aps-AFP14 taatctctgcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 644 aps-AFP16 taatctctgcagAATCACTGACATCAAC ATGGACCCAGCAAAAGCCGCCG 643

aps-AFP10 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 692 aps-AFP18 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 692 aps-AFP09 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 695 aps-AFP11 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 695 aps-AFP07 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 695 aps-AFP12 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 695 aps-AFP05 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 692 aps-AFP13 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 690 aps-AFP17 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 690 aps-AFP15 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 690 aps-AFP08 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 690 aps-AFP04 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 692 aps-AFP03 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 696 aps-AFP06 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 696 aps-AFP01 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 697 aps-AFP14 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 694 aps-AFP16 CAGCCACCGCCGCCAAAGCCAAAGCCGACGCCGAAAAGACTGCAGCCGCC 693

aps-AFP10 GCCGCCAAGGCCGCTGCCGACACCGCCGCTGCCGCCGCCAAAGCCGCCAA 742 aps-AFP18 GCCGCCAAGGCCGCTGCCGACACCGCCGCTGCCGCCGCCAAAGCCGCCAA 742 aps-AFP09 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------CAAAGCCGCCAA 736 aps-AFP11 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------CAAAGCCGCCAA 736 aps-AFP07 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------CAAAGCCGCCAA 736 aps-AFP12 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------CAAAGCCGCCAA 736 aps-AFP05 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------CAAAGCCGCCAA 733 aps-AFP13 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------CAA 722 aps-AFP17 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------CAA 722 aps-AFP15 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------CAA 722 aps-AFP08 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------CAA 722 aps-AFP04 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------TGCCGCCGCCAA 733 aps-AFP03 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------TGCCGCCGCCAA 737 aps-AFP06 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------TGCCGCCGCCAA 737 aps-AFP01 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------TGCCGCCGCCAA 738 aps-AFP14 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------TGCCGCCGCCAA 735 aps-AFP16 GCCGCCAAGGCCGCCGCCGACACCGCCGC ------TGCCGCCGCCAA 734 119 aps-AFP10 AGCCGCCGCCCATTAA GGATCGTG 766 aps-AFP18 AGCCGCCGCCCATTAA GGATCGTG 766 aps-AFP09 AGCCGCCGCCCATTAA GGATCGTG 760 aps-AFP11 AGCCGCCGCCCATTAA GGATCGTG 760 aps-AFP07 AGCCGCCGCCCATTAA GGATCGTG 760 aps-AFP12 AGCCGCCGCCCATTAA GGATCGTG 760 aps-AFP05 AGCCGCCGCCCATTAA GGATCGTG 757 aps-AFP13 AGCCGCCGCCCATTAA GGATCGTG 746 aps-AFP17 AGCCGCCGCCCATTAA GGATCGTG 746 aps-AFP15 AGCCGCCGCCCATTAA GGATCGTG 746 aps-AFP08 AGCCGCCGCCCATTAA GGATCGTG 746 aps-AFP04 AGCCGCCGCCCCTTAA GGATCGTG 757 aps-AFP03 AGCCGCCGCCCCTTAA GGATCGTG 761 aps-AFP06 AGCCGCCGCCCCTTAA GGATCGTG 761 aps-AFP01 AGCCGCCGCCCCTTAA GGATCGTG 762 aps-AFP14 AGCCGCCGCCCCTTAA GGATCGTG 759 aps-AFP16 AGCCGCCGCCCCTTAA GGATCGTG 758

120

Appendix H DNA alignment of three unique gene sequences encoding starry flounder hyperactive AFPs

An alignment of three unique sequences obtained from PCR on genomic DNA is shown.

Sequences are labeled by codes assigned to genomic clones after sequencing (g#, Table 3).

Different nucleotides are highlighted in grey and the number of nucleotides is noted at the end of each row. Notes detailing the result of a nested PCR on primary phage plaque H11 (Figure 14) are at the bottom.

121 g11 AGCCAATGCCGCCGCCGCAGCAGCCACCGCCGCTGCCGCAGCAATAGCAGCCGAGGAAGC 60 g12 AGCCAATGCCGCCGCCGCAGCAGCCACCGCCGCTGCCGCAGCAATAGCAGCCGAGGAAGC 60 g13 AGCCAATGCCGCCGCCGCAGCAGCCACCGCCGCCGCCGCAGCAATAGCAGCCGAGGAAGC 60

g11 CGCAACCGCGGCTGCTACCGCCGCAGCTGCTGCCGCCGCCACCGCCGCCACAGCCCAGGC 120 g12 CGCAACCGCGGCTGCTACCGCCGCAGCTGCTGCCGCCGCCACCGCCGCCACAGCCCAGGC 120 g13 CGCAACCGCGGCTGCTACCGCCGCAGCTGCTGCCGCCGCCACCGCCGCCACAGCCCAGGC 120

g11 AGCCATCTTTGACAAAGCCGCAGCCGCCGCATCCACAACCGCCACCACCGCCGCCACGGC 180 g12 AGCCATCTTTGACAAAGCCGCAGCCGCCGCATCCACAACCGCCACCACCGCCGCCACGGC 180 g13 GGCCATCTTTGACAAAGCCGCAGCCGCCGCATCCACAACCGCCACCACCGCCGCCACGGC 180

g11 GGCCGCCACCATAGCCACCACCGCCGCAGCCGCAGCCGC 219 g12 GGCCGCCGCCACAGCCACCACCGCCGCAGCCGCAGCCGC 219 g13 GGCCGCCGCCACAGCCACCACCGCCGCAGCCGCAGCCGC 219

The product of the nested PCR on phage H11 prior to insert isolation and sequencing was identical to bp 59–116 of the genomic clones.

The portion of the H11 insert that corresponds to this portion of the gene is identical to that of g12.

122

Appendix I DNA alignment of winter and starry flounder hyperactive AFPs

An alignment of the starry flounder hyperactive AFP from lambda insert #2 (stfh-AFP2, Figure

17) and the winter flounder hyperactive variant wfh-AFP1 is shown. Identical bases are marked with a dot (.), the polyadenylation site is marked with pound signs (#), and the number of nucleotides is noted at the end of each row. The translated sequences of stfh-AFP2 and wfh-

AFP1 are above or below the respective gene sequence. Substitutions that are silent in all three translated products are underlined, substitutions that are silent between wfh-5a and stfh-AFP2 are italicized, and those that are silent between wfh-5a and wfh-AFP1 are bolded. The primers used to isolate hyperactive sequences from starry flounder DNA (Table Z) are marked with asterisks and labelled. The Genbank accession numbers for wfh-AFP1 and wfh-5a are EU188795 and

M63477, respectively.

123

I T E A I D P A A Q A A A A A stfh-AFP2 AATCACTGAAGCCA---TCGACCCCGCAGCCCAAGCCGCTGCAGCCGCAG 47 wfh-5a C..T...... ---...... A...... C...... 47 wfh-AFP1 C...... ACA...... AG...... A...... 50 I T E A N I D P A A R A A A A A

A A A A A V V T A A D A A A A A A stfh-AFP2 CCGCCGCCGCCGCCGTTGTCACCGCCGCCGACGCTGCCGCAGCCGCCGCC 97 wfh-5a ....A...A....A...... T.....A.C...... 97 wfh-AFP1 .A...T..AAA....CA...... T... 100 A A S K A A V T A A D A A A A A A

N A A A N A A A V A A A T A A D V stfh-AFP2 AACGCCGCCGCCAACGCCGCCGCAGTCGCTGCGGCCACCGCCGCCGACGT 147 wfh-5a GC.AT ...... C...... G...... T.C 147 wfh-AFP1 .C.AT...... ATC...... T...... T..T..T..T.A 150 T I A A S A A S V A A A T A A D D

******maxi5'mid****** A T A S I A T I K A N A A A A A stfh-AFP2 CGCCACCGCATCCATAGCAACCATCAAAGCCAATGCCGCCGCCGCAGCAG 197 wfh-5a .... G...... G...... C..... CA...... 197 wfh-AFP1 ....G...... C...GCAT....T.....GAA.T 200 A A A S I A T I N A A S A A A K

A T A A A A A I A A E E A A T A A stfh-AFP2 CCACCGCCGCTGCCGCAGCAATAGCAGCCGAGGAAGCCGCAACCGCGGCT 247 wfh-5a ... T...... C...... A...... A...... CG..A.C..C 247 wfh-AFP1 ...T...... C...... G...... A....CA.T..CG....C..G 250 S I A A A A A M A A K D T A A A A

A T A A A A A A A T A A T A Q A A stfh-AFP2 GCTACCGCCGCAGCTGCTGCCGCCGCCACCGCCGCCACAGCCCAGGCAGC 297 wfh-5a ..C.....T..C..CA.AA...... A...... A... A. 297 wfh-AFP1 ..C.G...... C..C..C....TT...T...... A.....TA.A.A. 300 A S A A A A A V A S A A K A L E T

I F D K A A A A A S T T A T T A stfh-AFP2 CATCTTTGACAAAGCCGCAGCCGCCGCATCCACAACCGCCACCACCGCCG 347 wfh-5a ....AAG.. T.....G...... G..T.C...... A..... 347 wfh-AFP1 ....AAC.T...... TA...... TG....C.....T.AT.....T. 350 I N V K A A Y A A A T T A N T A

124

**** A T A A A A T A T T A A A A A A A stfh-AFP2 CCACGGCGGCCGCCGCCACAGCCACCACCGCCGCAGCCGCAGCCGCAGCC 397 wfh-5a .. G....C..G...... C...... C....T.... 397 wfh-AFP1 .TG.C..C...... C...... T.....C...... 400 A A A A A A T A T T A A A A A A A

**maxi3'mid****** T E T I D K A A A A A A A A A A T stfh-AFP2 ACAGAAACCATCGACAAAGCCGCCGCAGCCGCCGCAGCCGCAGCCGCCAC 447 wfh-5a .A.AC...... G...... C....TG....TT..G.C...... 447 wfh-AFP1 .A..C...... C...... T...AAG...... T...... 450 K A T I D N A A A A K A A A V A T

A V A T A A A A A A T A A A T A stfh-AFP2 CGCCGTGGCAACTGCCGCAGCCGCCGCAGCCACCGCCGCCGCCACCGCCG 497 wfh-5a ...... G...... A....C.....T...... T. 497 wfh-AFP1 ...... TT..GA...... A....C.....T...... GTT.... 500 A V S D A A A T A A T A A A V A

A A T L G A A A A K A A A T A V A stfh-AFP2 CCGCAACCCTCGGGGCTGCCGCCGCAAAAGCCGCAGCCACCGCAGTCGCT 547 wfh-5a ...... A.A.T...... C...AA. 547 wfh-AFP1 .T...... AA...... T...... T.. 550 A A T L E A A A A K A A A T A V S

A A A A A A I A A A A A A A A P P stfh-AFP2 GCCGCCGCAGCCGCTGCCATCGCTGCTGCCGCCGCCGCCGCCGCCCCCCC 597 wfh-5a ...... C...... AGCA ..CA...... 597 wfh-AFP1 ...... ---...... TGC...C..C...AT....TT...... TG.... 597 A A A A A A A A A I A F A A A P

* stfh-AFP2 TTAAGGATCGTGGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAGC 647 wfh-5a A...... 647 wfh-AFP1 A...... AT 647 * ###### stfh-AFP2 GAGATGTTACCAATCTGCTGAATAAACCTGAGAAGCTGTTTGTTG 692 wfh-5a ...... 674 wfh-AFP1 .....A.C...... T...... G...... T 692

125

Appendix J DNA alignment of winter and starry flounder skin AFPs

An alignment of the starry flounder skin AFP from lambda insert #2 (stfs-AFP8, Figure 17) and the winter flounder skin variant wfs-11-3 is shown. Identical bases are marked with a dot (.) and gaps with a hyphen (-). Intronic and flanking sequences are in lower case, while exonic sequence is in upper case. The coding region is bolded with the translated sequences of stfs-AFP8 or wfs-

11-3 above or below the respective gene sequence, and silent substitutions are underlined. The number of nucleotides is noted at the end of each row. The putative TFIID binding site ( | ), the transcriptional start sites (+), and the polyadenylation signal (#) are marked based on previous characterization of the winter flounder skin AFPs. The intronic enhancer, Element S, is in white text and highlighted in black [59]. The Genbank accession number for wfs-11-3 is M63478.

126 stfs-AFP8 gttacaaaacaagttcatactggcctggatgttcgccacaccttcctttt 50 wfs-11-3 ...... -----...g.t...... g.. 45

stfs-AFP8 gtggtgaaccagtcggagccgacaacatgctgcgtcacaaactcgaagtg 100 wfs-11-3 .at...... gc.c...... g..a..a..... 95

||||||| ++ stfs-AFP8 aataaataagggatgctccctaaaggttttcatcaggactcaaccACTTT 150 wfs-11-3 ...... ga..c...... a...... a...... 145

stfs-AFP8 TCACTGTCGAACACTCAGgtaagtgaacactcactttatttagcaccgca 200 wfs-11-3 ...... C...... c...... g...------186

stfs-AFP8 cgtgcccataactatgttactgtatatgttttgttctatattgtttttat 250 wfs-11-3 ------186

stfs-AFP8 attgttgttttatgtgcagtgcaccaacgacaccaaggcaatttcctgta 300 wfs-11-3 ------186

stfs-AFP8 tgtgcaaacatacttggcaaataaaataattctgattctgattctacaaa 350 wfs-11-3 ------c.ca...... 196

stfs-AFP8 tctggttt-actgtaaatatcttgggaaggaaggaaggatatctgcatta 399 wfs-11-3 ...... t...... 246

stfs-AFP8 tcccagaggggccatttgttttacagccagcggtaaaagatgaagatctt 449 wfs-11-3 ...t-...... g...... 295

stfs-AFP8 caaccgtgttcgtctgatggaaagtttgttctgaaac------486 wfs-11-3 ..t..a...... cttcagtggaaga 345

stfs-AFP8 ------486 wfs-11-3 aacagattcatgtcttcaggcttaaacctgcaaaaatctgagctctgtta 395

stfs-AFP8 ------actgtaaaactctttta 503 wfs-11-3 aatcatgggaaacaactttttaattcagtcaggg...g...... a..... 445

stfs-AFP8 tatgaacagaagaagaagaagtgatctttagttcatcactgtggaaacat 553 wfs-11-3 ....c...... ca...... 495

stfs-AFP8 cagcagcagttaaagtctgtctgcttcagtatcactggccagttccagtg 603 wfs-11-3 ..t...... c...... 545 127 stfs-AFP8 ctcatgtttctgatcagcttggtttgaatgatatgaaa-cggatggagtc 652 wfs-11-3 ...... a...... 595

stfs-AFP8 cctgtttgaccctgtttaacacaagat------ggaccatcttt att 693 wfs-11-3 ...... tggacgcat...... 645

stfs-AFP8 aacataatgttttacatgagcacttcc tgttttcagccctaaacctaaag 743 wfs-11-3 t...... c...... t..... 695

stfs-AFP8 aggcctcatagaaacttcctgatgatctggtgacacctgctggttgaagg 793 wfs-11-3 ...... g...... 745

stfs-AFP8 aaacagagtttgagagtcagaagaacaaatgattttagtttgaaacaaga 843 wfs-11-3 ...... g...c...... tg.... 795

stfs-AFP8 agctgtcatttgatattatgtt-gtgggggggggggcgggggtggtcatc 892 wfs-11-3 ...... t..t....a..t.ga...... g....---a...c. 842

stfs-AFP8 acacacagatattgaacactgtcatcactgggttcagtgaaagtgacgga 942 wfs-11-3 ...... g...... a.a. 892

stfs-AFP8 ccagtacatgttgtgataaataat---atcataataattataat-aatac 988 wfs-11-3 ...... t.....att...... t..... 942

M D A P A A stfs-AFP8 cattaatctctgcagAATCACTGACATCAAC ATGGACGCCCCAGCCGCCG 1038 wfs-11-3 ...... A...... AA.. 992 M D A P A K

A A A A T A A A A K A A A E A T A stfs-AFP8 CCGCCGCAGCCACCGCCGCCGCCGCCAAAGCCGCCGCAGAAGCCACCGCC 1088 wfs-11-3 ...... G...... 1042 A A A A T A A A A K A A A E A T A

A A A A K A A A A T K A A A A R * stfs-AFP8 GCCGCAGCTGCCAAAGCAGCAGCCGCAACCAAAGCTGCCGCAGCCCGTTA 1138 wfs-11-3 ...... C...... C.. A...... 1092 A A A A K A A A A T K A A A A R *

stfs-AFP8 ATGATCATGGTCGTCTTGATGTGGGATCATGTGAACATCTGAGCAGCGAG 1188 wfs-11-3 ...... G...... 1142

128

###### stfs-AFP8 ATGTTACCAATCTGCTGAATAAACCTGAGAAGCTGTTTGTTGAaaaccaa 1238 wfs-11-3 ...... T...... 1192

stfs-AFP8 gtgtccagttcatttcatctctgaaactccttcacactttctgtagatca 1288 wfs-11-3 ...... t...... a...... 1242

129

Appendix K Genomic Southern blot probed with hyperactive AFP cDNAs from the winter flounder and starry flounder

Genomic DNA (10 µg) from individual fish was digested with Sac I or BamHI for winter flounder, or with Sac I or Eco RI for starry flounder. The four starry flounders (1, 4, 6, 9) were obtained from location 4 (Figure 4). Digested DNA was electrophoresed on a 0.8% agarose gel, blotted onto nylon and probed with the winter flounder hyperactive isoform (A) or the starry flounder hyperactive isoform (B). The positions of DNA size markers (kb) are indicated on the left.

130 Winter Winter flounder Starry flounder flounder Starry flounder I I H Sac EcoR H Sac EcoR I I I I I I c m c m a a a a S B 1 4 6 9 1 4 6 9 S B 1 4 6 9 1 4 6 9 10 8 6 5 4 3.5 3

2

0.75

AB

131