FOUNDING FISH: GENE DUPLICATION CONTRIBUTES TO IMMUNOLOGICAL DIVERSITY IN BOTTLENECKED POPULATIONS OF INTRODUCED RAINWATER KILLIFISH

A Thesis submitted to the faculty of A 5 San Francisco State University 3(o In partial fulfillment of MS the requirements for the Degree

Master of Science

In

Biology: Marine

by

Danielle Nicole Desmet

San Francisco, California

December 2015 Copyright by Danielle Nicole Desmet 2015 CERTIFICATION OF APPROVAL

I certify that I have read FOUNDING FISH: GENE DUPLICATION CONTRIBUTES

TO IMMUNOLOGICAL DIVERSITY IN BOTTLENECKED POPULATIONS OF

* INTRODUCED RAINWATER KILLIFISH by Danielle Nicole Desmet, and that in my opinion this work meets the criteria for approving a thesis submitted in partial fulfillment of the requirement for the degree Master of Science in Biology (Marine Biology) at San

Francisco State University.

C. Sarah Cohen, Ph.D. Professor, Biology

Eric J. Routman, Ph.D. Professor, Biology

Vanessa Miller-Sims, Ph.D. Adjunct Faculty FOUNDING FISH: GENE DUPLICATION CONTRIBUTES TO IMMUNOLOGICAL DIVERSITY IN BOTTLENECKED POPULATIONS OF INTRODUCED RAINWATER KILLIFISH

Danielle Nicole Desmet San Francisco, California 2015

Population bottlenecks, and subsequent loss of genetic diversity, are a common occurrence for introduced species. The MHC Class IIDB locus must constantly adapt to detect shifting pathogen communities and is used in this study to assess functional diversity in 2 regions with introduced populations of Rainwater killifish, Lucania parva. L. parva was likely introduced to San Francisco Bay from the Pecos River, New Mexico through a series of sport fish stockings to southern then northern California during the 1940’s and 1950’s. Demographic patterns of San Francisco and New Mexico populations were assessed using microsatellite, D-loop and MHCIIDB diversity. A single D-loop allele was found in all populations and three of four microsatellite loci were fixed at shared alleles, suggesting a strong bottleneck and common ancestry between all populations sampled. MHC diversity was lowest in New Mexico and one very similar allele was shared between the regions sampled. Inferred functional diversity was also assessed using physiochemical properties of MHC codons under positive selection and alleles were grouped into supertypes. Supertype usage and allelic diversity was relatively uniform between San Francisco and New Mexico. Despite low allelic diversity, functional diversity was spread across duplicated MHCIIDB loci. Two supertypes (out of four) were found to dominate in all populations. This reduced MHC diversity may be a result of relaxed parasite mediated selection as decreased immune response, assortative mating or new parasite pressures.

I c< A1 ' rrect representation of the content of this thesis.

C. Sarah Cohen, Chair, Thesis Committee Date PREFACE AND/OR ACKNOWLEDGEMENTS

I would like to thank Daniel Chase for his help collecting samples from San Francisco

Bay and Evan Carson, Colleen Caldwell, Stephen Davenport and Chris Hoagstrom for their assistance in obtaining samples from New Mexico. I am also grateful to Laura

Melroy, Vanessa Miller-Sims, Riley Smith, Kathryn Nuessly and Benson Chow for their assistance in the field, and to Bradley Johnson and Peter Drell for statistical advice.

Collection and euthanization methods were approved by the California Department of

Fish and Wildlife (SC-12713, permit holder Danielle Desmet) and the San Francisco

State University Institutional Review Board and Institutional Animal Care and Use

Committee (IACUC protocol #A13-04).

v TABLE OF CONTENTS

List of Table...... viii

List of Figures...... ix

List of Appendices...... x

Introduction...... 1

Methods...... 9

Fish Collection...... 9

Molecular Methods...... 9

Data Analysis...... 12

Sequence Analysis...... 12

MHC loci...... 12

MHC class II variability and tests for selection and supertypes...... 13

Results...... 16

Microsatellites...... 16

Mitochondrial Control Region Sequences...... 16

MHC IIDB Sequences...... 16

MHC loci...... 18

Selection...... 19

Functional Diversity...... 19

Discussion...... 21

Common Ancestry...... 21 Recovery and Drift...... 23

Selection...... 25

MHC and Functional Diversity...... 27

Conclusions...... 30

Funding...... 31

Reference...... 32

Appendices ...... 49 LIST OF TABLES

Table Page

Populations sampled...... 42 Pairwise Fst values...... 42 Loci summary...... 42 Diversity measures by population...... 43 Analysis of molecular variance...... 44 Codons under positive selection...... 44 LIST OF FIGURES

Figures Page

Map of study sites with MHCIIDB exon 2 frequencies...... 45 Loci prevalence by population...... 45 Codon level analysis of selection...... 46 Supertype tree...... 47 Supertype prevalence by population...... 48 LIST OF APPENDICES

Appendix Page

Microsatellite heterozygosity...... 49 Loci Fishers Exact Test...... 49 Positively selected codon and PBR...... 50 Supertype Fishers Exact Test...... 50 Codon level analysis of selection...... 51 PCR gel of imaged loci...... 51 Supertype trees...... 53-54 Alignment of MHC exons...... 55 Alignment of amino acid haplotypes...... 56 Alignment of MHC introns...... 57

x 1

Introduction

Eco- is an emerging field that focuses on the relationship between an ecosystem and the immune systems of the resident organisms. One aspect of this field is to understand how invaders respond to new environments despite often reduced immunological genetic diversity by assessing the functional diversity present in introduced populations. Several theories exist regarding how genetic diversity is maintained, and sometimes even increased, following an introduction. Various selective and molecular mechanisms, such as mate choice (Aeschlimann et al. 2003) and gene conversion (Spurgin et al. 2011), have been found to maintain or increase immunological genetic diversity following reductions in effective population size. But it remains unclear how much functional immunological diversity is needed for an invader to be successful in the event of a very strong bottleneck or small founding event.

During the establishment phase of an invasion, the invader generally loses genetic diversity through a population bottleneck (White and Perkins 2012). The severity of a bottleneck and subsequent diversity loss is strongly dependent on the number of founders

(Uller and Leimu 2011) and whether there have been multiple introductions (Dlugosch and Parker 2008) or admixture of different source populations, which can increase diversity (Kolbe et al. 2004). Maintaining adequate genetic diversity through a bottleneck allows the invader to avoid inbreeding depression and respond to selection

(Frankham et al. 2002; White and Searle 2008). Genetic diversity is crucial for genes

involved with pathogen resistance to be effective, particularly for genes that exhibit 2

overdominance (Charbonnel and Cosson 2012; Webster et al. 2011). For functional genes, such as those involved with pathogen recognition, a minimum amount to functional diversity may be required for an invader to successfully establish.

A key consequence of low host densities and founder effects during an invasion is that parasites are usually lost from the invading species. The enemy release hypothesis may also impact diversity, particularly at pathogen recognition loci.

Enemy release, or release from co-evolved natural enemies such as predators and parasites, results in fitness tradeoffs for the alien species (Keane and Crawley 2002). For instance, an invader may lose protection against specialist pathogens in its home range and shift resources to defend against generalist enemies (Joshi and Vrieling 2005). They may also reallocate resources favoring fitness, fecundity and growth over maintaining unneeded immune diversity (White and Perkins 2012). Further, novel pathogens in the invaded range typically elicit an immune response because they lack the adaptations to avoid eliciting a strong immune response (Mansfield and Olivier 2002; Lee and Klasing

2004). This has implications for diversity at pathogen recognition genes as the severity of the bottleneck will influence what kinds of fitness tradeoffs can be made.

The immunological capacity to recognize pathogens mentioned above is provided by the major histocompatibility complex (MHC). The MHC is a multi-gene family that codes for cell surface receptors responsible for detecting pathogen presence and activating the adaptive immune response (Klein 1987) and is among the most polymorphic genes in jawed vertebrates. The MHC class II must constantly adapt to 3

detect and protect against ever shifting pathogen communities (Eizaguirre et al. 2010).

MHC diversity is thought to be generated by a birth and death process through gene duplication and deletion which results in copy number variation among individuals (Nei et al. 1997; Eimes et al. 2011). Variation at individual MHC loci is also generated by recombination and mutation, with diversity maintained by balancing selection on the peptide binding region (PBR) (Spurgin and Richardson 2010; Piertney and Oliver 2006).

The mechanisms underlying how the MHCs exceptional diversity is maintained have been debated at great length. The overdominance hypothesis suggests MHC heterozygotes should have higher fitness due to their ability to recognize a wider array of pathogens (Doherty and Zinkemagel 1975). Overdominance may be advantageous for responding to novel pathogens. Conversely, there is evidence that specific alleles are more effective against specific parasites and high allelic diversity is maintained by opposing directional selection or negative frequency selection (Oliver and Piertney 2012;

Eizaguirre et al. 2012). In practice, these concepts are difficult to separate but the host’s loss of parasites during an invasion may mitigate the impact of reduced MHC diversity resulting from a founder event (White and Perkins 2012).

MHC class II p diversity, particularly at the PBR involved in pathogen recognition, is influenced by pathogen-mediated selection (Spurgin and Richardson

2010). Mate choice has been shown to preserve MHC diversity through an extreme bottleneck (Aeschlimann et al. 2003). While molecular mechanisms, like gene conversion, can also increase MHC diversity. Spurgin et al. (2011) found the Berthelot’s 4

pipit (Anthus berthelotii) retained only 11-15 of its MHC haplotypes when it spread across its island range and has since generated at least 26 new haplotypes in situ by gene conversion, with the rate of gene conversion exceeding point mutations by an order of magnitude. However, as these changes occurred within the last 75,000 years, it remains unknown whether gene conversion is an important mechanism over the shorter time scale relevant for introduced species (White and Perkins 2012).

While there is a current dearth of experimental studies looking at the patterns of

MHC diversity in invasive species, the combination of drift, relaxed selection and directional selection is expected to reduce MHC diversity in invasive species (White and

Perkins 2012). Yet, in Chile, Monzon-Argiiello et al. (2013; 2014) found levels of MHC diversity in two introduced trout species (Oncorhynchus mykiss and Salmo trutta) comparable with their native ranges but reduced MHC diversity compared to the aquaculture populations from which invaders initially escaped. Salmonids are an emerging system for assessing the eco-immunology of introduced species as they have been translocated all over the world for aquaculture and typically have artificially high diversity in captive populations. Yet, there remains a lack of experimental evidence concerning MHC diversity for invasions with small founding populations or low propagule pressure.

Experimental studies examining how MHC diversity is influenced by strong bottlenecks have illustrated various dynamics of drift and selection. In an inbred population of water voles (Arvicola terrestris), selection was found to counter drift, thus 5

maintaining MHC polymorphism through a population bottleneck of six individuals

(Oliver and Piertney 2012)., Death Valley pupfish (Cyprinodon diabolis) also maintain

MHC diversity through overdominant natural selection in spite of drift attributed to historical and ongoing bottlenecks and heavily reduced population sizes (Fisher 2000).

However, losses in MHCI1 diversity and fixation of the same allele across duplicated loci were attributed to drift and reductions in gene copy number following significant reductions in the greater prairie chickens (Tympanuchus cupido) population size (Eimes et al. 2011). Maintaining functional diversity across duplicated loci may be an important mechanism for preserving functional diversity through a bottleneck.

The Rainwater killifish, Lucaniaparva, presents an opportunity to study functional immune diversity in introduced populations in San Francisco Bay and New

Mexico. L. parva is native to the Atlantic and Gulf Coast of the North America (Moyle

2002) and the lower portion of the Pecos River, New Mexico (Cohen and Carlton 1995).

Since its introduction to the San Francisco Bay, L. parva has established populations around the Bay that persist in many brackish waterways (Moyle 2002) and is often among the dominant taxa (Hubbs and Miller 1965; D. Desmet personal observation). L. parva was first documented in Berkeley’s Aquatic Park in the spring of 1958, and was subsequently collected from the north, east and south bay within four years (Cohen and

Carlton 1995). The source population of L. parva in San Francisco Bay is unknown but at least two theories exist regarding the potential source and transport vector. The most recent proposal by Cohen and Carlton (1995) suggests a single vector was responsible 6

due to L. parva’s almost concurrent appearance in five water bodies across the western

United States between 1958 and 1963. L. parva was likely first introduced to southern

California with six shipments of game fish from two hatcheries on the Pecos River, New

Mexico during the 1940’s (Cohen and Carlton 1995; Hubbs and Miller 1965). A single shipment of 120 juvenile sunfish were sent from southern California to the Central Valley

Hatchery in 1956 and subsequently “planted in a number of waters in northern

California” between 1956 and 1959 (Dill and Cordone 1997; Cohen and Carlton 1995).

Serial bottlenecks and founding events are expected to result in lower genetic diversity in

San Francisco relative to L. parva populations from the putative source in the Pecos

River drainage in New Mexico.

The alternative vector suggested by Hubbs and Miller (1965) is that L. parva was transported with oyster shipments from the east coast of the United States between 1869 and 1940. However, the oyster shipments ended eighteen years before L. parva was discovered in San Francisco Bay and L. parva has not been documented in other

California estuaries that received these oyster shipments. As L. parva was discovered in

Berkeley in 1958, it seems likely that game fish introductions were responsible for the

San Francisco Bay L. parva introduction. DNA sequence data from mitochondrial D- loop, MHC and microsatellites were used here to address whether killiflsh in San

Francisco Bay were related to killiflsh form Lea Lake, New Mexico. Lea Lake is an invaded lake adjacent to the Pecos River (E. Carson personal communication) and between the two hatcheries that supplied the game fish to southern California in the 7

1940’s. The D-loop has been used to describe regional differentiation and population

structure in killifish (Cohen 2002; Tirindelli 2008; Haney et al. 2009; Li et al. 2009).

Neutral loci are often used to detect bottlenecks and changes in effective population size. Based on available habitat, San Francisco Bay and Lea Lake presumably had very different founding and establishment scenarios. L. parva established around San

Francisco Bay within four years of its discovery, the estuary is >2500 km2 (Cohen and

Carlton 1995; Rubissow and Macris 1997). Lea Lake’s invasion history is unknown but it

is a small lake, 15 acres, with available habitat limited to the lake margins as it quickly drops off to a significant depth (Caran 1988). Based on the rapid colonization of San

Francisco Bay, neutral diversity may have been maintained. Fast recovery following a bottleneck had been shown to mitigate further diversity loss from a bottleneck (Nei et al.

1975; Zenger et al. 2003). Though, serial bottlenecking may have significantly eroded diversity. Lea Lake is expected to have higher diversity due to its proximity to the Pecos

River. However, drift may also influence diversity since it is a geographically isolated population.

Killifish are thought to have high site fidelity, not migrating far from where they

are born (Lotrich 1975; Jordan 2002; Skinner et al. 2005; Burnett et al. 2007). Local

adaptation to contaminants, even in the presence of gene flow, has been documented in

Fundulus heteroclitus (Cohen 2002; Cohen et al. 2006; Nacci et al. 2010; Nacci et al.

1999; Meyer and Di Giulio 2003). Depending on the strength of selection, local

adaptation can persist despite gene flow. A range of contaminants exist in patches of San 8

Francisco Bay (SFEI 2014). Given serial bottlenecks and a putatively small L. parva founding population into the San Francisco Bay watershed, we were interested to see whether a population in a PCB-contaminated site would exhibit any signals of local adaptation such as differing substitutions in the MHC PBR, or population structure.

This is the first study to assess MHC diversity in serially bottlenecked invasive fish populations. To evaluate how bottlenecking influences genetic diversity, demographic patterns of introduced L. parva populations were evaluated using functional

(MHCIIDB) and neutral (microsatellites and mitochondrial D-loop) loci. First, genetic diversity was assessed in populations from San Francisco Bay, California and Lea Lake,

New Mexico. With large populations and high site fidelity, we expected to see detectable differences between populations due to some combination of drift and selection. Lea

Lake was also expected to have higher genetic diversity than San Francisco Bay due to its close proximity to the putative common source population. Diversity, especially at neutral loci, was expected to be significantly reduced in San Francisco Bay due to the serial bottlenecks that occurred prior to establishment in norther California. Severe bottlenecking may have caused MHC diversity in San Francisco populations to be low due to drift outweighing balancing selection. We also used a variety of techniques to test for selection at the MHCIIB codon level. This information was then used to determine functional variation present across populations. L. parva was expected to show signals of positive selection in the PBR. 9

Methods

Fish Collection

In the fall of 2013, fish were collected from one slough in Oakland (East Bay) and four brackish sloughs in Marin County (North Bay) (Table 1). Forty fish were seined from each location, euthanized with an overdose of MS-222 and stored in 95% ethanol.

The four Marin sites are characterized as natural or restored salt marshes. The Oakland population was sampled from a polluted creek which drains into San Leandro Bay, a site on California’s list of Toxic Hot Spots (BPTCP 1999; Daum et al. 2000). PCB levels in the area were measured at >320 ng/g in both sediment and fish tissue: 359.6 ng/g in sediment immediately adjacent the creek channel in 2008 and 326ng/g shiner perch sampled in San Leandro Bay 680 meters from the mouth of the creek in 2000 (SFEI

2014). Fin clips from twenty individuals were provided by the University of New Mexico

Museum of Southwestern Biology collection, catalog numbers EC 1301 and EC 1302.

These samples were collected from the undeveloped Lea Lake wetlands in the Bottomless

Lakes State Park.

Molecular Methods

DNA was extracted from L. parva muscle tissue using a Nucleospin tissue extraction kit (Machery-Nagel). DNA was diluted to 100-120 ng/fj.1 prior to PCR. PCR was used to amplify the mitochondrial control region (D-loop), microsatellites, and intron

1 through exon 2 of the MHCIIDB region. Molecular methods were performed in the 10

shared molecular facilities at the Romberg Tiburon Center for Environmental Sciences in

Tiburon, California.

D-loop PCR used primers Pro-5 (Palumbi 1996) and M-b3 (referenced in Cohen

2002). The PCR reaction volume was 25(xl and consisted of 1 (xl template, IX PEII

Buffer, 0.05gm/ml BSA, 2mM MgCh, ImM dNTPs, 0.45nM each primer and 0.625U

Taq. Thermal cycling conditions were 94°C for 2 mins; 35 cycles of 94°C for 1 min,

51°C for 1 min, 72°C for 1.5 mins; 25°C for 2 mins.

Microsatellite primers used were Fh-ATG6, Fh-ATG17, Fh-ATG18, Fh-

ATGB101 (Adams et al. 2005) and LG1 (Creer and Trexler 2006). The PCR reaction volume was 25|al for Fh-ATG and Fh-ATGB primers and consisted of 1 |Lil template, IX

PEII Buffer, 0.05mg/ml BSA, 2mM MgC^, ImM dNTPs, 5pM each primer, 0.5U Taq.

Thermal cycling conditions were 94°C for 2 mins; 35 cycles of 94°C for 30 sec, 54°C for

30 sec, 72°C for 75 sec; with a final extension of 72°C for 10 mins. For Lg-1 a 25ju.l reaction consisted of 1 (J.1 template, IX PEII Buffer, 0.05mg/ml BSA, 4mM MgCh, ImM dNTPs, 5pM each primer, 0.5U Taq. Thermal cycling conditions were 94°C for 3 mins;

35 cycles of 94°C for 30 sec, 51°C for 30sec, 72°C for 1 min; with a final extension of

72°C for 10 mins.

Intron 1 through exon 2 of the MHCIIDB region was amplified using degenerate

MHC primers FI2 (5’ - CST CHR CWC WGC AGG TAG GA - 3’) (Wright 2003) and MRS (Cohen 2002). PCRs were performed in 25fxl reactions using AmpliTaq Gold® 11

(LifeTechnologies). Two PCRs were carried out following measures to reduce PCR

artifacts described by Lenz and Becker (2008). An initial PCR and a reconditioning PCR were done using a threefold dilution of product from the first PCR as template for the

second PCR. The reaction volume for the first PCR was 25|*1 and consisted of 1 p.1 template, IX GeneAmp® PCR Buffer II (MgCh included), 0.5mM MgCl2, 0.8mM

dNTPs, 0.32|iM each primer, and 0.9375 U AmpliTaq Gold®. Reaction conditions for the reconditioning PCR were the same as the first except 2^1 of dilute template were

used. Cycling conditions were 10 min at 95°C for hot-start polymerase activation,

followed by 29 cycles (first PCR) or 17 cycles (second PCR) of 94°C for 45s, 51°C for

30s, and 72°C for 1 min with a final extension step of 3 min.

The reconditioned PCR product was cloned using the pGEM-T Easy Vector

System (Promega). For each individual, plasmid DNA from up to 16 positive colonies was amplified using the universal vector primers SP6 and T7. The amplified product was then visualized on an agarose gel in order to select colonies that represented all bands in

each individual. This was done in order to capture all products amplified by the initial 29 cycle PCR. Typically four clones, representing each previously visualized band (from the initial PCR) in the individual, were then sequenced. For some samples the larger

bands (>800bp) were rare in the clone screen PCR, even when 16 additional clones were

screened, and in these cases a minimum of 2 clones were sequenced of these large bands.

A random sub-sample of clone sequences were submitted to GenBank’s BLASTN

algorithm (Altschul et al. 1990) to confirm the sequences were MHC IIDB. PCR 12

products for both D-loop and MHC clones were purified using ExoSAP-IT® for PCR product cleanup (Affymetrix, Santa Clara, CA, USA) and sequenced with BigDye®

Terminator v3.1 (Applied Biosystems Inc., Carlsbad, CA, USA) on an ABI 3130 Genetic

Analyzer (Appied Biosystems) at the Romberg Tiburon Center for Environmental

Studies.

Data Analysis

Sequence Analysis

D-loop and MHC sequences were aligned separately in Geneious v7.0.1

(Biomatters Limited, Auckland, New Zealand). Sequences were assembled using the De

Novo Assemble feature set to the Highest Sensitivity. To avoid PCR artifacts, only identical MHC sequences from more than one separate PCR reaction were considered confirmed and used in subsequent analyses. Numbers of alleles and nucleotide diversity

(n) were estimated for MHC and D-loop using DnaSP v.5 (Rozas et al. 2010). Arlequin v3.5 (Excoffier and Lischer 2010) was used to estimate Fst, allele frequencies and 0(k). an index of diversity. FSTAT v2.9 (Goudet 2002) was used to estimate allelic richness.

Microsatellite Fst was estimated with GenAlEx 6.5 (Peakall and Smouse 2006; Peakall and Smouse 2012).

MHC Loci

Primers amplified multiple MHC class IIB loci based on imaging the initial MHC

PCR (not the reconditioned PCR) on an agarose gel (Fig. Al). Loci were assigned names 13

according to their estimated base length following separation by gel electrophoresis (450,

660, 750, 827 and, 850+). The available MHCIIDB intron 1 sequences, which differed considerably in length, were used to assess duplicated loci. Intron sequences have been suggested as better indicators of loci in MHC genes than coding sequences (Miller and

Lambert 2004; Sato et al. 2000) due to high rates of microrecombination in the exon. The proportion of each locus found in a population was calculated by summing the number of individuals that had each locus (presence of an appropriately sized band on the gel) and dividing by the total number of individuals (n=20 in all populations). Fisher’s Exact test was used to determine if there was a significant pairwise difference between each population at each locus.

MHC class II tests for selection and supertypes

Positive selection across the MHCIIDB exon 2 was assessed using two separate approaches based on different principles. The first involved the conventional approach where dn/ds is estimated separately for PBR (thought to be under positive selection) and nonPBR sites then the resulting values are compared (Cizkova et al. 2011). We estimated dn/ds by calculating the average values of nonsynonymous (dn) and synonymous (ds) substitutions in MEGA v6 (Tamura et al. 2013) using the modified Nei-Gojorobi method with Jukes-Cantor correction with 1000 bootstrap replicates. The dn/ds was estimated for the entire exon sequence, the PBR and nonPBR sites. A Z-test was also conducted using the same parameters to test whether dn>ds. PBR nucleotide positions were based on the crystallized MHC structure (Brown et al. 1993). This method is often used in studies 14

addressing selection at the MHC (Cohen 2002; Miller and Lambert 2004; Anmarkrud et

al. 2010; Cizkova et al. 2011; Eimes et al. 2011; Kiemnec-Tyburczy et al. 2012).

The second approach involved identifying codons under positive selection with no prior assumptions of their binding properties. Assumptions of many analyses are violated if recombination is present, thus the programs omegaMap (Wilson and McVean

2006) and DataMonkey (Pond and Frost 2005) were selected because they take recombination into account. OmegaMap uses a Bayesian approach combined with a population genetic approximation of the coalescent to examine selection on codons of sequences known to experience recombination (Wilson and McVean 2006) while

DataMonkey uses a “classical phylogenetic approach” (Cizkova et al. 2011).

OmegaMap was used to assess which exon 2 codons are experiencing positive selection in L. parva. To ensure convergence, the model was run twice on each population. OmegaMap estimates dn/ds ratio (to), recombination rate (p), transition transversion rate ratio (k), the rate of synonymous transversion (m), and the rate of

insertion and deletion (/'). The inverse prior for the variables co and p were used and they were allowed to vary. Average block size was 10 for co and 30 for p. The remaining variables were set to improper inverse. Priors were set following Wilson and McVean’s

(2006) recommendations for instances where the parameter distributions are unknown for the organism. 500,000 Markov-chain Monte Carlo iterations were run in each simulation

and thinned every 1000 iterations. The first 50,000 iterations were discarded as “burn-in.” 15

Codons were assumed to have equal equilibrium frequencies because we have no prior

knowledge of codon frequencies across the L. parva genome.

An extra allele may not provide greater pathogen detection because differences in

nucleotide sequence may not necessarily change MHC binding capabilities (Ellison et al.

2012). However, amino acid sequence changes in the binding pocket could change an

individual’s ability to detect pathogen peptides (Engelhard 1994). In order to assess

functional diversity, all MHC alleles were grouped by “supertypes” (Monzon-Argiiello et al. 2014; Ellison et al. 2012) based on the properties of all positively selected sites, or

PSS (Ellison et al. 2012; Doytchinova et al. 2005). PSS were identified using two methods implemented in the DataMonkey web server (Pond and Frost 2005) with HypHy

(Pond, Frost, and Muse 2005). The mixed effects model of (MEME) detects episodic and pervasive positive selection (Murrell et al. 2012), while the Fast Unbiased

Bayesian Approximation (FUBAR) detects positive selection using a model that allows

for site to site variation (Murrell et al. 2013). Only sites that had a significance level of

<0.05 in MEME and >0.95 in FUBAR were considered. Prior to analysis the presence of

recombination was assessed using the GRAD analysis in DataMonkey in order to avoid

confounding effects of recombination. PSS were also identified from posterior

probability values (>95%) generated by omegaMap using the same priors described

previously. To determine the functional supertypes used by L. parva, all populations were

pooled. Codons identified by at least two methods were considered PSS (Cizkova et al.

2011). The amino acid sequences for identified PSS were aligned and alleles were 16

clustered into supertypes as described in Ellison et al. (2012) using the procedure of

Doytchinova et al. (2005).

Results

Microsatellites

Twenty individuals from each population were sized using all loci. Three of the 5 microsatellite loci were fixed across all populations. Fh-ATG18, Fh-ATG 17 and Fh-

ATGB101 were fixed for alleles of 143, 134 and 102 bp respectively. Fh-ATG6 had only two alleles in San Francisco (peaks at 197 and 200), but was fixed in New Mexico (peak at 197) and mostly homozygous in San Francisco Bay populations (Table Al). Lg-1 was variable in all populations, however severe stutters excluded this locus from analysis.

Pairwise Fst (Table 2) showed no significant differentiation between any populations.

Mitochondrial Control Region Sequences

From 35 individuals, 331 base pairs of the mitochondrial control region were analyzed (n=10 Mill Valley, n=7 Corte Madera, n=8 Oakland, n=10 Lea Lake). A single shared allele was found across all populations.

MHC IIDB Sequences

MHCIIDB sequences spanning intron 1 and exon 2 were obtained from 51

individuals (San Francisco Bay: n=19 Oakland, n=10 Mill Valley, n=14 Corte Madera and Lea Lake n=8). Exon 2 sequences from all populations showed similarity to 17

GenBank MHC sequences in BLAST searches. No indels or stop codons were found in the coding sequences. Intron 1 was found to be 192 to 608bp in length. Exon 2 for every allele was composed of 257 bases and trimmed to 255 bases to remove the incomplete codon at the end of the sequence.

In all cases unique exon sequences had corresponding unique intron sequences. In one case (L4 and L6), a single exon sequence was found with two highly similar intron sequences (99.5% pairwise identity, a difference of 4 nucleotides). In every other case, the exactly matching intron and exon pair were always found together. In no case was an identical exon allele found at more than one intron size locus (Table 3). A total of 14 unique confirmed MHCIIDB alleles were obtained from the populations sampled. These alleles included both the intron and exon. However, with respect to the exon there were only 13 unique alleles (with 101 out of 255, 39.6%, variable sites), which yielded 12 unique amino acid sequences (with 48 out of 85, 56.5% variable codons). In order to simplify presentation of results and discussion all further mention of alleles will be in reference to exon alleles (nucleotide), unless otherwise noted. There were a total of 9 alleles in Oakland, 5 in Mill Valley, 7 in Corte Madera and 2 in Lea Lake. In Lea Lake, a single confirmed allele (intron and exon) was present at locus 450 and locus 750. Lea

Lake and San Francisco populations had one very similar allele (L3 and L20) which shared 99.3% pairwise sequence identity, a 5 nucleotide difference, across the intron and exon. 18

LI and L4 were the most common alleles in San Francisco Bay (Fig. 1), from all

individuals sampled 69% had both alleles and 74% had one or both. Corte Madera and

Oakland had one (L2) and three (L3, L8 and LI2) private alleles respectively. LI, L4, L5

and L9 were found in all San Francisco Bay populations.

Measures of genetic diversity by population revealed diversity across San

Francisco Bay to be fairly even, while New Mexico had the lowest diversity (Table 4).

Based on the MHC, only Lea Lake was significantly differentiated from other

populations but no genetic structure was observed using neutral loci (Table 2). Though

not significant, AMOVA revealed that 90.47% of variation was due to variation within

populations and only 10% of the variation came from differences between San Francisco

Bay and New Mexico (Table 5).

MHC Loci

There were several MHCII loci found in L. parva based on intron 1 length

variation (Fig. Al). Locus presence was similar across San Francisco Bay populations. In

San Francisco Bay, 450 and 750 were the most abundant loci and had the most allelic

diversity with three and five alleles respectively (Fig. 2, Table 3). The 450 and >850 loci

were not present in Lea Lake (Fig. 2). Lea Lake had a significantly higher proportion of

individuals with the 660 locus than the two northern San Francisco Bay populations

(Fisher’s Exact Test, P<0.001, Lea Lake vs Mill Valley; P0.001, Lea Lake vs Corte

Madera; Table A2). 19

Selection

The ratio of nonsynonymous and synonymous substitution rates was significantly

larger than one across all sites in exon 2 (dn/ds=1.984; Z=3.216, P=0.001) and the

nonPBR sites (dn/ds=2.108; Z=2.733, P=0.004). In the PBR dn/ds was not significantly

larger than one (dn/ds = 1.915; Z=1.551, P=0.062) (Table A5 for dn and ds). Without any

assumptions of binding function, positive selection was found to vary along the

MHCIIDB exon 2 using omegaMap (Wilson and McVean 2006). Elevated dn/ds was not

detected in 14 of the 24 putative PBR sites, while >22% of the nonPBR sites had elevated

dn/ds (Fig. 3). Signals of positive selection, high co values and posterior probabilities

>95%, corresponded to particular PBR residues (Fig. 3a, 3b) and binding pocket codons

(Table S3). Patterns of selection, a> values and posterior probabilities, were consistent

across populations when analyzed separately and when populations were pooled (Fig. 3a-

b and Fig. A2). All populations in San Francisco and New Mexico had a> >1 and >95%

posterior probability values for codons in the |31 sheet and a-helix (Codons 11,12, 13 and

61-72).

Functional Diversity

OmegaMap, FUBAR and MEME identified several codons in the MHCIIDB

exon 2 as being under positive selection (Table 6). GARD analysis did not identify any

codons as potential breakpoints for recombination. Similarly the results of omegaMap

showed little variation in p (Fig. 3c). Nine codons were identified as PSS (Table 6). 20

These codons were used in cluster analysis to group alleles into super types based on functional characteristics. PSS codons 11, 13, 38, 61, 62, 65, 68, 70, 71 corresponded with PBR sites, while codons 29 and 31 were adjacent to PBR sites.

Cluster analysis yielded 4 supertypes (Fig. 4) each having between 2 and 4 alleles.

Clusters and their bootstrapping values were consistent among all methods (Paired &

Wards Method, Fig. S3). Fish had between one and three supertypes and possessed between one and three alleles. Allele L10 was a low frequency allele which was ultimately excluded from further analysis due to ambiguous clustering into supertypes 3 and 4. The exclusion of this allele from the supertype frequency analysis only affected a single individual in Mill Valley thus, the overall results do not change whether L10 is included or not. Supertypes 1 and 3 were the most common supertypes in every population with 68.6% of all individuals sampled having both supertypes 1 and 3. Alleles from supertypes 1, 2 and 4 were found at multiple loci (Table 3). Specific supertypes were associated with each other within loci, supertypes 1 and 2 were associated as were supertypes 3 and 4 (Table 3). Across all populations fish had an average of 2 supertypes per individual and in San Francisco Bay individuals had an average of 2.2 supertypes.

Supertype usage was similar across all populations (Fig. 5). Despite Lea Lake’s lack of supertype 2 and 4, there was no significant difference observed between populations based the abundance of each supertype using Fisher’s Exact Test after Bonferroni correction (Table A4). Only prior to Bonferroni correction did Oakland have significantly fewer individuals with supertype 3 than Lea Lake or Mill Valley. With respect to 21

supertype 4 Lea Lake was significantly different from Oakland and Mill Valley prior to

Bonferroni correction but not after.

Discussion

In this study, MHCIIDB, D-loop and microsatellite diversity was assessed in

contemporary introduced L. parva populations from San Francisco Bay and New Mexico.

Diversity across neutral markers was extremely low for all populations, with three fixed microsatellites and only one D-loop allele. Shared low neutral genetic differentiation

from two geographically distinct regions suggests a common source population. At the

MHC, signals of positive (diversifying) selection were evident despite reduced allelic diversity. Both geographic regions shared one highly similar allele and just four

functional supertypes were discovered in total with two widely used in every population.

The MHCIIB has experienced duplication events in L. parva. Spreading supertypes

across duplicated loci appears to maintain functional diversity despite low within locus

allelic diversity.

Common Ancestry

Our findings support the hypothesis that L. parva in San Francisco Bay originated

from New Mexico and was not transported with oyster shipments from the Atlantic

(Cohen and Carlton 1995). Introduced populations founded by migrants from a single

source population typically harbor reduced neutral diversity. The L. parva populations

sampled here were bottlenecked at neutral loci and share the same neutral profile which 22

suggests a common historical source population. Further, data from a native population in

Florida revealed variation at all the neutral markers used in this study (T. Fuller unpublished data). A feasible explanation for the low neutral diversity in San Francisco

Bay is low propagule pressure in conjunction with sequential founder effects which occurred prior to establishment in northern California. An invader’s genetic diversity is influenced by propagule pressure, which results from the number of invading individuals, the number of introduction events and the number of source populations (Roman and

Darling 2007). Non-native striped bass (Morone saxalitis) showed decreasing mitochondrial diversity with serially bottlenecked populations, from the Atlantic Coast to

San Francisco Bay, then into Coos Bay, Oregon (Waldman et al. 1997).

The stepping stone model (Kimura and Weiss 1964) would predict higher variation in Lea Lake than in San Francisco because of its close proximity to the common source population in the Pecos River. For example, the Bluegill sunfish (Lepomis macrochirus) in Japan was established by a small one-time founding population of 15

individuals from a single native population. The original Bluegill population in Lake

Biwa, Japan maintained about 80% of the nuclear microsatellite genetic diversity from the native range, while genetic diversity declined with increasing distance from the founding population (Kawamura et al. 2010). Contemporary Bluegill populations close to the original Lake Biwa population showed considerably higher microsatellite diversity than has been observed by this study. However, serially bottlenecked Bluegill populations far from Lake Biwa showed comparable microsatellite diversity to that 23

observed in this study. The unexpected lack of diversity in Lea Lake suggests low variation in the source population.

No evidence of elevated diversity or admixture was found at putatively neutral

loci for the San Francisco Bay populations. A single D-loop allele was found in all four populations of L. parva. The D-loop is useful for describing regional differentiation and population structure in killifish (Haney et al. 2009; Li et al. 2009). Two MHC alleles which share >99% identity at the nucleotide level across both the intron and exon suggest further evidence of shared ancestry between San Francisco and Lea Lake. All five

introductions in western North America have been hypothesized to be genetically more related to New Mexico than Atlantic stocks as gamefish plantings were the likely transport vector (Cohen and Carlton 1995).

Recovery and Drift

With large populations and high site fidelity, we expected to see detectable differences between killifish populations due to some combination of drift and selection.

Based on the functional locus, evidence of population structure was found between the

San Francisco populations and Lea Lake but no evidence of structure was found within

San Francisco Bay. The San Francisco Bay founders likely experienced rapid growth during the establishment phase as L. parva spread around the bay within four years of

initial documentation (Cohen and Carlton 1995). Thus, it is interesting that contemporary

San Francisco Bay populations show such homogeneity at all loci despite stochastic 24

processes that occur during founding events and range expansions. Rapid population growth following a bottleneck minimizes further loss of alleles caused by drift (Nei et al.

1975), but it does not explain the lack of population structure observed in relation to the

MHCIIDB in San Francisco Bay. Perhaps not enough generations have passed for drift, selection or gene conversion to cause significant differentiation, although this may occur rapidly with strong selection (see discussion below). Alternatively L. parva may be reliant on these MHC alleles in San Francisco Bay. For naturalized rainbow trout

(Oncorhynchus mykiss) across Chile, the hypothesis for highly admixed MHC and structuring at neutral loci was that only certain MHC alleles were needed for wild population to be successful, thus many alleles artificially maintained in fish farms were purged following escape into the wild (Monzon-Argiiello et al. 2013).

A possible explanation for Lea Lake’s extreme lack of neutral and functional diversity is that it is a small isolated population subject to strong drift. Loss of average numbers of alleles per locus is strongly affected by bottleneck size but not so strongly by the rate of recovery (Nei et al. 1975). Alternatively, if a population experiences a long period of low effective population size, for example during a range expansion or slow recovery following a bottleneck, fixation across multiple loci can occur. Hedrick and

Hurt (2012) found two species of endangered Sonoran topminnows (Poeciliopsis spp.) to be invariant at three mtDNA loci. That only two MHC alleles were found in this isolated lake is consistent with findings of other fragmented and isolated populations (Hedrick et al. 2001; Hedrick and Parker 1998; Miller and Lambert 2004). 25

Selection

Despite low allelic diversity across duplicated MHC loci and bottlenecked neutral loci, positive selection was evident at several MHCIIDB codons. The overall dn/ds ratios were significantly >1, which is consistent with historical long-term balancing selection.

All PSS codons were either putative PBR codons or adjacent to putative PBR codons.

Codons under positive selection adjacent to PBR codons may contribute to conformational changes in the binding pocket that impact binding affinity rather than directly interacting with foreign (Foote and Winter 1992). Salmonids and haplochromine chiclids have shown evidence of positive selection adjacent to putative

PBR codons, supporting evidence of PBR shifts between fish and mammals (Pavey et al.

2013; Fraser et al. 2010).

While positive selection was evident in our data despite reduced diversity, bottlenecking has been shown to impact selection at duplicated MHCIIB loci in a variety of ways. Montane voles (Microtus montanus) showed strong signals of purifying selection with weak historical balancing selection across duplicated MHCIIB loci as a result of cyclical population fluctuations that may cause functional differentiation of each locus (Winternitz and Wares 2013). In contrast, L. parva supertypes were spread across duplicated loci. In San Francisco Bay populations, the possibility of loci becoming functionally fixed remains because at most only two supertypes were found in any one locus across the populations sampled. However, there were more than two supertypes found in San Francisco, though only two were at high frequencies. The greater prairie 26

chicken (Tympanuchus cupido) suffered great losses in MHCIIB variation following a bottleneck where population level losses were attributed to drift. Individual level losses, however, were attributed to reductions in gene copy number or alleles becoming fixed across multiple loci (Eimes et al. 2011). Drift across loci likely contributed to the fixation of each locus in Lea Lake. However, it is unlikely drift alone would cause loci fixation due to the large populations found in San Francisco Bay. L. parva is often among the dominant taxa in San Francisco Bay (Hubbs and Miller 1965; D. Desmet personal observation).

Selection is evident in the evolutionary history of the MHC in our data, yet selection is unlikely to have generated significant diversity since L. parva invaded San

Francisco Bay. Simulations and studies have shown selection does not usually maintain

MHC variation over shorter time scales relevant for recent invasions (Ejsmond and

Radwan 2011; Miller and Lambert 2004).The consistent co (dn/ds) and posterior probability profiles (revealed by omegaMap) between populations and at the species level indicate that these signals of selection were generated in the original source population, prior to introduction. Significant dn/ds ratios may not represent current selective pressures as they take a long time to accumulate, and possibly an equally long time to disappear in the absence of selection (Garrigan and Hedrick 2003). Contemporary MHC diversity in San Francisco Bay was probably maintained through rapid population expansion. However, extreme bottlenecks can overwhelm balancing selection. Drift was found to outweigh balancing selection in shaping MHC diversity in a bottlenecked 27

population of inbred endangered, black robins (Petroica traversi) (Miller and Lambert

2004). The black robin population consisted of five individuals and only one breeding pair following an extreme population bottleneck in the 1980’s.

MHC and Functional Diversity

The MHC is among the most polymorphic genes in jawed vertebrates. Though we expected to see variation in terms of allelic diversity across populations, LI and L4 were the dominant alleles in the San Francisco Bay populations. These alleles disproportionately contributed to each of the dominant supertypes (LI for supertype 1 and L4 for supertype 3), rather than a more even distribution of alleles from within each supertype. During population expansion phase of the invasion, strong drift may have resulted in this uneven distribution. If these dominant alleles are adequate for survival, there may be consequences for the continued persistence of L. parva. For small populations, particularly Lea Lake, there is an unclear relationship between long-term persistence and MHC diversity (Radwan et al. 2010).

Insight was gained into how L. parva maintains diversity within the MHCIIDB by examining duplicated loci and corresponding supertype patterns. Functional diversity appears to be maintained by spreading supertypes across duplicated loci in L. parva. The effects of MHC evolutionary dynamics, such as frequent recombination and duplications, may cause differentiation in gene composition even within a species; haplotypes differing in the number of loci have been described for a number of species (Babik 2010). The 28

same supertype was present at more than one locus. Yet, allelic diversity was low within individual loci and identical alleles, both nucleotide and amino acid, were not shared across loci. Identical MHCII alleles have been found at multiple loci in guppies (Poecilia reticulate), turkeys (Meleagris spp.) and prairie chickens (McMullan 2010; Chaves et al.

2010; Eimes et al. 2011). Maintaining supertype diversity across multiple loci may be an important mechanism for preserving functional diversity through a bottleneck or founding event. These introduced L. parva populations have low allelic diversity at individual MHCIIDB loci, but managed to maintain some functional diversity.

The pattern of functional diversity found in this study may be a product of reduced parasite pressure or sexual selection. Two supertypes dominated in all populations sampled with at least 68% of individuals having alleles from both supertypes

1 and 3. Enemy release from parasites is, in theory, predicted to reduce immune activity for successful invaders with reallocation of resources towards growth and reproduction

(Lee and Klasing 2004). As these two supertypes occur at high frequency in all populations, they may be advantageous against a wide range of novel pathogens or against specific novel pathogens. However, the presence of the same dominant supertypes in two geographically distinct regions suggests pathogens pressure may not be the only factor. Assortative mating has been suggested as a mechanism by which female stickleback (Gasterosteus aculeatus) select males with “good genes” in order to confer resistance to certain parasites to their offspring (Eizaguirre et al. 2010). Female stickleback select mates complimentary to their own MHC genes in order to optimize the 29

number of alleles for their offspring and this strategy has been shown to maintain

MHCIIB diversity through a bottleneck of two individuals (Aeschlimann et al. 2003).

Reliance on certain supertypes suggests some degree of assortative mating may be important for these L. parva populations.

Pollution from organic contaminants has been shown to impact killifish MHC and parasite loads (Cohen 2002). Overall, the San Francisco Bay populations were not significantly different from each other despite differing levels of pollution. Between the geographic regions sampled in this study allelic diversity was highest in San Francisco

Bay with Oakland having the most allelic richness. However, across essentially every measure of genetic diversity (richness, allele frequencies and theta values) the MHC showed a rather uniform pattern across the San Francisco Bay populations. Further,

AMOVA revealed the majority of variance was due to differences within a population, though the results were not significant. Significant changes in amino acid substitution have been detected between populations with differing levels of PCB contamination.

Cohen (2002) and Cohen et al. (2006) observed significant differences between Atlantic killifish (Fundulus heteroclitus) populations in a PCB contaminated superfund site and unpolluted reference sites in terms of amino acid substitutions in the a-helix and significantly different patterns in the P-pleated sheet of the PBR. No such substitution patterns were found in the Oakland population and may be partially due severe bottlenecking. Cohen (2002) found 41 unique alleles from 35 F. heteroclitus individuals sampled from native populations, while 14 unique alleles were found from the 51 L. 30

parva individuals sampled from introduced populations. The one notable difference

between Oakland and the other populations was the more even distribution of different

supertypes (Table 5). Yet more than half the individuals in Oakland still had supertype 3

and most individuals had supertype 1.

Conclusions

This study illustrates the importance of using both adaptive and neutral loci to

assess genetic variability in highly bottlenecked systems. It also contributes to the

emerging field of eco-immunology of invasive species with the observation that specific

supertypes may contribute to invasion success. Of all the individuals sampled from

introduced wild (non aquaculture) populations, 68% had both dominant supertypes and

most had one of the two. Microsatellite, D-loop and MHCIIDB diversity was analyzed in

introduced populations of L. parva from San Francisco and New Mexico. Neutral loci

were severely bottlenecked in the populations sampled, while the MHCIIDB maintained

some allelic and functional diversity. The exon showed signals of positive selection

despite low allelic diversity. While the San Francisco Bay populations and Lea Lake did

not share any identical MHC alleles they did share one very similar allele. Coupled with

the neutral loci, this suggests a common ancestry. Two supertypes of four were found to

dominate in all populations, suggesting low functional MHC diversity. In San Francisco

Bay a high percentage of individuals had the same alleles for each dominant supertype,

rather than a more even mix of alleles. Further only two alleles were found in Lea Lake.

Reduced MHC diversity could reflect the impacts of a) relaxed parasite mediated 31

selection as decreased immune response, b) assortative mating or c) new parasite pressures. A mechanism for maintaining functional diversity across duplicated loci in the face of bottlenecking is also noted. Alleles form the same supertype may be spread across loci even though identical alleles are not shared across loci.

Neither low neutral genetic diversity nor reduced diversity at adaptive loci appears to have hindered L. parvas invasion success, especially in San Francisco Bay where it occurs at high abundance. For L. parva wide environmental tolerances (thermal, salinity, pollution tolerance) may be an important in invasion success. Genetic diversity contributes to an invader’s success, but phenotypic plasticity is another important factor to consider (Blanchet 2012). There is growing awareness for the need to consider genetic diversity at both neutral and adaptive loci in order to assess the impacts of low genetic diversity on adaptation in introduced populations and role of phenotypic plasticity in invasion success (Kawamura et al. 2010; Lee 2002; Blanchet 2012).

Funding

This work was supported by the National Science Foundation FSML (Grant

0435033) for use of the Gene lab at the Romberg Tiburon Center for Environmental

Studies, San Francisco State University. This work was also made possible by donations from the Depot Biolink at City College of San Francisco and the Instructionally Related

Research Awards of San Francisco State University to D. Desmet. This project was largely supported by Dr. Sarah Cohen’s start-up funding. 32

References

Adams, S. M., Oleksiak, M. F. and Duvernell, D. D. 2005. “Microsatellite Primers for the Atlantic Coastal Killifish, Fundulus Heteroclitus, with Applicability to Related Fundulus Species.” Molecular Notes 5 (2): 275-77. doi: 10.1111/j. 1471 - 8286.2005.00898.x.

Aeschlimann, P. B., Haberli, M. A., Reusch, T. B. H ., Boehm, T. and Milinski, M. 2003. “Female Sticklebacks Gasterosteus Aculeatus Use Self-Reference to Optimize MHC Allele Number during Mate Selection.” Behavioral Ecology and Sociobiology 54 (2): 119-26.

Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. “Basic Local Alignment Search Tool.” Journal o f Molecular Biology 215 (3): 403-10. doi: 10.1016/S0022-2836(05)80360-2.

Anmarkrud, J. A., A. Johnsen, L. Bachmann, and J. T. Lifjeld. 2010. “Ancestral Polymorphism in Exon 2 of Bluethroat (Luscinia Svecica) MHC Class II B Genes: MHC Class II Diversity in Bluethroats.” Journal o f Evolutionary Biology 23 (6): 1206-17. doi: 10.1111/j. 1420-9101.2010.01999.x.

Babik, W. 2010. “Methods for MHC Genotyping in Non-Model Vertebrates.” Molecular Ecology Resources 10 (2): 237-51. doi: 10.1111/j.l 755-0998.2009.02788.X.

Blanchet, S. 2012. “The Use of Molecular Tools in Invasion Biology: An Emphasis on Freshwater Ecosystems: USING MOLECULAR TOOLS IN BIOLOGICAL INVASIONS.” Fisheries Management and Ecology 19 (2): 120-32. doi: 10.1111/j. 1365- 2400.2011.00832.x.

BPTCP. 1999. “Regional Toxic Hot Spot Cleanup Plan. Bay Protection and Toxic Cleanup Program (BPTCP) Final Report.” Oakland, CA: San Francisco Bay Region Regional Water Quality Control Board. http://www.swrcb.ca.gov/publications_forms/publications/general/docs/1598.pdf.

Brown, Jerry H., Theodore S. Jardetzky, Joan C. Gorga, Lawrence J. Stem, Robert G. Urban, Jack L. Strominger, and Don C. Wiley. 1993. “Three-Dimensional Structure of the Human Class II Histocompatibility Antigen HLA-DR1.” Nature 364 (6432): 33-39. doi:10.1038/364033a0.

Burnett, Karen G., Lisa J. Bain, William S. Baldwin, Gloria V. Callard, Sarah Cohen, Richard T. Di Giulio, David H. Evans, Marta Gomez-Chiarri, Mark E. Hahn, and Cindi A. Hoover. 2007. “Fundulus as the Premier Teleost Model in Environmental 33

Biology: Opportunities for New Insights Using Genomics.” Comparative Biochemistry and Part D: Genomics and Proteomics 2 (4): 257-86.

Caran, S. 1988. “Bottomless Lakes, New Mexico - A Model for the Origin and Development of Ground Water Lakes (abs.).” In Abstracts with Programs, 20:93. 2.

Charbonnel, Nathalie, and Jean-Francois Cosson. 2012. “Molecular Epidemiology of Disease Resistance Genes with Perspectives for Researches on Biological Invasions and Flybrid Zones.” In New Frontiers o f Molecular Epidemiology o f Infectious Diseases, 255-90. Dordrechit, Netherlands: Springer.

Chaves, Lee D., Gretchen M. Faile, Stacy B. Krueth, Julie A. Hendrickson, and Kent M. Reed. 2010. “Haplotype Variation, Recombination, and Gene Conversion within the Turkey MHC-B Locus.” Immunogenetics 62 (7): 465-77. doi: 10.1007/s00251-010- 0451-2.

Cizkova, D., J. Gouy de Bellocq, S. J. E. Baird, J. Pialek, and J. Bryja. 2011. “Genetic Structure and Contrasting Selection Pattern at Two Major Histocompatibility Complex Genes in Wild House Mouse Populations.” Heredity 106 (5): 727-40.

Cohen, A. N., and J.T. Carlton. 1995. “Nonindigenous Aquatic Species in a United States Estuary: A Case Study of the Biological Invasions of the San Francisco Bay and Delta.” PB96-166525,. United States Fish and Wildlife Service and The National Sea Grant College Program, Connecticut Sea Grant. http://www.anstaskforce.gov/Documents/sfinvade.htm.

Cohen, Sarah. 2002. “Strong Positive Selection and Habitat-Specific Amino Acid Substitution Patterns in Mhc from an Estuarine Fish Under Intense Pollution Stress.” Molecular Biology and Evolution 19(11): 1870-80.

Cohen, Sarah, Tirindelli, Joelle, Gomez-Chiarri, Maria, and Nacci, Diane. 2006. “Functional Implications of Major Histocompatibility (MH) Variation Using Estuarine Fish Populations.” Integrative and Comparative Biology 46 (6): 1016-29.

Creer, Douglas A., and Joel C. Trexler. 2006. “New Polymorphic Microsatellite Loci in Two Fish Species: Bluefin Killifish (Lucania Goodei) and Yellow Bullhead (Ameiurus Natalis).” Molecular Ecology Notes 6 (1): 167-69. doi:10.1111/j.l471- 8286.2005.01179.x.

Daum, Ted, Sarah Lowe, Rob Toia, G. Bartow, R. Fairey, J. Anderson, and J. Jones. 2000. Sediment Contamination in San Leandro Bay, CA. San Francisco Estuary Institute, http://www.sfei.org/sites/default/files/finalslbay.pdf. 34

Dill, William, and Almo Cordone. 1997. “History And Status of Introduced Fishes In California, 1871 - 1996.” Fish Bulliten 178. State of California The Resource Agency Department of Fish and Game. about:reader?url=http%3A%2F%2Fcontent.cdlib.org%2Fview%3FdocId%3Dkt8p30069f %26brand%3Dcalisphere%26doc.view%3Dentire_text.

Dlugosch, K. M., and I. M Parker. 2008. “Founding Events in Species Invasions: Genetic Variation, Adaptive Evolution, and the Role of Multiple Introductions.” Molecular Ecology 17 (1): 431-49. doi:10.111 l/j,1365-294X.2007.03538.x.

Doherty, P. C., and R. M. Zinkernagel. 1975. “Enhanced Immunological Surveillance in Mice Heterozygous at the H-2 Gene Complex.” Nature 256 (5512): 50- 52.

Doytchinova, Irini A., Valerie Walshe, Persephone Borrow, and Darren R. Flower. 2005. “Towards the Chemometric Dissection of Peptide - HLA-A*0201 Binding Affinity: Comparison of Local and Global QSAR Models.” Journal o f Computer-Aided Molecular Design 19 (3): 203-12. doi:10.1007/sl0822-005-3993-x.

Eimes, J. A., J. L. Bollmer, L. A. Whittingham, J. A. Johnson, C. Van Oosterhout, and P. O. Dunn. 2011. “Rapid Loss of MHC Class II Variation in a Bottlenecked Population Is Explained by Drift and Loss of Copy Number Variation: Reduced MHC Variation Following a Bottleneck.” Journal o f Evolutionary Biology 24 (9): 1847-56. doi: 10.1111/j.l 420-9101.2011.02311 .x.

Eizaguirre, Christophe, Tobias L. Lenz, Martin Kalbe, and Manfred Milinski. 2012. “Rapid and Adaptive Evolution of MHC Genes under Parasite Selection in Experimental Vertebrate Populations.” Nature Communications 3 (January): 621. doi: 10.1038/ncommsl 632.

Eizaguirre, Christophe, Tobias L. Lenz, Ralf D. Sommerfeld, Chris Harrod, Martin Kalbe, and Manfred Milinski. 2010. “Parasite Diversity, Patterns of MHC II Variation and Olfactory Based Mate Choice in Diverging Three-Spined Stickleback Ecotypes.” Evolutionary Ecology 25 (3): 605-22. doi:10.1007/sl0682-010-9424-z.

Ejsmond, Maciej Jan, and Jacek Radwan. 2011. “MHC Diversity in Bottlenecked Populations: A Simulation Model.” Conservation Genetics 12 (1): 129-37. doi: 10.1007/s 10592-009-9998-6.

Ellison, A., J. Allainguillaume, S. Girdwood, J. Pachebat, K. M. Peat, P. Wright, and S. Consuegra. 2012. “Maintaining Functional Major Histocompatibility Complex Diversity under Inbreeding: The Case of a Selfing Vertebrate.” Proceedings o f the Royal Society B: Biological Sciences 279 (1749): 5004-13. doi:10.1098/rspb.2012.1929. 35

Engelhard, Victor H. 1994. “Structure of Peptides Associated with Class I and Class II MHC Molecules.” Annual Review o f Immunology 12 (1): 181—207.

Excoffier, L, and H.E. L. Lischer. 2010. Arlequin Suite Ver 3.5: A New Series o f Programs to Perform Population Genetics Analyses under Linux and Windows. Molecular Ecology Resources.

Fisher, Michael Todd. 2000. “Variation at Major Histocompatibility Complex Class I Loci in Two Killifish Species with Reduced Genetic Variance.” Virginia Polytechnic Institute and State University, http://scholar.lib.vt.edu/theses/available/etd- 04242001-095754/.

Foote, J., and G. Winter. 1992. “ Framework Residues Affecting the Conformation of the Hypervariable Loops.” Journal o f Molecular Biology 224 (2): 487- 99.

Frankham, R, J.D Ballou, and D.A Briscoe. 2002. Introduction to Conservation Genetics. Cambridge, UK: Cambridge University Press.

Fraser, B. A., I. W. Ramnarine, and B. D. Neff. 2010. “Selection at the MHC Class IIB Locus across Guppy (Poecilia Reticulata) Populations.” Heredity 104 (2): 155— 67.

Garrigan, Daniel, and Philip W. Hedrick. 2003. “Perspective: Detecting Adaptive Molecular Polymorphism: Lessons from the MHC.” Evolution 57 (8): 1707-22.

Goudet, Jerome. 2002. FSTAT, A Program to Estimate and Test Gene Diversities and Fixation Indices, Version 2.9.3. Institute d’Ecologie: Universite de Laussane, Switzerland.

Haney, R. A., M. Dionne, J. Puritz, and D. M. Rand. 2009. “The Comparative Phylogeography of East Coast Estuarine Fishes in Formerly Glaciated Sites: Persistence versus Recolonization in Cyprinodon Variegatus Ovinus and Fundulus Heteroclitus Macrolepidotus.” Journal o f Heredity 100 (3): 284-96. doi:10.1093/jhered/esnl07.

Hedrick, Philip W., and Carla R. Hurt. 2012. “Conservation Genetics and Evolution in an Endangered Species: Research in Sonoran Topminnows*: Conservation Genetics and Evolution in an Endangered Species.” Evolutionary Applications 5 (8): 806-19. doi: 10.1111/j. 1752-4571.2012.00259.x.

Hedrick, Philip W., and Karen M. Parker. 1998. “MHC Variation in the Endangered Gila Topminnow.” Evolution 52 (1): 194. doi: 10.2307/2410934. 36

Hedrick, P. W., K. M. Parker, and R. N. Lee. 2001. “Using Microsatellite and MHC Variation to Identify Species, ESUs, and MUs in the Endangered Sonoran Topminnow.” Molecular Ecology 10 (6): 1399-1412.

Hubbs, C, and R.R Miller. 1965. “Studies of Cyprinodont Fishes. XXII. Variation in Lucania Parva, Its Establishment in Western United States, and Description of a New Species from Interior Basin in Coahuila, Mexico.” Miscellaneous Publications Museom o f Zoology, University o f Michigan, no. 127: 1-104.

Jordan, Frank. 2002. “Field and Laboratory Evaluation of Habitat Use by Rainwater Killifish (Lucania Parva) in the St. Johns River Estuary, Florida.” Estuaries 25 (2): 288-95.

Joshi, J, and K Vrieling. 2005. “The Enemy Release and EICA Hypothesis Revisited: Incorporating the Fundamental Difference between Specialist and Generalist Herbivores.” Ecology Letters 8: 704-14. doi: 10.111 l/j.l461-0248.2005.00769.x.

Kawamura, Kouichi, Ryuji Yonekura, Yuiko Ozaki, Osamu Katano, Yoshinori Taniguchi, and Kenji Saitoh. 2010. “The Role of Propagule Pressure in the Invasion Success of Bluegill Sunfish, Lepomis Macrochirus, in Japan: Propagule Pressure and Invasion Success.” Molecular Ecology 19 (24): 5371-88. doi: 10.1111/j.l 365- 294X.2010.04886.x.

Keane, R.M, and M.J Crawley. 2002. “Exotic Plant Invasions and the Enemy Release Hypothesis.” Trends in Ecology & Evolution 17 (4): 164-70. doi: 10.1016/SO 169- 5347(02)02499-0.

Kiemnec-Tyburczy, K. M., J. Q. Richmond, A. E. Savage, K. R. Lips, and K. R. Zamudio. 2012. “Genetic Diversity of MHC Class I Loci in Six Non-Model Frogs Is Shaped by Positive Selection and Gene Duplication.” Heredity 109 (3): 146-55.

Kimura, M., and G. H. Weiss. 1964. “The Stepping Stone Model of Population Structure and the Decrease of Genetic Correlation with Distance.” Genetics 49 (4): 561— 76.

Klein, Jan. 1987. “Natural History of the Major Histocompatibility Complex.” American Journal o f Human Genetics 40 (5): 468.

Kolbe, Jason J., Richard E. Glor, Lourdes Rodriguez Schettino, Ada Chamizo Lara, Allan Larson, and Jonathan B. Losos. 2004. “Genetic Variation Increases during Biological Invasion by a Cuban Lizard.” Nature 431 (7005): 177-81. doi: 10.103 8/nature02807. 37

Lee, Carol Eunmi. 2002. “Evolutionary Genetics of Invasive Species.” Trends in Ecology & Evolution 17 (8): 386-91.

Lee, Kelly A., and Kirk C. Klasing. 2004. “A Role for Immunology in Invasion Biology.” Trends in Ecology & Evolution 19(10): 523-29. doi: 10.1016/j.tree.2004.07.012.

Lenz, Tobias L., and Sven Becker. 2008. “Simple Approach to Reduce PCR Artefact Formation Leads to Reliable Genotyping of MHC and Other Highly Polymorphic Loci — Implications for Evolutionary Analysis.” Gene 427 (1-2): 117-23. doi: 10.1016/j.gene.2008.09.013.

Li, C., M. L. Bessert, J. Macrander, and G. Orti. 2009. “Low Variation but Strong Population Structure in Mitochondrial Control Region of the Plains Topminnow, Fundulus Sciadicus.” Journal o f Fish Biology 74 (5): 1037-48. doi: 10.1111/j. 1095- 8649.2008.02097.x.

Lotrich, Victor A. 1975. “Summer Home Range and Movements of Fundulus Heteroclitus (Pisces: Cyprinodontidae) in a Tidal Creek.” Ecology 56 (1): 191. doi: 10.2307/1935311.

Mansfield, John M., and Martin Olivier. 2002. “Immune Evasion by Parasites.” In Immunology o f Infectious Diseases, edited by Alan Sher, Rafi Ahmed, and Stefan H. E. Kaufmann, 379-92. American Society of Microbiology. http://www.asmscience.org/content/book/10.1128/9781555817978.chap25.

McMullan, Mark. 2010. “Host-Parasite Co-Evolution and Genetic Variation at the Major Histocompatibility Complex in the Trinidadian Guppy (Poecilia Reticulata).” The University of Hull. https://hydra.hull.ac.Uk/resources/hull:5820.

Meyer, Joel N., and Richard T. Di Giulio. 2003. “Heritable Adaptation and Fitness Costs in Killifish (Fundulus Heteroclitus) Inhabiting a Polluted Estuary.” Ecological Applications 13 (2): 490-503.

Miller, Hilary C., and David M. Lambert. 2004. “Genetic Drift Outweighs Balancing Selection in Shaping Post-Bottleneck Major Histocompatibility Complex Variation in New Zealand Robins (Petroicidae).” Molecular Ecology 13 (12): 3709-21. doi: 10.111 l/j.l365-294X.2004.02368.x.

Monzon-Argiiello, Catalina, Carlos Garcia de Leaniz, Gonzalo Gajardo, and Sofia Consuegra. 2013. “Less Can Be More: Loss of MHC Functional Diversity Can Reflect Adaptation to Novel Conditions during Fish Invasions.” Ecology and Evolution, August, n/a-n/a. doi:10.1002/ece3.701. 38

Monzon-ArgUello, C., C. Garcia de Leaniz, G. Gajardo, and S. Consuegra. 2014. “Eco-Immunology of Fish Invasions: The Role of MHC Variation.” Immunogenetics 66 (6): 393—402. doi:10.1007/s00251-014-0771-8.

Moyle, Peter B. 2002. Inland Fishes o f California. University of California Press.

Murrell, Ben, Joel O. Wertheim, Sasha Moola, Thomas Weighill, Konrad Scheffler, and Sergei L. Kosakovsky Pond. 2012. “Detecting Individual Sites Subject to Episodic Diversifying Selection.” Edited by Harmit S. Malik. PLoS Genetics 8 (7): e 1002764. doi: 10.1371 /journal.pgen. 1002764.

Murrell, B., S. Moola, A. Mabona, T. Weighill, D. Sheward, S. L. Kosakovsky Pond, and K. Scheffler. 2013. “FUBAR: A Fast, Unconstrained Bayesian AppRoximation for Inferring Selection.” Molecular Biology and Evolution 30 (5): 1196— 1205. doi:10.1093/molbev/mst030.

Nacci, D., L. Coiro, D. Champlin, S. Jayaraman, R. McKinney, T. R. Gleason, W. R. Munns Jr, J. L. Specker, and K. R. Cooper. 1999. “Adaptations of Wild Populations of the Estuarine Fish Fundulus Heteroclitus to Persistent Environmental Contaminants.” Marine Biology 134 (1): 9-17.

Nacci, Diane E., Denise Champlin, and Saro Jayaraman. 2010. “Adaptation of the Estuarine Fish Fundulus Heteroclitus (Atlantic Killifish) to Polychlorinated Biphenyls (PCBs).” Estuaries and Coasts 33 (4): 853-64. doi:10.1007/sl2237-009-9257-6.

Nei, Masatoshi, Xun Gu, and Tatyana Sitnikova. 1997. “Evolution by the Birth- and-Death Process in Multigene Families of the Vertebrate Immune System.” Proceedings o f the National Academy o f Sciences 94 (15): 7799-7806.

Nei, Masatoshi, Takeo Maruyama, and Ranajit Chakraborty. 1975. “The Bottleneck Effect and Genetic Variability in Populations.” Evolution 29 (1): 1. doi: 10.2307/2407137.

Oliver, M. K., and S. B. Piertney. 2012. “Selection Maintains MHC Diversity through a Natural Population Bottleneck.” Molecular Biology and Evolution 29 (7): 1713-20. doi: 10.1093/molbev/mss063.

Palumbi, S.R. 1996. “PCR and Molecular Systematics.” In Molecular Systematics, 2nd Edition, D. Hillis, C. Moritz, and B. Mable, Eds. Sinauer Press.

Pavey, Scott A., Maelle Sevellec, William Adam, Eric Normandeau, Fabien C. Lamaze, Pierre-Alexandre Gagnaire, Marie Filteau, Francois Olivier Hebert, Halim Maaroufi, and Louis Bernatchez. 2013. “Nonparallelism in MHCII(3 Diversity Accompanies Nonparallelism in Pathogen Infection of Lake Whitefish ( Coregonus 39

Clupeaformis) Species Pairs as Revealed by next-Generation Sequencing.” Molecular Ecology 22 (14): 3833-49. doi:10.111 l/mec.12358.

Peakall, Rod, and Peter Smouse. 2012. “GenAlEx 6.5: Genetic Analysis in Excel. Population Genetic Software for Teaching and Research - an Update.” Bioinformatics, July, bts460. doi:10.1093/bioinformatics/bts460.

Peakall, Rod, and Peter E. Smouse. 2006. “Genalex 6: Genetic Analysis in Excel. Population Genetic Software for Teaching and Research.” Molecular Ecology Notes 6 (1): 288-95. doi: 10.1111/j. 1471-8286.2005.01155.x.

Piertney, S. B., and M. K. Oliver. 2006. “The Evolutionary Ecology of the Major Histocompatibility Complex.” Heredity 96 (1): 7-21.

Pond, S. L. K., and S. D. W. Frost. 2005. “Datamonkey: Rapid Detection of Selective Pressure on Individual Sites of Codon Alignments.” Bioinformatics 21 (10): 2531-33. doi: 10.1093/bioinformatics/bti320.

Pond, S. L. K., S. D. W. Frost, and S.V. Muse. 2005. “HyPhy: Hypothesis Testing Using Phylogenies.” Bioinformatics 21 (5): 676-79.

Radwan, Jacek, Aleksandra Biedrzycka, and Wiestaw Babik. 2010. “Does Reduced MHC Diversity Decrease Viability of Vertebrate Populations?” Biological Conservation 143 (3): 537-44. doi:10.1016/j.biocon.2009.07.026.

Roman, J, and J Darling. 2007. “Paradox Lost: Genetic Diversity and the Success of Aquatic Invasions.” Trends in Ecology & Evolution 22 (9): 454-64. doi: 10.1016/j.tree.2007.07.002.

Rubissow, Ariel, and Natalie Macris. 1997. “Land Use and Population.” San Francisco Estuary Project. http://sfep.sfei.org/wp-content/uploads/2012/12/15Land_Use- Population.pdf.

Sato, Akie, Felipe Figueroa, Werner E. Mayer, Peter R. Grant, B. Rosemary Grant, and Klein Jan. 2000. “Mhc Class II Genes of Darwin’s Finches: Divergence by Point Mutations and Reciprocal Recombination.” In Major Histocompatibility Complex, 518—41. Tokyo: Springer Japan.

SFEI. 2014. “Contaminant Data Display & Download.” San Francisco Estuary Institute. Accessed March 9. http://www.sfei.org/tools/wqt.

Skinner, Marc A., Simon C. Courtenay, W. Roy Parker, and R. Allen Curry. 2005. “Site Fidelity of Mummichogs(Fundulus Heteroclitus) in an Atlantic Canadian Estuary.” Water Quality Research Journal o f Canada 40 (3): 288-98. 40

Spurgin, Lewis G., Cock Van Oosterhout, Juan Carlos Illera, Stephen Bridgett, Karim Gharbi, Brent C. Emerson, and David S. Richardson. 2011. “Gene Conversion Rapidly Generates Major Histocompatibility Complex Diversity in Recently Founded Bird Populations: Gene Concervation at the MHC.” Molecular Ecology 20 (24): 5213- 25. doi:l0.1111/j. 1365-294X.2011.05367.x.

Spurgin, L. G., and D. S. Richardson. 2010. “How Pathogens Drive Genetic Diversity: MHC, Mechanisms and Misunderstandings.” Proceedings o f the Royal Society B: Biological Sciences 277 (1684): 979-88. doi:10.1098/rspb.2009.2084.

Tamura, K , G. Stecher, D. Peterson, N. Filipski, and S. Kumar. 2013. “MEGA6: Molecular Evolutionary Genetics Analysis Version 6.0.” Molecular Biology and Evolution 30: 2725-29. doi:10.1093/molbev/msrl21.

Tirindelli, Joelle. 2008. “Variation in the Major Histocompatibility Class 1IDB Gene Region between Polychlorinated Biphenyl (PCB)-Contaminated and Reference Populations of the Mummichog Fish, Fundulus Heteroclitus.” Masters Thesis, Romberg Tiburon Center: San Francisco State University.

Uller, Tobias, and Roosa Leimu. 2011. “Founder Events Predict Changes in Genetic Diversity during Human-Mediated Range Expansions.” Global Change Biology 17 (11): 3478-85. doi:10.1111/j. 1365-2486.2011.02509.x.

Waldman, John, Reese Bender, and Isaac Wirgin. 1997. “Multiple Population Bottlenecks and DNA Diversity in Populations of Wild Striped Bass, Morone Saxatilis.” Fishery Bulletin 96: 614-20.

Webster, L. M. I., S. Paterson, F. Mougeot, J. Martinez-Padilla, and S. B. Piertney. 2011. “Transcriptomic Response of Red Grouse to Gastrointestinal Nematode Parasites and Testosterone: Implications for Population Dynamics.” Molecular Ecology 20 (5): 920-31. doi:10.1111/j. 1365-294X.2010.04906.x.

White, Thomas A., and Sarah E. Perkins. 2012. “The Ecoimmunology of Invasive Species.” Edited by Alison Dunn. Functional Ecology 26 (6): 1313-23. doi:10.1111/1365-2435.12012.

White, Thomas A., and J.B Searle. 2008. “Mandible Asymmetry and Genetic Diversity in Island Populations of the Common Shrew, SorexAraneus .” Journal o f Evolutionary Biology 21 (2): 636^1. doi:10.1111/j. 1420-9101.2007.01481 .x.

Wilson, D. J., and G. McVean. 2006. “Estimating Diversifying Selection and Functional Constraint in the Presence of Recombination.” Genet. doi: 10.1534/genetics. 105.044917. 41

Winternitz, Jamie C., and John P. Wares. 2013. “Duplication and Population Dynamics Shape Historic Patterns of Selection and Genetic Variation at the Major Histocompatibility Complex in Rodents.” Ecology and Evolution 3 (6): 1552-68. doi: 10.1002/ece3.567.

Wright, William. 2003. “The Genomic Structure of the MHC Class II B in the Estuarine Fish Fundulus Heteroclitus .” Harvard University.

Zenger, K. R., B. J. Richardson, and A.-M. Vachot-Griffin. 2003. “A Rapid Population Expansion Retains Genetic Diversity within European Rabbits in Australia.” Molecular Ecology 12 (3): 789-94. 42

List of Tables

Table 1. Rainwater killifish populations sampled______Population Longitude Location Name Genetic Data State (Acronym)______Latitude______Collected______Oakland 122°n ,31.53wW Elmhurst Creek MHC, (OAK) 37°44,54.21WN Microsatellites, D-loop Corte Madera, 122°32'14.72”W Creekside Park MHC, Hospital 37°56,45.06"N Marsh, Corte Microsatellites, (CMH) Madera Creek D-loop Mill Valley 122°31,35.27t,W Coyote Creek MHC, (CCT) 37052f34.68"N Tributary Microsatellites, D-loop Lea Lake 104°19,50.39"W Lea Lake Outflow MHC, (LL) 33°18’59.05"N and Wetlands Microsatellites, D-loop

Table 2. Pairwise Fst values between populations of Lucania pa ir a for MHCDDB (above diagonal) and the microsatellite loci (below diagonal). Bold indicates significant values (F<0.05) using the Mantel

CMH CCT OAK LL CMH 0.014 0.004 0.095* CCT 0.000 0.007 0.116* OAK 0.000 0.000 0.091* LL 0.079 0.053 0.079

Table 3. Summary of each i-locus including alleles, and supertypes present. Intron-Exon Allele i-Locus length, bp Exon Alleles Supertype 450 452 & 455 L2, L1,L1A 1,2 660 660,669 L7, L 3, L20 1,2 750 752,782 & 785,786 L8, L9,L10, 3,4 L4, L21 827 827 L5 4 850+ 867 L12 2 Table 4. Population analysis of MHCIIDB including sample size (N), number of alleles (na ), nucleotide diversity Or), mean observed heterozygosity (H0) and expected heterozygosity (He), allelic richness (AR). Theta k (0(k))5 Lower and Upper 95% Confidence Interval for 9(k). 9 Horn with SD. 6 Pi with SD.______6(k) 0Hom 0Pi N na 7T Ho He AR m L95%-U95% (SD) (SD) 4.745 42.641 OAK 19 9 0.165+/- 0.081 0.894 0.852 3.361 3.179 1.478-6.508 (1.135) (20.949) 2.661 41.496 CMH 14 7 0.161+/-0.079 0.928 0.772 2.605 2.423 1.023-5.409 (0.728) (20.551) 2.201 38.666 CCT 10 6 0.150+/- 0.075 0.9 0.74 2.256 2.181 0.856-5.224 (0.642) ( 19.379) 0.786 34.871 LL 8 2 0.135+/-0.0711 0.625 0.5128 1 0.379 0.087- 1.55 (0.261) (18.276) 44

Table 5. Analysis of Molecular Variance for MHCIIDB exon2. Populations are grouped by geographic region (San Francisco and New Mexico). (ns= not significant)______Sum of Variance % of Fixation Source of Variation d.f. Squares Components Variation indices Significance Among Regions 1 70.75 2.296 Va 10.24 Fsc :-0.007 ns Among Populations within Regions 2 30.229 -0.157 Vb -0.7 Fst: 0.095 ns Within Populations 110 2232.635 20.296 Vc 90.47 Frt: 0.102 ns

Table 6. San Francisco results from codon based maximum likelihood approaches: Mixed effects model of evolution (MEME) Fast Unbiased Bayesian Approximation (FUBAR), and Bayesian posterior probabilities (>95%) generated by omegaMap to estimate positive section on the codons. Codon numbers based off mature protein (Ono et al. 1993). Method Positively Selected Site MEME 9, 29,36,39, 42,61,75,87, 90 FUBAR 9,11,12,13,39,62,87 omegaMap 8, 9 \ 10, lit, Ht, I3t, 29,3 1+, 38,39,47, 56,60,61, 62,63,64,65,66,67,68*, 69f, 70+, 71+, 72,73

^Posterior Probability =100% -PSS in Bold 45

List of Figures

Figure 1. The frequency of MHC exon 2 alleles collected across populations sampled in California and New Mexico. MHC alleles are not shared between San Francisco Bay and New Mexico.

a Oakland | 0.9 ■ Mi!! Vai!ey 15 0.8 ■ Corte Madera t 0.7 ■ Lea Lake 0.6 ° 0.5 .2 0.4 S 0.3 S °'2 0.1 0.0 850+ Locus

Figure 2. The proportion of loci usage across individuals in each population. Data was collected by imaging the initial MHC PCR on an agarose gel and counting the bands present at each locus (n=20 for every population). * denotes a significant difference based on Fisher’s Exact Test. 46

a) Variation in o j

codon position b) Posterior probability

codon position

c) Variation in p 0.7

0 1 ------6 16 26 36 46 56 66 76 86 codon position

Figure 3. Codon level analysis of positive selection across the MHCIIDB exon 2 of the Rainwater killiflsh. (a) variation in omega across exon 2 where the solid line represents the mean estimate (co), the grey field represents 95% highest posterior probability dense intervals and the dotted line corresponds to neutral estimates of nonsynonymous/synonymous substitutions (w=l). (b) Posterior probability of positive selection, crosses indicate PBR sites and the dotted line represents the 95% posterior probability level, (c) Variation in rho across exon 2 where the solid line represents the mean estimate (p ), the grey field represents 95% highest posterior probability dense intervals. 47

- L9

18 18 Supertype 4

L5 91 110

29 • L4 Supertype 3 . L21

L12

Supertype Z 15 L2

45 . 17

94 L1A

L1 Supertype 1 19 L3

64 L20

Figure 4. Cluster analysis tree (Ward’s algorithm with 1000 bootstrap replicates) grouping MHC alleles into supertypes, based on physiochemical properties of positively selected codons. 48

■ Oakland

■ Mill Valley

* Corte Madera ■ Lea Lake

supertype 3 supertype 4supertype 1 supertype 2 supertype 3 supertype 4supertype

Figure 5. The proportion each stpertype used in the population. There was no significant difference between populations based on supertype usage following Bonferroni correction of Fisher’s Exact Test. 49

Appendix

Table Al. Microsatellite heterozygosity sample size (N)} number of alleles (NJ, mean observed heterozygosity (H*,) and expected heterozygosity (PL). Sample size of N=20 for all populations.______Locus OAK CMH CCT LL Fh-ATG18 Ho 0 0 0 0 He 0 0 0 0 Na 1 1 1 1 Fh-ATG6 Ho 0.2 0.2 0.15 0 He 0.18 0.18 0.139 0 N, 2 2 2 1 Fh-ATG17 Ho 0 0 0 0 He 0 0 0 0 Na 1 1 1 1 Fh- ATGB101 H. 0 0 0 0 He 0 0 0 0 Na 1 1 1 1

Table A2. P-values from Fishers Exact Test listed below. AH four populations were tested against each other to look for significant differences at each locus. * indicated significance (P<0.05). Bold values indicates significance following Bonferroni correction (P <0.0083). Populations: CMH= Corte Madera, CCT=Mill Valley. ELC= Oakland. LL=Lea Lake.______450 locus 660 locus 750 locus 827 locus 850+ locus CMH vs. CCT 0.342 1 1 0.731 0.235 CMH vs. OAK 0.044 0.127 0.605 0.731 1 CCT vs. OAK 0.48 0.127 1 1 0.451 CMH vs. LL 0.000* 0.000* 1 0.182 0.487 CCT vs. LL 0.000* 0.000* 1 0.044 0.02 OAK vs. LL 0.000* 0.113 0.605 0.044 0.231 Table A3. Amino Acids under positive selection with respect to Binding pockets. Positively selected codons identified using omegaMap (Wilson and McVean 2005) and pocket codons (Stem et al. 1994).

Codon position under positive Pocket # selection 1 86 4 13,70.71 6 11,13 7 61,71 9 9

Table A4. P-values from Fishers Exact Test listed below. Supertype usage was tested between all four populations to look for significant differences between populations based on supertype usage. * indicated significance (P<0.05). Bold values indicates significance following Bonferroni correction (P <0.0083). Populations: CMH= Corte Madera. CCT=Mill Valley. OAK= Oakland, LL=Lea Lake______superiype 1 supertype 2 supertype 3 supertvpe 4 OAKvCMH 0.305 0.293 0.095 0.161 OAKvCCT 0.245 0.057 0.046* 0.226 OAK vLL 0.293 0.092 0.02* 0.02* CCTvCMH 0.343 0.094 0.343 0.193 CCTvLL 0.183 1 0.556 0.029* LL v CMH 0.273 0.137 0.236 0.137 electrophoresisan onagarose gel. imageAn of initialthe MHC for PCR CorteMaderathe populationshowingFigure Al. lociseparated multiple by 100 bp Ladder oPR .1 0.101 0.213 NonPBR B 046 0.212 0.406 0.13 0.258 PBR SitesAll The number of synonymous substitutions substitutions synonymous of number The second column and were obtained by a by obtained andwere column second shown. are eachgroup pairs within sequence Table A5. Estimates of average codon-based codon-based average Estimatesof A5. Table Standard error estimate(s) are shown in the shown are estimate(s) error Standard bootstrap procedure (1000 replicates). (1000 procedure bootstrap sequence pairs within each group are shown. shown. are eachgroup pairs within sequence per synonymous site from averaging over allover averaging site from synonymous per across exon 2. The number of of number 2.across exon The pairs sequence over divergence evolutionary nonsynonymous site from averaging over allover averaging site from nonsynonymous per substitutions nonsynonymous 008 (0.027) (0.038) 017 (0.079) (0.117) (0.026) (0.038) ^ d (SE) ds ) E d^S

i i J i n n 51 52

b l) Oakland posterior probability a l) Oakland variation in oj 20 2- 0.8

JI2 "0.6 | 0-5 fc 0.4 mm M to 0.3 IU

6 16 26 56 66 76 86 6 16 26 36 46 56 66 76 86 Codon

a2) Corte Madera variation in w

aB) Mill Valley variation in w b3) Mill Valley posterior probibility

Codon

a4)Lea Lake variation in u b4) Lea Lake posterior probability 14

12 - 10 I Jk. 8 1 6 4 l i - L 2 ...... 0 6 16 26 36 46 56 66 76 86 Codon

Figure A2. Codon level analysis of positive selection across the MHCIIDB exon 2 of the four Rainwater killifish populations, (a#) Variation in omega across exon 2 where the solid line represents the mean estimate (co), the grey field represents 95% highest posterior probability dense intervals and the dotted line corresponds to neutral estimates of nonsynonymous/synonymous substitutions (ey=l). (b#) Posterior probability of positive selection, crosses indicate putative PBR sites and the dotted line represents the 95% posterior probability. PBR sites annotated from Brown et al. 1993) (Population numbers: 1 = Oakland, 2= Corte Madera, 3= Mill Valley, 4= Lea Lake) 53

- LS - L8 Supertype 4- • 15 - L10

29 ------L4 Supertype 3 88 ------121

- 12 Supertype 2

- L7

11 A 1L1 Supertype 1 - L3

* L20

Suptrtype 2

Supertype 1

14 Supertype 3 m 1- L21 L10 L5 L8 Supertype 4 54

c)

Supertype 2

Supertype 1

Supertype 3

Supertype 4

d)

Supertype 2

Supertype 1

Supertype 3

Supertype 4

Figure A3. Results of cluster analysis (a) Ward’s algorithm, Paired linkage b) Euclidean, c) Cosine distance measure and d) Pearson’s correlation) defining MHC supertypes for L.parva, derived from physiochemical properties of positively selected codons. Bootstrapping values (n=1000) indicated. 50

R E Y E V D R C V F N S S E L N D I E F IR S H F Y N K L E Y I R F S SS VGK F V G Y T E Y m M i M l M l f i S H i m 1.L1 . Q F . . K . . . Y T . F T . RE 2.L1A . Q F . K . . Y T . F T . RE 3. L2 G . Y E . L F . L W 4 L7 ! N G M Q . L F . R 5 U2 I I K I I N . I A . L 6 L3 ! p. D A ! M . L Y . LE y ! ! F 7.L20 . Y A . M . L Y . L E Y . . F 8 L4 E Y MT6 ! ! ! . k . Y Y ! D ; d T L 9 L21 E H M T Q . . . ! ! ! T . K Y Y . D . D T L 10. L5 M . V s . . . Y Y Y .D! K L ! y ! ! 1 ! Q F 1t L8 M . VL i . Y Y Y . . K L . Y . . . . Q 12.L9 E H V T y ! ! ! . T \ k Y Y . D ! D T L 13.L10 E H M T H ...... T L Y Y . D . D T L

Consensus E YX N K Q T S F I A R X K A Q R E A Y C L H N V G I D Y Q N

1.L1 D . 9 » s L QQ MDIN 2.L1A 0 . W . s L Q Q MD I N 3 L2 N . Y . G ! Y EGL ! t ! ! ! N V 4 L7 N . R . S L QQ M . . N ! N 5 L12 N . R . SL QQ M . . N . N 6 L3 N . F . G ; y E G L . . N V 7.L20 N . F . G . YE V L . . N ‘ V 8 U . R W D P . E W N ; ! t ! . . N ! v A 9 L21 . O w DP . E w N . . . . T . . V 10. L5 . RLD P . AR . V 11. L8 . R L D P . A R ! t ! W 12. L9 . R L D P . O W . T . w 13.L10 . R L D P . Q w . T . ! v

Figure A4. Amino acid sequences from the thirteen MHCIIDB haplotypes detected in this study. PBR codons are annotated from Brown et al 1993 and marked with the box below the consensus sequence. The beta sheets (arrows) and alpha helix (cylinder) have been annotated above the consensus sequence. Codon numbering corresponds to the mature protein (Ono et al. 1993, Cohen et al 2002). GGGTTTAGACAATATGTRGTGGATCGTTGTGTGTTTAACTCCTCTGAGCTGAACGACATCGAGTTCATCAGCTCCCACTTTTACAACAAGTTGGAGTTCATCAGI I I MM M M M M ____ .. # 1 L9 ....GAC.T... A. G i...TA., .A. 2 110 . .A.GAC.C. . . vA» «»«*«• , . . .CT. ...T...AC... G . . . . A . ! . 3 L4 . .A,GAC.C.G. « * * Q * * * - * * * ...TA. . . . . A . . ! 4 L21 ...GAG..T . .A.GAC.C.G. A...... * » ♦ * O * « * * « * ....TA...... AC . » . G . . . . . A 5 L5 ....TGG...... G...AG•. , . . . A .,...TA...... AC. . .G. . . I ! aa! ...A.C. 6 18 . . . .GC...... A * « * » * * * » » • A i...TA.. . . AA, ...A.C. 7.L2 * * * * A 1...... A .« AG * A . . C . . T . 8 13 . . . AT...... A . .AGCA...... * • A • * ..C.„TA 9 L20 , , . T . C . . .AA...... A .* AGCA» * * * *««»»« * * A * * .,C..TA 10 U A A. . * * * * A *...... A . .AACA...... C >> » 11. LI OO ; ; ;;A; ♦ AACA* » * » * * * * * • ...... c 12 L7 ...... A .C . . .GAA...... a:a *1* * > O « < ,* y * .♦ . . C . . T . 13- LI 2 » * * G > ’•'*-« • * !! ‘ i c :. .*»TA .,A..GC t* m < m m 2J0 22© **

GTTCAGCAGCAGTGTGGGGAAGTTTGTTGGATACACTGAGTTTGGAGTGAAGAACGCAGACTACTKGAACAGTSAGACTTCATATATTGATAGGATGAAAGCTCAGAGGGAGGCCTACT

1 19 . . . .GA. .C. .C.G...... ACGACT. . . . . AAG.TC.. , .C.G..A.C....TG. . . . A . 2 LI 0 . . . . GA. . C . . C . G ...... ACGACT.. . .C.G..A.C....TG. . , , A . 3 L4 . . . . GA. . c . . C . G ...... ACGA.G.. . . .AAG.TC.. ..G.G. .A.C....TG. \c\ . . . A . 4 L21 . . . . GA. .c. . C . G . . . , . . . . AC.A.G.. . . . AAG.TC.. . .G.G..A.C....TG. .c. . . . A . 5 L5 .. c. . . . AC.•. \\ q \ . . T ...... ACGACT.. . . .AAG.TC . . , . GCG ..A .C ...... G . 6.L8 . .AC« * . . .C. . . A ...... ACGACT.. . . . AAG.TC.. . .GCG. .A.C...... G. * * .* • * . , . a . 7 L2 ! ! ! t ! . » G G .. . . . • . .C.G. . GA...... AC . .TG..C...... AG. .C. . . TC...... A > 8L3 . . . T . Ik , , A ...... C . G . . GA...... TC . .TG..C...... AG. .C. . 9.L20 . . . T . . A . . A ...... C . G . . G A ...... TC . .TG . .C ...... AGT.C.. 10 UA . .A, . G ...... A...... • * -r * • • * * # • * O ♦ ' ♦... » . C* . • * . . . . T.C. .C.ACA. . . . 11. LI . . A , . . » G ...... * A . . » < > ...... G . . ..,T.C..C.ACA.... 12.L7 . . A ...... : : :: c ! g ! .GA...... AG. . ... * • C *- . . * * . . .T.C..C.GCA.... 13.L12 . . . T . . . . .C.G. .GA. . . . . AG. . ?« 2» 3W zn , A ......

GCCTGCACAACGTTGGTATTGACTACCAGAAT t L9 2.L10 3 L4 4 L21 5.L5 6 L8 7 L2 .GT. 8, L3 . GT . 9 L20 .GT. 10 L1A .,.TG..A...... A. . . 11.LI . . . TG. .A. . , . . .A . . . 12.L7 .A...... A...... 13.L12

Figure A5. Nucleotide sequences from the fourteen MHCIIDB haplotypes detected in this study. PBR codons are marked with the box below the consensus sequence and are annotated from Brown et al. 1993 and. The beta sheets (arrows) and alpha helix (cylinder) have been annotated above the consensus sequence. Numbering corresponds to the codon position of the mature protein (Ono et al. 1993, Cohen et al. 2002).

L/i On 1 t t 2 0 S 0 4e&0 *9 7 9 « » 9 p ia> no iso Consensus CCGATAATTACCYAGAATGA------RTGAT Y AAT- NHT AATMAAN-NN NRTC ARK AAY CTGGTTS —- -CATGGAT------< 1.U 21 «• ■ • » «C. . A.CCCAGAATA___ CC. .AAA. TTATTA. . . AT . . T ...... CGATAA...... ATAGAAATGGAC . . . T . . TCAAGAATAACCCAGAATCTGATCAGTAAATC------2. U 9 c . •A.CACAGAATA___ CC. . -AA. . . .C . .TTATTA. . .AT. .T ...... G...... G------3 Ll 10 . A . CACAGAATA------CC. .-A A . ..C..TTATTA...AT..T... 4. LI 4 .* * » . * c * . A . TOCAGAATAA. . .C C .. - AA. ..C..TTATTA...AT..T... . .AGATAA.. 5. LI 6 c. . A. CACAGAATA----- CC- . -AA. ..C..TTATTA...AT..T... . .AGATAA.. 6. U5 «•.• •.Cf . A. CCCAGAATA------C C .. AAA. ..C..TTATTA...AT..T... . . CGATAA. . ! ? ATAGAAATGGAC I ". TCAAGAATAAQCCACAATATGATCCATAAATAATCAATTATTATCAATAATCTGATTGATGGA 7 U 8 c. . ------A____ C . . .-A A . ..C..TAATTA. . .A T ..T ... 8 LI 1A . . . T . ' . . . A . . ------G...GG..C... 9 LI 1 . . . T . ' . . . A . . ------G . . .G G ..C .. . IgIcag!------Z-ZZIZZZ1ZZZZ 10. LI 3 . . - T .' A . . ------G. . .G G ..C ... 11 LI 20 . . . T . ’ A . . ------G . . .GG. . C . . . .‘ g .’c a g ^ ------12. LI 2 G . AT. A . . ------G...GG..C... 13. LI 7 . . .T . , T . . A . . ------G. . .G G ..C .. * 14. L112 210 :so m m m soo jio Consensus TTAAANNN^N-NNNM-NHNNNRTTCAYTMWGSKCWGCAGSAG------AACYA— N-GTSATYMGAACS*mmAWTCAGTTAATTWATCCCTMATYATTMttWNNNNNNGGCAGAAYCT3TRMH3WeTGAKCTCTGGT— TCCTGG 1.U 21 . TAATCAGGATTGAOCTGA C .C T .G G .A G. . ------. . . C . ------. .G . .CCAG. .C A .A .T T . A .C ...... T . . - . . . C . . T . . . CTGTGGACCT...... C.GG.GATCA_____ T ...... — _____A. 2 LI 9 .TAATCAGGATTAACCTGA., ...C.AG.GATCA ------T ...... — ------A. 3. L110 . TAATCAGGATTAACCTGA. , .. .C.AG.GATCA T ...... A. 4 LI 4 .TAATCAGGATTAACCTGA., ..C.CT.GG.A. . ..G .. ------. . . C . ------. .G. .CCAG. .CA.A.TT. A. A ...... T ...... C . . T . . . CTGTGGACCT. .T ...C.AG.GATCA _____ T ...... — ...... 5. LIS . TAATCAGGATTAACCTGA.. ..C.CT.GG.A. . ..G . . ------. . . C . ------. .G. .CCAG. .CA. A.TT. A. A ...... T ...... C . . T . . .CTGTGGACCT. .T .. .C.AG.GATCA_____ T ...... — ...... 6. U5 GCAATAATCAGGA. .TAATCATGATTGACCTGA., ..C.CT.GG.A G. . ------. . . C . ------. .G. .CC. .T.CA.A..T. ..T ...... T . . - . . . C . . T . . .CTGTGGACCT...... C.AG.GATCA ------T ...... C— ...... 7. LI 8 . TAATCAGGATTAACCTGA.. ..C .C T .G G .A G. . ------. . . C . ------. .G. .CC. . T. CA. A.. T . ..T ...... T ...... O . . I . . .CTGTGGACCT...... ? -- ...... 8. LI 1A . ..T.AA.CT.T..T.CT.CTAATGTT..TT.AAGA..C..TA_ . . ------... — GT.T..A. - A ------A .. .AT.A..C.GGA ------.. . . .T..C.ACAGT.. ..G. . ------9. U 1 ...T.AA.CT.T..T.CT.CTAATGTT..TT.AAGA..C..TA GT.T..A. . .A . . . . A . . .A T .A ..C .G G A ------.. .. .T. .C.ACAGT____ G. . ------10. U 3 . ..T .A A .C T .T C . . ------.T.CTGT. .C. .TA ------GT.T..A. . .A A . . A . , C . . .A------.T. .. ,T. .C.ACAGT------G. .0 . .C.CA.GT.CT 11 LI 20 CT.AA.CT.T C.. ------.T.CTGT. .C. .TA. . . G T . T . . A ‘ . . . .A A . . ------A . . C . . .A ------.T. .. . T. . C.ACAGT_____G. . G. .C.C7-. 12. U 2 . ------G. . .C T .A A .C T .T C . . ------.T.CTGT. .C. .TA. ..G T.T..A .. . .G. .A.A .. .A T.A ..C .. .A------.T. .. .T. .C.ACAGT.. . .G. ,G ------.A . . .A 13. U 7 . ------G. . . .T.AA.CT.T. .T.C. . ------.T.CTGT. .C. .TA. ..G T.T..A .. . .G . . A. A . . . AT. A . .C . . .A ------.T . . . .T..C.ACAGT.. . .G..T.AC.AA.GT.CT 14 U 12 . ------G ____ T.AA.CT.T. .T .C .. ------■------.T.CTGT. .C. .TA. ..GT.T..A.. . ,G. .A.A.. .A T.A ..C.. .A------.T. . . .T. .C.ACAGT G..T.AC.AA.GT.CT 3«a >70 *70 GGTTCTGTCTGGTTCTG C GTCTGACCCAGGAACAGCAGACGGTCCAGAACCAA-GCAGAACCTTCAGAACAGCGTTACTGTGATCAAT-— ------ATTCTGATTAAC— GATTAATATTACNGCATGTTYTAATAATTTACAGCTAATCAAT— I LI 21 A G .G .. 2. LI 9 ...GAC...... AG .G .. . .C ...... T ...... 3. LI 10 . . .G A C ...... A G .G .. . ,C ...... T ...... 4. LI 4 .G A .. . . .T ...... G C .. . 5. U 6 !!! g a g !—”!!!!!!!!!!!!!!!!!! .GA... . .T ...... G C .. . 6 LI 5 . . .GATGAGC...... T ...... G C .. . 7 . LI 8 .. .GAT.AGC.C...... 8 Li 1A 9, U 1 II10. U LI 203 12. U 2 13. U 7 . ------. TCA.ATG.G. TT. .A T . .TATGA.GAA.AA.TG.TT. .T T T .T . .A G .C A G .. . .CTG. . .C . . C . . . .A . . . GAAAGGTTA. / 14 L112 •TTT.TCA.ATG.GTTT. .ATG.TATGA.GAA.AA.TG.TT. .TTT.T. .AG.CAG.. . .CTG. . .C. .C ...... GACAGGTTA.i 518 520 SSO 540 530 580 570 550 5*0 GTGAACTATTG— - -ATGATC AATAAT------NNKKNf'nJNKHNmiTATTCA<3AGTCTGTCAN-JfNMNN>.'NW------AATCAAAGCTGTTGT-H------— -— ...— — H— NNNN-NN-N-N ------1 LI 21 • »A» • I • * * . . . . . » — — —rt 1.1 GAiAAOl ■*———■ 2 Li 9 ...... GA------ATTGATGATAACTGGTAA------3 LI 10 4. LI 4 ...... GA————————————————------— ------ATTGATGATAACTGGTAA------. 5. LI 6 ...... CTGATCCCTGATCAGCTGTT...... GA------ATTGATGATAACTGGTAA------6. LI 5 & ' AAACT' GA T- - —* A 7. U 8 r'T'taar'TrsaT'iiS'ra 8. LI 1A 9 L11 10. LI 3 11. LI 20 12. LI 2 13 LI 7 14. L112 . C . CATGTTCTGGTTCTGTTCTGTTCCTCAGATGGGTTTAGATACTATGAAGAATAAATGATTAATTAAT AAA 7*0 750 768 770 780 790 800 907 Consensus -NN------N’-NNGATCA-- — 1 ------1 ------1 ------G T ACTGAGGTTC TGTT CT GGTYCNC A 1. LI 21 TAAAAGCTGTTGTCTCT...... T . - . 2. U 9 TAAAAGCTGTTGTCTCT...... T . - . 3. U 10 — TAAAAGCTGTTGTCTCT...... T . - . 4 LI 4 ~ — ? ’.A 7/‘TO? ...... T . - . 5 LI6 TAAAAGCTGTTGTCTCT...... T . - . 6. LI 5 . . .T A -. 7 U 8 TAAAAA-' ! .*! ; . . . T . - . 8. LI 1A . . .C .T . 9 LI 1 . . .C .T . 10 U 3 . . .C .T . 11. LI 20 . . .C .T . 12. U 2 .T .C .T . 13. LI 7 .T .C .T . 14. U 12 .T.C.T.

Figure A6. Nucleotide alignment from the fourteen MHCIIDB intron 1 haplotypes detected in this study. Intron alleles are named LI correspond to exon allele L.