University of California Santa Cruz Creation And
Total Page:16
File Type:pdf, Size:1020Kb
UNIVERSITY OF CALIFORNIA SANTA CRUZ CREATION AND UTILIZATION OF NOVEL GENETIC METHODS FOR STUDYING AND IMPROVING MANAGEMENT OF CHINOOK SALMON POPULATIONS A dissertation submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in OCEAN SCIENCES by Anthony J. Clemento December 2013 The Dissertation of Anthony J. Clemento is approved: Dr. John Carlos Garza, Chair Dr. Jonathan Zehr Dr. Grant Pogson Dr. Eric Anderson Dean Tyrus Miller Vice Provost and Dean of Graduate Studies Copyright c by Anthony J. Clemento 2013 Table of Contents List of Figures vi List of Tables viii Abstract xi Dedication xiv Acknowledgments xv Introduction 1 1 Discovery and characterization of single nucleotide polymorphisms in Chinook salmon, Oncorhynchus tshawytscha 10 1.1 Abstract . 10 1.2 Introduction . 11 1.3 Methods . 16 1.3.1 Primer Design and PCR . 16 1.3.2 Sequencing and SNP Assay Development . 17 1.4 Results . 20 1.5 Discussion . 41 2 Evaluation of a SNP baseline for genetic stock identification of Chi- nook salmon (Oncorhynchus tshawytscha) in the California Current Large Marine Ecosystem 48 2.1 Abstract . 48 2.2 Acknowledgments . 50 2.3 Introduction . 51 2.4 Methods . 55 2.4.1 Baseline Populations . 55 2.4.2 Markers and Genotyping . 60 2.4.3 Marker Selection . 61 iii 2.4.4 Population Genetics Analyses . 66 2.4.5 Power Analyses . 67 2.4.6 Mixed Fishery Samples . 69 2.5 Results . 71 2.5.1 Genotyping and Basic Population Genetics . 71 2.5.2 Assignment and Mixture Estimation Accuracy . 75 2.5.3 Fishery Sample . 76 2.6 Discussion . 79 2.6.1 Methodological Considerations . 80 2.6.2 Implications for Management . 84 2.7 Conclusions . 87 3 Large-scale genetic tagging experiment in a hatchery population of Chinook salmon (Oncorhynchus tshawytscha) allows for pedigree-based inference 88 3.1 Introduction . 88 3.2 Methods . 96 3.2.1 Study Site . 96 3.2.2 Hatchery Sampling . 98 3.2.3 DNA Extraction and Genotyping . 100 3.2.4 Population Genetic Analyses . 100 3.2.5 Pedigree Reconstruction . 101 3.2.6 Age Structure, Reproductive Success and Length-at-spawning . 102 3.2.7 Relatedness . 104 3.2.8 Fishery Samples . 105 3.3 Results . 106 3.3.1 Population Genetic Parameters . 106 3.3.2 Hatchery Pedigree Reconstruction . 107 3.3.3 Age Structure . 111 3.3.4 Variance in Family Size and Reproductive Success . 113 3.3.5 Heritability of Length-at-spawning . 115 3.3.6 Relatedness . 119 3.3.7 Fishery Samples . 123 3.4 Discussion . 128 3.4.1 Technical Issues . 129 3.4.2 Parentage Assignments . 130 3.4.3 Heritability of Length-at-maturity . 131 3.4.4 Age Structure of Returning Adults and Spawning Broodstock . 133 3.4.5 Inbreeding and Reproductive Success . 135 3.4.6 Fishery Assignments . 137 3.5 Conclusions . 139 Conclusions and Future Directions 140 iv References 144 v List of Figures 2.1 Unrooted neighbor-joining tree based on chord distances of 67 Chinook salmon populations from California to Alaska in the GSI baseline (see Table 2.1 for population details). Dashed lines indicate the position of populations which fall at tree junctions or have very short branch lengths. Sinona Creek and the coho salmon were omitted for missing data. 74 2.2 Estimates of mixing proportions from cross-validation over gene copies (CV-GC) and K-Fold simulations for the eight most abundant reporting units in California Chinook salmon fisheries. The x-axis gives the true proportion of fish from each reporting unit, and the y-axis gives the esti- mated proportion. The dashed line is the y=x line. Shaded regions give the range between the 5% and 95% quantiles of estimates that would be achieved with perfect assignment of fish to reporting unit; i.e., they represent the uncertainty due to the fact that fishery proportions are es- timated with a finite sample (in our simulations, a sample of 200 fish). The 5% and 95% quantiles of the estimates using genetic data from the CV-GC and the K-Fold methods are shown with vertical line segments and open diamonds, respectively. The mean over 20,000 CV-GC simula- tion replicates and 1,000 K-Fold replicates are given by filled circles and open triangles, respectively. These points fall along the dotted line when the estimator is unbiased. 77 3.1 Age structure of returning adults (male and female) for two cohorts (2006 and 2007) from the Feather River Hatchery, CA. Numbers in parentheses indicate the total number of fish in each category, while white bars denote two-year olds, grey bars three-year olds and black bars four-year old fish. 112 3.2 Age structure of spawning adults (male and female) for two years of spawner broodstock from the Feather River Hatchery, CA. Numbers in parentheses indicate the total number of fish in each category, while white bars denote two-year olds, grey bars three-year olds and black bars four- year old fish. 114 vi 3.3 Number of offspring that returned to the hatchery for females (white bars), males (grey bars) and mated pairs (dark bars) over all study years. The similarity over comparisons is expected as generally one male is spawned with one female at the hatchery. 116 3.4 Number of offspring (full-siblings) that returned to the hatchery for par- ents spawned in each study year, 2006-2009. Note that offspring of 2009 spawners are under-represented as sampling permitted assignment of only two-year old fish. 117 3.5 Relationship between the length of a mother and the number of her off- spring that returned to the hatchery as adults at ages two, three or four. The size of full-sibling families here ranges from one to thirteen. 118 3.6 Linear regression of parental length on the length of their 3-year old adult offspring. Independent comparisons were made for: mean parent length and all offspring, male offspring, and female offspring, as well as, fathers and male offspring and mothers and female offspring. 120 3.7 Distribution of the relatedness coeffcient (Rxy; Queller and Goodnight 1989) between all possible pairs of individuals in each collection of spawn- ing broodstock and over all samples. Values are normally distributed, so the range, mean, standard deviation (Std. Dev.) and skew are reported. 122 3.8 Mated pairs were recorded at the FRH for spring-run spawners from 2006-2009. Parentage assignment allowed for the comparison of the dis- tribution of relatedness (Rxy) among pairs that successfully had offspring return to the hatchery as adults (left side) and those that did not (right side). Again, values were normally distributed, and the range, mean, standard deviation (Std. Dev.) and skew are reported. 124 3.9 Linear regression of the degree of relatedness between a parent pair (as estimated by Rxy) and the number of offspring that returned to the hatchery in subsequent years. This includes Rxy values for parents that had no offspring return. 125 vii List of Tables 1.1 Summary of EST sequencing effort to identify genetic variation in popu- lations of Chinook salmon (O. tshawytscha) from the west coast of North America. The weighted estimates account for unobserved variation in consensus sequence derived from less than 24 individuals. 21 1.2 Description of the 117 SNP assays developed in this project with the tar- get polymorphism, primer and probe sequences, length of the consensus sequence in base pairs (bp), and GenBank (dbGSS) and NCBI (dbSNP) accession numbers indicated. 24 1.3 Summary statistics for 117 SNP loci in five Chinook salmon populations. N is the number of individuals genotyped. HE is expected (unbiased) heterozygosity and HO is observed heterozygosity. FST is over all five populations. AF is the observed frequency of the minor allele from the Feather River stock in each population. Asterisks (*) indicate significant (p<0.001) deviations from Hardy-Weinberg equilibrium. 31 1.4 Preliminary BLAST results (BLAST hit and e-value) and annotation of the target SNP for the loci described here (Reference 1) and for an additional 24 loci (References 2, 3, 4 and unpublished) that are part of the final genotyping panel described in Chapter 2. Also included is whether the variation is present in an intron or exon and its location with respect to the described gene, either in coding sequence (CDS) or untranslated regions (UTR). No translation (n.t.) was available for 10 loci. For CDS exons, a single amino acid is indicated for synonymous substitutions, while both amino acids are included for non-synonymous substitutions. Reference codes are as follows: 1. Clemento et al. 2011; 2. Smith et al. 2005a; 3. Campbell and Narum 2008; 4. Smith et al. 2005b. 37 viii 2.1 Populations and reporting groups in the single-nucleotide polymorphism baseline for genetic stock identification of Chinook salmon from the West Coast of North America. Shown are the names used on the phylogeo- graphic tree (Figure 2.1), the total number of individuals sampled (n), the number used in the training set (nt), estimates of unbiased expected (Exp.) and observed (Obs.) heterozygosity (Hz), and the mean number of alleles (A); also shown are the proportion of individuals that self-assign (Assign.) to the population (pop.) from which they were sampled and the proportion that self-assign to the correct reporting (rep.) group, as well as the mean FST for each population within and between reporting groups. Note that mean summary values shown were calculated excluding the coho salmon sample. 57 2.2 List of the 96 single nucleotide polymorphism loci used to construct the baseline for genetic stock identification of Chinook salmon from the West Coast of North America, including dbSNP accession numbers (at the NCBI on-line repository for short genetic variations) and source reference (SR) where available: 1.