Proceedings of the World Congress on Genetics Applied to Livestock Production, 11 1

A selective breeding programme for the marine finfish Australasian Snapper (Chrysophrys auratus)

David Thomas Ashton1, Peter Alan Ritchie2, Maren Wellenreuther1*

1The Institute for Plant & Food Research Limited, 293-297 Akersten St, Port Nelson, Nelson, 7010, New Zealand 2School of Biological Sciences, Victoria University of Wellington, Wellington, 6140, New Zealand *Corresponding and presenting author: [email protected] Proceedings of the World Congress on Genetics Applied to Livestock Production, 11 2

Summary

The ever increasing global demand for protein-rich food source is a significant challenge for sustainable seafood production. In New Zealand, wild-capture fisheries have already peaked or are falling due to exploitation pressures and other stressors, as well as variable recruitment, leading to substantial annual fluctuations of natural fish stocks. Promising solutions to alleviate these pressures and protect the future of wild stocks include alternative seafood production routes via domestication and subsequent breeding of new aquaculture species. Selective breeding programmes have significant potential to help address this challenge via cumulative improvements in key production traits, such as growth rate to disease resistance. However, the majority of farmed fish and shellfish production is based on stocks with limited or no selective breeding, and this is in part due to the lack of available genomic resources (e.g. fully assembled genomes, genetic markers). The native marine finfish Australasian snapper (Chrysophrys auratus) is an iconic species and of significant recreational and commercial interest in New Zealand. Work at Plant and Food Research is being undertaken to develop genomic and phenomic resources for this the Australasian snapper to improve the breeding of cultured populations. Here we provide a short summary of the breeding progress to date, including a description of the genomic resources that have been build, and highlight the future directions of the aquaculture breeding programme at Plant and Food Research.

Keywords: Snapper, QTL, GBS, genome, New Zealand, selective breeding. Proceedings of the World Congress on Genetics Applied to Livestock Production, 11 3

Introduction

Globally, less than 10% of aquaculture production comes from selectively bred stocks [1], lagging significantly behind the terrestrial and plant farming industries [2]. Encouragingly, genetic gains for aquatic species are generally higher than that of terrestrial farm [3, 4]. However, the majority of farmed fish and shellfish production is based on stocks with limited or no selective breeding, and this is in part due to the lack of available genomic resources (e.g. fully assembled genomes, genetic markers) [2, 5]. New Zealand has the fourth largest exclusive economic zone in the world and seafood is consequently a major source of revenue for this country ($1.42 billion). Wild-catch fisheries are the primary source for most seafood production, but with many species at or above maximum production capacity there has been an increasing focus on aquaculture [6]. Currently, however, only three species make up 90% of New Zealand’s aquaculture production [7, 8], namely Pacific oysters (Crassostrea gigas), Greenshell mussel (Perna canaliculus) and Chinook salmon (Oncorhynchus tshawytscha), with the latter one being introduced from North America. This lack of species diversity makes domestication of new species, particularly native finfish species, an urgent need. The Australasian Snapper (Chrysophrys auratus, Figure 1) is one of New Zealand’s most important and valuable inshore finfish species ($36.8m pa in exports) [9]. In addition to the commercial fishery, snapper populations also support a major recreational fishery [10]. While snapper is not a commercial aquaculture species in New Zealand, the close sister species (Chrysophrys major) is a cultured fish in that has been bred since the 1950s [11]. Plant and Food Research has started a domestication and breeding programme for snapper over ten years ago and has established a four generation cultured population in Nelson, New Zealand. Until recently, the only genetic resources available for this species were a few microsatellites transferred from C. major and short mitochondrial DNA sequences [12, 13, 14]. As part of the snapper breeding programme at Plant and Food Research, efforts have been underway to “genomically enable” snapper by developing a range of genomic resources, including a fully assembled and annotated genome, development of a large dataset of genetic SNP markers, transcriptome, linkage map and the discovery of QTLs. These resources allow genomic selection procedures to select the best breeders from the broodstock, even when selecting polygenic traits [15], and control for the genetic relatedness of individual in the broodstock population. Proceedings of the World Congress on Genetics Applied to Livestock Production, 11 4

Figure 1. Cultured Australasian snapper (Chrysophrys auratus) at the Nelson Maitai hatchery of Plant and Food Research. Material and Methods

Snapper populations The snapper breeding programme at Plant and Food Research was established with samples collected from a three generation cultured snapper pedigree held in Nelson. The F0 generation (grandparents) is comprised of individuals (n = 50) collected from Tasman Bay in the South Island of New Zealand. These were held in two separate populations and used to produce five F1 year classes using tank based spawning. In 2013, a selection of individuals (n = 75) from the five F1 year classes were combined into a single population and used to produce a F2 generation (n = 577) using tank based spawning.

Genome wide marker discovery using Genotyping By Sequencing A modified Genotyping by Sequencing (GBS) approach was used to genotype all individuals in the three generation pedigree [16]. A double digest (restriction enzymes: PstI and MspI) GBS approach was used to generate the libraries. A total of 672 individuals where split into 96 individuals per sequencing lane, with 2-3 replicates used for each of the F0 and F1 individuals. A total of 8x lanes of sequencing were carried out on a HiSeq 2000 at the Australian Genome Research Facility (AGRF) in . Single nucleotide polymorphisms (SNPs) were extracted from the raw data using the STACKs pipeline [17] after checking and trimming reads with fastqc and fastq-mcf, respectively. Relatedness estimates were calculated for individuals using an R package developed by Dodds et al. [18] and checked with the software package Cervus [19]. A linkage map was constructed using Lep-MAP2 [20], which was based on the 10 largest F1 and F2 families in the data set.

Genome sequencing and assembly A genome assembly for C. auratus was constructed using one F1 individual. Sequencing data included a 160bp (PE151), 3kbp (PE101), 8kbp (PE101), 10kbp (PE50), and 20kbp (PE50) mate pair libraries. The initial genome assembly was carried out using ALLPATHS [21]. The scaffolds from the initial assembly and the linkage map from the GBS data were used to check the ordering between the linkage map and genome scaffolds. Comparisons between the linkage map and scaffolds was then plotted using the CIRCOS software [22]. Optical mapping was also carried out to aid the genome assembly process.

Phenotyping and Quantitative Trait Loci (QTL) mapping Phenotyping and QTL mapping were carried out using the GBS data and three generation pedigree. Phenotypic traits measured include length, colour, and shape data over time (each fish is measured repeatedly). Automated phenotyping scripts have been being developed to extract a wide range of phenotypic shape and growth information from the images collected. QTL scans are carried out on 1 and 3 year old fish. Given that the generations were generated using tank based breeding, the Proceedings of the World Congress on Genetics Applied to Livestock Production, 11 5

pedigree was expected to consist of unrelated as well as closely related individuals. Having such a complex pedigree means that the use of software is severely limited, as most are designed to deal with simple family based designs [23]. Results

A total of 1.6 GB of data was produced from the 8 lanes of sequencing with each individual having between 2 – 6 Million reads. FastQC results indicated that the data quality was high, with 1.5 GB remaining after filtering out low quality reads. STACKs generated a total of 249,468 SNPs with greater than 7x coverage. 20,311 SNPs remained after filtering for those with a Minor Allele Frequency (MAF) above 0.05 and that were present in 75% of the populations. This data was used to determine the pedigree structure of the cultured population, generate a dense linkage map, and identify several QTLs for growth and colour traits. The linkage map consists of 10,968 markers in 24 linkage groups. Data from a range of sequencing libraries is being used to construct a full genome sequence. The linkage map and optical mapping have also proved suitable to help link together genome scaffolds. Before linkage mapping the genome assembly consisted of 5,998 scaffolds. Of those, 1,479 were able to be placed somewhere on the linkage map. Those 1,479 scaffolds comprised a total of 701Mb (95% of the total bp in the genome). The 1,479 scaffolds were combined into 24 linkage groups which match our expected 24 chromosomes, and together had a N50 scaffold value of 1.5 Mb. Following integration of our optical mapping data to improve the genome assembly the N50 scaffold value increased to 4.4 Mb. A number of QTLs were detected at the genome-wide significance threshold, for both growth related traits, as well as other traits. Growth information was used to estimate breeding values for the third generation snapper to select suitable parents for the 2017/18 breeding season. Discussion

The development of high-throughput genomic techniques (e.g. next-generation sequencing) has led to the transition from relatively basic genetic methods into a new area of research known as genomics. While prior genetic methods used a relatively low number of neutral genetic markers, genomic methods can use large numbers of neutral and adaptive genetic markers, full genome sequencing data, and therefore more effectively address the complex array of interactions that occur across the genome and affect the phenotype [see 24, 25]. Due the significant increases in genomic power over previous genetic methods there is increasing interest in using genomic informed methods to better manage wild populations [26] and species important for aquaculture, the latter particularly when selective breeding programmes are underway [6]. Despite significant benefits for aquaculture and research, genomics is still underutilized in New Zealand’s aquatic species. One of the main reasons for this is the high resource requirements of genomics based approaches. Additionally, unlike land based animal culture, which is dominated by relatively few species (e.g. sheep, cattle, pigs), fish, which are still largely wild-caught, have the highest species diversity for any vertebrate group. This results in most genomic research being spread thinly across the different species and limited genomic resources being available for many key fisheries or aquaculture species. However, the reduced cost of sequencing and the development of new genotyping method for species with no prior information (e.g. GBS) are beginning to Proceedings of the World Congress on Genetics Applied to Livestock Production, 11 6

overcome this limitation [e.g. 16]. The Australasian snapper is a good example of a fish species reflecting the opportunity that genomic resources can have. Snapper is an important recreational and commercial teleost fish species in New Zealand and worth ~ $35M pa in exports and supports an important local fishery. Some population structure may be present although it is still poorly understood (due to being investigated with a few weak genetic markers) [14]. Although snapper is currently only wild- caught, there is interest in the development of this species for aquaculture and potentially restocking of the wild populations. Wild snapper populations have recently undergone a significant reduction in total biomass (down to 20% of its unfished stock biomass) [9] and restocking could present on method for recovering this stock. Despite a wide range of potential applications, no genomic approaches have been used in the research or management of this species. We presented the development of significant genomic resources for snapper, and have outlined how they can be used to inform and accelerate the ongoing breeding programme of snapper and Plant and Food Research in New Zealand. We think that an additional focus for future work is likely to centre on the investigation of natural populations (to enable restocking, and also to understand adaptive wild variation), annotation of the assembled genome, and further QTL mapping. Investigation of the natural populations will likely include resequencing and genotyping of individuals to determine the structure of the wild populations, direction and strength of gene flow, how environmental gradients affect the population genetic strcyuture and the amount of adaptive genetic variation in different regions of New Zealand. Acknowledgements

We would like to acknowledge all the Plant and Food Research staff who have been involved with breeding and husbandry for the snapper populations; in particular Warren Fantham who manages the fish breeding and Therese Wells who manages the post-juvenile snapper husbandry. We would also like to thank the laboratory and bioinformatics staff at Plant and Food Research involved in the snapper breeding programme, in particular, Elena Hilario, Peter Jaksons and Ross Crowhurst. This project was funded through the “Accelerated breeding for enhanced seafood production” MBIE research programme project (#C11X1603).

References

1. Gjedrem, T., et al. (2012) The importance of selective breeding in aquaculture to meet future demands for animal protein: A review. Aquaculture 350–353, 117-129. 2. Yáñez, J.M., et al. (2015) Genomics in aquaculture to better understand species biology and accelerate genetic progress. Front. Genet. 6. 3. Nguyen, N.H. (2016) Genetic improvement for important farmed aquaculture species with a reference to carp, tilapia and prawns in Asia: achievements, lessons and challenges. Fish. Fish. 17, 483-506. 4. Gjedrem, T. and Rye, M. (2016) Selection response in fish and shellfish: a review. Reviews in Aquaculture. Proceedings of the World Congress on Genetics Applied to Livestock Production, 11 7

5. Robledo, D., et al. (2017) Applications of genotyping by sequencing in aquaculture breeding and genetics. Reviews in Aquaculture. 6. Bernatchez, L., et al. (2017) Harnessing the power of genomics to secure seafood future. Trends Ecol. Evol. 9, 665-680. 7. Camara, M.D. and Symonds, J.E. (2014) Genetic improvement of New Zealand aquaculture species: programmes, progress and prospects. New Zealand Journal of Marine and Freshwater Research, 1-26. 8. Symonds, J., et al. (2012) Developing broodstock resources for farmed marine fish. In Proceedings of the New Zealand Society of Animal Production, pp. 222-226. 9. MPI (2013) Review of sustainability and other management controls for snapper 1 (SNA 1). In MPI Discussion Paper, pp. 1-80, Ministry of Primary Industries. 10. Parsons, D., et al. (2014) Snapper (Chrysophrys auratus): a review of life history and key vulnerabilities in New Zealand. New Zealand Journal of Marine and Freshwater Research, 1-28. 11. Murata, O., et al. (1996) Selective breeding for growth in red sea bream. Fisheries science 62, 845-849. 12. Adcock, G.J., et al. (2000) Screening of DNA polymorphisms in samples of archived scales from New Zealand snapper. Journal of Fish Biology 56, 1283-1287. 13. Hauser, L., et al. (2002) Loss of microsatellite diversity and low effective population size in an overexploited population of New Zealand snapper ( auratus). Proceedings of the National Academy of Sciences 99, 11742-11747. 14. Bernal-Ramírez, J.H., et al. (2003) Temporal stability of genetic population structure in the New Zealand snapper, Pagrus auratus, and relationship to coastal currents. Marine Biology 142, 567-574. 15. Wellenreuther, M. and Hansson, B. (2016) Detecting polygenic evolution: problems, pitfalls, and promises. TIG 32, 155–164. 16. Elshire, R.J., et al. (2011) A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE 6, 1-10. 17. Catchen, J., et al. (2013) Stacks: an analysis tool set for population genomics. Molecular ecology 22, 3124-3140. 18. Dodds, K.G., et al. (2015) Construction of relatedness matrices using genotyping-by- sequencing data. BMC Genomics 16, 1-15. 19. Kalinowski, S.T., et al. (2007) Revising how the computer program cervus accommodates genotyping error increases success in paternity assignment. Molecular Ecology 16, 1099- 1106. 20. Rastas, P., et al. (2015) Construction of ultra-dense linkage maps with Lep-MAP2: stickleback F2 recombinant crosses as an example. Genome Biology and Evolution, 1-48. 21. Butler, J., et al. (2008) ALLPATHS: de novo assembly of whole-genome shotgun microreads. Genome Research 18, 810-820. Proceedings of the World Congress on Genetics Applied to Livestock Production, 11 8

22. Krzywinski, M.I., et al. (2009) Circos: An information aesthetic for comparative genomics. Genome Research. 23. Ashton, D.T., et al. (2017) 15 years of QTL studies in fish: challenges and future directions. Mol. Ecol. 26, 1465–1476. 24. Valenzuela-Quiñonez, F. (2016) How fisheries management can benefit from genomics? Briefings in Functional Genomics, 352-357. 25. Wenne, R., et al. (2007) What role for genomics in fisheries management and aquaculture? Aquat. Living Resour. 20, 241-255. 26. Bernatchez, L. and Wellenreuther, M. (2017) Synergistic Integration of Genomics and Ecoevolutionary Dynamics for Sustainable Fisheries: A Reply to Kuparinen and Uusi- Heikkilä. Trends in Ecology & Evolution.