bioRxiv preprint doi: https://doi.org/10.1101/668384; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 TITLE PAGE 2 3 4 A chromosome-level assembly of the Atlantic herring – 5 detection of a supergene and other signals of selection 6 7 8 Mats E. Pettersson1, Christina M. Rochus1,*, Fan Han1,*, Junfeng Chen1, Jason Hill1, 9 Ola Wallerman1, Guangyi Fan2,3, Xiaoning Hong2,4, Qiwu Xu2, He Zhang2, 10 Shanshan Liu2, Xin Liu2,5,6, Leanne Haggerty7, Toby Hunt7, Fergal J. Martin7, Paul 11 Flicek7, Ignas Bunikis8, Arild Folkvord9,10, Leif Andersson1,11,12 12 *these authors contributed equally 13 14 15 1 Science for Life Laboratory, Department of Medical Biochemistry and 16 Microbiology, Uppsala University, Uppsala, Sweden 17 2 BGI-Qingdao, BGI-Shenzhen, Qingdao 266555, China 18 3 State Key Laboratory of Quality Research in Chinese Medicine, Institute of 19 Chinese Medical Sciences, University of Macau, Macao, China 20 4 BGI Education Center, University of Chinese Academy of Sciences, 21 Shenzhen 518083, China 22 5 BGI-Shenzhen, Shenzhen 518083, China 23 6 China National GeneBank, BGI-Shenzhen, Shenzhen 518120, China 24 7 European Molecular Biology Laboratory, European Bioinformatics 25 Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK 1 bioRxiv preprint doi: https://doi.org/10.1101/668384; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 8 Science for Life Laboratory Uppsala, Department of Immunology, Genetics 2 and Pathology, Uppsala University, Uppsala, Sweden. 3 9 Department of Biological Sciences, University of Bergen, Bergen, Norway. 4 10 Institute of Marine Research, Bergen, Norway. 5 11 Department of Animal Breeding and Genetics, Swedish University of 6 Agricultural Sciences, Uppsala, Sweden 7 12 Department of Veterinary Integrative Biosciences, Texas A&M University, 8 Texas, United States 2 bioRxiv preprint doi: https://doi.org/10.1101/668384; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 ABSTRACT 2 The Atlantic herring is a model species for exploring the genetic basis for 3 ecological adaptation, due to its huge population size and extremely low genetic 4 differentiation at selectively neutral loci. However, such studies have so far been 5 hampered because of a highly fragmented genome assembly. Here, we deliver a 6 chromosome-level genome assembly based on a hybrid approach combining a de 7 novo PacBio assembly with Hi-C-supported scaffolding. The assembly comprises 8 26 autosomes with sizes ranging from 12.4 to 33.1 Mb and a total size, in 9 chromosomes, of 726 Mb. The development of a high-resolution linkage map 10 confirmed the global chromosome organization and the linear order of genomic 11 segments along the chromosomes. A comparison between the herring genome 12 assembly with other high-quality assemblies from bony fishes revealed few 13 interchromosomal but frequent intrachromosomal rearrangements. The 14 improved assembly makes the analysis of previously intractable large-scale 15 structural variation more feasible; allowing, for example, the detection of a 7.8 16 Mb inversion on chromosome 12 underlying ecological adaptation. This 17 supergene shows strong genetic differentiation between populations from the 18 northern and southern parts of the species distribution. The chromosome-based 19 assembly also markedly improves the interpretation of previously detected 20 signals of selection, allowing us to reveal hundreds of independent loci 21 associated with ecological adaptation in the Atlantic herring. 3 bioRxiv preprint doi: https://doi.org/10.1101/668384; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 INTRODUCTION 2 The Atlantic herring (Clupea harengus) is a model system to study the genetic 3 basis for ecological adaptation and the consequences of natural selection 4 (Martinez Barrio et al., 2016;,Lamichhaney et al., 2017). A major merit of this 5 system for evolutionary studies is the minute genetic drift due to the enormous 6 population size facilitating the detection of how natural selection affects the 7 populations. The Atlantic herring is in fact one of the most abundant vertebrates 8 on Earth, with schools comprising more than a billion individuals and an 9 estimated global population in excess of 1011 fish (Feng et al., 2017). It is also one 10 of very few marine species to successfully colonize the Baltic Sea, a brackish 11 body of water formed after the last Ice Age, giving rise to the phenotypically 12 distinct Baltic herring classified as a subspecies of the Atlantic herring. 13 Earlier work provided the first draft version of the herring genome 14 (Martinez Barrio et al., 2016), and revealed regions with strong signals of 15 selection related to both adaptation to the brackish Baltic Sea and differences in 16 spawning time between herring populations (Martinez Barrio et al., 17 2016;,Lamichhaney et al., 2017). In contrast, there is essentially no genetic 18 differentiation at selectively neutral loci even between geographically distant 19 populations, a fact documented already by isozyme and microsatellite analyses 20 (Andersson et al., 1981; Larsson et al., 2010; Limborg et al., 2012; Ryman et al., 21 1984) and verified by whole genome sequencing and Fst analysis (Lamichhaney 22 et al., 2012; Lamichhaney et al., 2017). However, while the signals of selection 23 were strong, the fragmented nature of the draft genome made it challenging to 24 determine the number of independent loci under selection as well as studying 25 the impact of large-scale inversions and other structural variations. 4 bioRxiv preprint doi: https://doi.org/10.1101/668384; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 Here, by combining a de novo long-read assembly of an Atlantic herring 2 with long-range chromatin interaction information gathered via the Hi-C method 3 (Lieberman-Aiden et al., 2009), we remedy this fragmentation and deliver a 4 chromosome-level assembly of the herring genome comprising 26 autosomes 5 with sizes ranging from 12.4 to 33.1 Mb and a total size of 726 Mb. We also show 6 how this new assembly has a major impact on our ability to interpret the signals 7 of selection. The final assembly version is publicly available via the European 8 Nucleotide Archive (https://www.ebi.ac.uk/ena/data/view/GCA_900700415). 9 10 RESULTS 11 Compiling the hybrid assembly 12 The assembly is based on a new, de novo, assembly of an Atlantic herring, as 13 opposed to a Baltic herring used for the previously published version 1.2 14 (Martinez Barrio et al., 2016). We generated 63 Gb (approximately 75x coverage) 15 of sequence using Pacific Biosciences RSII cells and assembled the genome with 16 FALCON-unzip (Chin et al., 2016). The FALCON-unzip assembly was processed 17 through the PurgeHaplotigs pipeline (Roach et al., 2018), in order to remove 18 redundant sequences from the primary assembly. This procedure resulted in a 19 de novo assembly with a total size of 792.6 MB, a contig N50 of 1.61 Mb, 20 comparable to the scaffold N50 (1.84 Mb) of the published v1.2 genome. Thus, 21 the PacBio assembly achieves similar level of organization while eliminating a 22 substantial degree of uncertainty, as v1.2 contains close to 10% undetermined 23 bases (Ns) as compared to zero Ns in the FALCON-unzip assembly. 24 In order to obtain chromosome-level organization, a Hi-C library was 25 prepared from liver and brain tissue from the same individual as used for the 5 bioRxiv preprint doi: https://doi.org/10.1101/668384; this version posted June 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 PacBio assembly, and it was sequenced on a BGISEQ-500 sequencer. Mapping 2 with Juicer v1.5.6 (Durand et al., 2016b) yielded 99 million informative Hi-C-read- 3 pairs, which were used to scaffold the PacBio de novo assembly into 4 chromosome-level organization using the 3D-DNA workflow pipeline 5 (Dudchenko et al., 2017) followed by manual correction using Juicebox v1.9.8 6 (Durand et al., 2016a). The output assembly was polished using Pilon v1.22 7 (Walker et al., 2014), based on 50x Ilumina paired-end coverage from the same 8 individual. Finally, a custom R script was applied to eliminate a set of small, 9 nearly identical repeats that were deemed likely to be redundant haplotypes 10 based on analysis of the mapped read depth of a set of Ilumina short reads from a 11 previously sequenced herring population (Martinez Barrio et al., 2016).
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages49 Page
-
File Size-