Cryptic, Extensive and Non-Random Chromosome Reorganization Revealed by A
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/233700; this version posted March 2, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 1 Cryptic, extensive and non-random chromosome reorganization revealed by a 2 butterfly chromonome 3 4 5 Authors 6 7 Jason Hill1,Ɨ, Pasi Rastas2, Emil A. Hornett11, Ramprasad Neethiraj1, Nathan Clark3, Nathan 8 Morehouse4, Maria de la Paz Celorio-Mancera1, Jofre Carnicer Cols5,6, Heinrich Dircksen10, Camille 9 Meslin3, Naomi Keehnen1, Peter Pruisscher1, Kristin Sikkink7, Maria Vives5,6, Heiko Vogel9, 10 Christer Wiklund1, Alyssa Woronik1, Carol L. Boggs8, Sören Nylin1, Christopher Wheat1,Ɨ 11 12 Affiliations: 13 1 Population Genetics, Department of Zoology, Stockholm University, Stockholm, Sweden 14 2 Institute of Biotechnology, (DNA Sequencing and Genomics), University of Helsinki, Helsinki, 15 Finland 16 3 Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, USA 17 4 Department of Biological Sciences, University of Cincinnati, Cincinnati, USA 18 5 Department of Evolutionary Biology, Ecology and Environmental Sciences, University of 19 Barcelona, 08028 Barcelona, Spain 20 6 CREAF, Global Ecology Unit, Autonomous University of Barcelona, 08193 Cerdanyola del 21 Vallès, Spain 22 7 Department of Ecology, Evolution and Behavior, University of Minnesota, St Paul, MN, USA 23 8 Department of Biological SciencesUniversity of South Carolina, Columbia, SC, USA 24 9 Department of Entomology, Max Planck Institute for Chemical Ecology, D-07745 Jena, Germany 25 10 Functional Morphology, Department of Zoology, Stockholm University, Stockholm, Sweden bioRxiv preprint doi: https://doi.org/10.1101/233700; this version posted March 2, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 26 11 Department of Zoology, University of Cambridge, Cambridge , UK 27 28 Ɨ Authors for correspondence: 29 Jason Hill <[email protected]> 30 Christopher Wheat <[email protected]> 31 32 Abstract 33 Chromosome evolution, an important component of mico- and macroevolutionary dynamics 1–5, 34 presents an enigma in the mega-diverse Lepidoptera6. While most species exhibit constrained 35 chromosome evolution, with nearly identical haploid chromosome counts and chromosome-level 36 shared gene content and collinearity among species despite more than 140 Million years of 37 divergence7, a small fraction of species independently exhibit dramatic changes in chromosomal 38 count due to extensive fission and fusion events that are facilitated by their holocentric 39 chromosomes7–9. Here we address this enigma of simultaneous conservation and dynamism in 40 chromosome evolution in our analysis of the chromonome (chromosome level assembly10) of the 41 green-veined white butterfly, Pieris napi (Pieridae, Linnaeus, 1758). We report an unprecedented 42 reorganization of the standard Lepidopteran chromosome structure via more than 90 fission and 43 fusion events that are cryptic at other scales, as the haploid chromosome number is identical to 44 related genera and gene collinearity within the large rearranged segments matches other 45 Lepidoptera. Furthermore, these rearranged segments are significantly enriched for clusters of 46 functionally related genes and the maintenance of ancient telomeric ends. These results suggest an 47 unexpected role for selection in shaping chromosomal evolution when the structural constraints of 48 monocentric chromosomes are relaxed. 49 bioRxiv preprint doi: https://doi.org/10.1101/233700; this version posted March 2, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 50 Butterflies and moths comprise nearly 10% of all described species6 and inhabit diverse niches with 51 varied life histories, yet they exhibit a striking similarity in their genome architecture despite 140 52 million generations19 of divergence. The vast majority have a haploid chromosome number between 53 28 and 327,11,12, and within chromosomes the gene content and order is remarkably similar among 54 divergent species as adduced by three previous chromonomes7,13, BAC sequencing and 55 chromosomal structure analyses14,15. Complicating this picture of conservation, haploid chromosome 56 counts in species of Lepidoptera, as compared to all non-polyploid animals, exhibit the highest 57 variance in number between species within a genus (n = 5 to 22616–18), the highest single count 58 (n=2269), and polymorphism in counts that do not affect fertility in crosses3,19. Lepidoptera tolerate 59 such chromosomal variation due to their holokinetic chromosomes, which facilitate the successful 60 inheritance of novel fission or fusion fragments20. Thus, while Lepidoptera can avoid the deleterious 61 consequences of large-scale bouts of chromosomal fission and fusion, in the vast majority of cases 62 such events are only found in young clades, suggesting an evolutionary cost of chromosomal 63 rearrangement. 64 The P. napi chromonome was generated using DNA from inbred siblings from Sweden, a 65 genome assembly using variable fragment size libraries (180 bp to 100 kb; N50-length of 4.2 Mb 66 and a total length of 350 Mb), and a high density linkage map generated using 275 full-sib larva, 67 which placed 122 scaffolds into 25 linkage groups, consistent with previous karyotyping of P. 68 napi21,22. After assessment and correction, the total chromosome level assembly was 299 Mb, 69 comprising 85% of the total assembly size and 114% of the k-mer estimated haploid genome size, 70 with 2943 scaffolds left unplaced (Supplementary Note 3). Subsequent annotation predicted 13,622 71 gene models with 9,346 functional predictions (Supplementary Note 4) and 94% of expected single 72 copy genes (BUSCOs) were found complete (Supplementary Note 1), along with a typical mtDNA 73 genome (Supplementary Figure 6). 74 The content and structure of the P. napi chromonome was compared with the available 75 Lepidopteran chromonomes: the silk moth Bombyx mori (Bombycidae), the postman butterfly bioRxiv preprint doi: https://doi.org/10.1101/233700; this version posted March 2, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 76 Heliconius melpomene (Nymphalidae), and the Glanville fritillary Melitaea cinxia (Nymphalidae). 77 These latter three species exhibit gene collinearity along their chromosomes that is maintained even 78 after chromosomal fusion and fission events, readily reflecting the history of these few 79 events13 (Fig. 1). After identifying the shared single copy orthologs (SCO) among these four 80 species’ chromonomes, we placed these results into a comparative chromosomal context. 81 Unexpectedly, nearly every P. napi chromosome was uniquely reorganized on the scale of 82 megabasepairs in what appeared to be the result of a massive bout of fission and subsequent fusion 83 events, including the Z chromosome (Fig. 1a). A detailed comparison of the size and number of 84 these rearrangements was then made between P. napi and B. mori, as the latter has a high quality 85 chromonome and a haploid chromosome count (n=28) closest to the Lepidopteran mode of n=31. 86 Using the shared SCOs, 99 well defined blocks of collinear gene order (hereafter referred to as 87 “collinear blocks”) were identified, with each collinear block having an average of 69 SCOs. Each 88 P. napi chromosome contained an average of 3.96 (SD = 1.67) collinear blocks, which derived from 89 an average of 3.5 different B. mori chromosomes. In P. napi, the average collinear block length was 90 2.82 Mb (SD = 1.97 Mb) and contained 264 genes in our annotation (SD = 219). 91 This novel chromosomal reorganization was validated using four complementary but 92 independent approaches to assess the scaffold joins used to make the chromosomal level assembly. 93 First, we generated a second linkage map for P. napi and used this to confirmed the 25 linkage 94 groups, our assembly scaffolds and their placement within chromosomes (Fig. 2b,c; Supplementary 95 Fig. 2). Second, since mate-pair (MP) reads spanning the scaffold joins that were indicated by the 96 first linkage map provide an independent assessment of validity of those joins, we quantified the 97 number of MP reads spanning each base pair position along chromosomes, revealing consistent 98 support for the scaffold joins within chromosomes (Fig. 2a; Supplementary Fig. 2, Note 7). Third, 99 we aligned the scaffolds of a recent high quality genome of P. rapae23 to our P. napi chromonome, 100 identifying P. rapae scaffolds that spanned the scaffold joins within P. napi, finding support for 71 101 of the 97 joins (Supplementary Fig. 5). Fourth, since the P. napi scaffold joins are not associated bioRxiv preprint doi: https://doi.org/10.1101/233700; this version posted March 2, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 102 with B. mori collinear blocks of SCOs, we identified boundaries of B. mori collinear blocks that 103 spanned scaffold joins within P. napi chromosomes, finding support for 62 of the 97 scaffold joins 104 in P. napi and none against them (Fig. 2d; Supplementary Fig. 2, Note 8,9). Thus, the Pieris napi 105 chromonome is well supported. 106 We next compared the P. napi chromosomal structure to the additional genomes available in 107 the Lepidoptera (n=20). Like most eukaryotic genomes, most Lepidopteran genome assemblies are 108 not chromonomes, complicating comparative assessments of chromosomal structure (Supplemental 109 Figure 6). To overcome this limitation, we queried each scaffold of each of these genomes for 110 SCO’s that were shared with B. mori and P. napi, and then quantified whether the scaffold 111 supported the chromosomal structure of B.