Evolution of Gene-Rich Germline Restricted Chromosomes in Black-Winged Fungus
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/2021.02.08.430288; this version posted February 9, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 1 Evolution of gene-rich germline restricted chromosomes in black-winged fungus 2 gnats through introgression (Diptera: Sciaridae) 3 Christina N. Hodson1*, Kamil S. Jaron1, Susan Gerbi2, Laura Ross1 4 5 6 1. Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, EH9 3JT, UK 7 2. Brown University, Division of Biology and Medicine, Providence, RI 02912, USA 8 9 ** corresponding author: Christina Hodson, Institute of Evolutionary Biology, University of 10 Edinburgh, UK, email: [email protected] 11 12 Short title: Evolution of germline restricted chromosomes in a fly 13 14 15 16 17 18 19 20 21 22 23 24 1 bioRxiv preprint doi: https://doi.org/10.1101/2021.02.08.430288; this version posted February 9, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 25 Abstract 26 Germline restricted DNA has evolved in diverse animal taXa, and is found in several 27 vertebrate clades, nematodes, and flies. In these lineages, either portions of chromosomes 28 or entire chromosomes are eliminated from somatic cells early in development, restricting 29 portions of the genome to the germline. Little is known about why germline restricted DNA 30 has evolved, especially in flies, in which three diverse families, Chironomidae, 31 Cecidomyiidae, and Sciaridae eXhibit germline restricted chromosomes (GRCs). We 32 conducted a genomic analysis of germline restricted chromosomes in the fungus gnat 33 Bradysia (Sciara) coprophila (Diptera: Sciaridae), which carries two large germline restricted 34 “L” chromosomes. We sequenced and assembled the genome of B. coprophila, and used 35 differences in sequence coverage and k-mer frequency between somatic and germ tissues 36 to identify GRC sequence and compare it to the other chromosomes in the genome. We 37 found that the GRCs in B. coprophila are large, gene-rich, and have many genes with 38 paralogs on other chromosomes in the genome. We also found that the GRC genes are 39 extraordinarily divergent from their paralogs, and have sequence similarity to another 40 Dipteran family (Cecidomyiidae) in phylogenetic analyses, suggesting that these 41 chromosomes have arisen in Sciaridae through introgression from a related lineage. These 42 results suggest that the GRCs may have evolved through an ancient hybridization event, 43 raising questions about how this may have occurred, how these chromosomes became 44 restricted to the germline after introgression, and why they were retained over time. 45 46 Keywords: hybridization, L chromosome, non-Mendelian inheritance, reproduction, 47 programmed DNA elimination, segregation distortion 48 2 bioRxiv preprint doi: https://doi.org/10.1101/2021.02.08.430288; this version posted February 9, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 49 Introduction 50 An underlying tenet of heredity is that all cells within an organism have the same 51 genomic sequence. However, there are a surprising number of exceptions to this rule. For 52 instance, Boveri [1] noted in Ascaris nematodes that fragments of chromosomes were 53 eliminated from somatic cells early in development, showing that in some cases germline/ 54 soma differentiation involves changes in the genomic composition of cells as well as 55 regulatory changes. In addition to the loss of chromosomal fragments (referred to as 56 “chromatin diminution”), another type of germline specialization involves the elimination of 57 whole chromosomes from somatic cells. A phenomenon we believe this was first noted in 58 the Dipteran gnat Bradysia (Sciara) coprophila [2]. Both chromatin diminution and 59 chromosome elimination are examples of programmed DNA elimination, which occurs in a 60 developmentally regulated manner across a broad evolutionary range from ciliates to 61 mammals, including more than 100 species from nine major taXonomic groups [3]. 62 Programmed DNA elimination is not a rare phenomenon, yet remains poorly understood. 63 Recently, however, genomic studies in several species are beginning to address questions 64 regarding their function and evolution. 65 66 Many eXamples of programmed DNA elimination involve regulated DNA elimination 67 from somatic cells so that portions of the genome are restricted to the germline [3]. Germline 68 restricted DNA, involving either portions of chromosomes (chromatin diminution) or entire 69 chromosomes (chromosome elimination) have evolved repeatedly and are found in 70 lampreys and hagfish (the most basal vertebrates), songbirds, nematodes, and flies [1,4–7]. 71 Recent genomic work on lampreys and nematodes (with chromatin diminution) and 72 songbirds (with chromosome elimination) have found that the germline restricted portions of 73 the genome often carry protein coding genes involved in germ tissue maturation and 74 function [8–11]. Therefore, a leading hypothesis is that germline restricted DNA may help 75 resolve intralocus conflict between the germline and somatic cells [10,12]. However, 3 bioRxiv preprint doi: https://doi.org/10.1101/2021.02.08.430288; this version posted February 9, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 76 although chromatin diminution and chromosome elimination have similar consequences, the 77 initial evolution of these systems probably differs, as the mechanism of elimination is 78 substantially different in these two systems. 79 80 In species with chromosome elimination, entire chromosomes are eXclusively found 81 in the germline: the germline restricted chromosomes (GRCs). Little is known about how 82 these chromosomes arise and how they are related to the rest of the genome. One 83 hypothesis is that they originate from B chromosomes [13], which are accessory non- 84 essential chromosomes that are widespread in eukaryotes [14]. GRCs are similar to B 85 chromosomes in that they are chromosomes in addition to the core genome (i.e. the 86 chromosomes which are found in the somatic cells as well as the germ cells), with greater 87 variation in presence/number of chromosomes than the core chromosome set. However, 88 while B chromosomes are non-essential, recent genomic work in songbirds suggests that 89 GRCs likely play an important, and perhaps fundamental role in zebra finches [10] and are 90 evolutionarily conserved across songbirds [15]. Furthermore, there is no clear evidence that 91 GRCs spread through drive and therefore unlike B chromosomes most likely persist due to 92 their functional importance, rather than as reproductive parasites. So while it is still possible 93 that GRCs originated from B chromosomes and were subsequently “domesticated”, 94 alternative explanations for their origin cannot be excluded. Especially as the origins of the 95 GRCs have so far only focused on their single origin among birds. Here we focus on a 96 different origin of GRCs; their evolution and origin in flies (Diptera). 97 98 GRCs are found in three dipteran families: the “K” chromosomes of non-biting 99 midges (Chironomidae), the “E” chromosomes of gall gnats (Cecidomyiidae), and the “L” 100 chromosomes of black winged fungus gnats (Sciaridae) [4,16,17]. Each instance appears to 101 have an independent origin, as GRCs show different properties in each lineage, and the 102 three families are not sister clades [18,19]. While the evolutionary origins of these 103 chromosomes remain obscure, GRCs are eXpected to have some function relating to 4 bioRxiv preprint doi: https://doi.org/10.1101/2021.02.08.430288; this version posted February 9, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 104 reproduction, otherwise, they likely would not have been retained over time. The origin and 105 evolution of GRCs in Sciaridae and Cecidomyiidae are particularly intriguing, as these 106 families are relatively closely related, both belonging to the infraorder Bibionomorpha 107 (although they are not sister clades, [19]). Therefore, understanding how GRCs arose in 108 these two lineages and what factors led to their evolution can provide a foundation from 109 which we can answer many questions. For instance, we can start to unravel why GRCs 110 arose in some Bibionomorpha families but not others, and compare the gene content and 111 expression of GRC genes in two relatively closely related families. 112 113 Although both Sciaridae and Cecidomyiidae carry GRCs, the characteristics of these 114 chromosomes differ between the two families, with Sciaridae carrying few (up to 4) large 115 GRCs, and Cecidomyiidae carrying many (between 16 and 67) small GRCs (reviewed in 116 [18,20]). Therefore, theories for how GRCs arose differ between the two lineages. In 117 Cecidomyiidae, the GRCs show some similarities in appearance to the core genome, and so 118 it was originally proposed that they evolved through whole genome duplications followed by 119 restriction of the duplicated chromosomes to the germline [21,22]. However, this idea 120 remains controversial and lacks empirical support. In Sciaridae, however, a comprehensive 121 theory for the evolution of GRCs suggests that the GRCs evolved from the X chromosome in 122 a series of conflicts between different parts of the genome [23].