<<

1 Supplementary Information for Evolution of gene-rich germline restricted

2 chromosomes in black-winged fungus gnats through introgression (Diptera:

3 )

4 Christina N. Hodson, Kamil S. Jaron, Susan Gerbi, Laura Ross

5

6 Supplementary Text 1: Detailed description of the chromosome inheritance system in

7 coprophila.

8

9 The chromosome system in B. coprophila, and in sciarids generally, is unique in

10 several ways including chromosome transmission patterns, sex determination, and the

11 presence of GRCs (see Fig 1 for transmission patterns). All sciarids studied to date have a

12 system of reproduction known as paternal genome elimination, where males only transmit

13 maternally inherited chromosomes to offspring [1,2]. Paternal genome elimination has

14 evolved independently in at least seven lineages, including the related gnat

15 family Cecidomyiidae [3]. In all with paternal genome elimination, meiosis occurs in

16 a Mendelian manner in females, but in males meiosis is aberrant. In male meiosis in

17 sciarids, there is a monopolar spindle in meiosis I. Maternally inherited chromosomes move

18 towards the monopolar spindle, while paternally derived chromosomes move away from it

19 and are discarded in a bud of cytoplasm [2]. Thus, only the maternal complement of

20 chromosomes is transmitted to the sperm. This phenomenon in B. coprophila was the first

21 example of “imprinting”, to our knowledge, by which the cell recognizes the maternal or

22 paternal origin of a chromosome [4]. Interestingly, the GRCs always segregate with the

23 maternal set of chromosomes. Therefore, all of the GRCs (typically two in B. coprophila) are

24 transmitted through sperm, regardless of whether they are of maternal or paternal origin [4].

25 This is one of the few examples of chromosomes which seem to evade paternal genome

26 elimination. In the second division of meiosis in B. coprophila there is a bipolar spindle,

27 however there is a nondisjunction of the maternal X chromosome in this division such that

1 28 only one sperm develops through male meiosis. This sperm contains a haploid set of

29 autosomes, typically two GRCs, and two X chromosomes [1,2]. There is some variation in

30 the number of GRCs in each sperm, ranging from 0-4 in B. coprophila [4]. Variation in GRC

31 number is thought to be due to nondisjunction events which can occur in early germ cell

32 divisions, however, the majority of sperm (78%) carry two GRCs [4]. In female meiosis, the

33 GRCs form a bivalent during meiosis, and one GRC segregates into each egg (i.e. meiosis

34 is typical) [4].

35

36 As a result of the unusual type of meiosis in male sciarids, B. coprophila zygotes

37 typically carry a diploid set of autosomes, three X chromosomes (one inherited from their

38 mother and two from their father), and three GRCs. All sciarids have XO sex chromosome

39 system (i.e. males are XO and females are XX, and there is no Y chromosome), but sex is

40 determined via X chromosome elimination from somatic cells early in development. In the 7-

41 9 cleavage division, either one X chromosome (for females) or two X chromosomes (for

42 males) are eliminated from somatic cells [5]. Elimination occurs due to a failure of separation

43 of the sister chromatid arms during mitosis, resulting in the chromosomes being left on the

44 metaphase plate and not being incorporated into daughter nuclei [6]. It is thought that the

45 number of X chromosomes eliminated is maternally controlled, since B. coprophila females

46 are monogenic, and produce exclusively female or male progeny [7]. Females that produce

47 female offspring carry a large inversion on the X chromosome that is always associated with

48 female-producing females [2,8]. GRC elimination from somatic cells occurs in a remarkably

49 similar manner, with the exception that GRC elimination occurs in the 5-6 cleavage division

50 and all GRCs are eliminated from somatic cells [5].

51

52 In germ cells, there is also an elimination of one X chromosome and typically one

53 GRC. In this case, elimination occurs in a somewhat mysterious manner in early germ cell

54 development, when one X chromosomes and all but two GRCs are eliminated by being

55 ejected from the germ cell through a cytoplasmic bud [9,10]. Therefore, early germ cells of

2 56 both males and females in B. coprophila have the same chromosome constitution, with a

57 diploid set of autosomes, X chromosomes, and GRCs. This mechanism also regulates the

58 number of GRCs and prevents their accumulation over time, as all but two GRCs are always

59 eliminated from early germ cells.

60

61 Less is known about the mechanism of the chromosome system in other sciarid

62 species, but across the family all species studied exhibit paternal genome elimination and X

63 chromosome elimination early in development as the means of sex determination. Although

64 only a handful of species have been studied in detail, evidence suggests that most, but not

65 all sciard species carry GRCs, with the number of GRCs ranging from 0-4 [2]. The two

66 species in which GRCs are absent are closely related to each other, suggesting that GRCs

67 were likely lost in these species. Additionally, monogeny, or females that produce offspring

68 of only one sex, is present in some, but not all species across Sciaridae [2]. There seem to

69 be many transitions in this trait across Sciaridae, with some species being monogenic, some

70 being digenic (i.e. females produce offspring of both sexes), and some species having a mix

71 of these two types of females. Very little is known about the genetic underpinnings of this

72 trait.

73

74 Overall, the evidence suggests that paternal genome elimination and X chromosome

75 elimination as a means of sex determination evolved once at the base of Sciaridae. It is less

76 clear how GRCs and monogeny evolved. It was originally suggested that the presence of

77 GRCs and monogeny are related, as Bradysia ocellaris, a species that has lost GRCs is

78 digenic. Additionally, a lab line of Bradysia impatiens that was bred to lose GRCs

79 transitioned from monogenic to digenic reproduction [4]. However, these facts are anecdotal

80 and there are also several species with digenic reproduction that do carry GRCs (reviewed

81 in [2]).

82

3 83 Cecidomyiidae, gall gnats also in the Infraorder , have a similar

84 reproduction system to Sciaridae, in that both families exhibit paternal genome elimination

85 and X chromosome elimination as a means of sex determination, GRCs, and a mix of

86 monogenic and digenic species [11]. However, cecidomyiid species have two X

87 chromosomes (i.e. females are X1X1X2X2 and males are X1X2OO), and the factor that

88 controls X chromosome elimination in offspring is associated with an inversion on an

89 autosome (rather than the X chromosome in B. coprophila) [12]. Additionally, GRC

90 characteristics are quite different in this family, with species containing many small GRCs,

91 which are maternally transmitted and do not seem to form bivalents during female meiosis.

92

4 93 Supplementary Text 2- Supplementary Methods

94 DNA extraction procedure

95 For gDNA extractions, for both short read and long read libraries we followed a

96 similar protocol. All the centrifugation steps took place at 4°C and 13,000rpm, unless

97 otherwise stated. Tissue samples were stored at -80°C until DNA extractions. Before

98 extraction, we briefly froze the samples in liquid nitrogen and crushed the tissue with a

99 micro-pestle. We then added 360μl of Cell Lysis Buffer (Qiagen) with 40μl of Proteinase K

100 (20 mg/ml) (Qiagen), and incubated overnight in a shaking incubator at 55°C. We then

101 added 4μl of RNase A (100 mg/ml), mixed by inverting the sample tube, and incubated the

102 sample for 1 hour at 37°C. We cooled the sample on ice for 5 minutes, then added 133 μl of

103 Protein Precipitate Buffer (Qiagen), mixed by gently vortexing the sample and incubated on

104 ice for 10 min. We then centrifuged for 15 min at 4°C, transferred the supernatant to a new

105 tube containing 400μl isopropanol, and mixed by inversion. For the short read samples, we

106 then incubated the sample overnight at -20°C, while for the long read samples, we incubated

107 the sample for 10 min at room temperature. We then centrifuged the sample for 20 min, and

108 discarded the supernatant by inverting the tube. We washed the DNA pellet twice with 300μl

109 freshly prepared 70% EtOH, then centrifuged the sample for 20 min, and carefully removed

110 the supernatant by pipetting. We air dried the DNA pellet for approximately 30 min, and

111 resuspended the pellet in 60μl TE after it dried.

112

113 Long-read assembly

114 In addition to the short read sequencing data, we also extracted DNA (using the

115 protocol above) from approximately 250 male testes to generate long read germline data.

116 We sequenced the sample at Liverpool Genomics using a low input library prep procedure

117 and PacBio sequencing on 3 SMRT cells. We used red bean (previously known as wtdbg2)

118 with the parameters -L 1000 -x sq for the initial genome assembly (v2.5) [13], then polished

119 the assembly three times with the long read library using minimap2 with parameters -c -x

5 120 map-pb to map the long reads to the assembly (v2.17-r941) [14] and racon with parameter -

121 u to polish the assembly (v1.4.10) [15]. We then polished the assembly twice with short read

122 data (with only germline the library) using minimap2 and polishing with racon (see

123 Supplementary Table 1 for assembly statistics).

124

125 The long read assembly, compared to the short read assembly, showed two

126 unfortunate problems: much lower mapping rates of GRC k-mers (Supplementary Fig 2B),

127 and lower BUSCO score (BUSCO score: 93.6% complete BUSCOs vs. 98.3% for the short

128 read assembly). We suspect the problems stemmed from high error profiles of long reads

129 combined with high levels of paralogy across the genome hindering precise genome

130 polishing and subsequently leading to frame shifts in gene models. This problem could be

131 resolved in the future with newer sequencing approaches, such as HiFi reads with much

132 smaller error rates, or Haplotagging. However, the long read genome assembly still featured

133 much higher continuity (N50: 576,242 vs 18,920 for short-read assembly). Therefore, we

134 used the short read assembly for annotation and gene level comparisons but used the long

135 read assembly to link individual GRC genes found in the short read assembly for the

136 collinearity analysis (Fig 3C).

137

6 138 Supplementary Figures/ Tables Span (kb) Span Coverage

GC proportion 139 Span (kb)

140 Supplementary Fig 1. Blobplot of unfiltered assembly generated from both germ and

141 somatic libraries showing scaffold coverage vs. scaffold GC (size of dot indicates scaffold

142 size and colour taxonomic assignment). Reads mapping to scaffolds with a GC content

143 between 0.14 and 0.51 and a coverage higher than 7 were retained for the final assembly.

144

145

7 146 Supplementary Table 1. Summary statistics for the short read and long read

147 assemblies used in this study. The short read assembly was used for gene prediction as it

148 was closer to the expected genome size compared to the long read assembly and also was

149 more complete according to BUSCO assessment.

Short read (Illumina) Long read (PacBio)

Size 398 MB 415 Mb

# scaffolds 46,532 3505

N50 18.9 Kb 576 Kb

L50 5203 135

GC 35.4% 35.8%

BUSCO completeness 98.3% 93.6%

150

151

8 152

A. Short read asm B. Long reads asm

GRC GRC A A 120 1500 X X 100 80 1000 60 Frequency Frequency Frequency Frequency 40 500 20 0 0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

score score 153 Score (mapped k-mers/scaffold length) Score (mapped k-mers/scaffold length)

154 Supplementary Fig 2. Distributions of scores used in the k-mer identification

155 technique. A. Histogram of k-mer assignment scores for scaffolds of each chromosome

156 type in the short-read assembly used throughout the manuscript. The score is defined as the

157 maximum number of k-mers with an exact match to the scaffold from the chromosome with

158 the majority of k-mers matching the scaffold, divided by the scaffold length. For GRC

159 scaffolds (orange), we only assigned scaffolds with a score higher than 0.8 as GRC

160 scaffolds, while for autosomal and X chromosome scaffolds (green and blue respectively)

161 we assigned scaffolds with a score higher than 0.4. B. Histogram of k-mer assignment

162 scores in long read assembly (See Supplementary Methods). The scores are substantially

163 lower, especially for differentiating autosomes and the X chromosome. This assembly was

164 used only for anchoring GRC genes in longer blocks for the collinearity analysis.

165

166

9 167 Supplementary Table 2. Size and proportion of reference genome [16] anchored to specific

168 chromosomes in B. coprophila and number of GRC paralogs and number of GRC collinear

169 blocks anchored to each chromosome in the reference assembly.

A-II A-III A-IV X

Size (Mb) 48-62 66-71 88-94 48-62

Proportion anchored 20-46% 8-19% 37-52% 93-100%

GRC paralogs (number genes) 128 7 108 119

GRC collinear blocks 14 1 12 12

170

171

10 172

Histogram of cov_L$mean_cov 700 600 500 400 Frequency 300 Frequency 200 100 0

0 20 40 60 80 100

173 GRCcov_L$mean_cov gene coverage

174 Supplementary Fig 3. Histogram of the mean coverage of all GRC genes. The coverage

175 of GRC genes is bimodal, with one peak at 24.6x coverage and another at 30.3x coverage.

176

11 ● ●

50 0.64 (3 excluded genes) 0.67 (all genes low coverage)

● 28 45 ● ●

● ● 40 26

35 ● mean_cov.x mean_cov.x 24 30 ● ● ● ● Gene coverage Gene ● ● ● ● ● coverage Gene ● ●

● 22

25 ● ● ● ● ● ● ● ●

● ● 20

2 4 6 8 10 1 2 3 4 5 6 Gene positionorder_in_block in blockHistogram of LL.patterns$Exp_ratioGene positionorder_in_block in block 12 10 8 6 Frequency Frequency 4 2 0

0.5 0.6 0.7 0.8 0.9 1.0

Proportion LL.patterns$Exp_ratiogenes fitting coverage pattern

34 ● ● 0.83 1.0 ● 35

32 ● ●

● ● ● ●

30 ●

● ● ● ● 30 ● ● 28

● mean_cov.x mean_cov.x ● ● ● ● 26 ● 25

Gene coverage Gene ● ● ●

Gene coverage Gene ● ● ● ● 24 ●

● ● 20 22

1 2 3 4 5 6 2 4 6 8 10

order_in_block order_in_block 177 Gene position in block Gene position in block

178 Supplementary Fig 4. Coverage of genes in GRC-GRC collinear blocks. The two blocks

179 are coloured with different shades of orange. The histogram shows the proportion of genes

180 in each block that fit the expected coverage pattern (i.e. one block having all genes with a

181 higher coverage than the other). Only genes with a coverage less than 45x were considered.

12 182 Four blocks with different coverage patterns are shown with the score assessing how many

183 paralogs fit the expected coverage pattern in the top corner of each plot. The plot in the top

184 left corner has several genes with a higher coverage than expected (which were excluded),

185 with the other paralogs in the block having inconsistent coverage patterns. The top right

186 corner shows a case where all genes have coverage distributions which would fit in the

187 lower coverage peak (in Supplementary Fig 1), and an inconsistent coverage pattern (likely

188 because these blocks are both located on the same GRC), and the lower left corner shows a

189 case where most paralogs fit the expected coverage pattern except one set of paralogs,

190 which has an intermediate coverage level. Most blocks (13/23) show the same pattern as

191 the block in the lower right corner, where all genes in one block have a higher coverage than

192 the genes in the other block.

193

13 194

195 Supplementary Fig 5. Summary of universal single-copy orthologs (BUSCO) results

196 for all Dipteran species in phylogenetic analyses. Exechia fusca was excluded from

197 analyses as the proportion of complete BUSCOs was low (54%). In Bradysia coprophila,

198 39.2% of the BUSCO genes were duplicated.

199

200

14 201 Supplementary Table 3. Genomic location of universal single-copy orthologs

202 (BUSCO) in Bradysia coprophila. BUSCO assessment was conducted with the

203 insecta_odb10 database. The genomic location of genes was identified with both coverage

204 and k-mer identification techniques. Categories indicated with * were used in phylogenetic

205 analyses (Fig 5A) to determine the phylogenetic position of GRC genes, and individual gene

206 trees were examined (Fig 5C) for the paralogs in the categories indicated with * and **.

BUSCO type Chromosome Frequency GRC related

Single-copy A 521 No

X 182 No

GRC 106 Yes

Duplicated A-GRC* 291 Yes

X-GRC* 81 Yes

GRC-GRC 18 Yes

A-A 6 No

A-X 1 No

Multi-copy A-GRC-GRC** 56 Yes

X-GRC-GRC** 30 Yes

GRC-GRC-GRC 3 Yes

A-A-GRC 3 Yes

A-X-GRC 2 Yes

A-GRC-GRC-GRC 1 Yes

X-X-GRC-GRC-GRC 1 Yes

207

15 A. B. fuscatus Sylvicola fuscatus Penthetria funebris Penthetria funebris cinerea Bolitophila cinerea 99.9/100 100/100 Bolitophila hybrida Bolitophila hybrida 61.7/62 Gnoriste bilineata 87.6/32 ferruginosa

48.6/69 Phytosciara flavipes Phytosciara flavipes 97.4/99 81.6/56 95.2/94 98.6/99 84/57 99.6/100 Bradysia coprophila (core) 92.9/99 Bradysia coprophila (core) 92.8/89 Trichosia splendens 99.6/100Bradysia coprophila (GRC) Trichosia splendens 0/52 78.8/59 Macrocera vittata Gnoriste bilineata 68.9/77 0/13 Platyura marginata Macrocera vittata 89.9/64 92.8/65 82/76 Symmerus nobilis Platyura marginata Catotricha subobsoleta Catotricha subobsoleta 93/92 41.9/49 Lestremia cinerea destructor 97.8/99 97.8/96 Mayetiola destructor 95.1/68 Porricondyla nigripennis 96.5/99 94.1/94 Bradysia coprophila (GRC) Lestremia cinerea Porricondyla nigripennis Symmerus nobilis

0.08 0.2 C. D. Sylvicola fuscatus Sylvicola fuscatus Penthetria funebris Penthetria funebris 85.2/33 63.2/44 Macrocera vittata Symmerus nobilis 83.9/93 Platyura marginata Bolitophila cinerea 100/100 Diadocidia ferruginosa 4.9/14 86.3/52 Bolitophila hybrida 65.2/68 93.5/86 Phytosciara flavipes 100/100 Macrocera vittata 41.9/69 100/100 Bradysia coprophila (core) 65.3/32 Platyura marginata 91.3/56 Trichosia splendens 99.1/90 Diadocidia ferruginosa Gnoriste bilineata 78.9/85 Gnoriste bilineata Bolitophila cinerea 100/100 99.4/99 Phytosciara flavipes Bolitophila hybrida 97.1/99 Bradysia coprophila (core) Catotricha subobsoleta 100/100 64.7/74 Bradysia coprophila (GRC) Lestremia cinerea Catotricha subobsoleta 100/100 Mayetiola destructor 70.9/75 100/100 Bradysia coprophila (GRC) Lestremia cinerea 96.5/99 99.8/100 93/92 46.7/54 Bradysia coprophila (GRC) Mayetiola destructor 100/100 Porricondyla nigripennis 99.9/100 Bradysia coprophila (GRC) Symmerus nobilis Porricondyla nigripennis

0.08 0.2 E. F. Sylvicola fuscatus Sylvicola fuscatus Penthetria funebris Penthetria funebris 85.5/85 2.8/54 Catotricha subobsoleta 0/50 Mayetiola destructor 99.9/100 Lestremia cinerea Bradysia coprophila (GRC) 87.6/89 94.5/80 Mayetiola destructor Bolitophila cinerea 85.6/90 79.2/87 Porricondyla nigripennis 81.7/89 Bolitophila hybrida Symmerus nobilis Diadocidia ferruginosa 75.5/32 79.4/59 77.9/72 Diadocidia ferruginosa Gnoriste bilineata

Phytosciara flavipes 0/37 Phytosciara flavipes 94.3/90 94.2/91 89.4/56 91.8/78 Bradysia coprophila (core) 0/45 Trichosia splendens 99.1/98 Bradysia coprophila (GRC) Platyura marginata 73/54 10.5/46 100/100 79.9/52 Bradysia coprophila (GRC) 83/73 Macrocera vittata Trichosia splendens Lestremia cinerea 83.8/72 Bolitophila cinerea 0/24 Bradysia coprophila (GRC) 99.7/100 79.5/56 76.5/46Bolitophila hybrida Bradysia coprophila (core) Gnoriste bilineata Catotricha subobsoleta Macrocera vittata Symmerus nobilis 91.7/88 Platyura marginata Porricondyla nigripennis

0.2 0.006 208

209 Supplementary Fig 6. Examples of GRC gene trees with various topologies. (A) with

210 one GRC copy rooted in Cecidomyiidae (B) with one GRC copy rooted in Sciaridae; (C) with

211 two GRC copies both in Cecidomyiidae (E) with two GRC copies, one in Cecidomyiidae and

212 the other in Sciaridae (D) with two GRC copies both in Sciaridae. (F) GRCs unplaced

16 213 (without significant nodes) or branching with a species from any other family. These six gene

214 trees are representative examples of individual categories of gene tree topologies on Fig

215 5B.

216

17 217

218

219 Supplementary Fig 7. Terminal branch length distribution of GRC genes; Branch length

220 distribution of GRC copies of BUSCO genes plotted with respect to the phylogenetic position

221 (at family level), means shown by dashed lines. Branch lengths of BUSCO genes on GRCs

222 within Cecidomyiidae (violet) are significantly longer than branches found within Sciaridae

223 (teal; p-value < 0.0001) suggesting the genes on GRCs found within Sciaridae might be due

224 to gene duplications and translocations within Sciaridae after the GRCs were acquired.

225

18 226 Supplementary Text 3: Approximate dating of the introgression of GRCs

227

228 We used Baltic amber records to roughly estimate the date of the introgression of

229 GRCs from Cecidomyiidae to Sciaridae. The records of the early diversification of the

230 Sciaridae family found in the amber are dated to be ~44myo [17,18]. Supposedly, the

231 common ancestor of the family is a bit older than that, therefore, we roughly estimate the

232 common ancestor of Sciaridae to be 50 mya. We can use this estimate to time calibrate the

233 phylogeny of 340 BUSCO genes (see Fig 5). Using this logic, the common ancestors of

234 Sciaridae and Cecidomyiidae was 147 mya (calculated using sum of branches from

235 Sciaridae backwards). The isolation of Sciaridae then happened ~31 my after the split with

236 Cecidomyiidae. We hypothesise the introgression must have happened after that as no

237 other Bibionomorpha families have GRCs nor show any other signs of hybridization with

238 ancestors of Cecidomyiidae [19,20]. The hybridisation therefore probably happened on the

239 Sciaridae branch before the diversification of the family, approximately 116 - 50 mya and

240 between 31 - 97 my after the split of the original ancestors of the two families.

241

242 These calculations must be taken with a large grain of salt, as the substitution rates

243 change over time and our calibration is relatively crude as it uses only one reference point.

244 However, it suggests that hybridisation of extremely divergent species (31 - 97 my) can have

245 important evolutionary consequences. This is not the first record of viable hybrids of two

246 extremely diverged . Recently a successful hybridization of Russian Sturgeon and

247 American Paddlefish was accomplished [21]. However, this was a lab-generated hybrid, not

248 a result of mating in the wild. Other hybridisation events of diverged animals that resulted in

249 gene flow are found in Nasonia wasps [22], sea squirt [23], and burrowing frogs [24]. The B

250 chromosome in Nasonia wasps is thought to have arisen through hybridization with a

251 Trichomalopsis wasp, the two lineages are estimated to diverge for 2.6 my [25]. The B

252 chromosome also have a substantial effect on Nasonia wasps, as it affects sex

19 253 determination in individuals that carry it. The sea squirt the gene flow appeared after

254 secondary contact of more than 3 my of divergence of the two lineages [23], which is already

255 an upper boundary of known systems with ongoing gene flow. Gene flow between more

256 remote lineages, as in burrowing frogs, seems to be facilitated by polyploidisation [24].

257 Similar case is found in Arabidopsis lyrata and A. arenosa species complex. Both those

258 species have diploid and tetraploid forms and while the diploid variants are fully

259 reproductively isolated, the tetraploid variants form viable hybrids generating an indirect

260 route for gene flow between these ~20my diverged lineages [26]. It appears that

261 hybridization events of extremely diverged species are always associated with polyploidy.

262 Isolation of the two genomic copies in separate instances increases the stability as the

263 recombination does not break up already-working of within-subgenome co-adapted genes. It

264 seems likely, given that the size of the GRCs in B. coprophila is comparable to the size of

265 the entire Cecidomyiid genome (the genome size of Mayetiola destructor, for example, is

266 158Mb; [27]), that the originally introgressed GRCs in the common ancestor of Sciaridae

267 carried a full genomic copy of the ancestral Cecidomyiidae genome. We speculate that

268 GRCs originated in Sciaridae through introgression of the full cecidomyiid genome. The

269 introgression might have directly resulted in GRCs or was followed by restriction of the

270 introgressed genome to the germline. Sciaridae species therefore represent a rare case of

271 germ-line specific polyploids.

272

273

20 274 Supplementary References

275 1. Metz CW. Chromosome Behavior, Inheritance and Sex Determination in Sciara. Am

276 Nat. 1938;72: 485–520.

277 2. Gerbi SA. Unusual chromosome movements in sciarid . Results Probl Cell Differ.

278 1986;13: 71–104. Available: http://www.ncbi.nlm.nih.gov/pubmed/3529273

279 3. Gardner A, Ross L. Mating ecology explains patterns of genome elimination. Ecol

280 Lett. 2014;17: 1602–12. doi:10.1111/ele.12383

281 4. Crouse H V., Brown A, Mumford BC. L-Chromosome Inheritance and the Problem of

282 Chromosome “Imprinting” in Sciara (Sciaridae, Diptera)*. Chromosoma. 1971;34:

283 324–339.

284 5. Du Bois AM. Chromosome behavior during cleavage in the eggs of Sciara coprophila

285 (Diptera) in the relation to the problem of sex determination. Zeitschrift für Zellforsch

286 und Mikroskopische Anat. 1933;19: 595–614. doi:10.1007/BF00393361

287 6. de Saint Phalle B, Sullivan W. Incomplete sister chromatid separation is the

288 mechanism of programmed chromosome elimination during early Sciara coprophila

289 embryogenesis. Development. 1996;122: 3775–3784.

290 7. Metz CW, Schmuck LM. Unusual progenies and the sex chromosome mechanism in

291 Sciara. Proc Natl Acad Sci U S A. 1929;15: 863–866.

292 8. Crouse H V., Gerbi SA, Liang CM, Magnus L, Mercer IM. Localization of ribosomal

293 DNA within the proximal X heterochromatin of Sciara coprophila (Diptera, Sciaridae).

294 Chromosoma. 1977;64: 305–318. doi:10.1007/BF00294938

295 9. Rieffel SM, Crouse H V. The elimination and differentiation of chromosomes in the

296 germ line of Sciara. Chromosoma. 1966;19: 231–276.

297 10. Perondini ALP, Ribeiro AF. Chromosome elimination in germ cells of sciara embryos:

298 Involvement of the nuclear envelope. Invertebr Reprod Dev. 1997;32: 131–141.

299 doi:10.1080/07924259.1997.9672614

300 11. MJD W. cytology and evolution, 3rd edn. 3rd ed. Cambridge: Cambridge Univ

301 Press; 1973.

21 302 12. Benatti TR, Valicente FH, Aggarwal R, Zhao C, Walling JG, Chen MS, et al. A neo-

303 sex chromosome that drives postzygotic sex determination in the Hessian

304 (Mayetiola destructor). Genetics. 2010;184: 769–777.

305 doi:10.1534/genetics.109.108589

306 13. Ruan J, Li H. Fast and accurate long-read assembly with wtdbg2. Nat Methods.

307 2020;17: 155–158. doi:10.1038/s41592-019-0669-3

308 14. Li H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics.

309 2018;34: 3094–3100. doi:10.1093/bioinformatics/bty191

310 15. Vaser R, Sović I, Nagarajan N, Šikić M. Fast and accurate de novo genome assembly

311 from long uncorrected reads. Genome Res. 2017;27: 737–746.

312 doi:10.1101/gr.214270.116

313 16. Urban JM, Foulk MS, Bliss JE, Coleman CM, Lu N, Mazloom R, et al. Single-molecule

314 sequencing of long DNA molecules allows high contiguity de novo genome assembly

315 for the fungus fly, Sciara coprophila. bioRxiv. 2020; 1–65.

316 doi:10.1017/CBO9781107415324.004

317 17. Roschmann F, Morhig W. Die trauermucken des sächsischen bernsteins aus dem

318 untermiozän von Bitterfeld/Deutschland (Diptera, Sciaridae). Dtsch Entomol

319 Zeitschrift. 1995;42: 17–54.

320 18. Blagoderov V, Grimaldi D. Fossil (Diptera) in Ambers,

321 Exclusive of Cecidomyiidae, Sciaridae, and . Am Museum Novit.

322 2004;3433: 1. doi:10.1206/0003-0082(2004)433<0001:fsdica>2.0.co;2

323 19. Le Calvez J. Morphologie et comportement des chromosomes dans la

324 spermatogenese se quelques Mycetophilides. Chromosoma. 1947; 137–165.

325 20. Fahmy OG. The mechanism of chromosome pairing during meiosis in male

326 Apolipthisa subincana (, Diptera). J Genet. 1949;49: 246–263.

327 doi:10.1007/BF02986079

328 21. Káldy J, Mozsár A, Fazekas G, Farkas M, Fazekas DL, Fazekas GL, et al.

329 Hybridization of russian sturgeon (Acipenser gueldenstaedtii, Brandt and Ratzeberg,

22 330 1833) and american paddlefish (Polyodon spathula, Walbaum 1792) and evaluation of

331 their progeny. Genes (Basel). 2020;11: 1–17. doi:10.3390/genes11070753

332 22. McAllister BF, Werren JH. Hybrid origin of a B chromosome (PSR) in the parasitic

333 wasp Nasonia vitripennis. Chromosoma. 1997;106: 243–253.

334 doi:10.1007/s004120050245

335 23. Roux C, Tsagkogeorga G, Bierne N, Galtier N. Crossing the species barrier: Genomic

336 hotspots of introgression between two highly divergent ciona intestinalis species. Mol

337 Biol Evol. 2013;30: 1574–1587. doi:10.1093/molbev/mst066

338 24. Novikova PY, Brennan IG, Booker W, Mahony M, Doughty P, Lemmon AR, et al.

339 Polyploidy breaks speciation barriers in Australian burrowing frogs Neobatrachus.

340 PLoS Genet. 2020;16: 1–24. doi:10.1371/journal.pgen.1008769

341 25. Martinson EO, Mrinalini, Kelkar YD, Chang CH, Werren JH. The Evolution of Venom

342 by Co-option of Single-Copy Genes. Curr Biol. 2017;27: 2007-2013.e8.

343 doi:10.1016/j.cub.2017.05.032

344 26. Lafon-Placette C, Johannessen IM, Hornslien KS, Ali MF, Bjerkan KN, Bramsiepe J,

345 et al. Endosperm-based hybridization barriers explain the pattern of gene flow

346 between Arabidopsis lyrata and Arabidopsis arenosa in Central Europe. Proc Natl

347 Acad Sci U S A. 2017;114: E1027–E1035. doi:10.1073/pnas.1615123114

348 27. Stuart JJ, Chen M, Harris MO. . Publ from USDA-ARS / UNL Fac US.

349 2008.

350

23