1 Supplementary Information for Evolution of gene-rich germline restricted
2 chromosomes in black-winged fungus gnats through introgression (Diptera:
3 Sciaridae)
4 Christina N. Hodson, Kamil S. Jaron, Susan Gerbi, Laura Ross
5
6 Supplementary Text 1: Detailed description of the chromosome inheritance system in
7 Bradysia coprophila.
8
9 The chromosome system in B. coprophila, and in sciarids generally, is unique in
10 several ways including chromosome transmission patterns, sex determination, and the
11 presence of GRCs (see Fig 1 for transmission patterns). All sciarids studied to date have a
12 system of reproduction known as paternal genome elimination, where males only transmit
13 maternally inherited chromosomes to offspring [1,2]. Paternal genome elimination has
14 evolved independently in at least seven arthropod lineages, including the related gall gnat
15 family Cecidomyiidae [3]. In all species with paternal genome elimination, meiosis occurs in
16 a Mendelian manner in females, but in males meiosis is aberrant. In male meiosis in
17 sciarids, there is a monopolar spindle in meiosis I. Maternally inherited chromosomes move
18 towards the monopolar spindle, while paternally derived chromosomes move away from it
19 and are discarded in a bud of cytoplasm [2]. Thus, only the maternal complement of
20 chromosomes is transmitted to the sperm. This phenomenon in B. coprophila was the first
21 example of “imprinting”, to our knowledge, by which the cell recognizes the maternal or
22 paternal origin of a chromosome [4]. Interestingly, the GRCs always segregate with the
23 maternal set of chromosomes. Therefore, all of the GRCs (typically two in B. coprophila) are
24 transmitted through sperm, regardless of whether they are of maternal or paternal origin [4].
25 This is one of the few examples of chromosomes which seem to evade paternal genome
26 elimination. In the second division of meiosis in B. coprophila there is a bipolar spindle,
27 however there is a nondisjunction of the maternal X chromosome in this division such that
1 28 only one sperm develops through male meiosis. This sperm contains a haploid set of
29 autosomes, typically two GRCs, and two X chromosomes [1,2]. There is some variation in
30 the number of GRCs in each sperm, ranging from 0-4 in B. coprophila [4]. Variation in GRC
31 number is thought to be due to nondisjunction events which can occur in early germ cell
32 divisions, however, the majority of sperm (78%) carry two GRCs [4]. In female meiosis, the
33 GRCs form a bivalent during meiosis, and one GRC segregates into each egg (i.e. meiosis
34 is typical) [4].
35
36 As a result of the unusual type of meiosis in male sciarids, B. coprophila zygotes
37 typically carry a diploid set of autosomes, three X chromosomes (one inherited from their
38 mother and two from their father), and three GRCs. All sciarids have XO sex chromosome
39 system (i.e. males are XO and females are XX, and there is no Y chromosome), but sex is
40 determined via X chromosome elimination from somatic cells early in development. In the 7-
41 9 cleavage division, either one X chromosome (for females) or two X chromosomes (for
42 males) are eliminated from somatic cells [5]. Elimination occurs due to a failure of separation
43 of the sister chromatid arms during mitosis, resulting in the chromosomes being left on the
44 metaphase plate and not being incorporated into daughter nuclei [6]. It is thought that the
45 number of X chromosomes eliminated is maternally controlled, since B. coprophila females
46 are monogenic, and produce exclusively female or male progeny [7]. Females that produce
47 female offspring carry a large inversion on the X chromosome that is always associated with
48 female-producing females [2,8]. GRC elimination from somatic cells occurs in a remarkably
49 similar manner, with the exception that GRC elimination occurs in the 5-6 cleavage division
50 and all GRCs are eliminated from somatic cells [5].
51
52 In germ cells, there is also an elimination of one X chromosome and typically one
53 GRC. In this case, elimination occurs in a somewhat mysterious manner in early germ cell
54 development, when one X chromosomes and all but two GRCs are eliminated by being
55 ejected from the germ cell through a cytoplasmic bud [9,10]. Therefore, early germ cells of
2 56 both males and females in B. coprophila have the same chromosome constitution, with a
57 diploid set of autosomes, X chromosomes, and GRCs. This mechanism also regulates the
58 number of GRCs and prevents their accumulation over time, as all but two GRCs are always
59 eliminated from early germ cells.
60
61 Less is known about the mechanism of the chromosome system in other sciarid
62 species, but across the family all species studied exhibit paternal genome elimination and X
63 chromosome elimination early in development as the means of sex determination. Although
64 only a handful of species have been studied in detail, evidence suggests that most, but not
65 all sciard species carry GRCs, with the number of GRCs ranging from 0-4 [2]. The two
66 species in which GRCs are absent are closely related to each other, suggesting that GRCs
67 were likely lost in these species. Additionally, monogeny, or females that produce offspring
68 of only one sex, is present in some, but not all species across Sciaridae [2]. There seem to
69 be many transitions in this trait across Sciaridae, with some species being monogenic, some
70 being digenic (i.e. females produce offspring of both sexes), and some species having a mix
71 of these two types of females. Very little is known about the genetic underpinnings of this
72 trait.
73
74 Overall, the evidence suggests that paternal genome elimination and X chromosome
75 elimination as a means of sex determination evolved once at the base of Sciaridae. It is less
76 clear how GRCs and monogeny evolved. It was originally suggested that the presence of
77 GRCs and monogeny are related, as Bradysia ocellaris, a species that has lost GRCs is
78 digenic. Additionally, a lab line of Bradysia impatiens that was bred to lose GRCs
79 transitioned from monogenic to digenic reproduction [4]. However, these facts are anecdotal
80 and there are also several species with digenic reproduction that do carry GRCs (reviewed
81 in [2]).
82
3 83 Cecidomyiidae, gall gnats also in the Infraorder Bibionomorpha, have a similar
84 reproduction system to Sciaridae, in that both families exhibit paternal genome elimination
85 and X chromosome elimination as a means of sex determination, GRCs, and a mix of
86 monogenic and digenic species [11]. However, cecidomyiid species have two X
87 chromosomes (i.e. females are X1X1X2X2 and males are X1X2OO), and the factor that
88 controls X chromosome elimination in offspring is associated with an inversion on an
89 autosome (rather than the X chromosome in B. coprophila) [12]. Additionally, GRC
90 characteristics are quite different in this family, with species containing many small GRCs,
91 which are maternally transmitted and do not seem to form bivalents during female meiosis.
92
4 93 Supplementary Text 2- Supplementary Methods
94 DNA extraction procedure
95 For gDNA extractions, for both short read and long read libraries we followed a
96 similar protocol. All the centrifugation steps took place at 4°C and 13,000rpm, unless
97 otherwise stated. Tissue samples were stored at -80°C until DNA extractions. Before
98 extraction, we briefly froze the samples in liquid nitrogen and crushed the tissue with a
99 micro-pestle. We then added 360μl of Cell Lysis Buffer (Qiagen) with 40μl of Proteinase K
100 (20 mg/ml) (Qiagen), and incubated overnight in a shaking incubator at 55°C. We then
101 added 4μl of RNase A (100 mg/ml), mixed by inverting the sample tube, and incubated the
102 sample for 1 hour at 37°C. We cooled the sample on ice for 5 minutes, then added 133 μl of
103 Protein Precipitate Buffer (Qiagen), mixed by gently vortexing the sample and incubated on
104 ice for 10 min. We then centrifuged for 15 min at 4°C, transferred the supernatant to a new
105 tube containing 400μl isopropanol, and mixed by inversion. For the short read samples, we
106 then incubated the sample overnight at -20°C, while for the long read samples, we incubated
107 the sample for 10 min at room temperature. We then centrifuged the sample for 20 min, and
108 discarded the supernatant by inverting the tube. We washed the DNA pellet twice with 300μl
109 freshly prepared 70% EtOH, then centrifuged the sample for 20 min, and carefully removed
110 the supernatant by pipetting. We air dried the DNA pellet for approximately 30 min, and
111 resuspended the pellet in 60μl TE after it dried.
112
113 Long-read assembly
114 In addition to the short read sequencing data, we also extracted DNA (using the
115 protocol above) from approximately 250 male testes to generate long read germline data.
116 We sequenced the sample at Liverpool Genomics using a low input library prep procedure
117 and PacBio sequencing on 3 SMRT cells. We used red bean (previously known as wtdbg2)
118 with the parameters -L 1000 -x sq for the initial genome assembly (v2.5) [13], then polished
119 the assembly three times with the long read library using minimap2 with parameters -c -x
5 120 map-pb to map the long reads to the assembly (v2.17-r941) [14] and racon with parameter -
121 u to polish the assembly (v1.4.10) [15]. We then polished the assembly twice with short read
122 data (with only germline the library) using minimap2 and polishing with racon (see
123 Supplementary Table 1 for assembly statistics).
124
125 The long read assembly, compared to the short read assembly, showed two
126 unfortunate problems: much lower mapping rates of GRC k-mers (Supplementary Fig 2B),
127 and lower BUSCO score (BUSCO score: 93.6% complete BUSCOs vs. 98.3% for the short
128 read assembly). We suspect the problems stemmed from high error profiles of long reads
129 combined with high levels of paralogy across the genome hindering precise genome
130 polishing and subsequently leading to frame shifts in gene models. This problem could be
131 resolved in the future with newer sequencing approaches, such as HiFi reads with much
132 smaller error rates, or Haplotagging. However, the long read genome assembly still featured
133 much higher continuity (N50: 576,242 vs 18,920 for short-read assembly). Therefore, we
134 used the short read assembly for annotation and gene level comparisons but used the long
135 read assembly to link individual GRC genes found in the short read assembly for the
136 collinearity analysis (Fig 3C).
137
6 138 Supplementary Figures/ Tables Span (kb) Span Coverage
GC proportion 139 Span (kb)
140 Supplementary Fig 1. Blobplot of unfiltered assembly generated from both germ and
141 somatic libraries showing scaffold coverage vs. scaffold GC (size of dot indicates scaffold
142 size and colour taxonomic assignment). Reads mapping to scaffolds with a GC content
143 between 0.14 and 0.51 and a coverage higher than 7 were retained for the final assembly.
144
145
7 146 Supplementary Table 1. Summary statistics for the short read and long read
147 assemblies used in this study. The short read assembly was used for gene prediction as it
148 was closer to the expected genome size compared to the long read assembly and also was
149 more complete according to BUSCO assessment.
Short read (Illumina) Long read (PacBio)
Size 398 MB 415 Mb
# scaffolds 46,532 3505
N50 18.9 Kb 576 Kb
L50 5203 135
GC 35.4% 35.8%
BUSCO completeness 98.3% 93.6%
150
151
8 152
A. Short read asm B. Long reads asm
GRC GRC A A 120 1500 X X 100 80 1000 60 Frequency Frequency Frequency Frequency 40 500 20 0 0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
score score 153 Score (mapped k-mers/scaffold length) Score (mapped k-mers/scaffold length)
154 Supplementary Fig 2. Distributions of scores used in the k-mer identification
155 technique. A. Histogram of k-mer assignment scores for scaffolds of each chromosome
156 type in the short-read assembly used throughout the manuscript. The score is defined as the
157 maximum number of k-mers with an exact match to the scaffold from the chromosome with
158 the majority of k-mers matching the scaffold, divided by the scaffold length. For GRC
159 scaffolds (orange), we only assigned scaffolds with a score higher than 0.8 as GRC
160 scaffolds, while for autosomal and X chromosome scaffolds (green and blue respectively)
161 we assigned scaffolds with a score higher than 0.4. B. Histogram of k-mer assignment
162 scores in long read assembly (See Supplementary Methods). The scores are substantially
163 lower, especially for differentiating autosomes and the X chromosome. This assembly was
164 used only for anchoring GRC genes in longer blocks for the collinearity analysis.
165
166
9 167 Supplementary Table 2. Size and proportion of reference genome [16] anchored to specific
168 chromosomes in B. coprophila and number of GRC paralogs and number of GRC collinear
169 blocks anchored to each chromosome in the reference assembly.
A-II A-III A-IV X
Size (Mb) 48-62 66-71 88-94 48-62
Proportion anchored 20-46% 8-19% 37-52% 93-100%
GRC paralogs (number genes) 128 7 108 119
GRC collinear blocks 14 1 12 12
170
171
10 172
Histogram of cov_L$mean_cov 700 600 500 400 Frequency 300 Frequency 200 100 0
0 20 40 60 80 100
173 GRCcov_L$mean_cov gene coverage
174 Supplementary Fig 3. Histogram of the mean coverage of all GRC genes. The coverage
175 of GRC genes is bimodal, with one peak at 24.6x coverage and another at 30.3x coverage.
176
11 ● ●
50 0.64 (3 excluded genes) 0.67 (all genes low coverage)
●
● 28 45 ● ●
● ● 40 26
●
35 ● mean_cov.x mean_cov.x 24 30 ● ● ● ● Gene coverage Gene ● ● ● ● ● coverage Gene ● ●
● 22
25 ● ● ● ● ● ● ● ●
● ● 20
2 4 6 8 10 1 2 3 4 5 6 Gene positionorder_in_block in blockHistogram of LL.patterns$Exp_ratioGene positionorder_in_block in block 12 10 8 6 Frequency Frequency 4 2 0
0.5 0.6 0.7 0.8 0.9 1.0
Proportion LL.patterns$Exp_ratiogenes fitting coverage pattern
34 ● ● 0.83 1.0 ● 35
32 ● ●
● ● ● ●
30 ●
● ● ● ● 30 ● ● 28
● mean_cov.x mean_cov.x ● ● ● ● 26 ● 25
Gene coverage Gene ● ● ●
Gene coverage Gene ● ● ● ● 24 ●
● ● 20 22
1 2 3 4 5 6 2 4 6 8 10
order_in_block order_in_block 177 Gene position in block Gene position in block
178 Supplementary Fig 4. Coverage of genes in GRC-GRC collinear blocks. The two blocks
179 are coloured with different shades of orange. The histogram shows the proportion of genes
180 in each block that fit the expected coverage pattern (i.e. one block having all genes with a
181 higher coverage than the other). Only genes with a coverage less than 45x were considered.
12 182 Four blocks with different coverage patterns are shown with the score assessing how many
183 paralogs fit the expected coverage pattern in the top corner of each plot. The plot in the top
184 left corner has several genes with a higher coverage than expected (which were excluded),
185 with the other paralogs in the block having inconsistent coverage patterns. The top right
186 corner shows a case where all genes have coverage distributions which would fit in the
187 lower coverage peak (in Supplementary Fig 1), and an inconsistent coverage pattern (likely
188 because these blocks are both located on the same GRC), and the lower left corner shows a
189 case where most paralogs fit the expected coverage pattern except one set of paralogs,
190 which has an intermediate coverage level. Most blocks (13/23) show the same pattern as
191 the block in the lower right corner, where all genes in one block have a higher coverage than
192 the genes in the other block.
193
13 194
195 Supplementary Fig 5. Summary of universal single-copy orthologs (BUSCO) results
196 for all Dipteran species in phylogenetic analyses. Exechia fusca was excluded from
197 analyses as the proportion of complete BUSCOs was low (54%). In Bradysia coprophila,
198 39.2% of the insect BUSCO genes were duplicated.
199
200
14 201 Supplementary Table 3. Genomic location of universal single-copy orthologs
202 (BUSCO) in Bradysia coprophila. BUSCO assessment was conducted with the
203 insecta_odb10 database. The genomic location of genes was identified with both coverage
204 and k-mer identification techniques. Categories indicated with * were used in phylogenetic
205 analyses (Fig 5A) to determine the phylogenetic position of GRC genes, and individual gene
206 trees were examined (Fig 5C) for the paralogs in the categories indicated with * and **.
BUSCO type Chromosome Frequency GRC related
Single-copy A 521 No
X 182 No
GRC 106 Yes
Duplicated A-GRC* 291 Yes
X-GRC* 81 Yes
GRC-GRC 18 Yes
A-A 6 No
A-X 1 No
Multi-copy A-GRC-GRC** 56 Yes
X-GRC-GRC** 30 Yes
GRC-GRC-GRC 3 Yes
A-A-GRC 3 Yes
A-X-GRC 2 Yes
A-GRC-GRC-GRC 1 Yes
X-X-GRC-GRC-GRC 1 Yes
207
15 A. B. Sylvicola fuscatus Sylvicola fuscatus Penthetria funebris Penthetria funebris Bolitophila cinerea Bolitophila cinerea 99.9/100 100/100 Bolitophila hybrida Bolitophila hybrida 61.7/62 Gnoriste bilineata 87.6/32 Diadocidia ferruginosa
48.6/69 Phytosciara flavipes Phytosciara flavipes 97.4/99 81.6/56 95.2/94 98.6/99 84/57 99.6/100 Bradysia coprophila (core) 92.9/99 Bradysia coprophila (core) 92.8/89 Trichosia splendens 99.6/100Bradysia coprophila (GRC) Diadocidia ferruginosa Trichosia splendens 0/52 78.8/59 Macrocera vittata Gnoriste bilineata 68.9/77 0/13 Platyura marginata Macrocera vittata 89.9/64 92.8/65 82/76 Symmerus nobilis Platyura marginata Catotricha subobsoleta Catotricha subobsoleta 93/92 41.9/49 Lestremia cinerea Mayetiola destructor 97.8/99 97.8/96 Mayetiola destructor 95.1/68 Porricondyla nigripennis 96.5/99 94.1/94 Bradysia coprophila (GRC) Lestremia cinerea Porricondyla nigripennis Symmerus nobilis
0.08 0.2 C. D. Sylvicola fuscatus Sylvicola fuscatus Penthetria funebris Penthetria funebris 85.2/33 63.2/44 Macrocera vittata Symmerus nobilis 83.9/93 Platyura marginata Bolitophila cinerea 100/100 Diadocidia ferruginosa 4.9/14 86.3/52 Bolitophila hybrida 65.2/68 93.5/86 Phytosciara flavipes 100/100 Macrocera vittata 41.9/69 100/100 Bradysia coprophila (core) 65.3/32 Platyura marginata 91.3/56 Trichosia splendens 99.1/90 Diadocidia ferruginosa Gnoriste bilineata 78.9/85 Gnoriste bilineata Bolitophila cinerea 100/100 99.4/99 Phytosciara flavipes Bolitophila hybrida 97.1/99 Bradysia coprophila (core) Catotricha subobsoleta 100/100 64.7/74 Bradysia coprophila (GRC) Lestremia cinerea Catotricha subobsoleta 100/100 Mayetiola destructor 70.9/75 100/100 Bradysia coprophila (GRC) Lestremia cinerea 96.5/99 99.8/100 93/92 46.7/54 Bradysia coprophila (GRC) Mayetiola destructor 100/100 Porricondyla nigripennis 99.9/100 Bradysia coprophila (GRC) Symmerus nobilis Porricondyla nigripennis
0.08 0.2 E. F. Sylvicola fuscatus Sylvicola fuscatus Penthetria funebris Penthetria funebris 85.5/85 2.8/54 Catotricha subobsoleta 0/50 Mayetiola destructor 99.9/100 Lestremia cinerea Bradysia coprophila (GRC) 87.6/89 94.5/80 Mayetiola destructor Bolitophila cinerea 85.6/90 79.2/87 Porricondyla nigripennis 81.7/89 Bolitophila hybrida Symmerus nobilis Diadocidia ferruginosa 75.5/32 79.4/59 77.9/72 Diadocidia ferruginosa Gnoriste bilineata
Phytosciara flavipes 0/37 Phytosciara flavipes 94.3/90 94.2/91 89.4/56 91.8/78 Bradysia coprophila (core) 0/45 Trichosia splendens 99.1/98 Bradysia coprophila (GRC) Platyura marginata 73/54 10.5/46 100/100 79.9/52 Bradysia coprophila (GRC) 83/73 Macrocera vittata Trichosia splendens Lestremia cinerea 83.8/72 Bolitophila cinerea 0/24 Bradysia coprophila (GRC) 99.7/100 79.5/56 76.5/46Bolitophila hybrida Bradysia coprophila (core) Gnoriste bilineata Catotricha subobsoleta Macrocera vittata Symmerus nobilis 91.7/88 Platyura marginata Porricondyla nigripennis
0.2 0.006 208
209 Supplementary Fig 6. Examples of GRC gene trees with various topologies. (A) with
210 one GRC copy rooted in Cecidomyiidae (B) with one GRC copy rooted in Sciaridae; (C) with
211 two GRC copies both in Cecidomyiidae (E) with two GRC copies, one in Cecidomyiidae and
212 the other in Sciaridae (D) with two GRC copies both in Sciaridae. (F) GRCs unplaced
16 213 (without significant nodes) or branching with a species from any other family. These six gene
214 trees are representative examples of individual categories of gene tree topologies on Fig
215 5B.
216
17 217
218
219 Supplementary Fig 7. Terminal branch length distribution of GRC genes; Branch length
220 distribution of GRC copies of BUSCO genes plotted with respect to the phylogenetic position
221 (at family level), means shown by dashed lines. Branch lengths of BUSCO genes on GRCs
222 within Cecidomyiidae (violet) are significantly longer than branches found within Sciaridae
223 (teal; p-value < 0.0001) suggesting the genes on GRCs found within Sciaridae might be due
224 to gene duplications and translocations within Sciaridae after the GRCs were acquired.
225
18 226 Supplementary Text 3: Approximate dating of the introgression of GRCs
227
228 We used Baltic amber records to roughly estimate the date of the introgression of
229 GRCs from Cecidomyiidae to Sciaridae. The records of the early diversification of the
230 Sciaridae family found in the amber are dated to be ~44myo [17,18]. Supposedly, the
231 common ancestor of the family is a bit older than that, therefore, we roughly estimate the
232 common ancestor of Sciaridae to be 50 mya. We can use this estimate to time calibrate the
233 phylogeny of 340 BUSCO genes (see Fig 5). Using this logic, the common ancestors of
234 Sciaridae and Cecidomyiidae was 147 mya (calculated using sum of branches from
235 Sciaridae backwards). The isolation of Sciaridae then happened ~31 my after the split with
236 Cecidomyiidae. We hypothesise the introgression must have happened after that as no
237 other Bibionomorpha families have GRCs nor show any other signs of hybridization with
238 ancestors of Cecidomyiidae [19,20]. The hybridisation therefore probably happened on the
239 Sciaridae branch before the diversification of the family, approximately 116 - 50 mya and
240 between 31 - 97 my after the split of the original ancestors of the two families.
241
242 These calculations must be taken with a large grain of salt, as the substitution rates
243 change over time and our calibration is relatively crude as it uses only one reference point.
244 However, it suggests that hybridisation of extremely divergent species (31 - 97 my) can have
245 important evolutionary consequences. This is not the first record of viable hybrids of two
246 extremely diverged animals. Recently a successful hybridization of Russian Sturgeon and
247 American Paddlefish was accomplished [21]. However, this was a lab-generated hybrid, not
248 a result of mating in the wild. Other hybridisation events of diverged animals that resulted in
249 gene flow are found in Nasonia wasps [22], sea squirt [23], and burrowing frogs [24]. The B
250 chromosome in Nasonia wasps is thought to have arisen through hybridization with a
251 Trichomalopsis wasp, the two lineages are estimated to diverge for 2.6 my [25]. The B
252 chromosome also have a substantial effect on Nasonia wasps, as it affects sex
19 253 determination in individuals that carry it. The sea squirt the gene flow appeared after
254 secondary contact of more than 3 my of divergence of the two lineages [23], which is already
255 an upper boundary of known systems with ongoing gene flow. Gene flow between more
256 remote lineages, as in burrowing frogs, seems to be facilitated by polyploidisation [24].
257 Similar case is found in Arabidopsis lyrata and A. arenosa species complex. Both those
258 species have diploid and tetraploid forms and while the diploid variants are fully
259 reproductively isolated, the tetraploid variants form viable hybrids generating an indirect
260 route for gene flow between these ~20my diverged lineages [26]. It appears that
261 hybridization events of extremely diverged species are always associated with polyploidy.
262 Isolation of the two genomic copies in separate instances increases the stability as the
263 recombination does not break up already-working of within-subgenome co-adapted genes. It
264 seems likely, given that the size of the GRCs in B. coprophila is comparable to the size of
265 the entire Cecidomyiid genome (the genome size of Mayetiola destructor, for example, is
266 158Mb; [27]), that the originally introgressed GRCs in the common ancestor of Sciaridae
267 carried a full genomic copy of the ancestral Cecidomyiidae genome. We speculate that
268 GRCs originated in Sciaridae through introgression of the full cecidomyiid genome. The
269 introgression might have directly resulted in GRCs or was followed by restriction of the
270 introgressed genome to the germline. Sciaridae species therefore represent a rare case of
271 germ-line specific polyploids.
272
273
20 274 Supplementary References
275 1. Metz CW. Chromosome Behavior, Inheritance and Sex Determination in Sciara. Am
276 Nat. 1938;72: 485–520.
277 2. Gerbi SA. Unusual chromosome movements in sciarid flies. Results Probl Cell Differ.
278 1986;13: 71–104. Available: http://www.ncbi.nlm.nih.gov/pubmed/3529273
279 3. Gardner A, Ross L. Mating ecology explains patterns of genome elimination. Ecol
280 Lett. 2014;17: 1602–12. doi:10.1111/ele.12383
281 4. Crouse H V., Brown A, Mumford BC. L-Chromosome Inheritance and the Problem of
282 Chromosome “Imprinting” in Sciara (Sciaridae, Diptera)*. Chromosoma. 1971;34:
283 324–339.
284 5. Du Bois AM. Chromosome behavior during cleavage in the eggs of Sciara coprophila
285 (Diptera) in the relation to the problem of sex determination. Zeitschrift für Zellforsch
286 und Mikroskopische Anat. 1933;19: 595–614. doi:10.1007/BF00393361
287 6. de Saint Phalle B, Sullivan W. Incomplete sister chromatid separation is the
288 mechanism of programmed chromosome elimination during early Sciara coprophila
289 embryogenesis. Development. 1996;122: 3775–3784.
290 7. Metz CW, Schmuck LM. Unusual progenies and the sex chromosome mechanism in
291 Sciara. Proc Natl Acad Sci U S A. 1929;15: 863–866.
292 8. Crouse H V., Gerbi SA, Liang CM, Magnus L, Mercer IM. Localization of ribosomal
293 DNA within the proximal X heterochromatin of Sciara coprophila (Diptera, Sciaridae).
294 Chromosoma. 1977;64: 305–318. doi:10.1007/BF00294938
295 9. Rieffel SM, Crouse H V. The elimination and differentiation of chromosomes in the
296 germ line of Sciara. Chromosoma. 1966;19: 231–276.
297 10. Perondini ALP, Ribeiro AF. Chromosome elimination in germ cells of sciara embryos:
298 Involvement of the nuclear envelope. Invertebr Reprod Dev. 1997;32: 131–141.
299 doi:10.1080/07924259.1997.9672614
300 11. MJD W. Animal cytology and evolution, 3rd edn. 3rd ed. Cambridge: Cambridge Univ
301 Press; 1973.
21 302 12. Benatti TR, Valicente FH, Aggarwal R, Zhao C, Walling JG, Chen MS, et al. A neo-
303 sex chromosome that drives postzygotic sex determination in the Hessian fly
304 (Mayetiola destructor). Genetics. 2010;184: 769–777.
305 doi:10.1534/genetics.109.108589
306 13. Ruan J, Li H. Fast and accurate long-read assembly with wtdbg2. Nat Methods.
307 2020;17: 155–158. doi:10.1038/s41592-019-0669-3
308 14. Li H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics.
309 2018;34: 3094–3100. doi:10.1093/bioinformatics/bty191
310 15. Vaser R, Sović I, Nagarajan N, Šikić M. Fast and accurate de novo genome assembly
311 from long uncorrected reads. Genome Res. 2017;27: 737–746.
312 doi:10.1101/gr.214270.116
313 16. Urban JM, Foulk MS, Bliss JE, Coleman CM, Lu N, Mazloom R, et al. Single-molecule
314 sequencing of long DNA molecules allows high contiguity de novo genome assembly
315 for the fungus fly, Sciara coprophila. bioRxiv. 2020; 1–65.
316 doi:10.1017/CBO9781107415324.004
317 17. Roschmann F, Morhig W. Die trauermucken des sächsischen bernsteins aus dem
318 untermiozän von Bitterfeld/Deutschland (Diptera, Sciaridae). Dtsch Entomol
319 Zeitschrift. 1995;42: 17–54.
320 18. Blagoderov V, Grimaldi D. Fossil Sciaroidea (Diptera) in Cretaceous Ambers,
321 Exclusive of Cecidomyiidae, Sciaridae, and Keroplatidae. Am Museum Novit.
322 2004;3433: 1. doi:10.1206/0003-0082(2004)433<0001:fsdica>2.0.co;2
323 19. Le Calvez J. Morphologie et comportement des chromosomes dans la
324 spermatogenese se quelques Mycetophilides. Chromosoma. 1947; 137–165.
325 20. Fahmy OG. The mechanism of chromosome pairing during meiosis in male
326 Apolipthisa subincana (Mycetophilidae, Diptera). J Genet. 1949;49: 246–263.
327 doi:10.1007/BF02986079
328 21. Káldy J, Mozsár A, Fazekas G, Farkas M, Fazekas DL, Fazekas GL, et al.
329 Hybridization of russian sturgeon (Acipenser gueldenstaedtii, Brandt and Ratzeberg,
22 330 1833) and american paddlefish (Polyodon spathula, Walbaum 1792) and evaluation of
331 their progeny. Genes (Basel). 2020;11: 1–17. doi:10.3390/genes11070753
332 22. McAllister BF, Werren JH. Hybrid origin of a B chromosome (PSR) in the parasitic
333 wasp Nasonia vitripennis. Chromosoma. 1997;106: 243–253.
334 doi:10.1007/s004120050245
335 23. Roux C, Tsagkogeorga G, Bierne N, Galtier N. Crossing the species barrier: Genomic
336 hotspots of introgression between two highly divergent ciona intestinalis species. Mol
337 Biol Evol. 2013;30: 1574–1587. doi:10.1093/molbev/mst066
338 24. Novikova PY, Brennan IG, Booker W, Mahony M, Doughty P, Lemmon AR, et al.
339 Polyploidy breaks speciation barriers in Australian burrowing frogs Neobatrachus.
340 PLoS Genet. 2020;16: 1–24. doi:10.1371/journal.pgen.1008769
341 25. Martinson EO, Mrinalini, Kelkar YD, Chang CH, Werren JH. The Evolution of Venom
342 by Co-option of Single-Copy Genes. Curr Biol. 2017;27: 2007-2013.e8.
343 doi:10.1016/j.cub.2017.05.032
344 26. Lafon-Placette C, Johannessen IM, Hornslien KS, Ali MF, Bjerkan KN, Bramsiepe J,
345 et al. Endosperm-based hybridization barriers explain the pattern of gene flow
346 between Arabidopsis lyrata and Arabidopsis arenosa in Central Europe. Proc Natl
347 Acad Sci U S A. 2017;114: E1027–E1035. doi:10.1073/pnas.1615123114
348 27. Stuart JJ, Chen M, Harris MO. Hessian fly. Publ from USDA-ARS / UNL Fac US.
349 2008.
350
23