2 Short Report

3

4 Transposable element abundance correlates with mode of transmission in

5 microsporidian parasites

6

7 Nathalia Rammé Medeiros de Albuquerque1, Dieter Ebert2 and Karen Luisa Haag1*

8

9

10 1 Department of Genetics and Post-Graduation Program of Genetics and Molecular Biology,

11 Federal University of Rio Grande do Sul, Av. Bento Gonçalves 9500, 91501-970 RS, Brazil.

12 2 Department of Environmental Sciences, Zoology, Basel University, Vesalgasse 1, 4051 Basel,

13 Switzerland.

14

15

16 *Corresponding author: Karen L. Haag, Department of Genetics, Federal University of Rio

17 Grande do Sul, Porto Alegre, Brazil. Tel.: +55 51 3308 6721. E-mail: [email protected].

18

19 20

21 Abstract

22

23 The extreme genome reduction and physiological simplicity of some has been

24 attributed to their intracellular, obligate parasitic lifestyle. Although not all microsporidian

25 genomes are small (size range from about 2 to 50 MB), it is suggested that the size of their

26 genomes has been streamlined by natural selection. We explore the hypothesis that vertical

27 transmission in microsporidia produces population bottlenecks, and thus reduces the

28 effectiveness of natural selection. Here we compare the transposable element (TE) content of

29 47 microsporidian genomes, and show that genome size is positively correlated with the

30 amount of TEs, and that species that experience vertical transmission have larger genomes

31 with higher proportion of TEs. Our findings are consistent with earlier studies inferring that

32 nonadaptive processes play an important role in microsporidian evolution.

33

34 Background

35

36 Microsporidia, a phylum of fast evolving intracellular parasites related to fungi [1–3], are

37 interesting models for studying the mechanisms shaping eukaryotic genome evolution. Their

38 genomes are small and highly diverse. Two contrasting genome architectures are found in

39 microsporidians. The extremely compact, gene-dense genomes of the encephalitozoonids (≅ 2

40 Mb), which include human parasites, are the smallest known eukaryote parasites [4,5]. At the

41 other extreme are the large, gene-sparse genomes of several phylogenetically unrelated

42 species, such as Edhazardia aedis (≅ 50 Mb), Hamiltosporidium spp. (≅ 17-25 Mb), Nosema

2 43 bombycis (≅ 15 Mb) and Anncaliia algerae (≅ 12-17 Mb) [6–10]. Despite showing an about

44 25-fold difference in size, the gene-sparse and gene-dense genomes only differ by a factor of

45 2-3 in the number of proteins they encode, and a large fraction of the noncoding regions in the

46 gene-sparse genomes is populated by repetitive sequences, most of which have been

47 identified as transposable elements [11]. It is suggested that the majority of transposable

48 element (TE) insertions cause mildly deleterious effects in eukaryotes [12,13]. Therefore,

49 their spread and accumulation depend on the cellular mechanisms of TE silencing [14] as well

50 as on the efficacy of natural selection to eliminate or retain them [15]. TEs are also known to

51 escape cellular defense mechanisms by transferring themselves horizontally to different

52 genomes [16,17]. When a TE enters a genome for the first time, it can often duplicate freely

53 before becoming epigenetically silenced [14]. Symbiotic organisms living in close proximity

54 are more likely to exchange TEs [16,18]; not surprisingly, a large number of horizontal gene

55 transfer (HGT) events involving TEs have been identified in the evolutionary history of

56 microsporidia [11].

57

58 We have recently proposed that nonadaptive processes have a major role in the evolution of

59 microsporidian species that are predominantly vertically transmitted [8]. In such species

60 genetic drift might override selection because the translocation of parasites from the

61 somatic cells to gametes reduces the parasite effective population size (Ne) [19]. Moreover,

62 vertical transmission might be predominant in host species with a metapopulation structure,

63 where local extinction and re-colonization is common and where the parasite enters new

64 populations with the hosts dispersal stage and thus experiences severe bottlenecks [20].

3 65 Natural selection is efficient only if selection coefficients are greater than about 1/2Ne [21]. In

66 our previous study we showed that vertical transmission in a subset of microsporidian

67 genomes is associated with a higher ratio of non-synonymous to synonymous substitutions

68 [8]. Furthermore, microsporidia that experience vertical transmission usually have gene-

69 sparse genomes [8]. Here we expand our analyses by characterizing the mobilomes from a

70 larger number of genomes, and explore TE evolutionary dynamics in the light of

71 microsporidian modes of transmission.

72

73 Results and Discussion

74

75 All 47 microsporidian genome assemblies available as of March 2019 were retrieved from

76 GenBank (Additional file 1) and used for the identification of TEs. DNA transposons are the

77 dominant class of TEs in microsporidian genomes, but their abundance varies greatly

78 between species, and is not phylogenetically conserved (Fig. 1A; Additional file 2). The tree

79 presented in Fig. 1A, built from a dataset of 37 highly conserved proteins (Additional file 3)

80 established to reduce long-branch-attraction artifacts in microsporidian phylogenies [1,2], is

81 congruent with previously published microsporidian phylogenetic inferences based on other

82 data [22,23].

83

84 To evaluate whether TE accumulation in microsporidian lineages represents ancient or recent

85 events, we used the Kimura 2-parameter distance (K2P) to generate age distributions of TEs

86 within each genome (Additional file 4). The evolutionary dynamics of TEs shows very distinct

87 patterns, even within species. For example, the three A. algerae genomes show variable

4 88 amounts of TE content (1.64% in strain PRA109, 1.34% in PRA339 and 15.75% in Undeen;

89 Additional file 1; Fig. 1A), and the accumulation of TEs in A. algerae Undeen apparently results

90 from a recent spread of DNA transposons, mainly from the Merlin superfamily (Additional

91 Files 5 and 6). A sharply bimodal distribution of TE abundance per age class suggests two

92 successive expansions in A. algerae Undeen that are not observed in the remaining two strains

93 (Fig. 1B). Similarly, the Nosema genomes from N. bombycis (33.41 % of TEs) and N. ceranae

94 BRL01 (21.64% of TEs) show an excess of young elements, which also seems to result from

95 the recent spread of Merlin (Fig. 1B; Additional Files 5 and 6).

96

97 We reasoned that these recent events of TE expansion in some evolutionary lineages could

98 result from the loss of genes encoding components from cellular defense mechanisms, such as

99 members of the RNAi pathway. We test this idea using the example of Dicer and Argonaute,

100 two genes essential for the RNAi pathway. It was suggested that Dicer and Argonaute genes

101 are selected as a unit in microsporidia, and that they were selectively maintained or lost

102 during lineage divergence [24]. Therefore, we searched for orthologs of proteins related to

103 cellular defense against TEs in the 47 proteomes (Additional file 6). No RIP/MIP-related

104 proteins, which silence TEs by inducing mutation and/or cytosine methylation within

105 repetitive sequences, were found in any lineage. Most strains carrying Dicer and Argonaute

106 orthologs also carry the protein sets necessary for somatic quelling and meiotic silencing, but

107 this is still no evidence of the activity of these mechanisms in microsporidia. The number of

108 Dicer and Argonaute gene copies varies between 0-5 and 0-7, respectively, and in one lineage

109 (A. algerae PRA 109) we found 2 copies of Dicer, but no Argonaute (Additional file 6). Thus,

110 Dicer and Argonaute seem not necessarily selected as a unit. Interestingly, although there is an

5 111 overall congruence between the Dicer + Argonaute tree (Fig. 1C) and the one built with 37

112 conserved proteins (Fig. 1A), there are two inconsistencies, i.e., the A. algerae, and the E. aedis

113 clades, though with low statistical support.

114

115 A strong negative correlation was found between the percentage of total coding sequences

116 (CDS) and genome size (rho = -0.92, p-value = 2.2e-16; Fig. 2A), indicating that non-coding

117 regions are responsible for variations in genome size. On the other hand, there is a positive

118 correlation between the fraction of genomes occupied by TEs and genome size (rho = 0.66, p-

119 value = 4.375e-07; Fig. 2A). Regression analysis between TE fraction and genome size is also

120 significant (R-squared = 0.33, p-value = 1.501e-05; Additional file 7). We further applied

121 phylogenetic independent contrasts to the regression analysis [25], which showed that the

122 positive correlation between genome size and the amount of TEs in microsporidia does not

123 result exclusively from phylogenetic inertia (R-squared = 0.16, p-value = 0.003; Additional file

124 7). However, some lineages with comparatively large genomes such as A. algerae PRA 109,

125 PRA339 and Edhazardia aedis have TE fractions similar to small genomes such as

126 Encephalitozoon cuniculi GBM1 and Enterocytozoon bieneusi. Both E. cuniculi GBM1 and E.

127 bieneusi genomes suggest recent expansions of miniature inverted-repeat transposable

128 elements according to the K2P age distributions of TEs (MITEs; Additional files 3 and 5).

129 MITEs are deletion derivatives of DNA transposons from the Tc1/Mariner superfamily [26].

130 Tc1/Mariner elements are absent in Encephalitozoon cuniculi and Enterocytozoon bieneusi but

131 are found in closely related microsporidia (Nosema spp. and Hepatospora eriocheir; Additional

132 files 3 and 5).

133

6 134 We identified Dicer and Argonaute orthologs in 27 proteomes (Additional file 6). Strains with

135 putative Dicer and Argonaute genes have, on average, a higher proportion of TEs, and larger

136 genome sizes compared to strains in which these proteins are absent (Wilcoxon rank sum; p-

137 value <0.004, and p-value < 2.2e-16, respectively; Fig. 2B). Although the association between

138 the presence of both Dicer and Argonaute with the accumulation of TEs seems to support the

139 idea that a RNAi machinery is being selectively maintained in microsporidian genomes

140 attacked by selfish elements, we think that random events such as gene deletions,

141 duplications and/or HGT may have had an equivalent or perhaps more important role. First,

142 there is only partial phylogenetic congruence between Dicer + Argonaute proteins (Fig. 1C)

143 and the history of microsporidia inferred from 37 highly conserved proteins (Fig. 1A).

144 Second, Argonaute was apparently lost in A. algerae PRA109 but not in A. algerae PRA339,

145 although both strains seem to have a similar TE age structure (Fig. 1B). Third, we found

146 strong evidence of Argonaute HGT in N. bombycis; its Argonaute protein sequence is identical

147 to Papilio xuthus, the Asian swallowtail butterfly (Additional file 8). The ortholog protein in N.

148 apis has only 62.3% identity with the P. xuthus Argonaute, indicating that the N. bombycis gene

149 may have resulted from a recent HGT event. Previous comparative studies focusing on the

150 enlargement of the N. bombycis genome suggested a major implication of HGT in the

151 acquisition of new genes and TEs from its lepidopteran host Bombyx mori [10]. Taken

152 together, these findings are consistent with the hypothesis that the evolutionary dynamics of

153 the RNAi machinery may be under the influence of other mechanisms beyond natural

154 selection.

155

7 156 Horizontal transfer of different genes has been reported in microsporidia [23,27–30]. Among

157 TEs, DNA transposons are the most often horizontally transferred in eukaryotes [16], and in A.

158 algerae, several Merlin and PiggyBac transposon insertions were shown to result from HGT

159 [11]. We hypothesize that the invasion and accumulation of TEs in microsporidian genomes is

160 facilitated by their intracellular mode of life and by the low power of purifying selection. We

161 further hypothesize that the latter is caused by low Ne in association with vertical

162 transmission. Microsporidia use vertical transmission as a survival strategy when

163 opportunities to transmit horizontally are reduced [31], thus being of particular importance

164 during times of low density or the colonization of new host population, both of which cause

165 population bottlenecks. Therefore, this mode of transmission may reduce Ne and thus the

166 strength of natural selection [8]. Species with larger mobilomes, such as N. bombycis and

167 Hamiltosporidium spp. show mixed mode transmission (a combination of horizontal and

168 vertical transmission) [32,33]. Microsporidia with mixed-mode transmission have, on

169 average, a higher proportion of TEs than species with horizontal transmission. The difference

170 in TE percentage between species capable of vertical transmission and those showing only

171 horizontal transmission is statistically significant (Wilcoxon rank sum test, p-value <0.0006;

172 Fig. 2B). The difference in genome size between these two groups is also significant (Wilcoxon

173 rank sum test, p-value=1.77e-05; Fig. 2B), with vertically transmitted species having larger

174 genome sizes, on average.

175

176 On closer inspection, one can see that some microsporidia that use vertical transmission do

177 exhibit a low proportion of TEs (such as A. algerae PRA109, PRA339 and E. aedis; Fig. 1;

178 Additional files 2 and 4). While these examples are few and do not reverse the overall

8 179 statistical signal, they seemingly contradict our hypothesis. It is unclear if these low

180 abundances are genuine or artefacts resulting from current methodological approaches. TEs

181 are a challenge for genome assembly and annotation, and some sequencing pipelines may

182 discard TEs during the assembly process and/or later during the filtering of contaminants

183 [34]. Different sequencing platforms and pipelines were used to sequence and assemble the

184 genomes used in our study, which might have interfered in subsequent TE annotation. Most

185 TE annotation methods rely on -based approaches that limit the discovery of novel

186 families of TEs, or require previous knowledge of structural features of TEs, being prone to

187 false negatives [35]. Moreover, highly degenerated TEs without recognizable structural

188 features can be missed from annotation [36,37].

189

190 Taken together with our previous study [8] the here presented results show that small

191 effective population sizes, and thus increased levels of genetic drift, are correlated with

192 genome expansions and the spread of TEs in microsporidia. Thus, we suggest that

193 nonadaptive forces play a strong role in shaping the evolution of genome size in populations

194 of microsporidian parasites [13].

195

196 Materials and Methods

197

198 TE annotation

199 For de novo identification of TE in all genomes we used RepeatModeler software with default

200 parameters. RepeatModeler creates consensus sequences for each TE family and classifies

201 based in homology. The output is TE lineage-specific libraries for each genome. We combined

9 202 these TE lineage-specific libraries with all TE from fungi obtained from Repbase to be used in

203 RepeatMasker software [38,39]. RepeatMasker is a homology-based method that screens

204 genomic sequences and annotates TEs by using BLAST (e-value threshold of 1e-10) as search

205 engine. Low complexity DNA sequences and simple repeats were not masked. The annotation

206 of Miniature Inverted-Repeat Transposable Elements (MITEs) insertions was performed with

207 MITE Tracker [40]⁠. The software identifies elements with valid Terminal Inverted Repeats

208 (TIRs) between 50 and 800 nt, and Target Site Duplications (TSDs). MITE candidates are

209 filtered by flanking sequence (sequences outside the TSDs with 50 nt in length) similarity and

210 grouped into families. MITEs undetected or classified as “unknown” by RepeatModeler were

211 added or reclassified in the TE lineage-specific libraries. The tool “One code to find them all”

212 [41]⁠ was used to filtrate retrotransposons longer than 80 bp and DNA transposons longer

213 than 50 bp, with >80% of identity with the reference sequences. The output summarizes the

214 number of TE copies and genome coverage for each TE family.

215

216 Kimura distance and genome coverage

217 The Kimura 2-Parameter (K2P) divergence metric was calculated for TE sequences present in

218 all genomes analyzed with RepeatMasker package RepeatLandscape [39,42].

219 RepeatLandscape creates graphs depicting the contribution of repeat classes for each genome

220 assembly and K2P distance from the consensus TE sequences contained in the RepeatMasker

221 libraries.

222

223 Highly conserved proteins for phylogenetic analyses

10 224 We searched the 47 genome assemblies for highly conserved proteins [1,2] using BLASTp (e-

225 value threshold of 1e-15). The conserved proteins needed to be present in at least half of the

226 analyzed proteomes for not being removed in the multiple alignment trimming phase. We

227 identified 37 conserved proteins in at least 50% of the proteomes (Additional file 3).

228

229 Genome defense mechanisms against TEs

230 We searched for proteins related to genome defense mechanisms against TE [43]. BLASTp

231 searches with e-value threshold of 1e-5 were performed for using Neurospora crassa proteins

232 as queries: RNA-dependent RNA polymerase (Q1K6C4), DNA repair protein RAD51 (Q1K929),

233 DNA repair protein RAD52 (Q9HGI2), DNA repair protein Rad54 (Q9P978), Replication

234 protein A (Q1K7R9), RNA helicase (Q7S3A3), Masc1 (Methyltransferase from Ascobolus 1)

235 (O13369), RID (RIP defective) (V5INW3). BLASTp searches with e-value threshold of 1e-8

236 were performed using Hamiltosporidium tvaerminnensis FI-OER-3-3 Dicer (A0A4Q9LBM7)

237 and Argonaute (A0A4Q9L0B4) as queries (Additional file 6).

238

239 Phylogenies

240 Protein sequences were aligned with MUSCLE and uninformative columns in the alignments,

241 containing at least 50% of gaps, were trimmed using Geneious Prime 2019.0.4 [44,45]. The

242 best amino acid substitution models fitting our data were selected with ProtTest 3.4.2 [46].

243 The 37 conserved proteins were concatenated, and the final alignment contained 12,467 sites.

244 The best model according with the AIC and BIC criteria was VT+I+G. Argonaute and Dicer

245 proteins were concatenated and the final alignment contained 1,822 sites. The best model was

11 246 LG+I+G. The maximum likelihood phylogenies were estimated using in RAxML 8.2.10 [47],

247 and the statistical support for each branch was estimated by 1,000 bootstrap replicates.

248

249 Statistical tests

250 In order to test correlations between TE content and genome size all data were

251 logarithmically transformed. We applied Shapiro-Wilks method [48] to test the normality of

252 our data with the function shapiro.test() in R version 3.6.2. The data were not normally

253 distributed therefore nonparametric tests were applied. The Spearman correlation between

254 TE percentage in the genome and genome size, as well as total CDS length and genome size,

255 was determined by using the function cor.test() in R [49]. The package ape in R [50] was used

256 to perform a regression analysis between the amount of TEs and genome size using the

257 function fit.pic(). We performed a Wilcoxon rank sum test with the function wilcox.test() in R

258 to test for a difference in TE percentage between lineages horizontally transmitted vs. lineages

259 with mixed-mode transmission, and genome size between lineages horizontally transmitted

260 and lineages with mixed-mode transmission (Additional file 9). We also tested the difference

261 in TE percentage between genomes encoding for both Dicer and Argonaute proteins vs.

262 genomes where these proteins are absent. The plots were constructed using ggplot2 package

263 in R [51].

264

265 Abbreviations

266

267 HGT:

268 K2P: Kimura 2-Parameter distance

12 269 RIP: Repeat-induced Point mutation

270 MIP: Methylation Induced Premeiotically

271 MITE: Miniature Inverted-repeat Transposable Element

272 TIR: Terminal Inverted Repeat

273 TSD: Target Site Duplication

274 CDS: Coding sequence

275 AIC: Akaike Information Criterion

276 BIC: Bayesian Information Criterion

277

278 References

279

280 1. Capella-Gutierrez S, Marcet-Houben M, Gabaldon T. Phylogenomics supports microsporidia 281 as the earliest diverging clade of sequenced fungi. BMC Biology. 2012;10:47.

282 2. Haag KL, James TY, Pombert J-F, Larsson R, Schaer TMM, Refardt D, et al. Evolution of a 283 morphological novelty occurred before genome compaction in a lineage of extreme parasites. 284 PNAS. 2014;111:15480–5.

285 3. James TY, Pelin A, Bonen L, Ahrendt S, Sain D, Corradi N, et al. Shared signatures of 286 and phylogenomics unite cryptomycota and microsporidia. Curr Biol. 287 2013;23:1548–53.

288 4. Corradi N, Pombert J-F, Farinelli L, Didier ES, Keeling PJ. The complete sequence of the 289 smallest known nuclear genome from the microsporidian Encephalitozoon intestinalis. Nat 290 Commun. 2010;1:77.

291 5. Katinka MD, Duprat S, Cornillot E, Méténier G, Thomarat F, Prensier G, et al. Genome 292 sequence and gene compaction of the eukaryote parasite Encephalitozoon cuniculi. Nature. 293 2001;414:450–3.

294 6. Williams BA, Lee RC, Becnel JJ, Weiss LM, Fast NM, Keeling PJ. Genome sequence surveys of 295 Brachiola algerae and Edhazardia aedis reveal microsporidia with low gene densities. BMC 296 Genomics. 2008;9:200.

13 297 7. Keeling P, Fast NM, Corradi N. Microsporidian genome structure and function. In: Louis M. 298 Weiss, Becnel JJ, editors. Microsporidia: Pathogens of Opportunity. John Wiley & Sons; 2014. p. 299 221–9.

300 8. Haag KL, Pombert J-F, Sun Y, de Albuquerque NRM, Batliner B, Fields P, et al. Microsporidia 301 with vertical transmission were likely shaped by nonadaptive processes. Genome Biol Evol. 302 2020;12:3599–614.

303 9. Corradi N, Haag KL, Pombert J-F, Ebert D, Keeling PJ. Draft genome sequence of the 304 pathogen Octosporea bayeri: insights into the gene content of a large microsporidian genome 305 and a model for host-parasite interactions. Genome Biol. 2009;10:R106.

306 10. Pan G, Xu J, Li T, Xia Q, Liu S-L, Zhang G, et al. Comparative genomics of parasitic silkworm 307 microsporidia reveal an association between genome expansion and host . BMC 308 Genomics. 2013;14:186.

309 11. Parisot N, Pelin A, Gasc C, Polonais V, Belkorchia A, Panek J, et al. Microsporidian genomes 310 harbor a diverse array of transposable elements that demonstrate an ancestry of horizontal 311 exchange with metazoans. Genome Biol Evol. 2014;6:2289–300.

312 12. Hua-Van A, Le Rouzic A, Boutin TS, Filée J, Capy P. The struggle for life of the genome’s 313 selfish architects. Biology Direct. 2011;6:19.

314 13. Lynch M, Conery JS. The origins of genome complexity. Science. 2003;302:1401–4.

315 14. Slotkin RK, Martienssen R. Transposable elements and the epigenetic regulation of the 316 genome. Nature Reviews Genetics. 2007;8:272–85.

317 15. Song MJ, Schaack S. Evolutionary Conflict between Mobile DNA and Host Genomes. The 318 American Naturalist. 2018;192:263–73.

319 16. Peccoud J, Loiseau V, Cordaux R, Gilbert C. Massive horizontal transfer of transposable 320 elements in insects. PNAS. 2017;114:4721–6.

321 17. Schaack S, Gilbert C, Feschotte C. Promiscuous DNA: horizontal transfer of transposable 322 elements and why it matters for eukaryotic evolution. Trends in Ecology & Evolution. 323 2010;25:537–46.

324 18. Panaud O. Horizontal transfers of transposable elements in eukaryotes: The flying genes. 325 Comptes Rendus Biologies. 2016;339:296–9.

326 19. Mira A, Moran NA. Estimating population size and transmission bottlenecks in maternally 327 transmitted endosymbiotic bacteria. Microb Ecol. 2002;44:137–43.

328 20. Sheikh-Jabbari E, Hall MD, Ben-Ami F, Ebert D. The expression of for a mixed- 329 mode transmitted parasite in a diapausing host. Parasitology. 2014;141:1097–107.

14 330 21. Ohta T. The nearly neutral theory of molecular evolution. Annual Review of Ecology and 331 Systematics. 1992;23:263–86.

332 22. Vossbrinck CR, Debrunner-Vossbrinck BA. Molecular phylogeny of the Microsporidia: 333 ecological, ultrastructural and taxonomic considerations. Folia Parasitol. 2005;52:131–42; 334 discussion 130.

335 23. Pombert J-F, Haag KL, Beidas S, Ebert D, Keeling PJ. The Ordospora colligata genome: 336 evolution of extreme reduction in microsporidia and host-to-parasite horizontal gene 337 transfer. mBio. 2015;6:e02400-14.

338 24. Huang Q. Evolution of Dicer and Argonaute orthologs in microsporidian parasites. 339 , Genetics and Evolution. 2018;65:329–32.

340 25. Felsenstein J. Phylogenies and the Comparative Method. The American Naturalist. 341 1985;125:1–15.

342 26. Feschotte C, Mouchès C. Evidence that a family of Miniature Inverted-Repeat Transposable 343 Elements (MITEs) from the Arabidopsis thaliana genome has arisen from a pogo-like DNA 344 transposon. Mol Biol Evol. 2000;17:730–7.

345 27. Pombert J-F, Selman M, Burki F, Bardell FT, Farinelli L, Solter LF, et al. Gain and loss of 346 multiple functionally related, horizontally transferred genes in the reduced genomes of two 347 microsporidian parasites. PNAS. 2012;109:12638–43.

348 28. Corradi N. Microsporidia: Eukaryotic intracellular parasites shaped by gene loss and 349 horizontal gene transfers. Annual Review of Microbiology. 2015;69:null.

350 29. Selman M, Pombert J-F, Solter L, Farinelli L, Weiss LM, Keeling P, et al. Acquisition of an 351 animal gene by microsporidian intracellular parasites. Current Biology. 2011;21:R576–7.

352 30. Tsaousis AD, Kunji ERS, Goldberg AV, Lucocq JM, Hirt RP, Embley TM. A novel route for 353 ATP acquisition by the remnant mitochondria of Encephalitozoon cuniculi. Nature. 354 2008;453:553–6.

355 31. Turner PE, Cooper VS, Lenski RE. Tradeoff between horizontal and vertical modes of 356 transmission in bacterial plasmids. Evolution. 1998;52:315–29.

357 32. Han M-S, Watanabe H. Transovarial transmission of two microsporidia in the silkworm, 358 Bombyx mori, and disease occurrence in the progeny population. Journal of Invertebrate 359 Pathology. 1988;51:41–5.

360 33. Vizoso DB, Ebert D. Phenotypic plasticity of host–parasite interactions in response to the 361 route of infection. Journal of Evolutionary Biology. 2005;18:911–21.

362 34. Goerner-Potvin P, Bourque G. Computational tools to unmask transposable elements. Nat 363 Rev Genet. 2018;19:688–704.

15 364 35. Ou S, Su W, Liao Y, Chougule K, Agda JRA, Hellinga AJ, et al. Benchmarking transposable 365 element annotation methods for creation of a streamlined, comprehensive pipeline. Genome 366 Biology. 2019;20:275.

367 36. Juretic N, Bureau TE, Bruskiewich RM. Transposable element annotation of the rice 368 genome. Bioinformatics. 2004;20:155–60.

369 37. Bergman CM, Quesneville H. Discovering and detecting transposable elements in genome 370 sequences. Brief Bioinform. 2007;8:382–92.

371 38. Bao W, Kojima KK, Kohany O. Repbase Update, a database of repetitive elements in 372 eukaryotic genomes. Mob DNA. 2015;6:11.

373 39. Smit A, Hubley R, Green P. RepeatMasker [Internet]. Institute for Systems Biology; 2013. 374 Available from: http://www.repeatmasker.org/RepeatModeler/

375 40. Crescente JM, Zavallo D, Helguera M, Vanzetti LS. MITE Tracker: an accurate approach to 376 identify miniature inverted-repeat transposable elements in large genomes. BMC 377 Bioinformatics. 2018;19:348.

378 41. Bailly-Bechet M, Haudry A, Lerat E. “One code to find them all”: a perl tool to conveniently 379 parse RepeatMasker output files. Mobile DNA. 2014;5:13.

380 42. Kimura M. A simple method for estimating evolutionary rates of base substitutions 381 through comparative studies of nucleotide sequences. J Mol Evol. 1980;16:111–20.

382 43. Gladyshev E. Repeat-induced point mutation and other genome defense mechanisms in 383 fungi. The Fungal Kingdom [Internet]. John Wiley & Sons, Ltd; 2017 [cited 2020 Feb 22]. p. 384 687–99. Available from: 385 https://onlinelibrary.wiley.com/doi/abs/10.1128/9781555819583.ch33

386 44. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, et al. Geneious Basic: 387 An integrated and extendable desktop software platform for the organization and analysis of 388 sequence data. Bioinformatics. 2012;28:1647–9.

389 45. Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. 390 Nucleic Acids Res. 2004;32:1792–7.

391 46. Darriba D, Taboada GL, Doallo R, Posada D. ProtTest 3: fast selection of best-fit models of 392 protein evolution. Bioinformatics. 2011;27:1164–5.

393 47. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large 394 phylogenies. Bioinformatics. 2014;30:1312–3.

395 48. Shapiro SS, Wilk MB. An analysis of variance test for normality (Complete Samples). 396 Biometrika. 1965;52:591–611.

16 397 49. R Core Team. R: The R Project for Statistical Computing [Internet]. 2019 [cited 2020 Feb 398 22]. Available from: https://www.r-project.org/

399 50. Paradis E, Schliep K. ape 5.0: an environment for modern phylogenetics and evolutionary 400 analyses in R. Bioinformatics. Oxford Academic; 2019;35:526–8.

401 51. Wickham H. ggplot2: Elegant Graphics for Data Analysis. 1st ed. 2009. Corr. 3rd printing 402 2010 edition. New York: Springer; 2010.

403

404 Declarations

405

406 Availability of data and materials

407 The datasets generated as part of the study are available as supplementary information.

408 Please contact the authors for further data request.

409

410 Funding

411 This work was supported by the Swiss National Science Foundation grant numbers

412 31003A_146462 and 310030B_166677 to D.E., Conselho Nacional de Desenvolvimento

413 Científico e Tecnológico (CNPq) grant PQ 302121/2017-0 to K.L.H, and fellowship to N.R.M.A.

414

415 Competing interests

416 The authors declare that they have no competing interests.

417

418 Authors’ contributions

419 N.R.M.A. acquired and analyzed the data, designed the study, interpreted the data, and wrote

420 the manuscript. D.E. interpreted the data, and wrote the manuscript. K.L.H. designed the

17 421 study, interpreted the data, and wrote the manuscript. All the authors approved the final

422 manuscript.

423

424 Acknowledgements

425 We especially thank to Sidia Maria Callegari Jacques for her help with the statistical analyses,

426 and to Jean-François Pombert for the critical reading of this manuscript.

427

428 Ethics approval and consent to participate

429 Not applicable

430

431 Consent for publication

432 Not applicable

433

434 Figure captions

435

436 Figure 1 – TE dynamics in the history of microsporidia. (A) Assembly sizes (Mb) and their

437 percentages of distinct TE classes are displayed next to the phylogeny of 47 microsporidia.

438 The tree was inferred using 37 highly conserved proteins; only bootstrap support values

439 lower than 90% are shown. Known modes of transmission are indicated in bold letters:

440 H=horizontal; HV=horizontal and vertical; HS=Horizontal sexual. (B) Age distribution of

441 different classes of TEs; larger Kimura distances (K2P) represent older elements. (C)

442 Phylogeny inferred using concatenated Argonaute and Dicer proteins from 25 lineages of

443 microsporidia. Only lineages where both proteins were found have been included in the

18 444 analysis. Rozella alomycis, an early diverging [3], is used as root in both (A) and (B)

445 phylogenies⁠. The major clades of microsporidia bearing both Argonaute and Dicer are

446 highlighted in different colors.

447

448 Figure 2 – TE accumulation associated with parasite vertical transmission. (A) Spearman

449 correlations of assembly sizes (Mb) with their coding (CDS) and TE fractions. (B)

450 Distributions of assembly sizes and TE fractions in lineages grouped according to the

451 presence or absence of both Argonaute and Dicer, and to the presence or absence of vertical

452 transmission.

453

454 Additional files

455

456 Additional file 1 – Microsporidian assemblies analyzed in our study.

457

458 Additional file 2 – Content of distinct TE classes in the microsporidian assemblies.

459

460 Additional file 3 – List of the 37 highly conserved proteins employed for phylogenetic

461 reconstruction.

462

463 Additional file 4 – Distribution of the Kimura 2-Parameter (K2P) divergence metric calculated

464 for all TE sequences present in all genomes analyzed in our study.

465

466 Additional file 5 – Frequency of subtypes of TEs in microsporidian mobilomes.

19 467

468 Additional file 6 – Results of BLASTp searches for proteins involved in genome defense

469 mechanisms against TEs.

470

471 Additional file 7 – Regression analyses of the amount of TEs in relation to genome size with

472 and without phylogenetic independent contrasts.

473

474 Additional file 8 – Protein alignments of N. bombycis vs. P. xuthus Argonaute, and N. apis vs. P.

475 xuthus Argonaute.

476

477 Additional file 9 – Known transmission modes of the microsporidia included in our study.

20 Mb A B Anncaliia algerae PRA109 A. algerae Undeen 11.85 Rozella allomycis 0.24 1.00 Mitosporidium daphniae H 5.63 0.18 0.75 Amphiamblys sp. 5.62

%TE 0.12 0.50 2.29 Encephalitozoon cuniculi EcunIII-L H 0.06 0.25 2.26 E. cuniculi EC3 H 0.00 0.00 E. cuniculi EC2 H 2.28 0 25 50 0 25 50 E. cuniculi GBM1 H 2.49 A. algerae PRA339 0.16 E. cuniculi EC1 H 2.28 0.12 E. intestinalis H 2.21

%TE 0.08 E. hellem H 2.25 E. romaleae H 2.18 0.04 0.00 Ordospora colligata FI-SK-17-1 H 2.26 0 25 50 O. colligata OC4 H 2.29 Nosema bombycis N. apis 64 O. colligata GB-EP-1 H 2.30 2.0 0.30 45 O. colligata NO-V-7 H 2.32 1.5 0.22

Nosema bombycis HV 15.68 1.0 0.15 %TE N. apis HS 8.56 0.5 0.07 N. ceranae BRL01 HS 7.86 0.0 0.00 N. ceranae PA08 HS 5.69 0 25 50 00 2525 5050 N. ceranae BRL01 N. ceranae PA08 Vittaforma corneae 3.21 0.8 0.16 Hepatospora eriocheir GB1 4.57 0.6 0.12 H. eriocheir Canceri 4.83 0.4 0.08

Enterospora canceri 3.10 %TE 0.2 0.04 Enterocytozoon hepatopenaei H 3.25 67 0.00 E. bieneusi HV 3.86 0 25 50 0 25 50 Anncaliia algerae PRA339 HV 12.16 K2P K2P A. algerae Undeen HV 13.82 A. algerae PRA109 HV 17.53 C 79 Pseudoloma neurophilia HV 5.24 Rozella allomycis Vavraia culicus H 6.11 Mitosporidium daphniae Anncaliia algerae PRA339 Trachipleistophora hominis 8.49 A. algerae Undeen Spraguea lophii EM120 H 5.76 Pseudoloma neurophilia S. lophii 42_110 H 4.98 Trachipleistophora hominis S. lophii Celtic deep H 5.76 79 Vavraia culicus S. lophii North Atlantic H 5.85 Spraguea lophii 42_110 S. lophii RA12034 H 5.80 S. lophii RA12034 S. lophii North Atlantic Edhazardia aedis HV 51.35 54 S. lophii EM120 Hamiltosporidium magnivora BEOM2 V 20.73 S. lophii Celtic deep H. tvaerminnensis FIOER33 HV 18.34 Hamiltosporidium tvaerminnensis ILG3 H. tvaerminnensis ILG3 HV 25.20 H. tvaerminnensis FIOER33 45 80 H. magnivora ILBN2 V 17.19 H. magnivora BEOM2 Nematocida displodere H 3.06 H. magnivora ILBN2 Nematocida sp. ERTm6 H 4.27 Edhazardia aedis Nematocida sp. ERTm2 H 4.70 Nosema ceranae PA08 52 N. ceranae BRL01 Nematocida sp. ERTm5 H 4.39 N. apis N. parisii ERTm3 H 4.14 N. bombycis N. parisii ERTm1 H 4.07 Vittaforma corneae 0 5 15 25 35 % Hepatospora eriocheir GB1 Enterospora canceri Enterocytozoon hepatopenaei Unknown LINE LTR DNA 3.0 A 2.00 Enc. intestinalis 1.5 No. bombycis

1.75 1.0 Enc. cuniculi Ed. aedis 1.50 GB-M1

0.5 Log10(%TE) Log10(%CDS) 1.25 No. bombycis

0.0 A. algerae PRA339 1.00 Ed. aedis Enc. cuniculi EC2 0.5 1.0 1.5 0.5 1.0 1.5 Log10(Assembly size [Mb]) Log10(Assembly size [Mb])

1 10 20 30 40 50 Mb B

50 50 30 30 40 40

30 20 30 20

20 %TE

20 %TE 10 10 Assembly size (Mb) Assembly size (Mb) 10 10

0 0 0 0 Absence Presence Absence Presence Absence Presence Absence Presence Dicer and Argonaute Dicer and Argonaute Vertical transmission Vertical transmission Additional file 1: List of microsporidian assemblies and genome parameters.

Lineage GenBank assembly accession Size (Mb) CDS (Mb) GC% Scaffolds Gene Protein Amphiamblys sp. GCA_001875675.1 5.6201 3.57 50.2 1727 3695 3642 Anncaliia algerae PRA109 GCA_000385855.2 17.5349 3.03 28.60 7113 5307 5245 Anncaliia algerae PRA339 GCA_000385875.2 12.1634 2.88 28.10 431 3659 3598 Anncaliia algerae Undeen GCA_000313815.1 13.8234 2.15 25.50 8427 - - Edhazardia aedis GCA_000230595.3 51.3480 4.44 22.8 1429 4262 4190 Encephalitozoon cuniculi EC1 GCA_000221285.2 2.2881 - 46.90 71 - - Encephalitozoon cuniculi EC2 GCA_000221265.2 2.2812 - 46.80 54 - - Encephalitozoon cuniculi EC3 GCA_000221245.2 2.2647 - 46.80 60 - - Encephalitozoon cuniculi EcunIII-L GCA_001078035.1 2.2923 1.96 46.90 15 1896 1834 Encephalitozoon cuniculi GB-M1 GCA_000091225.1 2.4975 2.15 47.30 11 2029 1996 Encephalitozoon hellem GCA_000277815.3 2.2517 1.98 43.36 12 - 1847 Encephalitozoon intestinalis GCA_000146465.1 2.2169 1.98 41.5182 11 - 1938 Encephalitozoon romaleae GCA_000280035.2 2.1875 1.94 40.3432 13 - 1831 Enterocytozoon bieneusi GCA_000209485.1 3.8607 2.61 33.7 1733 3806 3632 Enterocytozoon hepatopenaei GCA_002081675.1 3.2535 2.33 25.50 62 2536 2536 Enterospora canceri GCA_002087915.1 3.0954 1.82 40.2 537 2178 2169 Hamiltosporidium magnivora BE-OM-2 GCA_004325065.1 20.7341 4.9 25.80 3550 4658 4658 Hamiltosporidium magnivora IL-BN-2 GCA_004325035.1 17.1898 4.278 25.70 3833 4180 4180 Hamiltosporidium tvaerminnensis FI-OER-3-3 GCA_004325045.1 18.3444 4.518 25.80 2915 4121 4121 Hamiltosporidium tvaerminnensis IL-G-3 GCA_004325075.1 25.2027 6.643 26.10 2738 6203 6203 Hepatospora eriocheir Canceri GCA_002087875.1 4.8348 1.983 23.20 2344 3058 3052 Hepatospora eriocheir GB1 GCA_002087885.1 4.5703 1.891 22.40 1300 2716 2691 Mitosporidium daphniae GCA_000760515.1 5.6370 3.831 43 612 3330 3330 Nematocida displodere GCA_001642395.1 3.0664 2.64 49.1 42 2278 2278 Nematocida parisii ERTm1 GCA_000250985.1 4.0713 2.936 34.60 65 2722 2661 Nematocida parisii ERTm3 GCA_000190615.1 4.1476 2.875 34.50 53 2788 2726 Nematocida sp. 1 ERTm2 GCA_000250695.1 4.7007 3.002 38.40 202 2831 2770 Nematocida sp. ERTm5 GCA_001642415.1 4.3976 3.204 33.5 186 2709 2709 Nematocida sp. 1 ERTm6 GCA_000738915.1 4.2760 2.886 38.40 24 2491 2433 Nosema apis BRL 01 GCA_000447185.1 8.5695 2.28 18.9 554 2764 2764 Nosema bombycis CQ1 GCA_000383075.1 15.6898 3.34 32.2 1607 4468 4468 Nosema ceranae BRL01 GCA_000182985.1 7.8602 2.023 25.30 5465 2376 2060 Nosema ceranae PA08 GCA_000988165.1 5.6907 2.51 25.40 536 3264 3209 Ordospora colligata FI-SK-17-1 GCA_004325055.1 2.2600 1.927 38.7 26 1849 1800 Ordospora colligata GB-EP-1 GCA_004324935.1 2.3000 1.939 38.70 18 1857 1808 Ordospora colligata NO-V-7 GCA_004324945.1 2.3200 1.946 38.60 21 1862 1813 Ordospora colligata OC4 GCA_000803265.1 2.2905 1.948 38.5 15 1879 1820 Pseudoloma neurophilia GCA_001432165.1 5.2484 2.885 29.6 1603 3645 3644 Rozella allomycis GCA_000442015.1 11.8593 7.439 35 1059 6350 6350 Spraguea lophii 42_110 GCA_000430065.1 4.9809 2.539 23.40 1392 2585 2499 Spraguea lophii Celtic Deep GCA_001887945.1 5.7648 - 23.20 390 - - Spraguea lophii EM120 GCA_001887935.1 5.7648 - 23.20 390 - - Spraguea lophii North Atlantic GCA_001888035.1 5.8500 - 23.20 439 - - Spraguea lophii RA12034 GCA_001887975.1 5.8022 - 23.10 422 - - Trachipleistophora hominis GCA_000316135.1 8.4982 2.809 35.3 310 3253 3212 Vavraia culicis GCA_000192795.1 6.1187 2.869 39.9 379 2825 2773 Vittaforma corneae GCA_000231115.1 3.2135 2.15 36.7 220 2293 2239 Additional file 2: Content of distinct TE classes in each lineage.

Tes LTR DNA LINE Unknown Lineage Genome size (Mb) % Total Fragments Total_Mb % TE in genome Fragments Total_Mb % TE in genome Fragments Total_Mb % TE in genome Fragments Total_Mb % TE in genome Amphiamblys sp. 5.6201 23.4249 599 0.1791 3.1870 3955 0.8192 14.5766 1535 0.3182 5.6613 0 0.0000 0.0000 Anncaliia algerae PRA109 17.5349 1.6448 81 0.0001 0.0006 1778 0.2883 1.6442 0 0.0000 0.0000 0 0.0000 0.0000 Anncaliia algerae PRA339 12.1634 1.3454 137 0.0000 0.0000 1382 0.1636 1.3454 0 0.0000 0.0000 0 0.0000 0.0000 Anncaliia algerae Undeen 13.8234 15.7476 1159 0.2766 2.0007 6804 1.9003 13.7470 0 0.0000 0.0000 0 0.0000 0.0000 Edhazardia aedis 51.3480 8.2129 337 0.0541 0.1054 21525 4.1399 8.0625 69 0.0231 0.0450 0 0.0000 0.0000 Encephalitozoon cuniculi EC1 2.2881 1.1104 0 0.0000 0.0000 95 0.0254 1.1104 0 0.0000 0.0000 0 0.0000 0.0000 Encephalitozoon cuniculi EC2 2.2812 0.5446 0 0.0000 0.0000 74 0.0124 0.5446 0 0.0000 0.0000 0 0.0000 0.0000 Encephalitozoon cuniculi EC3 2.2647 0.7918 0 0.0000 0.0000 65 0.0179 0.7918 0 0.0000 0.0000 0 0.0000 0.0000 Encephalitozoon cuniculi EcunIII-L 2.2923 #VALUE! ------Encephalitozoon cuniculi GB-M1 2.4975 8.9475 0 0.0000 0.0000 185 0.2090 8.3701 0 0.0000 0.0000 4 0.0144 0.5774 Encephalitozoon hellem 2.2517 #VALUE! ------Encephalitozoon intestinalis 2.2169 #VALUE! ------Encephalitozoon romaleae 2.1875 #VALUE! ------Enterocytozoon bieneusi 3.8607 12.8582 232 0.0342 0.8847 410 0.3960 10.2577 246 0.0662 1.7158 0 0.0000 0.0000 Enterocytozoon hepatopenaei 3.2535 2.3494 208 0.0764 2.3494 0 0.0000 0.0000 0 0.0000 0.0000 0 0.0000 0.0000 Enterospora canceri 3.0954 4.0685 47 0.0104 0.3363 253 0.1059 3.4215 23 0.0096 0.3107 0 0.0000 0.0000 Hamiltosporidium magnivora BE-OM-2 20.7341 12.7582 64 0.0164 0.0791 10321 2.0242 9.7628 3986 0.6047 2.9163 0 0.0000 0.0000 Hamiltosporidium magnivora IL-BN-2 17.1898 11.0000 54 0.0200 0.1166 7964 1.4179 8.2482 3299 0.4530 2.6352 0 0.0000 0.0000 Hamiltosporidium tvaerminnensis FI-OER-3-3 18.3444 10.5686 96 0.0239 0.1300 9463 1.4799 8.0671 3268 0.4350 2.3714 0 0.0000 0.0000 Hamiltosporidium tvaerminnensis IL-G-3 25.2027 15.2094 104 0.0252 0.1002 13161 2.6040 10.3322 4977 1.2020 4.7695 8 0.0019 0.0075 Hepatospora eriocheir Canceri 4.8348 13.2159 168 0.0379 0.7835 1536 0.6011 12.4324 0 0.0000 0.0000 0 0.0000 0.0000 Hepatospora eriocheir GB1 4.5703 8.2339 100 0.0173 0.3790 1310 0.2620 5.7325 0 0.0000 0.0000 466 0.0970 2.1224 Mitosporidium daphniae 5.6370 2.3523 46 0.0007 0.0117 372 0.1319 2.3406 0 0.0000 0.0000 0 0.0000 0.0000 Nematocida displodere 3.0664 3.3685 103 0.1033 3.3685 0 0.0000 0.0000 0 0.0000 0.0000 0 0.0000 0.0000 Nematocida parisii ERTm1 4.0713 2.3924 10 0.0000 0.0000 393 0.0974 2.3924 0 0.0000 0.0000 0 0.0000 0.0000 Nematocida parisii ERTm3 4.1476 5.5851 8 0.0000 0.0000 826 0.2074 5.0015 112 0.0242 0.5836 0 0.0000 0.0000 Nematocida sp. 1 ERTm2 4.7007 8.8987 6 0.0000 0.0000 819 0.4183 8.8987 0 0.0000 0.0000 0 0.0000 0.0000 Nematocida sp. ERTm5 4.3976 3.3543 8 0.0000 0.0000 495 0.1475 3.3543 0 0.0000 0.0000 0 0.0000 0.0000 Nematocida sp. 1 ERTm6 4.2760 3.4072 5 0.0001 0.0019 365 0.1456 3.4053 0 0.0000 0.0000 0 0.0000 0.0000 Nosema apis BRL 01 8.5695 6.6645 186 0.0271 0.3166 956 0.2756 3.2158 1072 0.2684 3.1322 0 0.0000 0.0000 Nosema bombycis CQ1 15.6898 33.4147 1140 0.2545 1.6224 13133 4.8017 30.6039 436 0.1852 1.1804 58 0.0013 0.0080 Nosema ceranae BRL01 7.8602 21.6436 234 0.0486 0.6180 5359 1.6463 20.9441 45 0.0064 0.0815 0 0.0000 0.0000 Nosema ceranae PA08 5.6907 6.4726 224 0.0748 1.3152 726 0.2755 4.8405 96 0.0180 0.3170 0 0.0000 0.0000 Ordospora colligata FI-SK-17-1 2.2600 ------Ordospora colligata GB-EP-1 2.3000 ------Ordospora colligata NO-V-7 2.3200 ------Ordospora colligata OC4 2.2905 ------Pseudoloma neurophilia 5.2484 4.8244 612 0.0971 1.8495 627 0.1329 2.5329 180 0.0232 0.4419 0 0.0000 0.0000 Rozella allomycis 11.8593 1.4946 650 0.0826 0.6969 713 0.0946 0.7977 0 0.0000 0.0000 0 0.0000 0.0000 Spraguea lophii 42_110 4.9809 3.2143 118 0.0076 0.1534 1143 0.1479 2.9685 108 0.0046 0.0924 0 0.0000 0.0000 Spraguea lophii Celtic Deep 5.7648 6.6857 237 0.0076 0.1317 1351 0.3778 6.5540 0 0.0000 0.0000 0 0.0000 0.0000 Spraguea lophii EM120 5.7648 6.3706 12 0.0001 0.0016 1349 0.3607 6.2571 230 0.0064 0.1118 0 0.0000 0.0000 Spraguea lophii North Atlantic 5.8500 8.1766 310 0.0301 0.5141 1337 0.4350 7.4365 208 0.0132 0.2261 0 0.0000 0.0000 Spraguea lophii RA12034 5.8022 7.1791 11 0.0002 0.0037 1434 0.3977 6.8550 285 0.0186 0.3204 0 0.0000 0.0000 Trachipleistophora hominis 8.4982 2.2363 30 0.0004 0.0047 2133 0.1576 1.8550 478 0.0320 0.3766 0 0.0000 0.0000 Vavraia culicis 6.1187 3.6481 11 0.0005 0.0083 617 0.2227 3.6398 0 0.0000 0.0000 0 0.0000 0.0000 Vittaforma corneae 3.2135 3.1654 3 0.0000 0.0000 206 0.1017 3.1654 0 0.0000 0.0000 0 0.0000 0.0000 Additional file 3: List of proteins used in philogeny reconstruction.

Protein Accession Amphiamblys sp. Anncaliia algerae PRA109 Anncaliia algerae PRA339 Anncaliia algerae Undeen Edhazardia aedis Encephalitozoon cuniculi EC1 APN1 - apurinic endonuclease sp|P22936; kegg|K10771 OIR57476.1 KCZ78017.1 KCZ79265.1 CAIR01007146.1 EJW02098.1 AEWD01000006.1 Brr2p – pre-mRNA-splicing helicase yop|YER172C; kegg|K12854 OIR57348.1 KCZ75913.1 KCZ81989.1 CAIR01005312.1 EJW01922.1 AEWD01000009.1 CDC9 - coiled-coil domain containing protein sp|P04819; kegg|K10747 OIR58482.1 KCZ73797.1 KCZ80121.1 CAIR01007793.1 EJW04879.1 AEWD01000003.1 Chitin deacetylase ref|NP_586357; sp|Q8SU65 - KCZ75820.1 KCZ80434.1 CAIR01000676.1 EJW01818.1 AEWD01000015.1 Chitin synthase I ref|XP_965977; sp|Q8SSI7 OIR59110.1 KCZ75424.1 KCZ79943.1 CAIR01000467.1 EJW04385.1 AEWD01000007.1 Dmc1p - meiotic recombination protein gb|AAB64706 OIR58962.1 KCZ78199.1 KCZ81425.1 CAIR01000894.1 EJW05219.1 AEWD01000015.1 Kin28p – Serine-threonine kinase/subunit of RNA polymerase TFIIK sp|P06242; kegg|K02202 OIR58313.1 KCZ78322.1 KCZ80572.1 CAIR01005011.1 EJW01940.1 AEWD01000003.1 Mpe1p - mRNA surveillance pathway protein yop|YKL059C; kegg|K15541 OIR58491.1 KCZ76173.1 KCZ81146.1 CAIR01001936.1 EJW02101.1 AEWD01000013.1 Mre11p - double-strand break repair protein gb|CAA90195 OIR58894.1 KCZ75155.1 KCZ79369.1 CAIR01001673.1 EJW04387.1 AEWD01000002.1 Msh4p - DNA mismatch repair protein ref|NP_116652 OIR58207.1 KCZ74784.1 KCZ82232.1 CAIR01004222.1 EJW03235.1 AEWD01000005.1 Msh5p - DNA mismatch repair protein ref|NP_010127 OIR57052.1 KCZ76853.1 KCZ79487.1 CAIR01002824.1 EJW03106.1 AEWD01000005.1 OGG2 - DNA-(apurinic or apyrimidinic site) lyase sp|P31378; kegg|K10773 OIR58571.1 KCZ75943.1 KCZ79402.1 CAIR01000860.1 - AEWD01000010.1 Pap1p – polynucleotide adenylyltransferase yop|YKR002W; kegg|K14376 OIR57278.1 KCZ74836.1 KCZ80841.1 CAIR01005782.1 EJW03519.1 AEWD01000003.1 Pfs2p – polyadenylation factor subunit 2 yop|YNL317W; kegg|K15542; NP_014082.1 OIR57158.1 KCZ74651.1 KCZ82134.1 CAIR01002476.1 EJW04962.1 AEWD01000003.1 Pol30p - DNA polymerase processivity accessory factor sp|P15873; kegg|K04802 OIR56377.1 KCZ73849.1 KCZ81594.1 CAIR01007999.1 EJW04886.1 AEWD01000002.1 Pol3p - DNA polymerase delta subunit 2 sp|P15436; kegg|K02327 OIR58514.1 - KCZ82291.1 CAIR01000143.1 EJW01429.1 AEWD01000013.1 Ppz2p – serine/threonineprotein Phosphatase yop|YDR436W; kegg|K06269 OIR59052.1 KCZ77188.1 KCZ82521.1 CAIR01000290.1 EJW01629.1 AEWD01000015.1 Prp28p - ATP-dependent RNA helicase yop|YDR243C; kegg|K12858 OIR56775.1 KCZ74643.1 KCZ80710.1 CAIR01005671.1 EJW03398.1 AEWD01000012.1 Prp4p - U4/U6 small nuclear ribonucleoprotein yop|YPR178W; kegg|K12662 OIR57158.1 KCZ74651.1 KCZ82134.1 CAIR01002030.1 EJW04962.1 AEWD01000003.1 Prp8p – pre-mRNA-processing factor 8 yop|YHR165C; kegg|K12856 - KCZ74086.1 KCZ79510.1 CAIR01006747.1 - AEWD01000001.1 Rad27p – structure-specific endonuclease sp|P26793; kegg|K04799 OIR57890.1 KCZ80819.1 - CAIR01006334.1 EJW03248.1 AEWD01000005.1 Rad3p - DNA polymerase eta sp|P06839; kegg|K10844 OIR58369.1 KCZ75279.1 KCZ79444.1 CAIR01003274.1 EJW01367.1 AEWD01000022.1 Rad51p - DNA repair protein ref|NP_011021 OIR57801.1 KCZ78199.1 KCZ79458.1 CAIR01004715.1 EJW05219.1 AEWD01000015.1 RIB72 - nucleosidediphosphokinase regulatory subunit p72 jgi|81047; tr|O81047 OIR57714.1 KCZ74343.1 KCZ82499.1 CAIR01004252.1 EJW04018.1 AEWD01000002.1 Snu114p - U4/U6 small nuclear ribonucleoprotein yop|YKL173W; kegg|K12853; NP_012748.1 OIR57247.1 KCZ75693.1 KCZ79868.1 CAIR01007340.1 EJW02121.1 AEWD01000006.1 Snu13p - U4/U6 small nuclear ribonucleoprotein yop|YEL026W; kegg|K12845 OIR57678.1 - KCZ80905.1 CAIR01007271.1 EJW05147.1 AEWD01000019.1 Spt15p – TATA-binding Protein sp|P13393; kegg|K03120 OIR57408.1 KCZ74950.1 KCZ81287.1 CAIR01004727.1 EJW04618.1 AEWD01000001.1 Ssl1p - transcription initiation factor TFIIH subunit 2 sp|Q04673; kegg|K03142 - KCZ74283.1 KCZ80298.1 CAIR01006766.1 EJW04086.1 AEWD01000001.1 Ssl2p - DNA excision repair protein sp|Q00578; kegg|K10843 OIR58268.1 KCZ77281.1 KCZ81015.1 CAIR01002836.1 EJW01911.1 AEWD01000007.1 Sua7p - RNA Pol II transcription factor TFIIB sp|P29055; kegg|K03124 OIR58104.1 KCZ75384.1 KCZ79298.1 CAIR01004903.1 EJW04304.1 AEWD01000004.1 Taf1p - TBP associated factor TFIID subunit 1 sp|P46677; kegg|K03125 OIR59081.1 KCZ75223.1 KCZ82142.1 CAIR01001331.1 EJW02845.1 AEWD01000011.1 Tfa1p – transcription initiation factor TFIIE sp|P36100; kegg|K03136 OIR58871.1 KCZ76358.1 KCZ80998.1 CAIR01008422.1 EJW01418.1 AEWD01000020.1 Tfb2p – transcription initiation factor TFIIH subunit 4/DNA excision repair sp|Q02939; kegg|K03144 OIR58219.1 KCZ77406.1 KCZ82495.1 CAIR01004032.1 EJW01474.1 AEWD01000007.1 Tfb3p - role in transcription by RNA polymerase II/DNA excision repair sp|Q03290; kegg|K10842 OIR57740.1 KCZ74590.1 KCZ81317.1 CAIR01000719.1 EJW05210.1 AEWD01000025.1 UNG1 – uracil-DNA glycosylase sp|P12887; kegg|K03648 OIR56646.1 KCZ76491.1 KCZ81821.1 CAIR01001919.1 EJW04878.1 AEWD01000011.1 Ysh1p - cleavage and polyadenylation specificity factor subunit 3 yop|YLR277C; kegg|K14403 OIR58047.1 KCZ77102.1 KCZ80052.1 CAIR01005373.1 EJW01455.1 AEWD01000008.1 Yth1p - cleavage and polyadenylation specificity factor subunit 4 yop|YPR107C; kegg|K14404 OIR57963.1 KCZ74439.1 KCZ80367.1 CAIR01007413.1 EJW01267.1 AEWD01000007.1

Additional file 3.xlsx Page 1 Encephalitozoon cuniculi EC2 Encephalitozoon cuniculi EC3 Encephalitozoon cuniculi EcunIII-L Encephalitozoon cuniculi GB-M1 Encephalitozoon hellem Encephalitozoon intestinalis Encephalitozoon romaleae Enterocytozoon bieneusi Enterocytozoon hepatopenaei AEWQ01000001.1 AEWR01000005.1 KMV65113.1 NP_586461.1 XP_003888398.1 XP_003074003.1 XP_009265622.1 XP_002649640.1 OQS54804.1 AEWQ01000007.1 AEWR01000009.1 KMV65807.1 NP_586011.1 XP_003887588.1 XP_003073201.1 XP_009264826.1 XP_001828035.1 OQS55304.1 AEWQ01000005.1 AEWR01000003.1 KMV66640.1 NP_584647.1 XP_003886850.1 XP_003072475.1 XP_009264085.1 XP_001827810.1 OQS55255.1 AEWQ01000001.1 AEWR01000017.1 KMV65008.1 NP_586357.1 XP_003888293.1 XP_003073896.1 XP_009265516.1 - OQS55075.1 AEWQ01000008.1 AEWR01000006.1 KMV66795.1 XP_965977.1 XP_003886734.1 XP_003072355.1 XP_009263964.1 XP_002651382.1 OQS54297.1 AEWQ01000001.1 AEWR01000017.1 KMV65040.1 NP_586388.1 XP_003888325.1 XP_003073929.1 XP_009265548.1 XP_001828021.1 OQS55103.1 AEWQ01000005.1 AEWR01000003.1 KMV66663.1 NP_584670.1 XP_003886873.1 XP_003072498.1 XP_009264108.1 XP_002649841.1 OQS55236.1 AEWQ01000011.1 AEWR01000012.1 KMV65513.1 XP_955586.1 XP_003887909.1 XP_003073523.1 XP_009265147.1 XP_002649383.1 OQS55191.1 AEWQ01000004.1 AEWR01000002.1 KMV66185.1 NP_597471.1 XP_003887330.1 XP_003072941.1 XP_009264561.1 XP_001827857.1 OQS55264.1 AEWQ01000009.1 AEWR01000007.1 KMV66418.1 NP_597565.1 XP_003886923.1 XP_003072549.1 XP_009264161.1 XP_002649625.1 OQS54718.1 AEWQ01000009.1 AEWR01000007.1 KMV66418.1 NP_597565.1 XP_003886923.1 XP_003072549.1 XP_009264161.1 XP_002649625.1 OQS54569.1 AEWQ01000002.1 AEWR01000008.1 KMV65616.1 NP_597218.1 XP_003887767.1 XP_003073377.1 XP_009265000.1 XP_002650318.1 OQS53714.1 AEWQ01000005.1 AEWR01000003.1 KMV66592.1 NP_584599.1 XP_003886802.1 XP_003072427.1 XP_009264037.1 XP_002649523.1 OQS53507.1 AEWQ01000005.1 AEWR01000003.1 KMV66658.1 NP_584665.1 XP_003886868.1 XP_003072493.1 XP_009264103.1 XP_001828114.1 OQS55242.1 AEWQ01000004.1 AEWR01000002.1 KMV66160.1 NP_597446.1 XP_003887303.1 XP_003072914.1 XP_009264534.1 XP_002649393.1 OQS55179.1 AEWQ01000011.1 AEWR01000012.1 KMV65523.1 XP_955596.1 XP_003887919.1 XP_003073533.2 XP_009265158.1 XP_001827844.1 OQS54890.1 AEWQ01000001.1 AEWR01000017.1 KMV65024.1 NP_586372.1 XP_003888309.1 XP_003073913.1 XP_009265532.1 XP_001828015.1 OQS55228.1 AEWQ01000002.1 AEWR01000011.1 KMV65637.1 NP_597238.1 XP_003887788.1 XP_003073400.1 XP_009265022.1 XP_002650226.1 OQS55835.1 AEWQ01000005.1 AEWR01000003.1 KMV66658.1 NP_584665.1 XP_003886868.1 XP_003072493.1 XP_009264103.1 XP_002649805.1 OQS55375.1 AEWQ01000003.1 AEWR01000001.1 KMV66279.1 NP_584759.1 XP_003887099.1 XP_003072721.1 XP_009264335.1 - - AEWQ01000009.1 AEWR01000007.1 KMV66470.1 NP_597617.1 XP_003886976.1 XP_003072601.1 XP_009264214.1 XP_002649556.1 OQS55703.1 AEWQ01000019.1 AEWR01000022.1 KMV65916.1 NP_585776.1 XP_003887367.1 XP_003072979.1 XP_009264603.1 XP_002650201.1 OQS54631.1 AEWQ01000001.1 AEWR01000017.1 KMV65040.1 NP_586388.1 XP_003888325.1 XP_003073929.1 XP_009265548.1 XP_001828021.1 OQS55103.1 AEWQ01000004.1 AEWR01000002.1 KMV66168.1 NP_597454.1 XP_003887311.1 XP_003072922.1 XP_009264542.1 XP_001827783.1 - AEWQ01000001.1 AEWR01000005.1 KMV65104.1 NP_586452.1 XP_003888389.1 XP_003073994.1 XP_009265613.1 XP_001827979.1 OQS53458.1 AEWQ01000014.1 AEWR01000015.1 KMV65770.1 NP_585974.1 XP_003887551.1 XP_003073164.1 XP_009264789.1 XP_001828101.1 OQS53962.1 AEWQ01000003.1 AEWR01000001.1 KMV66349.1 NP_584829.1 XP_003887170.1 XP_003072792.1 XP_009264408.1 XP_001827801.1 OQS53793.1 AEWQ01000003.1 AEWR01000001.1 KMV66230.1 NP_584711.1 XP_003887054.1 XP_003072673.1 XP_009264290.1 XP_002649891.1 OQS54851.1 AEWQ01000008.1 AEWR01000006.1 KMV66760.1 XP_965942.1 XP_003886698.1 XP_003072319.1 XP_009263927.1 XP_001827839.1 OQS54883.1 AEWQ01000006.1 AEWR01000004.1 KMV66004.1 NP_585866.1 XP_003887456.1 XP_003073070.1 XP_009264694.1 XP_002650868.1 OQS54248.1 AEWQ01000012.1 AEWR01000014.1 KMV65227.1 NP_586191.1 XP_003888150.1 XP_003073757.1 XP_009265376.1 XP_002649612.1 OQS54584.1 AEWQ01000014.1 AEWR01000015.1 KMV65745.1 NP_585950.1 XP_003887526.1 XP_003073140.1 XP_009264763.1 XP_002650291.1 OQS55071.1 AEWQ01000008.1 AEWR01000006.1 KMV66720.1 XP_965902.1 XP_003886658.1 XP_003072278.1 XP_009263887.1 XP_002649880.1 OQS54745.1 AEWQ01000023.1 AEWR01000025.1 KMV64978.1 NP_586328.1 XP_003888267.1 XP_003073871.1 XP_009265490.1 - OQS54802.1 AEWQ01000012.1 AEWR01000014.1 KMV65204.1 NP_586168.1 XP_003888127.1 XP_003073733.2 XP_009265353.1 XP_002650466.1 OQS54534.1 AEWQ01000012.1 AEWR01000032.1 KMV65242.1 NP_586205.1 XP_003888165.1 XP_003073770.1 XP_009265391.1 XP_002650133.1 OQS54558.1 AEWQ01000008.1 AEWR01000006.1 KMV66696.1 XP_965877.1 XP_003886634.1 XP_003072253.1 XP_009263863.1 XP_001827743.1 OQS53906.1

Additional file 3.xlsx Page 2 Enterospora canceri Hamiltosporidium magnivora BE-OM-2 Hamiltosporidium magnivora IL-BN-2 Hamiltosporidium tvaerminnensis FI-OER-3-3 Hamiltosporidium tvaerminnensis IL-G-3 Hepatospora eriocheir Canceri Hepatospora eriocheir GB1 Mitosporidium daphniae ORD95155.1 CWI36_0288p0040 CWI39_3823p0010 CWI37_0207p0010 CWI38_0178p0010 ORD98427.1 ORD95658.1 XP_013238461.1 ORD93671.1 CWI36_1015p0010 CWI39_0814p0010 CWI37_2057p0010 CWI38_2081p0010 ORE00312.1 ORD98069.1 XP_013237046.1 ORD93561.1 CWI36_0367p0030 CWI39_0868p0020 CWI37_1415p0020 CWI38_0185p0040 ORD99497.1 ORD97002.1 XP_013236980.1 ORD93868.1 CWI36_0470p0020 CWI39_0738p0010 CWI37_0066p0050 CWI38_0367p0040 ORD99351.1 ORD95791.1 - ORD93597.1 CWI36_0762p0020 CWI39_0492p0020 CWI37_0056p0020 CWI38_0516p0030 ORD99035.1 ORD97500.1 XP_013236426.1 ORD94454.1 CWI36_2050p0010 CWI39_0375p0030 CWI37_1098p0030 CWI38_1679p0020 - ORD96532.1 XP_013237960.1 ORD94086.1 CWI36_0039p0040 CWI39_1228p0010 CWI37_2741p0010 CWI38_0015p0050 ORD99613.1 ORD98014.1 XP_013236680.1 ORD94377.1 CWI36_0358p0030 CWI39_0047p0040 CWI37_0054p0040 CWI38_0018p0060 - ORD96821.1 - ORD94643.1 CWI36_0114p0040 CWI39_0978p0010 CWI37_0098p0040 CWI38_0006p0110 ORD99960.1 ORD98125.1 XP_013236603.1 ORD95081.1 CWI36_1429p0020 CWI39_0894p0010 CWI37_1754p0010 CWI38_0541p0030 ORD98711.1 ORD96253.1 XP_013237719.1 ORD95081.1 CWI36_0516p0020 CWI39_0770p0010 CWI37_2568p0010 CWI38_0032p0020 ORD98711.1 ORD96253.1 XP_013237719.1 - CWI36_1064p0020 CWI39_2529p0020 CWI37_1400p0010 CWI38_2067p0020 ORE00467.1 ORD97537.1 - ORD93587.1 CWI36_2533p0010 CWI39_2604p0010 CWI37_1542p0020 CWI38_1119p0030 - ORD95885.1 XP_013236795.1 ORD94092.1 CWI36_0919p0010 CWI39_2524p0010 CWI37_2805p0010 CWI38_1671p0010 ORD99475.1 - XP_013239055.1 ORD95062.1 CWI36_0641p0030 CWI39_1140p0010 CWI37_0186p0030 CWI38_0218p0020 ORD99339.1 ORD97627.1 XP_013236653.1 ORD94630.1 CWI36_1401p0010 CWI39_2480p0010 CWI37_1479p0010 CWI38_0930p0030 ORE00025.1 ORD98141.1 XP_013239544.1 ORD93926.1 CWI36_1547p0010 CWI39_2364p0010 CWI37_0435p0020 CWI38_0359p0010 ORE00627.1 ORD96264.1 XP_013237966.1 ORD93579.1 CWI36_0620p0010 CWI39_1898p0010 CWI37_2310p0010 CWI38_0989p0020 ORE00500.1 ORD96797.1 XP_013236391.1 ORD94092.1 CWI36_0919p0010 CWI39_0859p0010 CWI37_2805p0010 CWI38_1671p0010 ORD99475.1 - XP_013238961.1 - CWI36_2186p0010 CWI39_1752p0010 CWI37_0627p0020 CWI38_2159p0010 ORD99604.1 - XP_013238622.1 ORD94214.1 CWI36_0118p0010 CWI39_0636p0010 CWI37_1995p0010 CWI38_0181p0050 ORE00581.1 ORD96148.1 XP_013238699.1 ORD95121.1 CWI36_0532p0020 CWI39_1192p0010 CWI38_0111p0060 ORD99221.1 ORD96895.1 ORD99221.1 XP_013239062.1 ORD94454.1 CWI36_1066p0010 CWI39_0411p0030 CWI37_1407p0010 CWI38_1445p0010 - ORD96532.1 XP_013237684.1 ORD94759.1 CWI36_0035p0020 CWI39_0036p0040 CWI37_0011p0050 CWI38_0746p0020 ORD99692.1 ORD97758.1 XP_013238849.1 ORD94033.1 CWI36_0506p0010 CWI39_1178p0010 CWI37_1578p0010 CWI38_0087p0020 ORD98480.1 - XP_013238927.1 ORD94699.1 CWI36_2312p0010 CWI39_3146p0010 CWI37_2497p0010 CWI38_0001p0140 ORE00486.1 ORD95606.1 XP_013239678.1 ORD93962.1 CWI36_0027p0010 CWI39_1983p0010 - CWI38_0602p0010 ORD98345.1 ORD95363.1 XP_013238012.1 - CWI36_0752p0020 CWI39_1790p0010 CWI37_1471p0020 CWI38_0801p0020 ORD99706.1 ORD95520.1 XP_013237656.1 ORD93927.1 CWI36_1118p0020 CWI39_0045p0040 CWI37_0364p0020 CWI38_0364p0040 ORE00034.1 ORD98161.1 XP_013238497.1 ORD94185.1 CWI36_0989p0010 CWI39_1806p0010 CWI37_0411p0020 CWI38_2242p0010 ORD98846.1 ORD97165.1 XP_013238998.1 ORD95090.1 CWI36_0568p0030 CWI39_0140p0040 CWI37_0163p0020 CWI38_0845p0010 ORE00537.1 ORD95880.1 XP_013237891.1 ORD93872.1 CWI36_2494p0010 CWI39_1339p0010 CWI37_0520p0020 CWI38_0006p0020 ORE00453.1 ORD96808.1 XP_013236429.1 ORD93855.1 CWI36_2004p0010 CWI39_0907p0020 CWI37_0070p0040 CWI38_0248p0020 - - XP_013239360.1 ORD95166.1 CWI36_2052p0020 CWI39_1994p0020 CWI37_0184p0050 CWI38_0134p0020 ORD99603.1 ORD96578.1 - ORD93943.1 CWI36_0251p0010 CWI39_0034p0040 CWI37_0025p0040 CWI38_0786p0020 ORD92852.1 ORD97160.1 XP_013238904.1 ORD95079.1 CWI36_0907p0010 CWI39_2026p0010 CWI37_0932p0020 CWI38_1482p0030 ORD98609.1 ORD97141.1 XP_013239640.1 - CWI36_0716p0010 CWI39_0407p0020 CWI37_0621p0020 CWI38_0811p0030 ORD98703.1 ORD96777.1 XP_013237586.1

Additional file 3.xlsx Page 3 Nematocida displodere Nematocida parisii ERTm1 Nematocida parisii ERTm3 Nematocida sp. 1 ERTm2 Nematocida sp. 1 ERTm5 Nematocida sp. 1 ERTm6 Nosema apis BRL 01 Nosema bombycis CQ1 Nosema ceranae BRL01 Nosema ceranae PA08 OAG29340.1 XP_013058613.1 EIJ89681.1 EHY66489.1 OAG29813.1 KFG26804.1 EQB61436.1 EOB15058.1 XP_002996767.1 XP_024332422.1 OAG29571.1 XP_013058803.1 EIJ89498.1 EHY66683.1 OAG31004.1 KFG26626.1 EQB60823.1 EOB14662.1 XP_002996101.1 XP_024332097.1 OAG29643.1 XP_013058893.1 EIJ89409.1 EHY66775.1 OAG32826.1 KFG26545.1 EQB59894.1 EOB13379.1 XP_002996411.1 XP_024331279.1 OAG31768.1 XP_013058202.1 EIJ88596.1 EHY65189.1 OAG30642.1 KFG27182.1 EQB60675.1 EOB14616.1 XP_002995641.1 XP_024329808.1 OAG32181.1 XP_013060203.1 EIJ88776.1 EHY66347.1 OAG33367.1 KFG25996.1 EQB60652.1 EOB13805.1 XP_002996712.1 XP_024332313.1 OAG31748.1 XP_013058221.1 EIJ88614.1 EHY65206.1 OAG30624.1 KFG27164.1 EQB60630.1 EOB11895.1 XP_002996416.1 XP_024330899.1 OAG29646.1 XP_013058899.1 EIJ89399.1 EHY66772.1 OAG33645.1 KFG26548.1 EQB59709.1 EOB14304.1 XP_002995817.1 XP_024332155.1 OAG30185.1 XP_013059696.1 EIJ88920.1 EHY66208.1 OAG30218.1 KFG26128.1 EQB61642.1 EOB13126.1 XP_002995917.1 XP_024330099.1 OAG29765.1 XP_013060304.1 EIJ89239.1 EHY66925.1 OAG33119.1 KFG25104.1 EQB60527.1 EOB11973.1 XP_002995263.1 XP_024331497.1 OAG29968.1 XP_013059117.1 EIJ87849.1 EHY64933.1 OAG30055.1 KFG25688.1 EQB60089.1 EOB11713.1 XP_002996315.1 XP_024331331.1 OAG31246.1 XP_013059117.1 EIJ87849.1 EHY64933.1 OAG30055.1 KFG25430.1 EQB60089.1 EOB11713.1 XP_002996315.1 XP_024331331.1 ------EQB61724.1 EOB14033.1 XP_002995417.1 XP_024330240.1 OAG29667.1 XP_013058930.1 EIJ89367.1 EHY66800.1 OAG33616.1 KFG26520.1 EQB61207.1 EOB15538.1 - XP_024331471.1 OAG29639.1 XP_013058888.1 EIJ89414.1 EHY66767.1 OAG32821.1 KFG26551.1 EQB62261.1 - XP_002996190.1 XP_024331298.1 OAG31682.1 XP_013058296.1 EIJ88688.1 EHY65273.1 OAG30558.1 KFG27100.1 EQB61860.1 EOB13207.1 XP_002996751.1 XP_024331033.1 OAG31840.1 XP_013058124.1 EIJ88518.1 EHY65106.1 OAG30716.1 KFG27260.1 EQB60706.1 EOB12369.1 XP_002995919.1 XP_024330101.1 OAG31726.1 XP_013058243.1 EIJ88636.1 EHY65229.1 OAG30602.1 KFG27142.1 EQB60448.1 EOB13635.1 XP_002996272.1 XP_024332358.1 OAG31078.1 XP_013059458.1 EIJ89161.1 EHY65974.1 OAG32665.1 KFG26357.1 EQB60133.1 EOB14849.1 XP_002996065.1 XP_024332276.1 OAG31665.1 XP_013058317.1 EIJ88707.1 EHY65293.1 OAG30540.1 KFG27084.1 EQB61656.1 EOB12313.1 XP_002996190.1 XP_024331298.1 OAG29725.1 XP_013060349.1 EIJ89285.1 EHY66880.1 OAG31490.1 KFG26445.1 EQB60240.1 EOB14667.1 XP_002995556.1 XP_024330844.1 OAG29986.1 XP_013060098.1 EIJ87644.1 EHY64727.1 OAG32455.1 KFG25876.1 EQB62347.1 EOB14980.1 XP_002995811.1 XP_024332148.1 OAG30001.1 XP_013059164.1 EIJ88169.1 EHY65739.1 OAG29363.1 KFG25398.1 EQB60927.1 EOB15206.1 XP_002995725.1 XP_024331246.1 OAG31748.1 XP_013058221.1 EIJ88614.1 EHY65206.1 OAG30624.1 KFG27164.1 EQB60701.1 EOB11895.1 XP_002996481.1 XP_024330457.1 OAG29608.1 XP_013059571.1 EIJ89044.1 EHY66086.1 OAG29144.1 KFG26245.1 EQB62331.1 EOB12068.1 XP_002996741.1 XP_024331045.1 OAG29677.1 XP_013058942.1 EIJ89355.1 EHY66810.1 OAG33604.1 KFG26510.1 EQB60509.1 EOB15403.1 XP_002996761.1 XP_024332413.1 OAG32000.1 XP_013057888.1 EIJ87452.1 EHY64461.1 OAG30789.1 KFG27455.1 EQB61398.1 EOB14455.1 XP_002995532.1 XP_024332322.1 OAG29022.1 XP_013059345.1 EIJ87301.1 EHY65566.1 OAG33674.1 KFG25244.1 EQB61916.1 EOB15340.1 XP_002994966.1 XP_024330705.1 OAG31210.1 XP_013059795.1 EIJ87787.1 EHY65001.1 OAG33252.1 KFG25626.1 EQB61060.1 EOB11915.1 XP_002996816.1 XP_024331419.1 OAG29435.1 XP_013058621.1 EIJ89673.1 EHY66480.1 OAG29821.1 KFG26812.1 EQB61079.1 EOB14675.1 XP_002995956.1 XP_024330286.1 OAG29454.1 XP_013058648.1 EIJ89645.1 EHY66531.1 OAG29847.1 KFG26762.1 EQB61033.1 EOB13568.1 XP_002996560.1 XP_024331910.1 OAG29291.1 XP_013059610.1 EIJ89006.1 EHY66118.1 OAG32096.1 KFG26209.1 EQB61686.1 EOB12728.1 XP_002996284.1 XP_024331809.1 OAG32584.1 XP_013059028.1 EIJ88306.1 EHY65878.1 OAG30287.1 KFG25526.1 EQB61633.1 EOB14500.1 XP_002996244.1 XP_024332115.1 OAG28824.1 XP_013059925.1 EIJ87926.1 EHY64880.1 OAG33232.1 KFG25739.1 - EOB14326.1 XP_002996037.1 XP_024331984.1 OAG29183.1 XP_013059539.1 EIJ89076.1 EHY66005.1 OAG29172.1 KFG26327.1 EQB61415.1 EOB14227.1 XP_002996837.1 XP_024331444.1 OAG31774.1 XP_013058193.1 EIJ88587.1 EHY65177.1 OAG30651.1 KFG27195.1 EQB60225.1 - XP_002996544.1 XP_024331674.1 OAG31342.1 XP_013058509.1 EIJ89784.1 EHY66388.1 OAG32544.1 KFG26897.1 EQB60746.1 EOB12953.1 XP_002995499.1 XP_024331645.1 OAG29942.1 XP_013059092.1 EIJ88242.1 EHY65804.1 OAG30352.1 KFG25459.1 EQB61562.1 EOB12696.1 XP_002995591.1 XP_024332008.1

Additional file 3.xlsx Page 4 Ordospora colligata FISK Ordospora colligata GBEP Ordospora colligata NOV7 Ordospora colligata OC4 Pseudoloma neurophilia Rozella allomycis Spraguea lophii 42_110 Spraguea lophii Celtic Deep Spraguea lophii EM120 Spraguea lophii North Atlantic CWI42_121430 CWI41_121430 CWI40_121430 XP_014562962.1 KRH95221.1 EPZ32500.1 EPR78925.1 LDOQ01000106.1 LDWE01000106.1 MQSS01000061.1 CWI42_080800 CWI41_080790 CWI40_080810 XP_014563387.1 KRH93266.1 EPZ35860.1 EPR79303.1 LDOQ01000354.1 LDWE01000354.1 MQSS01000297.1 CWI42_021170 CWI41_021190 CWI40_021190 XP_014564322.1 KRH93611.1 EPZ30811.1 EPR80072.1 LDOQ01000069.1 LDWE01000069.1 MQSS01000180.1 CWI42_120360 CWI41_120360 CWI40_120360 XP_014562859.1 KRH94475.1 - EPR78516.1 LDOQ01000311.1 LDWE01000311.1 MQSS01000038.1 CWI42_011330 CWI41_011330 CWI40_011330 XP_014564521.1 KRH94744.1 EPZ33633.1 EPR79483.1 LDOQ01000303.1 LDWE01000303.1 MQSS01000165.1 CWI42_120700 CWI41_120700 CWI40_120700 XP_014562892.1 KRH93297.1 EPZ30915.1 EPR79355.1 LDOQ01000176.1 LDWE01000176.1 MQSS01000012.1 CWI42_021410 CWI41_021430 CWI40_021430 XP_014564346.1 KRH92523.1 EPZ31846.1 EPR80059.1 LDOQ01000066.1 LDWE01000066.1 MQSS01000321.1 CWI42_040330 CWI41_040330 CWI40_040330 XP_014563883.1 KRH93025.1 EPZ35476.1 EPR78511.1 LDOQ01000069.1 LDWE01000069.1 MQSS01000180.1 CWI42_051860 CWI41_051900 CWI40_051880 XP_014563824.1 KRH92600.1 EPZ31753.1 EPR78088.1 LDOQ01000030.1 LDWE01000030.1 MQSS01000404.1 CWI42_031460 CWI41_031130 CWI40_030050 XP_014564172.1 KRH95200.1 EPZ31156.1 EPR78556.1 LDOQ01000029.1 LDWE01000029.1 MQSS01000297.1 CWI42_031460 CWI41_031130 CWI40_030050 XP_014564172.1 KRH93991.1 EPZ32426.1 EPR77901.1 LDOQ01000174.1 LDWE01000174.1 MQSS01000233.1 CWI42_090860 CWI41_090860 CWI40_090870 XP_014563202.1 KRH94601.1 EPZ33924.1 EPR78377.1 LDOQ01000263.1 LDWE01000263.1 MQSS01000240.1 CWI42_020160 CWI41_020180 CWI40_020180 XP_014564225.1 KRH91870.1 EPZ34860.1 EPR78646.1 LDOQ01000098.1 LDWE01000098.1 MQSS01000237.1 CWI42_021360 CWI41_021380 CWI40_021380 XP_014564341.1 KRH92067.1 EPZ31761.1 EPR79496.1 LDOQ01000311.1 LDWE01000311.1 MQSS01000038.1 CWI42_011490 CWI41_011490 CWI40_011490 XP_014564537.1 KRH92960.1 EPZ32490.1 EPR78508.1 LDOQ01000218.1 LDWE01000218.1 MQSS01000329.1 CWI42_040440 CWI41_040440 CWI40_040440 XP_014563893.1 KRH95256.1 EPZ31855.1 EPR79322.1 LDOQ01000176.1 LDWE01000176.1 MQSS01000012.1 CWI42_120530 CWI41_120530 CWI40_120530 XP_014562876.1 KRH92751.1 EPZ35540.1 EPR79823.1 LDOQ01000311.1 LDWE01000311.1 MQSS01000370.1 CWI42_091060 CWI41_091070 CWI40_091040 XP_014563222.1 KRH93581.1 EPZ35242.1 EPR79471.1 LDOQ01000066.1 LDWE01000066.1 MQSS01000321.1 CWI42_021360 CWI41_021380 CWI40_021380 XP_014564341.1 KRH92067.1 EPZ31374.1 EPR79496.1 LDOQ01000166.1 LDWE01000311.1 MQSS01000298.1 CWI42_041190 CWI41_041190 CWI40_041190 XP_014563965.1 - EPZ33458.1 EPR77805.1 LDOQ01000212.1 LDWE01000212.1 MQSS01000225.1 CWI42_030960 CWI41_030630 CWI40_031030 XP_014564122.1 KRH93809.1 EPZ31665.1 EPR80140.1 LDOQ01000326.1 LDWE01000326.1 MQSS01000243.1 CWI42_060210 CWI41_060200 CWI40_060200 XP_014563565.1 KRH94866.1 EPZ33822.1 EPR78274.1 LDOQ01000300.1 LDWE01000300.1 MQSS01000234.1 CWI42_120700 CWI41_120700 CWI40_120700 XP_014562892.1 KRH93297.1 EPZ33448.1 EPR78654.1 LDOQ01000018.1 LDWE01000018.1 MQSS01000203.1 CWI42_110240 CWI41_110240 CWI40_110240 XP_014563037.1 KRH94636.1 EPZ34811.1 EPR79669.1 LDOQ01000176.1 LDWE01000176.1 MQSS01000012.1 CWI42_121350 CWI41_121350 CWI40_121350 XP_014562954.1 KRH94929.1 EPZ33144.1 EPR80094.1 LDOQ01000255.1 LDWE01000255.1 MQSS01000109.1 CWI42_080380 CWI41_080390 CWI40_080390 XP_014563346.1 KRH93459.1 EPZ35009.1 EPR79239.1 LDOQ01000166.1 LDWE01000166.1 MQSS01000298.1 CWI42_041940 CWI41_041940 CWI40_041940 XP_014564037.1 KRH92525.1 EPZ36225.1 EPR78458.1 LDOQ01000212.1 LDWE01000212.1 MQSS01000202.1 CWI42_040720 CWI41_040720 CWI40_040720 XP_014563919.1 KRH95272.1 EPZ33710.1 EPR78082.1 LDOQ01000029.1 LDWE01000029.1 MQSS01000297.1 CWI42_010960 CWI41_010960 CWI40_010960 XP_014564485.1 KRH94463.1 EPZ32209.1 EPR78461.1 LDOQ01000212.1 LDWE01000212.1 MQSS01000202.1 CWI42_070110 CWI41_070110 CWI40_070110 XP_014563487.1 KRH92712.1 EPZ32711.1 EPR79275.1 LDOQ01000225.1 LDWE01000225.1 MQSS01000278.1 CWI42_050650 CWI41_050670 CWI40_050690 XP_014563703.1 KRH93170.1 EPZ34390.1 EPR78186.1 LDOQ01000088.1 LDWE01000088.1 MQSS01000163.1 CWI42_080130 CWI41_080140 CWI40_080140 XP_014563322.1 KRH92390.1 EPZ36306.1 EPR78766.1 LDOQ01000010.1 LDWE01000010.1 MQSS01000438.1 CWI42_010550 CWI41_010550 CWI40_010550 XP_014564445.1 KRH95242.1 EPZ36140.1 EPR77921.1 LDOQ01000051.1 LDWE01000051.1 MQSS01000402.1 CWI42_120070 CWI41_120070 CWI40_120070 XP_014562832.1 KRH95205.1 EPZ31445.1 EPR78661.1 LDOQ01000199.1 LDWE01000199.1 MQSS01000058.1 CWI42_050420 CWI41_050440 CWI40_050460 XP_014563680.1 KRH94454.1 EPZ33335.1 - LDOQ01000300.1 LDWE01000300.1 MQSS01000234.1 CWI42_050790 CWI41_050810 CWI40_050830 XP_014563717.1 KRH93732.1 EPZ32166.1 EPR79919.1 LDOQ01000088.1 LDWE01000088.1 MQSS01000163.1 CWI42_010290 CWI41_010290 CWI40_010290 XP_014564419.1 KRH92931.1 - EPR78956.1 LDOQ01000342.1 LDWE01000342.1 MQSS01000242.1

Additional file 3.xlsx Page 5 Spraguea lophii RA12034 Trachipleistophora hominis Vavraia culicis Vittaforma corneae LDXA01000374.1 ELQ76223.1 XP_008073030.1 XP_007604043.1 LDXA01000223.1 ELQ76710.1 XP_008075132.1 XP_007604868.1 LDXA01000245.1 ELQ76685.1 XP_008075121.1 XP_007604812.1 LDXA01000288.1 ELQ75344.1 XP_008074727.1 XP_007603639.1 LDXA01000266.1 ELQ75318.1 XP_008073705.1 XP_007604572.1 LDXA01000120.1 ELQ75893.1 XP_008075496.1 XP_007603695.1 LDXA01000083.1 ELQ74835.1 XP_008074365.1 XP_007605341.1 LDXA01000245.1 ELQ75594.1 XP_008073074.1 XP_007604488.1 LDXA01000016.1 ELQ74002.1 XP_008074748.1 XP_007604049.1 LDXA01000076.1 ELQ74420.1 XP_008075613.1 XP_007603507.1 LDXA01000397.1 ELQ76449.1 XP_008073404.1 XP_007603507.1 LDXA01000052.1 ELQ76619.1 XP_008073895.1 XP_007604090.1 LDXA01000260.1 ELQ76169.1 XP_008075173.1 XP_007605263.1 LDXA01000288.1 ELQ76967.1 XP_008075058.1 XP_007604798.1 LDXA01000170.1 ELQ74933.1 XP_008073850.1 XP_007605035.1 LDXA01000120.1 ELQ76017.1 XP_008073927.1 XP_007603937.1 LDXA01000336.1 ELQ74999.1 XP_008074109.1 XP_007603704.1 LDXA01000083.1 ELQ74738.1 XP_008074611.1 XP_007605478.1 LDXA01000393.1 ELQ76967.1 XP_008075058.1 XP_007604798.1 LDXA01000249.1 ELQ74649.1 XP_008074765.1 - LDXA01000401.1 ELQ74443.1 XP_008074500.1 XP_007604223.1 LDXA01000135.1 ELQ75301.1 XP_008073675.1 XP_007603567.1 LDXA01000003.1 ELQ75018.1 XP_008074127.1 XP_007603695.1 LDXA01000120.1 ELQ74785.1 XP_008074170.1 XP_007605132.1 LDXA01000023.1 ELQ75572.1 XP_008075030.1 XP_007605390.1 LDXA01000393.1 ELQ76364.1 XP_008074698.1 XP_007603886.1 LDXA01000173.1 ELQ76708.1 XP_008075131.1 XP_007604282.1 LDXA01000076.1 ELQ76030.1 XP_008073905.1 XP_007604027.1 LDXA01000173.1 ELQ75388.1 XP_008075477.1 XP_007603946.1 LDXA01000312.1 ELQ74241.1 XP_008075109.1 XP_007604919.1 LDXA01000349.1 ELQ74900.1 XP_008073566.1 XP_007603526.1 LDXA01000067.1 ELQ74789.1 XP_008074174.1 XP_007603650.1 LDXA01000057.1 ELQ76006.1 XP_008073916.1 XP_007604678.1 LDXA01000128.1 ELQ74430.1 XP_008075353.1 XP_007603920.1 LDXA01000135.1 ELQ74281.1 XP_008073435.1 XP_007603468.1 LDXA01000349.1 ELQ75870.1 XP_008073438.1 XP_007603495.1 LDXA01000189.1 ELQ76329.1 XP_008074378.1 XP_007605092.1

Additional file 3.xlsx Page 6 Anncaliia algerae PRA109 Anncaliia algerae PRA339 0,30 0,30 LTR/Gypsy LTR/Gypsy e LTR/Copia e LTR/Copia m m o o 0,22 DNA/TcMar 0,22 DNA/TcMar n n e e DNA/Merlin DNA/Merlin g g

f f

DNA/MITE o DNA/MITE o 0,15 0,15 t t en en c c r r e e

0,07 p 0,07 p

0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50 Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Anncaliia algerae Undeen Amphiamblys sp. 1,60 LTR/Gypsy 2,40 LINE/Dong-R4 LTR/Copia LINE/RTE e e m 1,20 DNA/TcMar LINE m 1,80 o o n LTR/Gypsy DNA/Merlin n e e g

LTR/Copia g f DNA/MITE

0,80 f o 1,20

o LTR/DIRS t

DNA t DNA/MITE en en c DNA/Ginger r c DNA/Maverick

0,40 r e 0,60 e p p

0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50 Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Edhazardia aedis Encephalitozoon culiculi EC1 0,60 LINE/Jockey-I 0,30 LTR LINE/RTE e LTR/Gypsy m e

0,45 o m 0,22 LTR/Gypsy n LTR/Copia o e n g e LTR/Copia DNA/MITE f g o 0,30 f

DNA/MITE t 0,15 o t DNA en c r en e c 0,15 r

p 0,07 e p

0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50 Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Encephalitozoon culiculi EC2 Encephalitozoon culiculi EC3 0,24 1,60 LTR LTR e

LTR/Gypsy m LTR/Gypsy e 0,18 o 1,20 n m LTR/Copia LTR/Copia e o g n

DNA/MITE f DNA/MITE e o g 0,80 0,12 t f DNA DNA o en t c r en e 0,40

c 0,06 p r e p 0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50 Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted) Encephalitozoon culiculi EC2 Enterocytozoon bieneusi 3,00 6,00 LTR LINE/L1 e e LTR/Gypsy LTR/Gypsy m m 2,25 4,50 o o LTR/Copia LTR/Copia n n e e DNA/MITE g g DNA/MITE

f f DNA o o 1,50 3,00 t t Unknown en en c c r r e e 0,7 5 1,50 p p

0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50 Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Enterocytozoon hepatopenaei Enterospora canceri 2,00 1,60 LTR/Gypsy LINE/CR1 LTR/Copia LTR 1,20 e 1,50 e LTR/Gypsy m

LTR/Pao m o o LTR/Copia n n e e 0,80 DNA/TcMar g

1,00 g

f f o o DNA/Merlin t t DNA/MITE en 0,50 en 0,40 c c r r e e p p 0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50 Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Hamiltosporidium magnivora Hamiltosporidium magnivora BEOM2 ILBN2 LINE/Dong-R4 1,00 1,20 LINE/L2 LINE/Dong-R4 LINE/RTE LINE/RTE e e m LTR/Gypsy m 0,7 5 LTR/Gypsy o 0,90 o n n LTR/Copia e RC/Helitron e g g f DNA/TcMar

f RC/Helitron 0,50 o 0,60 o t

t DNA/TcMar DNA/Merlin en c en

DNA/Merlin r DNA/MITE c r 0,25 e 0,30 e DNA/MITE p

p DNA DNA DNA/hAT 0,00 DNA/hAT 0,00 DNA/Ginger 0 10 20 30 40 50 DNA/Ginger 0 10 20 30 40 50 Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Hamiltosporidium tvaerminnensis Hamiltosporidium tvaerminnensis FI-OER-3-3 LINE/Dong-R4 ILG3 LINE/Dong-R4 1,20 LINE/L2 1,00 LINE/Jockey-I LINE/RTE LINE/RTE e

LTR/Gypsy e LTR/Gypsy m 0,90 0,7 5 o LTR/Copia m n LTR/Copia o e n

g RC/Helitron e RC/Helitron f g o 0,60 DNA/TcMar f 0,50 DNA/TcMar t o

DNA/Merlin t DNA/Merlin en c en r DNA/MITE

c DNA/MITE e

0,30 r 0,25 p DNA e DNA p DNA/Maverick DNA/hAT 0,00 0,00 DNA/hAT DNA/Ginger 0 10 20 30 40 50 DNA/Ginger 0 10 20 30 40 50 Unknown Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted) Hepatospora eriocheir Canceri Hepatospora eriocheir GB1

1,20 0,60 LTR/Gypsy LTR/Gypsy e LTR/Copia e LTR/Copia m m o 0,90 DNA/TcMar o 0,45 DNA/TcMar n n e e DNA/Merlin

g DNA/Merlin g f f DNA/MITE o 0,60 DNA/MITE o 0,30 t DNA t DNA en en c c r r e e

p 0,30 p 0,15

0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50

Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Mitosporidium daphniae Nematocida displodere

0,40 LINE 1,60 LTR/Gypsy e

e LTR

m LTR/Copia m LTR/Gypsy o o 0,30 n 1,20 RC/Helitron n e

e LTR/Copia g g f f

DNA/MITE o o t t 0,20 DNA 0,80 en en c r c r e e p p 0,10 0,40

0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50

Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Nematocida parisii ERTm1 Nematocida parisii ERTm3 0,30 0,80 LTR/Gypsy LINE/L1 e LTR/Copia e LTR/Gypsy m m o 0,22 DNA/MITE o 0,60 LTR/Copia n n e e DNA/MITE g DNA g

f f o 0,15 o 0,40 DNA t t en en c c r r e 0,07 e p p 0,20

0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50

Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Nematocida sp. 1 ERTm2 Nematocida sp. ERTm5

1,00 0,30 LTR/Gypsy LTR/Gypsy e e m LTR/Copia m LTR/Copia o 0,7 5 o 0,22 n DNA/Dada n DNA/MITE e e g g

f DNA/MITE f DNA o o t 0,50 t 0,15 en en c c r r e e p 0,25 p 0,07

0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50

Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted) Nematocida sp. 1 ERTm6 Nosema apis

0,6 0 0,6 0 LTR/Gypsy LINE/Dong-R4 e e LTR/ERV1 LTR/Copia m m o 0,45 o 0,45 DNA/Dada n LTR/Gypsy n e e g

LTR/Copia

g DNA/MITE f

f o

o DNA/PiggyBac

t 0,30

t 0,30

en DNA/MITE en c r c r e e

p 0,15 p 0,15

0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50 Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Nosema bombycis Nosema ceranae BRL1 LINE/Dong-R4 LTR 3,00 2,00 LINE/Dong-R4 LTR/Gypsy

e LTR LTR/Copia e m m o 2,25 1,50 LTR/Gypsy

RC/Helitron o n n e LTR/Copia e

g DNA/TcMar

g f RC/Helitron f o DNA/PiggyBac

1,50 o 1,00 t DNA/TcMar DNA/Merlin t en en c DNA/Merlin

r DNA/MITE c r e

e DNA/MITE p 0,75 DNA 0,50 DNA/Maverick p DNA DNA/hAT 0,00 0,00 0 10 20 30 40 50 DNA/Ginger 0 10 20 30 40 50 Unknown Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Nosema ceranae PA08 Pseudoloma neurophilia

0,6 0 0,6 0 LINE/R1

LINE/Dong-R4 e

e LTR/Gypsy m m LTR/Gypsy o LTR/Copia o 0,45 n 0,45 e n LTR/Copia RC/Helitron e g

f g DNA/TcMar DNA/TcMar f o o t 0,30 DNA/Merlin t 0,30 DNA/Merlin en c en DNA/MITE DNA/MITE r c e r DNA e p p 0,15 0,15 DNA/hAT

0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50 Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Rozella allomycis Spraguea lophii 42_110

0,20 LTR/ERV1 0,6 0 e LTR/Gypsy e

m LTR m o LTR/Copia

0,15 o n LTR/Gypsy n e 0,45 DNA/Merlin e g

LTR/Copia g f

f DNA/MITE o

RC/Helitron o t 0,10 t 0,30 DNA en DNA/MITE c en r c r

e DNA e p

0,05 p 0,15

0,00 0 10 20 30 40 50 0,00 0 10 20 30 40 50 Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted) Spraguea lophii Celtic Deep Spraguea lophii EM120

0,60 0,8 0 LTR/Gypsy LINE/L2 e e m m LTR/Copia LTR/Gypsy o o n n 0,45 DNA/MITE 0,60 LTR/Copia e e g g

DNA f DNA/MITE f o o t t 0,30 0,40 DNA en en c c r r e e p p 0,15 0,20

0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50 Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Spraguea lophii North Atlantic Spraguea lophii RA12034

1,00 0,60 LINE/L2 LINE/L2 e e m m LTR/Gypsy LTR/Gypsy o o

n 0,45 n 0,7 5 LTR/Copia LTR/Copia e e g g

DNA/MITE DNA/MITE f f o o DNA DNA t t 0,50 0,30 en en c c r r e e p p 0,25 0,15

0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50 Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Trachipleistophora hominis Vavraia culicis

0,30 0,8 0 LINE/RTE LTR e e m LTR/Gypsy m LTR/Gypsy o 0,22 o 0,60 n LTR/Copia n LTR/Copia e e g g

RC/Helitron f

f RC/Helitron o DNA/MITE o t 0,15 t 0,40 DNA/MITE en DNA en DNA c c r r e e p 0,07 p 0,20

0,00 0,00 0 10 20 30 40 50 0 10 20 30 40 50 Kimura substitution level (CpG adjusted) Kimura substitution level (CpG adjusted)

Vittaforma corneae

0,30 LTR/Gypsy e

m LTR/Copia o 0,22 n DNA/MITE e g f o

t 0,15 en c r e

p 0,07

0,00 0 10 20 30 40 50

Kimura substitution level (CpG adjusted) Additional file 5: Frequency of subtypes of TEs in microsporidian mobilomes.

DNA (%) Assembly Size (Mb) MITE TcMar-ISRm11TcMar-Tc1 TcMar-Tc2 TcMar-StowawayTcMar-Tigger TcMar-Pogo Merlin PiggyBac Maverick Ginger hAT-Charlie Helitron MuLE/MuDR Amphiamblys sp. 5.6201 61.5052 0 0 0 0 0 0 0 0 0.7217 0 0 0 0 Anncaliia algerae PRA109 17.5349 22.6846 12.9391 34.352 0 0 0 0 29.9689 0 0 0 0 0 0 Anncaliia algerae PRA339 12.1634 27.6911 0 9.7404 0 0 0 0 62.5684 0 0 0 0 0 0 Anncaliia algerae Undeen 13.8234 11.9656 0.9309 15.4344 0 0 0 0.494 57.3944 0 0 1.0759 0 0 0 Edhazardia aedis 51.3480 98.1807 0 0 0 0 0 0 0 0 0 0 0 0 0 Encephalitozoon cuniculi EC1 2.2881 100 0 0 0 0 0 0 0 0 0 0 0 0 0 Encephalitozoon cuniculi EC2 2.2812 100 0 0 0 0 0 0 0 0 0 0 0 0 0 Encephalitozoon cuniculi EC3 2.2647 100 0 0 0 0 0 0 0 0 0 0 0 0 0 Encephalitozoon cuniculi EcunIII-L 2.2923 ------Encephalitozoon cuniculi GB-M1 2.4975 93.5466 0 0 0 0 0 0 0 0 0 0 0 0 0 Encephalitozoon hellem 2.2517 ------Encephalitozoon intestinalis 2.2169 ------Encephalitozoon romaleae 2.1875 ------Enterocytozoon bieneusi 3.8607 79.775 0 0 0 0 0 0 0 0 0 0 0 0 0 Enterocytozoon hepatopenaei 3.2535 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Enterospora canceri 3.0954 57.3834 11.8376 0 0 0 0 14.8764 0 0 0 0 0 0 Hamiltosporidium magnivora BE-OM-2 20.7341 59.5194 2.1583 1.0726 0 0.7886 0 0 2.8076 0 0 5.637 3.4583 1.0742 0 Hamiltosporidium magnivora IL-BN-2 17.1898 61.1107 2.4848 1.6337 0 1.078 0 0 1.1203 0 0 2.9891 3.2276 1.2747 0 Hamiltosporidium tvaerminnensis FI-OER-3-3 18.3444 59.4588 1.1421 2.9522 0 0.6143 0 0 1.5989 0 0.8287 3.5238 4.7131 1.4926 0 Hamiltosporidium tvaerminnensis IL-G-3 25.2027 56.4699 2.5106 0.7522 0 1.4204 0 0 1.3914 0 0 1.8772 2.7912 0.7198 0 Hepatospora eriocheir Canceri 4.8348 22.2078 0 51.6842 0 0 0 0 20.1794 0 0 0 0 0 0 Hepatospora eriocheir GB1 4.5703 35.7659 0 13.9807 0 0 0 0 44.0512 0 0 0 0 0 0 Mitosporidium daphniae 5.6370 99.4547 0 0 0 0 0 0 0 0 0 0 0 0 0 Nematocida displodere 3.0664 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Nematocida parisii ERTm1 4.0713 99.9312 0 0 0 0 0 0 0 0 0 0 0 0 0 Nematocida parisii ERTm3 4.1476 89.5216 0 0 0 0 0 0 0 0 0 0 0 0 0 Nematocida sp. 1 ERTm2 4.7007 93.9103 0 0 0 0 0 0 0 0 0 0 0 0 0 Nematocida sp. ERTm5 4.3976 100 0 0 0 0 0 0 0 0 0 0 0 0 0 Nematocida sp. 1 ERTm6 4.2760 83.0227 0 0 0 0 0 0 0 0 0 0 0 0 0 Nosema apis BRL 01 8.5695 44.1199 0 0 0 0 0 0 0 4.1322 0 0 0 0 0 Nosema bombycis CQ1 15.6898 60.7627 1.2486 4.7323 0.2483 0 0.7463 0 7.6584 4.0386 0.2031 0.8936 7.6369 1.3002 2.0916 Nosema ceranae BRL01 7.8602 66.4331 0 9.735 0 0 0 0.6631 16.7267 0 0 0 0 0.5775 2.6324 Nosema ceranae PA08 5.6907 42.5585 0 10.4254 0 0 0 0 21.8005 0 0 0 0 0 0 Ordospora colligata FI-SK-17-1 2.2600 ------Ordospora colligata GB-EP-1 2.3000 ------Ordospora colligata NO-V-7 2.3200 ------Ordospora colligata OC4 2.2905 ------Pseudoloma neurophilia 5.2484 11.1305 9.0735 0 0 0 0 0 25.5919 0 0 0 6.6364 0.0228 0 Rozella allomycis 11.8593 53.301 0 0 0 0 0 0 0 0 0 0 0 0.0705 0 Spraguea lophii 42_110 4.9809 74.6208 0 0 0 0 0 0 17.7314 0 0 0 0 0 0 Spraguea lophii Celtic Deep 5.7648 98.0299 0 0 0 0 0 0 0 0 0 0 0 0 0 Spraguea lophii EM120 5.7648 98.2194 0 0 0 0 0 0 0 0 0 0 0 0 0 Spraguea lophii North Atlantic 5.8500 90.9481 0 0 0 0 0 0 0 0 0 0 0 0 0 Spraguea lophii RA12034 5.8022 95.4852 0 0 0 0 0 0 0 0 0 0 0 0 0 Trachipleistophora hominis 8.4982 77.1507 0 0 0 0 0 0 0 0 0 0 5.7985 0 0 Vavraia culicis 6.1187 3.6045 0 0 0 0 0 0 0 0 0 0 0 96.1678 0 Vittaforma corneae 3.2135 100 0 0 0 0 0 0 0 0 0 0 0 0 0

Additional file 5.xlsx Page 1 LTR (%) LINE (%) Unknown Dada En/Spm Gypsy Copia Pao ERV1 DIRS Penelope-like Dong-R4 RTE-BovB RTE-X L1 L2 R1 CR1 Jockey Unclassified 0 0 2.2592 0 0 0 11.3368 0.0091 0.9823 9.2349 4.7111 0 0 0 0 0 9.2393 0 0 0 0.019 0.036 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 12.606 0 0 0 0 0.0984 0 0 0 0 0 0 0 0 0 0 0 0 1.2717 0 0 0 0 0 0 0.3039 0 0 0 0 0 0.2436 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ------0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6.4533 ------0 0 6.8806 0 0 0 0 0 0 0 0 13.3442 0 0 0 0 0 0 0 0 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8.0754 0.1913 0 0 0 0 - - 0 0 0 0 7.6355 0 0 0 0 0 0.6203 0 0 0 0 0 12.3356 10.036 0 0 0.4862 0 0 0 0 0 0 0 1.0666 0 0 0 0 0 15.9782 8.0358 0 0 0 0 0 0 0 0 0 0 1.225 0.0054 0 0 0 0 12.9718 9.3541 0 0 0.1125 0 0 0 0 0 0 0 0.6586 0 0 0 0 0 24.0538 7.2829 0 0 0 0 0 0.0219 0 0.0493 0 0 5.9285 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6.202 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0475 0.0693 0.2404 0 0 0 0.1877 0 0 0 0 0 0 0 0 0 0 0 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0687 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0289 0 0 0 0 0 0 0 0 0 10.4494 0 0 0 0 0 0 6.0896 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16.9202 0 0 0.0569 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4.2024 0 0 0.5476 0 0 46.9976 0 0 0 0 0 0 0 0 0 0 0.001 4.4902 0.3847 0 0 0 0 3.5484 0 0 0 0 0 0 0 0 0.024 0 0 2.8501 0 0 0 0 0.0052 0.3764 0 0 0 0 0 0 0 0 0 0 0 20.3187 0 0 0 0 0 4.8968 0 0 0 0 0 0 0 0 0 ------0 0.0213 38.3176 0 0 0 0 0 0 0 0 0 0 9.1556 0 0 0 0.0501 0 0 14.4281 16.0468 0 16.1534 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4.772 0 0 0 0 0 0 0 0 0 2.8757 0 0 0 0 0 0 0 1.97 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0253 0 0 0 0 0 0 0 0 0 1.7552 0 0 0 0 0 0 0 6.287 0 0 0 0 0 0 0 0 0 2.7648 0 0 0 0 0 0 0 0.0513 0 0 0 0 0 0 0 0 0 4.4633 0 0 0 0 0 0 0 0 0.2094 0 0 0 0 0 0 16.8411 0 0 0 0 0 0 0 0 0 0 0.1084 0 0 0 0.1191 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Additional file 5.xlsx Page 2 Additonal file 6: Results of BLASTp searches for proteins involved in genome defense mechanisms against TEs.

Lineage Dicer (A0A4Q9LBM7) Argonaute (A0A4Q9L0B4) RNA-dependent RNA polymerase (Q1K6C4) DNA repair protein RAD51 (Q1K929) Amphiamblys sp. 0 0 0 2 Anncaliia algerae PRA109 2 0 1 5 Anncaliia algerae PRA339 1 4 1 2 Anncaliia algerae Undeen 5 7 2 7 Edhazardia aedis 3 1 1 1 Encephalitozoon cuniculi EC1 0 0 0 1 Encephalitozoon cuniculi EC2 0 0 0 1 Encephalitozoon cuniculi EC3 0 0 0 1 Encephalitozoon cuniculi EcunIII-L 0 0 0 1 Encephalitozoon cuniculi GB-M1 0 0 0 1 Encephalitozoon hellem 0 0 0 1 Encephalitozoon intestinalis 0 0 0 1 Encephalitozoon romaleae 0 0 0 1 Enterocytozoon bieneusi 0 0 0 1 Enterocytozoon hepatopenaei 2 1 1 1 Enterospora canceri 1 1 2 1 Hamiltosporidium magnivora BE-OM-2 2 2 1 3 Hamiltosporidium magnivora IL-BN-2 1 1 1 2 Hamiltosporidium tvaerminnensis FI-OER-3-3 1 1 1 3 Hamiltosporidium tvaerminnensis IL-G-3 1 1 2 2 Hepatospora eriocheir Canceri 2 0 1 1 Hepatospora eriocheir GB1 2 1 1 1 Mitosporidium daphniae 1 1 0 2 Nematocida displodere 0 0 0 1 Nematocida parisii ERTm1 0 0 0 1 Nematocida parisii ERTm3 0 0 0 1 Nematocida sp. 1 ERTm2 0 0 0 1 Nematocida sp. ERTm5 0 0 0 1 Nematocida sp. 1 ERTm6 0 0 0 1 Nosema apis BRL 01 4 3 1 2 Nosema bombycis CQ1 1 1 0 3 Nosema ceranae BRL01 1 1 0 2 Nosema ceranae PA08 1 1 1 2 Ordospora colligata FI-SK-17-1 0 0 0 1 Ordospora colligata GB-EP-1 0 0 0 1 Ordospora colligata NO-V-7 0 0 0 1 Ordospora colligata OC4 0 0 0 1 Pseudoloma neurophilia 1 2 1 1 Rozella allomycis 2 2 1 3 Spraguea lophii 42_110 1 1 0 2 Spraguea lophii Celtic Deep 1 1 1 2 Spraguea lophii EM120 1 1 1 2 Spraguea lophii North Atlantic 1 1 1 2 Spraguea lophii RA12034 1 1 1 2 Trachipleistophora hominis 2 1 1 2 Vavraia culicis 1 1 1 2 Vittaforma corneae 1 1 1 1

Additional file 6.xlsx Page 1 DNA repair protein RAD52 (Q9HGI2) DNA repair protein Rad54 (Q9P978) Replication protein A (Q1K7R9) RNA helicase (Q7S3A3) Masc1 (Methyltransferase from Ascobolus 1) (O13369) RID (RIP defective) (V5INW3) 1 7 2 3 0 0 1 12 4 3 0 0 1 7 1 2 0 0 3 18 6 7 0 0 1 8 1 5 0 0 1 6 1 3 0 0 1 6 1 2 0 0 1 6 1 2 0 0 1 6 1 2 0 0 1 6 1 3 0 0 1 6 1 2 0 0 1 6 1 2 0 0 1 7 1 2 0 0 0 4 1 1 0 0 1 6 1 0 0 0 1 6 1 0 0 0 1 10 1 4 0 0 1 9 1 4 0 0 1 11 1 4 0 0 1 11 1 6 0 0 1 4 1 1 0 0 1 5 1 1 0 0 1 8 1 2 0 0 1 8 1 2 0 0 1 8 1 2 0 0 1 8 1 2 0 0 1 9 1 2 0 0 1 8 1 2 0 0 1 10 1 2 0 0 1 10 1 3 0 0 3 4 1 2 0 0 1 5 1 2 0 0 1 5 1 2 0 0 1 6 1 0 0 0 1 6 1 0 0 0 1 6 1 0 0 0 1 6 1 0 0 0 1 9 1 1 0 0 1 17 4 8 0 0 1 6 1 2 0 0 1 7 1 3 0 0 1 7 1 3 0 0 1 7 1 4 0 0 1 7 1 4 0 0 1 7 1 3 0 0 1 7 1 3 0 0 1 7 1 2 0 0

Additional file 6.xlsx Page 2 Regression analyses

Without contrasts (log transformed data) With phylogenetic independent contrasts (PIC) Residual standard error: 0.2768 on 45 degrees of freedom Residual standard error: 0.2574 on 45 degrees of freedom Multiple R-squared: 0.3435, Adjusted R-squared: 0.3289 Multiple R-squared: 0.1802, Adjusted R-squared: 0.1619 F-statistic: 23.55 on 1 and 45 DF, p-value: 1.501e-05 F-statistic: 9.889 on 1 and 45 DF, p-value: 0.002944 Nosema bombycis vs. Papilio xuthus Argonaute

Nosema apis vs. Papilio xuthus Argonaute Additional file 9: Transmission modes of the microsporidia included in the study. ? = lack of information about transmission mode in the literature; HV = horizontal and vertical transmission; H = horizontal transmission; HS = horizontal and sexual transmission. Lineage Transmission mode References Statistical group Amphiamblys sp. ? - Not included in the statistical test Anncaliia algerae PRA109 HV Anncaliia algerae PRA339 HV 1, 4, 19 2 Anncaliia algerae Undeen HV Edhazardia aedis HV Encephalitozoon cuniculi EC1 H Encephalitozoon cuniculi EC2 H Encephalitozoon cuniculi EC3 H Encephalitozoon cuniculi EcunIII-L H Encephalitozoon cuniculi GB-M1 H 2, 7, 8, 10, 11 1 Encephalitozoon hellem H Encephalitozoon intestinalis H Encephalitozoon romaleae H Enterocytozoon bieneusi H Enterocytozoon hepatopenaei H 9, 16, 18 Enterospora canceri ? - Not included in the statistical test Hamiltosporidium magnivora BE-OM-2 V? Hamiltosporidium magnivora IL-BN-2 V? 20.25 2 Hamiltosporidium tvaerminnensis FI-OER-3-3 HV Hamiltosporidium tvaerminnensis IL-G-3 HV Hepatospora eriocheir Canceri ? - Not included in the statistical test Hepatospora eriocheir GB1 ? - Not included in the statistical test Mitosporidium daphniae H 24 Nematocida displodere H Nematocida parisii ERTm1 H Nematocida parisii ERTm3 H 1 21 Nematocida sp. 1 ERTm2 H Nematocida sp. ERTm5 H Nematocida sp. 1 ERTm6 H Nosema apis BRL 01 HS 15 Nosema bombycis CQ1 HV 6 2 Nosema ceranae BRL01 HS 15 Nosema ceranae PA08 HS Ordospora colligata FI-SK-17-1 H Ordospora colligata GB-EP-1 H 13 1 Ordospora colligata NO-V-7 H Ordospora colligata OC4 H Pseudoloma neurophilia HV 17 2 Rozella allomycis ? Not included in the statistical test Spraguea lophii 42_110 H Spraguea lophii Celtic Deep H Spraguea lophii EM120 H 5 1 Spraguea lophii North Atlantic H Spraguea lophii RA12034 H Trachipleistophora hominis ? Not included in the statistical test Vavraia culicis H 14 1 Vittaforma corneae ? Not included in the statistical test

1. Anthony DW, Lotzkar MD, Avery SW. 1978. Fecundity and longevity of Anopheles albimanus exposed at each larval instar to of Nosema algerae. 116–121. 2. Baneux PJR, Pognan F. 2003. In utero transmission of Encephalitozoon cuniculi strain type I in rabbits. Lab. Anim. 37:132–138. doi: 10.1258/00236770360563778. 3. BECNEL JJ, SPRAGUE V, FUKUDA T, HAZARD EI. 1989. Development of Edhazardia aedis (Kudo, 1930) N. G., N. Comb. (Microsporida: Amblyosporidae) in the Mosquito Aedes aegypti (L.) (Diptera: Culicidae). J. Protozool. 36:119–130. doi: 10.1111/j.1550-7408.1989.tb01057.x. 4. Canning E, Hulls RH. 1970. A Microsporidan Infection of Anopheles gambiae Giles, from Tanzania, Interpretation of Its Mode of Transmission and Notes on Nosema in Mosquitoes. J Protozool. 17:531–539. 5. Freeman MA et al. 2011. Spraguea (Microsporida: Spraguidae) infections in the nervous system of the Japanese anglerfish, Lophius litulon (Jordan), with comments on transmission routes and host pathology. J. Fish Dis. 34:445–452. doi: 10.1111/j.1365-2761.2011.01255.x. 6. Han MS, Watanabe H. 1988. Transovarial transmission of two microsporidia in the silkworm, Bombyx mori, and disease occurrence in the progeny population. J. Invertebr. Pathol. 51:41–45. doi: 10.1016/0022-2011(88)90086-9. 7. Hunt RD, King NW, Foster HL. 1972. Encephalitozoonosis: Evidence for vertical transmission. J. Infect. Dis. 126:212–214. doi: 10.1093/infdis/126.2.212. 8. Juan-Sallés C et al. 2006. Disseminated encephalitozoonosis in captive, juvenile, cotton-top (Saguinus oedipus) and neonatal emperor (Saguinus imperator) tamarins in North America. Vet. Pathol. 43:438–446. doi: 10.1354/vp.43-4-438. 9. Karthikeyan K, Sudhakaran R. 2019. Experimental horizontal transmission of Enterocytozoon hepatopenaei in post-larvae of whiteleg shrimp, Litopenaeus vannamei. J. Fish Dis. 42:397–404. doi: 10.1111/jfd.12945. 10. Ozkan O, Karagoz A, Kocak N. 2019. First molecular evidence of ocular transmission of Encephalitozoonosis during the intrauterine period in rabbits. Parasitol. Int. 71:1–4. doi: 10.1016/j.parint.2019.03.006. 11. Patterson-Kane JC, Caplazi P, Rurangirwa F, Tramontin RR, Wolfsdorf K. 2003. Encephalitozoon cuniculi placentitis and abortion in a Quarterhorse mare. J. Vet. Diagnostic Investig. 15:57–59. doi: 10.1177/104063870301500113.

Additional file 9.xlsx Page 1 12. Peng Y, Grassl J, Millar AH, Baer B. 2016. Seminal fluid of honeybees contains multiple mechanisms to combat infections of the sexually transmitted pathogen nosema apis. Proc. R. Soc. B Biol. Sci. 283. doi: 10.1098/rspb.2015.1785. 13. Pombert JF, Haag KL, Beidas S, Ebert D, Keeling PJ. 2015. The Ordospora colligata genome: Evolution of extreme reduction in microsporidia and host-to-parasite horizontal gene transfer. MBio. 6. doi: 10.1128/mBio.02400-14. 14. Rivero A, Agnew P, Bedhomme S, Sidobre C, Michalakis Y. 2007. Resource depletion in Aedes aegypti mosquitoes infected by the microsporidia Vavraia culicis. Parasitology. 134:1355–1362. doi: 10.1017/S0031182007002703. 15. Roberts KE, Evison SEF, Baer B, Hughes WOH. 2015. The cost of promiscuity: Sexual transmission of Nosema microsporidian parasites in polyandrous honey bees. Sci. Rep. 5:1–7. doi: 10.1038/srep10982. 16. Salachan PV, Jaroenlak P, Thitamadee S, Itsathitphaisarn O, Sritunyalucksana K. 2017. Laboratory cohabitation challenge model for shrimp hepatopancreatic microsporidiosis (HPM) caused by Enterocytozoon hepatopenaei (EHP). BMC Vet. Res. 13:1–7. doi: 10.1186/s12917-016-0923-1. 17. Sanders JL, Watral V, Clarkson K, Kent ML. 2013. Verification of Intraovum Transmission of a Microsporidium of : Pseudoloma neurophilia Infecting the Zebrafish, Danio rerio. PLoS One. 8. doi: 10.1371/journal.pone.0076064. 18. Tang KFJ et al. 2016. Dense populations of the microsporidian Enterocytozoon hepatopenaei (EHP) in feces of Penaeus vannamei exhibiting white feces syndrome and pathways of their transmission to healthy shrimp. J. Invertebr. Pathol. 140:1–7. doi: 10.1016/j.jip.2016.08.004. 19. Vavra J, Undeen AH. 1970. Nosema algerae n. sp. (Cnidospora, Microsporida) a Pathogen in a Laboratory Colony of Anopheles stephensi Liston (Diptera, Culicidae). J Protozool. 17:240–249. 20. Vizoso DB, Ebert D. 2004. Within-host dynamics of a microsporidium with horizontal and vertical transmission: Octosporea bayeri in . Parasitology. 128:31–38. doi: 10.1017/S0031182003004293. 21. Bakowski MA, Luallen, RJ, Troemel, ER. 2014. Microsporidia Infections in Caenorhabditis elegans and Other Nematodes. In: Weiss LM, Becnel JJ, editors. Microsporidia: Pathogens of Opportunity. Ames: John Wiley & Sons. p. 341–355. 22. Zilio G, Thiévent K, Koella JC. 2018. Host genotype and environment affect the trade-off between horizontal and vertical transmission of the parasite Edhazardia aedis. BMC Evol. Biol. 18:1–9. doi: 10.1186/s12862-018-1184-3. 23. Didier ES. 2014. Mammalian Animal Models of Human Microsporidiosis. In: Weiss LM, Becnel JJ, editors. Microsporidia: Pathogens of Opportunity. Ames: John Wiley & Sons. p. 327–334. 24. Haag KL et al. 2014. Evolution of a morphological novelty occurred before genome compaction in a lineage of extreme parasites. Proc Natl Acad Sci U S A. 111:15480–15485. 25. Haag KL, Traunecker E, Ebert D. 2013. Single-nucleotide polymorphismsof two closely related microsporidian parasites suggest a clonal pop-ulation expansion after the last glaciation. Mol Ecol. 22(2):314–326.25

Additional file 9.xlsx Page 2