bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1 Anaerobic derivates of mitochondria and in the free-living

2 schiedti revealed by single-cell genomics

3

4 Kristína Záhonová1,2,*, Sebastian Cristian Treitli1, Tien Le1, Ingrid Škodová-Sveráková2,3, Pavla

5 Hanousková4, Ivan Čepička4, Jan Tachezy1, Vladimír Hampl1,*

6

7 1 Department of Parasitology, Faculty of Science, Charles University, BIOCEV, Vestec, Czech

8 Republic

9 2 Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice

10 (Budweis), Czech Republic

11 3 Department of Biochemistry, Faculty of Natural Sciences, Comenius University, Bratislava,

12 Slovakia

13 4 Department of Zoology, Faculty of Science, Charles University, Prague, Czech Republic

14

15 * Corresponding authors: [email protected] (KZ); [email protected] (VH)

16

17

18

19

20

21

22

1 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

23 Abstract

24 Pelomyxa schiedti is a free-living amoeba belonging to the group , which

25 encompasses anaerobes bearing -related (MROs) –

26 in free-living balamuthi and in the human pathogen

27 histolytica. Anaerobic peroxisomes, another adaptation to anaerobic lifestyle, were identified

28 only recently in M. balamuthi. We found evidence for both these organelles in the single-cell-

29 derived genome and transcriptome of P. schiedti, and corresponding vesicles were tentatively

30 revealed in electron micrographs. In silico reconstructed MRO metabolism seems similar to that

31 of M. balamuthi harboring respiratory complex II, electron-transferring flavoprotein, partial TCA

32 cycle running presumably in reductive direction, pyruvate:ferredoxin oxidoreductase, [FeFe]-

33 hydrogenases, glycine cleavage system, and sulfate activation pathway. The cell disposes with

34 an expanded set of NIF enzymes for iron sulfur cluster assembly, but their localization remains

35 unclear. Quite contrary, out of 67 predicted peroxisomal enzymes, only four were reported also

36 in M. balamuthi, namely peroxisomal processing peptidase, nudix hydrolase, inositol 2-

37 dehydrogenase, and D-lactate dehydrogenase. Other putative functions of peroxisomes could

38 be pyridoxal 5-phosphate biosynthesis, amino acid and carbohydrate metabolism, and

39 hydrolase activities. Future experimental evidence is necessary to define functions of this

40 surprisingly enzyme-rich anaerobic .

41

42

43 Author summary

2 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

44 A major part of the microbial diversity cannot be cultured in isolation, and so it escapes from

45 traditional ways of investigation. In this paper, we demonstrate the successful approach for

46 generating good-quality genome and transcriptome drafts from a peculiar amoeba Pelomyxa

47 schiedti using single-cell methods. P. schiedti is a member of Archamoebae clade harboring

48 microaerobic including a free-living Mastigamoeba balamuthi and a human parasite

49 . Mitochondria and peroxisomes represent two organelles that are most

50 affected during adaptation to microoxic or anoxic environments. Mitochondria are known to

51 transform to anaerobic mitochondria, hydrogenosomes, mitosomes, and various transition

52 stages in between, all of which encompass different enzymatic capacity. Anaerobic

53 peroxisomes have been first noticed in M. balamuthi, but their function remained unclear for

54 now. Data obtained in this study were used for revealing the presence and for the detailed

55 functional annotations of anaerobic derivates of mitochondria and peroxisomes in P. schiedti,

56 which were corroborated by electron microscopy.

57

58

59 Introduction

60 Transition to life in low oxygen environments requires significant modifications of cell

61 biochemistry and organellar make up. Several lineages of protists have undergone such

62 transitions and exemplify partly convergent solutions [1–3]. Mitochondria and peroxisomes

63 have been most significantly remodeled in this process, as they are the key places of oxygen-

64 dependent metabolism and oxygen detoxification.

3 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

65 Mitochondria are double-membrane-bound organelles, which have arisen from

66 engulfment of a prokaryotic lineage related to [2,4–6]. Since then, they

67 have diverged into a range of categories [1] and plethora of transitional forms [7,8], collectively

68 designated as mitochondrion related organelles (MROs), while only a single case of complete

69 loss has been reported [9]. Substantial collection of typical mitochondrial functionalities, such

70 as oxidative phosphorylation, carbon, amino acid and fatty-acid metabolism, iron-sulfur (FeS)

71 cluster assembly, homeostasis, and apoptosis, has been reduced to various extent in MROs

72 [10–12].

73 Peroxisomes are bound by a single membrane and characterized by a highly conserved

74 set of proteins (peroxins) essential for their biogenesis [13,14]. The matrix content and

75 consequently the repertoire of metabolic pathways is very variable reflecting high versatility of

76 peroxisomal functions [15]. Most frequently, they possess oxidases reducing molecular oxygen

77 to hydrogen peroxide (H2O2), and catalase for its detoxification. Not surprisingly, they are

78 absent from most anaerobes, such as Giardia and Trichomonas [16]; however, anaerobic

79 peroxisomes were recently reported from Mastigamoeba balamuthi [17].

80 Archamoebae represents a clade of microaerophilic protists nested within a broader

81 group of predominantly aerobic amoebozoans [18,19] represented e.g. by Dictyostelium

82 discoideum, known to bear a classical aerobic mitochondrion [20], or by their more distant

83 amoebozoan relative castellanii (Centramoebida) with mitochondria potentially

84 adapted to periods of anaerobiosis and exhibiting a highly complex proteome [12,21]. Small to

85 almost inconspicuous MROs have been characterized in two Archamoebae, the parasitic

86 Entamoeba histolytica and the free-living M. balamuthi. The only known function of E.

4 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

87 histolytica is production and export of activated sulfate – phosphoadenosine-5’-

88 phosphosulfate (PAPS) [22]. Metabolic capacity of M. balamuthi is

89 substantially broader involving pyruvate and amino acid metabolism, ATP production, and FeS

90 cluster assembly [23–25]. Another adaptation of M. balamuthi to the low oxygen environment

91 is represented by anaerobic peroxisomes that lack catalase and enzymes of β-oxidation of fatty

92 acids but harbor several enzymes of pyrimidine and CoA biosynthesis and acyl-CoA and

93 carbohydrate metabolism [17].

94 Pelomyxa is a free-living archamoeba distantly related to M. balamuthi [18], and so it

95 represents valuable point for tracing the evolution of anaerobic adaptations. There is a single

96 report on MROs in the giant P. palustris [26] but their metabolism in unknown. Using

97 methods of single-cell -omics and electron microscopy, we bring clear evidence for the

98 presence of both MROs and peroxisomes in its smaller cousin P. schiedti [27].

99

100

101 Results and discussion

102 General features of assemblies

103 P. schiedti single-cell genome assembly of 52.4 Mb contained 5,338 scaffolds with an

104 N50=51,552 bp (S1 Table) and 19,965 predicted proteins. We identified a single small subunit

105 ribosomal RNA (18S rDNA). In the 18S phylogeny, P. schiedti was sister to other Pelomyxa

106 species inside the Pelomyxidae clade (88% standard bootstrap) within a robust clade (94%

107 standard bootstrap) of Archamoebae (Fig 1, S1 Fig). The decontaminated transcriptome

108 assembly of 76.6 Mb comprised 43,993 contigs. BUSCO was used to estimate completeness of

5 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

109 assemblies and to compare them to the M. balamuthi genome (S2 Fig, S1 Table). Transcriptome

110 contained 83.2% of complete and 2.0% of fragmented BUSCO , while in the genome-

111 derived proteome the proportions were 81.9% and 3.6%, respectively. With 82.8% complete

112 and 3.0% fragmented genes [28], the completeness of M. balamuthi data was comparable.

113 36.0% of BUSCOs were duplicated in the transcriptome assembly, while only 8.6% in genomic,

114 reflecting a higher number of contigs or presence of isoforms in the former. It should be noted

115 that for non-model , which Pelomyxa certainly is, the BUSCO completeness is not

116 expected to reach 100%, because some of the orthologues might be absent and/or diverged

117 beyond recognition. Altogether, our analyses showed considerably high completeness of both

118 assemblies.

119

120 Fig 1. Phylogenetic analysis of amoebozoan 18S rDNA.

121 The Maximum Likelihood tree places Pelomyxa schiedti in monophyletic Pelomyxidae group

122 inside monophyletic Archamoebae. Standard bootstrap support values are shown when ≥ 50%.

123 Outgroup was collapsed for simplicity (for full tree see S1 Fig).

124

125 P. schiedti genes encompass 149,016 introns (S1 Table), which accounts for an intron

126 density of 7.46, almost twice higher than in M. balamuthi (3.74). While protists’ genomes have

127 usually lower intron densities, several organisms in IntronDB [29] exhibit similar intron density

128 as Pelomyxa, e.g., the Monosiga brevicolis (6.53) and Salpingoeca rosetta

129 (7.44), the chromerid Vitrella brassicaformis (7.45), or the chlorarachniophyte Bigelowiella

130 natans (7.85). The vast majority of introns (98.41%) contained canonical GT-AG boundaries,

6 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

131 1.59% possessed GC-AG boundaries, and one an unusual GT-GG intron boundary (S1 Table).

132 Similar frequencies of intron boundaries were observed in M. balamuthi (S1 Table and [28]).

133

134 Putative MRO proteome

135 The major focus of this study was to reveal the presence and to characterize the putative

136 proteomes of MRO and peroxisome of P. schiedti. We used a combined approach to search for

137 proteins possibly involved in the MRO metabolism and biogenesis by: (i) retrieving homologues

138 of MRO- or mitochondrion-targeted proteins of E. histolytica, M. balamuthi, and A. castelanii,

139 and (ii) predicting N-terminal mitochondrial targeting sequence (NTS) by four tools. The

140 resulting in silico predicted MRO proteome consists of 51 proteins (Fig 2, S2 Table) and provides

141 functionalities described below.

142

143 Fig 2. Overview of the Pelomyxa schiedti MRO metabolism.

144 Proteins were identified by BLAST or HMMER searches and their intracellular localization was

145 predicted by TargetP, PSORT II, MultiLoc2, and NommPred tools. Confidence of MRO

146 localization is enhanced by shades of blue as explained in graphical legend above the scheme.

147 Multiple copies of a protein are shown as overlapping ovals. Potential end-products are boxed

148 in dark-fuchsia color. ATP production and consumption are highlighted by dark- and light-pink

149 boxes around ATP, respectively. Abbreviations: AAT, aspartate alanine transferase; ACL, ATP-

150 citrate lyase; ACO, aconitase; ACS, acetyl-CoA synthetase; AK, adenylate kinase; APS,

151 adenosine-5'-phosphosulfate; APSK, adenosine-5'-phosphosulfate kinase; AS, ATP sulfurylase;

152 cpn10, chaperonin 10; cpn60, chaperonin 60; CIC, citrate carrier; CoA, coenzyme A; ETFa,

7 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

153 electron transferring flavoprotein subunit alpha; ETFb, electron transferring flavoprotein

154 subunit beta; ETFDH, electron transferring flavoprotein dehydrogenase; Fe-ADH, iron-

155 containing alcohol dehydrogenase; Fd, ferredoxin; FH, fumarase; GDH, glutamate

156 dehydrogenase; H, GCSH protein; HSP70, heat shock protein 70; HydA, [FeFe]-hydrogenase;

157 HydE, hydrogenase maturase; HydF, hydrogenase maturase; HydG, hydrogenase maturase; IDH,

158 isocitrate dehydrogenase; IPP, inorganic pyrophosphatase; L, GCSL protein; D-LDH, D-lactate

159 dehydrogenase; LPLA, lipoamide protein ligase; MCF, mitochondrial carrier family; MDH, malate

160 dehydrogenase; MPP a+b, mitochondrial processing peptidase subunit alpha and beta; NaS,

161 sodium/sulfate symporter; NifS, cysteine desulfurase; NifU+FdhD, scaffold protein + formate

162 dehydrogenase accessory sulfurtransferase; OGC, 2-oxoglutarate carrier; P, GCSP protein;

163 P5CDH, pyrroline-5-carboxylate dehydrogenase; P5CR, pyrroline-5-carboxylate reductase; PAPS,

164 3’-phosphoadenosine 5’-phosphosulfate; PC, pyruvate carboxylase; PFO, pyruvate:ferredoxin

165 oxidoreductase; PNT, pyridine nucleotide transhydrogenase; ProDH, proline dehydrogenase;

166 RQ, rodoquinone; RQH2, rhodoquinol; RquA, RQ methyltransferase; SAM, sorting and assembly

167 machinery; SDH5, succinate dehydrogenase assembly factor; SDHA, succinate dehydrogenase

168 subunit A; SDHB, succinate dehydrogenase subunit B; SDHC, succinate dehydrogenase subunit

169 C; SDHD, succinate dehydrogenase subunit D; SHMT, serine hydroxymethyltransferase; T, GCST

170 protein; THF, tetrahydrofolate; THF-CH2, N5,N10-methylenetetrahydrofolate; TOM/TIM,

171 translocase of the outer/inner membrane; UQ, ubiquinone; Zn-ADH, zinc-containing alcohol

172 dehydrogenase.

173

174 Protein import machinery

8 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

175 Despite sensitive HMMER searching we identified only three subunits of the outer membrane

176 translocase (TOM) and the sorting and assembly machinery (SAM) complexes — Tom40,

177 Sam50, and Sam37 (Fig 2). All three proteins had corresponding domains predicted by

178 InterProScan. Many homologues of the canonical subunits are missing (S2 Table),

179 as are all parts of the translocase of the inner membrane (TIM), and so the mechanism of

180 protein import across this membrane remains unclear. The situation resembles other

181 Archamoebae [23,28,30], suggesting that their translocons are either highly streamlined and/or

182 contain highly divergent or lineage-specific subunits as reported from trichomonads or

183 trypanosomes [31,32].

184 Enzymes involved in processing (matrix processing peptidase) and folding (chaperonins

185 cpn10 and cpn60) are present. HSP70 was detected in 14 copies, none of them confidently

186 predicted to mitochondrion (S2 Table). Phylogenetic analysis revealed a single MRO candidate

187 (Pelo10550) branching sister to M. balamuthi mtHSP70 within the mitochondrial clade (S3 Fig).

188 The other HSP70 paralogues fell into the ER or cytosolic clades, the latter being diversified in

189 ten copies all forming robust clades with M. balamuthi sequences.

190 Although we have probably revealed only a fragment of the inventory needed for the

191 protein import into the P. schiedtii MRO, the presence of the hallmarks—Tom40, Sam50,

192 mtHSP70, and cpn60—conclusively shows that the MRO is truly present.

193

194 Tricarboxylic acid cycle and electron transport chain

195 P. schiedti encodes four enzymes of the tricarboxylic acid (TCA) cycle possessing NTS (S2 Table)

196 and catalyzing consecutive reactions. ATP citrate lyase (ACL) is typical for the reductive

9 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

197 direction of TCA, while others, fumarate hydratase (fumarase/FH), malate dehydrogenase

198 (MDH), and four subunits of the succinate dehydrogenase complex (SDH/complex II/CII), are

199 common for both, oxidative and reductive TCA. The absence of CII subunit SDH5/SDHAF

200 involved in the flavination of SDHA subunit [33] is likely common for Archamoebae as it is

201 absent also from M. balamuthi [24].

202 Homologues of A. castellanii respiratory complexes were not identified, except for the

203 aforementioned CII and a quinone-dependent electron-transferring flavoprotein (ETF; S2

204 Table). Both soluble subunits, alpha (ETFa) and beta (ETFb), and the membrane-bound ETF

205 dehydrogenase (ETFDH), are present but only ETFDH and ETFa contain a recognizable NTS. It

206 has been proposed in M. balamuthi that electrons may be transferred in an unknown direction

207 between ETF and rhodoquinone (RQ), a quinone molecule with a lower electron potential than

208 ubiquinone [3,34]. RQ is in M. balamuthi synthetized by a hydrogenosomal methyltransferase

209 dubbed RquA [34], which was detected also in P. schiedti (S2 Table). RQ presence allows

210 delivery of electrons to CII that could function as fumarate reductase [35] producing succinate,

211 the putative end product of the partial reverted TCA in both Archamoebae [24], which may be

212 secreted as in Trypanosoma [36].

213

214 Pyruvate and ATP metabolism

215 Pyruvate is in aerobic mitochondria oxidatively decarboxylated to acetyl-coenzyme A (CoA) by

216 the pyruvate dehydrogenase (PDH) complex. In most MROs, PDH is substituted by

217 pyruvate:ferredoxin oxidoreductase (PFO), pyruvate:NADP+ oxidoreductase (PNO), or pyruvate

218 formate lyase (PFL) [2]. We identified six copies of PFO and one copy of PNO in the P. schiedti

10 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

219 genome, all without NTS (S2 Table). However, one of the P. schiedti PFOs was sister to one of

220 the M. balamuthi putatively hydrogenosomal PFOs [24] (S4 Fig). We assume that this PFO

221 homologue operates in P. schiedti MRO. Another pyruvate-metabolizing enzyme predicted to

222 MRO is pyruvate carboxylase (PC; S2 Table) producing oxaloacetate [37], a substrate of MDH. In

223 M. balamuthi, pyruvate may be produced by the activity of NAD+-dependent D-lactate

224 dehydrogenase (D-LDH), of which one is present in hydrogenosome and the other in

225 peroxisome [17,24]. P. schiedti bears only one homologue of D-LDH that is predicted to

226 peroxisomes (S3 Table), thus pyruvate is likely imported to MRO from .

227 Two ATP-synthesizing enzymes are putatively present. Acetyl-CoA synthetase (ACS),

228 enzyme converting acetyl-CoA to acetate, CoA, and ATP, was found in eight copies, four of

229 which possessed a putative NTS. ATP may be formed also by the adenylate kinase (AK)

230 catalyzing interconversion of adenine nucleotides. Three of the six AKs are putatively localized

231 in the MRO (S2 Table). In this respect, the situation resembles M. balamuthi hydrogenosome

232 [24]. Third putative source of ATP is the antiport against PAPS.

233

234 Amino acid metabolism

235 Glycine cleavage system (GCS) is at least partially retained in many MROs [38]. It consists of

236 four enzymes (H-, L-, T-, and P-protein) and methylates tetrahydrofolate (THF) while

237 decomposing glycine into CO2 and ammonia. THF methylation is also provided by the serine

238 hydroxymethyltransferase (SHMT) [39]. We identified all GCS enzymes and SHMT in P. schiedti,

239 all with predicted NTS (Fig 2, S2 Table). L-protein was present in two copies with only one

240 bearing NTS, similarly to M. balamuthi. The function of the second copy is unknown [24].

11 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

241 Lipoamide protein ligase (LPLA) necessary for lipoamide attachment to GCSH was present with

5 10 242 NTS. The resulting N ,N -methylenetetrahydrofolate (CH2-THF) is an intermediate in one-

243 carbon metabolism and cofactor for the synthesis of pyrimidines and methionine in both

244 mitochondria and cytosol. Two cytosolic enzymes requiring this cofactor, B12-dependent

245 methionine synthase and THF dehydrogenase/cyclohydrolase, were detected (S2 Table).

246 Glycine can be produced in mitochondria from threonine by threonine dehydrogenase (TDH)

247 and alpha-amino-beta-ketobutyrate CoA ligase (AKL) [40] but both proteins lack a recognizable

248 NTS in P. schiedti (S2 Table). Consistently, TDH activity was measured only in the cytosolic

249 fraction of M. balamuthi [24]. It is highly probable that this pathway operates in the cytosol of

250 P. schiedti and glycine is imported to MRO.

251 To our surprise, we identified remnants of the proline degradation pathway presumably

252 residing in P. schiedti MRO (Fig 2). In mitochondria, proline is usually degraded to glutamate by

253 the function of proline dehydrogenase (ProDH) and pyrroline-5-carboxylate dehydrogenase

254 (P5CDH) [41]. While ProDH is missing in P. schiedti, an alternative enzyme pyrroline-5-

255 carboxylate reductase (P5CR) was predicted to be mitochondrion-targeted by one predictor (S2

256 Table). P5CDH is present in P. schiedti but lacks predictable NTS. Glutamate can be further

257 metabolized to 2-oxoglutarate by glutamate dehydrogenase (GDH) [41], which is present and

258 predicted to be mitochondrion-targeted also by one tool (S2 Table).

259

260 Cofactor regeneration

261 NADH produced by GCS or during putative proline degradation would be in most mitochondria

262 reoxidized by NADH dehydrogenases in the electron transport chain [41]. Since this is absent in

12 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

263 P. schiedti, we explored other ways for regeneration of this cofactor. One possibility is

264 fermentation of aldehydes to alcohols by alcohol dehydrogenases [42] putatively targeted to

265 the MRO (S2 Table). Another option is the reductive partial TCA cycle running from citrate to

266 succinate consuming not only NADH but also electrons from ETFDH via CII producing succinate

267 [43]. Citrate or oxaloacetate are necessary to fuel this pathway. We identified a mitochondrial

268 citrate carrier (CIC; S2 Table) which belongs to SLC25A family and is known to exchange malate

269 for cytosolic citrate in cancer cells under low concentration of oxygen [44]. ACL produces acetyl-

270 CoA and oxaloacetate from citrate on the expense of ATP [45]. Acetyl-CoA may become a

271 substrate for anabolic reactions or be used by ACS to regenerate both ATP and CoA (Fig 2),

272 while oxaloacetate may enter the reverse TCA cycle becoming substrate of MDH, regenerating

273 NAD+. The malate pool is maintained also by 2-oxoglutarate carrier (OGC; S2 Table). In the

274 cytosol, 2-oxoglutarate can be reductively carboxylated to replenish citrate [46]. Oxaloacetate

275 may alternatively be produced from pyruvate by PC with ATP consumption or by the action of

276 aspartate amino transferase (S2 Table). The latter enzyme may balance the ratio of 2-

277 oxoglutarate + aspartate: oxalacetate + glutamate; however, the origin and fate of aspartate is

278 unclear due to the absence of the glutamate-aspartate antiporter (S2 Table).

279 Pyridine nucleotide transhydrogenase (PNT) is predicted to MRO by a single predictor

280 (S2 Table). PNT usually localizes in the inner mitochondrial membrane and pumps protons while

281 transferring electrons between NADH and NADPH [47]. PNT is present in M. balamuthi and E.

282 histolytica [23,48], however in E. histolytica, it was shown to localize outside mitosomes [49],

283 which calls into question its MRO location in other Archamoebae.

13 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

284 MRO contains two additional electron sinks with unclear purpose. ETF and ETFDH

285 proteins are known to use electrons from oxidation of fatty acids, which is absent in P. schiedti

286 MRO. Finally, [FeFe]-hydrogenases uptake electrons from reduced ferredoxins and produce

287 molecular hydrogen. Three of the six detected hydrogenases bear putative NTS. Hydrogenases

288 contain catalytic H cluster and its maturation is dependent on maturases (HydE, HydF, and

289 HydG) [50], which are all present and contain NTS (S2 Table). Reduced ferredoxin may originate

290 from pyruvate oxidation.

291

292 Iron-sulfur cluster assembly

293 Mitochondria usually house the iron-sulfur cluster assembly (ISC) pathway inherited from

294 alphaproteobacteria serving for maturation of both, mitochondrial and cytosolic FeS proteins

295 [51]. Some organisms, including Archamoebae, have replaced it by another machinery via

296 horizontal gene transfer [25,52]. M. balamuthi bears two copies of the nitrogen fixation (NIF)

297 system, both comprising NifS and NifU proteins. While one pair of NIFs operates in cytosol, the

298 other localizes in the hydrogenosome [25]. In E. histolytica, only cytosolic copy has been

299 retained [24].

300 In P. schiedti MRO, hydrogenases and their maturases HydE and HydF, SDH, ferredoxin,

301 and PFO are putative clients for NIF system. We identified seven NifS and three NifU proteins,

302 of which only NifU (Pelo10620) contained predicted NTS (S2 Table). Interestingly, this protein

303 consists of a NifU N-terminal fused to a formate dehydrogenase accessory

304 sulfurtransferase (FdhD) C-terminal domain (S5A Fig). The FdhD transfers sulfur

305 from IscS to formate dehydrogenase (FdhF) and is essential for its activity [53]. P. schiedti

14 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

306 indeed encodes a FdhF homologue without NTS (S2 Table). In the NifU phylogeny (Fig 3A), the

307 NifU domain of the fusion protein formed a long branch within a moderately supported (80%

308 ultrafast bootstrap) clade of all Archamoebae NifUs. The other P. schiedti NifU sequences

309 branched sister to hydrogenosomal and cytosolic M. balamuthi homologues. All three P.

310 schiedti NifU sequences contained conserved cysteine residues (S5B Fig) necessary for their

311 function [54].

312

313 Fig 3. Analyses of NIF system components.

314 (A-B) The Maximum Likelihood phylogenetic trees show that Pelomyxa schiedti possesses

315 orthologues of hydrogenosomal and cytosolic NifU (A) and NifS (B) proteins from

316 Mastigamoeba balamuthi. Hydrogenosomal proteins of M. balamuthi are marked with stars.

317 The Maximum Likelihood tree was estimated with standard (BS) and ultrafast bootstrapping

318 (UFB). The tree topology shown is from the ultrafast bootstrap analysis. Support values for

319 <50% BS and <75% UFB are denoted by a dash (-), whereas an asterisk (*) marks a topology that

320 does not exist in a particular analysis. Fully supported nodes are shown as black circles, while

321 nodes that were not supported are without any value. (C) Heterologous expression of two NifS

322 sequences of Pelomyxa schiedti showed their cytosolic localization. Proteins were expressed in

323 Saccharomyces cerevisiae with a GFP-tag at their C-terminus. Mitochondria were stained with

324 MitoTracker. DIC, differential interference contrast. Scale bar: 5 μm.

325

326 None of the seven NifS proteins was predicted to MRO (S2 Table). Two sequences were

327 identical but incomplete at their C-termini and could not be completed by read mapping or PCR

15 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

328 amplification. In the phylogenetic analysis (Fig 3B), Pelo13211 and Pelo6206 branched sister to

329 the hydrogenosomal and cytosolic M. balamuthi sequences, respectively. Pelo14142 was sister

330 to candidate Riflebacteria species and the last four formed a long branch nested within the

331 Archamoebae clade. All amino acid residues required for function [55] were present in both P.

332 schiedti sequences that were sister to M. balamuthi (S5C Fig).

333 The phylogenetic pattern offers an elegant hypothesis in which NifS Pelo6206 and NifU

334 Pelo14273 act in the cytosol, while NifS Pelo13211 and NifU Pelo19958 in MRO. The remaining

335 NifS copies might be functional partners of the NifU-FdhD fusion protein (Pelo10620).

336 Surprisingly, our experiments with heterologous localization of the MRO and cytosolic NifS

337 candidates in Saccharomyces cerevisiae revealed cytosolic localization of both (Fig 3C), leaving

338 the question of the FeS cluster assembly in P. schiedti MRO unresolved.

339

340 Sulfate activation pathway

341 Sulfate activation pathway produces PAPS necessary for sulfolipid synthesis [22]. It is present in

342 E. histolytica [22,56] and M. balamuthi [24] MROs, and we identified all of its components also

343 in P. schiedti (Fig 2, S2 Table). The pathway requires two transporters. A sodium/sulfate

344 symporter (NaS) is necessary for substrate delivery, however, its homologues in P. schiedti (S2

345 Table) are unrelated to E. histolytica mitosomal NaS [22] (S6A Fig) yielding their role unclear.

346 The PAPS exporter belongs to the mitochondrial carrier family (MCF) and, indeed, one of P.

347 schiedti MCF proteins branched sister to a clade of PAPS transporters of E. histolytica and M.

348 balamuthi [28,57] (S6B Fig). As this transporter exchanges PAPS to ATP, it plays role in

16 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

349 supplementing the ATP pool in MRO, yet cannot provide a net ATP gain, because two ATP

350 molecules are required for production of one PAPS.

351

352 Anaerobic peroxisomes

353 We have also investigated the presence of anaerobic peroxisomes, which were recently

354 characterized in M. balamuthi [17]. P. schiedti encodes genes for 13 proteins required for

355 peroxisome biogenesis (peroxins, Pexs) strongly suggesting presence of peroxisomes. Identified

356 peroxins include Pex5 and Pex7 required for the recognition of peroxisomal targeting signal 1

357 and 2 (PTS1 and PTS2), respectively, Pex13 and 14 mediating the protein import, Pex1, 2, 6, 10,

358 and 12, which are receptor-recycling peroxins, Pex3, 16, and 19 involved in protein import to

359 the peroxisomal membrane, and Pex11 participating in the peroxisome (S3 Table).

360 Prediction of putative peroxisomal matrix proteins based on the PTS1/PTS2 presence revealed

361 67 candidates (S3 Table). Interestingly, only four candidates were previously found in anaerobic

362 peroxisomes of M. balamuthi that include peroxisomal processing peptidase (PPP), inositol 2-

363 dehydrogenase, nudix hydrolase, and D-lactate dehydrogenase, all with clear support for

364 localization in P. schiedti peroxisomes (S3 Table). Unlike in M. balamuthi, P. schiedti

365 peroxisomes possibly contain pyridoxamine 5'-phosphate oxidase (PNPO) that utilizes

366 molecular oxygen as an electron acceptor to catalyze the last step of the pyridoxal 5-

367 phosphate (PLP) biosynthesis with concomitant formation of ammonia and H2O2. The presence

368 of PNPO raises a question how H2O2 is detoxified as typical antioxidant enzymes, such as

369 catalase and peroxidase, are not present. However, H2O2 could be decomposed also

370 nonenzymatically by antioxidants, such as 2-oxoglutarate, in which the ketone group of the α-

17 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

371 carbon atom reacts with H2O2 to form succinate, CO2, and water [58]. P. schiedti contains a

372 putative glutamate dehydrogenase that may produce 2-oxoglutarate and that possesses –SKL

373 triplet, a typical PTS1. However, the peroxisomal targeting was not supported by the PTS

374 predictor, which considers twelve C-terminal amino acid residues. The other proteins with

375 predicted peroxisomal localization include several enzymes of amino acid synthesis and

376 degradation, carbohydrate metabolism and hydrolases, however without clear biochemical

377 context. More experimental studies are required to verify predicted localizations and to

378 delineate function of peroxisomes in P. schiedti.

379

380 Electron microscopy

381 Finally, we were interested whether the two organelles described by the genomic data can be

382 visualized by microscopy. Careful inspection of electron micrographs, indeed, revealed two

383 populations of small vesicles, one presumably bounded by a double membrane while the other

384 by a single (Fig 4). We ascribe them to putative MROs and peroxisomes in silico characterized in

385 this work but leave the confirmation for further studies.

386

387 Fig 4. Transmission electron micrograph of Pelomyxa schiedti, ultra-thin sections. (A) The

388 nuclear area. N, nucleus; black arrow, putative mitochondrion-related ; white arrow,

389 small dense body (putative peroxisome); asterisk (*), prokaryotic endosymbiont. (B-C) High

390 magnification of putative mitochondrion-related organelle; black arrow, bounding double

391 membrane. (D) Detail of the bounding double membrane. (E) High magnification of the small

392 dense body (putative peroxisome). Scale bars: 400 nm for (A); 50 nm for (B-E).

18 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

393

394 Conclusions

395 Our bioinformatic survey of the putative proteome of Pelomyxa schiedti MRO revealed several

396 interesting insights and opened many questions for further investigation of this amoeba. Most

397 importantly, P. schiedti clearly does harbor an MRO with a very streamlined or lineage specific

398 set of protein translocases, and peroxisomes with a set of 13 soluble and membrane associated

399 peroxins. Our in silico predictions showed that the MRO provides the cell with the synthesis of

400 PAPS, contains glycine cleavage system, [FeFe]-hydrogenase, and likely also a part of a TCA

401 cycle running in reverse direction from citrate enabling concomitantly the production of acetyl-

402 CoA. The electron transport chain is reduced to complex II and electron-transferring

403 flavoprotein dehydrogenase, and possibly uses rhodoquinone as the electron transporter. We

404 predict that the source of reduced ferredoxin for [FeFe]-hydrogenase comes from pyruvate.

405 The situation with the FeS cluster assembly in this amoeba seems rather complex as it contains

406 the most diverse set of NIF pathway proteins of all previously investigated Archamoebae. These

407 proteins very likely provide parallel FeS synthesis in MRO and cytoplasm, but in addition to this,

408 some may be involved in the activation of formate dehydrogenase as seen in some prokaryotes.

409 P. schiedti anaerobic peroxisomes, similarly to M. balamuthi, lack enzymes of β-oxidation of

410 fatty acids and catalase. Although the function of these peroxisomes needs to be clarified, the

411 set of predicted enzymes suggested significant metabolic diversity between the two amoebae

412 as well as from their aerobic counterparts.

413

414

19 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

415 Materials and methods

416 Cell culture

417 Polyxenic (and polyeukaryotic) culture of Pelomyxa schiedti strain SKADARSKE was maintained

418 in Sonneborn’s Paramecium medium [59] as described previously [27].

419

420 Genome and transcriptome sequencing and assembly

421 Genome sequencing was performed from whole genome amplified DNA (WGA). Individual cells

422 were picked by micromanipulation and washed twice in Trager U media [60]. Genomic DNA was

423 amplified using Illustra Single Cell GenomiPhi DNA Amplification Kit (GE Healthcare Life

424 Sciences) according to the manufacturer’s protocol and purified using ethanol precipitation.

425 Presence of the eukaryotic DNA was confirmed by amplification of a partial actin gene using

426 specific primers (S4 Table). Sequencing libraries from seven positive samples were prepared

427 using Illumina TruSeq DNA PCR-Free kit (Illumina). Samples Pelo2 and Pelo5 were sequenced on

428 Illumina MiSeq (2x300 bp; Genomic Core facility, Faculty of Science) and Nanopore (Oxford

429 Nanopore Technologies), samples P1 – P5 on Illumina HiSeq X (Macrogen Inc.). The Nanopore

430 library was prepared using Oxford Nanopore Technologies ligation sequencing kit (SQK-LSK108)

431 from 4 μg of T7 endonuclease I (New England Biolabs) treated WGA. Sequencing was

432 performed using a R9.4.1 Spot-On Flow cell (FLO-MIN106) for 48 hours.

433 For transcriptome sequencing, single-cells of P. schiedti were washed twice in Trager U

434 and amplification by 19 cycles was performed [61]. Five libraries were prepared using Nextera

435 XT DNA Library preparation Kit (Illumina) and sequenced on Illumina MiSeq (PE 2x300bp;

436 Genomic Core facility, Faculty of Science).

20 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

437 Raw Illumina DNA- and RNA-Seq reads were quality and adapter trimmed using BBDuk

438 v36.92 (part of BBTools suite: https://jgi.doe.gov/data-and-tools/bbtools/). Firstly, individual

439 single-cell genome assemblies for Pelo2, Pelo5, and P1 – P5 were generated with SPAdes

440 v3.11.1 [62] using single-cell (--sc) mode and a k-mer size of 127. As the 18S rDNAs extracted

441 from individual assemblies were identical, all reads (i.e., Illumina HiSeqs and MiSeq, and

442 Nanopore) were assembled together by SPAdes v3.11.1 using --sc and k-mers of 21, 33, 55, 77,

443 99, 121. The resulting assembly was binned and decontaminated using tetraESOM [63] and a

444 BLASTing strategy described previously [64]. The final assembly was scaffolded using

445 P_RNA_scaffolder [65]. Prediction was done using Augustus v3.3.1 [66], and further improved

446 by PASA and EVM [67] using the transcriptomic data. RNA-Seq reads were assembled using

447 Trinity v2.6.5 [68] with default parameters, and contaminants were removed by BLASTing

448 against the decontaminated genome assembly. RNA-Seq reads were mapped to the

449 transcriptome using Bowtie2 v2.3.0 [69] and to the genome using HISAT2 v2.0.5 [70]. Genome

450 and transcriptome completeness were assessed using BUSCO v3 with the eukaryota_odb9

451 dataset [71].

452

453 Sequence searches and localization predictions

454 Proteins predicted to localize in M. balamuthi hydrogenosome, E. histolytica mitosome, and A.

455 castellanii mitochondria served as queries in BLAST v2.6.0+ [72] searches through P. schiedti

456 assemblies. Sensitive searches for components of TOM/TIM machinery were done using

457 HMMER v3.3 [73]. Protein domains were predicted by InterProScan [74] implemented in

458 Geneious Prime v2020.2.3 [75].

21 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

459 Potentially mitochondrion-targeted proteins were identified using TargetP v2 [76],

460 PSORT II [77], MultiLoc2 [78], and NommPred [79] tools. Since P. schiedti does not harbor

461 plastid, the setting from MultiLoc2 was omitted and NommPred was used in the MRO and

462 in the Dictyostelium settings [19]. A protein was considered as mitochondrial if predicted by at

463 least one setting of MultiLoc2 or NommPred.

464 Peroxins were identified by BLAST searches using M. balamuthi queries. Peroxisomal

465 matrix proteins were predicted by searching for peroxisomal targeting signals (PTS). The

466 tripeptides SRI and [SAP][KR][LM] (excluding AKM, PKM, and PRM) were used to search for the

467 C-terminal PTS1. Proline at position -3 and methionine at position -1 were included based on

468 experimental verification in M. balamuthi [17]. Two nanopeptides R[LI](x5)HL were used for N-

469 terminal PTS2 searches [80]. All putative transmembrane proteins determined by TMHMM

470 Server v2.0 [81] were filtered out. PTS1 candidates were submitted to the PTS1 Predictor using

471 GENERAL function [82] evaluating twelve C-terminal residues.

472

473 Phylogenetic analyses

474 An 18S rRNA gene dataset was aligned by MAFFT v7 [83] server with the G-INS-i algorithm at

475 default settings and manually edited in BioEdit v7.0.4.1 [84] resolving 1,437 positions.

476 Phylogenetic tree was constructed using Maximum-Likelihood in RAxML v8.0.0 [85] under the

477 GTRGAMMAI model, 100 starting trees, and 1,000 bootstrap pseudoreplicates.

478 For selected proteins, datasets were aligned by MAFFT v7.313 [83], trimmed by trimAl

479 v1.4 [86] and Maximum-Likelihood trees were inferred by IQ-TREE v1.6.8 [87] using the

22 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

480 posterior mean site frequency method [88], LG+C20+F+G model, with the guide tree inferred

481 under LG+F+G. Branch supports were obtained by the ultrafast bootstrap approximation [89].

482

483 Immunofluorescence analysis

484 NifS genes (Pelo6206 and Pelo13211) were amplified from cDNA using specific primers (S4

485 Table) and PrimeSTAR® Max DNA Polymerase (Takara Bio Inc.) premix, cloned into pUG35

486 vector containing C-terminal green fluorescence protein (GFP), and transformed to S. cerevisiae

487 strain YPH499 using the lithium acetate method [90]. Transformants were grown on selective

488 medium without uracil (SD-URA) at 30 °C. For localization, transformed cells were incubated

489 with MitoTracker Red CMXRos (1:10,000; Thermo Fisher Scientific) for 10 minutes, followed by

490 two washes with PBS, and mounted in 1% low-melting agarose and imaged using a Leica SP8

491 confocal microscope. Deconvolution was performed using Huygens Professional v17.10 and

492 ImageJ v1.50b.

493

494 Transmission electron microscopy

495 A grown culture of P. schiedti was pelleted by centrifugation and fixed one hour on ice with

496 2.5% glutaraldehyde in 0.1 M cacodylate buffer (pH 7.2). After washing in 0.1 M cacodylate

497 buffer, the cells were postfixed one hour on ice with 1% OsO4. After washing with distilled

498 water, the fixed cells were dehydrated in a graded series of ethanol, transferred to acetone,

499 and embedded in EPON resin. Ultrathin sections were prepared on an ultramicrotome

500 (Reichert-Jung Ultracut E) with a diamond knife. Sections were stained with uranyl acetate and

501 lead citrate and examined using JEOL 1011 transmission electron microscope.

23 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

502

503

504 Data availability

505 The raw sequencing data are available at NCBI (https://www.ncbi.nlm.nih.gov/) as BioProject

506 PRJNA672820. Final assemblies are available from Zenodo at

507 https://zenodo.org/record/4733726#.YI_RPWYza3I.

508

509 Author contributions

510 SCT cultured cells, prepared sequencing libraries, and performed Nanopore sequencing. KZ and

511 SCT assembled genome and transcriptome, and predicted proteins. IČ provided the culture of P.

512 schiedti and conducted 18S rDNA phylogeny. KZ performed most of the bioinformatic and

513 phylogenetic analyses. KZ, IŠ-S, JT, and VH analyzed the metabolism of the mitochondrion-

514 related organelle. TL and JT analyzed the peroxisomal metabolism. PH performed the

515 microscopic observations. JT and VH supervised the project. KZ, JT, and VH wrote the

516 manuscript. All authors contributed to the editing of the final manuscript.

517

518 Acknowledgements

519 We thank Dr. Zoltán Füssy (Charles University, BIOCEV, Prague, Czech Republic) and Dr.

520 Courtney W. Stairs (Lund University, Lund, Sweden) for helpful discussions and advice.

521

522 Funding

24 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

523 This project has received funding from the European Research Council (ERC) under the

524 European Union’s Horizon 2020 research and innovation programme (grant agreement No.

525 771592) and from Ministry of Education, Youth and Sports (MEYS) of Czech Republic (CR) in

526 Centre for research of pathogenicity and virulence of parasites (project No.

527 CZ.02.1.01/0.0/0.0/16_019/0000759). We acknowledge Imaging Methods Core Facility at

528 BIOCEV, supported by the MEYS CR (Large RI Project LM2018129 Czech-BioImaging) and ERDF

529 (project No. CZ.02.1.01/0.0/0.0/16_013/0001775 and CZ.02.1.01/0.0/0.0/18_046/0016045) for

530 their support with obtaining imaging data. Computational resources were supplied by the

531 project "e-Infrastruktura CZ" (e-INFRA LM2018140) provided within the program Projects of

532 Large Research, Development and Innovations Infrastructures.

533

534 Competing interests

535 The authors declared no competing interests.

536

537

538 References

539 1. Müller M, Mentel M, van Hellemond JJ, Henze K, Woehle C, Gould SB, et al. Biochemistry

540 and evolution of anaerobic energy metabolism in eukaryotes. Microbiol Mol Biol Rev.

541 2012;76: 444–495.

542 2. Roger AJ, Muñoz-Gómez SA, Kamikawa R. The origin and diversification of mitochondria.

543 Curr Biol. 2017;27: R1177–R1192.

544 3. Gawryluk RMR, Stairs CW. Diversity of electron transport chains in anaerobic protists.

25 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

545 Biochim Biophys Acta - Bioenerg. 2021;1862: 148334.

546 4. Zaremba-Niedzwiedzka K, Caceres EF, Saw JH, Bäckström Di, Juzokaite L, Vancaester E, et

547 al. Asgard archaea illuminate the origin of eukaryotic cellular complexity. Nature.

548 2017;541: 353–358.

549 5. Sagan L. On the origin of mitosing cells. J Theor Biol. 1967;14: 255–274.

550 6. Martijn J, Vosseberg J, Guy L, Offre P, Ettema TJG. Deep mitochondrial origin outside the

551 sampled alphaproteobacteria. Nature. 2018;557: 101–105.

552 7. Gawryluk RMR, Kamikawa R, Stairs CW, Silberman JD, Brown MW, Roger AJ. The earliest

553 stages of mitochondrial adaptation to low oxygen revealed in a novel rhizarian. Curr Biol.

554 2016;26: 2729–2738.

555 8. Leger MM, Kolisko M, Kamikawa R, Stairs CW, Kume K, Čepička I, et al. Organelles that

556 illuminate the origins of Trichomonas hydrogenosomes and Giardia mitosomes. Nat Ecol

557 Evol. 2017;1: 0092.

558 9. Karnkowska A, Vacek V, Zubáčová Z, Treitli SC, Petrželková R, Eme L, et al. A

559 without a mitochondrial organelle. Curr Biol. 2016;26: 1274–1284.

560 10. Panigrahi AK, Ogata Y, Zíková A, Anupama A, Dalley RA, Acestor N, et al. A

561 comprehensive analysis of mitochondrial proteome. Proteomics.

562 2009;9: 434–450.

563 11. Lee CP, Taylor NL, Harvey Millar A. Recent advances in the composition and

564 heterogeneity of the Arabidopsis mitochondrial proteome. Front Plant Sci. 2013;4: 4.

565 12. Gawryluk RMR, Chisholm KA, Pinto DM, Gray MW. Compositional complexity of the

566 mitochondrial proteome of a unicellular eukaryote (Acanthamoeba castellanii,

26 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

567 supergroup ) rivals that of , fungi, and . J Proteomics. 2014;109:

568 400–416.

569 13. Žárský V, Tachezy J. Evolutionary loss of peroxisomes - not limited to parasites. Biol

570 Direct. 2015;10: 74.

571 14. Schlüter A, Fourcade S, Ripp R, Mandel JL, Poch O, Pujol A. The evolutionary origin of

572 peroxisomes: An ER-peroxisome connection. Mol Biol Evol. 2006;23: 838–845.

573 15. Gabaldón T. Peroxisome diversity and evolution. Philos Trans R Soc B Biol Sci. 2010;365:

574 765–773.

575 16. Gabaldón T, Ginger ML, Michels PAM. Peroxisomes in parasitic protists. Mol Biochem

576 Parasitol. 2016;209: 35–45.

577 17. Le T, Žárský V, Nývltová E, Rada P, Harant K, Vancová M, et al. Anaerobic peroxisomes in

578 Mastigamoeba balamuthi. Proc Natl Acad Sci U S A. 2020;117: 2065–2075.

579 18. Pánek T, Zadrobílková E, Walker G, Brown MW, Gentekaki E, Hroudová M, et al. First

580 multigene analysis of Archamoebae (Amoebozoa: ) robustly reveals its phylogeny

581 and shows that represents a deep lineage of the group. Mol Phylogenet

582 Evol. 2016;98: 41–51.

583 19. Kang S, Tice AK, Spiegel FW, Silberman JD, Pánek T, Cepicka I, et al. Between a pod and a

584 hard test: The deep evolution of Amoebae. Mol Biol Evol. 2017;34: 2258–2270.

585 20. Pearce XG, Annesley SJ, Fisher PR. The Dictyostelium model for mitochondrial biology

586 and disease. Int J Dev Biol. 2019;63: 497–508.

587 21. Leger MM, Gawryluk RMR, Gray MW, Roger AJ. Evidence for a hydrogenosomal-type

588 anaerobic ATP generation pathway in Acanthamoeba castellanii. PLoS One. 2013;8:

27 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

589 e69532.

590 22. Mi-ichi F, Yousuf MA, Nakada-Tsukui K, Nozaki T. Mitosomes in Entamoeba histolytica

591 contain a sulfate activation pathway. Proc Natl Acad Sci. 2009;106: 21731–21736.

592 23. Gill EE, Diaz-Triviño S, Barberà MJ, Silberman JD, Stechmann A, Gaston D, et al. Novel

593 mitochondrion-related organelles in the anaerobic amoeba Mastigamoeba balamuthi.

594 Mol Microbiol. 2007;66: 1306–1320.

595 24. Nývltová E, Stairs CW, Hrdý I, Rídl J, Mach J, Pačes J, et al. Lateral gene transfer and gene

596 duplication played a key role in the evolution of Mastigamoeba balamuthi

597 hydrogenosomes. Mol Biol Evol. 2015/01/07. 2015;32: 1039–1055.

598 25. Nývltová E, Šuták R, Harant K, Šedinová M, Hrdy I, Paces J, et al. NIF-type iron-sulfur

599 cluster assembly system is duplicated and distributed in the mitochondria and cytosol of

600 Mastigamoeba balamuthi. Proc Natl Acad Sci U S A. 2013;110: 7371–7376.

601 26. Seravin L, Goodkov A. Cytoplasmic microbody-like granules of the amoeba Pelomyxa

602 palustris. Tsitologiya. 1987;29: 600–603.

603 27. Zadrobílková E, Walker G, Čepička I. Morphological and molecular evidence support a

604 close relationship between the free-living Archamoebae and Pelomyxa.

605 . 2015;166: 14–41.

606 28. Žárský V, Klimeš V, Pačes J, Vlček Č, Hradilová M, Beneš V, et al. The Mastigamoeba

607 balamuthi genome and the nature of the free-living ancestor of Entamoeba. Mol Biol

608 Evol. 2021;msab020: .

609 29. Wang D, Hancock J. IntronDB: A database for eukaryotic intron features. Bioinformatics.

610 2019;35: 4400–4401.

28 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

611 30. Dolezal P, Dagley MJ, Kono M, Wolynec P, Likić VA, Foo JH, et al. The essentials of protein

612 import in the degenerate mitochondrion of Entamoeba histolytica. PLoS Pathog. 2010;6:

613 e1000812.

614 31. Schneider A. Evolution of mitochondrial protein import - Lessons from trypanosomes.

615 Biol Chem. 2020;401: 663–676.

616 32. Makki A, Rada P, Žárský V, Kereïche S, Kováčik L, Novotný M, et al. Triplet-pore structure

617 of a highly divergent TOM complex of hydrogenosomes in . PLoS

618 Biol. 2019;17: e3000098.

619 33. Hao H-X, Khalimonchuk O, Schraders M, Dephoure N, Bayley J-P, Kunst H, et al. SDH5, a

620 gene required for flavination of succinate dehydrogenase, is mutated in paraganglioma.

621 Science. 2009;325: 1139–1142.

622 34. Stairs CW, Eme L, Muñoz-Gómez SA, Cohen A, Dellaire G, Shepherd JN, et al. Microbial

623 eukaryotes have adapted to hypoxia by horizontal acquisitions of a gene involved in

624 rhodoquinone biosynthesis. Elife. 2018;7: e34292.

625 35. Castro-Guerrero NA, Jasso-Chávez R, Moreno-Sánchez R. Physiological role of

626 rhodoquinone in gracilis mitochondria. Biochim Biophys Acta. 2005;1710: 113–

627 121.

628 36. Coustou V, Besteiro S, Rivière L, Biran M, Biteau N, Franconi JM, et al. A mitochondrial

629 NADH-dependent fumarate reductase involved in the production of succinate excreted

630 by procyclic Trypanosoma brucei. J Biol Chem. 2005;280: 16559–16570.

631 37. Jitrapakdee S, St Maurice M, Rayment I, Cleland WW, Wallace JC, Attwood P V. Structure,

632 mechanism and regulation of pyruvate carboxylase. Biochem J. 2008;413: 369–387.

29 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

633 38. Leger MM, Eme L, Hug LA, Roger AJ. Novel hydrogenosomes in the microaerophilic

634 Stygiella incarcerata. Mol Biol Evol. 2016/06/08. 2016;33: 2318–2336.

635 39. Kikuchi G. The glycine cleavage system: composition, reaction mechanism, and

636 physiological significance. Mol Cell Biochem. 1973;1: 169–187.

637 40. Dale RA. Catabolism of threonine in mammals by coupling of L-threonine 3-

638 dehydrogenase with 2-amino-3-oxobutyrate-CoA ligase. Biochim Biophys Acta. 1978;544:

639 496–503.

640 41. Schertl P, Braun H-P. Respiratory electron transfer pathways in plant mitochondria. Front

641 Plant Sci. 2014;5: 163.

642 42. Thomson JM, Gaucher EA, Burgan MF, De Kee DW, Li T, Aris JP, et al. Resurrecting

643 ancestral alcohol dehydrogenases from yeast. Nat Genet. 2005;37: 630–635.

644 43. Hügler M, Wirsen CO, Fuchs G, Taylor CD, Sievert SM. Evidence for autotrophic CO2

645 fixation via the reductive tricarboxylic acid cycle by members of the ε subdivision of

646 . J Bacteriol. 2005;187: 3020–3027.

647 44. Jiang L, Boufersaoui A, Yang C, Ko B, Rakheja D, Guevara G, et al. Quantitative metabolic

648 flux analysis reveals an unconventional pathway of fatty acid synthesis in cancer cells

649 deficient for the mitochondrial citrate transport protein. Metab Eng. 2017;43: 198–207.

650 45. Verschueren KHG, Blanchet C, Felix J, Dansercoer A, De Vos D, Bloch Y, et al. Structure of

651 ATP citrate lyase and the origin of citrate synthase in the Krebs cycle. Nature. 2019;568:

652 571–575.

653 46. Taylor EB. Functional properties of the mitochondrial carrier system. Trends Cell Biol.

654 2017;27: 633–644.

30 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

655 47. Jackson JB, Peake SJ, White SA. Structure and mechanism of proton-translocating

656 transhydrogenase. FEBS Lett. 1999;464: 1–8.

657 48. Yu Y, Samuelson J. Primary structure of an Entamoeba histolytica nicotinamide

658 nucleotide transhydrogenase. Mol Biochem Parasitol. 1994;68: 323–328.

659 49. Yousuf MA, Mi-ichi F, Nakada-Tsukui K, Nozaki T. Localization and targeting of an unusual

660 pyridine nucleotide transhydrogenase in Entamoeba histolytica. Eukaryot Cell.

661 2010/04/09. 2010;9: 926–933.

662 50. Kuchenreuther JM, Britt RD, Swartz JR. New insights into [FeFe] hydrogenase activation

663 and maturase function. PLoS One. 2012;7: e45850.

664 51. Tachezy J, Doležal P. Iron–sulfur proteins and iron–sulfur cluster assembly in organisms

665 with hydrogenosomes and mitosomes. In: Martin WF, Müller M, editors. Origin of

666 Mitochondria and Hydrogenosomes. Berlin, Heidelberg: Springer Berlin Heidelberg;

667 2007. pp. 105–133.

668 52. Stairs CW, Eme L, Brown MW, Mutsaers C, Susko E, Dellaire G, et al. A SUF Fe-S cluster

669 biogenesis system in the mitochondrion-related organelles of the anaerobic protist

670 Pygsuia. Curr Biol. 2014;24: 1176–1186.

671 53. Thomé R, Gust A, Toci R, Mendel R, Bittner F, Magalon A, et al. A sulfurtransferase is

672 essential for activity of formate dehydrogenases in Escherichia coli. J Biol Chem.

673 2012;287: 4671–4678.

674 54. Mansy SS, Wu G, Surerus KK, Cowan JA. Iron-sulfur cluster biosynthesis. Thermatoga

675 maritima IscU is a structured iron-sulfur cluster assembly protein. J Biol Chem. 2002;277:

676 21397–21404.

31 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

677 55. Kaiser JT, Clausen T, Bourenkow GP, Bartunik HD, Steinbacher S, Huber R. Crystal

678 structure of a NifS-like protein from Thermotoga maritima: implications for iron sulphur

679 cluster assembly. J Mol Biol. 2000;297: 451–464.

680 56. Mi-ichi F, Makiuchi T, Furukawa A, Sato D, Nozaki T. Sulfate activation in mitosomes plays

681 an important role in the proliferation of Entamoeba histolytica. PLoS Negl Trop Dis.

682 2011;5: e1263.

683 57. Mi-ichi F, Nozawa A, Yoshida H, Tozawa Y, Nozaki T. Evidence that the Entamoeba

684 histolytica mitochondrial carrier family links mitosomal and cytosolic pathways through

685 exchange of 3-phosphoadenosine 5-phosphosulfate and ATP. Eukaryot Cell. 2015;14:

686 1144–1150.

687 58. Liu S, He L, Yao K. The antioxidative function of alpha-ketoglutarate and its applications.

688 Biomed Res Int. 2018;2018: 3408467.

689 59. Sonneborn TM. Methods in the general biology and genetics of Paramecium aurelia. J

690 Exp Zool. 1950;113: 87–147.

691 60. Trager W. The cultivation of a cellulose-digesting , Trichomonas termopsidis,

692 and of certain other . Biol Bull. 1934;66: 182–190.

693 61. Picelli S, Faridani OR, Björklund AK, Winberg G, Sagasser S, Sandberg R. Full-length RNA-

694 seq from single cells using Smart-seq2. Nat Protoc. 2014;9: 171–181.

695 62. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: a new

696 genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol.

697 2012/04/16. 2012;19: 455–477.

698 63. Dick GJ, Andersson AF, Baker BJ, Simmons SL, Thomas BC, Yelton AP, et al. Community-

32 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

699 wide analysis of microbial genome sequence signatures. Genome Biol. 2009/08/21.

700 2009;10: R85–R85.

701 64. Treitli SC, Kolisko M, Husník F, Keeling PJ, Hampl V. Revealing the metabolic capacity of

702 Streblomastix strix and its bacterial symbionts using single-cell metagenomics. Proc Natl

703 Acad Sci U S A. 2019;116: 19675–19684.

704 65. Zhu B-H, Xiao J, Xue W, Xu G-C, Sun M-Y, Li J-T. P_RNA_scaffolder: a fast and accurate

705 genome scaffolder using paired-end RNA-sequencing reads. BMC Genomics. 2018;19:

706 175.

707 66. Stanke M, Schöffmann O, Morgenstern B, Waack S. Gene prediction in eukaryotes with a

708 generalized hidden Markov model that uses hints from external sources. BMC

709 Bioinformatics. 2006;7: 62.

710 67. Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, et al. Automated eukaryotic gene

711 structure annotation using EVidenceModeler and the Program to Assemble Spliced

712 Alignments. Genome Biol. 2008;9: R7–R7.

713 68. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length

714 transcriptome assembly from RNA-Seq data without a reference genome. Nat

715 Biotechnol. 2011/05/17. 2011;29: 644–652.

716 69. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods.

717 2012/03/06. 2012;9: 357–359.

718 70. Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. Graph-based genome alignment and

719 genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019;37: 907–915.

720 71. Simão FA, Waterhouse RM, Ioannidis P, Kriventseva E V, Zdobnov EM. BUSCO: assessing

33 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

721 genome assembly and annotation completeness with single-copy orthologs.

722 Bioinformatics. 2015/06/11. 2015;31: 3210–3212.

723 72. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J

724 Mol Biol. 1990;215: 403–410.

725 73. Eddy SR. A new generation of homology search tools based on probabilistic inference.

726 Genome Inf. 2010/02/25. 2009;23: 205–211.

727 74. Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, et al. InterProScan 5: genome-

728 scale protein function classification. Bioinformatics. 2014/01/24. 2014;30: 1236–1240.

729 75. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, et al. Geneious Basic:

730 an integrated and extendable desktop software platform for the organization and

731 analysis of sequence data. Bioinformatics. 2012/05/01. 2012;28: 1647–1649.

732 76. Almagro Armenteros JJ, Salvatore M, Emanuelsson O, Winther O, von Heijne G, Elofsson

733 A, et al. Detecting sequence signals in targeting peptides using deep learning. Life Sci

734 Alliance. 2019;2: e201900429.

735 77. Horton P, Nakai K. Better prediction of protein cellular localization sites with the k

736 nearest neighbors classifier. Proc Int Conf Intell Syst Mol Biol. 1997;5: 147–152.

737 78. Blum T, Briesemeister S, Kohlbacher O. MultiLoc2: Integrating phylogeny and gene

738 ontology terms improves subcellular protein localization prediction. BMC Bioinformatics.

739 2009/09/03. 2009;10: 274.

740 79. Kume K, Amagasa T, Hashimoto T, Kitagawa H. NommPred: Prediction of mitochondrial

741 and mitochondrion-related organelle proteins of nonmodel organisms. Evol Bioinform

742 Online. 2018;14: 1176934318819835.

34 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

743 80. Reumann S. Specification of the peroxisome targeting signals type 1 and type 2 of plant

744 peroxisomes by bioinformatics analyses. Plant Physiol. 2004;135: 783–800.

745 81. Krogh A, Larsson B, von Heijne G, Sonnhammer EL. Predicting transmembrane protein

746 topology with a hidden Markov model: application to complete genomes. J Mol Biol.

747 2001/01/12. 2001;305: 567–580.

748 82. Neuberger G, Maurer-Stroh S, Eisenhaber B, Hartig A, Eisenhaber F. Prediction of

749 peroxisomal targeting signal 1 containing proteins from amino acid sequence. J Mol Biol.

750 2003;328: 581–592.

751 83. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7:

752 improvements in performance and usability. Mol Biol Evol. 2013/01/19. 2013;30: 772–

753 780.

754 84. Hall TA. BioEdit: a user-friendly biological sequence alignment editor and analysis

755 program for Windows 95/98/ NT. Nucleic Acids Symp Ser. 1999;41: 95–98.

756 85. Stamatakis A. RAxML version 8: A tool for phylogenetic analysis and post-analysis of large

757 phylogenies. Bioinformatics. 2014;30: 1312–1313.

758 86. Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. trimAl: a tool for automated

759 alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009/06/10.

760 2009;25: 1972–1973.

761 87. Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic

762 algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2014/11/06.

763 2015;32: 268–274.

764 88. Wang H-C, Minh BQ, Susko E, Roger AJ. Modeling site heterogeneity with posterior mean

35 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

765 site frequency profiles accelerates accurate phylogenomic estimation. Syst Biol. 2018;67:

766 216–235.

767 89. Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS. UFBoot2: Improving the

768 ultrafast bootstrap approximation. Mol Biol Evol. 2017;35: 518–522.

769 90. Gietz RD, Woods RA. Yeast transformation by the LiAc/SS Carrier DNA/PEG method. In:

770 Xiao W, editor. Yeast Protocols. Totowa, NJ: Humana Press; 2006. pp. 107–120.

771

772

773 Supporting information

774 S1 Fig. Phylogenetic analysis of amoebozoan 18S rDNA. The Maximum Likelihood tree places

775 Pelomyxa schiedti in monophyletic Pelomyxidae group inside monophyletic Archamoebae.

776 Standard bootstrap support values are shown when ≥ 50%.

777

778 S2 Fig. BUSCO analysis of the Pelomyxa schiedti transcriptome and predicted proteins.

779 Completeness of P. schiedti datasets were assessed using the odb9_eukaryota dataset and

780 compared with completeness of predicted proteins from Mastigamoeba balamuthi.

781

782 S3 Fig. Phylogenetic analysis of HSP70 proteins. The Maximum Likelihood phylogenetic tree

783 documents that one of the Pelomyxa schiedti HSP70 sequence is related to mitochondrial

784 orthologues from other eukaryotes. Ultrafast bootstrap support values are shown when ≥ 75%.

785

36 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

786 S4 Fig. Phylogenetic analysis of PFO enzymes. The Maximum Likelihood phylogenetic tree

787 identified a PFO version putatively operating in Pelomyxa schiedti MRO. Hydrogenosomal PFO

788 copies of Mastigamoeba balamuthi are marked with stars. Number in parenthesis shows

789 number of species in the collapsed clade. Ultrafast bootstrap support values are shown when ≥

790 75%.

791

792 S5 Fig. Sequences of Pelomyxa schiedti components of NIF system. (A) The diagram depicts P.

793 schiedti protein Pelo10620 composed of a NifU N-terminal domain fused to a FdhD (formate

794 dehydrogenase accessory sulfurtransferase) C-terminal domain as determined by InterProScan.

795 (B-C) Sequence alignment of NifU (B) and NifS (C) proteins from P. schiedti and Mastigamoeba

796 balamuthi in comparison with bacterial homologues from Thermotoga maritima. The amino

797 acid residues necessary for the function of NifU and NifS are labeled according to the legend.

798

799 S6 Fig. Phylogenetic analysis of transporters involved in the sulfate activation pathway. (A)

800 The phylogenetic analysis did not resolve which one of the sodium/sulfate symporters of

801 Pelomyxa schiedti is related to the Entamoeba histolytica mitosomal transporter. (B) The

802 Maximum Likelihood phylogenetic tree confirms one P. schiedti transporter as PAPS (3’-

803 phosphoadenosine 5’-phosphosulfate) transporter, while two others belong to a broader

804 mitochondrial carrier family of transporters. Experimentally proven mitosomal transporters of

805 E. histolytica are marked with stars. Ultrafast bootstrap support values are shown when ≥ 75%.

806

37 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

807 S1 Table: Statistics of Pelomyxa schiedti assemblies were compared with those of

808 Mastigamoeba balamuthi.

809

810 S2 Table: Proteins targeted to MRO of Pelomyxa schiedti. Localization of proteins was

811 predicted by several tools, as listed in columns E - J. Mitochondrial predictions are highlighted

812 by white font on blue background. Column K shows final inferred prediction of localization. mit,

813 mitochondrial; cyt, cytosolic; SP, signal peptide; ER, endoplasmic reticulum; nuc, nuclear;

814 extracell, extracellular; sec, secretory system; perox, peroxisomal; PM, plasma membrane;

815 Other, other localization; -, protein not localized in MRO; +, protein localized in MRO; +?,

816 protein localized in MRO with low confidence.

817

818 S3 Table. Proteins required for peroxisome biogenesis and targeted to peroxisome. Proteins

819 identified in Mastigamoeba balamuthi are highlighted by white font on blue background.

820 Proteins were considered peroxisome-targeted, if they contained PTS1 (SRI or [SAP][KR][LM]) or

821 PTS2 motif (R[LIV](x5)HL), and/or were predicted by PTS1 predictor [82].

822

823 S4 Table: Primers used in this study.

38 Ougroup 0.1 substitution/site Rhizomastix libera JX157645 Rhizomastix elongata KP343633 91 Rhizomastixidae 100 Rhizomastix vacuolata KP343626 bioRxiv100 preprintRhizomastix doi: bicoronata https://doi.org/10.1101/2021.05.20.444135 KP343636 ; this version posted May 21, 2021. The copyright holder for this preprint (which 64 AF149913 was not certified by peerEntamoeba review) coli is AF149915 the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made 100 D28490available under aCC-BY-NC-ND 4.0 International license. 78 52 Entamoeba equi DQ286371 93 Entamoeba bovis FN666248 Entamoebidae 96 Entamoeba histolytica X65163 77 Entamoeba hartmanni FR686379 94 91 AF149905 100 Entamoeba ranarum AF149908 Mastigella eilhardi KJ879584 Mastigella erinacea KJ879586 88 Pelomyxa schiedti SKADARSKE 88 Pelomyxidae 100 Pelomyxa belevskii JX157662 99 Pelomyxa stagnalis JX157660 68 Pelomyxa palustris JX157661 Mastigamoeba errans JX157637 100 Mastigamoeba sp. AM114798 69 Mastigamoeba sp. JX157664 99 Mastigamoeba balamuthi L23799 95 71 Mastigamoeba punctachora JX157653 Mastigamoeba lenta JX157656 Mastigamoeba simplex AF421218 Mastigamoebidae 95 100 Mastigamoeba guttula JX157649 82 Mastigamoeba abducta JX157636 53 56 piscium JX101953 Endolimax nana AF149916 100 Iodamoeba sp. JN635746 Archamoebae 96 Iodamoeba sp. JN635740 mit prediction by all four tools mit prediction by at least two tools no mit prediction mit prediction by at least three tools mit prediction by at least one tool not identified SAM37

TOM40 SAM50 outer membrane

RQH SDHD SDHC e- 2 bioRxiv preprintTIM doi: https://doi.org/10.1101/2021.05.20.444135ETFDH; this version postedRQ e May- 21, 2021. The copyrightinner membrane holder for this preprint (which was not certified by peer review) is the author/funder, who has grantedUQ RquA bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. FAD SDHB HSP70 PROTEIN QUINONE RESPIRATORY FADH2 ETFa SDH5 IMPORT UTILIZATION ETFa COMPLEX II ETFb SDHA cpn10 ETFb cpn60 2-oxoglutarate MPP succinate fumarate a+b PROLINE DEGRADATION proline + NADH+H H2 O + 2-oxoglutarate FH FeS CLUSTER NAD + GDH OGC 2-oxoglutarate P5CR NAD(P) alanine ASSEMBLY ProDH + aspartate NAD(P)H+H CO2 malate malate cysteine NifS + AAT NifU+FdhD pyrroline-5- NAD 2+ + Fe NADH+H + carboxylate NADPH+H P5CDH glutamate + IDH NAD + + NADP + NADH+H Na NaS PYRUVATE METABOLISM MDH isocitrate HydE ATP 2- HydF HydG H2 pyruvate ADP SO4 + Fd PC ATP CO2 oxaloacetate ACO PFO AS HydA ADP CIC - PPi IPP Pi Fd acetyl-CoA citrate + APS H CO2 ACL ACS ATP APSK CoA ADP acetate ATP ATP PAPS GLYCINE CLEAVAGE SULFATE ATP SYNTHESIS citrate MCF SYSTEM ACTIVATION ATP ATP 2 ADP AK serine SHMT glycine AMP

NADH+H+ THF + THF-CH NAD 2 COFACTOR REGENERATION MCF + TRANS- glycine P L NADH+H T H aldehyde + MCF PORTERS Fe-ADH NAD + + H + NADH+H CO2 NH3 + NAD+ NADH LPLA NAD + Zn-ADH NADPH NADP lipoamide alcohol

PNT

H+ A B Hippea jasoniae WP_035589794.1 BS/UFB Micromonas pusilla XP_003062581.1 96 BS/UFB 80/96 Solanum lycopersicum XP_004231957.1 mitochondrial Cys Caminibacter mediatlanticus WP_007473033.1 100 desulfurase clade 90/99 0.5 substitution/site Vitis vinifera XP_010646324.1 0.5 substitution/site Nautilia profundicola WP_012663496.1 Arabidopsis thaliana NP_201373.1 Campylobacter curvus WP_018137404.1 Brassica oleracea XP_013612151.1 67/88 1 substitution/site Campylobacter concisus WP_103617351.1 Danio rerio NP_001038458.1 87 Falco peregrinus XP_027639524.1 CampylobacterbioRxiv preprint homin doi:is WP_011991512.1 https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which 100 Xenopus laevis XP_018090723.1 Campylobacterwas not certifiedgracilis WP_005871570.1 by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0Ailuropoda International melanoleuca license XP_019652178.1. WP_052800029.1 92/99 Mus musculus NP_035041.2 80/100 Homo sapiens NP_066923.3 Campylobacter coli WP_075479177.1 Pelomyxa schiedti Pelo14142 Campylobacter avium WP_094325730.1 89 Candidatus Ozemobacter sibiricus RCK81131.1 Campylobacter showae WP_122871831.1 65/81 100 Candidatus Riflebacteria bacterium NLI78869.1 Campylobacter fetus WP_039363019.1 Deltaproteobacteria bacterium NLJ29472.1 Candidatus Poribacteria bacterium RKY01216.1 Sulfurovum sp. UBA12169 DAB39917.1 Candidatus Poribacteria bacterium HDX76784.1 Arcobacter mytili WP_114841279.1 -/87 bacterium HGH06956.1 Candidatus Micrarchaeota archaeon HGD47134.1 99 Arcobacter pacificus WP_130232783.1 -/87 52/- 100 Peptococcaceae bacterium BRH_c4b KJS17296.1 Arcobacter ellisii WP_118916683.1 67/97 Clostridia bacterium NLW25537.1 -/95 89/96 Arcobacter butzleri WP_014468333.1 60/99 Calderihabitans maritimus WP_088553978.1 88/90 Arcobacter aquimarinus WP_129095780.1 -/82 Carboxydocella WP_078665453.1 61/95 Heliophilum fasciatum WP_131919075.1 Sulfuricurvum sp. IAE1 WP_132813287.1 70/82 99 Heliorestis acidaminivorans WP_151621636.1 99 Sulfurimonas denitrificans WP_011373621.1 100 100 Peptococcaceae bacterium HHV63980.1 pylori WP_121036488.1 Desulfitobacterium dichloroeliminans WP_015262887.1 57/- Helicobacter jaachi WP_034352150.1 Desulfitobacterium sp. PCE1 WP_019850928.1 100/98 Desulfitobacterium dehalogenans HHY28140.1 83/98 Helicobacter anseris WP_115578997.1 64/- Desulfitobacterium dehalogenans WP_014794781.1 52/79 Helicobacter mustelae WP_013023432.1 85/96 Desulfitobacterium chlororespirans WP_072773467.1 59/81 Helicobacter bilis WP_004086798.1 Desulfitobacterium hafniense WP_018211119.1 60/79 Desulfitobacterium WP_011460327.1 Spirochaetales bacterium PIE05054.1 80/- Desulfitobacterium hafniense WP_005810829.1 Chrysiogenes arsenatis WP_027389677.1 Caminibacter mediatlanticus WP_007473035.1 73/88 81 Nautilia profundicola WP_015902151.1 98 Spirochaetes bacterium RBG_13_68_11 OHD74626.1 Desulfuromonas acetoxidans WP_005997885.1 Chrysiogenales bacterium TFG81061.1 -/82 Geothermobacter sp. HR-1 WP_103115417.1 Entamoeba invadens XP_004254879.1 96 Geothermobacter sp. EPR-M WP_085011872.1 Archamoebae 100 Entamoeba histolytica XP_ 655888.1 64 Spirochaetes bacterium GWF1_51_8 OHD56356.1 Spirochaetes bacterium GWF1_49_6 OHD53472.1 Entamoeba nuttalli XP_008855081.1 81 Chrysiogenales bacterium TFG81062.1 -/80 Pelomyxa schiedti Pelo19958 81/98 88/100 Spirochaetes bacterium RBG_13_68_11 OHD74625.1 78/94 Mastigamoeba balamuthi AGA37365.1 54/83 Spirochaetes bacterium DG_61 KPJ87573.1 Spirochaetes bacterium RKX89561.1 Pelomyxa schiedti Pelo10620 -/78 Spirochaetaceae bacterium 4572_7 OQY39222.1 Pelomyxa schiedti Pelo14273 53/87 Spirochaeta perfilievii WP_149567788.1 88/96 Mastigamoeba balamuthi AGA37363.1 Pelomyxa schiedti Pelo6206 Mastigamoeba balamuthi AGA37364.1 61 98 Pelomyxa schiedti Pelo3572 -/84 Pelomyxa schiedti Pelo594 Pelomyxa schiedti Pelo19960 93/- Pelomyxa schiedti Pelo19961 -/83 91/99 Pelomyxa schiedti Pelo13211 Mastigamoeba balamuthi AGA37366.1 55/92 Entamoeba invadens XP_004257277.1 C Entamoeba dispar XP_001740105.1 GFP MitoTracker Merged DIC Entamoeba histolytica XP_655257.1 Archamoebae Entamoeba nuttalli EKE40908.1 Pelo13211 Elusimicrobia bacterium HAM34917.1 NifS Elusimicrobia bacterium HAH07944.1 99 Wolinella succinogenes WP_011139978.1 100 Wolinella succinogenes WP_129545331.1 50/89 Helicobacter rodentium WP_026943642.1 WP_000941636.1 83/97 85 Helicobacter bilis WP_080732238.1 100 Hydrogenimonas thermophila 5ZSP 92/100 Arcobacter ebronensis WP_128981712.1 55/96 Arcobacter defluvii WP_014473466.1 99/100 Arcobacter ellisii WP_118918609.1 -/82 bacterium HFS85011.1 Pelo6206 Sulfurovum sp. RUM71353.1 98/100 NifS 98/100 Sulfurovum sp. CAA6827990.1 84/99 Epsilonproteobacteria bacterium HFQ60893.1 58/93 Campylobacter pinnipediorum WP_069636381.1 97/100 Campylobacter coli WP_057032214.1 82/99 Campylobacter ureolyticus WP_016646418.1 74/93 Sulfurospirillum deleyianum WP_012857911.1 81/98 Epsilonproteobacteria bacterium NCD11503.1 99/100 Sulfurospirillum sp. UCH001 WP_067178139.1 Epsilonproteobacteria bacterium NCB54812.1 72/90 Sulfurospirillum sp. MES KHG34166.1 */88 Sulfurospirillum cavolei DAB36060.1 64/100 Sulfurospirillum cavolei WP_041963135.1 bioRxiv preprint doi: https://doi.org/10.1101/2021.05.20.444135; this version posted May 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.