<<

A cnidarian parasite of (: Henneguya) lacks a mitochondrial genome

Dayana Yahalomia,1, Stephen D. Atkinsonb,1, Moran Neuhofc, E. Sally Changd,e, Hervé Philippef,g, Paulyn Cartwrightd, Jerri L. Bartholomewb, and Dorothée Huchona,h,2

aSchool of , George S. Wise Faculty of Life Sciences, Tel Aviv University, 6997801 Tel Aviv, Israel; bDepartment of Microbiology, Oregon State University, Corvallis, OR 97331; cSchool of Neurobiology, Biochemistry & Biophysics, George S. Wise Faculty of Life Sciences, Tel-Aviv University, 6997801 Tel- Aviv, Israel; dDepartment of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS 66045; eComputational and Statistical Genomics Branch, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892; fCNRS, Station d’Ecologie Expérimentale du CNRS, 09200 Moulis, France; gDépartement de Biochimie, Centre Robert-Cedergren, Universitéde Montréal, Montréal, QC H3C 3J7, Canada; and hThe Steinhardt Museum of Natural History and National Research Center, Tel Aviv University, 6997801 Tel Aviv, Israel

Edited by W. Ford Doolittle, Dalhousie University, Halifax, Canada, and approved January 10, 2020 (received for review June 25, 2019) Although aerobic respiration is a hallmark of eukaryotes, a few uni- have highly divergent genome structures, with large multipartite cellular lineages, growing in hypoxic environments, have second- circular mt chromosomes and unusually high evolutionary rates (8, arily lost this ability. In the absence of oxygen, the mitochondria of 9). To gain further insight into the evolution of the myxozoan these organisms have lost all or parts of their genomes and evolved mt genome, we studied two closely related freshwater species, into mitochondria-related organelles (MROs). There has been debate regarding the presence of MROs in . Using deep sequencing approaches, we discovered that a member of the Significance , the myxozoan Henneguya salminicola, has no mitochon- drial genome, and thus has lost the ability to perform aerobic Mitochondrial respiration is an ancient characteristic of eukary- cellular respiration. This indicates that these core eukaryotic fea- otes. However, it was lost independently in multiple eukaryotic tures are not ubiquitous among animals. Our analyses suggest lineages as part of adaptations to an anaerobic lifestyle. We that H. salminicola lost not only its mitochondrial genome but also show that a similar adaptation occurred in a member of the nearly all nuclear genes involved in transcription and replication of Myxozoa, a large group of microscopic parasitic animals that are the mitochondrial genome. In contrast, we identified many genes closely related to and hydroids. Using deep sequencing that encode proteins involved in other mitochondrial pathways approaches supported by microscopic observations, we present and determined that genes involved in aerobic respiration or mi- evidence that an has lost its mitochondrial genome. The tochondrial DNA replication were either absent or present only as myxozoan cells retain structures deemed -related pseudogenes. As a control, we used the same sequencing and organelles, but have lost genes related to aerobic respiration annotation methods to show that a closely related myxozoan, and mitochondrial genome replication. Our discovery shows squamalis, has a mitochondrial genome. The molecular that aerobic respiration, one of the most important metabolic results are supported by fluorescence micrographs, which show pathways, is not ubiquitous among animals. the presence of mitochondrial DNA in M. squamalis, but not in H. salminicola. Our discovery confirms that adaptation to an anaerobic Author contributions: P.C., J.L.B., and D.H. designed research; D.Y., S.D.A., and D.H. per- environment is not unique to single-celled eukaryotes, but has also formed research; D.Y., J.L.B., and D.H. contributed new reagents/analytic tools; D.Y., M.N., E.S.C., H.P., and D.H. analyzed data; and D.Y., S.D.A., and D.H. wrote the paper. evolved in a multicellular, parasitic animal. Hence, H. salminicola provides an opportunity for understanding the evolutionary transi- The authors declare no competing interest. tion from an aerobic to an exclusive anaerobic metabolism. This article is a PNAS Direct Submission. Published under the PNAS license. Cnidaria | mitochondrial evolution | mitochondria-related organelle | Data deposition: Voucher paratype material was deposited at the US National Parasite MRO | cristae Collection, Smithsonian Institution, Washington, DC (https://collections.nmnh.si.edu/ search/iz/) under the following accession numbers: Henneguya salminicola myxospores: USNM1611578 (from genome sample) and USNM1611579 (from transcriptome sample); he acquisition of the mitochondrion was a fundamental event Myxobolus squamalis myxospores: USNM1611580 (from genome sample); M. squamalis Tin the evolution of eukaryotes, and most extant eukaryotes myxospores and developmental stages: USNM1611581 (from transcriptome sample). All sequence data have been deposited in the National Center for Biotechnology Information cannot survive without oxygen. Interestingly, the loss of aerobic (NCBI) database (https://www.ncbi.nlm.nih.gov/). The H. salminicola data are available respiration has occurred independently in several eukaryotic under the BioProject accession number PRJNA485580. The SSU rDNA sequence was de- lineages that adapted to low-oxygen environments and replaced posited under MK480607. The raw transcriptome and genome reads are available under accession numbers SRR7754566 and SRR7754567, respectively. The transcriptome and the standard mitochondrial (mt) oxidative phosphorylation path- genome shotgun assembly projects were deposited at under the accession GHBP00000000 way with novel anaerobic metabolic mechanisms (Fig. 1) (1, 2). and SGJC00000000, respectively. The M. squamalis data are available under the BioProject Such anaerobic metabolism occurs within mitochondria-related accession number PRJNA485581. The SSU rDNA sequence was deposited under organelles (MROs), which have often lost their cristae, and in- MK480606. The raw transcriptome and genome reads are available under the accession numbers SRR7760054 and SRR7760053, respectively. The transcriptome and genome shot- clude hydrogenosomes and mitosomes (1, 2). There is debate gun assembly projects were deposited under accession numbers GHBR00000000 and regarding the existence of exclusively anaerobic animals and QWKW00000000, respectively. The mt genome of M. squamalis was deposited under accompanying MROs (3). Although it was reported that some the accession number MK087050. These accession numbers are provided in Dataset S6. loriciferans found in anoxic conditions possess hydrogenosomes The Bayesian trees and all alignments were deposited in the Dryad Digital Repository (https://doi.org/10.5061/dryad.v15dv41sm). Finally, the uncropped images underlying (4, 5), genomic data are not yet available for these organisms, and Fig. 2 and SI Appendix,Figs.S7andS8were deposited in the Figshare repository (https:// alternative explanations have been proposed (3). Here, we show doi.org/10.6084/m9.figshare.8300003, https://doi.org/10.6084/m9.figshare.9897284, that a myxozoan parasite (Cnidaria) has lost both its mt genome https://doi.org/10.6084/m9.figshare.9897320). and aerobic metabolic pathways, and has a novel type of anaerobic 1D.Y. and S.D.A. contributed equally to this work. MRO. Myxozoans are a large group of enigmatic, parasitic, cni- 2To whom correspondence may be addressed. Email: [email protected]. darians with complex life cycles that require two hosts, usually a This article contains supporting information online at https://www.pnas.org/lookup/suppl/ and an (6). They have a substantial negative economic doi:10.1073/pnas.1909907117/-/DCSupplemental. impact on fisheries and (7). Myxozoan mitochondria First published February 24, 2020.

5358–5363 | PNAS | March 10, 2020 | vol. 117 | no. 10 www.pnas.org/cgi/doi/10.1073/pnas.1909907117 Downloaded by guest on September 25, 2021 EVOLUTION

Fig. 1. Eukaryote phylogenetic relationships infer- red from a supermatrix of 9490 amino acid positions for 78 species. Bayesian majority-rule consensus tree reconstructed using the CAT + Γ model from two in- dependent Markov-chain Monte Carlo chains. Branches with low node support (posterior probabili- ties PP < 0.7) were collapsed. Most nodes were highly supported (PP > 0.98), and PP are only indicated for nodes with PP < 0.98. The eukaryote classification is based on Adl et al. (47). Species known to have lost their mt genome are indicated in bold with an aster- isk. Myxozoan species form a well-supported group (PP = 1.0) and our reconstructions agree with previous studies (14), which show monophyly of the fresh-wa- ter/oligochaete lineage (10).

Henneguya salminicola and Myxobolus squamalis (SI Appendix, (15). This view is supported by calculations using only the most Fig. S1), both of which are parasites of salmonid fish (10–12). conserved CEGMA genes, which have higher recovery in both H. salminicola and M. squamalis (76.9% and 56.9%, respectively). Results Assembly of the mt genomes revealed striking differences We assembled transcriptomes and genomes from both species between the two parasites. For M. squamalis, we successfully using identical protocols and computational pipelines. Our phylo- recovered a circular mt genome composed of a single chromo- genetic analyses based on 78 nuclear ribosomal protein-encoding some, which phylogenetic analyses confirmed was myxozoan (SI genes from taxa representative of eukaryotic diversity confirmed Appendix, Supplementary Results and Figs. S5 and S6). Similar to that the organisms we sequenced are closely related myxozoans, other myxozoans (8), the M. squamalis mt genome lacked tRNAs, and not contaminants (Fig. 1 and SI Appendix, Fig. S2). The and has a fast evolutionary rate (SI Appendix, Supplementary Re- genome assembly statistics revealed that H. salminicola has a sults and Figs. S5 and S6). In stark contrast, we could not identify more complete assembly with higher coverage and more pre- any mt sequence among the contigs of H. salminicola, despite the dicted protein sequences than M. squamalis (Table 1 and SI higher quality of that assembly compared with that of M. squa- Appendix, Figs. S3 and S4). Targeted searches in the genomes malis. To identify whether DNA was present in the myxozoan identified 75/78 nuclear ribosomal protein genes, which sug- mitochondria, we stained living multicellular developing stages of gested that the completeness is >90% for both species. However, M. squamalis and H. salminicola with DAPI (Fig. 2). Cells of M. estimates of genome completeness using the Core Eukaryotic squamalis showed the characteristic eukaryotic staining of both Genes Mapping Approach (CEGMA) (13) recovered only 53.6% of nuclei and mitochondria (as much smaller blue dots; Fig. 2A), core eukaryotic genes for H. salminicola and 37.5% for M. squamalis. whereas H. salminicola showed only nuclear staining (Fig. 2B). We hypothesize that the fast evolutionary rates of myxozoans (14) The microscopy results, together with the lack of mt contigs in reduced our ability to detect many common eukaryotic genes, a the genome and transcriptome assemblies, supported our central challenge also known with other fast-evolving eukaryotic lineages hypothesis that this animal has lost its mt genome. Electron

Yahalomi et al. PNAS | March 10, 2020 | vol. 117 | no. 10 | 5359 Downloaded by guest on September 25, 2021 Table 1. Assembly statistics, presence of mt genome, and number of nuclear-encoded mt genes identified for myxozoan genomes (gen.) and transcriptomes (trans.) K. iwatai T. kitauei M. cerebralis H. salminicola M. squamalis (14) (18) (14)

Gen./Trans. Gen./Trans. Gen./Trans. Gen. Trans.

Genome size, Mb 60.0/— 53.1/— 22.5/— 188.5 — Coverage 311/— 86.1/— 1,000/— 37 — DNA assembly size, Mb 61.4/— 43.7/— 23.7/— 150.7 — No. of contigs 18,330/31,825 37,919/11,236 22,174/6,528 5,610 52,821 N50 7,570/600 1,286/714 40,195/1,662 — 11,965 CEGs (complete) 53.6%/26.6% 37.5%/23.8% 73.0%/76.6% 46.8% 39.1% CEGs (complete group 4) 76.9%/33.9% 56.9%/27.7% 96.9%/95.4% 66.2% 55.4% CEGs (partial group 4) 87.7%/75.4% 76.9%/70.8% 96.9%/96.9% 73.9% 84.6% %GC 29/— 27/— 23.6/— 37.5 — No. of predicted proteins 8,188/— 5,725/— 5,533/— 16,638 — Presence/absence of mt genome Absent Present Present (8, 9) Present (8) Unknown No. of nuclear genes involved in mtDNA replication 658524149 and translation No. of nuclear genes involved in electron-transport 721251821 chains No. of genes involved in pyruvate metabolism 0* 3* 3* 3* 3* No. of genes involved in other mt pathways 52 55 64 45 46

*There are two additional proteins, involved in pyruvate metabolism and present in all Myxozoa, that appear also in other metabolic pathways.

microscopy images, however, showed mt-like double membrane H. salminicola, and was absent from the H. salminicola tran- organelles with cristae in H. salminicola (Fig. 2C and SI Appendix, scriptome assembly, whereas we identified homologous contigs in Fig. S7)andM. squamalis (SI Appendix,Fig.S8). Accordingly, all other myxozoan transcriptomes (Dataset S3). The presence of genes involved in cristae organization were also detected in the a pseudogene copy of this polymerase has several implications. genome of both species, in particular DNAJC11 and MTX1, First, it supports our central conclusion that H. salminicola has lost which have been linked to the presence of cristae (16, 17) (Dataset its mtDNA, as it has no mtDNA replication machinery. Second, it S1). Together, these results confirm that an MRO without an mt shows that the absence of protein homologs in this species is the genome, but with cristae, is present in this species. result of pseudogenization, and not an assembly artifact. In animals, most of the mt proteome is encoded in the nucleus. The loss of the mt genome should impact aerobic respiration, Accordingly, we identified 51 and 57 genes involved in key mt since animal mt genomes code for essential proteins of the metabolic pathways (e.g., amino acid, carbohydrate, or nucleo- electron-transport chain (20). To verify whether the loss of the tide metabolism) in H. salminicola and M. squamalis, respectively mt genome meant loss of aerobic respiration in H. salminicola,we (Fig. 3, Table 1, and Dataset S2). This suggests that the MROs of searched for homologs of known Drosophila nuclear genes that H. salminicola still perform diverse metabolic functions, similar typically encode ∼100 proteins from the mt electron-transport to the mitochondria of M. squamalis. In contrast, almost all chain complexes (Fig. 3 and SI Appendix, Supplementary Meth- nuclear-encoded proteins involved in mt genome replication and ods). Our searches of all myxozoan genomes available revealed translation were absent from the H. salminicola genome. Using a that nuclear genes for only seven of these mt proteins remain in H. database of 118 such nuclear-encoded genes in Drosophila,we salminicola, whereas 18 to 25 are present in other myxozoans (Fig. identified 41 to 58 homologous mt genes in M. squamalis and 3, Table 1, and Dataset S2). Specifically, all complex I, III, and IV among published myxozoan data (14, 18), but only six of these genes that we identified in other myxozoans are absent in H. genes in H. salminicola (Table 1 and Dataset S3). In addition, we salminicola (Fig. 3B, Dataset S2,andSI Appendix, Supplementary calculated that H. salminicola does not have a faster evolutionary Results and Fig. S10) or present as pseudogenes (SI Appendix, rate than other myxozoans, which might otherwise have pre- Fig. S9). Since complex IV interacts with O2 molecules, we cluded gene discovery (Fig. 1 and SI Appendix, Fig. S2). conclude that H. salminicola might not be capable of standard Interestingly, in H. salminicola, we found that the mt DNA cellular aerobic respiration. In concurrence with the absence of polymerase subunit gamma-1 (19) gene is a pseudogene, as it the complexes that pump protons into the mitochondrial in- contains three point mutations that create premature stop codons termembrane space (i.e., complexes I, III, and IV), most genes (SI Appendix,Fig.S9). Furthermore, this gene is not expressed in that encode the Fo subunit of the adenosine triphosphate (ATP)

Fig. 2. Microscopic evidence for the absence of mito- chondria in H. salminicola.(A and B) DAPI staining of normal 7-cell presporogonic developmental stages of two myxozoan parasites of salmonid fish. (A) M. squamalis, showing large nuclei with many smaller mitochondrial nucleosomes (arrowed). (B) H. salminicola, showing large nuclei but surprisingly no mitochondrial nucleosomes. (C) TEM image of H. salminicola mitochondrion-related organelle with few cristae. Uncropped images are available in the Figshare repository.

5360 | www.pnas.org/cgi/doi/10.1073/pnas.1909907117 Yahalomi et al. Downloaded by guest on September 25, 2021 A Aerobic mitochondrion B Henneguya MRO

Mitochondrial pathways: Mitochondrial pathways: Amino acid metabolism Amino acid metabolism Cell rescue defense and cell death Cell rescue defense and cell death Metabolism of cofactors and vitamins Metabolism of cofactors and vitamins Lipid metabolism Lipid metabolism Nucleotide metabolism Nucleotide metabolism Proteolysis DNA, RNA and Proteolysis DNA, RNA and Protein folding and stabilization Protein synthesis Protein folding and stabilization Protein synthesis Protein targeting sorting and translocation Protein targeting sorting and translocation ψDNA pol DNA pol Fe-S cluster biosynthesis Fe-S cluster biosynthesis Transport facilitation Transport facilitation No mt DNA DNAD RNA pol RNA pol Pyruvate metabolism Pyruvate metabolism Pyruvate Acetyl-CoA Pyruvate Acetyl-CoA No RNA mt RNA ψPDH PDH ribosomes ? Tricarboxylic mt encoded Tricarboxylic mt encoded TCA proteins TCA -acid pathway ATP -acid pathway proteins cycle ADP cycle ATP ADP unknown electron ? O H O 2 2 acceptor CV CV CI CII e- CII CI UQ CIII CIV UQ CIII CIV e- C C - e- H+ e + + H - H+ H e Oxidative phosphorylation Oxidative phosphorylation Present Absent EVOLUTION C Mitochondrial / MRO pathways present in selected species synthesis

2 Legend H ASCT ACS CI CIII CIV TCA cycle PDH PNO PFL AOX SCS Propionate Glycine DNA Cristae CV PFO CII Aerobic mitochondria Hydra magnipapillata (Metazoa) + + + + + + + + + - - c - - + + - + + + = present iwatai (Metazoa) + + + + + + + + + ------+ - + p - = absent Thelohanellus kitauei (Metazoa) + + + + + + + + + ------+ - p p p = partial Myxobolus squamalis (Metazoa) ? + + + + ? + + + ------+ - - p c = in cytosol (Metazoa) ? + + + + ? + + + ------+ - p p ? = unknown Hydrogen-producing Bastocystis sp. (Stramenopiles) + + + + - - - p + + + - + + + + c + + mitochondria Acanthamoeba castellani (Amoebozoa) + + + + + + + + + + - - + + + + - + + Myxozoa Hydrogenosomes Trichomonas vaginalis (Metamonada) - - p ------+ - - - + + + - - p Piromyces sp. (Nucletmyceta) - - + + + + + + - - + + - + + + - - p Mitosomes Giardia intestinalis (Metamonada) ------c - - - c - - c - - Nematocida parisii (Microsporidia) ------+ ------Cryptosporidium muris (Alveolata) - + - + - - + + - - c - + - - + c - p Henneguya salminicola (Metazoa) -+-+--p+------+--p

Fig. 3. Comparison between the pathways present in (A) a typical aerobic mitochondrion and (B)theH. salminicola MRO. (C) Mitochondrial/MRO pathways present in selected species (see refs. 1 and 2). The presence and absence of organellar genomes are indicated. ACS, acetyl-CoA synthetase; acetate AOX, alternative oxidase; ASCT, acetate succinyl-CoA transferase; DNA pol, mtDNA polymerase; RNA pol, mtDNA-dependent RNA polymerase; CI-CV, respiratory complexes I-V; C, cytochrome c; PDH, pyruvate dehydrogenase; PFL, pyruvate formate lyase; PFO, pyruvate ferredoxin oxidoreductase; PNO, pyruvate NADPH oxidoreductase; SCS, succinyl-CoA − + synthetase; TCA cycle, tricarboxylic acid cycle; UQ, ubiquinone; e ,electrons;H , protons; ψ indicates the presence of a pseudogene in the nuclear genome.

synthase complex (i.e., the proton channel of complex V) are also times independently, some of them present striking similarities missing in H. salminicola (Dataset S4), while being present in (1, 2). Not only have MROs often lost the same mt pathways Myxobolus (Dataset S4). This suggests that a proton gradient is (e.g., pyruvate dehydrogenase or electron transport chain en- absent across the inner organelle membrane in H. salminicola.In zymes) but also, in several cases, homologous enzymes, such as contrast, for complex II, which is part of the Krebs cycle, and for the hydrogenases or pyruvate formate lyases, have been acquired F1 subunit of the ATP synthase, H. salminicola encodes a similar independently by horizontal gene transfer. These enzymes allow number of protein coding genes as other myxozoans (Dataset S4). ATP production by anaerobic pyruvate metabolism and H2 syn- thesis. MROs with such abilities are called hydrogen-producing Discussion mitochondria or hydrogenosomes, the latter having lost their Structurally, H. salminicola has lost its mt genome, but has retained ability to utilize oxygen (Fig. 3C) (1, 2). an organelle that resembles a mitochondrion. However, as mito- As our H. salminicola assemblies did not contain any hydrogenase chondria are defined based on the use of oxygen as electron ac- or other genes of prokaryotic origin (Fig. 3C and Dataset S5), we ceptor (21), and usually the presence of an mt genome (but see conclude that the MROs in H. salminicola are not hydrogenosomes. ref. 22), we conclude that H. salminicola possesses MROs rather The presence of cristae in H. salminicola’s MRO is surprising since than true mitochondria. Although MROs have evolved several these membrane invaginations are usually absent in anaerobic

Yahalomi et al. PNAS | March 10, 2020 | vol. 117 | no. 10 | 5361 Downloaded by guest on September 25, 2021 MROs (1, 16, 17). However, we note that the MROs of H. visualized under a Leica DMR compound fluorescence microscope at 630× salminicola share these characteristics with the MRO of the and 1,000× magnification. apicomplexan Cryptosporidium muris, which has also lost com- plexes I, III, and IV, but possesses an alternative oxidase and Electron Microscopy. Fresh parasite pseudocysts were dissected from the tissue retains cristae (Fig. 3C) (23). The presence of cristae together with of a single host and then fixed in a solution comprising 1% glutaraldehyde and 2% paraformaldehyde in 0.1 M phosphate buffer. Larger pseudocysts the identification of pseudogenes suggest that the loss of mtDNA were sliced to permit penetration of the fixative. Fixed tissue was then and aerobic respiration may be a recent evolutionary event in the stained with osmium tetroxide and then uranyl acetate, before being Henneguya lineage. Future experiments are needed to better dehydrated in a graded alcohol series and embedded in Epon resin. Ultrathin characterize the metabolic energy pathways of H. salminicola. sections were mounted on copper grids and examined using a Helios 650 FEG However, such experiments are challenging because it is currently dual-beam SEM (Thermo Scientific) in transmission mode, at the Oregon State not possible to culture H. salminicola in the laboratory. University Electron Microscope Facility. Similar to most Myxozoa, H. salminicola likely alternates be- tween two hosts (6). In its fish host, it undergoes proliferation Filtering and Assembling the Genomic and Transcriptomic Data. Stringent and sporogenesis in pseudocysts within the white muscle (11), a filtering, involving multiple steps, was performed to eliminate host and bacterial contamination from M. squamalis and H. salminicola data. This tissue known to have anaerobic metabolism (24). While the obli- involved mapping reads to the corresponding host genomes and several gate invertebrate host of H. salminicola is unknown, it is probably rounds of BLAST searches against the National Center for Biotechnology In- an annelid from the family Naididae, based on known life cycles of formation nucleotide database (SI Appendix, Supplementary Methods). Reads related myxozoans (25). Members of the Naididae can grow and from contaminant sequences were eliminated and the remaining DNA and reproduce in anoxic environments (26). As all that have RNA reads assembled using IDBA (version 1.1.1) (34) and Trinity (35), re- lost their mt genomes live in anaerobic environments, we specu- spectively (SI Appendix, Supplementary Methods). late that the loss of the mt genome in H. salminicola was driven by low-oxygen environments in both of its hosts. Assembly of the M. squamalis mt Genome and Absence of mt Sequences in H. salminicola Loss of superfluous genes likely conveys an evolutionary ad- . Local blastn and tblastn searches using cnidarian (including published myxozoan) mt genome and protein sequences, respectively, were vantage, as it has been shown that the bioenergetic cost of a gene performed against our myxozoan assemblies of M. squamalis and H. salmi- is higher in small genomes (27). Myxozoans have smaller ge- nicola. After manual inspection of all sequences with E-values <1e−1,no > nomes [22 to 180 Mb (14, 18)] than free-living Cnidaria [ 250 mitochondrial sequence could be identified for H. salminicola. In contrast, a Mb (28, 29)]. Therefore, the loss of the mt genome and associ- putative mitochondrial contig was identified for M. squamalis. The coverage ated nuclear genes involved in its replication and electron of the mt genome was 185, about twice the nuclear coverage. To further pathways may be advantageous for a myxozoan living in anaer- search the H. salminicola data, Hidden Markov Model (HMM) profiles were obic environments. However, the loss of useless genes by random built based on alignments of myxozoan mt proteins, using HMMer3.0 (36). drift cannot be excluded. Interestingly, our results also open the These profiles were used to search protein predictions of H. salminicola by way to new treatment options against this pathogen, since an- Maker2 v2.31.10 (37) (see SI Appendix, Supplementary Methods, for details regarding Maker2 annotation), but no mt proteins were identified. Similarly, aerobic protists are known to be sensitive to specific drugs (30). HMM profiles were built based on alignments of myxozoan RNA sequences Myxozoans have gone through outstanding morphological and ge- using Infernal 1.1.1 (38), following the approach of Yahalomi et al. (8), and nomic simplifications during their adaptation to from a the profiles used to search genomic and transcriptomic assemblies. Again, no free-living cnidarian ancestor (31). It is remarkable that these myxo- mt sequences were identified. zoan simplifications do not appear to be ancestral, but rather the result The Perl script Novoplasty v2.6.3 (39) was used to reconstruct a first draft of secondary losses (14). Here we show that at least one myxozoan of the mitochondrial sequence of M. squamalis based on the mt contig species has lost a core animal feature: the genetic basis for aerobic identified in the BLAST search. The draft sequence was then corrected using respiration in its mitochondria. As a highly diverse group with >2,400 read mapping (see SI Appendix, Supplementary Methods, for details re- species, which inhabit marine, freshwater, and even terrestrial envi- garding the assembly and annotation of the mt genome of M. squamalis). ronments (32), evolutionary loss and simplification has clearly been a Estimating Completeness of Genomic and Transcriptomic Assemblies. CEGMA successful strategy for Myxozoa, which shows that less is more (33). (13) was used to estimate the completeness of our assemblies. Because Materials and Methods myxozoans show extreme evolutionary rates (14), the completeness was estimated based on the most conserved set of CEGMA genes (Group 4). Samples and Sequencing. Samples of H. salminicola and M. squamalis were We also used the program BUSCO V3 (40), but found that it performed identified on the basis of morphology (SI Appendix, Fig. S1), tissue poorly on Myxozoa, which was in concordance with other studies that show tropism, host, and SSU rDNA sequence similarity with published sequence that BUSCO underestimates completeness of fast-evolving organisms (15). available at the National Center for Biotechnology Information (SI Appen- dix, Supplementary Methods). Genome Size Estimation. Genome sizes were calculated based on K-mer fre- DNA and RNA of H. salminicola were each extracted from a single large quency estimation using the GenomeScope web server (41) (last accessed 2018/ cyst (4 to 8 mm diameter) sampled from Chinook salmon (Oncorhynchus 02). For both species, k-mer frequency histograms were generated for k = 17, tschawytscha). For M. squamalis, which develops in smaller cysts (2 to 4 mm using the program Jellyfish 2.2.7 (42) on the filtered reads with the following diameter), DNA and RNA were extracted from several cysts collected from a parameters: count -C -m 17 -s 1000000000 -t 10 (SI Appendix, Figs. S3 and S4). coho salmon (Oncorhynchus kisutch) and rainbow trout (Oncorhynchus mykiss), respectively. The multiisolate extract from M. squamalis may explain Characterization of Myxozoan Proteins Interacting with mtDNA and mtRNA. To the differences in assembly quality between H. salminicola and M. squa- create an exhaustive database of nuclear-encoded proteins that interact with malis, since polymorphism is known to complicate assembly. mt DNA and RNA, we downloaded three protein datasets: all mt ribosomal DNA and RNA were extracted with the DNeasy Blood & Tissue Kit (Qiagen) protein sequences from FlyBase (last accessed November 2017) (43), all and the High Pure RNA extraction kit (Roche), respectively, following manufac- Drosophila melanogaster proteins with either the functional classifications turers’ instructions. The samples were sent for library construction and se- “DNA and RNA” or “DNA and RNA/Protein synthesis/Others” from the quencing at the Center for Genome Research and Biocomputing at Oregon State MitoDrome database (last accessed January 2018) (44), and sequences of University (Corvallis, OR). Paired-end sequencing with 150-bp reads derived from human proteins known to bind to mt RNA and described in Rackham et al. fragments of average length ∼350 bp was performed on a HiSeq3000 platform. (45) from the National Center for Biotechnology Information. We then performed reciprocal blastp searches against the Drosophila proteome to Light Microscopy. Myxozoan cells were prefixed with 3:1 methanol:acetic acid, identify the corresponding homologs (SI Appendix, Supplementary Meth- and three drops of cell suspension were put on slides. DNA staining was ods). These sets of Drosophila proteins were used to perform reciprocal performed with VECTASHIELD (Vectorlabs) antifade mounting medium, BLAST searches against the proteome of the cnidarian Hydra vulgaris. The which contained DAPI (SI Appendix, Supplementary Methods). Cells were Hydra and Drosophila sequences were then used to identify homologous

5362 | www.pnas.org/cgi/doi/10.1073/pnas.1909907117 Yahalomi et al. Downloaded by guest on September 25, 2021 sequences in the myxozoan genome and transcriptome assemblies, after selected from this database: the first included 78 species representative of they had been filtered from contaminants. Detailed information about ho- major eukaryote lineages (47), and the second included 129 species that molog identification is provided in SI Appendix, Supplementary Methods. encompassed animal diversity and their closest outgroup (choanoflagellates, ichthyosporeans, and ministeriids). In both datasets, sequences were con- Characterization of Myxozoan Mitochondrial Metabolic Pathways. Drosophila catenated with SCaFoS (48). After removal of any ambiguously aligned po- proteins involved in the different mitochondrial metabolic pathways were sitions using Gblocks Version 0.91b (49), with default parameter values, downloaded from the MitoDrome database (44). Reciprocal BLAST searches these Eukaryota and Metazoa datasets included 9,490 and 11,352 amino were conducted using the Drosophila sequences as queries to identify ho- acid positions, respectively. Phylogenetic reconstructions were performed mologous copies in Hydra. The Drosophila and Hydra sequences were then using the site-heterogeneous CAT model (50), which reduces the impact of used as queries to identify nuclear-encoded mitochondrial proteins in the long branch attraction (51), as implemented in Phylobayes MPI Version 1.5 myxozoans (SI Appendix, Supplementary Methods). It is worth noting that all (52). For both datasets, two independent chains were run for 10,000 cycles. Kudoa proteins previously identified by Muthye and Lavrov (46) using HMM The first 5,000 trees from each chain were discarded as burn-in. Chain con- profiles were identified also in our reciprocal BLAST searches, indicating that vergence was assessed using the bpcomp and tracecomp scripts, which are the use of HMM profiles did not improve protein identifications in our case. part of Phylobayes. Specifically, for both analyses, the bpcomp maxdiff values Metabolic pathway components known from MROs were also searched were <0.3 and the tracecomp effsize values were >70 (except for eukaryotes, for. To do this, MRO-associated proteins from across the eukaryotes were where the tree length value was 21), indicating a proper convergence. gathered based on the supplementary data from Stairs et al. (2) and used as queries against the cnidarian assemblies (SI Appendix, Supplementary Methods). ACKNOWLEDGMENTS. We thank Mark Dasenko and the Center for Genome Research and Biocomputing at Oregon State University (OSU), and Teresa Phylogenetic Reconstructions. The phylogenetic analyses used a reference Sawyer of the OSU Electron Microscope Facility for their assistance. This work database of 78 ribosomal protein-coding genes, which was curated manually was supported by the Binational Science Foundation (Grant No. 2015010 to to avoid contamination and structural annotation errors. Two datasets were D.H. and P.C.).

1. A. J. Roger, S. A. Muñoz-Gómez, R. Kamikawa, The origin and diversification of mi- 26. P. Famme, J. Knudsen, Anoxic survival, growth and reproduction by the freshwater tochondria. Curr. Biol. 27, R1177–R1192 (2017). annelid, Tubifex sp., demonstrated using a new simple anoxic chemostat. Comp. Bio- 2. C. W. Stairs, M. M. Leger, A. J. Roger, Diversity and origins of anaerobic metabolism in chem. Physiol. A Comp. Physiol. 81, 251–253 (1985). mitochondria and related organelles. Philos. Trans. R. Soc. Lond. B Biol. Sci. 370, 27. M. Lynch, G. K. Marinov, The bioenergetic costs of a gene. Proc. Natl. Acad. Sci. U.S.A. 20140326 (2015). 112, 15690–15695 (2015). 3. J. M. Bernhard et al., Metazoans of redoxcline sediments in Mediterranean deep-sea 28. J. A. Chapman et al., The dynamic genome of Hydra. 464, 592–596 (2010). hypersaline anoxic basins. BMC Biol. 13, 105 (2015). 29. N. H. Putnam et al., Sea anemone genome reveals ancestral eumetazoan gene rep- 4. R. Danovaro et al., The first metazoa living in permanently anoxic conditions. BMC ertoire and genomic organization. Science 317,86–94 (2007).

Biol. 8, 30 (2010). 30. S. Löfmark, C. Edlund, C. E. Nord, Metronidazole is still the drug of choice for treat- EVOLUTION 5. R. Danovaro et al., The challenge of proving the existence of metazoan life in per- ment of anaerobic infections. Clin. Infect. Dis. 50 (suppl. 1), S16–S23 (2010). manently anoxic deep-sea sediments. BMC Biol. 14, 43 (2016). 31. E. Kayal et al., Phylogenomics provides a robust topology of the major cnidarian lineages 6. B. Okamura, A. Gruhl, J. L. Bartholomew, “An introduction to myxozoan evolution, ecology and insights on the origins of key organismal traits. BMC Evol. Biol. 18, 68 (2018). and development” in Myxozoan Evolution Ecology and Development, B. Okamura, A. 32. S. D. Atkinson, J. L. Bartholomew, T. Lotan, Myxozoans: Ancient metazoan parasites Gruhl, J. L. Bartholomew, Eds. (Springer International Publishing, 2015), pp. 1–20. find a home in phylum Cnidaria. Zoology (Jena) 129,66–68 (2018). 7. I. Fontes, S. L. Hallett, T. A. Mo, “Comparative epidemiology of myxozoan diseases” 33. M. V. Olson, When less is more: Gene loss as an engine of evolutionary change. Am. J. in Myxozoan Evolution Ecology and Development,B.Okamura,A.Gruhl, Hum. Genet. 64,18–23 (1999). J. L. Bartholomew, Eds. (Springer International Publishing, 2015), pp. 317–341. 34. Y. Peng, H. C. M. Leung, S. M. Yiu, F. Y. L. Chin, “IDBA–A practical iterative de Bruijn 8. D. Yahalomi et al., The multipartite mitochondrial genome of leei graph de novo assembler” in Research in Computational Molecular Biology. RECOMB (Myxozoa): Eight fast-evolving megacircles. Mol. Biol. Evol. 34, 1551–1556 (2017). 2010, B. Berger, Ed. (Lecture Notes Computer Science, 2010), vol. 6044, pp. 426–440. 9. F. Takeuchi et al., The mitochondrial genomes of a myxozoan genus Kudoa are ex- 35. B. J. Haas et al., De novo transcript sequence reconstruction from RNA-seq using the tremely divergent in Metazoa. PLoS One 10, e0132030 (2015). Trinity platform for reference generation and analysis. Nat. Protoc. 8,1494–1512 (2013). 10. I. Fiala, P. Bartošová-Sojková, C. M. Whipps, “Classification and phylogenetics of 36.J.Mistry,R.D.Finn,S.R.Eddy,A.Bateman,M.Punta,Challengesinhomologysearch: Myxozoa” in Myxozoan Evolution Ecology and Development, B. Okamura, A. Gruhl, J. HMMER3 and of coiled-coil regions. Nucleic Acids Res. 41, e121 (2013). L. Bartholomew, Eds. (Springer International Publishing, 2015), pp. 85–110. 37. C. Holt, M. Yandell, MAKER2: An annotation pipeline and genome-database manage- 11. F. F. Fish, Observations on Henneguya salminicola Ward, a myxosporidian parasitic in ment tool for second-generation genome projects. BMC Bioinformatics 12, 491 (2011). Pacific salmon. J. Parasitol. 25, 169–172 (1939). 38. E. P. Nawrocki, S. R. Eddy, Infernal 1.1: 100-fold faster RNA homology searches. Bio- 12. T. M. Polley, S. D. Atkinson, G. R. Jones, J. L. Bartholomew, Supplemental description informatics 29, 2933–2935 (2013). of Myxobolus squamalis (Myxozoa). J. Parasitol. 99 , 725–728 (2013). 39. N. Dierckxsens, P. Mardulyn, G. Smits, NOVOPlasty: De novo assembly of organelle 13. G. Parra, K. Bradnam, I. Korf, CEGMA: A pipeline to accurately annotate core genes in genomes from whole genome data. Nucleic Acids Res. 45, e18 (2017). eukaryotic genomes. Bioinformatics 23, 1061–1067 (2007). 40. R. M. Waterhouse et al., BUSCO applications from quality assessments to gene pre- 14. E. S. Chang et al., Genomic insights into the evolutionary origin of Myxozoa within diction and phylogenomics. Mol. Biol. Evol. 35, 543–548 (2018). Cnidaria. Proc. Natl. Acad. Sci. U.S.A. 112, 14912–14917 (2015). 41. G. W. Vurture et al., GenomeScope: Fast reference-free genome profiling from short 15. A. Karnkowska et al., A eukaryote without a mitochondrial organelle. Curr. Biol. 26, reads. Bioinformatics 33, 2202–2204 (2017). 1274–1284 (2016). 42. G. Marçais, C. Kingsford, A fast, lock-free approach for efficient parallel counting of 16. M. A. Huynen, M. Mühlmeister, K. Gotthardt, S. Guerrero-Castillo, U. Brandt, Evolu- occurrences of k-mers. Bioinformatics 27, 764–770 (2011). tion and structural organization of the mitochondrial contact site (MICOS) complex 43. L. S. Gramates et al.; The FlyBase Consortium, FlyBase at 25: Looking to the future. and the mitochondrial intermembrane space bridging (MIB) complex. Biochim. Bio- Nucleic Acids Res. 45, D663–D671 (2017). phys. Acta 1863,91–101 (2016). 44. D. D’Elia et al., The MitoDrome database annotates and compares the OXPHOS nu- 17. S. A. Muñoz-Gómez et al., Ancient homology of the mitochondrial contact site and clear genes of Drosophila melanogaster, Drosophila pseudoobscura and Anopheles cristae organizing system points to an endosymbiotic origin of mitochondrial cristae. gambiae. Mitochondrion 6, 252–257 (2006). Curr. Biol. 25, 1489–1495 (2015). 45. O. Rackham, T. R. Mercer, A. Filipovska, The human mitochondrial transcriptome and 18. Y. Yang et al., The genome of the myxosporean Thelohanellus kitauei shows adapta- the RNA-binding proteins that regulate its expression. Wiley Interdiscip. Rev. RNA 3, tions to nutrient acquisition within its fish host. Genome Biol. Evol. 6,3182–3198 (2014). 675–695 (2012). 19. M. A. Graziewicz, M. J. Longley, W. C. Copeland, DNA polymerase γ in mitochondrial 46. V. Muthye, D. V. Lavrov, Characterization of mitochondrial proteomes of non- DNA replication and repair. Chem. Rev. 106, 383–405 (2006). bilaterian animals. IUBMB Life 70, 1289–1301 (2018). 20. D. V. Lavrov, W. Pett, Animal mitochondrial DNA as we do not know it: Mt-genome or- 47. S. M. Adl et al., Revisions to the classification, nomenclature, and diversity of eu- ganization and evolution in nonbilaterian lineages. Genome Biol. Evol. 8, 2896–2913 (2016). karyotes. J. Eukaryot. Microbiol. 26, 12691 (2018). 21. M. Müller et al., Biochemistry and evolution of anaerobic energy metabolism in eu- 48. B. Roure, N. Rodriguez-Ezpeleta, H. Philippe, SCaFoS: A tool for selection, concatena- karyotes. Microbiol. Mol. Biol. Rev. 76, 444–495 (2012). tion and fusion of sequences for phylogenomics. BMC Evol. Biol. 7 (suppl. 1), S2 (2007). 22. U. John et al., An aerobic eukaryotic parasite with functional mitochondria that likely 49. J. Castresana, Selection of conserved blocks from multiple alignments for their use in lacks a mitochondrial genome. Sci. Adv. 5, eaav1110 (2019). phylogenetic analysis. Mol. Biol. Evol. 17, 540–552 (2000). 23. T. Makiuchi, T. Nozaki, Highly divergent mitochondrion-related organelles in anaer- 50. N. Lartillot, H. Philippe, A Bayesian mixture model for across-site heterogeneities in obic parasitic . Biochimie 100,3–17 (2014). the amino-acid replacement process. Mol. Biol. Evol. 21, 1095–1109 (2004). 24. I. A. Johnston, Studies on the swimming musculature of the rainbow trout. J. Fish Biol. 51. N. Lartillot, H. Brinkmann, H. Philippe, Suppression of long-branch attraction artefacts in the 7, 459– 467 (1975). animal phylogeny using a site-heterogeneous model. BMC Evol. Biol. 7 (suppl. 1), S4 (2007). 25. J. D. Alexander, B. L. Kerans, M. El-Matbouli, S. L. Hallett, L. Stevens, “Annelid-myxosporean 52. N. Lartillot, N. Rodrigue, D. Stubbs, J. Richer, PhyloBayes MPI: Phylogenetic re- interactions” in Myxozoan Evolution Ecology and Development,B.Okamura,A.Gruhl,J. construction with infinite mixtures of profiles in a parallel environment. Syst. Biol. 62, L. Bartholomew, Eds. (Springer International Publishing, 2015), pp. 217–234. 611–615 (2013).

Yahalomi et al. PNAS | March 10, 2020 | vol. 117 | no. 10 | 5363 Downloaded by guest on September 25, 2021