Berger et al. Parasites Vectors (2021) 14:436 https://doi.org/10.1186/s13071-021-04933-w Parasites & Vectors

RESEARCH Open Access The parasite secretes proteins with putative host manipulation functions Chloé Suzanne Berger1,2,3, Jérôme Laroche2, Halim Maarouf2, Hélène Martin1,2,4, Kyung‑Mee Moon5, Christian R. Landry1,2,4,6,7, Leonard J. Foster5 and Nadia Aubin‑Horth1,2,3*

Abstract Background: Manipulative parasites are thought to liberate molecules in their external environment, acting as manipulation factors with biological functions implicated in their host’s physiological and behavioural alterations. These manipulation factors are part of a complex mixture called the secretome. While the secretomes of various parasites have been described, there is very little data for a putative manipulative parasite. It is necessary to study the molecular interaction between a manipulative parasite and its host to better understand how such alterations evolve. Methods: Here, we used proteomics to characterize the secretome of a model cestode with a complex life cycle based on trophic transmission. We studied Schistocephalus solidus during the life stage in which behavioural changes take place in its obligatory intermediate fsh host, the threespine stickleback (Gasterosteus aculeatus). We produced a novel genome sequence and assembly of S. solidus to improve protein coding gene prediction and annotation for this parasite. We then described the whole worm’s proteome and its secretome during fsh host infection using LC–MS/MS. Results: A total of 2290 proteins were detected in the proteome of S. solidus, and 30 additional proteins were detected specifcally in the secretome. We found that the secretome contains proteases, proteins with neural and immune functions, as well as proteins involved in cell communication. We detected receptor-type tyrosine-protein phosphatases, which were reported in other parasitic systems to be manipulation factors. We also detected 12 S. solidus-specifc proteins in the secretome that may play important roles in host–parasite interactions. Conclusions: Our results suggest that S. solidus liberates molecules with putative host manipulation functions in the host and that many of them are species-specifc. Keywords: Schistocephalus solidus, Secretome, Proteomics, Manipulation factor, Parasite, Behaviour

Background release manipulation factors that interfere with the host Parasites have major impacts on their hosts, including physiological and central nervous systems [5–7]. Tese on their morphology [1], physiology [2], and behaviour manipulation factors are thought to be part of a complex [3, 4]. To induce these complex changes in their hosts, mixture of molecules called the secretome, which is a key it has been proposed that parasites produce, store, and element of parasite–host interactions [6]. Te secretome of a parasite includes lipids [8], nucleic acids [9], and proteins [10], which are sometimes protected inside *Correspondence: [email protected] extracellular vesicles [11]. Using molecular and bioinfor- 1 Département de Biologie, Université Laval, Quebec, QC, Canada matics approaches, the proteomic fraction of secretomes Full list of author information is available at the end of the article

© The Author(s) 2021. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat​ iveco​ mmons.​ org/​ licen​ ses/​ by/4.​ 0/​ . The Creative Commons Public Domain Dedication waiver (http://creat​ iveco​ ​ mmons.org/​ publi​ cdoma​ in/​ zero/1.​ 0/​ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Berger et al. Parasites Vectors (2021) 14:436 Page 2 of 20

of parasites infecting humans [12] and livestock [13] have possible to infer a list of putative manipulation factors, been described, both in terms of protein composition which would target the neural and the immune systems and function [14, 15] (see Table 1 for a review). of the hosts and induce behavioural changes (Table 1). Te secretomes that have been examined so far are Our knowledge regarding if and how many proteins enriched in peptidases and proteases [12, 15], which are with neural and immune functions can be found in the known to weaken the host immunity barriers. Other secretomes of manipulative parasites is very limited, and secreted proteins, such as paramyosin in the blood fuke is based in many cases on inferred proteins rather than Schistosoma mansoni, have been shown to help the para- actual detection. site to escape the host immune response, while secreted One particularly powerful model to study behavioural proteins involved in calcium functions have impacts on manipulation is the cestode Schistocephalus solidus the host neural activity [12]. In the context of behavioural [27]. Tis parasite exhibits a complex life cycle based on manipulation, the secretome is a logical potential source trophic transmission that includes three hosts: a cope- of manipulation factors. However, the secretome con- pod, the threespine stickleback (Gasterosteus aculeatus, tent of a behaviour-manipulating parasite has rarely been obligatory intermediate fsh host), and a fsh-eating bird, investigated, to the point that secretomes are referred in which S. solidus reproduces [27, 28]. S. solidus infects to as “the missing link in parasite manipulation” [7]. the threespine stickleback’s abdominal cavity through Te literature contains several reports from which it is the ingestion of a parasitized copepod [29]. Te

Table 1 Proteomic content (directly measured or inferred) of the secretomes of diferent parasite species Parasitic species Host species Manipulation factor Host system targeted Reference

78 helminth species: 64 Humans, , and plants Peptidases (61% of species) Immune system Helminth secretome data‑ nematodes, 7 trematodes, base (HSD) 7 cestodes (1) Garg and Ranganathan [15] Blood fuke Humans Proteins involved in calcium Neural system Knudsen et al. [12] Schistosoma mansoni (1) binding and regulation (afects cell signalization) Blood fuke Humans Paramyosin and SPO-1 Immune system (evasion) Knudsen et al. [12] Schistosoma mansoni (1) Blood fuke Humans Proteases Immune system (degrada‑ Knudsen et al. [12] Schistosoma mansoni (1) tion of skin barriers) Liver fuke Molluscs (e.g. Galba trun- Proteases and antioxidant Immune system (evasion) Cwiklinski and Dalton [16] Fasciola hepatica (1) catula) enzymes: Gourbal et al. [13] Cu/Zn-superoxide dis‑ mutase thioredoxin + Cestode Snakes Specifc proteins with no Unknown Kim et al. [17] Spirometra erinacei (1) Rhabdophis tigrinus identifed ­homologsa Cestode Rats Antigens Immune system Bień et al. [18] Hymenolepis diminuta (1) Rattus rattus (evasion and modulation) Baculovirus Silkworm Protein tyrosine Neural system (enhanced Kamita et al. [19] Bombyx mori NPV (2) Bombyx mori ­phosphatasea locomotory activity) Hairworm Grass-hopper Meconema Wnta Neural system (modifca‑ Biron et al. [20] Spinochordodes tellinii Par- thalassinum cricket tions of monoamine [21] agordius tricuspidatus+ (2) Nemobius sylvestris+ levels) Protozoan Rat Tyrosine ­hydroxylasea Neural system (increases Prandovszky et al. [22] Toxoplasma gondii (2) Rattus rattus dopamine levels) Fungus Ant Camponotus pennsyl- Guanobutyric acid (GBA) and Neural system (action not De Bekker et al. [23] Ophiocordyceps unilateralis vanicus and sphingosine determined) (2) Formica dolosa Wasp Caterpillar Cytokinea Immune system (activation Adamo et al. [24] Cotesia congregata (2) Manduca sexta that results in feeding reduction) Wasp Cockroach Periplaneta Dopamine Neural system (represses the Libersat et al. [25] Ampulex compressa (2) americana activity of neurons) Aphid Witch hazel BICYCLE protein Development (induces Korgaonkar et al. [26] Hormaphis cornu (2) Hamamelis virginiana galls—novel plant ‘organs’)

(1) Species of human importance. (2) Species studied in the context of behavioural manipulation. aProteins inferred from experimental or bioinformatics evidence, i.e. proteins not directly measured in the secretome Berger et al. Parasites Vectors (2021) 14:436 Page 3 of 20

consequences of the infection by S. solidus on the three- inside the S. solidus tegument, as shown through scan- spine stickleback’s morphology [30], physiology [31], ning and transmission electron microscopy [47]. Moreo- immune system [32], and behaviour [33] are well docu- ver, S. solidus excretes (through passive mechanisms) mented. For example, sticklebacks infected by S. solidus or secretes (through active mechanisms) molecules, show drastic behavioural changes that result in a loss of including (uncharacterized) proteins, and these secre- the anti-predator response [27]: infected fsh are more tions are sufcient to afect its fsh host behaviour [48]. exploratory [34], less anxious [35], and bolder in the pres- Finally, a well-annotated genome of the threespine stick- ence of a predator [36] than non-infected fsh. leback host is also available [49], which is important to Most of these behavioural alterations seen in infected adequately diferentiate proteins coming from the host fsh appear after several weeks, when the worm has and the parasite. Te Schistocephalus–stickleback sys- grown to reach the infective stage within its intermedi- tem is thus appropriate to test the presence of manipula- ate host (i.e. larger than 50 mg) [37]. Te infective stage tion factors. However, the nature of the protein content coincides with the time at which S. solidus is ready to of the S. solidus secretome has never been explored [7, reproduce in the bird [27, 38], which also generally cor- 48]. Analysing the proteomic fraction of the secretome of responds to the activation of the immune response in S. solidus, and its potential enrichment in manipulation the host. In the frst phase of infection, the adaptive factors involved in neuronal and immune functions, will immune response is generally not activated in the fsh. help us to understand if the behavioural changes of the It is only when the worm reaches the infective stage that host could be induced by parasitic manipulation through an inefective upregulation of the respiratory burst activ- the secretome. ity occurs [32]. Nevertheless, activation of the immune Here, we characterized the worm’s whole-body pro- response through the production of reactive oxygen spe- teome and the protein fraction of the secretome of S. cies (ROS) by granulocytes has been shown to occur early solidus using liquid chromatography-tandem mass during infection, depending on the genotype of the stick- spectrometry (LC–MS/MS) [50]. We focused on indi- leback population [39]. Several studies have suggested viduals in the infective stage of development so that the that the manipulation of the stickleback’s behaviour secretome may include manipulation factors that could increases the worm’s transmission rate to its fnal avian be associated with the fsh host’s behavioural changes. In host [27]. Yet, the adaptive value for S. solidus of such helminth parasites, a portion of proteins from the pro- behavioural changes has never been demonstrated [40], teome are passively released in the external environment and it is possible that these behavioural modifcations in as metabolic waste products, contributing to an impor- the fsh may solely result from a side efect of infection tant fraction of their secretome content [51]. Terefore, [7], such as the efect of the parasite mass burden or of the secretome is generally a subset of the proteome in the activation of the host immune response [35, 41]. To terms of protein content [52]. We thus expected that the demonstrate that behavioural changes in the host are the secretome of S. solidus would also mainly be a subset result of direct parasitic manipulation, the frst step is to of its proteome. However, in the context of behavioural determine whether the parasite can liberate molecules in manipulation, we hypothesized that proteins could also its external environment, and if yes, to study their func- be actively liberated by the parasite in its external envi- tions in relation with the host’s phenotype perturbations. ronment [51], so that they would be enriched in the Tis host–parasite system was used for the frst experi- secretome compared to the proteome. Based on what has mental demonstration of a complex life cycle of a trophi- been described in previous parasitic systems (references cally transmitted parasite in 1790 (as reviewed in [27]). reviewed in Table 1), if S. solidus manipulates stickleback A rich body of work on this model system during the behaviour with its secretions as we hypothesize, then its past 50 years has shown, using enzymatic assays, that the secretome would include proteases as well as proteins activity of proteases [42] and transferases [43] is required with neural and immune functions. Because the worm is for S. solidus survival and growth. Furthermore, a par- not in direct contact with the vertebrate host brain [27], tial genome of the worm is available [44], and extensive we also expected to detect proteins involved in cell com- transcriptome data has been produced [45, 46]. Quan- munication, cell–cell signalling, or transport functions. tifcation of the transcriptome dynamics across the life Tese proteins would mediate the communication of the stages has uncovered that when the worm reaches the worm with its host’s brain to induce potential neural and infective stage in its fsh host, genes involved in neural immune changes, and ultimately behavioural alterations. pathways and sensory perception have higher expres- As the genome of S. solidus is used during LC–MS/ sion levels compared to the earlier stages in the same MS as a reference database, a more thorough annotation host, which are characterized by upregulation of growth- of the genome than the one publicly available [44] could related pathways [46]. Furthermore, vesicles are present allow us to detect more proteins, including potential Berger et al. Parasites Vectors (2021) 14:436 Page 4 of 20

manipulation factors. We therefore used a multipronged the NEBNext FFPE Repair Mix (NEB, Ipswich, MA, USA) approach combining genomics with proteomics for 15 min at 20 °C before purifcation with AMPure XP (described in Fig. 1). We frst sequenced the genome of beads. Te repaired DNA was end-prepped with the S. solidus using a combination of long and short reads to NEBNext Ultra II End Repair/dA-Tailing Module (NEB, combine longer contigs and improved annotation. Ten, Ipswich, MA, USA) for 30 min at 20 °C and 30 min at 65 °C we investigated the global proteomic composition of the according to the manufacturer’s instructions and purifed proteome and secretome of S. solidus using LC–MS/MS. with AMPure XP beads. Adapter mix (ONT, Oxford, UK) and Blunt/TA Ligase Master Mix (NEB, Ipswich, MA, Methods USA) were added to the purifed end-prepped DNA and Genome sequencing of Schistocephalus solidus incubated for 15 min at room temperature. Te library Worm collection was then purifed with AMPure XP beads and ABB wash We caught threespine sticklebacks from Lac Témisc- bufer and recovered in elution bufer (ONT, Oxford, UK). ouata (Québec, 47°40′33″N 68°50′15″W), where fsh are Approximately 333 ng of the purifed library was loaded known to be infected by S. solidus [35], using minnow onto a primed R9.4 SpotOn fow cell (FLO-MIN106) traps in July 2015. Fish were brought to the Laboratoire along with RBF (ONT, Oxford, UK) and library loading Aquatique de Recherche en Sciences Environnementales beads (ONT, Oxford, UK). Sequencing was performed et Médicales at Université Laval (Québec, Canada) and with a MinION Mk1B sequencer running for 48 h, and were maintained under a 12 h:12 h light/dark cycle and MinKNOW software (provided by ONT, Oxford, UK) a water temperature of 15 °C. Fish were fed daily with a was used to control the sequencing process. Four librar- mixture of blood worms and Artemia. After 10 months, ies were sequenced using this protocol. Base calling was one fsh (out of 40) exhibited the morphological changes performed with albacore (read_fast5_basecaller.py, ONT typically induced by S. solidus [30] and was consequently Albacore Sequencing Pipeline Software v2.3.1) on all fast5 sacrifced to collect the worm. Te fsh was euthanized fles. A total of 4,636,932 read sequences resulted from the with an overdose of MS-222 (75 mg/l mg/kg) and dis- base calling step for a total of 14,535,522,365 nucleotides. sected to confrm the infection by S. solidus. Te worm was immediately put in ethanol 90% and stored at 4 °C Genome assembly and annotation until genome sequencing. All the other fsh were later All fastq ONT fles were pooled in one fle prior to used during a behavioural experiment [35]. assembly. ONT sequences were assembled using Flye v2.4 [53, 54]. Te fnal assembly produced 17,882 scaf- DNA extraction and sequencing folds, the largest scafold being 919,337 nt, with an N50 Genomic DNA was extracted using the DNeasy Blood of 121,189 nt. Te mean coverage across scafolds was and Tissue kit (Qiagen Inc., Valencia, CA, USA) 20×. A frst phase of correction (polishing) was carried from ~ 20 mg of tissues. After lysis, we added 4 µl of out with nanopolish (v0.10.2) (https://​github.​com/​jts/​ RNase A (10 mg/ml). Elution was done twice in 100 µl nanop​olish). A total of 3,473,881 changes were applied. of elution bufer. To reach the desired concentration for A second correction phase with one Illumina HiSeq X the Nanopore library preparation and Illumina sequenc- paired-end sequence library (20× coverage) was also ing, we concentrated DNA with a SpeedVac concentrator done with Pilon (v1.22) [55]. A total of 4,889,938 changes (Termo Fisher, Waltham, MA, USA) for 1 h. helped to improve the quality of the assembly sequences. Scafolds that had an average coverage of less than 10× Illumina HiSeq X sequencing Te DNA libraries for Illu- coverage and those that were shorter than 500 bp were mina sequencing were prepared using a shotgun PCR- removed because they were of limited use prior to free library preparation (Lucigen) Illumina Library at the genome annotation. Tese represented less than 1% of McGill University and Genome Quebec Innovation Cen- the data, and all contigs that contained hits with the tran- tre (Montréal, Canada). Sequencing was performed at the scripts from S. solidus [46] remained after this selection. same centre on an Illumina HiSeq X instrument, using Tis left a total of 15,357 scafolds and 625,207,408 nucle- paired-end reads (PE150). A total of 157,475,128 reads otides. Tis Whole Genome Shotgun project has been were obtained. Te estimated genome size is 540 MB [44]. deposited at DDBJ/ENA/GenBank under the accession WEDG00000000.1 and the assembly ID ASM1759139v1. Oxford Nanopore Technologies (MinION) sequenc- ing Library preparation for Oxford Nanopore sequenc- Completeness of the genome assembly We used a data- ing was done with a PCR-free ligation sequencing kit set of 24,765 transcripts from S. solidus that we previously SQK-LSK108 (ONT, Oxford, UK). Briefy, approximately published [46] and mapped them on the de novo assem- 2.5 µg of high-molecular-weight DNA was repaired using bly using GMAP (v2019-03–15) [56] as implemented in Berger et al. Parasites Vectors (2021) 14:436 Page 5 of 20

A multi-pronged approach to study the secretome of Schistocephalus solidus

S. solidus infected fish have behavioural perturbations

In PBS Worm for 2 hours stored at 4°C in ethanol90% X 1

Secretome Worm tissues stored at -20°C stored at -80°c X 4 X 4 DNA extraction

Protein precipitation Proteinextraction Sequencingwith Illumina and Oxford nanopore

In-gel digestion

Assembly and annotation LC-MS/MS analysis

Protein prediction using as databases: larger proteome of S. solidus from UniProt NEW GENOME + NEW GENOME of S. solidus

Proteins detected in the secretome but not in tissues MANIPULATION FACTORS Fig. 1 The multipronged approach designed to detect potential manipulation factors in the secretome of Schistocephalus solidus. On the left, the proteomic approach based on LC–MS/MS aims at describing the global proteomic composition of the proteome and the secretome of S. solidus. On the right, the genomic approach aims at producing a novel genome sequence and assembly of S. solidus to improve protein coding gene prediction and annotation for this parasite. The genome, whose quality is improved, can be used as a new reference database to infer proteins in the proteome and the secretome of S. solidus. Combined, these approaches allow us to characterize the secretome of S. solidus at the infective stage, including the uncovering of proteins detected only in that fraction and not in its proteome, thus representing potential manipulation factors Berger et al. Parasites Vectors (2021) 14:436 Page 6 of 20

the pipeline GAWN v0.3.2 (https://​github.​com/​enorm​ Mass spectrometry characterization of the worm proteome andeau/​gawn). An assembly quality analysis was per- and secretome formed using the Benchmarking Universal Single-Copy Worm and secretome collection for mass spectrometry Orthologs (BUSCO), and BUSCO groups were searched analysis in the Metazoa database [57]. Sampling of experimental individuals Whole worms were collected from wild-caught fsh acclimatized to lab- Protein coding gene prediction To fnd out the propor- oratory conditions. Juvenile sticklebacks came from Lac tion of repeated regions in the genome and to obtain a Témiscouata (Québec, 47°40′33″N 68°50′15″W), the same masked assembly prior to running BRAKER2 to predict lake that was used to collect a worm for genome sequenc- protein coding genes, we built a RepeatModeler (1.0.8) ing. Fish were caught using a seine in August 2016. Tey database [58] based on the new genome sequence and were brought to the Laboratoire Aquatique de Recherche ran RepeatMasker (4.0.6) based on that database [59]. en Sciences Environnementales et Médicales at Université We used BRAKER2 [60–65] for protein coding gene pre- Laval where they were raised for 1 year in 80-l tanks under diction using two approaches. Te frst approach was a light/dark photoperiod of 12 h:12 h and a temperature ab initio as it was not based on external data to fnd open of 15 °C refecting their natural environment conditions reading frames (ORFs) but only on genes predicted by (Québec, Canada). Fish were fed daily with brine shrimps. GeneMark-ES that are selected for training Augustus [60, 61]. Te second approach used the alignment (bam fles) Collection of proteome and secretome samples In sum- of the transcripts from S. solidus [46] on the genome. Te mer of 2017, 51 fsh were individually isolated in 2-l tanks. two sets of ORFs obtained with the two approaches were Te day following isolation, fsh were injected with 100 merged, and duplicate sequences were removed, which µl of phosphate-bufered saline (PBS; pH 7.4, Life Tech- allowed us to obtain a fnal number of predicted ORFs. nologies) in their abdominal cavity in order to sample We used a BLAST analysis to quantify the percentage of their fuids to detect infection by S. solidus following the the S. solidus transcriptome we previously published [46] method described in [72]. Tis protocol was repeated that was found in the genes predicted using Augustus and the next day. If fsh were detected as infected, they were vice versa. euthanized with an overdose of MS-222 (75 mg/l mg/kg) and dissected to confrm the infection by S. solidus and Identifcation of sequences specifc to the new assem- to collect the worm and its secretome. Fish sex, size, and bly Te predicted ORFs were locally aligned using mass, and S. solidus mass and number in each fsh were BLAST+ [66, 67] against a database of 43,058 protein noted. We found that fve fsh were infected, each har- sequences from S. solidus obtained from release 230 of the bouring a worm whose weight was above 50 mg (worm NCBI (March 2019, https://​www.​ncbi.​nlm.​nih.​gov). From 1 = 485.3 mg; worm 2 = 504.1 mg; worm 3 = 286.5 mg; the predicted ORFs, we selected those that had no BLAST worm 4 = 544.5 mg; worm 5 = 220.9 mg). All the worms match or that had no signifcant match based on the fact used to collect proteome and secretome samples were far that the length of the alignment was less that 80% of the above the mass threshold of 50 mg, which is commonly length of the query or that had less than 90% of identi- used to defne the parasitic infective stage that coincides cal nucleotides over the length of the query [68]. Tese with the appearance of the behavioural changes in the fsh unmatched sequences that were specifc to the assembly host [27, 37, 38]. Furthermore, behavioural changes in were used as one of the reference databases during LC– fsh infected by worms of similar masses were previously MS/MS analysis (see “Protein identifcation”). reported in the Témiscouata population [48]. Terefore, we considered these worms and secretome samples to be Genome annotation A functional annotation of the pre- representative of the manipulative stage of S. solidus. We dicted ORFs that were obtained with BRAKER2 [60–65] only selected fsh that were infected by a single worm to was performed using HMMER (version 3.3) [69] against prevent potential efects of multiple infections on the pro- the PFAM domain database (release 32) [70] and using teomic content of the secretome. orthology assignment with eggNOG-mapper (Evolution- Te worm secretome was collected according to a pro- ary Genealogy of Genes: Non-supervised Orthologous tocol adapted from [73]. Each worm was rinsed with 1 ml Groups) (version 2.0.1) [71]. of PBS (pH 7.4, Life Technologies) to remove fsh fuids and then immediately put in a 2-ml tube of PBS. Te tube was covered with aluminium foil to protect the worm from light and placed in a water recipient at the same temperature as the fsh tanks (15 °C) for 2 h. Te worm was removed from the tube, and a tablet of complete Berger et al. Parasites Vectors (2021) 14:436 Page 7 of 20

Mini Protease Inhibitor (Sigma) was added to the tube negative control (SDS-PAGE loading bufer and water). to protect the proteins from the protease activity. Te Migration was performed during 60 min at 175 V for the worm tissues were then snap-frozen in liquid nitrogen worm lysates, and during 30 min at 175 V for secretomes. and stored dry at −80 °C for proteomics analysis. Te Coomassie blue G250 (PMID: 15,174,055) was used for liquid in which the worm was incubated (PBS + poten- staining overnight (Additional fle 1). Each migration tial secretome collected) was stored at −20 °C. Further lane was cut into fve fractions for worm lysates and three experiments were therefore conducted with fve worms fractions for secretomes. In-gel digestion was performed and their respective secretome. on these fractions according to a previously developed protocol [75]. Briefy, after cutting the gel into slices, pro- Preparation of worm tissues and secretome for in‑gel teins were reduced using 10 mM dithiothreitol (DTT) for digestion 45 min at 56 °C, then alkylated with 55 mM iodoaceta- Worms were individually washed with 2 ml of PBS (pH mide (IAA) for 30 min at room temperature in the dark. 7.4, Life Technologies) to remove potential remain- Digestion was performed overnight at 37 °C with trypsin ing contaminants from the fsh. Tey were then cut into (Promega V5113; 0.1–1 µg of trypsin depending on three equal pieces that were each put into a tube contain- the gel staining intensity). Te next day, peptides were ing 700 µl of lysis bufer (4% (w/v) sodium dodecyl sul- extracted using an organic solvent (100 µl acetonitrile) fate (SDS) in 100 mM Tris, pH 8–10, mM dithiothreitol and a step-wise protocol (40% acetonitrile then 100% ace- [DTT]). Six sterile ceramic beads (size 2.8 mm) were tonitrile), and dried (< 50 μl). Following in-gel digestion, added to each lysis tube, and the samples were homoge- a STAGE-TIP protocol using C18 extraction disks (3M nized for 20 s at 6000 rpm using a Precellys homogenizer Empore) was performed to desalt the samples [76, 77]. (Bertin Technologies). Homogenization was repeated Samples were acidifed with trifuoroacetic acid (TFA) three times. Samples were put on ice for 1 min between (pH < 2.5) before being passed through the tip. At the end each run. Samples were then spun at 10,000×g at 4 °C for of the STAGE-TIP protocol, samples were dried com- 10 min. For each individual worm, the three homogen- pletely using a Vacufuge Plus concentrator (Eppendorf) ates obtained were pooled together and redistributed into and stored at −20 °C until LC–MS/MS analysis. equal volumes into two tubes. Samples were heated at 95 °C for 10 min, then spun for 10 min at room tempera- LC–MS/MS analysis ture. Te supernatant of each sample was collected into a Te peptides were analysed using a quadrupole time-of- new tube, and protein concentration was calculated using fight mass spectrometer (Impact II; Bruker Daltonics) the extinction coefcient measured with a NanoDrop coupled to an Easy-nLC 1000 high-performance liquid 1000 spectrophotometer (A280 nm; Termo Scientifc). chromatography (HPLC) system (Termo Fisher). More Te concentration obtained for each worm lysate was information about the LC–MS/MS analysis can be found respectively: 24.14 mg/ml; 22.03 mg/ml; 15.66 mg/ml; in the supplementary information (Additional fle 2). 28.53 mg/ml; and 11.37 mg/ml. Worm lysates were kept Protein identifcation at −20 °C before performing in-gel digestion. For the secretome, the protein concentration of We searched the detected mass spectra against the S. soli- each of the liquids in which the worms were incubated dus genome using MaxQuant (version 1.6.1.0) [78]. Two (“secretome” samples) was measured using a Pierce searches were independently performed: the frst search Coomassie (Bradford) Protein Assay Kit (Fisher Sci- used the larger proteome of S. solidus (43,058 entries, entifc). Bovine serum albumin (BSA) at 2 mg/ml was downloaded June 21, 2018, and updated May 6, 2019) used as standard (seven-point standard curve ranging from the Universal Protein Resource release 2018–05 and from 0 to 25 µg/ml). Te concentrations obtained were: 2019–03 (UniProt https://​www.​unipr​ot.​org/) as a refer- 13.11 µg/ml; 15.09 µg/ml; 9.83 µg/ml; 16.59 µg/ml; and ence database, which includes proteins predicted from 7.46 µg/ml. For each secretome sample, 10 µg of pro- the partial genome of S. solidus that was then currently teins were precipitated with trichloroacetic acid (TCA) available [44], as well as proteins predicted from the de [74]. Precipitated samples were directly used for in novo transcriptome [45, 46]. For the second search, we gel-digestion. used the unmatched sequences that we reported to be specifc to our genome assembly (36,140 sequences, see In‑gel digestion Results) as a reference database. Te search included For each sample, 50 µg of proteins from worm lysate and common contaminants and variable modifcations of 10 µg of proteins from secretome were resolved on a 12% methionine oxidation, and N-acetylation of the proteins. SDS–polyacrylamide gel electrophoresis (SDS-PAGE) Te data was fltered for matches passing a 1% false dis- with a BenchMark protein ladder (Invitrogen) and a covery rate set by MaxQuant [78]. We included the larger Berger et al. Parasites Vectors (2021) 14:436 Page 8 of 20

proteome of the threespine stickleback host (29,032 the proteome and the secretome samples, and those entries, downloaded June 21, 2018) from the Universal that were found only in a secretome sample. For the pro- Protein Resource release 2018–05 (UniProt https://​www.​ teins that were shared between the proteome and the unipr​ot.​org/) in the search in order to include proteins secretome samples, we performed two distinct enrich- originating from the fsh host during the MaxQuant data ment analyses: one for the proteins detected in at least search. Subsequent analyses were performed with all the one secretome sample and one for the proteins detected proteins inferred using the larger proteome of S. solidus in all secretome samples. Te enrichment analysis was and the unmatched sequences of the new genome as ref- performed using the FunRich tool (version 3.1.3) [79] erence databases. using the same approach as for the proteome. We investigated the annotation of proteins found only Data analysis of the proteome in the secretome and not the proteome using a three-step Data analysis was performed with Python custom scripts approach. During the frst step, the protein ID obtained (version 3.6.4) using Jupyter notebooks (version 5.4.0). with the MaxQuant analysis was searched in UniProt to A template of the custom script is available in the sup- retrieve its information if available (protein name and plementary information (Additional fle 3). Proteins sequence, corresponding coding gene, protein function, detected in each worm sample were retrieved using localization, and/or structural information). During the MaxQuant. We fltered out of the dataset protein IDs second step, we inferred the function of the protein based that were solely attributed to the threespine stickleback on sequence homologies using the Basic Local Alignment (Additional fle 4), protein IDs with REV coding (reverse Search Tool (BLAST) on NCBI (https://​blast.​ncbi.​nlm.​ hits for false discovery rate fltering), and protein IDs nih.​gov/​Blast.​cgi). We used BLASTP (protein–protein) with CON coding (contaminant hits that were added using the non-redundant protein sequences (nr) database into the search). In some cases, several protein IDs were (August 2018) and TBLASTN (protein-translated nucle- found by MaxQuant for a specifc protein because of high otide) using the nucleotide collection (nr/nt) database sequence similarities between them. Research on NCBI (August 2018). During the last step, when no informa- (https://​www.​ncbi.​nlm.​nih.​gov/) and UniProt (https://​ tion was found with the previous steps or to confrm the www.​unipr​ot.​org/) databases demonstrated that the pro- information previously found, we used the Pfam data- tein IDs detected for one protein were probably isoforms base (version 32.0) [70] to infer the function of the pro- with identical functions. Tese multiple protein IDs were tein based on domain organization homologies. In some kept during annotation to obtain exhaustive functional cases, several protein IDs were found by MaxQuant for information, but only the frst ID of each protein was a specifc protein because of high sequence similarities kept for enrichment analyses to avoid an overrepresenta- between these IDs. We found by applying the previous tion of specifc processes or functions (see below). Tis approach that in all of these cases, the putatively redun- fnal dataset was used to describe the global composition dant protein IDs detected for one protein were indeed of the proteome of S. solidus. isoforms with identical functions. Tese multiple protein We performed two distinct enrichment analyses for the IDs were nevertheless kept during annotation (see below) proteins detected in at least one worm sample and for the to obtain exhaustive functional information. proteins detected in all worm samples. For proteins with several protein IDs, we kept only the frst ID in order to Annotation of proteins prevent enrichment bias during analysis. Enrichment In order to obtain an exhaustive annotation of proteins, analysis was performed using the tool FunRich (version we conducted complementary approaches based on 3.1.3) [79]. We constructed a custom reference database sequence and structure analyses (as described below) [56, using the protein IDs and the Gene Ontology (GO) anno- 80–90] for the proteins detected in the proteome and/or tation (biological process, cellular component, molecular enriched in the secretome for which few or no annota- function) of all the proteins described in the larger pro- tion was available. Tese analyses were also performed teome of S. solidus available on UniProt (https://​www.​ for the proteins that were detected using the unmatched unipr​ot.​org/ 43058 entries). P-values for enrichment sequences specifc to our genome assembly as a refer- were obtained with a hypergeometric test corrected with ence database, for which limited or no annotation was the Bonferroni method. available.

Data analysis of the secretome Protein sequence analysis Data analysis was performed with Python using the same We predicted secreted proteins using the SignalP 5 .0 approach as for the proteome. We separated proteins (classical secretory proteins with option “eukary- into two categories: proteins that were shared between ota”, http://​www.​cbs.​dtu.​dk/​servi​ces/​Signa​lP/) and Berger et al. Parasites Vectors (2021) 14:436 Page 9 of 20

SecretomeP 1.0f (non-classical pathway secreted proteins the new genome of S. solidus (Additional fle 5), which is with option “mammalian”, http://​www.​cbs.​dtu.​dk/​servi​ in accordance with previous studies that demonstrated ces/​Signa​lP-1.​1/#​submi​ssion) servers [80, 81]. In addi- that the genomes of most fatworms include large num- tion, we predicted transmembrane helices in proteins bers of repetitive elements [93]. using TMHMM Server v. 2.0 (http://​www.​cbs.​dtu.​dk/​ servi​ces/​TMHMM-2.​0/) [82]. Te determination of pro- Number of putative genes tein domains was performed using HMMER hmmscan We frst performed gene prediction using an ab initio (http://​hmmer.​org/) against profle-HMM databases such approach, which gave a total of 21,780 ORFs. A second as Pfam, Gene3D, and Superfamily [83]. We predicted approach based on bam fles from the transcript align- protein disordered regions using the PrDOS software ments of S. solidus [46] led to the identifcation of 30,103 (http://​prdos.​hgc.​jp/​about.​html) [56]. ORFs. Te two sets were merged, and we found that 9103 ORFs overlapped between the two sets, with a highly var- Protein structure analysis iable percent identity (ranging from 1 to 100%). Because To construct 3D models of proteins, we searched their of this variation in percent identity and in order to obtain homologs in the Protein Data Bank (PDB) database exhaustive information, we decided to keep duplicate using BLASTP, Delta-BLAST, and HHpred [83]. Ten, sequences for the rest of the analysis, which left a total of 3D structure models were built by homology modelling 51,883 ORF sequences identifed. When comparing these based on their homologous structures (PDB ID: 4xp4_A) predicted genes to the previously published transcrip- using MODELLER [84]. Te quality of the models was tome of S. solidus, we found that 77% of the transcripts assessed by Ramachandran plot analysis through PRO- were found in these predicted genes, and that 66% of the CHECK [85]. Te images were generated with PyMOL predicted genes were found in the transcriptome. software (http://​pymol.​org/) [86]. Identifcation of sequences specifc to the new assembly Results Te 51,883 ORF sequences were aligned against a data- base of 43,058 protein sequences of S. solidus obtained Genome sequencing of S. solidus from NCBI. We identifed 19,853 sequences that had no Genome assembly and completeness BLAST match and 16,287 sequences that had no sig- By combining short and long reads produced with nifcant match (E-value < 1e−5). Te total unmatched Oxford Nanopore technology, we produced a genome sequences were therefore 36,140. Tese unmatched assembly corresponding to 4944 scafolds. Tese scaf- sequences are potential protein-coding genes not repre- folds summed to over 500 Mb (80% of the assembled sented in public databases of S. solidus and were therefore genome). To test the completeness of our assembly, we used as one of the reference databases in the LC–MS/MS mapped all the available transcripts of S. solidus [46] on analysis. our de novo assembly. We found that more than 99.6% (24,676/24,765) of transcripts mapped onto the new Annotation of the new expanded S. solidus genome assembly. Te remaining 89 transcripts that were not Te 51,883 ORF sequences were annotated using domain found were blasted (BLASTN) against the assembly data- information [70] and orthology assignment [71]. First, an set. A total of 46 transcripts had a hit on 28 scafolds, important fraction of the predicted sequences (35,576 but these hits were weak regarding the alignment length. ORFs; 68.6%) did not have a match in any database, and Tis left 43 orphan transcripts that had no match in the therefore, functions could not be assigned to them. We assembly. refer to them as putative S. solidus-specifc genes (Addi- We also searched BUSCO groups to investigate the tional fle 6). We found that 16,307 ORFs were success- completeness of the genome assembly using a comple- fully assigned a putative biological function (Additional mentary approach. A total of 649 complete BUSCOS fle 6: 7796 ORFs had a match on both databases, 4591 (66%) were found out of the 978 BUSCO groups in the ORFs had a match on the domain database only, and Metazoa database, which is close to the proportions 3920 ORFs had a match on the orthology database only). obtained from the assembly of other cestode genomes Enrichment analyses on the new expanded S. solidus (e.g. 72.6% in the genome of Schistosoma japonicum [91]; genome were conducted to identify a potential class of 73.2% in the genome of Schistosoma haematobium [92]). genes that could be overrepresented and that could be Te BUSCO duplication rate was 0.8%. Overall, these associated either with the S. solidus parasitic lifestyle results suggest that the coding genome was well repre- or with the phenotypic changes in the fsh host. Over- sented in the genome assembly. Finally, we identifed a all, enrichment analyses showed that the functions were total of 56% of elements that appeared to be repeated in related to environment sensing: 2.5% of ORFs are rd3 Berger et al. Parasites Vectors (2021) 14:436 Page 10 of 20

genes known to be expressed in photoreceptor cells [94]; detected in all worm samples. For proteins with sev- cell division, growth, and development: 1.7% of ORFs eral protein IDs, we kept only the frst ID in order to encode for proteins involved in the movement of micro- prevent enrichment bias. We found that 14.9% of pro- tubules, and 1.2% of ORFs are atrophin-coding genes teins detected in all worm samples were involved in involved in development [95]; as well as cell physiology: protein metabolism processes, namely “translation” 1% of ORFs encode for PARP proteins involved in various [GO:0006412] and “protein folding” [GO:0006457] cell physiological processes [96]. (p < 0.001). Tese are typically highly expressed proteins, which explains their overrepresentation here [93]. Also, Characterization of the proteome we found that 10.2% of proteins detected in all worm Using mass spectrometry (LC–MS/MS), we detected samples referred to mechanisms important in the func- 2290 proteins in samples of whole S. solidus tissues, tioning and the regulation of the cell cycle: “microtubule- among which 1467 proteins were detected in all worm based process” [GO:0007017], with the microtubules samples (Additional fle 7). Among these 2290 proteins, having major roles in cell division and growth, and “tri- 246 proteins were only detected using the new genome carboxylic acid cycle” [GO:0006099], the latter referring of S. solidus as a reference database during the LC–MS/ to abundant metabolic enzymes that are involved in res- MS analysis, with 113 proteins detected in all worms. Te piration by producing ATP (p < 0.001) (Fig. 2a). We also new genome sequence and annotation therefore provide found that 14.4% of proteins detected in all worm sam- a signifcant additional resource for S. solidus functional ples were proteins with GTP binding [GO:0005525] and genomics. Most of the 2290 proteins were functionally GTPase activity [GO:0003924] (p < 0.001) (Fig. 2b). Con- annotated, with the exception of the 246 proteins previ- sistent with all the functions described above, we found ously mentioned, for which functions were further inves- that 23.2% of proteins detected in all worm samples were tigated using a three-step approach based on sequence, localized in the cytoplasm (p < 0.001). structure, and phylogenetic analyses (see below). Using Te prediction of the functions of the 246 proteins this approach, we found that the 2290 proteins of S. soli- that were detected using the new genome as a refer- dus included 40 proteins encoded by 27 genes that did ence database was further investigated with sequence not have any sequence or domain similarities with other and structure analyses (Additional fle 7). Teir func- known species, and for which functions could not be tions were in accordance with the previous enrichment described (i.e. 1.7% of putative S. solidus-specifc proteins results, as we found for example proteins involved in in the proteome, compared to 68.6% of putative S. soli- microtubules (protein ID: g20896.t1) or with ATPase dus-specifc genes in the new genome) (Additional fle 8). activity (protein ID: g10622.t1). Additionally, we found Enrichment analyses in terms of biological process, that six proteins were peptidases or proteases with con- cellular component, and molecular function (GO terms) served active sites, suggesting that they are functional were performed for all the proteins detected in at least (Table 2) and that a protein contained an amidase one of the worm samples, and also for the proteins domain. Tis protein "g7530.t1″ is a fatty acid amide

translaon GTPase acvity a [GO:0006412] b [GO:0003924]

10,6% 6,5%

8,7% tricarboxylic 10,7% 4,9% acid cycle 2,2% [GO:0006099] protein folding 3,9% [GO:0006457] 4,3% 29,1% 12,3% 1,5% gluconeogenesis 8,0% [GO:0006094] 2,6% microtubule-based process 7,9% glycolyc process [GO:0007017] [GO:0006096] GTP binding [GO:0005525] Fig. 2 The proteome and secretome of S. solidus include proteins involved in protein metabolism/cell growth/energy intake. Results that were signifcant (p < 0.001) for the enrichment analyses performed in terms of biological process (a) and molecular function (b). In each case, analysis was performed for the proteins detected in all worm (outer chart) and secretome (inner chart) samples Berger et al. Parasites Vectors (2021) 14:436 Page 11 of 20

Table 2 Peptidases detected in the proteome of S. solidus using sample and one proteome sample, among which 385 its new expanded genome proteins were detected in all secretomes and proteomes Peptidase name ID Signal peptide Active site studied, and 74 proteins were detected in all secretomes (amino acid) and in at least one proteome. All the proteins that were Cathepsin B g20295.t1 1 to 26 Q131, C137, H306, shared between secretome and proteome samples had a N326 functional annotation, except for 36 proteins that were Aminopeptidase g13261.t1 1 to 30 K280, R355 previously detected in the proteome as S. solidus-specifc M17 and that were also secreted (2.3% of putative S. solidus- Aminopeptidase g19803.t1 1 to 17 K284, R359 specifc proteins in the secretome, compared to 68.6% of M17 putative S. solidus-specifc genes in the new genome and Aminopeptidase g13265.t1 1 to 19 K317, R392 M17 1.7% of putative S. solidus-specifc proteins in the pro- Peptidase family M49 g12315.t1 No E438, E439, H443, teome) (Additional fle 8). E495 We found that the secretome was composed of pro- Peptidase M16B g29053.t1 No E71, E141, L267, R368 teins enriched for biological processes, cellular compo-

The name, the identifcation number from the new genome assembly (ID), the nents, and molecular functions that were similar to the length of the signal peptide (in numbers of amino acids), and the name of the proteome. For the proteins detected in all secretomes, conserved active sites are indicated for each peptidase almost half of the proteins participated in the regulation of cell division and energy production. Tese proteins were enriched in functions such as the “microtubule- hydrolase like (SsFAAH-like). FAAH enzymes degrade based process” [GO:0007017] involved in the regulation signalling lipids of the endocannabinoid class [97]. of cell division, and “glycolytic process” [GO:0006096] Additionally, 141 proteins were assigned both to the and “gluconeogenesis” [GO:0006094], these two pro- worm and to the fsh host during LC–MS/MS analy- cesses being an important source of energy production sis (Additional fle 9). As it was not possible to directly (p < 0.001 hypergeometric test corrected with Bonferroni determine if the 141 proteins were produced by the method) (Fig. 2a). Proteins were also enriched in domains worm or the fsh, we searched if these proteins could with GTP binding [GO:0005525] and GTPase activity be core proteins, i.e. proteins conserved in all eukary- [GO:0003924] (Fig. 2b), as previously reported in the otes and whose functions are well characterized [98]. proteome (p < 0.001). Furthermore, 37.9% of the proteins Using BLASTP and Pfam tools, we found that the 141 detected in all the secretome samples were predicted to proteins had sequence similarities with 248 previously be localized in the cytoplasm (p < 0.001), similar to pro- reported eukaryotic core proteins [98] (mean Expect- teins in the proteome. Notably, we found that a signif- value 2.45e−11) and that they were composed of con- cant number of proteins (6.1% of the proteins detected in served domains. We therefore concluded that these all secretomes p < 0.001) were reported to be specifcally proteins were expressed both in the worm and in the localized into the extracellular space, an enrichment that fsh because of their fundamental functions in the was specifc to the secretome. cell, but that their peptide similarities did not allow us to determine their origin. Enrichment analysis per- formed on these 141 proteins confrmed that they were Proteins unique to the secretome involved in cell physiology and in energy production as We found eight proteins that were detected in all found for the whole proteome. secretome samples, but in none of the proteomes (Table 3). Among them, three proteins had fbronectin Characterization of the secretome type-III domains. Te frst (protein ID: A0A0X3PH69) A total of 1568 proteins were detected in the secretome was a “Neogenin”, which is a protein involved in neural samples (Additional fle 7). Te numbers ranged between development with two fbronectin type-III domains. 781 and 1183 proteins depending on the sample, with 459 Te second (protein ID: A0A0X3Q1B7; A0A0X3PKA1; proteins detected in all secretome samples. As expected, A0A0X3Q8R6) was a receptor-type tyrosine-protein the secretome of S. solidus was mostly a subset of the phosphatase eta that included three fbronectin type-III worm proteome, both in terms of protein number and domains. Te third (protein ID: A0A0V0JBL5) included functions, with a few exceptions described below. Te two fbronectin type-III domains, but no additional whole protein content of the secretome of a given worm information on the function of this protein was avail- represented up to 59% of the whole protein content of able. Furthermore, we detected an uncharacterized pro- the proteome of that same worm. In total, 1538 unique tein with a predicted molecular function corresponding proteins were shared between at least one secretome to “neurotransmitter: sodium symporter activity (NSS)”, Berger et al. Parasites Vectors (2021) 14:436 Page 12 of 20

Table 3 Proteins that are excreted/secreted by S. solidus, detected in fve secretomes and not in proteomes UniProt ID Information from UniProt and BLAST tools Information from sequence, structure, and phylogeny

A0A0X3PH69 Neogenin Fibronectin type-III domains A0A0X3Q1B7;A0A0X3PKA1;A0A0X3Q8R6 Receptor-type tyrosine-protein phosphatase eta Fibronectin type-III domains A0A0V0JBL5 Unknown Fibronectin type-II domains A0A183S8K9;A0A0X3PDV9;A0A0X3PNC8; Neurotransmitter: sodium symporter (NSS) Unknown A0A0V0J682;A0A183T7R5 A0A0X3P711;A0A183SGM7 Phospholipid scramblase Unknown A0A0V0J8W2 Unknown Gene specifc to S. solidus (orphan) A0A0X3P740 Unknown Gene specifc to S. solidus (orphan) A0A0X3Q756 Unknown Gene specifc to S. solidus (orphan) Signal peptide (secreted)

Protein IDs were taken from UniProt. When several protein IDs were assigned to one protein, these protein IDs corresponded to isoforms with identical functions. For each protein, functional annotation was frst retrieved from searches with UniProt and BLAST tools. Complementary analyses based on sequence, structure, and phylogeny were used to obtain information which is involved in transmembrane transport (pro- similar to the diagnostic antigen gp50 commonly used to tein ID: A0A183S8K9; A0A0X3PDV9; A0A0X3PNC8; detect parasitic diseases. A0A0V0J682; A0A183T7R5), and a protein potentially Two proteins were associated with transport func- acting at the cell membrane as a phospholipid scramblase tions. Te frst (protein ID: A0A183TIR8; A0A0X3PCX3; (PLSCR) (protein ID: A0A0X3P711; A0A183SGM7). A0A183TT84; A0A0X3PT59) was detected in four out of Lastly, three other uncharacterized proteins did not fve secretomes and was a sodium/glucose cotransporter have information on function using the UniProt data- involved in glucose homeostasis. Te second (protein ID: base, BLAST tools, and protein domain identifcation. It A0A0X3NT74), which was only detected in the second seems that these three proteins are specifc to S. solidus, largest worm (worm 2), was an intrafagellar transport with the last protein (protein ID: A0A0X3Q756) having a protein required for ciliogenesis. secretory signal peptide. One protein (protein ID: A0A183TDP7), detected in Twenty-two proteins were detected in at least one of four secretomes, had a knottin fold similar to a domain the fve secretome samples, but in none of the proteomes found in the Schistosoma parasitic trematodes, but for (Table 4). Tree proteins were composed of fbronectin which the function is not characterized. It is a cysteine- type-III domains. Te frst (protein ID: A0A0X3NX35) rich protein and interestingly, BLASTP search against the detected in four out of fve secretomes was described UniProt database revealed that it is only found in Platy- as a receptor-type tyrosine-protein phosphatase H. helminthes, and amino acid multiple sequence alignment Te second (protein ID: A0A183TLI3; A0A0X3PWG6; showed that it presents a new motif, C-x[6]-C-x(7)-CC- A0A0X3PW04; A0A0V0J316; A0A0X3PN23; A0A0X- x(4)-C-x(9)-C-x[2]-C-x[6]-C-x(5)-CC-x(3)-C-x(4)-C. 3PJY8), which was detected in three secretomes, was Tis type of protein is resistant to proteases. either a tenascin (an extracellular matrix glycoprotein) Nine proteins had functions which could not be or a receptor-type tyrosine-protein phosphatase F. Te clearly determined by BLAST in UniProt and non- last protein of this type (protein ID: A0A0V0JA38), also redundant protein sequences (GenBank) databases. detected in three secretomes, was, according to the Five other proteins were only detected using the new BLAST result, a collagen-like protein, which is an impor- genome of S. solidus as a database reference. We fur- tant component of cuticle. ther investigated the functions of these last 14 proteins Two proteins appeared to be involved in immunity with sequence, structure, and phylogenetic analysis: processes. Te frst (protein ID: A0A183TPG4;A0A0 nine proteins were specifc to S. solidus. Among them, X3P3D7;A0A0X3PTB8), which was detected in three two proteins (protein ID: A0A0V0J2U1; protein ID: secretomes, was uncharacterized and had a cystatin A0A0X3PIM2) had secretion signal peptides, and one domain which may be involved in immunomodula- protein (protein ID: A0A0X3PXG6) had a transmem- tory functions. Te second (protein ID: A0A0V0JBV1; brane helix (TMH) domain found in transmembrane A0A0V0J795; A0A0X3NVP3; A0A0X3P0M4), which proteins. Furthermore, one protein was identifed as was detected in the secretome of the largest worm only a peptidase M28B (glutamate carboxypeptidase 2) (worm 4), was according to the BLAST results an antigen Berger et al. Parasites Vectors (2021) 14:436 Page 13 of 20

Table 4 Proteins that are excreted/secreted by S. solidus, detected in at least one secretome and not in proteomes Protein ID No. of secretomes for Information from UniProt and BLAST tools Information from sequence, which the protein was structure, and phylogeny detected

A0A0X3NX35 4/5 Receptor-type tyrosine-protein phos‑ Fibronectin type-III domains phatase H A0A183TLI3;A0A0X3PWG6;A0A0X3PW04; 3/5 -Tenascin? Fibronectin type-III domains A0A0V0J316;A0A0X3PN23;A0A0X3PJY8 -Receptor-type tyrosine-protein phos‑ phatase F? A0A0V0JA38 3/5 Collagen-like protein Fibronectin type-III domains A0A183TPG4;A0A0X3P3D7;A0A0X3PTB8 3/5 Unknown Cystatin domain A0A0V0JBV1;A0A0V0J795;A0A0X3NVP3;A 1/5 (worm 4) Antigen Intrinsically disordered 0A0X3P0M4 A0A183TIR8;A0A0X3PCX3;A0A183TT84;A 4/5 Sodium/glucose cotransporter Unknown 0A0X3PT59 A0A0X3NT74 1/5 (worm 2) Intrafagellar transport protein 81 Unknown homolog A0A183TDP7 4/5 Protein with a knottin fold Cysteine-rich protein Specifc to Platyhelminthes A0A0V0J2U1 4/5 Unknown Gene specifc to S. solidus (orphan) Signal peptide (secreted) A0A0X3NRK5 4/5 Unknown Gene specifc to S. solidus (orphan) A0A0X3PMK5;A0A0X3PDL5;A0A0X3PEZ1 4/5 Unknown Gene specifc to S. solidus (orphan) A0A0X3PRL9 4/5 Unknown Gene specifc to S. solidus (orphan) A0A0X3NVW4 3/5 Unknown Gene specifc to S. solidus (orphan) A0A0X3PXG6 3/5 Unknown Gene specifc to S. solidus (orphan) TMH (membrane) A0A0X3PIM2 2/5 Unknown Gene specifc to S. solidus (orphan)—Signal peptide (secreted) A0A0X3PNW2 2/5 Unknown Gene specifc to S. solidus (orphan) A0A183TE24;A0A0X3PRC0 1/5 Unknown Gene specifc to S. solidus (orphan) (worm 4) g11241.t1 3/5 Unknown Intrinsically disordered g1854.t1;g12541.t1 2/5 Unknown Peptidase M28B (glutamate carboxypeptidase 2) g17644.t1 4/5 Unknown Fibronectin type-III domains g2.t1 3/5 Unknown Unknown g6226.t1 1/5 Unknown Unknown Protein IDs were taken from UniProt or from the new genome assembly. When several protein IDs were assigned to one protein, these protein IDs corresponded to isoforms with identical functions. For each protein, functional annotation was frst retrieved from searches with UniProt and BLAST tools. Complementary analyses based on sequence, structure, and phylogeny were used to obtain information

(protein ID: g1854.t1; g12541.t1), and one protein had describe for the frst time the protein component of the fbronectin type-III domains (protein ID: g17644.t1). secretome of a parasite commonly referred as manipu- lative, using mass spectrometry (LC–MS/MS), with the Discussion objective to identify potential manipulation factors that Understanding the molecular mechanisms behind behav- could explain host behavioural changes. As expected, ioural changes in a vertebrate host infected by a non-cer- we found that the proteins that are excreted/secreted ebral parasite is a fascinating challenge. If these changes by S. solidus are mostly a subset of the proteins being are the result of parasite manipulation of its host, one expressed in the whole worm and that both included possible mechanism hinges on molecules secreted by the proteases. We also found that 30 secretome proteins parasite (i.e. manipulation factors) that impact the host’s were not detected in the proteome and were involved physiological, immunological, and central nervous sys- in neural, immune, and cell communication functions, tems, and ultimately the behaviour of the host. Here, we therefore having the potential to interfere with the host’s Berger et al. Parasites Vectors (2021) 14:436 Page 14 of 20

physiological systems and behaviour. Finally, we high- cavity of the stickleback to the PBS tube) encountered by lighted that the secretome of S. solidus included S. soli- S. solidus at this step could have infuenced the levels of dus-specifc proteins that could play important roles in proteins detected in the proteomes and secretomes. One the tight interaction of the parasite with its fsh host. All can speculate that this environmental stress could have together, these proteins represent promising candidates triggered cell growth and energy intake in the worm (for to explain physiological and behavioural changes in the example, to counteract other metabolic efects), explain- stickleback host. ing the discrepancies between the proteomics data we obtained and the previous transcriptomic work [46]. To The secretome of S. solidus is mostly a subset of its confrm our results, future work could replace the PBS proteome with similar functions solution by a medium that would better mimic the fsh Te global protein composition of the proteomes and the abdominal cavity, for example, by a medium already used secretomes of S. solidus highlights functions that are cru- in vitro for breeding [32]. Te global analysis of the pro- cial for its parasitic lifestyle and for its interactions with teome and of the secretome of S. solidus therefore con- the external environment. As it is generally reported [52], frms the importance of energy use for the worm, even we found that the proteins that are excreted/secreted by before it reaches the bird. S. solidus (i.e. the secretome) are mostly a subset of the We detected both in the proteome and the secretome proteins being expressed in the whole worm (i.e. the pro- of S. solidus proteins with GTP binding and GTPase teome), both in terms of protein number and function. activity. GTP-binding proteins also called G-proteins We found that both the proteome and the secretome are known to regulate a variety of biological processes of S. solidus at the infective stage (i.e. when behavioural such as mediating signals by hormones and light, gene changes appear in the fsh host) were enriched in pro- expression, cytoskeletal and microtubule organization, teins involved in cell division, which was also found in or vesicle trafcking [101–104]. In parasites, GTPases the functional analysis of the genome we sequenced, as have been demonstrated to have important roles in the well as energy production. Concerning cell growth, our secretion of virulence factors [103, 104]. For instance, in results may seem counterintuitive because a previous Toxoplasma gondii, which is a parasite known to induce transcriptome analysis revealed that the non-infective behavioural changes in rodents [105], Rab GTPases reg- stage of S. solidus shows higher expression levels of genes ulate the secretion of proteins essential to invade host involved in growth and cell regulatory functions com- cells, and the modifcation of their expression results in pared to the infective stage [46]. However, we did not aberrant transport of proteins [106]. analyse the proteome and the secretome of worms at Te use of the new genome of S. solidus as a reference the non-infective stage, such that we can only quantify database during mass spectrometry analysis allowed us that these processes are enriched at the infective stage to detect six peptidases or proteases in the proteome but not their relative importance compared to other of S. solidus, and one peptidase (peptidase M28B gluta- life stages of the parasite. We therefore only speculate mate carboxypeptidase 2) detected only in the secretome that both the non-infective and the infective stages of S. fraction, with proteases and peptidases being typi- solidus could rely on biological processes involving cell cally described as virulence factors for many parasites growth, but production levels would be much higher at (Table 1). In S. solidus, proteases and peptidases may the non-infective stage, considering the fact that growth have important roles in weakening the fsh immune occurs predominantly in the frst 12 weeks after installa- response at the infective stage when the worm is ready to tion in the fsh host [99]. Concerning energy production, pass into its avian host to reproduce [32]. Injecting these the results are also surprising, as the worm was empiri- proteases and peptidases, alone or in combination, into cally described to use its glycogen reserve (its primary isolated head kidney samples from non-infected fsh as in source of energy) mainly when it reaches its fnal avian Scharsack et al. [32] would be necessary to confrm their host [100]. However, transcriptomic analyses demon- potential role in disrupting the host immunity. Further- strated that glycogen metabolism and energy production more, this shows that the new genome presented here in S. solidus are complex processes [46]. Specifcally, six is a valuable tool to identify proteins that are critical for steps of the glycolysis cycle are highly expressed at the the parasitic lifestyle. We also found an SsFAAH-like infective stage [46]. Furthermore, it is important to keep enzyme that could potentially degrade signalling lipids in mind that for tissue and secretome samples collection, of the endocannabinoid class. Endocannabinoids have worms were placed in a solution of PBS (pH 7.4) in the previously been reported as an important player in host- dark for 2 h. While this method is commonly used to parasite interactions by promoting the activation of the collect parasitic secretomes [73], we cannot dismiss the immune response in the host [97, 107]. Tis enzyme was possibility that the environmental change (from the body not detected in the secretome of S. solidus. Berger et al. Parasites Vectors (2021) 14:436 Page 15 of 20

The secretome of S. solidus also contains proteins (NSS)” annotation. In humans, it is a membrane pro- not detected in the proteome tein involved in the termination of synaptic transmission Neuronal and immune functions and the recycling of neurotransmitters at the membrane Transcriptomic analysis demonstrated that when S. [114]. How this membrane protein could be secreted and solidus reaches the infective stage, genes involved in act as a manipulation factor in the secretome is unclear. neural pathways and sensory perception are expressed Terefore, we are cautious in interpreting the role of this by the worm at higher levels [46]. Tus, we expected protein as a manipulation factor, as it could be solely a the secretome at the infective stage of S. solidus to be waste product from the membrane of the parasite. enriched with proteins involved in such neural func- tions (as described in other parasitic systems, reviewed Cell communication in Table 1). Furthermore, behavioural changes in the Schistocephalus solidus is located in the abdominal cav- stickleback infected by S. solidus generally appear in con- ity of the threespine stickleback [27]. As the worm is comitance with the activation of the immune response of not in direct contact with its host brain, we expected to the fsh [32]. Terefore, we expected the secretome of S. detect proteins involved in cell communication or cell– solidus at the infective stage to include proteins involved cell signalling. During annotation of the new genome, in immunity (as reviewed in Table 1). We found that 2.5% of genes were related to environmental sensing three proteins were playing potential roles in neural and functions. Furthermore, we found that seven proteins immune functions. Te frst protein, which was detected detected only in the secretome fraction were charac- in four secretomes but in none of the proteomes, had a terized by fbronectin type-III (FNIII) domains. FNIII knottin fold called UPF0506 composed of cysteines and domains are widely found in proteins and are generally found in Schistosoma parasites. In Schistosoma, involved in cell–cell interactions [115]. Te frst protein the function of the proteins with such knottin fold is not with fbronectin type-III domains found in all secretomes defned. However, peptides with knottin domains (i.e. but in none of the proteomes was identifed as a neo- knottins) were described in venoms from various ani- genin. In addition to playing roles in cell–cell adhesion, mals. For venomous animals, knottins are neurotoxins neogenins are involved in neural development in humans having high specifcity towards receptors in the nervous [116]. Tree additional proteins that we identifed with system of their prey or aggressor [108]. Furthermore, fbronectin type-III domains were described as receptor- it is a cysteine-rich protein, and in parasites, cysteine- type tyrosine-protein phosphatases (type eta, detected rich proteins play a role in invasion [109] and modula- in all secretomes but in none of the proteomes; type tion [110] of the immune system. Terefore, this protein H, detected in four secretomes but in none of the pro- could be a promising manipulation factor if it could act teomes; type F, detected in three secretomes but in none as a neurotoxin in the brain of infected sticklebacks. Te of the proteomes). Receptor-type tyrosine-protein phos- second protein, which was detected in three secretomes phatases are involved in cell–cell communication [117]. but in none of the proteomes, had a cystatin domain It was previously shown that a baculovirus secretes a pro- which may be involved in immunomodulatory func- tein tyrosine phosphatase, which acts on the neural sys- tions. It was demonstrated in parasitic nematodes that tem of its host, the silkworm Bombyx mori, and enhances cystastins are important secreted molecules that help its locomotory activity, so that it ultimately increases the parasites to evade immunity of the host [111]. Nema- virus dispersal [19] (Table 1). Te set of phosphatases tode cystatins inhibit host proteases involved in antigen identifed here are located in the membrane, such that processing and presentation, leading to a reduction of they are predicted to have specifc biological functions the host immune response [112]. Tis secreted protein that difer from those of the protein tyrosine phosphatase could thereby explain in part why the immune response found in the baculovirus. However, their abundant repre- is only activated late in infected sticklebacks, which sentation in the secretome is intriguing, and the impor- needs to be further studied. Te third protein appeared tance of the process of phosphorylation represents a to be an antigen similar to the diagnostic antigen gp50. future avenue of study in the context of behavioural Tis diagnostic antigen is used to detect parasitic infec- changes in the fsh host, such as increased exploration tion, for example, by Taenia solium [113]. However, the [34]. For the next two proteins identifed with fbronec- antigen was detected only in the secretome from the big- tin type-III domains, the frst protein was found in all gest worm of our study and not in its proteome. Sam- secretomes but in none of the proteomes and had no pling over a longer time frame could alleviate this type clear function (protein ID: A0A0V0JBL5), as for the sec- of discrepancy between samples. One protein, which was ond protein that was detected in four secretomes but in detected in all secretomes but in none of the proteomes, none of the proteomes (protein ID: g17644.t1). Te last had a “neurotransmitter: sodium symporter activity protein appeared to be a collagen-like protein found in Berger et al. Parasites Vectors (2021) 14:436 Page 16 of 20

three secretomes. In Caenorhabditis elegans, the cuti- suggested that 19% of all the protein-coding genes could cle is an extracellular matrix made of small collagen-like be S. solidus-specifc [45]. In this study, producing a proteins [118]. Tus, the role of this protein in cell com- novel genome sequence and assembly of S. solidus led munication is uncertain, and it may rather be a waste us to increase this estimation by three times. S. solidus- product excreted passively from the cuticle of S. solidus. specifc secreted proteins represent promising candidates In summary, fbronectin type-III proteins secreted by S. to explain physiological and behavioural changes in the solidus appear to be good candidates for manipulation stickleback host. Te threespine stickleback is an obliga- factors because of their roles in cell–cell signalling, but tory host of S. solidus [122]. Because of the potential also in potential neural functions. coevolution between the worm and the fsh [27], S. soli- Additionally, we expected to detect proteins with trans- dus may have developed highly specifc molecular mech- port functions in the secretome of S. solidus to mediate anisms targeting the threespine stickleback physiological the communication between the worm localized in the machinery to insure it can establish and grow in this fsh host abdominal cavity and the host brain. We detected only, until it is ready to pass into its fnal avian host. Te three proteins involved in transport or related functions S. solidus-specifc proteins found here are therefore likely at the membrane, including a protein that was detected to be part of this unique set of molecules crucial for the in all secretomes but in none of the proteomes, a phos- survival of the worm, and we can assign them this eco- pholipid scramblase. It is a transmembrane protein that logical annotation [123]. It will be very interesting in the is known in humans to bind to the 5’-promoter region future to produce recombinant S. solidus-specifc pro- of the inositol 1,4,5-triphosphate receptor type 1 gene teins and test their efects on the phenotype of the three- (IP3R1) so that it enhances expression of the receptor spine stickleback by functional analysis. Tese proteins [119]. Since these proteins detected here could be simply could also serve to obtain structural information by X-ray waste products coming from the membranes of the para- crystallography or NMR spectroscopy, which could help site, we thus have to be careful when speculating about us to discover their function. Finally, studies of how these the roles of these proteins, and functional analyses will be proteins evolve between Schistocephalus populations and required. between species of Schistocephalus infecting diferent fsh species [124] will shed additional light on the speci- The secretome includes S. solidus‑specifc proteins fcity and potential function of these proteins. that could play important roles in parasite–host interactions Conclusions During annotation of the new genome, we found that Te secretome of S. solidus appears to be an important 68.6% of the predicted genes did not have any sequences component of the molecular interaction between the or domain similarities with other known species that parasite and its threespine stickleback host. In accord- would enable accurate annotation, therefore represent- ance with our predictions, we detected in the secretome ing putative S. solidus-specifc genes. Part of these genes of S. solidus at the infective stage (i.e. when behavioural are efectively translated, as 1.7% of the proteome and changes appear in the stickleback) proteases, proteins 2.3% of the secretome included putative proteins spe- involved in neural and immune functions, as well as cifc to the worm. Specifcally, we detected a total of 12 proteins involved in cell communication. We also high- proteins only in the secretome fraction for which func- lighted in the secretome the presence of S. solidus-spe- tions could not be clearly determined, but that were cifc proteins. In the future, comparative studies could be specifc to S. solidus. Tree proteins were detected to conducted to validate that the proteins detected in the be secreted signal peptides (protein ID: A0A0X3Q756; secretome of S. solidus act as manipulation factors in the protein ID: A0A0V0J2U1; protein ID: A0A0X3PIM2). behaviour of its fsh host: at the organism level, the anal- We found that the proteins that are specifcally enriched ysis of the secretome of the worm at the non-infective in the secretome are more likely to have secreted sig- stage will confrm if the putative manipulative proteins nal peptides (10% of the proteins detected only in the reported in this study are efectively detected solely at the secretome fraction had secreted signal peptides, while infective stage (or at highest levels) when the behavioural 0.17% of the proteins detected in the proteome had such changes appear in the host. It would also be of special signals). One protein had a transmembrane helix (TMH) interest to study the secretome of S. solidus at the stage domain found in transmembrane proteins (protein ID: when it is able to reproduce, i.e. in the bird host. At the A0A0X3PXG6). Proteins with TMH domains are used to species level, comparing the secretome of S. solidus with diagnose parasitic infection, as they are highly specifc to the secretome of a closely related species Schistocephalus the parasite of interest [120, 121]. Interestingly, analyses pungitii will be of great interest, as S. pungitii does not conducted with the transcriptome of S. solidus previously have known efects on the behaviour of the nine-spine Berger et al. Parasites Vectors (2021) 14:436 Page 17 of 20

stickleback, its specifc intermediate host species [125]. Acknowledgements We thank Karina Nielsen, Alexandra Sébastien, and Tom Clark from the Foster Finally, functional studies testing efects of presence in Lab for their advice during the preparation of the samples for mass spectrom‑ non-infected sticklebacks of interesting proteins identi- etry analysis. The authors thank the personnel of the Laboratoire Aquatique fed in the secretome, such as receptor-type tyrosine- de Recherche en Sciences Environnementales et Médicales at Université Laval for their help with fsh rearing. We thank Eric Normandeau from Ressources protein phosphatases or phospholipid scramblase, would Aquatiques Québec (RAQ) for his help to identify S. solidus-specifc proteins. allow us to better understand the contribution of these proteins in the behavioural changes. To conclude, we Authors’ contributions CSB designed the study with input from NA-H, CRL, LJF, and K-MM. CSB hope that the genomic and proteomic resources we pro- performed the experiments to prepare the worms and their secretomes for vide will help other researchers to investigate general mass spectrometry. K-MM performed the experiments with the mass spec‑ questions on host–parasite interactions in nature. trophotometer and the MaxQuant software. HM prepared the worm samples for genome sequencing and performed the MinION sequencing. JL analysed the genome and prepared the fles used for searches with the MaxQuant Supplementary Information software. JL and H Maarouf assembled and annotated the new genome and The online version contains supplementary material available at https://​doi.​ proteins. CSB analysed the data with input from NA-H, CRL, and LJF. CSB and org/​10.​1186/​s13071-​021-​04933-w. NA-H wrote the manuscript with input from all authors. All authors read and approved the fnal manuscript.

Additional fle 1. Visualization on SDS-PAGE gels of the protein content Funding of the proteome and of the secretome for 5 S. solidus worms. (a) Proteome Funding for the project was provided by the Fonds de Recherche du Québec (i.e. tissues) extracts from 5 distinct worms. Note that for worm 1 and – Nature & Technologies (FRQNT), Projet en Équipe grant to N.A.H. and C.R.L.; worm 2, the sample was loaded on gel either in duplicate or triplicate a fellowship (international internship program) from FRQNT obtained through but only one lane was used for LC–MS/MS. (b) Secretome extracts from 5 the Ressources Aquatiques Québec (RAQ) strategic cluster and a Student/PDF distinct worms. On each gel, a BenchMark protein ladder (Invitrogen) and Research Grant award from the Canadian Society of Zoologists (CSZ) to C.S.B. a negative control (SDS-PAGE loading bufer and water) were also loaded. Mass spectrometry infrastructure used here was supported by the Canada Additional fle 2. Methods information for LC/MS analysis. Foundation for Innovation, the BC Knowledge Development fund, and the BC Proteomics Network. Work in L.J.F.’s group was supported by Genome Canada Additional fle 3. Template custom script for analysis of the proteome and Genome British Columbia via the Pan-Canadian Proteomics Initiative and secretome LC–MS/MS data using Python. (264PRO).). C.R.L. holds the Canada Research Chair in Cellular Systems and Additional fle 4. List of the proteins that were solely attributed to the Synthetic Biology. threespine stickleback host, and that were consequently discarded from analysis. All the protein IDs that were specifc to the threespine stickleback Availability of data and materials Gasterosteus aculeatus are reported. For a proteome or a secretome The datasets supporting the conclusions of this article are included within the sample, a value above 0 indicates protein detection. The Gene Ontology article and its additional fles. Furthermore, the dataset of the new genome (GO) annotation (biological process, cellular component and molecular assembly is available from the NCBI BioProject with accession PRJNA576252. function) of each protein ID is reported. The mass spectrometry proteomics data have been deposited to the Pro‑ teomeXchange Consortium via the PRIDE partner repository with the dataset Additional fle 5. List of the repeated elements that were identifed in the identifer PXD016844. new genome of S. solidus. For each class of repeated elements the follow‑ ing are indicated: the number of elements, their total length (bp) as well as their proportion (%) in the new genome. Declarations Additional fle 6. Annotation results of 16,307 ORFs predicted from the Ethics approval and consent to participate new expanded Schistocephalus solidus genome. Functional annotation This study was approved by the Comité de Protection des Animaux de was performed using PFAM (domain database) and eggNOG (orthology l’Université Laval (CPAUL 2014069-3). database). For each ORF match in each database: the accession number, the E-value and the functional description of the match are indicated. The Consent for publication other ORFs (68.6%) predicted from the new genome did not match with Not applicable. these two databases, and are considered S. solidus-specifc. Additional fle 7. List of the proteins detected by LC–MS/MS in the Competing interests proteome and the secretome of S. solidus. All the protein IDs that were The authors declare that they have no competing interests. detected in each of the proteome samples and/or secretome samples are reported. For a sample, a value above 0 indicates protein detection. The Author details Gene Ontology (GO) annotation (biological process, cellular component 1 Département de Biologie, Université Laval, Quebec, QC, Canada. 2 Institut and molecular function) of each protein ID is reported. de Biologie Intégrative Et Des Systèmes (IBIS), Université Laval, Quebec, QC, 3 Additional fle 8. List of the S. solidus-specifc proteins detected in the Canada. Ressources Aquatiques Québec (RAQ), Institut Des Sciences de La Mer de Rimouski, Quebec, Canada. 4 Département de Biochimie, Microbiologie proteome and/or in the secretome of S. solidus. For each protein, the ID 5 is reported from UniProt. If the protein is “not secreted”, then it was only Et Bioinformatique, Université Laval, Quebec, QC, Canada. Department of Bio‑ chemistry & Molecular Biology, Michael Smith Laboratories, University of Brit‑ detected in the proteome and not in the secretome. The Gene Ontology 6 (GO) annotation (biological process, cellular component and molecular ish Columbia, Vancouver V6T 1Z4, Canada. PROTEO, Le Réseau Québécois de function) of each protein ID is reported. Recherche Sur La Fonction, la structure et l’ingénierie des protéines, Université Laval, Quebec, Canada. 7 Centre de Recherche en Données Massives (CRDM), Additional fle 9. List of the 141 proteins that were assigned both to Université Laval, Quebec, Canada. S. solidus and to the fsh host during LC–MS/MS analysis. Each protein is assigned several protein IDs from the worm and the fsh. The Gene Received: 2 February 2021 Accepted: 6 August 2021 Ontology (GO) annotation (biological process, cellular component and molecular function) of each protein ID is reported. Berger et al. Parasites Vectors (2021) 14:436 Page 18 of 20

References 23. De Bekker C, Quevillon LE, Smith PB, Fleming KR, Ghosh D, Patterson 1. Krist AC. Efect of the digenean parasite Proterometra macrostoma on AD, et al. Species-specifc ant brain manipulation by a specialized host morphology in the freshwater snail Elimia livescens. J Parasitol. fungal parasite. BMC Evol Biol. 2014;14(1):166–78. 2000;86(2):262–7. 24. Adamo SA, Kovalko I, Turnbull KF, Easy RH, Miles CI. The parasitic wasp 2. Jones D, Jones G, Rudnicka M, Click A, Reck-Malleczewen V, Iwaya M. Cotesia congregata uses multiple mechanisms to control host (Manduca Pseudoparasitism of host Trichoplusia ni by Chelonus spp. as a new sexta) behaviour. J Exp Biol. 2016;219(23):3750–8. model system for parasite regulation of host physiology. J Insect 25. Libersat F. Wasp uses venom cocktail to manipulate the behavior of its Physiol. 1986;32(1):315–28. cockroach prey. J Comp Physiol A. 2003;189(7):497–508. 3. Watson DW, Mullens BA, Petersen JJ. Behavioral fever response of Musca 26. Korgaonkar A, Han C, Lemire AL, Siwanowicz I, Bennouna D, Kopec R, domestica (Diptera: Muscidae) to infection by Entomophthora muscae et al. A novel family of secreted proteins linked to plant gall develop‑ (Zygomycetes: Entomophthorales). J Invertebr Pathol. 1993;61(1):10–6. ment. BioRxiv. 2020. https://​doi.​org/​10.​1101/​2020.​10.​28.​359562. 4. Laferty KD, Shaw JC. Comparing mechanisms of host manipulation 27. Barber I, Scharsack JP. The three-spined stickleback-Schistocephalus across host and parasite taxa. J Exp Biol. 2013;216(1):56–66. solidus system: an experimental model for investigating host-parasite 5. Adamo SA. Parasites: evolution’s neurobiologists. J Exp Biol. interactions in fsh. Parasitology. 2010;137(3):411–24. 2013;216(1):3–10. 28. Clarke AS. Studies on the life cycle of the pseudophyllidean cestode 6. Biron DG, Loxdale HD. Host-parasite molecular cross-talk dur‑ Schistocephalus solidus. Proc Zool Soc London. 2010;124(1):257–302. ing the manipulative process of a host by its parasite. J Exp Biol. 29. Hammerschmidt K, Kurtz J. Schistocephalus solidus: establishment 2013;216(1):148–60. of tapeworms in sticklebacks–fast food or fast lane? Exp Parasitol. 7. Herbison R, Lagrue C, Poulin R. The missing link in parasite manipula‑ 2007;116(2):142–9. tion of host behaviour. Parasit Vectors. 2018;11(1):222–7. 30. Dingemanse NJ, Oosterhof C, Van Der Plas F, Barber I. Variation in stick‑ 8. Coppens I, Vielemeyer O. Insights into unique physiological features of leback head morphology associated with parasite infection. Biol J Lin neutral lipids in Apicomplexa: from storage to potential mediation in Soc. 2009;96(1):759–68. parasite metabolic activities. Int J Parasitol. 2005;35(1):597–615. 31. Heins DC, Baker JA, Toups MA, Birden EL. Evolutionary signifcance 9. Fromm B, Ovchinnikov V, Høye E, Bernal D, Hackenberg M, Marcilla of fecundity reduction in threespine stickleback infected by the A. On the presence and immunoregulatory functions of extracellular diphyllobothriidean cestode Schistocephalus solidus. Biol J Lin Soc. microRNAs in the trematode Fasciola hepatica. Parasite Immunol. 2010;100(1):835–46. 2017;39(1):e12399. 32. Scharsack JP, Koch K, Hammerschmidt K. Who is in control of the stick‑ 10. Grimwood J, Smith JE. Toxoplasma gondii: the role of parasite leback immune system: interactions between Schistocephalus solidus surface and secreted proteins in host cell invasion. Int J Parasitol. and its specifc vertebrate host. Proc R Soc B. 2007;274(1629):3151–8. 1996;26(1):169–73. 33. Barber I, Huntingford FA. The efect of Schistocephalus solidus 11. Coakley G, Maizels RM, Buck AH. Exosomes and other extracellular (: ) on the foraging and shoaling behav‑ vesicles: The new communicators in parasite infections. Trends Parasitol. iour of three-spined sticklebacks Gasterosteus aculeatus. Behaviour. 2015;31(1):477–89. 1995;132(1):1223–40. 12. Knudsen GM, Medzihradszky KF, Lim K-C, Hansell E, McKerrow JH. 34. Barber I. The efect of hunger and cestode parasitism on the shoaling Proteomic analysis of Schistosoma mansoni cercarial secretions. Mol Cell decisions of small freshwater fsh. J Fish Biol. 1995;47(1):524–36. Proteomics. 2005;4(12):1862–75. 35. Grécias L, Valentin J, Aubin-Horth N. Testing the parasite mass burden 13. Gourbal BEF, Guillou F, Mitta G, Sibille P, Thèron A, Pointier J-P, et al. efect on alteration of host behaviour in the -stickleback system. J Exp Excretory–secretory products of larval Fasciola hepatica investigated Biol. 2018;221(1):174748. using a two-dimensional proteomic approach. Mol Biochem Parasitol. 36. Giles N. Behavioural efects of the parasite Schistocephalus solidus (Ces‑ 2008;161(1):63–6. toda) on an intermediate host, the three-spined stickleback Gasteros- 14. Ditgen D, Anandarajah EM, Meissner KA, Brattig N, Wrenger C, Liebau E. teus aculeatus L. Animal Behaviour. 1983;131(1):1192–4. Harnessing the helminth secretome for therapeutic immunomodula‑ 37. Tierney JF, Crompton DWT. Infectivity of plerocercoids of Schistocepha- tors. Biomed Res Int. 2014;15(1):964350–63. lus solidus (Cestoda: Ligulidae) and fecundity of the adults in an experi‑ 15. Garg G, Ranganathan S. Helminth secretome database (HSD): a collec‑ mental defnitive host Gallus gallus. J Parasitol. 1992;78(1):1049–54. tion of helminth excretory/secretory proteins predicted from expressed 38. Barber I, Svensson PA, Walker P. Behavioural responses to simu‑ sequence tags (ESTs). BMC Genomics. 2012;13(13):1471–2164. lated avian predation in female three spined sticklebacks: The 16. Cwiklinski K, Dalton JP. Advances in Fasciola hepatica research using efect of experimental Schistocephalus solidus infections. Behaviour. “omics” technologies. Int J Parasitol. 2018;48(1):321–31. 2004;141(1):1425–40. 17. Kim D-W, Yoo WG, Lee M-R, Yang H-W, Kim Y-J, Cho S-H, et al. Transcrip‑ 39. Weber JN, Steinel NC, Shim KC, Bolnick DI. Recent evolution of extreme tome sequencing and analysis of the zoonotic parasite Spirometra cestode growth suppression by a vertebrate host. Proc Natl Acad Sci. erinacei spargana (plerocercoids). Parasit Vectors. 2014;7(1):368–87. 2017;114(25):6575–80. 18. Bień J, Sałamatin R, Sulima A, Savijoki K, Conn DB, Näreaho A, et al. Mass 40. Poulin R. Parasite manipulation of host behavior-Chapter 5: An update spectrometry analysis of the excretory-secretory (E-S) products of the and frequently asked questions. Adv Study Behav. 2010; 41(1):151–86. model cestode Hymenolepis diminuta reveals their immunogenic prop‑ 41. Grécias L. Étude des causes proximales des changements de comporte‑ erties and the presence of new E-S proteins in cestodes. Acta Parasitol. ment de l’épinoche à trois épines (Gasterosteus aculeatus) par son 2016;61(1):429–42. parasite Schistocephalus solidus. Thesis Université Laval 2017; https://​ 19. Kamita SG, Nagasaka K, Chua JW, Shimada T, Mita K, Kobayashi M, et al. corpus.​ulaval.​ca/​jspui/​handle/​20.​500.​11794/​27886 A baculovirus-encoded protein tyrosine phosphatase gene induces 42. Polzer M, Conradt U. Identifcation and partial characterization of the enhanced locomotory activity in a lepidopteran host. Proc Natl Acad proteases from diferent developmental stages of Schistocephalus Sci. 2005;102(1):2584–9. solidus (Cestoda: Pseudophyllidae). Int J Parasitol. 1994;24(7):967–73. 20. Biron DG, Marché L, Ponton F, Loxdale HD, Galéotti N, Renault L, et al. 43. Brophy PM, Papadopoulos A, Touraki M, Coles B, Körting W, Barrett J. Behavioural manipulation in a grasshopper harbouring hairworm: a Purifcation of cytosolic glutathione transferases from Schistocephalus proteomics approach. Proc R Soc B. 2005;272(1):2117–26. solidus (plerocercoid): interaction with anthelmintics and products of 21. Biron DG, Ponton F, Marché L, Galeotti N, Renault L, Demey-Thomas E, lipid peroxidation. Mol Biochem Parasitolol. 1989;36(2):187–96. et al. “Suicide” of crickets harbouring hairworms: a proteomics investiga‑ 44. International Helminth Genomes Consortium. Comparative genomics tion. Insect Mol Biol. 2006;15(6):731–42. of the major parasitic worms. Nat Genet. 2019;51(1):163–74. 22. Prandovszky E, Gaskell E, Martin H, Dubey JP, Webster JP, McConkey 45. Hébert FO, Grambauer S, Barber I, Landry CR, Aubin-Horth N. Transcrip‑ GA. The neurotropic parasite Toxoplasma gondii increases dopamine tome sequences spanning key developmental states as a resource metabolism. PLoS ONE. 2011;6(9):e23866. for the study of the cestode Schistocephalus solidus, a threespine stickle‑ back parasite. GigaScience. 2016;5(1):1–19. Berger et al. Parasites Vectors (2021) 14:436 Page 19 of 20

46. Hébert FO, Grambauer S, Barber I, Landry CR, Aubin-Horth N. Major host 72. Berger CS, Aubin-Horth N. An eDNA-qPCR assay to detect the presence transitions are modulated through transcriptome-wide reprogramming of the parasite Schistocephalus solidus inside its threespine stickleback events in Schistocephalus solidus, a threespine stickleback parasite. host. J Exp Biol. 2018;221(1):jeb178137. Molecular Ecololgy. 2017;26(4):1118–30. 73. Cao X, Fu Z, Zhang M, Han Y, Han Q, Lu K, et al. Excretory/secretory pro‑ 47. Yoneva A, Scholz T, Kuchta R. Comparative morphology of surface teome of 14-day schistosomula, Schistosoma japonicum. J Proteomics. ultrastructure of diphyllobothriidean tapeworms (Cestoda: Diphyllobo‑ 2016;130(1):221–30. thriidea). Invertebr Biol. 2018;137(1):38–48. 74. Link AJ, LaBaer J. Trichloroacetic acid (TCA) precipitation of proteins. 48. Berger CS, Aubin-Horth N. The secretome of a parasite alters its host’s Cold Spring Harbor Protocol. 2011;2011(8):993–4. behaviour but does not recapitulate the behavioural response to infec‑ 75. Shevchenko A, Wilm M, Vorm O, Mann M. Mass spectrometric tion. Proc R Soc B 287, 20200412. sequencing of proteins silver-stained polyacrylamide gels. Anal Chem. 49. Peichel CL, Sullivan ST, Liachko I, White MA. Improvement of the 1996;68(5):850–8. threespine stickleback genome using a Hi-C-based proximity-guided 76. Rappsilber J, Ishihama Y, Mann M. Stop and go extraction tips for assembly. J Hered. 2017;108(6):693–700. matrix-assisted laser desorption/ionization, nanoelectrospray, and LC/ 50. Barrett J, Jeferies JR, Brophy PM. Parasite proteomics. Parasitol Today. MS sample pretreatment in proteomics. Anal Chem. 2003;75(3):663–70. 2000;16(9):400–3. 77. Rappsilber J, Mann M, Ishihama Y. Protocol for micro-purifcation, 51. Harnett W. Secretory products of helminth parasites as immunomodu‑ enrichment, pre-fractionation and storage of peptides for proteomics lators. Mol Biochem Parasitol. 2014;195(1):130–6. using StageTips. Nat Protoc. 2007;2(1):1896–906. 52. Greenbaum D, Luscombe NM, Jansen R, Qian J, Gerstein M. Interrelating 78. Cox J, Mann M. MaxQuant enables high peptide identifcation rates, diferent types of genomic data, from proteome to secretome: ’oming individualized p.p.b.-range mass accuracies and proteome-wide protein in on function. Genome Res. 2001;11(9):1463–8. quantifcation. Nat Biotechnol. 2008;26(1):1367–72. 53. Kolmogorov M, Yuan J, Lin Y, Pevzner PA. Assembly of long, error-prone 79. Pathan M, Keerthikumar S, Ang C-S, Gangoda L, Quek CYJ, Wil‑ reads using repeat graphs. Nat Biotechnol. 2019;37(5):540–6. liamson NA, et al. FunRich: An open access standalone functional 54. Lin Y, Yuan J, Kolmogorov M, Shen MW, Pevzner PA. Assembly of enrichment and interaction network analysis tool. Proteomics. long error-prone reads using de Bruijn graphs. Proc Natl Acad Sci. 2015;15(15):2597–601. 2016;113(52):E8396–405. 80. Almagro Armenteros JJ, Tsirigos KD, Sønderby CK, Petersen TN, Winther 55. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, et al. O, Brunak S, et al. SignalP 5.0 improves signal peptide predictions using Pilon: an integrated tool for comprehensive microbial variant detection deep neural networks. Nat Biotechnol. 2019;37(4):420–3. and genome assembly improvement. PLoS ONE. 2014;9(11):e112963. 81. Bendtsen JD, Jensen LJ, Blom N, Von Heijne G, Brunak S. Feature-based 56. Ishida T, Kinoshita K. PrDOS: prediction of disordered protein regions prediction of non-classical and leaderless protein secretion. Protein Eng from amino acid sequence. Nucleic Acids Res. 2007;35(1):W460–4. Des Sel. 2004;17(4):349–56. 57. Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. 82. Krogh A, Larsson B, von Heijne G, Sonnhammer EL. Predicting trans‑ BUSCO: assessing genome assembly and annotation completeness membrane protein topology with a hidden Markov model: application with single-copy orthologs. Bioinformatics. 2015;31(1):3210–2. to complete genomes. J Mol Biol. 2001;305(3):567–80. 58. Smit A, Hubley R. RepeatModeler Open-1.0. Institute of Systems Biol‑ 83. Potter SC, Luciani A, Eddy SR, Park Y, Lopez R, Finn RD. HMMER web ogy, Seattle USA. . server: 2018 update. Nucleic Acids Res. 2018;46(1):200–4. 59. Hancock JM. RepeatMasker. Dictionary of Bioinformatics and Computa‑ 84. Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen tional Biology. 2004. M-Y, et al. Comparative protein structure modeling using MODELLER. 60. Hof KJ, Lomsadze A, Borodovsky M, Stanke M. Whole-Genome annota‑ Curr Protoc Protein Sci. 2007;2(1):1–30. tion with BRAKER. Methods Mol Biol. 2019;1962(1):65–95. 85. Laskowski RA, Rullmannn JA, MacArthur MW, Kaptein R, Thornton JM. 61. Hof KJ, Lange S, Lomsadze A, Borodovsky M, Stanke M. BRAKER1: AQUA and PROCHECK-NMR: programs for checking the quality of Unsupervised RNA-Seq-Based genome annotation with GeneMark-ET protein structures solved by NMR. J Biomol NMR. 1996;8(4):477–86. and AUGUSTUS. Bioinformatics. 2016;32(5):767–9. 86. DeLano WL, Others. Pymol: An open-source molecular graphics tool. 62. Stanke M, Diekhans M, Baertsch R, Haussler D. Using native and synteni‑ CCP4 Newsletter on protein crystallography. 2002;40(1):82–92. cally mapped cDNA alignments to improve de novo gene fnding. 87. Sievers F, Higgins DG. Clustal Omega for making accurate alignments of Bioinformatics. 2008;24(5):637–44. many protein sequences. Protein Sci. 2018;27(1):135–45. 63. Stanke M, Schöfmann O, Morgenstern B, Waack S. Gene prediction in 88. Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel eukaryotes with a generalized hidden Markov model that uses hints O. New algorithms and methods to estimate maximum-likelihood from external sources. BMC Bioinform. 2006;7(1):62–73. phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 64. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The 2010;59(3):307–21. Sequence Alignment/Map format and SAMtools. Bioinformatics. 89. Gouet P, Courcelle E, Stuart DI, Métoz F. ESPript: analysis of multiple 2009;25(16):2078–9. sequence alignments in PostScript. Bioinformatics. 1999;15(4):305–8. 65. Barnett DW, Garrison EK, Quinlan AR, Stromberg MP, Marth GT. 90. Rambaut A. FigTree 1.4. 2 software. Institute of Evolutionary Biology, BamTools: a C API and toolkit for analyzing and managing BAM fles. University of Edinburgh. 2014. Bioinformatics. 2011;27(1):1691–2. 91. Luo F, Yin M, Mo X, Sun C, Wu Q, Zhu B, et al. An improved genome 66. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local align‑ assembly of the fuke Schistosoma japonicum. PLOS Negl Trop Dis. ment search tool. J Mol Biol. 1990;215(3):403–10. 2019;13(8):e0007612. 67. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer 92. Stroehlein AJ, Korhonen PK, Chong TM, Lim YL, Chan KG, Webster B, K, et al. BLAST : architecture and applications. BMC Bioinform. et al. High-quality Schistosoma haematobium genome achieved by sin‑ 2009;10(1):421–30.+ gle-molecule and long-range sequencing. Gigascience. 2019;8(9):1–24. 68. Anderson I, Brass A. Searching DNA databases for similarities to 93. Maule AG, Marks NJ. Parasitic fatworms: Molecular biology, biochemis‑ DNA sequences: when is a match signifcant? Bioinformatics. try, immunology and physiology. Chapter 1. CABI Publishing. 2006. 448 1998;14(4):349–56. p. 69. Eddy SR. Accelerated Profle HMM Searches. PLoS Comput Biol. 94. Azadi S, Molday LL, Molday RS. RD3, the protein associated with Leber 2011;7(10):e1002195. congenital amaurosis type 12, is required for guanylate cyclase trafck‑ 70. Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. ing in photoreceptor cells. Proc Natl Acad Sci. 2010;107(49):21158–63. The Pfam protein families database: towards a more sustainable future. 95. Shen Y, Peterson AS. Atrophins’ emerging roles in development and Nucleic Acids Res. 2016;44(1):279–85. neurodegenerative disease. Cell Mol Life Sci. 2009;66(1):437–46. 71. Huerta-Cepas J, Forslund K, Coelho LP, Szklarczyk D, Jensen LJ, von Mer‑ 96. Vyas S, Chesarone-Cataldo M, Todorova T, Huang Y-H, Chang P. A ing C, et al. Fast Genome-Wide functional annotation through orthol‑ systematic analysis of the PARP protein family identifes new functions ogy assignment by eggNOG-Mapper. Mol Biol Evol. 2017;34(8):2115–22. critical for cell physiology. Nat Commun. 2013;4(1):2240–53. Berger et al. Parasites Vectors (2021) 14:436 Page 20 of 20

97. Batugedara HM, Argueta D, Jang JC, Lu D, Macchietto M, Kaur J, et al. 112. Hartmann S, Lucius R. Modulation of host immune responses by nema‑ Host- and helminth-derived endocannabinoids that have efects tode cystatins. Int J Parasitol. 2003;33(11):1291–302. on host immunity are generated during infection. Infect Immun. 113. Hancock K, Pattabhi S, Greene RM, Yushak ML, Williams F, Khan A, et al. 2018;86(1):1–26. Characterization and cloning of GP50, a Taenia solium antigen diagnos‑ 98. Parra G, Bradnam K, Korf I. CEGMA: a pipeline to accurately annotate tic for cysticercosis. Mol Biochem Parasitol. 2004;133(1):115–24. core genes in eukaryotic genomes. Bioinformatics. 2007;23(9):1061–7. 114. Shi L, Quick M, Zhao Y, Weinstein H, Javitch JA. The Mechanism of 99. Barber I, Svensson PA. Efects of experimental Schistocephalus solidus a neurotransmitter:sodium symporter—Inward release of Na and infections on growth, morphology and sexual development of substrate is triggered by substrate in a second binding site. Mol Cell. female three-spined sticklebacks Gasterosteus aculeatus. Parasitology. 2008;30(1):667–77. 2003;126(4):359–67. 115. Koide A, Bailey CW, Huang X, Koide S. The fbronectin type III domain as 100. Hopkins CA. Studies on cestode metabolism. I. Glycogen metabolism in a scafold for novel binding proteins. J Mol Biol. 1998;284(4):1141–51. Schistocephalus solidus in vivo. J Parasitol. 1950;36(4):384–90. 116. De Vries M, Cooper HM. Emerging roles for neogenin and its ligands in 101. Bourne HR, Sanders DA, McCormick F. The GTPase superfamily: a con‑ CNS development. J Neurochem. 2008;106(4):1483–92. served switch for diverse cell functions. Nature. 1990;348(6297):125–32. 117. Tonks NK. Protein tyrosine phosphatases: from genes, to function, to 102. Takai Y, Sasaki T, Matozaki T. Small GTP-binding proteins. Physiol Rev. disease. Nat Rev Mol Cell Biol. 2006;7(11):833–46. 2001;81(1):153–208. 118. Johnstone IL. The cuticle of the nematode Caenorhabditis elegans: a 103. Mitra BN, Saito-Nakano Y, Nakada-Tsukui K, Sato D, Nozaki T. Rab11B complex collagen structure. BioEssays. 1994;16(3):171–8. small GTPase regulates secretion of cysteine proteases in the 119. Zhou Q, Ben-Efraim I, Bigcas J-L, Junqueira D, Wiedmer T, Sims PJ. enteric protozoan parasite Entamoeba histolytica. Cell Microbiol. Phospholipid scramblase 1 binds to the promoter region of the inositol 2007;9(9):2112–25. 1,4,5-triphosphate receptor type 1 gene to enhance its expression. J 104. Ward GE, Tilney LG, Langsley G. Rab GTPases and the unusual secretory Biol Chem. 2005;280(41):35062–8. pathway of plasmodium. Parasitol Today. 1997;13(2):57–62. 120. Ohya K, Takahara Y, Kuroda E, Koyasu S, Hagiwara S, Sakamoto M, et al. 105. Vyas A, Kim S-K, Giacomini N, Boothroyd JC, Sapolsky RM. Behavioral Chlamydophila felis CF0218 is a novel TMH family protein with poten‑ changes induced by Toxoplasma infection of rodents are highly specifc tial as a diagnostic antigen for diagnosis of C. felis infection. Clin Vacc to aversion of cat odors. Proc Natl Acad Sci. 2007;104(15):6442–7. Immunol. 2008;15(10):1606–15. 106. Kremer K, Kamin D, Rittweger E, Wilkes J, Flammer H, Mahler S, 121. Wu H, Wang C, Jiang C, Xie Y, Liu L, Song Y, et al. Localization and char‑ et al. An overexpression screen of Toxoplasma gondii Rab-GTPases acterization of two putative TMH family proteins in Chlamydia psittaci. reveals distinct transport routes to the micronemes. PLoS Pathogens. Microbiol Res. 2016;183(1):19–25. 2013;9(3):e1003213. 122. Orr TSC, Hopkins CA, Charles GH. Host specifcity and rejection of Schis- 107. Struck TH, Wey-Fabrizius AR, Golombek A, Hering L, Weigert A, Bleidorn tocephalus solidus. Parasitology. 1969;59(1):683–90. C, et al. Platyzoan paraphyly based on phylogenomic data supports a 123. Pavey SA, Bernatchez L, Aubin-Horth N, Landry CR. What is needed for noncoelomate ancestry of spiralia. Mol Biol Evol. 2014;31(7):1833–49. next-generation ecological and evolutionary genomics? Trends Ecol 108. Sachkova MY, Slavokhotova AA, Grishin EV, Vassilevski AA. Structure of Evol. 2012;27(12):673–8. the yellow sac spider Cheiracanthium punctorium genes provides clues 124. Nishimura N, Heins DC, Andersen RO, Barber I, Cresko WA. Distinct to evolution of insecticidal two-domain knottin toxins. Insect Mol Biol. lineages of Schistocephalus parasites in threespine and ninespine 2014;23(4):527–38. stickleback hosts revealed by DNA sequence analysis. PLoS ONE. 109. Michon P, Stevens JR, Kaneko O, Adams JH. Evolutionary relationships 2011;6(7):e22505. of conserved cysteine-rich motifs in adhesive molecules of malaria 125. Heins DC, Ulinski B, Johnson J, Baker JA. Efect of the cestode macropar‑ parasites. Mol Biol Evol. 2002;19(7):1128–42. asite Schistocephalus pungitii on the reproductive success of ninespine 110. Anandarajah EM, Ditgen D, Hansmann J, Erttmann KD, Liebau E, stickleback Pungitius pungitius. Can J Zool. 2004;82(1):1731–7. Brattig NW. SPARC (secreted protein acidic and rich in cysteine) of the intestinal nematode Strongyloides ratti is involved in mucosa-associated parasite-host interaction. Mol Biochem Parasitol. 2016;207(1):75–83. Publisher’s Note 111. Robinson MW, Massie DH, Connolly B. Secretion and processing of Springer Nature remains neutral with regard to jurisdictional claims in pub‑ a novel multi-domain cystatin-like protein by intracellular stages of lished maps and institutional afliations. Trichinella spiralis. Mol Biochem Parasitol. 2007;151(1):9–17.

Ready to submit your research ? Choose BMC and benefit from:

• fast, convenient online submission • thorough peer review by experienced researchers in your field • rapid publication on acceptance • support for research data, including large and complex data types • gold Open Access which fosters wider collaboration and increased citations • maximum visibility for your research: over 100M website views per year

At BMC, research is always in progress.

Learn more biomedcentral.com/submissions