Journal of General Virology (2011), 92, 1870–1879 DOI 10.1099/vir.0.030585-0

Discovery and initial analysis of novel viral in the cyst

Sadia Bekal,1 Leslie L. Domier,2 Terry L. Niblack1 and Kris N. Lambert1

Correspondence 1Department of Crop Sciences, University of Illinois, Urbana, IL 61810, USA Kris N. Lambert 2United States Department of Agriculture, Agricultural Research Service, Department of Crop [email protected] Sciences, University of Illinois, Urbana, IL 61810, USA

Nematodes are the most abundant multicellular on earth, yet little is known about their natural viral pathogens. To date, only two nematode virus genomes have been reported. Consequently, nematode viruses have been overlooked as important biotic factors in the study of nematode ecology. Here, we show that one plant parasitic nematode species, glycines, the soybean cyst nematode (SCN), harbours four different RNA viruses. The nematode virus genomes were discovered in the SCN transcriptome after high-throughput sequencing and assembly. All four viruses have negative-sense RNA genomes, and are distantly related to nyaviruses and bornaviruses, rhabdoviruses, bunyaviruses and tenuiviruses. Some members of these families replicate in and are vectored by insects, and can cause significant diseases in animals and plants. The novel viral sequences were detected in both eggs and the second juvenile stage of SCN, suggesting that these viruses are transmitted vertically. While there was no evidence of integration of viral sequences into the nematode , we indeed detected transcripts from these viruses by using quantitative PCR. These data are the first finding of virus genomes in parasitic . This discovery highlights the need for further exploration for nematode viruses in all tropic groups of these diverse and abundant animals, to determine how the Received 15 January 2011 presence of these viruses affects the fitness of the nematode, strategies of viral transmission and Accepted 11 April 2011 mechanisms of viral pathogenesis.

INTRODUCTION briggsae and their genome sequenced. The discovery of natural nematode viruses supports the The use of genomic approaches for viral discovery has assertion that nematode viruses might be common but facilitated the rapid identification of unknown viruses overlooked (Fe´lix et al., 2011). from previously recalcitrant organisms and environments (Culley et al., 2006; Edwards & Rohwer, 2005). One group Other than C. elegans and C. briggsae, very little is known of organisms that has not been adequately explored for about nematode viruses. An iridovirus was found in the infectious viruses are the roundworms, forming the insect parasitic nematode (Thaumamermis cosgrovei) and phylum Nematoda. This is surprising since nematodes its isopod insect hosts Porcellio scaber and Armadillidium are among the most highly studied and numerous animals vulgare (Poinar et al., 1980). Another study described the on earth (Riddle et al., 1997; Telford et al., 2008; Wilson, infection of root-knot nematodes with a virus-like 2003). Plant parasitic nematodes are known to vector plant pathogen that caused visible and debilitating symptoms viruses, but these viruses do not replicate within their in nematodes; however no viral particles were visualized nematode vectors (Brown et al., 1995; Gray & Banerjee, (Loewenberg et al., 1959). In addition, a number of 1999). Several lines of evidence suggest that roundworms electron microscopic studies reported virus-like particles in are naturally infected by viruses. The model organism nematodes, but no biochemical or genomic analysis was has been experimentally infected conducted (Foor, 1972; Poinar & Hess, 1977; Zuckerman with broad-host-range viruses, and such infections et al., 1973). This dearth of information on nematode- have been shown to be useful tools for dissecting the infecting viruses may be due to the difficulty in culturing or genetics of host–virus interactions (Liu et al., 2006; Lu collecting large numbers of infected parasitic nematodes et al., 2005; Shaham, 2006). Recently, two nodaviruses have required for traditional virus characterization. been identified in natural populations of C. elegans and The soybean cyst nematode (SCN), Heterodera glycines,isa microscopic sedentary obligate plant pathogen that feeds The GenBank/EMBL/DDBJ accession numbers for the novel SCN viral upon the roots of soybean and other closely related plants. sequences reported in this study are HM849038–HM849041. SCN is the most destructive pathogen of soybean in the

1870 030585 G 2011 SGM Printed in Great Britain RNA viruses in nematodes

USA and has a complicated interaction with its host SCN tenuivirus (ScTV) (Table 1, Fig. 1). The Illumina (Wrather & Koenning, 2006). SCN hatches from its egg as cDNA sequences were derived from a paired-end library, a pre-parasitic second-stage juvenile (J2), which then therefore when the paired reads were mapped back on to migrates through the soil, finds a root, burrows into it and the SCN virus genomes the quality of the virus genome initiates feeding on vascular cells. The nematode completes assemblies could be assessed by noting the frequency of three more moults within the root to form both adult male correctly aligned paired cDNA sequences. The large and females. Determining whether a nematode is infected by number of SCN virus sequences (883- to 3310-fold a virus is a difficult task due to several factors: SCNs intra- coverage) produced continuous alignments of paired ends plant parasitic life cycle and microscopic size makes it showing no discontinuities in the alignments (data not difficult to observe or collect the parasitic stages free of plant shown), indicating the sequences were correctly assembled. material and in quantities sufficient for observation of The presence of long ORFs also supports the correct possible symptoms. Fortunately, new methods of high- assembly of the viral genomes (Table 2). throughput DNA sequencing are facilitating viral genome We designated the first virus genome detected in the SCN discovery. These methods produce immense amounts of transcriptome as ScNV after the top BLASTX result showed data, require relatively low amounts of starting material this genome was similar to the Midway virus (MIDWV) (Ansorge, 2009), and allow many potentially infected (Takahashi et al., 1982). MIDWV and Nyamanini virus individuals to be tested in a single experiment. Using these (NYMV) are closely related members of the proposed new methods, we describe the discovery of four nematode genus Nyavirus, and while related to viruses in the family viral genomes from SCN that were identified during a high- Bornaviridae, appear to be a distinct taxonic unit throughput sequencing of the SCN transcriptome. To our (Mihindukulasuriya et al., 2009). The assembled ScNV knowledge, this is the first description of nematode virus RNA genome sequence was 11 359 nt in length and had a genomes in a parasitic nematode. 1216-total fold coverage when the egg and J2 matching Illumina reads were combined, suggesting abundant RESULTS replication (Table 1). ScNV is predicted to have at least five ORFs (Fig. 1), which is the minimal number of ORFs needed for a virus in the order Mononegavirales. The Identification of virus genomes in the H. glycines arrangement of ORFs in the genomes of members of the transcriptome order is nucleoprotein (N), phosphoprotein (P), matrix Because of the importance of SCN as an agricultural protein (M), glycoprotein (G) and the RdRP large non- pathogen, a project was initiated to sequence the structural protein (L), with occasional additional ORFs + nematode’s transcriptome. Poly(A) RNA was extracted added upstream of the L-encoding ORF depending on the from eggs and surface-sterilized J2, and the resulting cDNA virus (Neumann et al., 2002). In ScNV the predicted amino was sequenced using 454 and Illumina sequencers. acid sequences of ORFs I, IV and V showed significant Subsequently, the cDNA was assembled into contigs, which similarity to N of Borna disease virus (BDV), and the G were compared to known proteins in the National Center and L proteins of MIDWV, respectively (Table 2). The for Biotechnology Information (NCBI) databases using predicted ORF IV amino acid sequence contains a puta- BLASTX (Altschul et al., 1990). Several large (.6 kb) contigs tive N-glycosylation site, amino-terminal signal peptide containing regions similar to viral RNA-dependent RNA sequence and a carboxyl-terminal membrane anchor polymerase (RdRP) enzymes were identified. Further sequence, which are consistent with ORF IVs encoding a bioinformatic investigation and assembly revealed nearly membrane-anchored G protein. ORF II and III showed no complete genomes of four different, likely ssRNA viruses significant similarity to proteins in the NCBI database. present in both egg and J2 stages: SCN nyavirus (ScNV), However, the predicted isoelectric point of the protein SCN rhabodovirus (ScRV), SCN phlebovirus (ScPV) and encoded by ORF II is 5.1 making it the most acidic protein

Table 1. Properties of SCN-associated virus-like genomes and genome fragments

SCN virus Most similar virus E value* CoverageD Length (nt)

ScNV (HM849038d) Midway virus (MIDWV) 5e2160 548E, 713J, 1261T 11 359 ScRV (HM849039d) Northern cereal mosaic virus (NCMV) 3e2102 568E, 315J, 883T 12 698 ScPV (HM849040d) Uukuniemi virus (UUKV) 0 460E, 2850J, 3310T 6705§ ScTV (HM849041d) Rice stripe virus (RSV) 3e298 393E, 815J, 1208T 8868§

*E value determined by BLASTX of the entire SCN viral genome. DThe fold coverage of each viral genome is shown for eggs (E), second-stage juveniles (J) and the total sum of both (T). dGenBank accession numbers. §Viruses with multi-partite ssRNA genomes, only the portion of the L segment of each genome is listed. http://vir.sgmjournals.org 1871 S. Bekal and others

Fig. 1. Organization of SCN-associated virus genomes or genome fragments. Numbers above indicate the size of the genomes in nucleotides. Arrows on each genome indicate the position, size and orientation of ORF, which are labelled with roman numerals. Genome maps are presented for ScNV, Midway virus (MIDWV), ScRV and Northern cereal mosaic virus (NCMV) and L-protein coding sequence segments of ScTV and ScPV. Glycoproteins (G), nucleoproteins (N) and RdRP genes (L) are labelled when significant. in ScNV (Table 2) and a probable P protein one less ORF than MIDWV in the 59 half of the genome (Mihindukulasuriya et al., 2009). By a process of elimina- between the N and G protein coding sequences (Fig. 1). tion and its position in the genome, it was assumed ORF ScNV also consistently grouped with NYMV and, to a III encodes the M protein. lesser extent BDV and other members of the family Bornaviridae (Fig. 2). While the genome organizations of Based upon BLASTX scores and neighbour-joining phylo- ScNV and BDV were similar the ScNV genome was genetic analysis of RdRP amino acid sequences, ScNV was considerably longer than BDV (11.4 vs 8.9 kb, respec- most similar to viruses in the order Mononegavirales, which tively). The ScNV genome had intergenic repeated gene- are enveloped ssRNA viruses with monopartite genomes of junction sequences downstream of each ORF (Table 3), negative polarity. As mentioned above, MIDWV showed similar to transcription termination elements found in the greatest similarity to ScNV, except that ScNV contained viruses of the order Mononegavirales (Cubitt et al., 1994;

Table 2. Predicted properties of SCN-associated virus ORFs

Virus ORF Length (nt) Size (aa) Mass (kDa) pI* Top BLASTX matchD E value

ScNV ORF I 1239 413 44.67 6.76 N protein (BDV) 6e205 ScNV ORF II 846 282 29.53 5.1 ND – ScNV ORF III 243 81 9.12 10.07 ND – ScNV ORF IV 1686 562 61.49d§ 6.28 G protein (MIDWV) 0.004 ScNV ORF V 6261 2087 236.43 8.79 RdRP (MIDWV) 6e2160 ScRV ORF I 1737 579 64.44 6.16 ND – ScRV ORF II 888 296 32.58 4.66 ND – ScRV ORF III 615 205 21.86 10.29 ND – ScRV ORF IV 2367 789 86.50d§ 9.42 Glycoamidase (Candida albicans) 1e207 ScRV ORF V 6588 2196 243.41§ 8.50 RdRP (NCMV) 2e2102 ScPV ORF I 6533 2176 247.98§ 7.51 RdRP (UUKV) 0.0 ScTV ORF I 8661 2887 321.64§ 6.68 RdRP (UUKV) 2e2129

*The pI, ORF length, size and mass were calculated using CLC Genomics Workbench. DData are not provided for virus ORF where no significant homology was detected (ND). dContains a N-terminal signal peptide and C-terminal membrane anchor. §Contains N-glycosylation sites.

1872 Journal of General Virology 92 RNA viruses in nematodes

Fig. 2. Phylogeny of the conserved region (aa 413–1229 of MIDWV; GenBank accession no. ACQ94979) of the predicted amino acid sequences of RdRp of the SCN-associated and related viruses. Viruses used in comparisons were: Barley yellow striate mosaic virus (BYSMV), Borna disease virus (BDV), Bovine ephemeral fever virus (BEFV), Bunyamwera virus (BUNV), Crimean-Congo hemorrhagic fever virus (CCHFV), Hantaan virus (HTNV), Human respiratory syncytial virus (HRSV), Lettuce necrotic yellows virus (LNYV), Maize mosaic virus (MMV), Measles virus (MeV), Northern cereal mosaic virus (NCMV), Nyamanini virus (NYMV), Orchid fleck virus (OFV), Rabies virus (RABV), Rice grassy stunt virus (RGSV), Rice stripe virus (RSV), Rift Valley fever virus (RVFV), Sandfly fever Naples virus (SFNV), Sonchus yellow net virus (SYNV), Tomato spotted wilt virus (TSWV), Uukuniemi virus (UUKV), Vesicular stomatitis Indiana virus (VSIV) and Zaire ebolavirus (ZEBOV). Genus and/or family designations for viruses are presented to the right of the phylogram. Significant bootstrap values are indicated as a percentage of 1000 iterations that supported that node. Bar indicates amino acid substitutions per site.

Mihindukulasuriya et al., 2009) and contained a consensus matches (E value 2e2133 to 1e291) to two H. glycines termination signal AGAAUUCUUUUUU where the (GenBank accession nos CD748291.1 and CB376231.1) and poly(U) tract is used by the RdRP to add poly(A) tails to one Heterodera schachtii cDNA (GenBank accession no. viral mRNAs via a stutter mechanism (Rassa et al., 2000; CF101148.1) that was 77 % identical at the protein level. Schubert et al., 1980). In contrast, the putative transcrip- The presence of related sequences in EST database suggests tion initiation sites were different from those of related that ScNV is present in SCN populations in addition to viruses, but they were found upstream of each ORF as those analysed here. The match to the closely related expected (Table 3). Overall, ScNV shows a closest fit to the nematode species H. schachtii suggests it was infected by a genus Nyavirus proposed by Mihindukulasuriya et al. virus very similar to ScNV, thus infection of cyst (2009). nematodes by viruses may be common.

Interestingly, a tBLASTX search of ScNV to NCBI’s expressed The second nematode virus genome was designated ScRV, sequence tags (EST) database showed three significant after its genome’s top BLASTX match, the enveloped plant http://vir.sgmjournals.org 1873 S. Bekal and others

Table 3. Intergenic sequences from ScNV and ScRV

Virus Intergenic sequence*

ScNV Transcript Start ORF I CUCGAUCGUAUCAAGG ScNV Transcript End ORF I/Start ORF II AGAAUUCUUUUUU GCUCGGUUGCGUCGAGG ScNV Transcript End ORF II/Start ORF III AGAAUUCUUUUUA GCUCGUUUGUGUCGAGG ScNV Transcript End ORF III/Start ORF IV AGAAUUCUUUUUU GCUCGUUCGUGUCGAGG ScNV Transcript End ORF IV/Start ORF V AGAAUUCUUUUUU GCUCGAUAGUAUACGAG ScNV Transcript End ORF V AGAAUUCUUUUUU ScRV Transcript Start ORF I GGUCGU ScRV Transcript End ORF I/Start ORF II GAAAUUCUUUUUUU GGUCUU ScRV Transcript End ORF II/Start ORF III GUUAUUCUUUUUUU GGUUGU ScRV Transcript End ORF III/Start IV GUAAUUCUUUUUUU GGUCUU ScRV Transcript Start ORF V GACUUUUUUAUUAG GGUCUU ScRV Transcript End ORF IV GUAGUUCUUUUUUU GGUACU ScRV Transcript End ORF V GUUAUUAUUUUUUU

*Intergenic repeats were identified using the motif alignment tool MEME (Bailey et al. 2006). virus, Northern cereal mosaic virus (NCMV). NCMV is also each ORF, however ORF V transcription initiation site in the order Mononegavirales, but is a member of the family appeared before ORF IV transcription termination making Rhabdoviridae, the genus Cytorhabdovirus. The ScRV it atypical (Table 3). genome was 12 698 nt and had a 833-fold total coverage The third virus-like sequence assembled, ScPV, was of Illumina reads (Table 1), suggesting abundant replica- 6705 nt in length, contained a single large ORF, and had tion in SCN. The ScRV genome contained five predicted a total of 3310-fold coverage of Illumina reads, often ORFs similar to ScNV in size (Fig. 1). The predicted amino making it the most abundant virus sequence of the four acid sequence of the largest ORF (V) in the ScRV genome (Table 1). ScPV was most closely related to enveloped was 46 % similar to RdRP or the L protein of NCMV viruses in the family Bunyaviridae, in the genus Phlebovirus (Table 2). The predicted amino acid sequences of ScRV (Fig. 2, Table 1). Phleboviruses have tripartite negative- ORFs I–III did not show similarity to N, P or M proteins or sense ssRNA genomes, but because the protein products of other virus-encoded proteins in the NCBI database, but we the two shorter RNAs (M and S) are less conserved, only assume this is the order of the coding sequences in the the L segment of the ScPV genome containing the RdRP ScRV genome based upon other rhabdoviruses with a coding sequence was detected (Fig. 1, Table 2). ScPV similar number of ORFs (Quan et al., 2010). Like ScNV, grouped with the type virus in this family (Bunyamwera ScRV ORF II is predicted to encode the only acidic protein virus; Fig. 2) and was most similar to Uukuniemi virus (pI 4.66) and since it does not encode a signal peptide, it (UUKV) (Flick et al., 2002). too may encode a phosphoprotein. The predicted amino acid sequence of the protein encoded by ORF IV had weak The final virus-like genome fragment detected, ScTV, was similarity to a glycoamidase (Table 2), contains four 8868 nt in length, contained a single ORF and had 1208- predicted N-glycosylation sites, is predicted to have an total fold coverage. ScTV was similar to UUKV based upon amino-terminal signal peptide sequence and a putative the BLASTX results and had a larger L-protein size than ScPV carboxyl-terminal transmembrane region, suggesting that (Tables 1 and 2, Fig. 1). Phylogenetic analysis of the the protein functions as a viral membrane glycoprotein. conserved RdRP region indicated that ScTV was related to viruses in the genus Tenuivirus, such as Rice stripe virus Phylogenetic analysis of ScRV RdRP sequence, like BLASTX (RSV) and Rice grassy stunt virus (RGSV; Fig. 2). Like results (Table 1), showed the closest relation of ScRV to tenuiviruses, the ScTV L-protein was larger than L proteins cytorhabdoviruses (Fig. 2). The genome size of ScRV was of most phleboviruses (Toriyama et al., 1994) (Fig. 1). Like consistent with those of other rhabdoviruses; however, the UUKV, the tenuiviruses have multipartite genomes, so only number of predicted ORFs in the ScRV genome differed the genome segment containing the L protein was detected from NCMV (Fig. 1). NCMV contains nine ORFs (Tanno and is shown in Tables 1 and 2. et al., 2000), while the ScRV sequence contained only five, suggesting that ScRV was significantly diverged from NCMV Detection of virus positive-strand RNA in J2-stage (Fig. 2). The ScRV sequence contained gene-junction nematodes repeats downstream of each ORF (Table 3) that were very similar to the transcription termination sites found in the To confirm virus replication in SCN, it was next queried NCMV and related viruses (Revill et al., 2005). Putative whether the predicted viral ORFs were actively transcribed. transcription initiations sites were identified upstream of To this end, we assayed for the presence of viral RdRP

1874 Journal of General Virology 92 RNA viruses in nematodes

Table 4. Detection of SCN virus RdRP mRNAs by PSRT-QPCR in J2-stage nematodes

PGP, a SCN transcript homologous to the C. elegans P-glycoprotein family member PGP-3 (GenBank accession no. NP509901); ND, not detected.

Assay J2 TN10 (with U-primer) Ct*D J2 TN10 (no U-primer) Ct J2 TN10 (no RT) Ct*

Replicate 1 Replicate 2 Replicate 1 Replicate 2 Replicate 1 Replicate 2

PGP 17.92 (0.17) 19.18 (0.23) ND ND 39.46d 31.20 (0.14) ScNV 17.99 (0.53) 17.31 (0.04) ND ND ND ND ScRV 18.72 (0.10) 18.67 (0.03) ND ND ND ND ScPV 16.90 (0.05) 16.99 (0.18) ND ND ND ND ScTV 17.06 (0.15) 18.16 (0.21) ND ND ND ND

*Threshold cycle (Ct) numbers are a mean of three replicates; the SD are shown in parentheses. DTwo biological replicates of SCN inbred strain TN10 samples were tested. dOnly one of the three replicates produced a detectable signal. mRNAs using a positive strand-specific reverse transcriptase (Table 5). The RNA samples were extracted from SCN over quantitative PCR (PSRT-QPCR) assay. A highly expressed a period of 4 years, which indicated that the SCN viral SCN gene, P-glycoprotein-3 (PGP-3), was used as a positive infections are long lived and stable in the nematode control. In this assay a universal primer sequence was added populations. to a reverse primer that hybridizes to the viral mRNA This QPCR data supports the Illumina data that suggested encoding the RdRP. After primer hybridization, reverse the putative SCN viral transcripts were very prevalent in transcriptase (RT) was used to synthesize cDNA using the the nematode transcriptome [Illumina sequence data with mRNA as a template. Upon completion of the RT reaction 883–3310-fold total coverage of the viral genomes (Table 1) the cDNA was detected via SYBR green-based real-time compared with 225–304-fold (529 total) coverage of PGP- PCR using a virus-specific forward primer and the universal 3]. However, the QPCR data are probably more accurate primer. Two negative controls were included, no universal since the Illumina data were derived from normalized tagged primer and no RT. The experiment was replicated cDNA libraries and the absolute number of reads probably using two different J2 RNA samples. In the PSRT-QPCR underestimates the coverage. tests, all four viruses and the control gene in both RNA samples were detected (Table 4). The PSRT-QPCR data showed similar levels of viral signals between the two Viruses are not integrated into the nematode nematode RNA samples tested (Table 4). The control PGP- genome 3 assay detected minor DNA contamination in the minus Because sequences homologous to transcripts of RNA RT control, a common problem in RT-QPCR due to the viruses in the order Mononegavirales have been detected high sensitivity of the assay, but the 4-log lower signal does integrated into host genomes and are sometimes actively not influence the experiments interpretation. The RNA transcribed (Belyi et al., 2010; Taylor et al., 2010), we samples were extracted over a 1 year period, which indicates assessed the possibility that the putative SCN viruses were that the SCN virus infections are persistent in the inbred integrated into the 100 Mb nematode genome (Opperman SCN population. The detection of positive strands for all & Bird, 1998). SCN genomic sequences generated using a SCN viruses indicates the viruses are replicating in J2-stage SOLiD sequencer were aligned to the viral sequences. In nematodes. this experiment, a total of 2.76108 50 bp reads of genomic DNA from SCN population TN10 (~100-fold coverage) Detection of virus sequences in eggs and J2-stage were aligned to all four SCN viral genomes and to a section nematodes of the SCN genome. None of the SOLiD fragments showed significant matches to the SCN virus sequences, which To confirm the SCN viruses were present in different represented nearly 37 000 bases in total. However, when the nematode populations and developmental stages, TaqMan same SOLiD data were aligned to a much shorter ~5000 bp assays designed to detect viral RdRP were used to test RNA single-copy region of the SCN genome (Craig et al., 2008), from SCN eggs and J2-stage nematodes via RT-QPCR. As a 1594 SOLiD reads matched the template. A similar result positive control, we also designed an RT-QPCR assay to was obtained when SOLiD data from SCN population detect the mRNA from PGP-3. Subsequent RT-QPCR TN20 was used (data not shown). These data indicate that analysis detected all four viruses in both egg and J2-stage the SCN viruses are not integrated into the SCN genome in nematodes at a level often higher than the PGP control. either of the two nematode populations tested, but does The RT-QPCR data showed variable levels of viral signals not rule out the possibility that other virus sequences may within and between the five SCN RNA samples tested be present in the SCN genome. http://vir.sgmjournals.org 1875 S. Bekal and others

Table 5. Detection of SCN virus RdRP mRNAs by PSRT-QPCR in eggs and J2-stage nematodes

ND, Not detected.

Assay TN10_egg Ct*D J2 TN10_1 Ct*D J2 TN10_2 Ct*D TN16 Ct*D TN20 Ct*D PGP 25.69 (0.03) 23.56 (0.03) 21.56 (0.049) 24.23 (0.32) 22.50 (0.01) ScNV 14.9 (0.04) 19.96 (0.02) 14.48 (0.12) 17.52 (0.12) 14.38 (0.17) ScRV 22.73 (0.06) 26.51 (0.10) 19.64 (0.292) 26.29 (0.23) 23.47 (0.12) ScPV 16.53 (0.03) 23.66 (0.19) 15.06 (0.162) 18.54 (0.07) ND ScTV 18.68 (0.01) ND 17.52 (0.05) 23.47 (0.32) 17.03 (0.231)

*Threshold cycle (Ct) numbers are a mean of three replicates; the SD are shown in parentheses. DSCN inbred populations are designated TN10-egg (RNA extracted 2009), TN10_1 (RNA extracted 2009), TN10_2 (RNA extracted 2010) TN16 (RNA extracted 2006), TN20 (RNA extracted 2009).

DISCUSSION closely related to cytorhabdoviruses that replicate in the cytoplasm. In the past, nematodes were thought to be poor hosts for viruses, primarily because few reports of nematode- Nematodes and insects are in the same superphylum, infecting viruses existed in the literature. Here, the Ecdysozoa (Telford et al., 2008), thus it is not surprising identification of two viral genomes sequences and two L- that the two would be infected by similar viruses. The polymerase-coding segments in SCN, and two in C. elegans observation that, for all four SCN viruses reported here, the and C. briggsae (Fe´lix et al., 2011) suggests that at least closest viral relative is an insect-associated virus (MIDWV some nematode species, and possibly others, are susceptible and UUKV infect ticks; NCMV and RSV replicate in to virus infection. It is unlikely that the sequences of the leafhoppers) is consistent with the conclusion that the viral viruses detected here were derived from contaminating genome sequences are from nematode viruses (Falk & Tsai, organisms for several reasons. (i) The SCN virus genomes 1998; Takahashi et al., 1982). were assembled from different stage nematodes from either One issue that should be addressed is why our laboratory SCN egg or J2 cDNA libraries. (ii) They were detected populations of SCN have so many viruses. One possible in multiple inbred populations by RT-QPCR. (iii) The explanation may be that the SCN populations used were migratory J2-stage nematodes were decontaminated highly inbred. The high genetic uniformity and elevated through a sterile sand column in the presence of SDS. nematode population density in the inbred populations This cleaning process physically abrades microbial con- may have allowed viruses of the nematode to increase in taminants from their surface (Craig et al., 2008). Therefore, the populations. Overall, we do see the most inbred all J2 SCN used in these sequencing projects were microbe- populations harbouring the most viruses. Nevertheless, we free on the surface. (iv) For both eggs and J2, often the did not observe specific viruses in particular populations most abundant sequences detected in the sequence analysis (Table 5). The observation that over time populations of were those of the viruses. Indeed, their genomes are more nematodes acquired a virus suggests the SCN viruses are abundant than most SCN mRNAs, thus the viruses are transferrable between nematodes. not likely to be minor contaminants of the SCN RNA preparations. Our PSRT-QPCR results also confirm SCN The mechanism of viral transfer between SCN populations viral genome replication in the J2 stage and support the is unknown, but the presence of all four viruses in eggs, relative high abundance of SCN viral mRNAs. (v) All suggests vertical transmission occurs. Given the life cycle of viruses detected in SCN are eukaryotic viruses not found in a soil-borne endoparasitic nematode, it might make sense prokaryote contaminants. that transovarial movement of the virus would be an efficient means of transfer. Vertical transmission is not Our alignments of SCN genomic DNA to the SCN virus uncommon in insect viruses and has been observed in sequences showed that the viruses are not integrated into the rhabdovirus Sigma virus, which infects Drosophila the SCN genome so they cannot be remnants of ancient (Longdon et al., 2010) and in the tenuiviruses, including viruses. Belyi et al. (2010) hypothesized that integrated RSV (Zeigler & Morales, 1990). Vertical viral transmis- virus sequences might provide a selective advantage sion would be a very effective mode of transmission in against infection by highly lethal BDV-like viruses, via SCN since most of the developmental stages are sedentary, an RNA interference-based inactivation of the virus, with the exception being the pre-parasitic J2 and the which, because they replicate in the nucleus, may have adult male. had greater opportunities to integrate into host genomes. It is possible that the SCN-associated members of the Horizontal transmission of SCN viruses could occur order Mononegavirales are not highly lethal and/or do between the migratory J2 if they are in close proximity to not replicate in the nucleus. For example, ScRV is most each other. Unfortunately, it is not known how often SCN

1876 Journal of General Virology 92 RNA viruses in nematodes

J2 encounter each other in their soil environment. One pH 7.5 and 0.5 M NaCl). After the incubation, the reaction was possible point of interaction may be at the root surface, treated with 0.25 U duplex specific nuclease (DSN). The normalized since it acts to aggregate the nematodes at the point of host cDNA was then amplified from 1 ml of DSN-treated cDNA by PCR using the CDSII-first and SMARTIV primers with the following invasion (Endo, 1971). In addition, aggregation behaviour conditions: 98 uC for 30 s, followed by 10 cycles with 98 uC for 10 s, has been demonstrated in many plant nematodes as a 68 uC for 30 s and 72 uC for 30 s, with a final extension of 72 uC for survival mechanism, thus this phenomenon may also serve 5 min. The resulting normalized cDNA was sequenced using Roche to facilitate the spread of nematode viruses (Wang et al., 454 Genome Sequencer FLX-Titanium (Roche) and Illumina Genome 2009). The adult male nematode moves in the soil and Analyser IIx (Illumina) following the manufacturer’s protocols. The mates with many female nematodes, which could also aid single Illumina sequencer run generated 155 204 800 reads (75 bp each) for the egg RNA sample and 112 226 970 reads (75 bp each) for in the spread of nematode viruses to uninfected females in the TN10 J2 RNA with the distance between pairs ranging from 39 to the SCN population. 372 bp. The 454 generated 5 788 850 reads (200–400 bp each) from The discovery of SCN viruses that are related to disease- SCN population TN10. causing agents suggests that the SCN viruses or deriva- Nucleotide sequence analysis. Genomic DNA from SCN tives thereof could be useful for the sustainable control of populations TN10 and TN20 using the Applied Biosystems (ABI, the soybean’s most damaging pathogen. However, the Foster City, CA) SOLiD sequencer. Two complete machine runs implications of this discovery may be even broader. were conducted for a TN20 paired-end library, generating 25 bp Nematodes form a large and diverse phylum, and if one paired-end reads. One machine run was conducted for a population widely distributed nematode species is infected by at TN10 fragment library, which generated 50 bp reads. The sequen- cing was conducted by SeqWright DNA Technology Services least four new viruses, and if other nematode species have following standard manufacturer’s protocols. Alignment to viruses as many viruses, then collectively nematodes could have was conducted as described above using CLC Genomics Workbench an astounding array of new and interesting viral software package (CLC bio). pathogens. The cDNA was assembled into contigs using CLC Genomics Nematode viruses could be used as vectors for expressing Workbench using the de novo assembly setting. The resulting contigs or silencing genes, or to dissect the molecular mechan- were compared to GenBank sequences using the BLASTX function in Workbench. The BLASTX results revealed similarity to viral sequences isms of host–pathogen interactions, for both parasitic and for four of the largest contigs. All contigs were exported in FASTA model nematode species. Further investigations of format and opened in the UltraEdit program (IDM Computer nematode viruses, and their infection cycles are likely to Solutions). End sequences of the four virus contigs were used to have a significant impact on many aspects of nematode search for matches to the contigs using UltraEdit. Additional stretches biology. of cDNA were added to the ends of the contigs until no further end sequences were detected in UltraEdit. The contigs were reassembled to their final size using the CLC Genomics Workbench and the Illumina paired-end reads were aligned to the contigs using the ‘align METHODS to’ function in Workbench. Consistent, overlapping paired-end Nematode populations. SCN populations TN10, TN16 and TN20 alignments across the contig were used to verify the correctly were grown by standard methods (Niblack et al., 1993). Eggs were assembled virus genomes. Coverage of Illumina reads at ends of the harvested and purified as described in Niblack et al. (1993) and J2 full virus genome gradually lowered to one or two read coverage were surface sterilized as described in Craig et al. (2008). indicating the end of the viral genome. Coverage for the virus genome was calculated by adding the total number of nucleotides in matching Construction of cDNA libraries. Total RNA was extracted from Illumina reads for each virus and then dividing by the length of the eggs and surface sterilized J2 (Craig et al., 2008). Construction of a specific virus. Intergenic repeats in the virus genome were identified normalized cDNA library and 454 pyrosequencing were carried out using the motif alignment tool MEME (Bailey et al., 2006). at the W. M. Keck Center for Comparative and Functional The CLC Genomics Workbench was also used for phylogenetic Genomics, Roy J. Carver Biotechnology Center, University of analyses. The conserved regions of the viral RdRP used in the pro- Illinois at Urbana-Champaign. For both 454 and Illumina sequen- tein sequence alignments and phylogenetic trees were chosen as cing, approximately 30 mg total RNA was used to select mRNA using described in Longdon et al. (2010). The protein alignments and trees the Oligotex mRNA mini kit (Qiagen). cDNAs were synthesized were performed using the neighbour-joining and UPGMA algo- from mRNA using the Creator SMART cDNA synthesis kit rithms with 10 000 bootstrap replications in the CLC Genomics (Clontech) with the following modifications: (i) the oligo-dT used Workbench. for priming of the reverse transcription was modified (CDSIII-First: 59-TAGAGACCGAGGCGGCCGACATGTTTTGTTTTTTTTTCTTT- RT-QPCR for virus detection. Primers and probes for the TaqMan TTTTTTTVN-39) to break the poly(A) run of the mRNA in the final assays were synthesized by Applied Systems. RT-QPCR was cDNA; this modification has been found to overcome the high conducted using the TaqMan EZ RT-PCR core reagents kit failure rate observed upon pyrosequencing in cDNAs with long (Applied Biosystems) and thermal cycling was performed using homopolymer runs; and (ii) after amplification, the double-stranded standard conditions on an ABI 7900HT Sequence Detection System. cDNA was size selected on a 2 % agarose gel to eliminate fragments The following primer and probes were used to detect the SCN viruses: smaller than 400 bp. ScNV-F-primer-59- TGGCCACTTCCTGTGCTTCT-39, ScNV-R-pri- mer-59-ACGAGCAGCGGATGAGTTTAA-39, ScNV-probe-59-FAM- Normalization of cDNA library. The cDNA library was normalized ATACTCAAGCGTGAATTG-NFQMGB-39; ScRV-F-primer-59-GTC- with the Trimmer Direct kit (Evrogen). In brief, 300 ng cDNA was CCCCTGCCCCATTATC-39, ScRV-R-primer-59-TTGACAAGTGCG- incubated at 95 uC for 5 min followed by incubation at 68 uC for 4 h GGTTTGAG-39, ScRV-probe-59-FAM-TCAGTGACACTCCATGCT- in the hybridization buffer included in the kit (50 mM HEPES, NFQMGB-39; ScPV-F-primer-59-TGCGGTCAGTCGGTCATAAG-39, http://vir.sgmjournals.org 1877 S. Bekal and others

ScPV-R primer-59-CAGTGGGACGGCAAAATCTT-39, ScPV-probe- Craig, J. P., Bekal, S., Hudson, M., Domier, L., Niblack, T. & Lambert, 59-FAM-AAACTATCATAGGGCTGCCA-NFQMGB-39;ScTV-F-primer- K. N. (2008). Analysis of a horizontally transferred pathway involved 59-CTTCCGGTGTAGCAGGGAGAT-39, ScTV-R-primer-59-AGCC- in vitamin B6 biosynthesis from the soybean cyst nematode CCAAGGACCTGGTTT-39, ScTV-probe-59-FAM-AGGGCTACCA- Heterodera glycines. Mol Biol Evol 25, 2085–2098. AACCA-NFQMGB-39;PGP-F-primer-59-AGGGCGTCGTGTCGTT- Cubitt, B., Oldstone, C. & de la Torre, J. C. (1994). Sequence and ACTC-39, PGP-R-primer-59-GGCGACATTCCAACCAAAGT-39,PGP- genome organization of Borna disease virus. J Virol 68, 1382–1396. probe-59-FAM-CGGAATTTCGGTGGCCT-NFQMGB-39. Culley, A. I., Lang, A. S. & Suttle, C. A. (2006). Metagenomic analysis PSRT-QPCR for virus detection. Reverse transcription cDNA of coastal RNA virus communities. Science 312, 1795–1798. synthesis was performed using 160 ng SCN total RNA in a 10 ml Edwards, R. A. & Rohwer, F. (2005). Viral metagenomics. Nat Rev reaction volume following manufacturer’s instructions for the Microbiol 3, 504–510. Thermoscript RT-PCR System (Invitrogen) using the gene-specific Endo, B. Y. (1971). Nematode-induced syncytia (giant cells) host– cDNA synthesis protocol. The following SCN virus-specific primers, parasite relationships of . In Plant Parasitic Nematodes, synthesized by Invitrogen, were designed to hybridize to the mRNA of pp. 91–117. Edited by B. M. Zuckerman, W. F. Mai & R. A. Rohde. the SCN viral RdRP-coding region (universal primer sequence New York: Academic Press. underlined): Falk, B. W. & Tsai, J. H. (1998). Biology and molecular biology of ScNV-RU-59-GGTCCTCCGCTGCCCTATGGTGGCCACTTCCTGT- viruses in the genus Tenuivirus. Annu Rev Phytopathol 36, 139–163. GCTTCT-39, ScRV-RU-59-GGTCCTCCGCTGCCCTATGGGTCCC- Fe´ lix, M. A., Ashe, A., Piffaretti, J., Wu, G., Nuez, I., Be´ licard, T., Jiang, Y., CCTGCCCCATTATC-39, ScPV-RU-59-GGTCCTCCGCTGCCCTAT- Zhao, G., Franz, C. J. & other authors (2011). Natural and experimental GGCAGTGGGACGGCAAAATCTT-39, ScTV-RU-59-GGTCCTCCG- infection of Caenorhabditis nematodes by novel viruses related to CTGCCCTATGGCTTCCGGTGTAGCAGGGAGAT-39, GPG-RU-59- nodaviruses. PLoS Biol 9, e1000586. GGTCCTCCGCTGCCCTATGGGGCGACATTCCAACCAAAGT-39. Flick, R., Elgh, F. & Pettersson, R. F. (2002). Mutational analysis of For each SCN virus a minus RU primer and minus RT control were the Uukuniemi virus (Bunyaviridae family) promoter reveals two included. Two different RNA samples were tested. elements of functional importance. J Virol 76, 10849–10860. After the cDNA reaction was complete, the RNA was degraded using Foor, W. E. (1972). Viruslike particles in a nematode. J Parasitol 58, 0.5 U RNase H and 2.5 mg RNase A at 37uC for 20 min. The 1065–1070. synthesized cDNA was detected using real-time PCR. Each cDNA Gray, S. M. & Banerjee, N. (1999). Mechanisms of arthropod sample was tested in triplicate using 1 ml cDNA for each reaction. For transmission of plant and animal viruses. Microbiol Mol Biol Rev 63, each virus cDNA SYBR green 26 master mix (Applied Biosystems) 128–148. and the universal primer 59-GGTCCTCCGCTGCCCTATGG-39 was used with a virus-specific forward primer (see below), the thermal Liu, W. H., Lin, Y. L., Wang, J. P., Liou, W., Hou, R. F., Wu, Y. C. & Liao, cycling was performed using standard conditions on an ABI 7900HT C. L. (2006). Restriction of vaccinia virus replication by a ced-3 and sequence detection system. The following forward primers synthe- ced-4-dependent pathway in Caenorhabditis elegans. Proc Natl Acad sized by Invitrogen were used to detect the SCN viruses: ScNV-F- Sci U S A 103, 4174–4179. primer-59-ACGAGCAGCGGATGAGTTTAA-39, ScRV-F-primer-59- Loewenberg, J. R., Sullivan, T. & Schuster, M. L. (1959). A virus TTGACAAGTGCGGGTTTGAG-39, ScPV-F-primer-59-TGCGGTC- disease of Meloidogyne incognita incognita, the southern root knot AGTCGGTCATAAG-39, ScTV-F-primer-59-AGCCCCAAGGACCT- nematode. Nature 184 (Suppl 24), 1896. GGTTT-39, PGP-F-primer-59-AGGGCGTCGTGTCGTTACTC-39. Longdon, B., Obbard, D. J. & Jiggins, F. M. (2010). Sigma viruses from three species of Drosophila form a major new clade in the rhabdovirus phylogeny. Proc Biol Sci 277, 35–44. ACKNOWLEDGEMENTS Lu, R., Maduro, M., Li, F., Li, H. W., Broitman-Maduro, G., Li, W. X. & We would like to thank the United Soybean Board and the North Ding, S. W. (2005). Animal virus replication and RNAi-mediated Central Soybean Research Panel for the financial support. We also antiviral silencing in Caenorhabditis elegans. Nature 436, 1040–1043. thank Drs George Bruening and Joanna Shisler for their helpful Mihindukulasuriya, K. A., Nguyen, N. L., Wu, G., Huang, H. V., da comments on this manuscript. Rosa, A. P., Popov, V. L., Tesh, R. B. & Wang, D. (2009). Nyamanini and midway viruses define a novel taxon of RNA viruses in the order Mononegavirales. J Virol 83, 5109–5116. REFERENCES Neumann, G., Whitt, M. A. & Kawaoka, Y. (2002). A decade after the generation of a negative-sense RNA virus from cloned cDNA – what Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, have we learned? J Gen Virol 83, 2635–2662. D. J. (1990). Basic local alignment search tool. J Mol Biol 215, 403– Niblack, T. L., Heinz, R. D., Smith, G. S. & Donald, P. A. (1993). 410. Distribution, density, and diversity of Heterodera glycines populations Ansorge, W. J. (2009). Next-generation DNA sequencing techniques. in Missouri. J Nematol 25, 880–886. New Biotechnol 25, 195–203. Opperman, C. H. & Bird, D. M. (1998). The soybean cyst nematode, Bailey, T. L., Williams, N., Misleh, C. & Li, W. W. (2006). MEME: Heterodera glycines: a genetic model system for the study of plant- discovering and analyzing DNA and protein sequence motifs. Nucleic parasitic nematodes. Curr Opin Plant Biol 1, 342–346. Acids Res 34, W369-W373. Poinar, G. O., Jr & Hess, R. (1977). Virus-like particles in the Belyi, V. A., Levine, A. J. & Skalka, A. M. (2010). Unexpected nematode Romanomermis culicivorax (Mermithidae). Nature 266, inheritance: multiple integrations of ancient bornavirus and ebola- 256–257. virus/marburgvirus sequences in vertebrate genomes. PLoS Pathog 6, Poinar, G. O., Jr, Hess, R. T. & Cole, A. (1980). Replication of an e1001030. Iridovirus in a nematode (Mermithidae). Intervirology 14, 316–320. Brown, D. J., Robertson, W. M. & Trudgill, D. L. (1995). Transmission Quan, P. L., Junglen, S., Tashmukhamedova, A., Conlan, S., of viruses by plant nematodes. Annu Rev Phytopathol 33, 223–249. Hutchison, S. K., Kurth, A., Ellerbrok, H., Egholm, M., Briese, T. &

1878 Journal of General Virology 92 RNA viruses in nematodes

Leendertz, F. H. (2010). Moussa virus: a new member of the Tanno, F., Nakatsu, A., Toriyama, S. & Kojima, M. (2000). Complete Rhabdoviridae family isolated from Culex decens mosquitoes in Coˆte nucleotide sequence of Northern cereal mosaic virus and its genome d’Ivoire. Virus Res 147, 17–24. organization. Arch Virol 145, 1373–1384. Rassa, J. C., Wilson, G. M., Brewer, G. A. & Parks, G. D. (2000). Taylor, D. J., Leach, R. W. & Bruenn, J. (2010). Filoviruses are ancient Spacing constraints on reinitiation of paramyxovirus transcription: and integrated into mammalian genomes. BMC Evol Biol 10, 193. the gene end U tract acts as a spacer to separate gene end from gene Telford, M. J., Bourlat, S. J., Economou, A., Papillon, D. & Rota- start sites. Virology 274, 438–449. Stabelli, O. (2008). The evolution of the Ecdysozoa. Philos Trans R Soc Revill, P., Trinh, X., Dale, J. & Harding, R. (2005). Taro vein chlorosis Lond B Biol Sci 363, 1529–1537. virus: characterization and variability of a new nucleorhabdovirus. Toriyama, S., Takahashi, M., Sano, Y., Shimizu, T. & Ishihama, A. (1994). J Gen Virol 86, 491–499. Nucleotide sequence of RNA 1, the largest genomic segment of rice stripe Riddle, D. L., Blumenthal, T., Meyer, B. J. & Priess, J. R. (1997). C. virus, the prototype of the tenuiviruses. JGenVirol75, 3569–3579. elegans II, 2nd edn. Cold Spring Harbor, NY: Cold Spring Harbor Wang, C. L., Lower, S. & Williamson, V. M. (2009). Application of Laboratory. Pluronic gel to the study of root-knot nematode behaviour. Schubert, M., Keene, J. D., Herman, R. C. & Lazzarini, R. A. (1980). Nematology 11, 453–464. Site on the vesicular stomatitis virus genome specifying polyadenyla- Wilson, E. O. (2003). The encyclopedia of life. Trends Ecol Evol 18,77–80. tion and the end of the L gene mRNA. J Virol 34, 550–559. Wrather, J. A. & Koenning, S. R. (2006). Estimates of disease effects on Shaham, S. (2006). Worming into the cell: viral reproduction in soybean yields in the United States 2003 to 2005. JNematol38,173–180. Caenorhabditis elegans. Proc Natl Acad Sci U S A 103, 3955–3956. Zeigler, R. S. & Morales, F. J. (1990). Genetic determination of Takahashi, M., Yunker, C. E., Clifford, C. M., Nakano, W., Fujino, N., replication of Rice hoja blanca virus within its planthopper vector Tanifuji, K. & Thomas, L. A. (1982). Isolation and characterization of Sogatodes oryzicola. Phytopathology 80, 559–566. Midway virus: a new tick-borne virus related to Nyamanini. J Med Zuckerman, B. M., Himmelho, S. & Kisiel, M. (1973). Virus-like Virol 10, 181–193. particles in Dolichodorus heterocephalus. Nematologica 19, 117.

http://vir.sgmjournals.org 1879