Homology-independent discovery of replicating pathogenic circular RNAs by deep sequencing and a new computational algorithm Qingfa Wua,b,1, Ying Wangb,1, Mengji Caob, Vitantonio Pantaleoc, Joszef Burgyanc, Wan-Xiang Lib, and Shou-Wei Dingb,2 aSchool of Life Sciences, University of Science and Technology of China, Hefei, 230027, China; bDepartment of Plant Pathology and Microbiology and Institute for Integrative Genome Biology, University of California, Riverside, CA 92521; and cIstituto di Virologia Vegetale, Consiglio Nazionale delle Ricerche, Torino, Italy Edited by George E. Bruening, University of California, Davis, CA, and approved January 24, 2012 (received for review October 28, 2011) A common challenge in pathogen discovery by deep sequencing Application of deep sequencing technologies has facilitated approaches is to recognize viral or subviral pathogens in samples discovery of viruses by purification- and culture-independent of diseased tissue that share no significant homology with a known approaches. In metagenomic approaches, viral sequences are pathogen. Here we report a homology-independent approach for enriched by partial purification of viral particles before deep se- discovering viroids, a distinct class of free circular RNA subviral quencing (6, 7). In contrast, enrichment of viral sequences is pathogens that encode no protein and are known to infect plants achieved by isolating the fraction of small RNAs 20–30 nt in only. Our approach involves analyzing the sequences of the total length for deep sequencing in virus discovery by deep sequencing small RNAs of the infected plants obtained by deep sequencing and assembly of total host small RNAs (vdSAR) (8, 9). Pro- with a unique computational algorithm, progressive filtering of duction of virus-derived small interfering RNAs (siRNAs) is a key overlapping small RNAs (PFOR). Viroid infection triggers production step in the RNA-based antiviral immunity of diverse eukaryotic of viroid-derived overlapping siRNAs that cover the entire genome hosts, in which these viral siRNAs guide specific viral RNA with high densities. PFOR retains viroid-specific siRNAs for genome clearance by RNA interference (RNAi) or RNA silencing (10– assembly by progressively eliminating nonoverlapping small RNAs 12). Arabidopsis thaliana plants produce 21-, 22-, and 24-nt size and those that overlap but cannot be assembled into a direct repeat classes of viral and endogenous siRNAs by Dicer-like 4 (DCL4), RNA, which is synthesized from circular or multimeric repeated-se- DCL2, and DCL3, respectively (10–12). In fruit flies and mos- quence templates during viroid replication. We show that viroids quitoes, viral siRNAs processed by Dicer-2 are predominantly 21 fi from the two known families are readily identi ed and their full- nt although recent studies also have detected virus-derived PIWI- length sequences assembled by PFOR from small RNAs sequenced interacting RNAs (piRNAs) that are 23–30 nt long (9, 10, 13). from infected plants. PFOR analysis of a grapevine library further Notably, virus-derived siRNAs and piRNAs overlap in sequence fi identi ed a viroid-like circular RNA 375 nt long that shared no sig- (14) so that it is possible to assemble them into longer fragments fi ni cant sequence homology with known molecules and encoded (contigs) of the invading viral genomes (8, 9) by genome assembly active hammerhead ribozymes in RNAs of both plus and minus algorithms developed specifically for short reads from next gen- polarities, which presumably self-cleave to release monomer from eration sequencing platforms such as Illumina (15). multimeric replicative intermediates. A potential application of the It should be pointed out that discovery of new viruses by both homology-independent approach for viroid discovery in plant and vdSAR and metagenomic approaches requires presence of a de- animal species where RNA replication triggers the biogenesis of tectable sequence homology with a known virus (6, 7, 9). How to siRNAs is discussed. recognize a novel pathogen in samples of diseased tissues that shares no statistically significant homology with a known pathogen he term “viroid” was first introduced in 1971 to describe represents a major challenge in pathogen discovery by deep se- Ta novel class of free RNA nonviral pathogens found in plants quencing approaches. (1–3). Viroids are single-stranded circular RNA molecules of In this study, we established a homology-independent approach 246–401 nt in length and do not encode any protein. Viroid RNAs for the discovery of the viroid pathogens by deep sequencing. display extensive intramolecular base pairing to give rod-like or Previous studies have shown that infection of plants with viroids quasirod-like conformations. Replication of viroids occurs via triggers abundant production of viroid-derived 21-, 22-, and 24-nt a rolling circle mechanism by host RNA polymerases to yield classes of small RNAs (sRNAs) in approximately equal plus and head-to-tail multiple-repeat replicative intermediates. Viroids minus strand ratios, which are likely products of DCL4, DCL2, identified to date infect only plants and belong to one of the two and DCL3, respectively (16–23). We found that these viroid families. Viroids in the Pospiviroidae such as potato spindle tuber sRNAs overlapped in sequence and covered the entire length of viroid (PSTVd) share a five-domain model including a central viroid RNAs with high densities. A unique computational algo- conserved region (CCR) and may all replicate in the nucleus. By rithm was developed to identify viroid-specific sRNAs sequenced contrast, viroids in the Avsunviroidae such as peach latent mosaic from a host organism and to assemble them into full-length viroid viroid (PLMVd) lack a CCR, encode a hammerhead ribozyme in RNA sequences. We show that viroids from both the Pospiviroidae both the positive and the negative strands that self-cleaves to re- lease monomer from the multimeric intermediates, and may all – replicate in the chloroplasts (1 3). Author contributions: Q.W., Y.W., W.-X.L., and S.-W.D. designed research; Q.W., Y.W., and Identification of new viroids requires purification and enrich- W.-X.L. performed research; V.P. and J.B. contributed new reagents; Q.W., Y.W., M.C., and ment of the naked viroid RNA by 2D gel electrophoresis before S.-W.D. analyzed data; and Q.W., Y.W., and S.-W.D. wrote the paper. cDNA synthesis and sequencing (4, 5). However, viroids generally The authors declare no conflict of interest. occur at low concentrations in the infected host, making viroid This article is a PNAS Direct Submission. discovery a challenging task for many plant pathology laborato- 1Q.W. and Y.W. contributed equally to this work. ries. This problem is probably one of the reasons why <40 viroids 2To whom correspondence should be addressed. E-mail: [email protected]. fi from two families have so far been identi ed although viroids This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. were first discovered >40 y ago (1–3). 1073/pnas.1117815109/-/DCSupplemental. 3938–3943 | PNAS | March 6, 2012 | vol. 109 | no. 10 www.pnas.org/cgi/doi/10.1073/pnas.1117815109 Downloaded by guest on September 28, 2021 and the Avsunviroidae families could be readily identified from the because PLMVd, GYSVd, and HSVd are 338, 363, and 293 nt in published small RNA libraries sequenced from the infected length, respectively. Importantly, we were not able to identify new plants. Analysis of a grapevine library by the algorithm led to the viroids from either library by similarity searches of the assembled identification of a viroid-like circular RNA 375 nt long that shared contigs to the known viroids. no significant sequence homology with any known viroid but encoded an active hammerhead ribozyme in both the plus and A Computational Algorithm for Viroid Discovery. We developed minus strands. It is likely that our approach will facilitate dis- a unique computational algorithm (Dataset S1) that classifies covery of unique viroids in plant and animal species in which RNA overlapping small RNAs in a library into two groups (Fig. 2). A replication triggers the Dicer-mediated biogenesis of overlap- terminal small RNA (TSR) overlaps at least one other small RNA ping siRNAs. in the pool (default k-mer = 17) at only one end to yield a contig longer than the TSR (represented by those numbered small Results RNAs in each of the small RNA assembles in Fig. 2, Lower). In Assembly of Viroid-Derived Small RNAs. To determine whether vi- contrast, an internal small RNA (ISR) overlaps at least one other roid-derived small RNAs could be assembled into contigs, we small RNA in the pool at both ends of the ISR to yield a longer analyzed two published small RNA libraries using Velvet and contig (represented by those between the numbered small RNAs Vcake programs, which represent graph- and overlapping-based in each of the small RNA assemblies in Fig. 2, Lower). The al- assembly approaches, respectively (24). The first library was from gorithm then removes nonoverlapping small RNAs and all of a peach tree infected with PLMVd and contained 7,862,905 reads those TSRs identified in the first step and reclassifies the small in total, of which 423,951 and 460,283 reads showed 100% match RNAs remaining in the pool into TSRs and ISRs. The algorithm, to the plus and minus strand RNA of PLMVd, respectively (23). referred to as progressive filtering of overlapping small RNAs The second was from tendril tissues of grapevine infected with (PFOR), is repeated iteratively until no TSR is found and all of both hop stunt viroid (HSVd) and grapevine yellow speckle viroid the small RNAs left in the pool are true ISRs. At the final step of (GYSVd) from the Pospiviroidae. The grapevine library contained PFOR, the remaining true ISRs from the small RNA library are a total of 4,701,135 reads, of which 95,980 and 66,933 were de- assembled into contigs.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages6 Page
-
File Size-