Sequence Analysis of the Xestia C-Nigrum Granulovirus Genome
Total Page:16
File Type:pdf, Size:1020Kb
Virology 262, 277–297 (1999) Article ID viro.1999.9894, available online at http://www.idealibrary.com on View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Sequence Analysis of the Xestia c-nigrum Granulovirus Genome Tohru Hayakawa,*,1 Rinkei Ko,† Kazuhiro Okano,†,‡ Su-Il Seong,*,2 Chie Goto,*,3 and Susumu Maeda*,†,‡ *Department of Entomology, University of California, Davis, One Shields Avenue, Davis, California 95616; †Laboratory of Molecular Entomology and Baculovirology, The Institute of Physical and Chemical Research (RIKEN), Wako 351-0198, Japan; and ‡Core Research for Evolutional Science and Technology (CREST) Project, JST, Japan Received May 10, 1999; returned to author for revision June 10, 1999; accepted July 8, 1999 The nucleotide sequence of the Xestia c-nigrum granulovirus (XcGV) genome was determined and found to comprise 178,733 bases with a G1C content of 40.7%. It contained 181 putative genes of 150 nucleotides or greater that showed minimal overlap. Eighty-four of these putative genes, which collectively accounted for 43% of the genome, are homologs of genes previously identified in the Autographa californica multinucleocapsid nucleopolyhedrovirus (AcMNPV) genome. These homologs showed on average 33% amino acid sequence identity to those from AcMNPV. Several genes reported to have major roles in AcMNPV biology including ie-2, gp64, and egt were not found in the XcGV genome. However, open reading frames with homology to DNA ligase, two DNA helicases (one similar to a yeast mitochondrial helicase and the other to a putative AcMNPV helicase), and four enhancins (virus enhancing factors) were found. In addition, several ORFs are repeated; there are 7 genes related to AcMNPV orf2, 4 genes related to AcMNPV orf145/150, and a number of repeated genes unique to XcGV. Eight major repeated sequences (XcGV hrs) that are similar to sequences found in the Trichoplusia ni GV genome (TnGV) were found. © 1999 Academic Press INTRODUCTION al., 1999; Ahrens et al., 1997; Kuzio et al., 1999), only very limited sequence information is available from GVs. Members of the Baculoviridae are characterized by rod- To investigate the gene content of a granulovirus, we shaped, enveloped virions containing large (90–180 kbp) have undertaken a program to sequence and character- double-stranded, circular DNAs. The Baculoviridae is sub- ize the genome of a GV pathogenic for spotted cutworm, divided into two genera, nucleopolyhedrovirus (NPV) and Xestia c-nigrum (Lepidoptera: Noctuidae) (Goto et al., granulovirus (GV). NPVs form large, polyhedral-shaped oc- 1998). XcGV was originally isolated in Hokkaido, Japan, clusion bodies (OB) within the nucleus of the infected cell from a field-collected X. c-nigrum larva showing symp- that embed numerous virions. GVs, on the other hand, form toms of granulovirus infection (Goto et al., 1985). X. c- smaller OBs and embed a single virion. During GV infection, nigrum is a major pest of many plants including celery, the nuclear membrane appears to break down and virion carrot, sugar beet, cotton, etc., and has been found occlusion occurs in both the nuclear and the cytoplasmic throughout Europe (as far north as Finland), Asia (from regions (Federici, 1997). NPVs from Lepidoptera replicate in India to Korea and Japan), North Africa, and Java (Hill, many tissues in infected insects, but show a relatively 1987). narrow host range. In contrast, the tissue tropism of GVs In a previous investigation, XcGV was shown to have a varies with the GV type. For example, Cydia pomonella GV genome size of about 179 kb (Goto et al., 1992). Se- (CpGV) shows tissue tropism similar to those of NPVs, but quence analysis of about 24% of the genome resulted in Trichoplusia ni GV (TnGV) infects only the fat body tissue the identification of a number of open reading frames (Federici, 1997). Although there are major differences in (ORFs) with homology to those from other baculoviruses morphology and biology between NPVs and GVs, little is (Goto et al., 1998). In this report, we describe the com- known about the causes of these differences at the molec- plete sequence and organization of the XcGV genome ular level. Whereas the complete genome sequences have and compare it to sequence data from other baculovi- been reported for several NPVs (Ayres et al., 1994; Gomi et ruses, primarily AcMNPV and Orgyia pseudotsugata MNPV (OpMNPV). 1 To whom correspondence and reprint requests should be ad- RESULTS AND DISCUSSION dressed at Graduate School of Science and Technology, Niigata Uni- versity, Ikarashi, Niigata 950-2181, Japan. Fax: 181-25-262-7637. DNA sequence of the XcGV genome 2 Present address: College of Natural Sciences, The University of Suwon, Suwon 445-743, Korea. To complete the genome sequence of XcGV we used 3 Present address: National Agriculture Research Center, Tsukuba, l and M13 phage libraries previously described (Goto et Ibaraki 305-8666, Japan. al., 1998). The XcGV genome was found to consist of 0042-6822/99 $30.00 277 Copyright © 1999 by Academic Press All rights of reproduction in any form reserved. 278 HAYAKAWA ET AL. 178,733 bp, which is about 45–50 kb larger than AcMNPV Theilmann and Stewart, 1992) and as origins of DNA (133,894 bp) (Ayres et al., 1994), Bombyx mori NPV (Bm- replication in transient replication assays (Ahrens et al., NPV, 128,413 bp) (Gomi et al., 1999), or OpMNPV (131,990 1995; Kool et al., 1995; Pearson et al., 1992). Homology bp) (Ahrens et al., 1997), but only about 18 kb larger than searches identified some sequence similarity between that of Lymantria disper MNPV (LdMNPV, 161,046 bp) the XcGV hrs and the TnGV internal repeat sequences (Kuzio et al., 1999). In addition, it is similar to the esti- (irs). The TnGV irs is composed of multiple overlapping mated size of the TnGV genome (175.6 kbp) (Hashimoto imperfect inverted repeats of unknown function (Hashi- et al., 1996). The XcGV genome has a G1C content of moto et al., 1996). Sequence comparisons between the 40.7%, which is similar to that of AcMNPV (41%) and XcGV hrs and the TnGV irs revealed that the homology BmNPV (40%), but significantly lower than that of OpM- was derived mostly from AT-rich sequences and, further- NPV (55%) and LdMNPV (58%). Computer-assisted ORF more, some of the XcGV hrs did not have inverted se- searches detected 412 ORFs of 50 amino acids or larger quences as found in the TnGV irs. However, both se- in the XcGV genome. Of these, 231 ORFs overlapped quences contained two highly conserved 10-bp-long significantly or were completely contained within other core sequences separated by a similar distance (Fig. 2). XcGV ORFs. The deduced protein sequence of these 231 ORFs also showed no significant homology to protein Comparison of ORFs between XcGV and NPVs sequences in GenBank. The remaining 181 ORFs were Eighty-four of the 181 putative genes of XcGV were thus selected for further detailed analysis. The location, homologs of AcMNPV ORFs (Table 1). Similarly, 76 of the orientation, size of the predicted amino acid sequences, 181 putative XcGV genes have OpMNPV homologs. and the positions of the repeated sequences are sum- Three genes found in XcGV and AcMNPV [XcGV ORF21 marized in Fig. 1 and detailed in Table 1. (AcMNPV ORF134; p94); orf67 (AcMNPV ORF105; he65); and orf147 (AcMNPV ORF1121113 homolog)] are not Repeated sequences found in OpMNPV (Table 1). Seven homologs of AcMNPV To identify repeated sequences, the XcGV sequence ORF2 are present and are discussed in detail below. On was compared with itself and its complementary strand average, 33% amino acid sequence identity was found by dot matrix analyses. These analyses were performed between the XcGV and AcMNPV homologs (Table 1). The using a 20-bp moving window that would accept up to most conserved is the ubiquitin homology (XcGV ORF52), four mismatches (data not shown). Eight major repeated which shows about 79% amino acid sequence identity sequence regions (XcGV hrs1–8) and one short homolo- (Id) with AcMNPV ORF35 and OpMNPV ORF25. Other gous region (XcGV hr5a) were identified (Figs. 1 and 2). highly conserved ORFs (more than 50% Id) are chitinase All of the XcGV hrs were found within AT-rich regions and (XcGV ORF104, AcMNPV ORF126, OpMNPV ORF124); the contained three to six direct imperfect repeats that are superoxide dismutase homolog (XcGV ORF69, AcMNPV about 120 bp long except for hr5a. Although hr5a showed ORF31, OpMNPV ORF29); LEF-9 (XcGV ORF141, AcMNPV homology to other hrs, it did not contain multiple re- ORF62, OpMNPV ORF65); and granulin/polyhedrin (XcGV peated sequences and, interestingly, was located within ORF1, AcMNPV ORF8, OpMNPV ORF3). an ORF (discussed in detail below). The nucleotide se- quences of the repeats were highly variable between Differences between XcGV and Ac/OpMNPV each hr and even within the same hr (Fig. 2). Sequence functional gene groups alignment revealed that the XcGV hrs contained two Homologs of genes involved in transient DNA replica- highly conserved 10-bp core sequences [TTAAT(G/ tion and late gene expression. Using transient assays, A)TCGA] that are located at roughly the same position considerable progress has been achieved in the identi- (about 35 bp) in each repeat. The core sequences of hrs fication and characterization of genes that are involved in 1, 2, 4, 6, and 8 are in the same orientation, whereas DNA replication and late gene expression. Nineteen late those of hrs 3, 5, 5a, and 7 are in the opposite direction expression factor or lef genes have been identified in the (Fig. 2). The biological function of these sequences re- AcMNPV genome that are required for optimal transac- mains to be determined. It is interesting to note, how- tivation of expression from the late vp39 and p6.9 pro- ever, that all baculovirus genomes examined appear to moters and the very late polyhedrin and p10 promoters.