<<

Virology 262, 277–297 (1999) Article ID viro.1999.9894, available online at http://www.idealibrary.com on

View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector Sequence Analysis of the Xestia c-nigrum Granulovirus Genome

Tohru Hayakawa,*,1 Rinkei Ko,† Kazuhiro Okano,†,‡ Su-Il Seong,*,2 Chie Goto,*,3 and Susumu Maeda*,†,‡

*Department of Entomology, University of California, Davis, One Shields Avenue, Davis, California 95616; †Laboratory of Molecular Entomology and Baculovirology, The Institute of Physical and Chemical Research (RIKEN), Wako 351-0198, Japan; and ‡Core Research for Evolutional Science and Technology (CREST) Project, JST, Japan Received May 10, 1999; returned to author for revision June 10, 1999; accepted July 8, 1999

The nucleotide sequence of the Xestia c-nigrum granulovirus (XcGV) genome was determined and found to comprise 178,733 bases with a GϩC content of 40.7%. It contained 181 putative genes of 150 nucleotides or greater that showed minimal overlap. Eighty-four of these putative genes, which collectively accounted for 43% of the genome, are homologs of genes previously identified in the Autographa californica multinucleocapsid nucleopolyhedrovirus (AcMNPV) genome. These homologs showed on average 33% amino acid sequence identity to those from AcMNPV. Several genes reported to have major roles in AcMNPV biology including ie-2, gp64, and egt were not found in the XcGV genome. However, open reading frames with homology to DNA ligase, two DNA helicases (one similar to a yeast mitochondrial helicase and the other to a putative AcMNPV helicase), and four enhancins (virus enhancing factors) were found. In addition, several ORFs are repeated; there are 7 genes related to AcMNPV orf2, 4 genes related to AcMNPV orf145/150, and a number of repeated genes unique to XcGV. Eight major repeated sequences (XcGV hrs) that are similar to sequences found in the Trichoplusia ni GV genome (TnGV) were found. © 1999 Academic Press

INTRODUCTION al., 1999; Ahrens et al., 1997; Kuzio et al., 1999), only very limited sequence information is available from GVs. Members of the Baculoviridae are characterized by rod- To investigate the gene content of a granulovirus, we shaped, enveloped virions containing large (90–180 kbp) have undertaken a program to sequence and character- double-stranded, circular DNAs. The Baculoviridae is sub- ize the genome of a GV pathogenic for spotted cutworm, divided into two genera, nucleopolyhedrovirus (NPV) and Xestia c-nigrum (: ) (Goto et al., granulovirus (GV). NPVs form large, polyhedral-shaped oc- 1998). XcGV was originally isolated in Hokkaido, Japan, clusion bodies (OB) within the nucleus of the infected cell from a field-collected X. c-nigrum larva showing symp- that embed numerous virions. GVs, on the other hand, form toms of granulovirus infection (Goto et al., 1985). X. c- smaller OBs and embed a single virion. During GV infection, nigrum is a major pest of many plants including celery, the nuclear membrane appears to break down and virion carrot, sugar beet, cotton, etc., and has been found occlusion occurs in both the nuclear and the cytoplasmic throughout Europe (as far north as Finland), Asia (from regions (Federici, 1997). NPVs from Lepidoptera replicate in India to Korea and Japan), North Africa, and Java (Hill, many tissues in infected , but show a relatively 1987). narrow host range. In contrast, the tissue tropism of GVs In a previous investigation, XcGV was shown to have a varies with the GV type. For example, Cydia pomonella GV genome size of about 179 kb (Goto et al., 1992). Se- (CpGV) shows tissue tropism similar to those of NPVs, but quence analysis of about 24% of the genome resulted in Trichoplusia ni GV (TnGV) infects only the fat body tissue the identification of a number of open reading frames (Federici, 1997). Although there are major differences in (ORFs) with homology to those from other baculoviruses morphology and biology between NPVs and GVs, little is (Goto et al., 1998). In this report, we describe the com- known about the causes of these differences at the molec- plete sequence and organization of the XcGV genome ular level. Whereas the complete genome sequences have and compare it to sequence data from other baculovi- been reported for several NPVs (Ayres et al., 1994; Gomi et ruses, primarily AcMNPV and Orgyia pseudotsugata MNPV (OpMNPV).

1 To whom correspondence and reprint requests should be ad- RESULTS AND DISCUSSION dressed at Graduate School of Science and Technology, Niigata Uni- versity, Ikarashi, Niigata 950-2181, Japan. Fax: ϩ81-25-262-7637. DNA sequence of the XcGV genome 2 Present address: College of Natural Sciences, The University of Suwon, Suwon 445-743, Korea. To complete the genome sequence of XcGV we used 3 Present address: National Agriculture Research Center, Tsukuba, ␭ and M13 phage libraries previously described (Goto et Ibaraki 305-8666, Japan. al., 1998). The XcGV genome was found to consist of

0042-6822/99 $30.00 277 Copyright © 1999 by Academic Press All rights of reproduction in any form reserved. 278 HAYAKAWA ET AL.

178,733 bp, which is about 45–50 kb larger than AcMNPV Theilmann and Stewart, 1992) and as origins of DNA (133,894 bp) (Ayres et al., 1994), Bombyx mori NPV (Bm- replication in transient replication assays (Ahrens et al., NPV, 128,413 bp) (Gomi et al., 1999), or OpMNPV (131,990 1995; Kool et al., 1995; Pearson et al., 1992). Homology bp) (Ahrens et al., 1997), but only about 18 kb larger than searches identified some sequence similarity between that of Lymantria disper MNPV (LdMNPV, 161,046 bp) the XcGV hrs and the TnGV internal repeat sequences (Kuzio et al., 1999). In addition, it is similar to the esti- (irs). The TnGV irs is composed of multiple overlapping mated size of the TnGV genome (175.6 kbp) (Hashimoto imperfect inverted repeats of unknown function (Hashi- et al., 1996). The XcGV genome has a GϩC content of moto et al., 1996). Sequence comparisons between the 40.7%, which is similar to that of AcMNPV (41%) and XcGV hrs and the TnGV irs revealed that the homology BmNPV (40%), but significantly lower than that of OpM- was derived mostly from AT-rich sequences and, further- NPV (55%) and LdMNPV (58%). Computer-assisted ORF more, some of the XcGV hrs did not have inverted se- searches detected 412 ORFs of 50 amino acids or larger quences as found in the TnGV irs. However, both se- in the XcGV genome. Of these, 231 ORFs overlapped quences contained two highly conserved 10-bp-long significantly or were completely contained within other core sequences separated by a similar distance (Fig. 2). XcGV ORFs. The deduced protein sequence of these 231 ORFs also showed no significant homology to protein Comparison of ORFs between XcGV and NPVs sequences in GenBank. The remaining 181 ORFs were Eighty-four of the 181 putative genes of XcGV were thus selected for further detailed analysis. The location, homologs of AcMNPV ORFs (Table 1). Similarly, 76 of the orientation, size of the predicted amino acid sequences, 181 putative XcGV genes have OpMNPV homologs. and the positions of the repeated sequences are sum- Three genes found in XcGV and AcMNPV [XcGV ORF21 marized in Fig. 1 and detailed in Table 1. (AcMNPV ORF134; p94); orf67 (AcMNPV ORF105; he65); and orf147 (AcMNPV ORF112ϩ113 homolog)] are not Repeated sequences found in OpMNPV (Table 1). Seven homologs of AcMNPV To identify repeated sequences, the XcGV sequence ORF2 are present and are discussed in detail below. On was compared with itself and its complementary strand average, 33% amino acid sequence identity was found by dot matrix analyses. These analyses were performed between the XcGV and AcMNPV homologs (Table 1). The using a 20-bp moving window that would accept up to most conserved is the ubiquitin homology (XcGV ORF52), four mismatches (data not shown). Eight major repeated which shows about 79% amino acid sequence identity sequence regions (XcGV hrs1–8) and one short homolo- (Id) with AcMNPV ORF35 and OpMNPV ORF25. Other gous region (XcGV hr5a) were identified (Figs. 1 and 2). highly conserved ORFs (more than 50% Id) are chitinase All of the XcGV hrs were found within AT-rich regions and (XcGV ORF104, AcMNPV ORF126, OpMNPV ORF124); the contained three to six direct imperfect repeats that are superoxide dismutase homolog (XcGV ORF69, AcMNPV about 120 bp long except for hr5a. Although hr5a showed ORF31, OpMNPV ORF29); LEF-9 (XcGV ORF141, AcMNPV homology to other hrs, it did not contain multiple re- ORF62, OpMNPV ORF65); and granulin/polyhedrin (XcGV peated sequences and, interestingly, was located within ORF1, AcMNPV ORF8, OpMNPV ORF3). an ORF (discussed in detail below). The nucleotide se- quences of the repeats were highly variable between Differences between XcGV and Ac/OpMNPV each hr and even within the same hr (Fig. 2). Sequence functional gene groups alignment revealed that the XcGV hrs contained two Homologs of genes involved in transient DNA replica- highly conserved 10-bp core sequences [TTAAT(G/ tion and late gene expression. Using transient assays, A)TCGA] that are located at roughly the same position considerable progress has been achieved in the identi- (about 35 bp) in each repeat. The core sequences of hrs fication and characterization of genes that are involved in 1, 2, 4, 6, and 8 are in the same orientation, whereas DNA replication and late gene expression. Nineteen late those of hrs 3, 5, 5a, and 7 are in the opposite direction expression factor or lef genes have been identified in the (Fig. 2). The biological function of these sequences re- AcMNPV genome that are required for optimal transac- mains to be determined. It is interesting to note, how- tivation of expression from the late vp39 and p6.9 pro- ever, that all baculovirus genomes examined appear to moters and the very late polyhedrin and p10 promoters. contain interspersed repeated sequences located Nine lef genes (ie-1, ie-2, lef1-3, lef-7, dnapol, p35, and throughout their genomes (Ayres et al., 1994; Ahrens et p143) are required for optimal plasmid DNA replication in al., 1997; Cochran and Faulkner, 1983; Garcia-Maruniak SF-21 cells (Lu and Miller, 1995b; Kool et al., 1994), et al., 1996; Kuzio et al., 1999; Majima et al., 1993; Theil- whereas lef-4, -5, -6, -8, -9, -10, -11, -12, 39k, and p47 are mann and Stewart, 1992; Xie et al., 1995). These se- required in addition to the replication genes for optimal quences have been shown to function as transactivators levels of late gene transcription (Todd et al., 1995, 1996; of RNA polymerase II mediated transcription in both Lu and Miller, 1994; Morris et al., 1994; Passarelli et al., AcMNPV and OpMNPV (Guarino and Summers, 1986; 1994; Li et al., 1993; Passarelli and Miller, 1993a,b,c, SEQUENCE ANALYSIS OF THE Xestia c-nigrum GRANULOVIRUS GENOME 279

FIG. 1. Organization of the XcGV genome. The circular XcGV genome is shown in a linear format beginning from the ATG of the granulin gene. Black arrows show the orientation and approximate length of ORFs of 50 amino acid or larger that show minimal overlap. Repeated sequence regions (hrs) are indicated by shaded boxes. The ORFs are numbered consecutively beginning with the granulin gene. Names assigned are referenced in Table 1 or are from Ayres et al. (1994) and Ahrens et al. (1997). The scale is in kilobase pairs (kb). The relative positions of BamHI and EcoRI sites within the genome are shown below the ORFs. TABLE 1 280 XcGV ORFsa

Xc ORF Left Dir Right Xc aa Mr Ac ORF Ac aa %ID (Range) Op ORF Op aa %ID (Range) prm Name References

11ӷ 747 248 29,202 8 245 53.8 (240) 3 245 54.2 (240) L granulin 2 795 Ӷ 1,490 231 26,340 9 543 22.0 (182) 2 473 24.1 (162) L 1629 capsid Russell et al. (1997) 3 1,423 ӷ 2,331 302 36,019 10 272 31.5 (286) 1 274 29.6 (284) L pk 4 2,365 Ӷ 3,339 324 38,924 L 5 3,461 ӷ 3,715 84 9,060 137 94 29.6 (71) 133 92 25.0 (76) L p10 6 3,577 Ӷ 3,750 57 6,107 — 7 3,750 Ӷ 4,313 187 21,960 — 8 4,303 ӷ 4,563 86 10,303 L 9 4,570 Ӷ 6,024 484 56,419 147 582 13.8 (363) 145 560 14.8 (357) — ie-1 10 6,054 ӷ 6,644 196 21,757 146 201 24.0 (171) 144 197 17.2 (198) — 11 6,668 Ӷ 6,967 99 11,578 145/150 77/99 44.8 (67)/35.6 (87) 142 95 40.4 (94) L 12 6,986 Ӷ 7,237 83 8,973 143 62 42.1 (57) 140 85 35.7 (70) L odv-e18 Braunagel et al. (1996) 13 7,241 Ӷ 8,602 453 52,690 142 477 29.5 (475) 139 484 32.9 (474) L 14 8,674 Ӷ 9,363 229 25,765 — 15 9,382 Ӷ 10,443 353 38,604 148 376 36.6 (352) 146 374 37.5 (352) L odv-e56 Theilmann et al. (1996) 16 10,473 ӷ 10,688 71 8,385 29 71 26.0 (50) 39 75 23.9 (46) — 15R (CpGV) Kang et al. (1997) AAAAE AL. ET HAYAKAWA 17 10,708 Ӷ 11,271 187 21,472 L 16L-1 (CpGV) Kang et al. (1997) 18 11,325 Ӷ 11,786 153 17,253 L 16L-2 (CpGV) Kang et al. (1997) 19 11,807 Ӷ 12,967 386 40,781 L 17R (CpGV) Kang et al. (1997) 20 13,029 ӷ 13,304 91 10,178 145/150 77/99 28.9 (45)/28.2 (85) 142 95 25.0 (80) L 21 13,391 ӷ 15,871 826 95,664 134 803 32.4 (834) — p94 22 16,038 Ӷ 17,516 492 56,921 L 23 17,583 Ӷ 18,635 350 40,201 — 24 18,707 Ӷ 19,081 124 13,899 L 25 20,518 ӷ 21,867 449 51,279 — 26 22,392 ӷ 23,504 370 42,911 — 27 23,528 ӷ 25,327 599 69,862 23 690 19.9 (550) 21 627 18.0 (550) — 28 25,465 ӷ 26,436 323 36,725 — 29 26,694 Ӷ 27,206 170 19,984 L 30 27,223 Ӷ 27,609 128 14,494 — 31 27,542 Ӷ 27,799 85 9,940 L 32 27,829 ӷ 28,416 195 22,221 115 204 34.6 (191) 115 205 38.1 (168) — 33 28,423 ӷ 28,716 97 10,768 L 34 28,742 ӷ 29,083 113 13,448 L 35 29,085 ӷ 29,654 189 22,158 6 210 22.6 (155) 6 204 21.6 (162) — lef-2 36 29,657 ӷ 29,926 89 10,384 — 35Ra (CpGV) Jehle et al. (1997) 37 29,940 Ӷ 30,467 175 20,722 — 38 30,514 Ӷ 30,849 111 12,828 — 39 30,976 Ӷ 31,425 149 17,633 E 40 31,526 Ӷ 32,935 469 53,806 — metalloproteinase Goto et al. (1998) 41 33,095 Ӷ 33,322 75 8,607 — 42 33,410 Ӷ 34,549 379 45,241 — 43 34,607 ӷ 35,440 277 32,438 L p13 (LsNPV) Wang et al. (1995) 44 35,693 ӷ 36,139 148 17,161 E 45 36,195 ӷ 37,361 388 44,999 22 382 50.5 (378) 20 382 49.7 (378) L Repeat 37,366 37,981 hr1 46 37,990 Ӷ 38,235 81 9,697 L 47 38,259 ӷ 38,921 220 25,166 L 48 38,957 ӷ 41,476 839 95,297 L 49 41,278 Ӷ 41,823 181 19,753 — 50 41,886 Ӷ 42,704 272 31,789 106 61 42.9 (56) 107 256 46.2 (171) L 51 42,749 ӷ 42,910 53 6,070 110 56 28.6 (49) 111 57 26.5 (49) — 52 42,950 Ӷ 43,183 77 8,676 35 77 78.9 (76) 25 93 79.2 (77) L ubiquitin 53 43,264 ӷ 44,325 353 40,186 109 390 25.9 (398) 109 390 30.5 (394) L 54 44,353 ӷ 44,685 110 12,541 L EUNEAAYI FTHE OF ANALYSIS SEQUENCE 55 44,744 Ӷ 45,631 295 33,624 36 275 19.6 (199) 24 261 19.4 (227) L 39k (pp31) 56 45,594 Ӷ 45,902 102 11,735 37 112 29.2 (89) 23 125 27.7 (83) — lef-11 Repeat 45,980 46,604 hr2 57 46,711 ӷ 47,547 278 32,659 E, L 58 47,561 ӷ 48,601 346 38,817 127 323 46.8 (316) 125 324 44.6 (323) — cathepsin 59 48,781 Ӷ 50,736 651 77,127 — 60 51,001 Ӷ 52,455 484 56,132 2 328 25.5 (216) — bro-a 61 52,744 Ӷ 54,111 455 52,812 — 62 54,238 Ӷ 54,873 211 24,784 — 63 55,246 Ӷ 55,476 76 8,878 — 64 55,616 Ӷ 57,685 689 80,083 E

65 57,740 Ӷ 59,182 480 58,116 L c-nigrum Xestia 66 59,293 Ӷ 59,949 218 25,128 — 67 60,174 Ӷ 61,880 568 66,450 105 553 37.9 (554) — he65 68 62,003 Ӷ 62,464 153 16,093 31 151 57.5 (146) 29 152 58.3 (144) L sod 69 62,520 ӷ 63,377 285 32,966 — 70 63,382 Ӷ 63,552 56 6,569 L

Repeat 63,514 64,217 hr3 GENOME GRANULOVIRUS 71 64,265 ӷ 64,939 224 25,766 L 72 65,010 ӷ 66,038 342 38,216 L 73 66,144 ӷ 67,520 458 52,039 — 74 67,564 Ӷ 68,670 368 43,312 L 75 68,728 ӷ 69,021 97 11,584 79 104 36.4 (88) 82 104 39.3 (89) L 11.3-kDa E. coli Repeat 69,071 69,821 hr4 76 71,067 ӷ 71,888 273 31,682 2 328 25.7 (140) — bro-b 77 71,928 ӷ 74,060 710 80,637 138 645 36.2 (694) 134 644 34.0 (694) E, L p74 78 74,444 ӷ 75,628 394 46,155 40 401 41.4 (401) 45 399 40.6 (399) — p47 79 75,896 ӷ 76,573 225 27,173 38 216 41.3 (189) 22 209 40.2 (189) L 80 76,566 ӷ 77,114 182 20,331 129 198 28.9 (90) 127 192 28.4 (102) — p24 capsid 81 77,141 Ӷ 77,695 184 21,166 L 82 77,696 Ӷ 78,412 238 27,886 14 266 32.5 (234) 13 243 34.0 (235) — lef-1 83 78,481 ӷ 79,029 182 19,282 L 84 79,046 ӷ 80,668 540 61,311 119 530 31.6 (544) 119 529 30.5 (548) L 85 80,706 Ӷ 81,404 232 26,742 — 86 81,456 Ӷ 81,803 115 13,586 — 87 81,934 ӷ 82,428 164 18,631 145/150 77/99 28.6 (42)/30.2 (96) 142 95 23.9 (71) — 88 82,425 Ӷ 82,724 99 11,634 — 281 TABLE 1—Continued 282

Xc ORF Left Dir Right Xc aa Mr Ac ORF Ac aa %ID (Range) Op ORF Op aa %ID (Range) prm Name References

89 82,787 Ӷ 83,620 277 31,624 25 316 22.3 (188) 43 300 28.0 (182) — dbp Mikhailov et al. (1998) 90 83,902 Ӷ 84,642 246 28,872 — 91 84,641 ӷ 85,759 372 43,837 103 387 35.8 (380) 104 411 32.3 (402) L 92 85,768 ӷ 86,130 120 12,965 102 122 20.9 (115) 103 112 24.5 (102) L 93 86,182 ӷ 87,300 372 42,545 101 361 19.7 (346) 102 354 21.3 (334) L 94 87,361 ӷ 87,543 60 7,452 100 55 38.0 (50) 101 51 40.0 (45) L p6.9 95 87,584 Ӷ 88,321 245 28,126 99 265 39.2 (240) 100 263 42.2 (225) — lef-5 96 88,320 ӷ 89,225 301 36,095 98 320 35.5 (307) 99 313 35.2 (301) L 97 89,242 Ӷ 89,715 157 17,993 96 173 30.9 (165) 97 172 30.3 (165) L 98 89,714 ӷ 93,193 1159 136,080 95 1221 26.1 (1197) 96 1223 23.1 (1197) — helicase-1 99 93,269 Ӷ 93,931 220 24,231 94 228 35.4 (229) 95 229 36.7 (229) L odv-e25 100 94,088 Ӷ 94,456 122 14,258 93 161 33.1 (118) 94 159 32.2 (118) L 101 94,552 ӷ 95,307 251 30,133 92 259 36.0 (258) 93 282 36.3 (193) L 102 95,313 Ӷ 95,576 87 10,043 60 87 28.6 (70) 63 90 32.9 (70) L 103 95,968 ӷ 97,752 594 66,311 126 551 61.6 (531) 124 550 60.4 (528) — chitinase 104 97,820 Ӷ 97,987 55 6,588 L 105 98,098 Ӷ 98,373 91 10,060 145/150 77/99 27.9 (61)/28.9 (83) 142 95 26.7 (90) L 106 98,567 ӷ 98,839 90 10,110 E AAAAE AL. ET HAYAKAWA 107 98,902 ӷ 99,636 244 27,785 64 302 41.7 (254) 69 321 41.9 (258) L gp37 108 99,743 ӷ 100,201 152 17,090 L 109 100,514 ӷ 101,440 308 34,563 2 328 23.4 (290) 67 79 33.3 (72) — bro-c 110 101,504 Ӷ 102,847 447 51,702 90 464 31.2 (459) 91 457 29.9 (451) — lef-4 111 102,896 ӷ 103,885 329 37,223 89 347 26.9 (316) 90 351 28.8 (316) L p39 capsid 112 103,961 ӷ 104,827 288 33,913 144 290 28.5 (298) 141 297 23.8 (303) — odv-ec27 Goto et al. (1998) 113 105,163 Ӷ 106,284 373 44,546 L Repeat 106,387 106,804 hr5 114 106,832 Ӷ 108,115 427 49,745 2 328 25.4 (173) — bro-d 115 108,081 ӷ 109,274 397 46,302 — 116 109,376 ӷ 109,747 123 14,577 L 117 109,788 Ӷ 110,333 181 21,345 E 118 110,420 Ӷ 112,645 741 82,911 83 847 20.6 (676) 86 819 20.7 (676) L p91 Russell and Rohrmann (1997) Repeat 111,506 111,632 hr5a 119 112,611 ӷ 113,096 161 18,167 82 180 23.7 (131) 85 155 21.8 (133) E tlp20 Raynes et al. (1994) 120 113,107 ӷ 113,670 187 21,473 81 233 46.5 (187) 84 218 45.1 (182) L 121 113,729 ӷ 114,601 290 33,049 80 409 33.8 (299) 83 367 31.5 (248) L gp41 122 114,665 ӷ 114,976 103 11,620 78 109 23.1 (91) 81 105 26.8 (97) L 123 114,960 ӷ 116,081 373 43,236 77 379 31.4 (366) 80 374 30.5 (367) L vlf-1 124 116,082 Ӷ 116,624 180 21,640 — 125 116,664 ӷ 116,921 85 9,722 76 84 31.2 (80) 79 84 35.0 (80) — 126 116,981 ӷ 117,418 145 17,093 75 133 18.9 (132) 78 130 23.3 (129) L 127 117,431 ӷ 117,589 52 5,753 3 53 45.3 (53) 136 53 47.2 (53) L ct1 128 117,608 Ӷ 118,039 143 16,457 — 129 118,205 ӷ 119,143 312 36,018 E 130 119,321 ӷ 120,034 237 27,426 2 328 32.0 (97) E bro-e 131 120,298 ӷ 121,626 442 51,354 2 328 22.6 (297) — bro-f 132 121,682 Ӷ 124,978 1098 126,923 65 984 31.7 (1005) 70 985 31.4 (969) — dna pol. Goto et al. (1998) 133 124,977 ӷ 126,962 661 76,565 — 134 127,032 Ӷ 128,087 351 40,947 L 135 128,056 ӷ 128,418 120 14,347 68 192 30.4 (102) 73 131 36.5 (85) — 136 128,566 ӷ 129,081 171 19,979 — 137 129,146 ӷ 130,003 285 33,200 27 286 25.2 (254) 35 268 26.8 (272) — iap 138 130,086 ӷ 132,050 654 77,224 — 139 132,152 ӷ 133,633 493 57,025 62 516 54.3 (494) 65 489 52.6 (492) — lef-9 140 133,678 ӷ 134,121 147 16,888 61 214 35.5 (152) 64 208 34.0 (147) L fp 141 134,118 Ӷ 135,701 527 60,725 L ligase Shuman and Schwer (1995) EUNEAAYI FTHE OF ANALYSIS SEQUENCE 142 135,945 ӷ 136,211 88 10,256 — 143 136,273 ӷ 136,473 66 7,457 — 144 136,513 Ӷ 137,742 409 46,704 32 181 24.5 (98) 27 205 24.0 (75) — fgf 145 137,754 ӷ 139,127 457 53,175 133 419 34.3 (370) 131 424 32.7 (364) — alk. exo. 146 139,195 ӷ 140,562 455 52,329 — helicase-2 Foury and Lahaye (1987) 147 140,657 ӷ 141,586 309 37,050 112 ϩ 113 87 ϩ 169 41.8 (67) ϩ 27.8 (176) E 148 141,693 Ӷ 144,272 859 99,697 50 876 48.9 (875) 54 884 47.1 (871) — lef-8 149 144,332 Ӷ 146,338 668 75,259 46 704 43.3 (584) 50 682 42.5 (631) L odv-e66 Repeat 146,385 146,880 hr6 150 146,883 Ӷ 149,357 824 94,156 L enhancin-1 Hashimoto et al. (1991) 151 149,385 Ӷ 150,089 234 25,785 —

152 150,103 Ӷ 152,706 867 98,874 L enhancin-2 Hashimoto et al. (1991) c-nigrum Xestia 153 152,873 Ӷ 153,025 50 5,770 — 154 153,164 ӷ 155,860 898 104,256 L enhancin-3 Hashimoto et al. (1991) 155 155,896 ӷ 157,071 391 44,712 L 156 157,147 ӷ 157,344 65 7,423 — 157 157,316 ӷ 157,495 59 6,759 —

158 157,551 Ӷ 158,003 150 17,391 — GENOME GRANULOVIRUS 159 158,165 Ӷ 159,391 408 46,358 2 328 18.9 (328) 67 79 46.8 (47) E bro-g 160 159,542 Ӷ 159,760 72 8,367 111 67 36.4 (55) 112 72 31.2 (64) — 161 159,923 ӷ 161,383 486 55,998 — 162 161,412 Ӷ 161,972 186 22,024 — Repeat 162,062 162,670 hr7 163 162,881 Ӷ 166,249 1,122 132,295 — 164 166,430 Ӷ 166,591 53 6,241 — 165 166,590 ӷ 166,946 118 12,958 L 166 166,978 Ӷ 169,548 856 97,799 L enhancin-4 Hashimoto et al. (1991) 167 169,796 ӷ 170,113 105 12,821 L 168 170,248 ӷ 170,844 198 22,795 E 169 170,888 Ӷ 171,322 144 17,185 L Repeat 171,344 171,968 hr8 170 171,972 Ӷ 172,163 63 7,295 — 171 172,150 ӷ 172,569 139 16,062 53 139 35.6 (59) 56 146 31.2 (77) — 172 172,573 Ӷ 173,709 378 43,629 L 173 173,732 Ӷ 173,935 67 7,860 L 174 173,913 ӷ 174,125 70 7,599 53a 78 29.0 (69) 57 80 23.5 (68) L lef-10 283 284 HAYAKAWA ET AL.

1994; Rapp et al., 1998). P35 functions to block apoptosis

et al., (Bump et al., 1995; Clem et al., 1991). LEF-4 appears to be (1997). a RNA capping enzyme (Gross and Shuman, 1998;

et al. Guarino et al., 1998a), and LEF-5 may function as a transcriptional elongation factor (Harwood et al., 1998). LEF-4, -8, -9, and P47 form a minimal RNA polymerase References complex (Guarino et al., 1998b). Both LEF-8 and LEF-9 have small domains characteristic of RNA polymerases

Olszewski and Miller (1997) (Lu and Miller, 1994; Passarelli et al., 1994). An additional (1994) and Ahrens

1994) and OpMNPV (Ahrens lef gene, hcf-1, that is required for optimal hr-dependent DNA replication and transient late gene expression in et al.

et al., TN-368 cells has been identified (Lu and Miller, 1995a). In addition, very late gene expression appears to be modulated by vlf-1 (Todd et al., 1996). me53 vp1054 Computer-assisted homology searches of predicted amino acid sequences indicated that the XcGV genome L — — — — contain homologs of 13 AcMNPV lef genes including ie-1, lef-1, -2, -4, -5, -8, -9, -10, -11, dnapol, dnahel, p47, and 39k, but lacked homologs of lef-3, -6, -7, -12, p35, ie-2, and hcf-1 (Table 2). On average, the homologs showed about 31% Id to AcMNPV and OpMNPV LEFs. The most con- served XcGV LEF homolog was LEF-9 (XcGV ORF139; 54%) followed by LEF-8 (ORF148; 49%), P47 (ORF78; 41%), and LEF-5 (ORF95; 39%) (Table 1). Six XcGV homologs of AcMNPV LEFs including IE-1, LEF-2, 39K, LEF-11, -10, and helicase showed relatively low homology (less than 30% Id). The XcGV IE-1 homolog showed only 13.8% Id over 363 aa to AcMNPV IE-1 (ORF147) and 14.8% Id over 357 aa to OpNPV IE-1 (ORF145). However, a highly acidic profile characteristic of the transactivation domain of Ac/OpMNPV IE-1 (Kovacs et al., 1992) was observed at TABLE 1 — Continued the N-terminal region of the XcGV IE-1 homolog (data not shown). Lu and Miller (1995a) have shown specificity and possibly tissue specificity for the requirement of AcNPV lef-7, ie-2, p35, and hcf-1 in DNA replication. For example, lef-7, ie-2, and p35 are required for replication in Sf-21 cells, but not in TN-368 cells, whereas hcf-1 is required for replication in TN-368 cells, but not in Sf-21

Ac ORF Ac aa %ID (Range) Op ORF Op aa %ID (Range) prm Name cells (Lu and Miller, 1995a). Similarly, XcGV may not require these genes for its replication.

r Although an AcMNPV p35 homolog was not found, M members of another class of baculovirus apoptotic in- hibitors, inhibitor of apoptosis (iap) gene (Crook et al., 1993) homologs, were found in the XcGV genome. AcM- NPV and LdMNPV both have two iap homologs (AcM- A(A/T)A(A/T)] with a CA(G/T)T mRNA start site sequence 25–35 nucleotides downstream. L indicates the presence of an (A/T/G)TAAG motif; Column 14 shows the NPV: iap-1, ORF27; iap-2, ORF71), (LdMNPV: iap-2, ORF79; iap-3, ORF139), OpMNPV has four iap homologs 178,715 110 12,585 175,243175,677 70176,719 119 332 8,138 13,478 178,383 36,520 325 38,131 139 449 21.2 (208) 137 455 21.1 (209) — 177,440 197 23,539 174,975 323 37,445 54 365 31.3 (320) 58 378(iap-1, 28.4 (327)ORF41; L iap-2, ORF74; iap-3, ORF35; iap-4, ORF106), ӷ ӷ ӷ ӷ ӷ ӷ ӷ and CpGV has one, although its genome has not been TATA box [TAT completely sequenced. OpNPV iap-3 and the CpGV iap gene can fully substitute for AcMNPV p35 (Crook et al., 1993; Birnbaum et al., 1994). The biological functions of the other iap homologs (AcMNPV iaps, OpMNPV iap-1, 2,

The first column shows the XcGV ORF number; Columns 2 and 4 show the left and right nucleotide coordinates; Column 3 shows the direction of the ORF; Column 5 shows the number of amino and 4) are unknown (Griffiths et al., 1999). Based on the a 181 178,383 176177 175,031 178 175,318 175,721 180 177,406 179 176,847 175 174,004 Xc ORF Left Dir Right Xc aa ORF name in AcMNPV and/or OpMNPV; Column 15 shows the reference for the XcGV homolog. The homologs with no reference indicated are from Ayres 1997) homologs of the XcGVamino ORF, respectively; acid Columns sequence 8 and identitycalculations, 11 gaps with show that the the were number inserted AcMNPVATG. of into E and amino the indicates alignment acids OpMNPV a were in homologs, counted AcMNPV as and respectively. negative; OpMNPV The Column homologs, 13 number respectively; shows and in the Columns presence parentheses of 9 indicates early and (E) 12 the and show late range the (L) percentage of promoter elements amino within acid 120 nucleotides sequence upstream showing from homology. In these acids in the predicted sequence; Column 6 shows the molecular mass of the predicted ORF; Columns 7 and 10 show the ORF number of AcMNPV (Ayres amino acid sequence homology, the XcGV iap homolog SEQUENCE ANALYSIS OF THE Xestia c-nigrum GRANULOVIRUS GENOME 285

FIG. 2. Alignment of XcGV repeated sequences. The nucleotide sequences of XcGV hrs and TnGV irs (Hashimoto et al., 1996) are aligned. The numbers on left indicate the genome positions. The conserved 10-bp core sequences (TTAATG/ATCGA) are indicated by shaded boxes. The arrows indicate direction of core sequences.

(ORF137) was most closely related to OpMNPV iap-3 LEF-3 has been shown recently to be required for the (26.8% Id over 272 aa) and the CpGV iap gene (26.5% Id transport of P143-helicase into the nucleus of transiently over 260 aa). transfected cells (Wu and Carstens, 1998). Furthermore,

TABLE 2 XcGV Homologs of AcMNPV lef Genes and Structural Protein Genes

Found [name (Xc/Ac/Op ORF)] Not found [name (Ac/Op ORF)]

AcMNPV lef ie-1 (Xc9/Ac147/Op145) lef-2 (Xc35/Ac6/Op6) lef-6 (Ac28/Op40) lef-12 (Ac41/Op46) gene homolog 39k (Xc55/Ac36/Op24) lef-11 (Xc56/Ac37/Op23) lef-3 (Ac67/Op72) hcf-1 (Ac70) p47 (Xc78/Ac40/Op45) lef-1 (Xc82/Ac14/Op13) lef-7 (Ac125/Op123) p35 (Ac135) lef-5 (Xc95/Ac99/Op100) dnahel (Xc98/Ac95/Op96) ie-2 (Ac151/Op151) lef-4 (Xc110/Ac90/Op91) dnapol (Xc132/Ac65/Op70) lef-9 (Xc139/Ac62/Op65) lef-8 (Xc148/Ac50/Op54) lef-10 (Xc174/Ac53a/Op57) Structural protein polh (Xc1/Ac8/Op3) 1629 (Xc2/Ac9/Op2) pep/calyx (Ac131/Op129) p80/p87 (Ac104/Op105) gene homolog pk (Xc3/Ac10/Op1) p10 (Xc5,[19,83]a/Ac137/Op133) ptp (Ac1/Op10) gp64 (Ac128/Op126) odv-e18 (Xc12/Ac143/Op140) odv-e56 (Xc15/Ac148/Op146) p74 (Xc77/Ac138/Op134) p24 (Xc80/Ac129/Op127) p6.9 (Xc94/Ac100/Op101) odv-e25 (Xc99/Ac94/Op95) vp39 (Xc111/Ac89/Op90) odv-ec27 (Xc112/Ac144/Op141) p91 (Xc118/Ac83/Op86) gp41 (Xc121/Ac80/Op83) odv-e66 (Xc149/Ac46/Op50) vp1054 (Xc175/Ac54/Op58)

a The numbers in brackets indicate the ORF number of the variants of the XcGV p10 homolog. 286 HAYAKAWA ET AL. it has been shown that LEF-3 is capable of interacting with P143-helicase (Wu and Carstens, 1998; Evans et al., 1999). Since a feature of GV infections is a major alter- ation or elimination of the nuclear membrane (Federici, 1997), it is possible that LEF-3 is not required for trans- port in GV infected cells because the nuclear membrane is highly porous or no longer exists. Although LEF-3 has FIG. 3. Alignment of XcGV ORF118 and AcMNPV ORF83 (p91). XcGV also been implicated as a single-strand DNA-binding ORF118 and AcMNPV ORF83 (P91) (Ayres et al., 1994). Arrows indicate protein (Hang et al., 1995; Evans and Rohrmann, 1997) ORFs. The shaded portion of the arrows indicate the amino acid sequence of AcMNPV ORF83 and the related regions in XcGV ORF118. and is essential for transient DNA replication, it has also The insert region containing hr5a (black box) is shown within XcGV been demonstrated that another SSB is present in Bm- ORF118. Regions with homology are connected by dotted lines. Amino NPV-infected cells (Mikhailov et al., 1998). A homolog of acid sequence identities of the related regions and their range are this gene is present in the XcGV genome (XcGV ORF89), indicated by percentages and in parentheses, respectively. The scale is suggesting that it may play a general role in single- on the right. stranded DNA binding in baculovirus-infected cells. Homologs of structural protein genes. The XcGV ge- nome contained homologs of 16 structural protein genes OpMNPV P91 is a virion-associated protein linked in of Ac/OpMNPV including polyhedrin, orf1629-capsid, pro- some manner to the capsid (Russell and Rohrmann, tein kinase, p10, odv-e18, -e25, -ec27, -e56, -e66, p74, 1997). The homolog in BmNPV (P95) has been reported p24-capsid, p6.9, vp39-capsid, p91, gp41, and vp1054, but to be able to stimulate gene expression from its own lacked homologs of polyhedron envelope protein (pep)/ promoter and the cytoplasmic actin gene promoter of B. calyx gene, p80/p87-capsid, protein tyrosine phospha- mori (Lu et al., 1998). XcGV ORF118 showed some ho- tase (ptp), and gp64 (Table 2). On average, these ho- mology to the N-terminal half of the P91 of AcMNPV mologs showed about 33% Id to Ac/OpMNPV structural (ORF83), OpMNPV (ORF86), and BmNPV (P95). However, proteins. The most conserved homolog of an NPV struc- XcGV ORF118 contains an insertion of about 0.7 kb near tural protein was granulin (XcGV ORF1; 54%), followed by its center, and the homologous region is separated by the ODV-E66 homolog (ORF149; 43%) and the ODV-E18 this insertion (Fig. 3). Interestingly, although NPV hrs and homolog (ORF12; 42%) (Table 1). Although GVs produce XcGV hrs (hrs 1–8) are usually located within intergenic virions that are biochemically and structurally similar to regions, the short homologous sequence hr5a was found those of NPVs, several homologs including ORF1629 in this insertion. It should be noted that all the homologs capsid, P10, P24 capsid, P39 capsid, ODV-EC27, and P91 of AcNPV orf83 that have been reported have a hr lo- showed relatively low homologies (less than 30% Id to cated immediately downstream (Kuzio et al., 1999). Ac/OpMNPV structural protein). Furthermore, deletions, Therefore it is possible to explain the presence of hr5a insertions, and truncations were observed in the amino within the XcGV ORF by a localized recombination during acid sequence of these structural protein homologs. virus evolution or a frameshift resulting in hr5a becoming ORF1629 capsid is believed to be a structure part of the reading frame. protein of the mature nucleocapsids (Vialard and Rich- XcGV ORFs (5, 19, and 83) showed some homology to ardson, 1993; Russell et al., 1997). Interestingly, although P10, which has been implicated in polyhedrin morpho- poorly conserved, it appears to be an essential compo- genesis (Williams et al., 1989; Gross et al., 1994) and cell nent of the virion and is required for virus viability (Kitts lysis (van Oers et al., 1993). Of the XcGV P10 homologs, and Possee, 1993). XcGV ORF2 showed some homology XcGV ORF5 (29% Id to AcMNPV P10) is most similar to to ORF1629 capsid (22.0% Id over 182 aa to AcMNPV baculovirus P10 proteins (Fig. 4A). The predicted molec- ORF9, 24.1% Id over 162 aa to OpMNPV ORF2). The ular mass of XcGV ORF5 (84 aa) is similar to other P10 homology was concentrated around conserved proline- proteins and contains a conserved heptad-repeat-like rich regions, but the size of the predicted protein (231 aa) sequence in its amino-terminal half, in which the first and was less than half of those of NPVs (AcMNPV ORF9, 543 fourth residues are hydrophobic amino acids, as has aa; OpMNPV ORF2, 473 aa). However it should be noted been found in other P10 proteins (Fig. 4A). XcGV ORF5 that the homolog of the ORF1629 capsid in LdMNPV also also is predicted to contain a proline-rich domain and a shows only 24% Id (Kuzio et al., 1999). positively charged carboxyl-terminal domain (Fig. 4A). The P24 capsid of OpMNPV (ORF127, 192aa) is a XcGV ORF19 shows 22% Id over 60 aa with AcMNPV P10. protein associated with both ODV and BV (Wolgamot et It also shows 57% Id to the CpGV 17R (Kang et al., 1997). al., 1993), and its homolog in the XcGV genome (ORF80, Both XcGV ORF19 and CpGV 17R are predicted to contain 182aa) had a similar predicted molecular mass. How- a proline-rich domain followed by heptad-repeat-like se- ever, the carboxyl-terminal half of the predicted protein quence, and several positively charged residues are did not have a counterpart in the AcMNPV (ORF129) or present at the carboxyl-terminus of XcGV ORF19 (Fig. OpMNPV (ORF127) protein. 4B). XcGV ORF83 shows 35% Id over 66 aa with AcMNPV SEQUENCE ANALYSIS OF THE Xestia c-nigrum GRANULOVIRUS GENOME 287

FIG. 4. XcGV P10 homologs (XcGV ORF5, 19, and 83). (A) Alignment of XcGV ORF5 with AcMNPV P10 (Ayres et al., 1994) and OpMNPV P10 (Ahrens et al., 1997). (B) Alignment of XcGV ORF19 and CpGV 17R (Kang et al., 1997). (C) Amino acid sequence of XcGV ORF83. Heptad-repeat-like amino acids are indicated by white letters within black boxes. Proline-rich domains are indicated by white letters within dark-shaded boxes. Positively charged amino acids at carboxyl-terminal domain are indicated by black letters within shaded boxes. Dashes in the amino acid sequences indicate gaps. Amino acid number is indicated on the right.

P10 and contained a heptad-repeat-like profile through- Homologs of other NPV genes. XcGV ORF144, which out the entire predicted amino acid sequence (Fig. 4C). was predicted to encode a protein of 409 aa, showed The most striking difference between the XcGV and homology to a fibroblast growth factor (fgf) homolog of the Ac/OpMNPV structural proteins is the lack of a bac- AcMNPV (ORF32) and OpMNPV (ORF27). Although the ulovirus gp64 gene homolog. GP64 is involved in viral homology was limited (about 25% Id) to an internal region entry through the endocytic pathway at acidic pH (Blis- of AcMNPV ORF32 (over 98 aa) and OpMNPV ORF27 sard and Wenz, 1992; Monsma and Blissard, 1995), is (over 75 aa), respectively, this region includes the most essential for the progression of the infection from the highly conserved domains found in FGFs of Drosophila, midgut epithelium to a systemic disease (Monsma et al., human, and AcMNPV (Sutherland et al., 1996). We also 1996), and is involved in virion budding (Oomens and observed that XcGV ORF85 showed homology (23.1% Id Blissard, 1999). Although the entry mechanism of XcGV is over 160 aa) to XcGV ORF144 within this region; however, not known, the absence of a gp64 gene homolog in the the homology to other baculovirus FGFs was signifi- XcGV genome suggests that XcGV may have another cantly lower. class of proteins that allows viral entry. It was recently A homolog of XcGV ORF43, which was predicted to reported that LdMNPV also lacks a gp64 gene homolog encode a protein of 277 amino acids, was not found in (Kuzio et al., 1999). Computer analysis of the XcGV ORFs AcMNPV or OpMNPV. However the N-terminal 113 amino for potential membrane proteins revealed a single ORF acids of the predicted protein was 51.3% identical to P13 (ORF27) that was predicted to have both signal and of Leucania seperata NPV (LsNPV) (Wang et al., 1995). transmembrane domains (data not shown). The absence This region also showed 34% identity to the N-terminal of a gp64 gene homolog in the LdMNPV and XcGV region of glycogenin of rabbit. Glycogenin is a self- genome may suggest that the Baculoviridae can be di- glucosylating protein that is involved in the initiation vided into subgroups with different proteins involved in reactions of glycogen synthesis (Campbell and Cohen, budded virus entry and exit from infected cells. 1989). However, Tyr-197, the first covalent attachment site 288 HAYAKAWA ET AL.

TABLE 3a Comparison of Selected XcGV ORFs with Cp, Cl, and TnGV Genes

XcGV CpGV GlGV TnGV AcMNPV ORF aaORF aa %Id (range) aa %Id (range) aa %Id (range) Name References

1 248 8 248 87.1 (248) 248 87.5 (248) 248 99.6 (248) granulin Crook et al. (1997) Jehle and Backhaus (1994a) Akiyoshi et al. (1985) 11 99 145/150 49 55.6 (45) Kang et al. (1998) 15 353 148 355 56.5 (352) odv-e56 Kang et al. (1997) 16 71 29 75 30.9 (55) 15R Kang et al. (1997) 17 187 235 44.9 (225) 16L-1 Kang et al. (1997) 18 153 235 31.2 (109) 16L-2 Kang et al. (1997) 19 386 299 54.7 (277) 17R Kang et al. (1997) 35 189 6 171 40.0 (165) lef-2 Jehle et al. (1997) 36 89 82 38.2 (68) 35Ra Jehle et al. (1997) 58 346 127 333 50.0 (326) cathepsin Kang et al. (1998) 93 372 101 129 50.4 (117) Jehle and Backhaus (1994b) 94 60 100 58 55.8 (52) p6.9 Jehle and Backhaus (1994b) 95 245 99 240 55.2 (221) lef-5 Jehle and Backhaus (1994b) 96 301 98 343 45.6 (298) Jehle and Backhaus (1994b) 97 157 96 162 49.4 (156) 157 92.4 (157) Jehle and Backhaus (1994b) Bideshi et al. (1998) 98 1159 95 1158 89.0 (1158) helicase-1 Bideshi et al. (1998) 99 220 94 219 96.8 (220) odv-e25 Bideshi et al. (1998) 100 122 93 68 100 (27) Bideshi et al. (1998) 103 594 126 594 62.2 (542) chitinase Kang et al. (1998) 137 285 27 275 26.5 (260) iap Kang et al. (1997) 180 325 139 303 31.9 (301) 303 31.4 (303) me53 Crook et al. (1997) Jehle and Backhaus (1994a)

of glucose, was not conserved in the predicted amino they do with NPVs. This finding suggests that the radia- acid sequence of XcGV ORF43. tion of many GVs and NPVs may have occurred at about the same time as the division of the Baculoviridae into XcGV lacks an egt homolog the GVs and NPVs. Homologs of CpGV genes. The XcGV genome con- A homolog of the NPV ecdysteroid UDP-glucosyltrans- tained three CpGV homologs (ORF16L, 17R, and 35Ra) ferase (egt) (AcMNPV ORF15, OpMNPV ORF14) gene that that are not found in the Ac, Op, or LdMNPV genomes suppresses host molting by catalyzing the conjugation of (Table 3a). The functions of these CpGV genes are un- ecdysteroids with UDP-glucose (O’Reilly and Miller, 1989) known; however, CpGV ORF17R has been shown to be was not found in XcGV. The absence of the egt gene in similar to baculovirus P10 as described above. Two con- the XcGV genome was also supported by the observa- tiguous XcGV ORFs (ORF17 and 18) showed homology to tion of molting and in some cases additional instars CpGV ORF16L. XcGV ORF17 and ORF18 showed 44.9% Id occurring in XcGV-infected larvae (Goto, unpublished (over 225 aa, through nearly the entire predicted protein) data). and 31.2% Id (within the N-terminal 109 aa), respectively. CpGV ORF13L, 17R, 35Rb, and 36L (Kang et al., 1997; Comparison of XcGV ORFs to those from other GVs Jehle et al., 1997) were not found in the XcGV genome. To date, 29 putative genes have been described from Enhancins. Enhancins (synergistic factor or viral en- CpGV, Cryptophlebia leucotreta GV (ClGV), and TnGV. Of hancing factor) are proteins that can dramatically en- these homologs, genes between XcGV and TnGV were hance the oral infectivity of NPV (Tanada, 1985; Derksen more closely related (Table 3a). XcGV ORFs 1 (granulin) and Granados, 1988; Goto, 1990). Genes encoding en- and 97–100 showed from 92 to almost 100% Id to their hancins have been identified from several GVs including TnGV counterparts (Table 3a). The XcGV and Cp/ClGV TnGV (Hashimoto et al., 1991), Pseudaletia unipuncta GV gene showed only slightly more homology than the XcGV (PuGV), and Helicoverpa armigera GV (HaGV) (Roelvink and NPV gene homologs (Table 3a). These data indicate et al., 1995). In addition, two enhancin genes have been that GVs are a highly diverse group of viruses and ap- found in the genome of the LdMNPV (Kuzio et al., 1999). pear to show as much divergence among one another as The XcGV genome sequence revealed the presence of SEQUENCE ANALYSIS OF THE Xestia c-nigrum GRANULOVIRUS GENOME 289

TABLE 3b Relatedness of Baculovirus Enhancinsa

XcGV XcGV XcGV XcGV LdMNPV LdMNPV enhancin 1 enhancin 2 enhancin 3 enhancin 4 HaGV PuGV TnGV enhancin 1 enhancin 2

XcGV enhancin 1 29.9 (676) 32.6 (525) 21.3 (520) 32.3 (524) 33.3 (492) 28.1 (562) 26.0 (516) 28.1 (501) XcGV enhancin 2 32.6 (868) 27.7 (876) 33.1 (874) 34.0 (873) 33.7 (878) 32.0 (747) 28.2 (745) XcGV enhancin 3 28.9 (887) 86.5 (902) 80.3 (902) 79.6 (902) 29.0 (776) 26.0 (745) XcGV enhancin 4 28.1 (885) 29.5 (885) 29.4 (885) 26.3 (754) 23.5 (740) HaGV 80.1 (906) 79.5 (906) 30.1 (760) 27.2 (784) PuGV 98.3 (901) 30.2 (748) 26.2 (780) TnGV 29.8 (779) 25.9 (780) LdMNPV enhancin 1 30.3 (790) LdMNPV enhancin 2

a The percentage amino acid sequence identity comparing one baculovirus enhancin with another is shown in each cell. The number in parentheses indicates the range of amino acid sequence showing homology. The predicted amino acid sequences of baculovirus enhancins are from HaGV, PuGV (Roelvink et al., 1995), TnGV (Hashimoto et al., 1991), and LdMNPV (Kuzio et al., 1999). four enhancin genes (XcGV orf150, 152, 154, and 166) DNA ligase. XcGV ORF141 was predicted to encode a (Tables 1 and 3b). These putative enhancins were tenta- protein of 527 amino acids and showed homology to a tively named enhancin-1 through -4. XcGV enhancin-3 of DNA ligases, especially to the ligases (ORF22) showed the highest homology (about 80% Id) to the of LdMNPV (Pearson and Rohrmann, 1998) (23.0% Id over enhancins of TnGV (Hashimoto et al., 1991), HaGV, and 487 aa) and poxviruses (about 28% Id over about 250 aa). PuGV (Roelvink et al., 1995). The other putative en- Significant homologies were observed within the con- hancins (XcGV enhancin-1, 2, and 4) showed 21–34% Id sensus regions (Shuman and Schwer, 1995) commonly to one another and to enhancins of other viruses (Table found in a superfamily of covalent nucleotidyl trans- 3b). The XcGV enhancins ranged in predicted size from ferases, which include DNA/RNA ligases and RNA cap- 824 to 898 amino acids (Table 1). One or two consensus ping enzymes (data not shown). Although the LdMNPV baculovirus late gene promoter motifs [(A/T/G)TAAG] ligase gene was shown to encode an active ligase, it did were found immediately upstream of these four putative not appear to stimulate transient DNA replication (Pear- enhancin genes (Table 1), suggesting that they are ex- son and Rohrmann, 1998). The ligase encoded by poxvi- pressed late in infection. It is not clear why XcGV main- ruses is nonessential for virus replication and is believed tains four putative enhancins. They may cooperate for to be a member of the ligase III family. DNA ligase III the enhancement of NPV infection in some manner, functions in the repair of single-strand DNA breaks that and/or each XcGV putative enhancin may facilitate infec- arise either by the direct action of a DNA-damaging tion of different hosts, or they may have another role in agent such as ionizing radiation or as a consequence of the infection cycle. DNA repair enzymes excising lesions (Caldecott et al., 1994, 1996; Ljungquist et al., 1994; Thompson et al., 1982, XcGV genes with homologs in other organisms 1990). Based on its homology with other known ligases, Metalloproteinase. As described previously (Goto et the putative ligase of XcGV (XcGV ORF141) may also be al., 1998), the predicted amino acid sequence of XcGV a member of the DNA ligase III group. ORF40 showed significant homology to a family of matrix A second helicase homolog. The XcGV genome con- metalloproteinases such as the 92-kDa collagenase IV tains two DNA helicase homologs, ORF98 and 146. XcGV (Wilhelm et al., 1989) within a consensus region contain- ORF98 was predicted to encode a protein of 1159 amino ing a zinc-binding domain. Collagenase IV digests type acids and was homologous (26.1% Id over 1197 aa and IV collagen, which is a major component of the basement 23.1% Id over 1197 aa) to the putative DNA helicases of membrane (Liotta et al., 1980, 1981) and plays an impor- AcMNPV (ORF95) and OpMNPV (ORF96), respectively tant role in the degradation of the basement membrane (Table 1). On the other hand, XcGV ORF146 was pre- during the first step of the metastatic cascade (Liotta et dicted to encode a protein of 455 amino acids. XcGV al., 1980, 1981; Salo et al., 1981; Brooks et al., 1996). ORF146 showed significant homology (52.2% Id over 414 290 HAYAKAWA ET AL.

FIG. 5. Relationships of baculovirus repeated ORFs (bros). AcMNPV, BmNPV, and XcGV bro alignments were carried out using the program Macaw with guidance from Gap BLAST pairwise alignments. The peptide sequences are represented as rectangles with the expanded rectangles of the same color corresponding to blocks of high sequence similarity (P Ͻ 10 Ϫ8). The numbers on the right are the sizes of the ORFs in amino acids. aa) to a LdMNPV helicase homolog (ORF50, helicase-2) ORF2 (Table 1, Fig. 1). In the BmNPV genome (Gomi et al., (Kuzio et al., 1999) and to a yeast mitochondrial helicase, 1999), there are five homologs of AcMNPV ORF2; in pif1 (Foury and Lahaye, 1987). Seven consensus DNA OpMNPV, two regions (OpMNPV ORF67/68 and ORF166) helicase motifs (Hodgman, 1988) are especially well con- are related to AcMNPV ORF2 (Ahrens et al., 1997); and in served in XcGV ORF146 (data not shown). In this paper, AcMNPV, there is only one (Ayres et al., 1994). In addi- XcGV ORF98 was tentatively named helicase-1 and XcGV tion, the LdMNPV genome contains 16 copies of variants ORF146 was named helicase-2. In a transient replication of this gene (Kuzio et al., 1999). Although the function of assay, the LdMNPV helicase-2 gene was not stimulatory these genes is unknown, it has been proposed to call and was unable to substitute for helicase-1, which is them baculovirus repeated ORFs (bro) (Kuzio et al., 1999). essential in this assay (Pearson and Rohrmann, 1998). Thus, the seven AcMNPV ORF2 homologs of XcGV were Although the roles that XcGV putative ligase and heli- named as follows: XcGV ORF60, bro-a; ORF76, bro-b; case-2 play in viral replication are unknown, based on ORF109, bro-c; ORF114, bro-d; ORF130, bro-e; ORF131, their homology to genes from other organisms, they may bro-f; and ORF159, bro-g. An analysis of the relatedness be involved in DNA recombination or repair systems. of bro genes from AcMNPV, BmNPV, and XcGV is shown in Fig. 5. The XcGV bro genes were roughly subdivided Repeated genes into two groups, i.e., bro genes showing homology to the In addition to the four copies of the enhancin gene N-terminal half of AcMNPV ORF2 including XcGV bro-a, described above (Table 3b) and three apparent variants -c, -e, -f, and -g and the bro genes showing homology to of p10 genes (Fig. 4), there are also a number of other the middle or C-terminal part of AcMNPV ORF2 including gene families present in XcGV. The XcGV genome con- XcGV bro-b, and -d. The N-terminal region of the bro tained seven ORFs that show homology to AcMNPV genes belonging to the first group were related to genes SEQUENCE ANALYSIS OF THE Xestia c-nigrum GRANULOVIRUS GENOME 291

FIG. 6. Alignment of XcGV homologs of AcMNPV ORF145/150. Four XcGV ORFs (ORF11, 20, 87, and 105) related to AcMNPV ORF145 and 150 (Ayres et al., 1994) and OpMNPV ORF142 (Ahrens et al., 1997) are shown. Identical amino acids among more than four sequences are indicated within shaded boxes. The asterisks above the alignment indicate conserved cysteine residues. Dashes in the amino acid sequences indicate gaps. Amino acid number is indicated on the right. of unknown function from several viruses including N-terminal half (Fig. 7). XcGV ORF59 and 138 also shared ORF266 of bacteriophage BK5-T (Boyce et al., 1995), some homology to each other (25.5% Id over 690 aa). ORF5 of bacteriophage rlt (van Sinderen et al., 1996), ORFA of bacteriophage A2 (GenBank #Y12813), and HI Genome organization 1418 of Haemophilus influenzae Rd (Fleischmann et al., 1995). XcGV bro-c and -g, which were relatively homolo- Genome content. The XcGV genome (about 179 kb) is gous to BmNPV bro-a (BmNPV ORF22) and c (BmNPV about 45–50 kb larger than Ac/OpMNPV and BmNPV and ORF81), shared some homology with OpMNPV ORF67 18 kb larger than LdMNPV. To understand the sources of (XcGV bro-c, 33.3% Id over 67 aa; XcGV bro-g, 46.8% Id the DNA contributing to this large size of the XcGV over 47 aa). However, the other XcGV bro genes did not genome, the contribution of different categories of XcGV show any significant similarity to the OpMNPV bro ORFs were calculated. The XcGV genome contained 73 genes. Interestingly the N-terminal half of XcGV bro-a nonrepeated ORFs that were homologs of AcMNPV and -f showed about 90% nucleotide sequence identity to genes. These ORFs account for about 68.5 kb or 38% of one another and some sequence similarity to the ihs the genome (Fig. 8). Similarly, the homologs of ORFs sequence of TnGV (Hashimoto et al., 1996). found in baculoviruses other than AcMNPV and/or other AcMNPV ORF145 and ORF150 are predicted to encode organisms contribute about 6.5 kb or 4% of the genome. proteins of 77 and 99 amino acids, respectively, which XcGV contains 78 nonrepeated ORFs that had no ho- share 35.9% Id over 39 aa. In the OpMNPV genome, only mologs in the database. These ORFs (unique genes) one ORF (OpMNPV ORF142) that shares homology with contribute about 52.5 kb or 29% of the genome (Fig. 8). AcMNPV ORF145/150 was found. In contrast, the XcGV Since very little sequence data are available for other GV genome contained four ORFs (XcGV ORF11, 20, 87, and genes, the percentage of unique genes of XcGV should 105), which showed 28–45% amino acid sequence iden- decrease as XcGV homologs are identified in other GVs, tities to AcMNPV ORF145/150 (Table 1, Fig. 6). The sizes whereas the XcGV genome contains a number of re- of the predicted proteins encoded by these XcGV ORFs peated genes, such as three variants of p10 genes, were similar to those from NPVs except for XcGV ORF87. seven bro genes, four variants of AcMNPV ORF145/150, The XcGV proteins showed about 22–44% amino acid four enhancins, and two sets of XcGV repeated ORFs sequence identity to one another. Although their biolog- (one includes ORF22, 61, 73, 155, and 161, and another ical functions are unknown, relatively well-conserved includes ORF59 and 138). These ORFs (repeated genes) cysteine residues were found in the sequence alignment contribute about 30.5 kb or 17% of the genome (Fig. 8). of these putative genes (Fig. 6). Thus, in the XcGV genome content, the unique genes The predicted amino acid sequence of five XcGV ORFs (29%) and the repeated gene (17%) occupy a high per- (XcGV ORF22, 61, 73, 155, and 161) were about 20–43% centage, suggesting that the large size of the XcGV identical to one another, but did not show any convincing genome can be accounted for by duplication of genes homology to sequences in GenBank. The alignment of and unique genes that were acquired from other organ- the predicted amino acid sequence of these ORFs isms or virus or lost by the other baculovirus. showed relatively conserved cysteine residues at their The intergenic regions contribute about 21 kb or 12% 292 HAYAKAWA ET AL.

FIG. 7. Alignment of XcGV ORFs 22, 61, 73, 155, and 161. Identical amino acids among more than three sequences are indicated within shaded boxes. The asterisks above the alignment indicate invariant cycteine residues. Dashes in the amino acid sequences indicate gaps. Amino acid number is indicated on the right. of the genome. About 25% of the intergenic regions orfs) is relatively similar, indicating that the XcGV ORFs (about 5 kb or 3% of the genome) are composed of hrs. are also tightly packed with minimal intergenic regions The remaining intergenic regions comprise about 17 kb, as in other baculoviruses. These features may be com- 9% of the genome (Fig. 8). The hr5a that is homologous mon among members of the Baculoviridae. to XcGV hrs, but located within an ORF, was excluded Baculovirus late gene promoter motif. The pentamer from the XcGV hrs group. The sizes of the non-hrs inter- motif (A/T/G)TAAG is known to act as both the promoter genic regions of LdMNPV (about 17.5 kb), AcMNPV element and the mRNA start site of baculovirus late (about 8.3 kb), and OpMNPV (about 12.4 kb) were re- genes, whereas CTAAG is not believed to act as a pro- cently calculated by Kuzio et al. (1999). Among the XcGV moter (Rankin et al., 1988; Rohrmann, 1986). To analyze and these NPV genomes, the average size of intergenic the conservation and distribution of these sequences in regions for one gene (non-hrs intergenic/the number of the XcGV genome, the frequency of each variant of the SEQUENCE ANALYSIS OF THE Xestia c-nigrum GRANULOVIRUS GENOME 293

et al., 1998), the relative locations of 29 partially se- quenced XcGV genome regions were compared to those in the AcMNPV genome, and 23 of 29 AcMNPV gene homologs appeared to be conserved in position in the XcGV genome. However, the nucleotide sequence of the complete genome indicates that gene arrangement be- tween the XcGV and the Ac/OpMNPV genomes was not well conserved. Similar observations were made when comparing XcGV with LdMNPV. This indicates that XcGV is evolutionally and genetically distant from these other baculoviruses. Four small regions showing relatively conserved gene arrangement with Ac/OpMNPV were found in the XcGV genome. These regions contained XcGV ORFs 1–3 (re- gion 1), ORFs 9–13 (region 2), ORFs 91–101 (region 3), and ORFs 118–126 (region 4), which correspond to Ac- FIG. 8. The sources of XcGV genome content. The chart shows the relative contribution in genetic content from a variety of sources. MNPV ORFs 8–9, ORFs 147–142, ORFs 103–92, and ORFs 83–75, respectively (Table 1). Although these re- gions showed conservation of gene arrangement, the NTAAG motif was calculated and compared with its ac- deduced amino acid sequences of the ORFs within these tual occurrence. In general, as observed in Ac, Ld, and regions did not show significantly higher homologies OpMNPV (Kuzio et al., 1999), the frequency of the NTAAG than the amino acid sequences of ORFs found in other motif was lower than would be predicted based on the regions of the genome. Furthermore, even the of genome size and GϩC content (698, or 37% of the pre- ORFs within these regions was also not precisely con- diction) (Table 4). The occurrence of CTAAG was only 77, served with Ac/OpMNPV. For example, in region 2, XcGV or 20% of the prediction, which is less than half the ORF112, the homolog of AcMNPV ORF144 (odv-ec27, frequency of ATAAG (241, 43%) and GTAAG (183, 47%) OpMNPV ORF141) was translocated to a region that was (Table 4). This suggests that the CTAAG sequence is about 96 kb away. Similarly, an AcMNPV ORF97 (OpM- selected against. NPV ORF98) homolog was absent in region 3. In region The late gene promoter motifs (A/T/GTAAG) occur at 4, the AcMNPV ORF79 homolog, XcGV ORF75, appeared relatively high frequency within sequences 120 bp up- to be translocated to a region 45 kb upstream. In its stream of ORFs. For example, the XcGV genome contains place was found XcGV ORF124, an ORF with unknown 181 ORFs. One hundred twenty basepairs upstream of function. The gene arrangement within all four of these each ORF comprises about 21.7 kb (about 6% of the regions was well conserved in the LdMNPV genome whole genome sequence). We found 104 late gene pro- (Kuzio et al., 1999). moter motifs (about 16% of the total late promoters in the The complete sequence of the XcGV genome revealed XcGV genome) in these upstream sequences. This is many differences in the structure of hrs and gene com- about three times more than would be predicted. position compared to Ac/OpMNPV. Although the actual Gene arrangement. Comparisons of the arrangement functions of most of the XcGV genes remain to be eluci- of genes in a viral genome can be indicative of the dated, it may be reasonable to explain many of the stability of the genome and the possibility of coregulation morphological and biological differences between GVs of genes in close proximity. In our previous report (Goto and NPVs by the differences in gene composition in the

TABLE 4

Frequency of Late Gene Promoter Motifs: Actual (and Predicteda)

ATAAG TTAAG GTAAG CTAAG Total (NTAAG)

XcGV 241 (562) 43% 197 (562) 35% 183 (386) 47% 77 (385) 20% 698 (1895) 37% AcMNPV 132 (421) 31% 125 (421) 30% 85 (289) 29% 55 (289) 19% 397 (1420) 28% LdMNPV 114 (190) 60% 62 (190) 33% 77 (256) 30% 30 (256) 12% 283 (892) 32% OpMNPV 92 (185) 50% 106 (184) 58% 71 (226) 31% 33 (226) 15% 302 (821) 37%

a Predicted NTAAG sequence was calculated from 8192 randomly shuffled genome sequences for XcGV as described by Kuzio et al. (1999). The values for Ac, Ld, and OpMNPV are from Kuzio et al. (1999). The predicted frequency is preceded by the actual frequency and followed by the percentage frequency. 294 HAYAKAWA ET AL. genome. When additional GV genome sequence be- Nucleotide sequence accession number comes available, comparison with the XcGV genome will The nucleotide sequence data reported in this paper contribute to a more detailed understanding of baculo- will appear in the GSDB, DDBJ, EMBL, and NCBI nucle- virus evolution. otide sequence databases under Accession No. AF162221. MATERIALS AND METHODS Source and cloning of DNA ACKNOWLEDGMENTS

The construction of a Lambda FIX II and M13 phage This paper is dedicated to the memory of Susumu Maeda. The library covering the entire XcGV genome has been pre- authors thank George F. Rohrmann, John Kuzio, and Shizuo G. Kamita viously described (Goto et al., 1998). DNA fragments for for assistance with sequence analysis and suggestions. This research sequencing were, in general, derived from these Lambda was partially supported by USDA (9802852), Core Research for Evolu- tional Science and Technology (CREST) Project, JST, Japan and Special FIX II phage clones. The DNA fragments were subcloned Postdoctoral Researchers Program, RIKEN. into pTZ19R (Pharmacia LKB) and propagated in Esche- richia coli DH5a using standard methods (Sambrook et al., 1989). In a few regions, XcGV DNA fragments ampli- REFERENCES fied by polymerase chain reaction (PCR) were also used Ahrens, C. H., Pearson, M. N., and Rohrmann, G. F. (1995). Identification as the template for the sequencing reaction, for example, and characterization of a second putative origin of DNA replication in the plasmid subclone containing the region around a baculovirus of Orgyia pseudotsugata. Virology 207, 572–576. 161,800–162,900 nt of XcGV. This region contained one of Ahrens, C. H., Russell, R. L., Funk, C. J., Evans, J. T., Harwood, S. H., and the XcGV repeated sequences (irs7) and appeared to be Rohrmann, G. F. (1997). The sequence of the Orgyia pseudotsugata multinucleocapsid nuclear polyhedrosis virus genome. Virology 229, very unstable when cloned into bacterial cells. To solve 381–399. this problem, the region was amplified by PCR using a Akiyoshi, D., Chakerian, R., Rohrmann, G. F., Nesson, M. H., and Beau- set of synthetic oligonucleotides (5Ј-TAACAAAACTA- dreau, G. S. (1985). Cloning and sequencing of the granulin gene CAAAACACACG-3Ј,5Ј-TTTGATAACAAATATTTACCAC-3Ј) from the Trichoplusia ni granulosis virus. Virology 141, 328–332. that flanked the repeated sequence. DNA was amplified Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). Basic local alignment search tool. J. Mol. Biol. 215, 403–410. using Pfu DNA polymerase (Stratagene) according to the Ayres, M. D., Howard, S. C., Kuzio, J., Lopez-Ferber, M., and Possee, manufacturer’s instructions. The amplified fragment was R. D. (1994). The complete DNA sequence of Autographa californica separated on a 0.7% TAE agarose gel and purified using nuclear polyhedrosis virus. Virology 202, 586–605. GeneClean kit (Bio 101, Inc). Bideshi, D. K., Hice, R. H., Ge, B., and Federici, B. A. (1998). Molecular characterization and expression of the Trichoplusia ni granulovirus helicase gene. J. Gen. Virol. 79, 1309–1319. DNA sequencing and analysis Birnbaum, M. J., Clem, R. J., and Miller, L. K. (1994). An apoptosis- inhibiting gene from a nuclear polyhedrosis virus encoding a In general, nucleotide sequencing was performed by polypeptide with Cys/His sequence motifs. J. Virol. 68, 2521–2528. the dideoxy chain termination method of Sanger et al. Blissard, G. W., and Wenz, J. R. (1992). Baculovirus gp64 envelope (1977) using a T7 DNA polymerase (USB, Amersham) and glycoprotein is sufficient to mediate pH-dependent membrane fu- [␣-35S]dATP (Amersham). A nucleotide analogue dITP sion. J. Virol. 66, 6829–6835. (USB, Amersham) was used in the reaction to resolve Boyce, J. D., Davidson, B. E., and Hillier, A. J. (1995). Sequence analysis of the Lactococcus lactis temperate bacteriophage BK5-T and dem- regions that had a high GC content. In some regions, onstration that the phage DNA has cohesive ends. Appl. Environ. cycle sequencing was also performed using a Sequi- Microbiol. 61, 4089–4098. Therm EXCEL DNA sequencing Kit with 7-deaza-dGTP Braunagel, S. C., He, H., Ramamurthy, P., and Summers, M. D. (1996). (Epicentre Technologies) according to the manufactur- Transcription, translation, and cellular localization of three Autogra- er’s protocol. In other regions (ca. 60 kb), the sequence pha californica nuclear polyhedrosis virus structural proteins: ODV- E18, ODV-E35, and ODV-EC27. Virology 222, 100–114. was determined using an ABI 377 automated DNA se- Brooks, P. C., Stromblad, S., Sanders, L. C., vonSchalscha, T. L., Aimes, quencer (Applied Biosystems, Inc.). The sequence reac- R. T., Stetler-Stevenson, W. G., Quigley, J. P., and Cheresh, D. A. tions for the automated DNA sequencer were performed (1996). Localization of matrix metalloproteinase MMP-2 to the sur- using a Taq DyeDeoxy Terminator Cycle Sequencing kit face of invasive cells by interaction with integrin alpha v beta 3. Cell (Applied Biosystems, Inc.) according to the manufactur- 85, 683–693. Bump, N. J., Hackett, M., Hugunin, M., Seshagiri, S., Brady, K., Chen, P., er’s protocol. A Peltier PTC-2000 Thermal Cycler (MJ Ferenz, C., Franklin, S., Ghayur, T., Li, P., Mankovich, J., Shi, L., Research) was used for this reaction. pUC primers and Greenburg, A. H., Miller, L. K., and Wong, W. W. (1995). Inhibition of oligonucleotides were used as primers for sequencing. ICE family proteases by baculovirus antiapoptotic protein p35. Sci- Nucleotide sequencing analysis and homology ence 269, 1885–1888. searches were aided by the basic local alignment search Caldecott, K. W., McKeown, C. K., Tucker, J. D., Ljungquist, S., and Thompson, L. H. (1994). An interaction between the mammalian DNA tool (BLAST) network service (Altschul et al., 1990) at the repair protein XRCC1 and DNA ligase III. Mol. Cell. Biol. 14, 68–76. National Center for Biotechnology and DNASIS/PROSIS Caldecott, K. W., Aoufouchi, S., Johnson, P., and Shall, S. (1996). XRCC1 programs (Hitachi America). polypeptide interacts with DNA polymerase beta and possibly poly SEQUENCE ANALYSIS OF THE Xestia c-nigrum GRANULOVIRUS GENOME 295

(ADP-ribose) polymerase, and DNA ligase III is a novel molecular Analysis of their relative expression levels and role in polyhedron ‘nick-sensor’ in vitro. Nucleic Acids Res. 24, 4387–4394. structure. J. Gen. Virol. 75, 1115–1123. Campbell, D. G., and Cohen, P. (1989). The amino acid sequence of Gross, C. H., and Shuman, S. (1998). RNA 5Ј-triphosphatase, nucleotide rabbit skeletal muscle glycogenin. Eur. J. Biochem. 185, 119–125. triphosphatase, and guanylyltransferase activities of baculovirus Clem, R. J., Fechheimer, M., and Miller, L. K. (1991). Prevention of LEF-4 protein. J. Virol. 72, 10020–10028. apoptosis by a baculovirus gene during infection of insect cells. Guarino, L. A., and Summers, M. D. (1986). Functional mapping of a Science 254, 1388–1390. trans-activating gene required for expression of a baculovirus de- Cochran, M. A., and Faulkner, P. (1983). Location of homologous DNA layed-early gene. J. Virol. 57, 563–571. sequences interspersed at five regions in the baculovirus AcMNPV Guarino, L. A., Jin, J., and Dong, W. (1998a). Guanylyltransferase activity genome. J. Virol. 45, 961–970. of the LEF-4 subunit of baculovirus RNA polymerase. J. Virol. 72, Crook, N. E., Clem, R. J., and Miller, L. K. (1993). An apoptosis-inhibiting 10003–10010. baculovirus gene with a zinc finger-like motif. J. Virol. 67, 2168–2174. Guarino, L. A., Xu, B., Jin, J., and Dong, W. (1998b). A virus-encoded RNA Crook, N. E., James, J. D., Smith, I. R., and Winstanley, D. (1997). polymerase purified from baculovirus-infected cells. J. Virol. 72, Comprehensive physical map of the Cydia pomonella granulovirus 7985–7991. genome and sequence analysis of the granulin gene region. J. Gen. Hang, X., Dong, W., and Guarino, L. A. (1995). The lef-3 gene of Auto- Virol. 78, 965–974. grapha californica nuclear polyhedrosis virus encode a single- Derksen, A. C., and Granados, R. R. (1988). Alteration of a lepidopteran stranded DNA-binding protein. J. Virol. 69, 3924–3928. peritrophic membrane by baculoviruses and enhancement of viral Harwood, S. H., Li, L., Ho, P. S., Preston, A. K., and Rohrmann, G. F. infectivity. Virology 167, 242–250. (1998). AcMNPV late expression factor-5 interacts with itself and Evans, J. T., and Rohrmann, G. F. (1997). The baculovirus single- contains a zinc ribbon domain that is required for maximal late stranded DNA binding protein, LEF-3, forms a homotrimer in solution. transcription activity and is homologous to elongation factor TFIIS. J. Virol. 71, 3574–3579. Virology 250, 118–134. Evans, J. T., Rosenblatt, G. S., Leisy, D. J., and Rohrmann, G. F. (1999). Hashimoto, Y., Corsaro, B. G., and Granados, R. R. (1991). Location and Characterization of the interaction between the baculovirus ssDNA- nucleotide sequence of the gene encoding the viral enhancing factor binding protein (LEF-3) and putative helicase (P143). J. Gen. Virol. 80, of the Trichoplusia ni granulosis virus. J. Gen. Virol. 72, 2645–2651. 493–500. Hashimoto, Y., Hayashi, K., Okuno, Y., Hayakawa, T., Saimoto, A., Gra- Federici, B. A. (1997). Baculovirus pathogenesis. In “The Baculovirus” nados, R. R., and Matsumoto, T. (1996). Physical mapping and iden- (L. K. Miller, Ed), pp. 33–59. Plenum, New York. tification of interspersed homologous sequences in the Trichoplusia Fleischmann, R. D., Adams, M. D., White, O., Clayton, R. A., Kirkness, ni granulosis virus genome. J. Gen. Virol. 77, 555–563. E. F., Kerlavage, A. R., Bult, C. J., Tomb, J.-F., Dougherty, B. A., Merrick, Hill, D. S. (1987). “Agricultural Insect Pests of Temperate Regions and J. M., Mckenney, K., Sutton, G., Fitzhugh, W., Fields, C. A., Gocayne, Their Control,” p. 443. Cambridge Univ. Press, Cambridge, UK. J. D., Scott, J. D., Shirley, R., Liu, L.-I., Glodek, A., Kelley, J. M., Hodgman, T. C. (1988). A new superfamily of replicative proteins. Weidman, J. F., Phillips, C. A., Spriggs, T., Hedblom, E., Cotton, M. D., Nature 333, 22–23. Utterback, T. R., Hanna, M. C., Nguyen, D. T., Saudek, D. M., Brandon, Jehle, J. A., and Backhaus, H. (1994a). The granulin gene region of R. C., Fine, L. D., Fritchman, J. L., Fuhrmann, J. L., Geoghagen, Cryptophlebia leucotreta granulosis virus: Sequence analysis and N. S. M., Gnehm, C. L., Mcdonald, L. A., Small, K. V., Fraser, C. M., phylogenetic considerations. J. Gen. Virol. 75, 3667–3671. Smith, H. O., and Venter, J. C. (1995). Whole-genome random se- Jehle, J. A., and Backhaus, H. (1994b). Genome organization of the quencing and assembly of Haemophilus influenzae Rd. Science 269, DNA-binding protein gene region of Cryptophlebia leucotreta gran- 496–512. ulosis virus is closely related to that of nuclear polyhedrosis viruses. Foury, F., and Lahaye, A. (1987). Cloning and sequencing of the PIF J. Gen. Virol. 75, 1815–1820. gene involved in repair and recombination of yeast mitochondrial Jehle, J. A., van der Linden, I. F., Backhaus, H., and Vlak, J. M. (1997). DNA. EMBO J. 6, 1441–1449. Identification and sequence analysis of the integration site of trans- Garcia-Maruniak, A., Pavan, O. H., and Maruniak, J. E. (1996). A variable poson TCp3.2 in the genome of Cydia pomonella granulovirus. Virus region of Anticarsia gemmatalis nuclear polyhedrosis virus contains Res. 50, 151–157. tandemly repeated DNA sequences. Virus Res. 41, 123–132. Kang, W., Crook, N. E., Winstanley, D., and O’Reilly, D. R. (1997). Com- Gomi, S., Majima, K., and Maeda, S. (1999). Sequence analysis of the plete sequence and transposon mutagenesis of the BamHI J frag- genome of Bombyx mori nucleopolyhedrovirus. J. Gen. Virol. 80, ment of Cydia pomonella granulosis virus. Virus Genes 14, 131–136. 1323–1337. Kang, W., Tristem, M., Maeda, S., Crook, N. E., and O’Reilly, D. R. (1998). Goto, C. (1990). Enhancement of a nuclear polyhedrosis virus (NPV) Identification and characterization of the Cydia pomonella granulo- infection by a granulosis virus (GV) isolated from the spotted cut- virus cathepsin and chitinase genes. J. Gen. Virol. 79, 2283–2292. worm, Xestia c-nigrum L. (Lepidoptera: Noctuidae). Appl. Entomol. Kitts, P. A., and Possee, R. D. (1993). A method for producing recombi- Zool. 25, 135–137. nant baculovirus expression vectors at high frequency. BioTech- Goto, C., Tsutsui, H., Honma, K., Iizuka, T., and Nakajima, T. (1985). niques 14, 810–817. Studies on nuclear polyhedrosis and granulosis virus of the spotted Kool, M., Ahrens, C. H., Goldbach, R. W., and Rohrmann, G. F. (1994). cutworm, Xestia c-nigrum L. Appl. Entomol. Zool. 29, 102–106. Identification of genes involved in DNA replication of the Autographa Goto, C., Minobe, Y., and Iizuka, T. (1992). Restriction endonuclease californica baculovirus. Proc. Natl. Acad. Sci. USA 91, 11212. analysis and mapping of the genomes of granulosis viruses isolated Kool, M., Ahrens, C. H., Vlak, J. M., and Rohrmann, G. F. (1995). Repli- from Xestia c-nigrum and five other noctuid species. J. Gen. Virol. 73, cation of baculovirus DNA. J. Gen. Virol. 76, 2103–2118. 1491–1497. Kovacs, G. R., Choi, J., Guarino, L. A., and Summers, M. D. (1992). Goto, C., Hayakawa, T., and Maeda, S. (1998). Genome organization of Functional dissection of the Autographa californica nuclear polyhe- Xestia c-nigrum granulovirus. Virus Genes 16, 199–210. drosis virus immediate-early 1 transcriptional regulatory protein. J. Vi- Griffiths, C. M., Barnett, A. L., Ayres, M. D., Windass, J., King, L. A., and rol. 66, 7429–7437. Possee, R. D. (1999). In vitro host range of Autographa californica Kuzio, J., Pearson, M. N., Harwood, S. H., Funk, C. J., Evans, J. T., nucleopolyhedrovirus recombinants lacking functional p35, iap1 or Slavicek, J. M., and Rohrmann, G. F. (1999). Sequence and analysis of iap2. J. Gen. Virol. 80, 1055–1066. the genome of a baculovirus pathogenic for Lymantria dispar. Virol- Gross, C. H., Russell, R. L., and Rohrmann, G. F. (1994). Orgyia pseudo- ogy 253, 17–34. tsugata baculovirus p10 and polyhedron envelope protein genes: Li, Y., Passarelli, A. L., and Miller, L. K. (1993). Identification, sequence, 296 HAYAKAWA ET AL.

and transcriptional mapping of lef-3, a baculovirus gene involved in passing the transcriptional start point are the major determinant for late and very late gene expression. J. Virol. 67, 5260–5268. baculovirus polyhedrin gene expression. Gene 70, 39–49. Liotta, L., Tryggvason, K., Garbisa, S., Hart, I., Folts, C., and Shafile, S. Rapp, J. C., Wilson, J. A., and Miller, L. M. (1998). Nineteen baculovirus (1980). Metastatic potential correlates with enzymatic degradation of open reading frames, including LEF-12, support late gene expres- basement membrane collagen. Nature 284, 67–68. sion. J. Virol. 72, 10197–10206. Liotta, L., Tryggvason, K., Garbisa, S., Rohey, P. G., and Abe, S. (1981). Raynes, D. A., Hartshorne, D. J., and Guerriero, V., Jr. (1994). Sequence Partial purification and characterization of a neutral protease which and expression of a baculovirus protein with antigenic similarity to cleaves type IV collagen. Biochemistry 20, 100–108. telokin. J. Gen. Virol. 75, 1807–1809. Ljungquist, S., Kenne, K., Olsson, L., and Sandstrom, M. (1994). Altered Roelvink, P. W., Corsaro, B. G., and Granados, R. R. (1995). Character- DNA ligase III activity in the CHO EM9 mutant. Mutat. Res. 314, ization of the Helicoverpa armigera and Pseudaletia unipuncta 177–186. granulovirus enhancin genes. J. Gen. Virol. 76, 2693–2705. Lu, A., and Miller, L. K. (1994). Identification of three late expression Rohrmann, G. F. (1986). Polyhedrin structure. J. Gen. Virol. 67, 1499– factor genes within the 33.8- to 43.4-map-unit region of Autographa 1513. californica nuclear polyhedrosis virus. J. Virol. 68, 6710–6718. Russell, R. L. Q., Funk, C. J., and Rohrmann, G. F. (1997). Association of Lu, A., and Miller, L. K. (1995a). Differential requirements for baculovirus a baculovirus-encoded protein with the capsid basal region. Virology late expression factor genes in two cell lines. J. Virol. 69, 6265–6272. 227, 142–152. Lu, A., and Miller, L. K. (1995b). The roles of eighteen baculovirus late Russell, R. L. Q., and Rohrmann, G. F. (1997). Characterization of P91, a expression factor genes in transcription and DNA replication. J. Virol. protein associated with virions of an Orgyia pseudotsugata baculo- 69, 975–982. virus. Virology 233, 210–223. Lu, M., Swevers, L., and Iatrou, K. (1998). The p95 gene of Bombyx mori Salo, T., Liotta, L., and Tryggvason, K. (1981). Purification and charac- nuclear polyhedrosis virus: Temporal expression and functional terization of a murin basement collagen-degrading enzyme secreted properties. J. Virol. 72, 4789–4797. by metastatic tumor cells. J. Biol. Chem. 258, 3058–3063. Majima, K., Kobara, R., and Maeda, S. (1993). Divergence and evolution Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989). “Molecular Cloning: of homologous regions of Bombyx mori nuclear polyhedrosis virus. A Laboratory Manual,” Cold Spring Harbor Laboratory Press, Cold J. Virol. 67, 7513–7521. Spring Harbor, NY. Mikhailov, V. S., Mikhailova, A. L., Iwanaga, M., Gomi, S., and Maeda, S. Sanger, F., Nicklen, S., and Coulson, A. R. (1977). DNA sequencing with (1998). Bombyx mori nucleopolyhedrovirus encode a DNA-binding chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74, 5463– protein capable of destabilizing duplex DNA. J. Virol. 72, 3107–3116. 5467. Monsma, S. A., and Blissard, G. W. (1995). Identification of a membrane Shuman, S., and Schwer, B. (1995). RNA capping enzyme and DNA fusion domain and an oligomerization domain in the baculovirus ligase: A superfamily of covalent nucleotidyl transferases. Mol. Mi- GP64 envelope fusion protein. J. Virol. 69, 2583–2595. crobiol. 17, 405–410. Monsma, S. A., Oomens, A. G., and Blissard, G. W. (1996). The GP64 Sutherland, D., Samakovlis, C., and Krasnow, M. A. (1996). Branchless envelope fusion protein is an essential baculovirus protein required encodes a Drosophila FGF homolog that controls tracheal cell mi- for cell-to-cell transmission of infection. J. Virol. 70, 4607–4616. gration and the pattern of branching. Cell 87, 1091–1101. Morris, T. D., Todd, J. W., Fisher, B., and Miller, L. K. (1994). Identification Tanada, Y. (1985). A synopsis of studies on the synergistic property of of lef-7: A baculovirus gene affecting late gene expression. Virology an insect baculovirus: A tribute to Edward A. Stainhous. J. Invertebr. 200, 360–369. Pathol. 45, 125–138. Olszewski, J., and Miller, L. K. (1997). Identification and characterization Theilmann, D. A., and Stewart, S. (1992). Tandemly repeated sequence of a baculovirus structural protein, VP1054, required for nucleocapsid Ј formation. J. Virol. 71, 5040–5050. at the 3 end of the IE-2 gene of the baculovirus Orgyia pseudo- Oomens, A. G. P., and Blissard, G. W. (1999). Requirement for GP64 to tsugata multicapsid nuclear polyhedrosis virus is an enhancer ele- drive efficient budding of Autographa californica multicapsid nucle- ment. Virology 187, 97–106. opolyhedrovirus. Virology 254, 297–314. Theilmann, D. A., Chantler, J. K., Stewart, S., Flipsen, H. T., Vlak, J. M., O’Reilly, D. R., and Miller, L. K. (1989). A baculovirus blocks insect and Crook, N. E. (1996). Characterization of a highly conserved molting by producing ecdysteroid UDP-glucosyl transferase. Science baculovirus structural protein that is specific for occlusion-derived 245, 1110–1112. virions. Virology 218, 148–158. Passarelli, A. L., and Miller, L. K. (1993a). Identification of genes encod- Thompson, L. H., Brookman, K. W., Dillehay, L. E., Carrano, A. V., ing late expression factors located between 56.0 and 65.4 map units Mazrimas, J. A., Mooney, C. L., and Minkler, J. L. (1982). A CHO-cell of the Autographa californica nuclear polyhedrosis virus genome. strain having hypersensitivity to mutagens, a defect in strand break Virology 197, 704–714. repair, and an extraordinary baseline frequency of sister chromatid Passarelli, A. L., and Miller, L. K. (1993b). Identification and character- exchange. Mutat. Res. 95, 247–254. ization of lef-1, a baculovirus gene involved in late and very late gene Thompson, L. H., Brookman, K. W., Jones, N. J., Allen, S. A., and Carrano, expression. J. Virol. 67, 3481–3488. A. V. (1990). Molecular cloning of the human XRCC1 gene, which Passarelli, A. L., and Miller, L. K. (1993c). Three baculovirus genes corrects defective DNA strand-break repair and sister chromatid involved in late and very late gene expression: ie-1, ie-n, and lef-2. exchange. Mol. Cell. Biol. 10, 6160–6171. J. Virol. 67, 2149–2158. Todd, J. W., Passarelli, A. L., and Miller, L. K. (1995). Eighteen baculo- Passarelli, A. L., and Miller, L. K. (1994). Identification and transcrip- virus genes, including lef-11, p35, 39K, and p47, support late gene tional regulation of the baculovirus lef-6 gene. J. Virol. 68, 4458–4467. expression. J. Virol. 69, 968–974. Passarelli, A. L., Todd, J. W., and Miller, L. K. (1994). A baculovirus gene Todd, J. W., Passarelli, A. L., Lu, A., and Miller, L. K. (1996). Factors involved in late gene expression predicts a large polypeptide with a regulating baculovirus late and very late gene expression in tran- conserved motif of RNA polymerases. J. Virol. 68, 4673–4678. sient-expression assays. J. Virol. 70, 2307–2317. Pearson, M., Bjornson, R., Pearson, G., and Rohrmann, G. F. (1992). The van Oers, M. M., Flipsen, J. T., Reusken, C. B., Sliwinsky, E. L., Goldbach, Autographa californica baculovirus genome: Evidence for multiple R. W., and Vlak, J. M. (1993). Functional domains of the p10 protein of replication origins. Science 257, 1382–1384. Autographa californica nuclear polyhedrosis virus. J. Gen. Virol. 74, Pearson, M., and Rohrmann, G. F. (1998). Characterization of a bacu- 563–574. lovirus-encoded ATP-dependent DNA ligase. J. Virol. 72, 9142–9149. van Sinderen, D., Karsens, H., Kok, J., Terpstra, P., Ruiters, M. H., Rankin, C., Ooi, B. G., and Miller, L. K. (1988). Eight base pairs encom- Venema, G., and Nauta, A. (1996). Sequence analysis and molecular SEQUENCE ANALYSIS OF THE Xestia c-nigrum GRANULOVIRUS GENOME 297

characterization of the temperate lactococcal bacteriophage rlt. Mol. Williams, G. V., Rohel, D. Z., Kuzio, J., and Faulkner, P. (1989). A cyto- Microbiol. 19, 1343–1355. pathological investigation of Autographa californica nuclear polyhe- Vialard, J. E., and Richardson, C. D. (1993). The 1,629-nucleotide open drosis virus p10 gene function using insertion/deletion mutants. reading frame located downstream of the Autographa californica J. Gen. Virol. 70, 187–202. nuclear polyhedrosis virus polyhedrin gene encode a nucleocapsid- Wolgamot, G. M., Gross, C. H., Russell, R. L., and Rohrmann, G. F. associated phosphoprotein. J. Virol. 67, 5859–5866. (1993). Immunocytochemical characterization of p24, a baculovirus Wang, J. W., Qi, Y. P., Huang, Y. X., and Li, S. D. (1995). Nucleotide capsid-associated protein. J. Gen. Virol. 74, 103–107. sequence of a 1446 base pair SalI fragment and structure of a novel Wu, Y., and Carstens, E. B. (1998). Abaculovirus single-stranded DNA early gene of Leucania separata nuclear polyhedrosis virus. Arch. binding protein, LEF-3, mediates the nuclear localization of the pu- Virol. 140, 2283–2291. tative helicase P143. Virology 247, 32–40. Wilhelm, S. M., Collier, I. E., Marmer, B. L., Eisen, A. Z., Grant, G. A., Xie, W. D., Arif, B., Dobos, P., and Krell, P. J. (1995). Identification and Goldberg, G. I. (1989). SV40-transformed human lung fibroblasts secrete analysis of a putative origin of DNA replication in the Choristoneura a 92-kDa type IV collagenase which is identical to that secreted by fumiferana multinucleocapsid nuclear polyhedrosis virus genome. normal human macrophages. J. Biol. Chem. 264, 17213–17221. Virology 209, 409–419.