Available online at www.sciencedirect.com

R

Virology 313 (2003) 401–414 www.elsevier.com/locate/yviro

Comparative analysis of bacterial Bam35, infecting a gram-positive host, and PRD1, infecting gram-negative hosts, demonstrates a viral lineage૾

Janne J. Ravantti,a,b,1 Ausra Gaidelyte,b,c,1 Dennis H. Bamford,b and Jaana K.H. Bamfordb,*

a Department of Computer Science, P.O. Box 26, (Teollisuuskatu 23), 00014 University of Helsinki, Finland b Institute of Biotechnology and Department of Biosciences, Biocenter 2, P.O. Box 56 (Viikinkaari 5), 00014 University of Helsinki, Finland c Department of Biotechemistry and Biophysics, Vilnius University, LT-2009, Vilnius, Lithuania Received 18 November 2002; returned to author for revision 12 January 2003; accepted 22 March 2003

Abstract Extra- and intracellular viruses in the biosphere outnumber their cellular hosts by at least one order of magnitude. How is this enormous domain of viruses organized? Sampling of the virosphere has been scarce and focused on viruses infecting humans, cultivated plants, and animals as well as those infecting well-studied bacteria. It has been relatively easy to cluster closely related viruses based on their sequences. However, it has been impossible to establish long-range evolutionary relationships as sequence homology diminishes. Recent advances in the evaluation of architecture by high-resolution structural analysis and elucidation of viral functions have allowed new opportunities for establishment of possible long-range phylogenic relationships—virus lineages. Here, we use a genomic approach to investigate a proposed virus lineage formed by PRD1, infecting gram-negative bacteria, and human adenovirus. The new member of this proposed lineage, bacteriophage Bam35, is morphologically indistinguishable from PRD1. It infects gram-positive hosts that evolutionarily separated from gram-negative bacteria more than one billion years ago. For example, it can be inferred from structural analysis of the coat protein sequence that the fold is very similar to that of PRD1. This and other observations made here support the idea that a common early ancestor for Bam35, PRD1, and adenoviruses existed. © 2003 Elsevier Science (USA). All rights reserved.

Keywords: Bam35; PRD1; Adenovirus; Virus evolution

Introduction gens, and bacterial viruses () infecting a small number of commonly studied bacteria. This has led to It is evident that there are approximately 10-fold more the usage of host specificity, as well as phenotypic compar- viruses than cognate host cells in the biosphere (Bergh et al., isons, as the basis for virus classification. It is considered 1989; Wommack and Clowell, 2000). All cellular organ- that despite horizontal gene transfer there are constraints isms are subject to virus infection. How is this enormous that conserve key structures (and functions) comprising the domain of viruses organized? Our current sampling of the viral innate “self.” Viruses sharing recognizable similar virosphere has focused mostly on animal and plant patho- self-structures would form a lineage (Bamford et al., 2002a; Bamford, 2003). Although advances in computational methods have made ૾ Sequence data from this article have been deposited with the EMBL/ a tremendous impact on the analysis, annotation, and inter- GenBank Data Libraries under Accession No. AY257527. pretation of biological data, the methods tend to break down * Corresponding author. Department of Biosciences, Biocenter 2, P.O. when dealing with data that originate from a new or less ϩ Box 56 (Viikinkaari 5), 00014 University of Helsinki, Finland. Fax: 358- sampled branch of the tree of life. Sequence comparisons 9-191-59098. E-mail address: jaana.bamford@helsinki.fi (J.K.H. Bamford). are relevant only between viruses that are closely enough 1 These authors contributed equally to this work. related to allow similarities to be recognized and followed.

0042-6822/03/$ – see front matter © 2003 Elsevier Science (USA). All rights reserved. doi:10.1016/S0042-6822(03)00295-2 402 J.J. Ravantti et al. / Virology 313 (2003) 401–414

When deeper penetration into evolutionary time is neces- Bam35. The aim was to increase the sampling of the viro- sary, more extensive methods are required. An increase in sphere to find new members of the -adenovirus the number of high-resolution structures of viruses and viral lineage. Although not accurately defined, gram-positive and coat components allow for the comparison of morphologic gram-negative bacteria are considered to have diverged data at atomic-level resolution. Surprisingly, it has been more than one billion years ago (Maidak et al., 2001; http:// observed that viruses infecting hosts from different domains rdp.cme.msu.edu/html/). Therefore, it would be of interest of life share a number of unexpected similarities. The most to compare the genome sequences of PRD1 and Bam35 to illustrative examples are the similarities discovered between determine if any common elements exist. It also would be adenoviruses (Burnett, 1997) and phage PRD1, the proto- interesting to see if more recent horizontal gene transfer can type of Tectiviridae (Bamford and Ackermann, 2000). The be detected as proposed for other dsDNA phages (Hendrix similarities extend from genome replication to archi- et al., 2002; Pajunen et al., 2001; Tetart et al., 2001). tecture (T ϭ 25) (Stewart et al., 1991; Butcher et al., 1995), Information on Bam35 is based on a single previous and to the fold of the major coat protein (pseudo hexameric publication (Ackermann et al., 1978), whereas the PRD1 ␤ trimer with two eight-stranded -barrels in the monomer) system has been described in considerable detail (Bamford (Rydman et al., 2001; Benson et al., 1999, 2002; Belnap and et al., 1995). PRD1 has a linear 15-kb dsDNA genome with Steven, 2000). In addition, the vertex organization, with the 5Ј covalently linked terminal proteins. The genome is rep- pentamer and spike proteins, is similar (van Ostrum and licated by a protein-priming sliding-back mechanism, sim- Burnett, 1985; van Raaij et al., 1999; Bamford and Bam- ilar to that of the adenoviruses (Savilahti et al., 1991; Cal- ford, 2000; Caldentey et al., 2000; Sokolova et al., 2001). dentey et al., 1993). The viral particle has an outer diameter Thus far, this architecture has been elucidated only for these of 740 Å between opposite vertices (San Martin et al., two viruses. These observations led to the hypothesis that 2001). The outer-most protein capsid is organized on a viruses form evolutionary lineages dating to a common pseudo T ϭ 25 lattice with 240 copies of the trimeric coat ancestor that existed before the division into the three do- protein (Butcher et al., 1995). Underneath the protein coat mains of life (Benson et al., 1999; Hendrix, 1999; Bamford resides the viral membrane that is closely aligned with the et al., 2002a; Bamford, 2003). icosahedral shape of the capsid (San Martin et al., 2001, To support the concept of evolutionary viral lineages, the 2002). The membrane vesicle encloses the genome. The sample sizes evaluated must be increased. Structural anal- capsid is composed of approximately 20 protein species, ysis of the Paramecium bursaria chlorella virus (PBCV-1) two of which (P3 and P30) from the coat skeleton (Mindich that infects unicellular, eukaryotic, chlorella-like green al- et al., 1982b; Bamford et al., 1991; Luo et al., 1993; Ryd- gae (Van Etten et al., 1991; Van Etten and Meints, 1999; man et al., 2001). Three other proteins (P31, P5, and P2) Yan et al., 2000), demonstrated a similar architectural or- form the receptor-binding spike structures at the particle ganization to that of PRD1. The high-resolution structure of the coat protein has been solved very recently (Nandhagopal vertices (Rydman et al., 1999; Grahn et al., 1999; Bamford et al., 2003), revealing that the fold is very much the same and Bamford, 2000; Caldentey et al., 2000). The remaining as in the PRD1 coat protein. proteins are membrane-associated and are involved in ge- The family Tectiviridae (tectus, covered) includes bac- nome packaging and DNA delivery into the host cell teriophages having a membrane beneath the icosahedral (Mindich et al., 1982b; Stro¨msten et al., 2003; Grahn et al., protein shell. PRD1 infects a variety of gram-negative bac- 2002a,b). Recently, the entire 66 mDa PRD1 virion with the teria harboring P-, N-, or W-type conjugative plasmids inner membrane and genomic DNA was crystallized (Bam- (Olsen et al., 1974). PRD1 has five very closely related ford et al., 2002b), and determination of its high-resolution isolates: PR3 (Bradley and Rutherford, 1975), PR4 (Sta- structure is underway. It is the largest virus, and the only nisich, 1974), PR5 (Wong and Bryan, 1978), PR772 one with a biological membrane, that has yielded diffracting (Coetzee and Bekker, 1979), and L17 (isolated by N. See- crystals thus far. ley; Bamford and Ackermann, 2000). Although these As determined by negative-stain electron microscopy, phages have been isolated from different parts of the world, the Bam35 virion morphology (Ackermann et al., 1978) the genomic sequences are approximately 98% identical resembles that of PRD1. Here we describe methods used to (J.K.H. Bamford, unpublished results). In addition to obtain purified Bam35 particles in amounts that allow de- phages that infect gram-negative bacteria, three members tailed biophysical and morphological characterization and infecting gram-positive Bacillus species have been reported: comparison with PRD1. The purified particles were used to Bam35, infecting Bacillus thuringiensis (Ackermann et al., isolate and sequence the viral genome. Sequence analysis 1978; Bamford and Ackermann, 2000), AP50, infecting led to the conclusion that, despite the lack of sequence Bacillus anthracis (Nagy, 1974), and phiNS11, infecting similarity to PRD1, these viruses are related based on their (Bamford and Ackermann, 2000). The latter genomic organization and the characteristics of the pre- is not available anymore and the anthracis phage is not a dicted proteins. These similarities indicated that Bam35 and desired model system; therefore, we optimized the growth PRD1 may have had a common ancestor that existed before and purification methods and analyzed the genome of the separation of gram-negative and gram-positive bacteria J.J. Ravantti et al. / Virology 313 (2003) 401Ð414 403 and that Bam35 most probably is a member of the same lineage as PRD1, adenoviruses, and PBCV-1.

Results

Optimization of the virus yield

The wild-type Bam35 from the d’Herelle Reference Cen- ter formed turbid plaques on the host B. thuringiensis HER1410. The titer of soft agar derived phage stocks in Luria–Bertani medium (LB) did not exceed 3 ϫ 109 PFU/ ml, making it difficult to obtain appropriate amounts of virus for further characterization. However, when the plates were carefully examined, approximately 5% of the plaques were found to be clear. Three clear plaques were picked and, after subsequent single plaque purifications, stocks were prepared. The plaque that gave the highest titer sloppy agar stock (3 ϫ 1011 PFU/ml) was selected, designated Fig. 2. m.o.i. dependent one-step growth curves of bacteriophage Bam35c Bam35c, and used in this study. on Bacillus thuringiensis HER1410. The cells were grown to a cell density of 2 ϫ 108 CFU/ml (150 Klett units) at 37°C and infected (time point 0) The stability test (virus in LB at 4°C, Fig. 1) indicated a using different m.o.i. values. After lysis the titers were determined (given reasonably rapid decay of infectivity. This was taken into in parentheses). m.o.i. 1 (closed circles, 3 ϫ 1010 PFU/ml), 10 (open account when the growth and purification procedures were circles, 4.5 ϫ 1010 PFU/ml), 15 (closed triangles, 6.7 ϫ 1010 PFU/ml), 20 designed. The host cells aggregated at high cell density (Ͼ4 (open triangles, 5.1 ϫ 1010 PFU/ml), noninfected cells (closed squares). ϫ 108 CFU/ml) and thus densities were not allowed to Turbidity (cell density) was measured using a Klett–Summerson colorim- eter (A ). exceed 3 ϫ 108 CFU/ml during the liquid growth and 540 titering experiments. The effect of cell density (between 1.2 ϫ 8 ϫ 8 10 and 3 10 CFU/ml) in one-step growth experi- initially released viruses were able to infect the uninfected ments was determined using a constant m.o.i. value of 10. cells, leading to complete but delayed lysis. Higher m.o.i. The best lysis and highest virus yield was obtained when the values (Ͼ20) caused premature lysis. The highest yield was ϫ 8 cells were infected at 2 10 CFU/ml. The effect of m.o.i. obtained with an effective m.o.i. of 15 using virus stocks was determined by keeping the cell density constant (2 ϫ 8 that were less than 24 h old, as demonstrated by the titer 10 CFU/ml) at the time of infection (Fig. 2). At low m.o.i. decay shown in Fig. 1. Under optimal conditions the lysis values (1) a dual growth curve was obtained, indicating that started about 45 min post infection (p.i.) and was complete in 60 min. The lysate titers did not exceed 7 ϫ 1010 PFU/ml. Thin-section electron microscopy, sampled throughout the one-step growth experiment (Fig. 3), revealed numerous empty particles tightly associated with the cell surface at 10 min p.i. Mature virus particles were centrally located within the cell at 40 min p.i. and cell lysis was occasionally captured at later time points.

Bam35c purification

The optimal growth conditions determined above were used to produce virus for purification. Due to virus decay, the growth and purification procedure used to obtain 1 ϫ virus was carried out in a single day. The normal polyeth- ylene glycol (PEG) precipitation (Yamamoto et al., 1970) precipitated extensive impurities. However, reducing the PEG concentration to 5% and reducing the centrifugation conditions (Sorvall SLA 3000, 8000 rpm, 10 min, 5°C) improved the purity. The rate zonal centrifugation (Fig. 4A) Fig. 1. Decay of Bam35c infectivity. Virus stocks in LB were sored at 4°C yielded two virus peaks with the same sedimentation be- and infectivity was periodically determined. havior as those of filled and empty PRD1 particles. SDS– 404 J.J. Ravantti et al. / Virology 313 (2003) 401Ð414

specific peak was observed. The density of Bam35c in sucrose (1.21 g/ml) and CsCl (1.28 g/ml) was determined using the equilibrium gradient data (Figs. 4B and C). We also tested iodixanol (OptiPrep) gradients, but no separation of flagella and virus was achieved (data not shown). The observation that the flagellar zone of the CsCl gradient showed aggregation led to experiments to examine the ef- fect of differential salt precipitation on flagellar aggregates. It was observed that flagella still precipitated quantitatively in 0.3 M ammonium sulfate, whereas only a small fraction of the virus was removed. Our current combined purifica- tion procedure includes PEG precipitation, rate-zonal su- crose gradient purification, ammonium sulfate precipitation of flagella, and isopycnic sucrose gradient centrifugation. The outcome of this purification scheme is illustrated in Table 1.

Fig. 3. Electron micrographs of thin-sectioned HER1410 cells infected Comparison of the morphology, proteins, and DNA of with Bam35c and harvested at 10, 40, or 45 min postinfection. At 10 min Bam35c to those of PRD1 postinfection (A) empty viral particles can be seen attached to the cell surface. Clusters of newly assembled viruses are seen centrally located in the cell interior (B). (C) Partially lysed cell with clearly visible viruses. Bar To illustrate the similarity of Bam35c and PRD1 parti- represents 250 nm in (A) and 500 nm in (B) and (C). cles, we prepared 1 ϫ purified material of both viruses. These purified particles were mixed 1:1 and subjected to negative-staining electron microscopy (Fig. 5). We could PAGE analysis indicated two major protein bands in both not distinguish these preparations. Both showed empty and empty and filled particle fractions. N-terminal amino acid filled particles and occasional empty tailed particles of the analysis of the lower band revealed the sequence, MRINT- same size as described previously for both of these viruses NINSM. A data bank search identified this sequence as (Ackerman et al., 1978; Grahn et al., 2002b). The Bam35c identical to flagellins from B. thuringiensis and B. anthracis preparation could be distinguished only by the presence of (Fig. 4A, arrowhead). flagellar contamination that is absent in the PRD1 prepara- Removal of the flagellar contamination using equilib- tion since a flagella-less host was used to produce particles. rium centrifugation in sucrose failed (Fig. 4B, arrowhead). The protein patterns of purified PRD1 and Bam35c par- However, the virus material was readily separated from the ticles analyzed via SDS–PAGE are shown in Fig. 6. If a flagella using isopycnic CsCl gradients (Fig. 4C). Unfortu- protein counterpart for PRD1 was found in Bam35c, it was nately in this high ionic strength environment the virus designated using the PRD1 nomenclature (P for protein dissociated in such a way that soluble (mostly coat protein) followed by an arabic number equal to the Roman numeral material was at the top of the gradient and aggregated for the gene). The rest of the proteins are classified as gene material was found at the bottom. Only a small virus- products (gp) and numbered according to their correspond-

Fig. 4. Purification of Bam35c. (A) An absorbance profile of a rate zonal centrifugation (5–20% w/v sucrose gradient). The peak positions of wild-type PRD1 empty (E) and filled (F) particles, in a similar centrifugation, are indicated by arrows. SDS–PAGE analysis of peak fractions 6 and 8 are also shown. (B) and (C) Refractive indices (open circles) and absorbance profiles (closed circles) of equilibrium centrifugations in sucrose and CsCl, respectively (see Materials and methods for details). SDS–PAGE profiles of relevant fractions are shown on the right. An arrow in (C) indicates the flagellar fraction. The position of flagellin in the gels is indicated by arrowheads. Fractions are numbered from top to bottom. J.J. Ravantti et al. / Virology 313 (2003) 401Ð414 405

Table 1 Quantitation of the purification steps of bacteriophage Bam35c

Cell lysate PEG precipitate 1ϫ virus 1ϫ virus treated with 2ϫ virus

(NH4)2SO4 Recovery of infectivity (PFU/liter of lysate) 4.9 ϫ 1013 3.45 ϫ 1013 6.5 ϫ 1012 2.8 ϫ 1012 1.2 ϫ 1011 Recovery of infectivity (%) 100 70 13 6 0.3 Specific infectivity (PFU/mg) ——1.6 ϫ 1012 1.5 ϫ 1012 2.3 ϫ 1011

ing open reading frame (ORFs). For Bam35c, the most Comparison of the Bam35c and PRD1 genomic sequences abundant protein (gp18), which presumably corresponds to the major virion coat protein, has a mass of approximately Sequencing of the Bam35c genome revealed a linear, 40 kDa. The corresponding protein in PRD1 is the 43-kDa 14,935-bp-long dsDNA molecule with an average GC con- coat protein P3. The coat protein seems to have the highest tent of 39.7%. The closest reported virus in the NCBI mass among the Bam35c structural proteins, whereas in database in terms of the length is PRD1 with 14,927 bp and PRD1 the 60-kDa receptor-binding protein P2 migrates an average GC content of 48.1%. The longest inverted slower than the coat protein P3 and represents the upper- repeat for Bam35c was found to be the 74-bp-long repeat most band in the gel (Fig. 6). The second most abundant located at the genome termini. These Bam35c inverted protein in Bam35c (gp25) also has a counterpart in PRD1 terminal repeats (ITR) have 81% identity, whereas the 110- (22-kDa protein P11). The migration patterns of the minor bp-long ITRs of PRD1 have 100% identity (Fig. 8). The proteins between P3 and P11 also seem to be quite similar, whereas the gel pattern for the smaller proteins, correspond- ing to the membrane-associated proteins for PRD1 (Bam- ford and Bamford, 1991), differ considerably (Fig. 6). Se- quencing of the genome of Bam35c (see below) allowed us to search the predicted amino-terminal amino acid se- quences of separated Bam35c proteins (Fig. 6). Of the approximately 12 protein bands detected in SDS–PAGE, 8 yielded sequences that could be identified at the beginning of an open reading frame. Four proteins (marked with an asterisk in Fig. 6A) did not yield any signal. The isolated genomic from Bam35c and PRD1 were analyzed in an agarose gel. Isolation of the DNA from PRD1 virions without protease treatment leaves the co- valently linked terminal proteins attached and the DNA does not properly enter the agarose gel. After treatment with protease, the DNA migrates according to its molecular weight (ϳ15 kb). The DNA from Bam35c behaved as the PRD1 genome in this assay (Fig. 7). This indicated that Bam35c most probably has covalently linked terminal pro- teins at the ends of the genome and a similar length to the PRD1 genome.

Fig. 6. (A) Comparison of the protein composition of 2ϫ purified Bam35c and PRD1 particles (SDS–PAGE, 16% acrylamide). PRD1 proteins are indicated on the left. (B) Arrows on the right indicate Bam35c gene products (gp, for nomenclature, see the text) from which N-terminal amino acid sequence could be obtained. Bam35c proteins which are blocked and could not be subjected to N-terminal amino acid determination are marked Fig. 5. Negative staining electron micrograph of 1ϫ purified Bam35c and with an asterisk. When determined the Bam35c gene products are assigned PRD1 particles mixed in 1:1 ratio. The scale bar represents 200 nm. using the same nomenclature as is in use with PRD1. 406 J.J. Ravantti et al. / Virology 313 (2003) 401Ð414

PRD1, which has eight proteins with transmembrane helices (Table 2). The Bam35c ORF5 (2207 bp), located near the left end of the genome, is the only one with enough coding capacity for a DNA polymerase (see below). The length of the second longest ORF is 1070 bp. In PRD1 there are two genes of approximately 1700 bp: I and II coding for the DNA polymerase P1, and the receptor-binding protein P2, Fig. 7. Agarose gel (0.8%) of isolated genomic DNA of Bam35c and PRD1 without (Ϫ) and with (ϩ) protease treatment. respectively. This suggests that the ORF for the receptor- binding protein in Bam35c is much shorter than the analo- gous gene in PRD1. This result is in concordance with the ITR regions of PRD1 and Bam35c do not have any observ- SDS–PAGE pattern of the Bam35c virion structural pro- able sequence similarity to each other. The overall nucleo- teins with no apparent bands larger than the coat protein P3 tide identity across the entire genome of Bam35c and PRD1 (ϳ40 kDa; Fig. 6). is 42% using both local and global alignments. Although the PSI-Blast searches were carried out for each of the overall nucleotide identity is rather high, the actual align- Bam35c ORFs. As with PRD1, overall protein sequence ment shows that the regions of identity is very patchy. The identity is low between the predicted Bam35c protein se- longest similar nucleotide “patch” has a length of only 32 quences and protein sequences available in the current non- nucleotides. redundant SWISS-PROT database in the NCB data bank. Only five of the deduced proteins had hits with any rele- Comparison of the ORFs and genome organization of vance (Table 2). The highest similarity, which was to sev- Bam35c and PRD1 eral polymerases, was observed for ORF 5, further confirm- ing that it is the gene for the Bam35c DNA polymerase. The GeneMark program predicted 32 ORFs longer in Meanwhile, ORF 6 had sequence identity to the LexA-type size than 100 bp that might code for proteins from Bam35c. DNA-binding proteins. This type of regulatory protein has All predicted ORFs are in the same DNA strand (Fig. 9). not been characterized for PRD1. The product of ORF 14 Ϫ The genome is tightly packed having short noncoding re- had highest similarity (BLAST E-score: 1.0 ϫ 10 66)tothe gions and only a few overlapping ORFs. The PRD1 genome PRD1 DNA packaging ATPase P9 (gene IX; Bamford et al., is similarly packed with few overlapping genes. The clearest 1991). Due to the conserved nucleotide-binding, Walker A difference in PRD1 with respect to Bam35c was that the motif in the ORF 14 gene product, similarities with a few early genes (XII and XIX coding for the DNA-binding pro- other ATPases were also discovered. Interestingly, ORF 14 teins) are located in the opposite strand at the right end of and PRD1 gene IX are located in the corresponding regions, the genome (Fig. 9) (Bamford et al., 1991). The total num- in front of the coat protein gene (Fig. 9). The ORF 26 ber of genes and ORFs in the PRD1 genome is 34 (Bamford product was found to have homology to peptidoglycan- et al., 1991, 2002b), which is very close to the number of hydrolyzing enzymes from different organisms, mainly predicted ORFs for Bam35c. Many of the Bam35c ORFs from gram-positive bacteria. The corresponding gene (VII) are quite short, similar to the PRD1 genes (Fig. 9). The (Rydman and Bamford, 2000) of PRD1 is located after the shortest known genes shown to code for a protein in PRD1 coat protein gene in the region (OL3) coding for the virion are genes XX (coding for a 42-residues-long protein P20) structural proteins (Fig. 9). ORF 30 from Bam35 codes for and XXII (coding for a 47-residues-long protein P22). These another putative lytic enzyme that has homology to the are integral membrane proteins involved in the linking of endolysins and lysozymes of gram-positive bacteria but no the DNA packaging complex to the virion (Stro¨msten et al., homology to PRD1 genes. 2003). The TMHMM program predicted a possible trans- Almost all of the few similarities found using PSI-Blast membrane helix for 10 of the putative gene products of were to PRD1 proteins. We continued the analysis by doing a Bam35c. This fits well with the corresponding number in separate all-against-all comparison directly between Bam35c

Fig. 8. Inverted terminal repeats of Bam35c and PRD1. J.J. Ravantti et al. / Virology 313 (2003) 401Ð414 407

Fig. 9. Genome organization of Bam35c (top) and PRD1 (middle). The genome of PRD1 is organized in five operons (bottom), two of which are transcribed early (OE) and three late (OL) during the infection cycle (Grahn et al., 1994). The operon OE1 at the 5Ј end codes for the genome replication functions (genes VIII and I) and the host receptor-binding protein (gene II). The next operon, OL1, contains the genes for the penton and spike proteins (XXXI and V, respectively), which are at the virus vertices. The middle operon, OL2, encodes proteins for the assembly and DNA packaging functions and the third late operon, OL3, codes for the structural proteins of the virion. The OE2 is transcribed in the direction opposite to the rest of the operons and the genes (XII and XIX) in that operon are indicated in black. The Bam35c ORFs are assigned with Arabic numerals, PRD1 genes with Roman numerals, and PRD1 ORFs with lowercase letters.

ORFs and PRD1 genes and ORFs at both the nucleotide and interesting hit. The sequence of the PBCV-1 coat protein the amino acid levels. The alignments were done to ensure that has some homology with the PRD1 coat protein structure, any possible, even low similarity, regions of interest were although with much a lower E-value (1.24) than Bam35c, detected. Still, none of the global or local alignments gave any arguing that the Bam35c coat protein-fold is even more significant matches beyond that already found in the PSI-Blast similar to the PRD1 coat protein than that of PBCV-1 analysis (ORFs 5, 14, and 26). However, when the product of (Nandhagopal et al., 2003). ORF 28 was aligned with PRD1 protein P5 using the Smith– Finally, the nucleotide frequency of Bam35c ORFs and Waterman local alignment method, a match was found in the PRD1 genes and ORFs was compared. Bam35c has a more region containing several serine and glycine residues corre- uneven nucleotide frequency than PRD1 (Fig. 10). Bam35c sponding to the spike protein P5 collagen-like motif (Bamford seems to favor clustering of some of the bases, while PRD1 and Bamford, 1990). This motif, with six collagen-type trip- uses them more evenly. This might be partly related to lets, divides the spike protein into two domains and has a role differences in the G-C content between the and to in host receptor recognition (Huiskonen et al., 2003). There is the fact that PRD1 genome is, due to its broad host-range, no collagen-type triplets in the Bam35c ORF 28 product, but devoid of the palindromic sequences for most restriction three triplets are found from the gene product encoded by the enzymes recognizing six bases (Bamford et al., 1991). Al- preceding gene (ORF 27), which might correspond, on the though the nucleotide usage differs, there seems to be an basis of its location, to the penton protein gene XXXI in PRD1. equal change of the G-C content in both viruses along the The sizes of the Bam35c ORF 28 product and PRD1 P5 sequence (Fig. 10). However, similarities in nucleotide protein are approximately the same (32 vs 34 kDa, respec- composition can be seen between some of the Bam35c and tively). PRD1 genes (Fig. 10). For example, the overlapping We extended our analyses even further to include the Bam35c ORFs 10 and 11 form a pattern similar to the 3D-PSSM method to test if any of the Bam35c gene prod- overlapping PRD1 genes VI and X (coding for proteins P6 ucts would match any known 3D protein structures. Only and P10, respectively) (Bamford et al., 1991). Comparison the amino acid sequences deduced from ORF 18 showed of the corresponding amino acid sequences also revealed significant homology with the X-ray structure of the PRD1 common features. PRD1 P6 is a minor capsid protein with major coat protein P3 (Benson et al., 2002; E-value a role in DNA packaging (Stro¨msten et al., 2003). The 0.00525, 95% confidence interval). This result confirms that sequence contains 17.4% glutamic acid residues and the ORF 18 codes for the coat protein of Bam35c (see above) protein is highly negatively charged (charge Ϫ25.1 at pH 7). and that the fold is probably similar to that of the PRD1 coat The product of Bam35c ORF10 has 27.6% glutamic acid protein even though there is no detectable nucleotide se- (and 15.8% of lysine) residues with the overall calculated quence level homology. Analysis of other viral coat protein charge being Ϫ17.1. PRD1 P10 is a membrane-bound scaf- sequences by the 3D-PSSM method resulted in another folding protein (Mindich et al., 1982a,b; Rydman et al., 408 J.J. Ravantti et al. / Virology 313 (2003) 401Ð414

Table 2 Summary of the analyses of the putative Bam35c genes

GeneMark ORFsa Startb Stop Length (bp) TMHMMc PSI-BLASTd 3D-PSSMe 1 358 534 176 2 550 1053 503 3 1125 1349 224 4 1408 2145 737 5 2129 4336 2207 Polymerases 6 4352 4552 200 LexA-type regulators 7 4566 4718 152 8 4693 4833 140 9 4952 5125 173 10 5138 5575 437 11 5389 6147 758 12 5993 6235 242 13 6204 6512 308 14 6493 7131 638 ATPases / PRD1 P9 15 7146 7286 140 1 16 7303 7443 140 1 17 7445 7699 254 18 7699 8769 1070 PRD1 coat protein 19 8811 9041 230 1 20 9045 9251 206 1 21 9331 9762 431 1 22 9759 9935 176 23 9932 10078 146 1 24 10090 10365 275 25 10369 10983 614 1 26 10983 11735 752 1 Lytic enzymes 27 11683 12195 512 1 28 12208 13122 914 29 13126 14007 881 30 14023 14820 797 Lytic enzymes 31 14171 14293 122 32 14509 14818 425 1

a Open reading frame over 100 bp that is predicted to code protein (Lukashin and Borodovsky, 1998). b Numbers refer to the Bam35c genome sequence. c Number of predicted transmembrane helices (Krogh et al., 2001). d Significant similarity to other protein sequences (Altschul et al., 1997). e Predicted similarity to known 3D structures (Kelley et al., 2000).

2001) with a pattern of glycine and serine (and sometimes ruses, and PBCV-1 led us ask whether gram-positive bac- arginine) residues grouped together at the amino-terminus teria harbor viruses possibly belonging to this lineage. of the protein. Likewise, the gene product of Bam35c Adenovirus and PBCV-1 infect animal and algal hosts, ORF11 has a similar type of pattern in the amino-terminus, respectively, which were separated less than a billion years although the sequence alignment similarity score was very ago (Carrol, 2001). On the other hand, gram-positive and weak (data not shown). The need for certain types of amino gram-negative bacteria are considered to be separated more acids to fulfill structural and functional requirements might than a billion years ago. This results in a near uniform conserve the nucleotide composition of Bam35c and PRD1 sampling of the evolutionary space in both bacterial and genes. The codons for glutamic acid and lysine residues are eukaryotic domains. Obviously, if all these viruses belong composed only of A or G nucleotides, which might result in to the same lineage, they must have had a common ancestor the lower abundance on average than T and C nucleotides in that existed before the bacterial and eukaryotic domains ORFs 10 and 11, and in genes VI and X, respectively (Fig. were separated. It appeared that there was only one avail- 10). It is likely that these two gene pairs are functional able rational candidate, Bam35, to be studied in more detail. counterparts. The most troublesome feature of this virus system is that the virions are fragile and decay rapidly, even when cultured Discussion in LB. The virions are also very sensitive to high salt concentrations, excluding the use of CsCl equilibrium cen- The motivation to more efficiently delineate the pro- trifugation. When the growth and rate zonal purification was posed lineage of viruses represented by PRD1, adenovi- carried out during a single day, specific infectivity of about J.J. Ravantti et al. / Virology 313 (2003) 401Ð414 409

Fig. 10. Nucleotide frequencies per gene for Bam35c and PRD1 genomes. Each nucleotide frequency (number of actual occurrences divided by gene length) is plotted as a horizontal line from the beginning to the end of the ORF. If two nucleotides have the same frequency, they are plotted on top of each other with a dotted line. The black rectangles show where Bam35c ORFs 10 and 11 possibly have their counterparts in PRD1 genome (genes P6 and P10). See text for details.

1.6 ϫ 1012 PFU/mg was achieved (Table 1). When the frozen (Ϫ80°C) LB virus stocks retained sufficiently high protein composition of this virus material was examined, titers for prolonged storage (3 months tested). The stability almost half of the protein mass was due to contaminating of PRD1 virions, which is much higher than Bam35c, may flagella (Fig. 4A) with only minor amounts of other impu- reflect their ability to infect enterobacterial hosts and sur- rities. Taking this into account, the deduced specific activity vive in intestinal environments. of this virus material was about twice the calculated value. The negative-stain electron microscopic data of Bam35c Even this number is less than half of the corresponding (see also Ackerman et al., 1978) show features in common value (1 ϫ 1013 PFU/mg protein) (Walin et al., 1994) with PRD1. The similarities were extended when the virion determined for PRD1. Any further purification step led to protein patterns and genome lengths were compared. The lower specific infectivity values despite increased purity. length of the Bam35c genome is almost exactly the same as This is a reflection of the sensitivity of the Bam35c virion that observed for PRD1. Further, the capsid sizes are prac- and limits its usage to very fresh material. Fortunately, tically identical and Bam35c contains an internal membrane 410 J.J. Ravantti et al. / Virology 313 (2003) 401Ð414

Table 3 Matching of Bam35c gene products and PRD1 proteins and gene products (see text for details)

1) Putative gene products of Bam35c are numbered by arabic numerals according to the corresponding ORFs. For the PRD1 protein nomenclature see Bamford et al., 1991 and Bamford et al., 2002b. 2) The initial methionine is included. 3) Isoelectric point. 4) Number of predicted transmembrane helices. similar to PRD1, making the available volume for DNA are located at the termini of the genome. It is possible that packaging equal to that of PRD1. PRD1 DNA density inside the Bam35c counterparts for the two early DNA-binding the particle is high, close to crystalline DNA (Tuma et al., protein genes (XII and XIX) at the 3Ј end of the PRD1 1996). We have proposed that the energy created during genome have moved to the 5Ј end in the Bam35c genome PRD1 DNA packaging is used to drive the initial stages of (ORFs 2 and 3; Fig. 9). The putative DNA polymerase gene DNA delivery into the host cell (Grahn et al., 2002b). The (ORF 5) of Bam35c is preceded by ORF 4 that could be a similarities in genome length and capsid volume highlight candidate for the genome terminal protein according to the the importance of DNA packaging density and point to a gene order in PRD1 (Fig. 9; Table 3, dashed line). similar DNA delivery mechanism. In addition, both viruses In PRD1, the genes for the vertex complex proteins are have covalently linked terminal proteins in their genomes. transcribed from separate operons. The penton and the spike 3D-PSSM, PSI Blast, and alignment analyses could di- protein genes (XXXI and V, respectively) are located in the rectly link only four putative Bam35c proteins (of 32) to same operon (in this order), and the gene (II) for the recep- PRD1 proteins. However, by combining the computational tor-binding protein is transcribed from a different operon analysis and biological and biophysical data available for (Grahn et al., 1994). There does not seem to be a clear the PRD1 system, we were able to extend the comparison counterpart for the receptor-binding protein in Bam35c. (Table 3). It is obvious that the genome organization of This is not surprising, since the host is a gram-positive Bam35c follows that of PRD1. The early functions in PRD1 bacterium with a different cell surface than the host for J.J. Ravantti et al. / Virology 313 (2003) 401Ð414 411

PRD1. In Bam35c, the putative operon for the vertex com- bly cannot be compromised without going extinct. For this plex proteins (ORFs 27 and 28) is located after the operon reason such elements as coat protein fold and interactions for the coat and membrane-associated proteins. This differs are very preserved. Only very conserved changes are al- from the location of the corresponding operon (OL1) in lowed to retain viability. As a consequence these structures PRD1 (Table 3). stay conserved, although sequences may diverge beyond In PRD1, the region coding for the proteins involved in recognition with current methods. In contrast, the structures virus assembly and DNA packaging likely has a counterpart and functions involved in the interactions with the host cell in Bam35c. Bam35c, however, has more ORFs in that evolve rapidly to assure continued parasitism in a changing position than PRD1. It was shown by nucleotide frequence host. comparison that ORFs 10 and 11 correspond to genes VI The above observations and, in particular, the discovery and X (see Results). ORF 6 had similarity to the LexA-type that the Bam35c coat protein topology most probably is of transcriptional proteins not observed in PRD1. The GC very similar to that of PRD1, strongly suggest that Bam35c content of ORF 9 is the lowest among Bam35c ORFs and PRD1 are related. Without data from other viruses (29.9%) and there is a noncoding DNA region (maybe a (adenovirus and PBCV-1) infecting eukaryotic hosts having possible promoter) upstream of the ORF, suggesting that it many common features with PRD1 (and Bam35c) the most might have been acquired recently as a so-called “moron” logical conclusion would be that the PRD1–Bam35c rela- (Hendrix et al., 2000). It also is possible that the counterpart tionship is due to lateral gene transfer. However, it is very of Bam35c ORF 17 upstream of the coat protein ORF18 is difficult to posit that a virus system could successfully jump absent in PRD1. There are three ORFs upstream of the coat from prokaryotes to eukaryotes. It also has been argued protein gene in Bam35c, whereas there are only two ORFs (Hendrix, 2002; Bamford et al., 2002a) that the probability in PRD1 (Fig. 9). In PRD1, gene XX codes for a small of viruses having so many similar features evolved inde- membrane protein involved in DNA packaging and the pendently is very low. Our conclusion is that Bam35c is yet function of ORF i is unknown. It can be postulated that ORF another member in the virus lineage originally proposed i arose as a result of gene duplication from gene XX since based on PRD1 and adenovirus comparisons (Benson et al., both genes code for 42-residues-long peptide and the se- 1999). It would be interesting to determine if there are quences are conserved. Remarkably, ORFs 15 and 16, additional viruses to be identified with similar structural and which have corresponding positions in the Bam35c genome, functional features infecting archaeal hosts, giving addi- code for 46-residues-long proteins. Both have a single pre- tional support for the presence of a virus lineage that ex- dicted transmembrane helix and equivalent isoelectric tends to hosts in all domains of cellular life. points (10.63 and 10.24, respectively; Table 3). However, the overall similarity (34.8%) is not very high. The possible relationship between Bam35c gp16 and PRD1 P20 is Materials and methods marked with a dashed line in Table 3. Sequencing of the amino-termini of the proteins corre- Bacteria and phages lating to the genes for structural proteins of Bam35c re- vealed the corresponding genes grouped together in the Wild-type Bam35 and its host B. thuringiensis HER1410 right half of the genome. This region corresponds to operon were obtained from the Felix d’Herelle Reference Center OL3 in PRD1, which codes for the coat protein P3 and other for Bacterial Viruses, Laval University, Quebec, Canada. structural proteins (Bamford et al., 1991; Grahn et al., Cells were grown in Luria–Bertani medium (Sambrook and 1994). The counterparts for the genes of the coat protein P3, Russel, 2001). Several rare Bam35 clear plaques were the membrane protein P11, and the transglycosylase P7 of picked and pure lines were obtained through three consec- PRD1 were established. It is tempting to suggest that the utive single plaque purification steps. The wt, producing counterpart to the small DNA packaging protein gene (XXII turpid plaques, and the clear plaque variant were used to in PRD1) downstream of the coat protein gene is ORF19 in make virus stocks by collecting the soft agar (using 3 ml of Bam35c based on its genome location and as its demonstra- LB per plate) from semiconfluent plates. The agar cultures tion as a structural protein (dashed line in Table 3). were incubated for2hat28°C with aeration; cell debris was The fact that the Bam35c DNA polymerase, coat protein, removed by centrifugation (Sorvall SLA rotor, 10,000 rpm, and proteins needed for the assembly and DNA packaging 15 min, 5°C). The titer of the supernatant was determined of the virus particle have the most easily recognizable coun- on HER1410 (plates grown overnight at 28°C). Due to the terparts in PRD1 is in agreement with the evolutionary thermosensitivity of the phage and the host, the soft agar conservation hypothesis suggested previously (Benson et was cooled to 48°C before addition to the host–virus mix- al., 1999; Bamford, 2003; Bamford et al., 2002a). Accord- ture. The virus stocks were stored at 4°C or frozen at Ϫ80°C ing to that hypothesis, the conserved viral structures and in the presence or absence of 20% glycerol. In the latter functions form an innate viral self that has evolved early and conditions about 90% of the infectivity was retained after changes slowly. It is obvious that horizontal gene flow storage with no fast decay. exists. However, vital viral functions such as capsid assem- Bacteriophage PRD1 was grown on flagella-less Salmo- 412 J.J. Ravantti et al. / Virology 313 (2003) 401Ð414 nella enterica DS88 (Bamford and Bamford, 1990). PRD1 cation using an iodixanol (OptiPrep) gradient, 1ϫ purified particles were purified by rate-zonal centrifugation in su- virus material was layered onto linear 5–40% (w/w) gradi- crose gradients (Bamford and Bamford, 1991) followed by ents and centrifuged to equilibrium (Beckman SW41 rotor, ion-exchange chromatography using Sartorius D100 car- 35,000 rpm, 4 h, 10°C), and the light-scattering virus zones tridges (Sartorius) essentially as described by Walin et al. were collected (BioComp fraction collector). (1994). Electron microscopy Growth and purification of Bam35c For thin-section electron microscopy B. thuringiensis Growth of the phage in liquid LB medium (at 37°C) was HER1410 cells were grown at 37°C to a density of 2 ϫ 108 carried out by infecting aerated cultures of B. thuringiensis HER1410 at a cell density of 2 ϫ 108 CFU/ml using an CFU/ml and infected with a freshly made stock of Bam35c m.o.i. of about 15. The cultures were incubated at 37°C until using an m.o.i. of 17. Samples were taken at 10, 30, 40, and lysis occurred. The lysates were cleared by low-speed cen- 45 min postinfection and fixed with 3% glutaraldehyde. trifugation (Sorvall SLA3000 rotor, 8000 rpm, 30 min, 5°C) After 20 min fixation at room temperature, cells were col- and virus particles in the supernatant were precipitated by lected, washed twice with 10 mM potassium phosphate adding solid polyethylene glycol 6000 and NaCl (final con- buffer (pH 7.2), and prepared for transmission electron centrations 5% and 0.5 M, respectively) followed by mag- microscopy as previously described (Bamford and Mindich, ϫ netic stirring for 30 min at 4°C. Phage precipitates were 1980). For negative staining, the 1 purified virus speci- collected by low-speed centrifugation (Sorvall SLA3000 mens on carbon-coated grids were stained with 1% (w/v) rotor, 8000 rpm, 10 min, 5°C) and the pellet was resus- potassium phosphotungstate (pH 6.5). The micrographs pended in 10 mM potassium phosphate pH 7.2 (about 10 were taken with a JEOL 1200 EX electron microscope (at ml/L lysate) for at least2honicewith gentle shaking. For the EM unit, Institute of Biotechnology, University of Hel- rate zonal separation the resuspended pellet was loaded onto sinki) operating at 60 kV. linear 5–20% (w/v) sucrose gradients (in 10 mM potassium phosphate pH 7.2) and centrifuged (Sorvall AH629 or Beck- Analytical methods man SW41 rotors, 24,000 rpm, for1h10minat15°C). The gradients were fractionated or the light-scattering virus The protein concentrations were determined by the Brad- zones were collected and virus particles were recovered by ford (1976) assay using BSA as a standard, and the protein differential centrifugation (Sorvall T647.5, 32,000 rpm, 2 h composition was determined by SDS–PAGE (Olkkonen and 30 min, 5°C or Sorvall T865, 40,000 rpm, 40 min, 5°C). The Bamford, 1989). N-terminal amino acid sequences were pellets were dissolved in 10 mM potassium phosphate pH determined from protein bands that had been separated by ϫ 7.2 (about 0.8 ml/L lysate) overnight on ice (1 purified SDS–PAGE, transferred to a polyvinylidene difluoride virus). membrane, stained with Coomassie brilliant blue, and sub- The contaminating host flagella were precipitated by jected to Edman degradation using a Procise 494A protein adding ammonium sulfate (final concentration 0.3 M using sequencer (Perkin–Elmer/Applied Biosystems, Foster City, ϫ 2.5 M stock solution) to the 1 purified virus preparation CA) in the Protein Chemistry Laboratory, Institute of Bio- containing about 4 mg/ml of protein. After a 5-min incuba- technology, University of Helsinki. tion on ice the suspension was centrifuged twice (microcen- trifuge 13,000 rpm, 10 min, 4°C) to remove the precipitate. The resulting supernatant was designated as flagella-free Determination of the Bam35c genome sequence 1ϫ purified virus. This virus material was layered onto linear 20–70% (w/v) sucrose gradients made in 10 mM Bam35c genomic DNA was extracted from the purified potassium phosphate pH 7.2 and the virus particles were virus particles essentially as described for purification of centrifuged to equilibrium (Beckman SW41 rotor, 24,000 PRD1 genomic DNA (Bamford and Bamford, 1991). The rpm, 20 h, 15°C). The gradients were fractionated (Bio- genome was digested with DdeIorRsaI restriction en- Comp fraction collector) or the light-scattering virus zones zymes, end-filled with Klenow polymerase, and ligated into were collected, diluted 1:1 with the buffer, and pelleted as in a SmaI-digested pSU18 vector. The sequence at the ends of the case of rate zonal purified virus. The pellet was rinsed the inserts in the resulting clones (pJB400-pJB403 and with buffer and resuspended in 10 mM potassium phosphate pJB411) was determined using universal and reverse se- pH 7.2 (about 0.1–0.2 ml/L lysate) on ice overnight. quencing primers (Sequencing Laboratory, Institute of Bio- For equilibrium centrifugation in CsCl, 1ϫ purified virus technology, University of Helsinki). The rest of the genome material was loaded onto the top of 2.4 M CsCl solution (in sequence was analyzed by a primer-walking technique us- 10 mM phosphate pH 7.2) in Beckman SW41 tubes and ing 75 specific primers and the viral genome as template. centrifuged (22,000 rpm, 22 h, 20°C). The gradients were Both strands were sequenced. Sequence information was fractionated using BioComp fraction collector. For purifi- collected to obtain ϳ3.5 times coverage. J.J. Ravantti et al. / Virology 313 (2003) 401Ð414 413

Computational analyses Bamford, D.H., Mindich, L., 1980. Electron microscopy of cells infected with nonsense mutants of bacteriophage phi6. Virology 107, 222–228. Programs from EMBOSS package (Rice et al., 2000) Bamford, J.K.H., Bamford, D.H., 1990. Capsomer proteins of bacterio- phage PRD1, a bacterial virus with a membrane. Virology 177, 445– were used with locally made shell-scripts to calculate sev- 451. eral of the results. The GC-percentage and length were Bamford, J.K.H., Bamford, D.H., 1991. Large scale purification of mem- calculated using the program “infoseq.” Inverted DNA re- brane-containing bacteriophage PRD1 and its subviral particles. Virol- peats were identified using the “einverted” program. Local ogy 181, 348–352. and global alignments were calculated with the programs Bamford, J.K.H., Bamford, D.H., 2000. A new mutant class, made by targeted mutagenesis, of phage PRD1 reveals that protein P5 connects “water” and “needle”, respectively. The predicted genes the receptor binding protein to the vertex. J. Virol. 74, 7781–7786. were translated from ORFs to proteins with “transeq.” Sta- Bamford, J.K.H., Cockburn, J., Diprose, J., Grimes, J.M., Sutton, G., tistics for each ORF were calculated with the program Stuart, D.I., Bamford, D.H., 2002b. Diffraction quality crystals of “pepstats” and “freak.” PRD1, a 66 MDa dsDNA virus with an internal membrane. J. Struct. Several web-based services were used to further charac- Biol. 139, 103–112. terize the Bam35c genome. The putative ORFs were found Bamford, J.K.H., Ha¨nninen, A.-L., Pakula, T.M., Ojala, P.M., Kalkkinen, N., Frilander, M., Bamford, D.H., 1991. Genome organization of using GeneMark.hmm- and GeneMark -programs (Lukashin PRD1, a membrane-containing bacteriophage infecting E. coli.. Virol- and Borodovsky, 1998; http://opal.biology.gatech.edu/ ogy 183, 658–676. GeneMark/). Several different Markov models were used to Belnap, D.M., Steven, A.C., 2000. “Deja vu all over again:” the similar ensure that all reasonable ORFs were found. Possible homol- structures of bacteriophage PRD1 and adenovirus. Trends Microbiol. 8, ogies to known proteins were searched with PSI-BLAST 91–93. (Altschul et al., 1997; http://www.ncbi.nlm.nih.gov/BLAST/). Benson, S., Bamford, J.K.H., Bamford, D.H., Burnett, R.M., 1999. Viral evolution revealed by bacteriophage PRD1 and Human adenovirus coat PSI-BLAST was executed for each ORF until no new hits protein structures. Cell 98, 825–833. were reported. Prediction of transmembrane helices was Benson, S., Bamford, J.K.H., Bamford, D.H., Burnett, R.M., 2002. The done using TMHMM program (Krogh et al., 2001; http:// X-ray crystal structure of P3, the major coat protein of the lipid- www.cbs.dtu.dk/services/TMHMM/). Protein fold recogni- containing bacteriophage PRD1, at 1.65 ÅA resolution. Acta Crystal- tion was done using the 3D-PSSM method (Kelley et al., logr. D 58, 39–59. ϳ Bergh, O., Borsheim, K.K., Bratbak, G., Heldal, M., 1989. High abundance 2000; http://www.sbg.bio.ic.ac.uk/ 3dpssm/). of viruses found in aquatic environments. Nature 340, 467–468. Bradford, M.M., 1976. A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye Acknowledgments binding. Anal. Biochem. 72, 248–254. Bradley, D.E., Rutherford, E.L., 1975. Basic characterization of a lipid- The skillful technical assistance of Sari Korhonen and containing bacteriophage specific for plasmids of the P, N, and W Violeta Manole is greatly acknowledged. This investigation compatibility groups. Cana. J. Microbiol. 21, 152–163. was supported by the Center of Excellence Program (2000– Burnett, R.M., 1997. The structure of adenovirus. in: Chiu, W., Burnett, R.M., Garcea, R.L. (Eds.), Structural Biology of Viruses. Oxford Uni- 2005) of the Academy of Finland (Grants 172904, 164298, versity Press, New York, pp. 209–238. and 172621). Butcher, S.J., Bamford, D.H., Fuller, S.D., 1995. DNA packaging orders the membrane of bacteriophage PRD1. EMBO J. 14, 6078–6086. Caldentey, J., Blanco, L., Bamford, D.H., Salas, M., 1993. In vitro repli- References cation of bacteriophage PRD1 DNA. Characterization of the protein- primed initiation site. Nucl. Acids Res. 21, 3725–3730. Ackermanns, H.-W., Roy, R., Martin, M., Murthy, M.R.V., Smirnoff, Caldentey, J., Tuma, R., Bamford, D.H., 2000. Assembly of bacteriophage W.A., 1978. Partial characterization of a cubic Bacillus phage. Can. J. PRD1 spike complex: role of the multidomain protein P5. Biochemis- Microbiol. 24, 986–993. try 39, 10566–10573. Altschul, S.F., Madden, T.L., Scha¨ffer, A.A., Zhang, J., Zhang, Z., Miller, Carrol, S.B., 2001. Chance and necessity: the evolution of morphological W., Lipman, D.J., 1997. Gapped BLAST and PSI-BLAST: a new complexity and diversity. Nature 409, 1102–1109. generation of protein database search programs. Nucleic Acids Res. 25, Coetzee, W.F., Bekker, P.J., 1979. Pilus-specific, lipid-containing bacte- 3389–3402. riophages PR4 and PR772: Comparison of physical characteristics of Bamford, D.H., 2003. Do viruses form lineages across different domains of genomes. J. Gen. Virol. 45, 195–200. life? Res. Microbiol., in press. Grahn, A.M., Bamford, J.K.H., O’Neill, M.C., Bamford, D.H., 1994. Bamford, D.H., Ackermann, H.-W., 2000. Family Tectiviridae.. in: van Functional organization of the bacteriophage PRD1 genome. J. Bacte- Regenmortel, M.H.V., Fauquet, C.M., Bishop, D.H.L., Carstens, E.B., riol. 176, 3062–3068. Estes, M.K., Lemon, S.M., Maniloff, J., Mayo, M.A., McGeoch, D.J., Grahn, A.M., Caldentey, J., Bamford, J.K.H., Bamford, D.H., 1999. Stable Pringle, C.R., Wickner, R.B. (Eds.), Virus Taxonomy. Classification packaging of phage PRD1 DNA requires adsorption protein P2, which and Nomenclature of viruses. Academic Press, San Diego, pp. 111– binds to the IncP plasmid-encoded conjugative transfer complex. J. 116. Bacteriol. 181, 6689–6696. Bamford, D.H., Burnett, R.M., Stuart, D.I., 2002a. Evolution of viral Grahn, A.M., Daugelavicius, R., Bamford, D.H., 2002a. Sequential model structure. Theor. Popul. Biol. 61, 461–470. of phage PRD1 DNA delivery: active involvement of the viral mem- Bamford, D.H., Caldentey, J., Bamford, J.K.H., 1995. PRD1: a broad host brane. Mol. Microbiol. 46, 1199–1209. range dsDNA tectivirus with an internal membrane. in: Maramorosch, Grahn, A.M., Daugelavicius, R., Bamford, D.H., 2002b. A small viral K., Shatkin, A.J., Murphy, F.A. (Eds.), Advances in Virus Research, membrane associated protein P32 is involved in bacteriophage PRD1 Vol. 45. Academic Press, San Diego, pp. 281–319. entry. J. Virol. 76, 4866–4872. 414 J.J. Ravantti et al. / Virology 313 (2003) 401Ð414

Hendrix, R.W., 1999. Evolution: the long evolutionary reach of viruses. Sambrook, J., Russell, D.W., 2000, third ed. Cold Spring Harbor Labora- Curr Biol. 9, R914–R917. tory Press, Cold Spring Harbor, NY. Hendrix, R.W., 2002. Bacteriophage: evolution of the majority. Theor. San Martin, C., Burnett, R.M., de Haas, F., Heinkel, R., Rutten, T, Fuller, Popul. Biol. 61, 471–480. S.D., Butcher, S.J., Bamford, D.H., 2001. Combined EM/X-ray imag- Hendrix, R.W., Lawrence, J.G., Hatfull, G.F., Casjens, S., 2000. The origins ing yields a quasi-atomic model of the adenovirus-related bacterio- and ongoing evolution of viruses. Trends Microbiol. 8, 504–508. phage PRD1 and shows key capsid and membrane interactions. Struc- Hendrix, R., Smith, M., Hatfull, G., 2002. Long distance relationship, ture 9, 917–930. temporal and phylogenic, among tailed phage. Res. Microbiol., in San Martin, C., Huiskonen, J., Bamford, J.K.H., Butcher, S.J., Fuller, S.D., press. Bamford, D.H., Burnett, R.M., 2002. Minor proteins, mobile arms and Huiskonen, J., Laakkonen, L., Toropainen, M., Sarvas, M., Bamford, D.H., membrane-capsid interactions in bacteriophage PRD1 assembly. Nat. Bamford, J.K.H., 2003. Probing the ability of the coat and vertex Struct. Biol. 9, 756–762. protein of the membrane-containing bacteriophage PRD1 to display a Savilahti, H, Caldentey, J., Lundstro¨m, K., Syva¨oja, J.E., Bamford, D.H., meningococcal epitope. Virology 310, 267–279. Kelley, L.A., MacCallum, R.M., Sternberg, M.J.E., 2000. Enhanced ge- 1991. Overexpression, purification, and characterization of Escherichia nome annotation using structural profiles in the program 3D-PSSM. J. coli bacteriophage PRD1 DNA polymerase. In vitro synthesis of full- Mol. Biol. 299, 499–520. length PRD1 DNA with purified proteins. J. Biol. Chem. 266, 18737– Krogh, B., Larsson, G., von, Heijne, Sonnhammer, E.L.L., 2001. Predict- 18744. ing transmembrane protein topology with a hidden Markov model: Sokolova, A., Malfois, M., Caldentey, J., Svergun, D.I., Koch, M.H., application to complete genomes. J. Mol. Biol. 305, 567–580. Bamford, D.H., Tuma, R., 2001. Solution structure of bacteriophage Lukashin, A., Borodovsky, M., 1998. GeneMark.hmm: new solutions for PRD1 vertex complex. J. Biol. Chem. 276, 46187–46195. gene finding. Nucleic Acids Res. 26, 1107–1115. Stanisich, V.A., 1974. The properties and host-range of male-specific Luo, C., Butcher, S., Bamford, D.H., 1993. Isolation of a phospholipid-free bacteriophages of Pseudomonas aeruginosa. J. Gen. Microbiol. 84, protein shell of bacteriophage PRD1, an Escherichia coli virus with an 332–342. internal membrane. Virology 194, 564–569. Stewart, P.L., Burnett, R.M., Cyrklaff, M., Fuller, S.D., 1991. Image Maidak, B.L., Cole, J.R., Lilburn, T.G., Parker Jr., C.T., Saxman, P.R., reconstruction reveals the complex molecular organization of adeno- Farris, R.J., Garrity, G.M., Olsen, G.J., Schmidt, T.M., Tiedje, J.M., virus. Cell 67, 145–154. 2001. The RDP-II (Ribosomal Database Project). Nucleic Acids Res. Stro¨msten, N.J., Bamford, D.H., Bamford, J.K.H., 2003. The unique vertex 29, 173–174. of bacterial virus PRD1 is connected to the viral internal membrane. Mindich, L., Bamford, D., Goldthwaite, C., Laverty, M., Mackenzie, G., J. Virol. 77, 6314–6321. 1982a. Isolation of nonsense mutants of lipid-containing bacteriophage Tetart, F., Desplats, C., Kutateladze, M., Monod, C., Ackermann, H.-F., PRD1. J. Virol. 44, 1013–1020. Krisch, H.M., 2001. Phylogeny of the major head and tail genes of the Mindich, L., Bamford, D., McGraw, T., Mackenzie, G., 1982b. Assembly wide-ranging T4-type bacteriophages. J. Bacteriol. 183, 358–366. of bacteriophage PRD1: particle formation with wild-type and mutant Tuma, R., Bamford, J.K.H., Bamford, D.H., Thomas Jr, G.J., 1996. Struc- viruses. J. Virol. 44, 1021–1030. Nagy, E., 1974. A highly specific phage attacking Bacillus anthracis strain ture, interactions and dynamics of PRD1 virus II. Organization of the STERNE. Acta Microbiol. Acad. Sci. Hung. 21, 257–263. viral membrane and DNA. J. Mol. Biol. 257, 102–115. Nandhagopal, N., Simpson, A.A., Gurnon, J.R., Yan, X., Baker, T.S., Van Etten, J., Lane, L., Meints, R., 1991. Viruses and viruslike particles of Graves, M.V., Van Etten, J.L., Rossmann, M.G., 2003. The structure eukaryotic algae. Microbiol. Rev. 55, 586–620. and evolution of the major capsid protein of a large, lipid-containing, Van Etten, J.L., Meints, R.H., 1999. Giant viruses infecting algae. Annu. DNA-virus. Proc. Natl. Acad. Sci. USA 99, 14758–14763. Rev. Microbiol. 53, 447–94. Olkkonen, V.M., Bamford, D.H., 1989. Quantitation of the adsorption and van Oostrum, J., Burnett, R.M., 1985. Molecular composition of the ade- penetration stages of bacteriophage phi6 infection. Virology 171, 229– novirus type 2 virion. J. Virol. 56, 439–448. 238. van Raaij, M.J., Mitraki, A., Lavigne, G., Cusack, S., 1999. A triple Olsen, R.H., Siak, J.-S., Gray, R.H., 1974. Characteristics of PRD1, a beta-spiral in the adenovirus fibre shaft reveals a new structural motif plasmid-dependent broad host range DNA bacteriophage. J. Virol. 14, for a fibrous protein. Nature 401, 935–938. 689–699. Walin, L., Tuma, R., Thomas Jr., G.J., Bamford, D.H., 1994. Purification Pajunen, M., Kiljunen, S.J., So¨derholm, M.E.-L., Skurnik, M., 2001. Com- of viruses and macromolecular assemblies for structural investigations plete genome sequence of the lytic bacteriohage phiYeO3 of Yersinia using a novel ion exchange method. Virology 201, 1–7. enterolitica serotype O:3. J. Bacteriol. 183, 1928–1937. Wommack, K.E., Clowell, R.R., 2000. Virioplankton: viruses in aquatic Rice, P., Longden, I., Bleasby, A., 2000. EMBOSS: The European Mo- ecosystems. Microbiol. Mol. Biol. Rev. 64, 69–114. lecular Biology Open Software Suite. Trends Genet 16, 276–277. Wong, F.H., Bryan, L.E., 1978. Characteristics of PR5, a lipid-containing Rydman, P.S., Bamford, D.H., 2000. Bacteriophage PRD1 DNA entry uses plasmid-dependent phage. Can. J. Microbiol. 24, 875–882. a viral membrane-accociated transclycosylase activity. Mol. Microbiol. Yamamoto, K.R., Alberts, B.M., Benzinger, R., Lawhorne, L., Treiber, G, 37, 356–363. 1970. Rapid bacteriophage sedimentation in the presence of polyeth- Rydman, P.S., Bamford, J.K.H., Bamford, D.H., 2001. A minor capsid protein P30 is essential for bacteriophage PRD1 capsid assembly. J. ylene glycol and its application to large-scale virus purification. Virol- Mol. Biol. 313, 785–795. ogy 40, 734–744. Rydman, P.S., Caldentey, J., Butcher, S.J., Fuller, S.D., Rutten, T., Bam- Yan, X., Olson, N.H., Van Etten, J.L., Bergoin, M., Rossmann, M.G., ford, D.H., 1999. Bacteriophage PRD1 contains a labile receptor bind- Baker, T.S., 2000. Structure and assembly of large lipid-containing ing structure at each vertex. J. Mol. Biol. 291, 575–587. dsDNA viruses. Na Struct. Biol. 7, 101–103.