View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

Virology 444 (2013) 374–383

Contents lists available at ScienceDirect

Virology

journal homepage: www.elsevier.com/locate/yviro

Genomic characterization of six novel pumilus bacteriophages

Laura Lorenz a, Bridget Lins b, Jonathan Barrett a, Andrew Montgomery a, Stephanie Trapani a, Anne Schindler a, Gail E. Christie c, Steven G. Cresawn d, Louise Temple a,n

a Department of Integrated Science and Technology, James Madison University, Harrisonburg, VA 22807, USA b Department of Biochemistry and Molecular biology, College of Medicine, University of Florida, Gainesville, FL 32610, USA c Department of Microbiology and Immunology, Virginia Commonwealth University, Richmond, VA 23298, USA d Department of Biology, James Madison University, Harrisonburg, VA 22807, USA

article info abstract

Article history: Twenty-eight bacteriophages infecting the local host Bacillus pumilus BL-8 were isolated, purified, and Received 1 March 2013 characterized. Nine genomes were sequenced, of which six were annotated and are the first of this host Returned to author for revisions submitted to the public record. The 28 phages were divided into two groups by sequence and 7 April 2013 morphological similarity, yielding 27 cluster BpA phages and 1 cluster BpB phage, which is a BL-8 Accepted 4 July 2013 prophage. Most of the BpA phages have a host range restricted to distantly related strains, B. pumilus and Available online 30 July 2013 B. simplex,reflecting the complexities of Bacillus . Despite isolation over wide geographic and Keywords: temporal space, the six cluster BpA phages share most of their 23 functionally annotated protein features Bacillus pumilus and show a high degree of sequence similarity, which is unique among phages of the Bacillus genera. This Bacteriophage is the first report of B. pumilus phages since 1981. Genomics & 2013 Elsevier Inc. All rights reserved. Host range Bacillus taxonomy

Introduction 1999). B. pumilus spores have been shown to survive in extreme environments, even in spacecraft and space-related facilities after The genus Bacillus contains a diverse group of 165 bacterial strict sterilization techniques (La Duc et al., 2004). The only fully species that are generally Gram-positive, aerobic sporeformers sequenced and assembled B. pumilus genome, strain SAFR-032, (Logan and Halket, 2011). The wide range of phenotypic diversity was isolated from a spacecraft in an assembly facility and showed and ecological origin of species in this genus has complicated resistance up to three times the concentration of hydrogen Bacillus taxonomy; for example, the Bacillus cereus group, com- peroxide and 7.5 times the concentration of UV exposure than prised of the six species B. cereus, Bacillus thuringiensis, Bacillus related Bacillus species (Gioia et al., 2007). anthracis, Bacillus mycoides, Bacillus pseudomycoides, and Bacillus A number of bacteriophages (phages) in the genus Bacillus have weihenstephanensis, do not resolve by 16S rDNA sequence analysis been described. Bacteriophages are arguably the most abundant but show high phenotypic variability (Maughan and Van, 2011). organisms on Earth, with an estimated global population of 1031 Despite difficulties in classifying taxonomy, a number of specific phage particles, and the potential genetic diversity is nearly Bacillus species are used in industry, i.e. in molecule production limitless (Hatfull et al., 2010). Phages are utilized in applications and food fermentation. In addition, two Bacillus species infect as diverse as therapy for human bacterial infections, biocontrol of humans; B. cereus and B. anthracis are the causative agents of food- pathogens on produce or animal products, gene or vaccine borne illness and anthrax, respectively. A number of other Bacillus delivery systems, protein display systems, and bacterial typing species have intermittently been implicated in food-associated (Haq et al., 2012). Bacillus species phages show great diversity in illness, including Bacillus licheniformis, Bacillus subtilis, and Bacillus morphology, sequence length, sequence content, and host range, pumilus (From et al., 2007; Logan, 2012; Salkinoja-Salonen et al., reflecting the high variability among and within species in this genus. Sequenced phage records exist in the GenBank database for B. anthracis, B. subtilis, B. thuringiensis, Bacillus clarkii, and B. cereus, n Correspondence to: MSC 4415, 701 Carrier Drive, Harrisonburg, VA 22807, USA. including the extensively studied superfamily reference phages E-mail addresses: [email protected] (L. Lorenz), [email protected] SPO1 and φ29 (Klumpp et al., 2010; Meijer et al., 2001) 2010 and (B. Lins), [email protected] (J. Barrett), [email protected] B. anthracis typing phage Gamma (Abshire et al., 2005). Prior to (A. Montgomery), [email protected] (S. Trapani), [email protected] (A. Schindler), [email protected] (G.E. Christie), this study, at least 33 other B. pumilus phages were reported [email protected] (S.G. Cresawn), [email protected] (L. Temple). (Bramucci et al., 1977; Reanney and Teh, 1976; van Elsas and

0042-6822/$ - see front matter & 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.virol.2013.07.004 L. Lorenz et al. / Virology 444 (2013) 374–383 375

Penido, 1981), but none have been sequenced and the latest study Finn ranged between 49,259 and 50,161 bp in length and between was published in 1981 (van Elsas and Penido, 1981). 75 and 79 number of genes, and contain a terminal redundancy In this study, 28 B. pumilus phages were isolated, purified, and between 794 and 852 bp. These and other genometrics of the characterized. Nine genomes were sequenced, and the annotation Cluster BpA phages are described in Table 2. The accessioned of six of them is described here. This report is the first genomic Cluster BpA phages, Andromeda, Taylor, Eoghan, Gemini, Curly, analysis of B. pumilus phages and nearly doubles the number of and Finn, were annotated and their gene organization analyzed. B. pumilus phage isolates shown in the literature. Putative functions were described for 23 genes in each genome and they showed similar organizational structure to each other, schematically depicted in Fig. 2. Results The Cluster BpB phage, Pleiades, was uncovered as a separate contig in the Polaris (Cluster BpA phage) sequence assembly and Isolation and characterization of novel B. pumilus phages visualized as a podovirus by electron microscopy of lysates resulting from infection with Cluster BpA phages (Fig. 1). Pleiades Twenty-eight phages (Table 1) were isolated from various soil was discovered to exist as a prophage when Pleiades-specific samples over a period of four years using the locally isolated host primers produced amplicons in samples of B. pumilus BL-8 B. pumilus BL-8. Viral morphology of 21 of the phages was genomic DNA (Fig. 3). Pleiades appears to be uninducible by either visualized by negative stain transmission electron microscopy. UV or mitomicin-C; however, no sensitive strain for plaque The siphoviridae morphotype was observed for 20 of the imaged tests exists for Pleiades, hampering efforts at further analysis. phages and the podoviridae for one phage. Representative TEM Preliminary experiments with additional B. pumilus strains images are depicted in Fig. 1. Nine of the isolated phages were revealed a dependence of Cluster BpA phage infection on the sequenced. Six of these sequenced phages (Andromeda, Taylor, presence of a similar prophage in partially sequenced strain ATCC Eoghan, Gemini, Finn, Curly) have been annotated and accessioned 7061 (accession number ABRX00000000.1) (data not shown). in the GenBank database. The final three sequenced phages Pleiades-specific sequences have been found in varying ratios in (Pleiades, Polaris, and BC) are in varying stages of sequence all of the Cluster BpA phage sequence assemblies, suggesting there finishing and annotation. is some relationship between Cluster BpA phage infection and expression of the BL-8 prophage; these studies are ongoing.

Analysis of the sequenced phages Genome organization and annotation of the finished cluster Genome comparisons of the nine sequenced genomes yielded BpA phages two groups of genomically distinct phages: the Cluster BpA phages (Andromeda, Taylor, Eoghan, Gemini, Curly, Finn, BC, and Polaris) The six accessioned Cluster BpA phages showed a uniform and the Cluster BpB phage, Pleiades. The sequenced Cluster BpA structural organization consisting of three putative gene cassettes phages shared an average of 82% genome similarity to each other. for structural genes, lysis genes, and replication genes, based on The genomes of Andromeda, Taylor, Eoghan, Gemini, Curly, and the 23 functionally annotated features. All genomes began with a

Table 1 Isolation information of the 28 phages presented in this study ordered by isolation date. Shaded rows are phages that have been sequenced. All phages except for phages Taylor and Eoghan came from distinct soil samples.

Name Isolation date Isolation source GPS coordinates

Polarisa 2008 Greenhouse soil, JMU campus 38.439329, 78.872169 Pleiades 2008 Greenhouse soil, JMU campus 38.439329, 78.872169 BC a 2008 Greenhouse soil, JMU campus 38.439329, 78.872169 JG a 2008 Greenhouse soil, JMU campus 38.439329, 78.872169 Leonitasa 2008 Greenhouse soil, JMU campus 38.439329, 78.872169 Andromeda a 2009 Grassy area, Harrisonburg VA 38.4234667, 78.8846667 Gemini a 2009 Near a reservoir, JMU campus 38.4344258, 78.874342 Taylor a 2009 Compost pile, NJ 40.7455556, 74.4808333 Eoghan a 2009 Compost pile, NJ 40.7455556, 74.4808333 Aladdinb 2009 Fairly dry residential compost, NJ 40.7455556,74.4975000 Habu a 2009 Unknown Unknown Dextera 2009 Small hill surrounded by shrubs Unknown Maximillian a 2009 Edge of lake, JMU campus 38.423327, 78.881133 Strevson a 2009 Mulch near statue, JMU campus 38.4340500,78.8636890 DRNb 2009 Mulch under tree, JMU campus 38.433364, 78.861773 Valhallaa 2009 Mulch under tree, JMU campus 38.433364, 78.861773 SB1a 2009 Zion National Park, Utah Unknown SB2a 2009 Zion National Park, Utah Unknown TDC a 2009 Unknown Unknown LCDa 2009 Grassy area, Harrisonburg VA 38.4234667, 78.8846667 Cheesea 2009 Near horse barn, 4 miles NW of Harrisonburg, VA Unknown Davea 2009 Soil near pond 38.4340500,78.8636890 Samboa 2009 Unknown Unknown Tweety a 2009 Compost pile, NJ 40.7455556, 74.4808333 Kylea 2009 Base of pine tree, JMU campus 38.4340100,78.8679740 Sigmaa 2009 Compost pile, NJ 40.7455556, 74.4808333 Curly 2010 Compost pile, Harrisonburg VA 38.4817750, 78.8756778 Finn 2010 Under a tree, Harrisonburg VA 38.4444722, 078.9176806

a Infects B. pumilus BL-8 (original host) and B. simplex RWR1, does not infect B. cereus 113, B. megaterium 113, B. weihenstaphanensis LazoA1, or B. pseudomycoides 113. b Only infects B. pumilus BL-8 (original host), does not infect B. simplex RWR1, B. cereus 113, B. megaterium 113, B. weihenstaphanensis LazoA1, or B. pseudomycoides 113. 376 L. Lorenz et al. / Virology 444 (2013) 374–383

Fig. 1. Representative TEMs of observed phage particle morphology. Phages pictured are: A, Curly; B, Gemini; C, Eoghan; D, Taylor; E, Pleiades. Pleiades image was derived from an Andromeda infection that yielded a mixed burst of phage.

Table 2 Genometrics and morphological data of the nine sequenced phages. Shaded phages have completed sequencing and annotation and are available in Genbank. Unshaded phages are incomplete and the metrics shown are based on incomplete sequence assembly and auto-annotation. Phage particle dimensions are averages in cases where the phage was imaged more than once with the following sample sizes and standard deviations of head length, head width, and tail length measurements: Taylor, n¼2, SD71 nm,74nm,77 nm; Eoghan, n¼4, SD75 nm,73nm,74 nm; Curly, n¼6, SD73 nm,72 nm,79nm; Pleiades, n¼3, SD72nm,71nm.

Name Length Number of GC content Percentage Terminal repeat length Particle Head length Head width Tail length (bp) genes (%) coding (bp) morphology (nm) (nm) (nm)

Andromeda 49259 79 41.91 91.54 852 Siphoviridae 55 56 115 Gemini 49362 79 41.9 91.22 852 Siphoviridae 47 48 149 Taylor 49492 75 42.29 91.82 831 Siphoviridae 58 60 114 Eoghan 49558 76 42.22 91.74 794 Siphoviridae 56 54 125 Curly 49425 77 41.82 91.78 852 Siphoviridae 57 54 110 Finn 50161 77 41.69 90.19 849 Siphoviridae 45 41 153 Polaris 49096 83 42.13 Siphoviridae Pleiades 65676 114 47.57 Podoviridae 59 37 BC 48729 88 41.96 Siphoviridae highly conserved terminally redundant region, followed by seven morphogenesis gene, scaffold protein gene, phage virion morpho- highly variable small genes of unknown function and no signifi- genesis gene, tape measure protein gene, and tailspike protein cant similarity in the BLASTp non-redundant database. These were gene. After a gap of three unknown genes, the lysis cassette followed by a terminase gene, beginning the putative struct- followed, including an autolysin and a holin gene. The remaining ural cassette that also included the portal gene, phage head annotated genes encode proteins either directly or indirectly L. Lorenz et al. / Virology 444 (2013) 374–383 377

Fig. 2. Genome schematic of B. pumilus phage Curly, overlaid with annotated gene functions. Terminally redundant regions are shaded purple, structural cassette annotated genes in green, lysis cassette genes in orange, and replication cassette genes in blue. The six BpA cluster phage genomes share similar organization and features.

Table 4 PCR results of the seven unsequenced Bacillus phages that failed to amplify one or more of the seven conserved regions found in five of the sequenced BpA phages. Presence or absence of the PCR product are denoted by the (+) and () signs, respectively.

P1 P2 P3 P4 P5 P6 P7

Maximillian +++++ + Aladdin + + + Dave ++ ++++ Cheese ++ ++++ Valhalla ++ ++++ DRN ++ + + Leonitas +++ Fig. 3. Gel of PCR results of B. pumilus BL-8 genomic DNA amplified using primers designed from the Pleiades (A–C) and Polari (D–F) sequences. Lane A: Pleiades a/b primers; Lane B: Pleiades c/d primers; Labe C: Pleiades e/f primers; Lane D: Polaris a/b primers; Lane E: Polaris c/d primers; Lane F: Polaris e/f primers; Lane G: 1 kb Host range ladder; Lane H: no template with Pleiades a/b primers. The host range of 25 of the phages was compared between the known host, B. pumilus BL-8, and four different Bacillus species related to replication, including a helicase, primase, polymerase isolated locally: B. simplex RWR1, B. cereus-113, B. thuringiensis-113, subunits and factors, and dNTP regulating enzymes. and Bacillus megaterium-113. All but two phage isolates, Aladdin and DRN, infected strain RWR1, and none infected any other Bacillus strains tested (Table 1). Clustering of the unsequenced phages Comparison with other sequenced Bacillus phages Sequence similarity of the unsequenced phages was investi- gated using two approaches: PCR of regions conserved in the Andromeda, Taylor, Eoghan, Gemini, Curly, and Finn were sequenced phages and restriction digests of genomic DNA. PCR compared in the program Phamerator with 18 other previously primers were designed for seven conserved regions in five of the sequenced Bacillus phages: five B. anthracis phages (AP50, Fah, sequenced BpA phages (Andromeda, Taylor, Eoghan, Gemini, Cherry, WBeta, Gamma), four B. thuringiensis phages (Bam35c, Polaris) and the presence or absence of amplicons in 19 un- GIL16c, IEBH, 0305phi8–36), six B. subtilis phages (SPBc2, SPO1, sequenced phages was observed. Twelve of the unsequenced SPP1, Φ29, B103, phi105), one B. clarkii phage (BCJA1c), one phages, along with the five sequenced phages from which the B. cereus phage (TP21-L) and one Lactobacillus plantarum phage primers were designed, amplified in all seven conserved regions. (LP65). Only small regions of sequence similarity (Eo10–4) existed The remaining seven phages did not amplify in at least one of the between our phages and phages of other host genera (Supple- regions. Table 3 depicts the location of the primer sets and Table 4 mentary Fig. 1). In fact, sequence similarity of this degree was rare lists a summary of the PCR results of the seven phages that failed among all the Bacillus phages, except for those within our to amplify at least one of the conserved regions. All primer sets B. pumilus group and the four B. anthracis phages Cherry, Gamma, correctly amplified the anticipated sequence in the 5 sequenced WBeta, and Fah, which have been shown to be derivatives of a BpA phages and in the following unsequenced phages: Tweety, single B. anthracis prophage named W (Fouts et al., 2006; LCD, LEG, AP, SB1, SB2, TDC, SH, Laurel, JG, KC, Dexter. In addition, a Minakhin et al., 2005). Thirteen of the non-Bacillus pumilus phages restriction digest of the genomes of 24 phages using restriction with a gene annotated or matching a CDD domain indicating a enzyme EcoRI was performed. Gel analysis yielded 11 distinct terminase function were used for phylogenetic analysis against the banding patterns that are depicted in Fig. 4. terminase genes of our six annotated phage genomes by protein 378 L. Lorenz et al. / Virology 444 (2013) 374–383

Table 3 Primer set genome coordinates, location, and gene function within the phage Andromeda. Primers were used in PCR analysis of unsequenced phages.

Primer set Coordinates in Andromeda Location Function

P1 27210–27399 gp38 DNA primase P2 6355–6508 gp11 Head morphogenesis protein P3 13698–13889 gp24 No significant similarity by BLASTn P4 23193–23405 Before and in gp33 No significant similarity by BLASTn P5 31290–21527 In and between gp41 and gp42 DNA polymerase P6 43714–43905 In and between gp66 and gp67 Bacillus conserved unknown protein P7 47989–48236 In and between gp76 and gp77 No significant similarity by BLASTn

Fig. 4. Gel electrophoresis of B. pumilus phage genomic DNA digested with EcoRI. Lane A: Andromeda; Lane B: Taylor; Lane C: Gemini; Lane D: SB2; Lane E: Cheese; Lane F: Strevson; Lane G: SB1; Lane H: LCD; Lane I: Tweety; Lane J: Sigma; Lane K: Eoghan; Lane L: 1 kb Ladder. alignment with Clustal Omega and ClustalW and visualization in Njplot (Fig. 5).

Identification of proteins by mass spectrometry

Fifteen proteins were identified by 1D LC-MS/MS of Curly lysate and Curly phage particles (Table 5). Of these 15, eight had been assigned functional annotation from bioinformatic analysis: gp 32 (endolysin), gp10 (portal), gp9 (terminase), gp42 (DNA polymerase), gp25 (tape measure), gp55 (kinase), gp22 (major tail protein) and gp39 (primase). Although LC-MS/MS analysis did not identify gp16, I-TASSER bioinformatic analysis showed significant hits to the HK9 fold of capsid proteins (Pietila et al., 2013; Roy et al., 2010). Fig. 5. Phylogenetic relationship of terminase genes in 19 Bacillus phages. Phages are color coded according to their host bacterium. Pink is Lactobacillus planarum, Discussion cyan is B. subtilis, red is B. anthracis, green is B. thuringiensis, orange is B. clarkii, purple is B. cereus, and dark blue is B. pumilus. This report contains the first description of novel B. pumilus phages since 1981 and the first analysis of sequenced B. pumilus the other seven phages tested showed conservation in at least phages published. Of the phages tested, all but one displayed the three of the seven regions. By these measures, our population of siphoviridae morphotype and all but two displayed uniform host Cluster BpA phage appears to share significant similarity, though specificity. Sequenced phages were divided into Cluster BpA or restriction digest analysis shows they are not identical. These Cluster BpB based on sequence and morphological similarity. Cluster BpA phages also appear to be ubiquitous in the environ- Among the sequenced Cluster BpA phages, pairwise alignments ment, varying greatly in their geographic and temporal origin over produce an average of 82% nucleotide similarity. Twelve unse- four years and locations as far apart as 300 miles. quenced phages joined the six sequenced Cluster BpA phages due The sequenced Cluster BpA phages displayed a uniform geno- to sequence conservation by PCR in all seven tested regions, while mic structure, and all contained roughly the same 20 annotated L. Lorenz et al. / Virology 444 (2013) 374–383 379

Table 5 Proteins identified by mass spectrometry. Tryptic digests of gel fragments comprising molecular weights from 4–150 kD, inclusive, were analyzed, and 15 predicted proteins were identified.

Gene product # Predicted function of protein E-value for best peptide # of unique peptides Sequence of best peptide(s)

gp32 Endolysin 1.50E-04 2 R.NTAMSAILTESLFINNPADAK.L gp10 Portal 1.50E-04 3 K.IMESLLVPPSLLGLSRGQSGSYALSSNQFDIFMIHLESIQR.D gp29 Hypothetical 1.50E-04 1 K.IQPWVIFQMPDGGEIEFPLGIFYLSTPTR.N gp09 Terminase 1.50E-04 2 K.KKLLIVHMESAPNMDTDER.I gp42 DNA polymerase 1.50E-04 2 K.AIGLTDHGAMHGIPEFMNLAK.Q gp52 Hypothetical 1.50E-04 2 R.ELDLGEVSPRRTSSEHICLYGAPILMDYNVNTGELSVWSASK.T gp25 Tape measure 1.50E-04 1 MASNSLINIVVSATDMASAQIGKVGGALEGMASK.A gp07 Hypothetical 1.50E-04 1 VIVLYKQVYMKDVNVVALLNGEESYGHR.K gp13 Hypothetical 1.50E-04 1 K.VGLPVLLRYAK.N gp22 Major tail protein 1.90E-14 1 MASQSHGFDNTIVFGK.E gp55 Kinase 1.90E-14 1 R.VVGENVCDIHSK.S gp37 Hypothetical 3.60E-14 1 K.PERRDSPTWFR.E gp36 Hypothetical 0.14 1 K.GNFQDIK.Y gp69 Hypothetical 0.024 1 MLEVTSKQGR.A gp39 Primase 0.1 1 R.MMRKQDK.K

Table 6 Evidence of proteins encoded in terminally redundant regions for four other Bacillus phages. Accession numbers for proteins follow below.

Phage Terminal Terminally redundant proteins redundancy length

Bacillus clarkii phage 350 Hypothetical protein gp1a BCJA1c (39) Bacillus subtilis phage 13,185 Host shut off proteins gp38-40, transcription regulation proteins gp44, gp50 and gp51, cell division inhibitor gp56, SPO1 (40) and 23 hypothetical proteins between gp36.1 and gp60b Bacillus subtilis phage 1320 (circularly Terminase small subunit ORF1, terminase large subunit ORF2, hypothetical protein ORF 2.1 and its complement SPP1 (41) permuted) ORF2.2c Bacillus thurengiensis phage 6479 Rel-Spo-like synthetase/hydrolase gp102, and hypothetical proteins gp101, and gp 103-107d 0305phi8-36 (42)

a YP_164380.1 b gp36.1 YP_002300277.1, gp36.2 YP_002300278.1, gp36.3 YP_002300279.1, gp35.1 YP_002300280.1, gp37 YP_002300281.1, gp38 YP_002300282.1, gp39 YP_002300283.1, gp40 YP_002300284.1, gp41 YP_002300285.1, gp42 YP_002300286.1, gp43 YP_002300287.1, gp44 YP_002300288.1, gp45 YP_002300289.1, gp46 YP_002300290.1, gp47 YP_002300291.1, gp48 YP_002300292.1, gp49 YP_002300293.1, gp50 YP_002300294.1, gp51 YP_002300295.1, gp52 YP_002300296.1, gp52.2 YP_002300297.1, gp52.1 YP_002300298.1, gp53 YP_002300299.1, gp54 YP_002300300.1, gp55 YP_002300301.1, gp56 YP_002300302.1, gp57 YP_002300303.1, gp58 YP_002300304.1, gp59 YP_002300305.1, gp60 YP_002300306.1 c ORF1 NP_690652.1, ORF 2 NP_690654.1, ORF 2.1 NP_690655.1, ORF 2.2 NP_690656.1. d gp 102 YP_001429592.1, gp 101 YP_001429591.1, gp 103 YP_001429593.1, gp 104 YP_001429594.1, gp 105 YP_001429595.1, gp 106 YP_001429596.1, gp 107 YP_001429597.1 features. Though broad regions of functional similarity were morphogenesis gene gp11, a putative translational frameshift assigned to this organization, most of the predicted genes did between gp23 and 24, the tape measure gene gp25, and the not match significantly to the BLAST non-redundant database or tailspike protein gene gp 28. CDD-BATCH and BLASTp results matched only with genes encoding proteins of unknown function. linked gp11 to the domain family pfam04233 that includes the All sequenced Cluster BpA genomes began with a highly well-studied GP7 gene product from B. subtilis phage SPP1, a conserved terminally redundant region. Though not all terminal protein that has been implicated in head assembly (Becker et al., redundancies encode proteins, of the 18 Bacillus and Lactobacillus 1997), though lack of GP7 mutants inhibited further analysis. Gp24 phages utilized for comparative analysis in this paper, four Bacillus may be expressed as a C-terminal extension of gp23 by a transla- phages with terminally redundant ends contain coding potential: tional frameshift, implied by the lack of a good candidate RBS B. clarkii phage BCJA1c (Kropinski et al., 2005), B. subtilis phage upstream of gp24 and a “slippery” run of adenosine bases over- SPO1 (Stewart et al., 2009), B. subtilis phage SPP1 (Alonso et al., lapping the end of gp23 and the beginning of gp24, much like the 1997) and B. thuringiensis phage 0305phi8-36 (Thomas et al., translational frameshift observed in tail maturation proteins in E. 2007). Gene calls in these regions range from hypothetical coli phages λ and P2 (Christie et al., 2002; Levin et al., 1993; Xu proteins to critical enzymes such as terminase subunits (Table 6), et al., 2004). The tape measure gene, the length of which has been but the gene observed in the terminal redundancy in the Cluster shown to control tail length in tailed phages (Katsura and Hendrix, BpA phages did not show homology by BLASTp to any of these 1984), is nearly identical in this set of phages except for that of proteins, nor did HHPred analysis confidently identify any struc- Finn, whose tape measure gene is 175 bases longer. This roughly ture similarities to the June 1, 2013 PDB70 database. corresponds to Finn's longer observed tail length, though the Using phage Curly for reference, the putative structural cassette sample size for tail length measurements in this family of phages began with the terminase at gp9 and also included the portal are prohibitively small for statistical analysis. Finally, HHPred protein gene (gp10), phage head morphogenesis gene (gp11), analysis of the tailspike protein gp28 yielded many high prob- scaffold protein gene (gp14), major head protein (gp16), phage ability domains, including many fungal and bacterial sugar virion morphogenesis gene (gp19), major tail protein (gp22), tape degrading enzymes implicated in cell wall degradation and a measure protein gene (gp25), and tailspike protein gene (gp28). Of 100% probability match (E¼2 1026, p¼6.6 1031) between particular interest in this cassette are the phage head gp28 and the self-cleaving trimeric B. subtilis phage Φ29's tail 380 L. Lorenz et al. / Virology 444 (2013) 374–383 appendage protein (Φ29 gp12). Xiang et al. identified that the genus, in which species are difficult to resolve even with modern sequence of the tail appendage proteins from Φ29 shows similarity molecular methods. Hardy spores persist in the most sterile of to various polysaccharide binding enzymes due to its activity in environments, and two species are potentially lethal human patho- digesting the bacterial cell wall during irreversible attachment gens. Host range studies performed with 25 novel B. pumilus phages (Xiang et al., 2009). The strong similarity between the BpA showed all but two (Aladdin, DRN) cross-infected in B. simplex but not tailspike protein gp28, various polygalactoronases, and Φ29's tail one infected B. thuringiensis, B. cereus, B. megaterium, B. weihenstepha- appendage protein, may suggest that gp28 shares functional and nensis, or B. pseudomycoides (Table 1). mechanistic similarity to the well-described Φ29 gp12 in vivo. ThehostrangeresultshaveimplicationsforBacillus taxonomy. Mass spectrometry analysis provided identification of 15 pro- Modern taxonomy separates the seven species used for host range teins from a crude lysate of Curly. Six of the identified proteins fall testing into three distinct clades; B. pumilus in one, B. cereus sensu lato into the predicted structural gene region. (including B. cereus, B. thuringiensis, B. weihenstephanensis, and Many of the functionally annotated genes described here are B. pseudomyoides) in another, and B. simplex and B. megaterium in present in other phages, but their sequence and order are unique the last (Maughan and Van, 2011). Interestingly, all but two tested among phages of the same host genus. The scaffold protein of the phages cross-infected B. simplex but not B. megaterium,twoclosely structural cassette and holin protein of the lysis cassette belong to related bacterial species by 16S rDNA analysis. However, Maughen gene families that exhibit little sequence conservation, so BLASTp et al. showed that rearranging the phylogeny to minimize 11 pheno- or CDD searches yielded no significant hits. As a result, functions of typic changes along the timeline places these two species further the scaffold and holin genes were assigned instead by analyzing apart, with B. pumilus, our phage host, evolutionarily between them candidate genes′ protein secondary structure elements using the (Maughan and Van, 2011). This suggests that some specific phenotype program PSIPRED and looking for structural features as described shared between B. pumilus and B. simplex, two genetically distant by Morais (Morais et al., 2003) and Young (Young and Bläsi, 1995). species, is related to infection by these phages, possibly one of the The sequence and order of the holin and its accompanying phenotypes that Maughen identified in her analysis (Maughan and endolysin gene have an unusual genome placement conserved Van, 2011). Further studies with the two restricted host range phages, among the six sequenced B. pumilus phages. These genes are Aladdin and DRN, may elucidate the connection between these three usually found directly adjacent in other phage genomes, as in B. Bacillus species. subtilis phages Φ29, PZA, and SPO1 (Stewart et al., 2009; Young, Inability of these 25 phages to infect strains of B. cereus sensu lato 1992), but all phages described by this study have a small variable may indicate that these phages will not infect other B. cereus group region containing one to two genes of unknown function situated species including B. anthracis, but host range studies with this group between the two canonical lysis genes. Despite the high similarities could yield a potential bacterial typing system if these phages in sequence and genomic structure between them, the sequenced preferentially infect one species in the group. Interestingly, we have phage genomes varied in their length, %GC content, and number of recently isolated a new myovirus on B. pumilus BL-8 that can cross- ORFs, particularly in regions with smaller genes. Previous research infect B. thuringiensis. This is the first B. pumilus phage we have in Mycobacteriophages demonstrates that smaller genes are gen- isolated that is a myovirus and that can cross-infect B. thuringiensis erally in greater genetic flux, likely due to illegitimate recombina- (data not shown). Sequencing of this phage genome may reveal tion events that favor smaller independent movable units (Hatfull genomic differences that define phage host range among these Bacillus et al., 2010). Our data suggest this may be a more widespread species. phenomenon in phage of host genera other than Mycobacteria. In summary, the 28 novel Bacillus phages presented here To examine gene transfer among phages infecting the Bacillus provide insight into three main questions. First, the terminase genera, the terminase genes of 19 sequenced Bacillus phages, phylogenetic tree implies a slow rate of genetic transfer among including the novel six accessioned B. pumilus phage genomes, Bacillus phages except among B. pumilus and B. anthracis phages. were aligned and a phylogenetic tree produced (Fig. 5). Terminases Studies are ongoing to characterize the BpB cluster B. pumilus BL-8 have previously been used to assess phylogenetic relationships prophage, Pleiades, in order to explain the prophage–phage between phages (Hatfull et al., 2010) as they are one of the most interaction between it and the BpA phages we have observed. highly conserved gene families among phages (Casjens, 2005). Their interaction may also explain the observed high genetic Interestingly, categorization by initial host is not always preserved transfer rate among the BpA phages and how it relates to the in this analysis; only the B. pumilus and B. anthracis phages B. anthracis phages and their ancestor prophage W. Second, the remained grouped by this method, while the latter likely share restricted host range phages Aladdin and DRN and the unique high sequence similarity due to a shared prophage ancestor (Fouts B. pumilus/B. thuringiensis cross-infecting phage Glitter may help et al., 2006; Minakhin et al., 2005). The arrangement does not identify useful typing variations in difficult to resolve Bacillus seem to be grouped by similar secondary hosts, either, as the only species such as B. simplex, B. megaterium, and the uniquely known multi-species infecting phages in this group, Cherry, complicated B. cereus sensu lato group. Finally, the contribution Gamma, and IEBH, are not aligned close to B. cereus phage TP21, of sequenced B. pumilus phages to the literature provides insight which shares a secondary host species with them. These data into the wide range of Bacillus phage diversity and provides suggest that horizontal gene transfer is not prevalent in the methods of clustering and categorization for future investigators. previously published sample of Bacillus phages, even when infec- tion of a shared species is possible. This implies that the B. pumilus phages have either recently diverged (despite the wide geographic Materials and methods and temporal distance of the isolates) or have a uniquely high rate of horizontal gene transfer compared to other Bacillus phages. Isolation and identification of strains Most published Bacillus phages are limited to a single species and in some cases to only a few strains, though B. subtilis phage Φ29 has Brain heart infusion (BHI) broth and agar plates were used at shown the ability to infect up to four different Bacillus species (Meijer 37 1C for the routine cultivation of . Cultures were aerated et al., 2001). Phages with narrow host range can be utilized for in BHI overnight on a rotary shaker at 220 rpm. Plate lawns or bacterial typing purposes, whereas broad host range phage can be isolation streaks were grown on 1.5% agar BHI plates. Bacterial utilized for sterilization or for their potential ability to infect and strains were maintained on plates at 4 1C for routine use and in destroy animal pathogens. All of these are critical issues in the Bacillus 50% glycerol stocks at 70 1C for long-term storage. L. Lorenz et al. / Virology 444 (2013) 374–383 381

Isolation and identification of strains 10 min. For forward and reverse primers see Supplementary Table 1. Several Bacillus strains were isolated from soil on the campus of James Madison University (Harrisonburg, VA) and used in this study: Prophage identification B. pumilus BL-8, B. simplex RWR1, B. cereus-113, B. thuringiensis-113, B. megaterium-113, B. weihenstephanensis LazoA1 and B. pseudomy- The B. pumilus BL-8 prophage Pleiades was identified by PCR of coides-113. To isolate strains, approximately 1 ml of soil was suspended BL-8 genomic DNA with Pleiades-specific primers (Supplementary in 10 ml 1 phosphate buffered saline (PBS) and mixed. After the soil Table 2). PCR was performed in a Bio-Rad Thermal Cycler (Bio-Rad, settled by gravity, serial dilutions of the supernatant were performed Hercules, CA) under the following cycle: 3 min at 95 1C, then 95 1C using 1 PBS, which were then plated onto BHI plates and incubated for 30 s, 54 1C for 30 s, and 72 1C for 60 s for 30 cycles, and a final at 37 1C overnight. Individual colonies were restreaked to ensure 72 1C for 5 min. For forward and reverse primers see Supplemen- isolation. To identify strains, amplification of 16S rDNA genes from tary Table 2. the isolates were performed as described by Fierer et al. (Fierer and Jackson, 2006) and sequencing was performed by Elim Biopharma- Sequencing and genome finishing ceuticals (Hayward, CA). 16S rDNA sequence comparison and putative species identification was accomplished using BLAST analysis and the Phages were sequenced at Virginia Commonwealth University's Ribosomal Database Project web tool (Cole et al., 2009). Sequencing Center (Richmond, VA). Assembly was done in the program Newbler (454 Life Sciences, Branford, CT) and analyzed in Consed version 16.0 (Gordon et al., 1998). Regions of weak or Phage isolation missing reads for incomplete phage genomes were amplified using PCR, sequenced at Elim Biopharmaceuticals (Hayward, CA) and To obtain a population of B. pumilus specific particles, approxi- reassembled in Sequencher version 5.0 (Gene Codes Corporation, mately 1 ml of soil was enriched with B. pumilus BL-8 in 10 ml BHI Ann Arbor, MI). Terminally redundant regions identified by double was incubated overnight with shaking at 37 1C. The resulting read coverage and defined read ends in Consed and were manually mixture was centrifuged and the supernatant sterilized by either fitted into the sequenced genome after read assembly. the addition of chloroform or filtered through a 0.22 μm filter (Thermo Fischer Scientific, Waltham, MA). Phage were isolated Electron microscopy and purified by plaque purification. Briefly, liquid suspensions of B. pumilus BL-8 were infected with the filtered soil solution and High titer phage lysate was adhered to EM grids (Ted Pella, agar was added. Mixtures were plated and allowed to harden Redding, CA), washed with sterile H 0, and stained with 3.0% before incubation overnight at 37 1C. To purify individual phage, 2 uranyl acetate. Images were taken at Virginia Commonwealth plaques were picked and up to 8 sequential rounds of plaque University (Richmond, VA). Image analysis was done using ImageJ purification were performed until a single plaque morphology was (Schneider et al., 2012). assured. High-titer lysates (4109 pfu/ml) were obtained by flood- ing infected plates displaying ∼2 103–5 103 plaques with phage Genome annotation buffer (PB: 10 mM Tris, pH 7.5, 10 mM MgSO4, 68 mM NaCl, 1 mM CaCl ), incubating six hours to overnight at 4 1C, then collecting 2 Phage genomes were auto-annotated primarily using the and filtering the liquid. Glycerol stocks of phage isolates were statistical gene and RBS prediction program Genemark.hmm-P stored in 40% glycerol at 70 1C. version 2.8 (Lukashin and Borodovsky, 1998) using the B. pumilus model species, but also with Glimmer version 3.02 (Delcher et al., Phage characterization 1999), and tRNA-scan-SE version 1.21 (Schattner et al., 2005). Annotations were refined manually using Apollo version 1.11.6 To test host range or titer calculation, spot tests were per- (Lewis et al., 2002), Artemis version 14 (Rutherford et al., 2000), formed by dropping high-titer lysate or dilutions of high-titer and DNA Master (Lawrence Lab, University of Pittsburgh, http:// lysate onto plates containing a top agar consisting of 0.7% BHI-agar cobamide2.bio.pitt.edu/). Final gene calls were optimized by com- mixed with 400 μl of the target bacterial species at OD500∼1.0. To paring program predictions, as well as BLASTp results (Altschul extract phage DNA, a modification of the DNA clean-up protocol et al., 1990), and preferentially choosing or manually refining gene from Promega (2010) was used. Briefly, a high titer phage lysate calls for matches with existing genes, longer ORFs (minimum was incubated with a nuclease mixture to remove contaminating 100 bp in length), higher Shine-Dalgarno scores, minimizing inter- bacterial DNA (0.25 mg/ml each DNaseI and RNaseA, New England gene overlap or gap, and including all coding potential. Putative Biolabs, Ipswich, MA), then precipitation solution (30% polyethy- gene functions were assigned by corroborating BLASTp, CD-BATCH lene glycol, PEG8000, and NaCl to a final concentration of 3.3 M) Search (Marchler-Bauer et al., 2011), PSIPRED (McGuffin et al., was added before incubation at 4 1C overnight. Phage particles 2000), HHPred (Söding et al., 2005), and I-TASSER (Roy et al., 2010; were pelleted by centrifugation and resuspended in sterile dH20. Zhang, 2008) results. GenBank accession numbers are as follows: Two ml of DNA clean-up resin (ProMega, Madison, WI) was added Curly, KC330679; Eoghan, KC330680; Gemini, KC330681; Taylor, and mixed gently. This slurry was added to DNA purification KC330682; Finn, KC330683; Andromeda, KC330684. columns (Zymo, Irvine, CA) and spun at 13,000 g. The resin was washed with 80% isopropanol and the DNA was eluted in Comparative genomic analysis dH2O. DNA was quantified using a Nanodrop spectrophotometer (Thermo Fischer Scientific, Waltham, MA). To perform restriction Phage genomes were analyzed using the program Phamerator digests, phage DNA was incubated with EcoRI restriction endonu- (Cresawn et al., 2011). Comparisons were made between the six clease (New England Biolabs, Ipswich, MA) overnight at 37 1C, and B. pumilus phages and 18 other sequenced Bacillus phages (Gen- products were visualized on a 1% agarose gel. PCR for clustering of Bank accession numbers NC_011523, NC_007814, NC_007457, unsequenced phages was performed in a Bio-Rad Thermal Cycler NC_007734, NC_007458, NC_005258, NC_006945, NC_011167, (Bio-Rad, Hercules, CA) under the following cycle: 95 1C for 30 s, NC_009760, NC_011048, NC_011421, NC_001884, NC_004166, 55 1C for 30 s, and 72 1C for 30 s for 30 cycles, and a final 72 1C for NC_004165, NC_004167, NC_006557, NC_011645, NC_006565). 382 L. Lorenz et al. / Virology 444 (2013) 374–383

Multiple alignments were performed with Clustal Omega (Sievers Alonso, J.C., Lüder, G., Stiege, A.C., Chai, S., Weise, F., Trautner, T.A., 1997. The et al., 2011), phylogenetic trees were constructed using ClustalW2 complete nucleotide sequence and functional organization of Bacillus subtilis bacteriophage SPP1. Gene 204, 201–212. (Larkin et al., 2007) and phylogenetic trees were visualized using Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local Njplot (Perrière and Gouy, 1996). Genome maps were generated alignment search tool. J. Mol. Biol. 215, 403–410. using SnapGene Viewer (GSL Biotech, Chicago IL). Becker, B., de la Fuente, N., Gassel, M., Günther, D., Tavares, P., Lurz, R., Trautner, T.A., Alonso, J.C., 1997. Head morphogenesis genes of the Bacillus subtilis bacteriophage SPP1. J. Mol. Biol. 268, 822–839. Protein identification by mass spectrometry Bramucci, M.G., Keggins, K.M., Lovett, P.S., 1977. Bacteriophage conversion of spore-negative mutants to spore-positive in Bacillus pumilus. J. Virol. 22, – fi 194 202. Proteins were identi ed using 1D LC-MS/MS on tryptic digested Casjens, S.R., 2005. Comparative genomics and evolution of the tailed- protein gels of precipitated phage particles and high-titer lysate of bacteriophages. Curr. Opin. Microbiol. 8, 451–458. phage Curly. Phage particles from a high-titer lysate were pre- Christie, G.E., Temple, L.M., Bartlett, B.A., Goodwin, T.S., 2002. Programmed 1 translational frameshift in the bacteriophage P2 FETUD tail gene operon. cipitated overnight at 4 C with precipitate solution (30% poly- J. Bacteriol. 184, 6522–6531. ethylene glycol, PEG8000, and NaCl to a final concentration of Cole, J.R., Wang, Q., Cardenas, E., Fish, J., Chai, B., Farris, R.J., Kulam-Syed-Mohideen, A.S., McGarrell, D.M., Marsh, T., Garrity, G.M., Tiedje, J.M., 2009. The Ribosomal 3.3 M), centrifuged, and the pellet resuspended in dH2O. Both the Database Project: improved alignments and new tools for rRNA analysis. phage precipitate and crude Curly lysate were concentrated with Nucleic Acids Res. 37, D141–D145. 10 kD Microcon microconcentrators (Millipore, Billeria, MA) and Cresawn, S.G., Bogel, M., Day, N., Jacobs-Sera, D., Hendrix, R.W., Hatfull, G.F., 2011. run on a 4–16% Bis-Tris protein gel in MES running buffer (Novex Phamerator: a bioinformatic tool for comparative bacteriophage genomics. – by Life Technologies, Grand Island, NY). Peptides were destained, BMC Bioinform. 12, 395 409. Delcher, A.L., Harmon, D., Kasif, S., White, O., Salzberg, S.L., 1999. Improved digested, and extracted following the procedure by Shevchenko, microbial gene identification with GLIMMER. Nucleic Acids Res. 27, 4636–4641. et al. (Shevchenko et al., 2006). Briefly, four gel fragments Fierer, N., Jackson, R.B., 2006. The diversity and biogeography of soil bacterial representing all the proteins in a single gel lane were excised communities. In: Proceedings of the National Academy of Sciences of the United States of America. vol. 103, pp. 626–631. and destained using ammonium bicarbonate and acetonitrile. Gel Fouts, D.E., Rasko, D.A., Cer, R.Z., Jiang, L., Fedorova, N.B., Shvartsbeyn, A., pieces were saturated with trypsin buffer (13 ng/ul trypsin in Vamathevan, J.J., Tallon, L., Althoff, R., Arbogast, T.S., Fadrosh, D.W., Read, T.D., Gill, S.R., 2006. Sequencing Bacillus anthracis typing phages gamma and cherry 10 mM ammonium bicarbonate containing 10% acetonitrile by – 1 reveals a common ancestry. J. Bacteriol. 188, 3402 3408. volume) and incubated at 55 C for 30 min. Peptides were From, C., Hormazabal, V., Granum, P.E., 2007. Food poisoning associated with extracted using C18 ZipTips (Millipore, Billeria, MA) and the pumilacidin-producing Bacillus pumilus in rice. Int. J. Food Microbiol. 115, sequence derived from the mass spectra were compared to a 319–324. Gioia, J., Yerrapragada, S., Qin, X., Jiang, H., Igboeli, O.C., Muzny, D., Dugan-Rocha, S., database of Curly proteins by sequence query search in Mascot Ding, Y., Hawes, A., Liu, W., Perez, L., Kovar, C., Dinh, H., Lee, S., Nazareth, L., (Matrix Science, Boston, MA) (Perkins et al., 1999). Blyth, P., Holder, M., Buhay, C., Tirumalai, M.R., Liu, Y., Dasgupta, I., Bokhetache, L., Fujita, M., Karouia, F., Eswara Moorthy, P., Siefert, J., Uzman, A., Buzumbo, P., Verma, A., Zwiya, H., McWilliams, B.D., Olowu, A., Clinkenbeard, K.D., Newcombe, D., Golebiewski, L., Petrosino, J.F., Nicholson, W.L., Fox, G.E., Acknowledgments Venkateswaran, K., Highlander, S.K., Weinstock, G.M., 2007. Paradoxical DNA repair and peroxide resistance gene conservation in Bacillus pumilus SAFR-032. The authors wish to acknowledge Dr. Ronald Raab and PLoS ONE 2, e928. Gordon, D., Abajian, C., Green, P., 1998. Consed: a graphical tool for sequence Dr. Stephanie Stockwell of James Madison University for their finishing. Genome Res. 8, 195–202. involvement in the Viral Discovery program, Dr. Mark Radosevich Haq, I., Chaudhry, W., Akhtar, M., Andleeb, S., Qadri, I., 2012. Bacteriophages and of University of Tennessee, Knoxville, for funding sequencing their implications on future biotechnology: a review 9, 9. Hatfull, G.F., Jacobs-Sera, Deborah, Lawrence, J.G., Pope, W.H., Russell, D.A., Ching- efforts, Dan Russell and Graham Hatfull of the University of Chung, Ko, Weber, R.J., Patel, M.C., Germane, K.L., Edgar, R.H., Hoyte, N.N., Pittsburgh for sequencing, Virginia Commonwealth University for Bowman, C.A., Tantoco, A.T., Paladin, E.C., Myers, M.S., Smith, A.L., Grace, M.S., sequencing and electron microscopy, Dr. Dave Anderson of SRI Pham, T.T., O′Brien, Matthew B., Vogelsberger, A.M., Hryckowian, A.J., Wynalek, J.L., Donis-Keller, Helen, Bogel, M.W., Peebles, C.L., Cresawn, S.G., Hendrix, R.W., International for mass spectrometry, Howard Hughes Medical 2010. Comparative genomic analysis of 60 mycobacteriophage genomes: Institute Science Education Alliance for initial support of the Viral genome clustering, gene acquisition, and gene size. J. Mol. Biol. 397, 119–143. Discovery program and to James Madison University′s Department Katsura, I., Hendrix, R.W., 1984. Length determination in bacteriophage lambda tails. Cell 39, 691–698. of Integrated Science and Technology and Department of Biology Klumpp, J., Lavigne, R., Loessner, M., Ackermann, H., 2010. The SPO1-related for continuing support of the Viral Discovery program. The authors bacteriophages. Arch. Virol. 155, 1547–1561. also wish to acknowledge phage hunters Brendan Cowcer, Mary Kropinski, A., Hayward, M., Agnew, M.D., Jarrell, K., 2005. The genome of BCJA1c: a Kathryn Dickinson, Drew Loso, Robert Procida, Blake Richardson, bacteriophage active against the alkaliphilic bacterium, Bacillus clarkii. Extre- mophiles 9, 99–109. Madison Shinaberry, Nick Wright, Ali Blackman, Shannon Cham- La Duc, M.T., Kern, R., Venkateswaran, K., 2004. Microbial Monitoring of Spacecraft bers, Lesley Chambers, Karen Corbett, Terri Corbett, Leo Duke, and Associated Environments. Microb. Ecol. 47, 150–158. Lauren Gerson, Jessica Gyourko, Samantha Howell, Catherine Lopez, Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., Matthew Mattson, Kent McQueen, Dave Motsek, Dorothy Nugent, Higgins, D.G., 2007. Clustal W and Clustal X version 2.0. Bioinformatics 23, Laurel Owens, and Ashley Parra, all of whom contributed to the 2947–2948. isolation and characterization of the phages detailed in this work. Levin, M., Hendrix, R., Casjens, S., 1993. A programmed translational frameshift is required for the synthesis of a bacteriophage lambda tail assembly protein. J. Molec. Biol. 234, 124–139. Lewis, S.E., Searle, S.M.J., Harris, N., Gibson, M., Iyer, V., Richter, J., Wiel, C., Appendix A. Supporting information Bayraktaroglu, L., Birney, E., Crosby, M.A., Kaminker, J.S., Matthews, B.B., Prochnik, S.E., Smith, C.D., Tupy, J.L., Rubin, G.M., Misra, S., Mungall, C.J., Clamp, M.E., 2002. Apollo: a sequence annotation editor. Genome Biol. 3, 1–14. Supplementary data associated with this article can be found in Logan, N., Halket, G., 2011. In: Logan, N.A., Vos, P. (Eds.), Developments in the the online version at http://dx.doi.org/10.1016/j.virol.2013.07.004. Taxonomy of Aerobic, Endospore-Forming Bacteria. Springer, Berlin Heidelberg, pp. 1–29. Logan, N.A., 2012. Bacillus and relatives in foodborne illness. J. Appl. Microbiol. 112, 417–429. Lukashin, A.V., Borodovsky, M., 1998. GeneMark.hmm: new solutions for gene References finding. Nucleic Acids Res. 26, 1107–1115. Marchler-Bauer, A., Lu, S., Anderson, J.B., Chitsaz, F., Derbyshire, M.K., De Weese- Abshire, T.G., Brown, J.E., Ezzell, J.W., 2005. Production and validation of the use of Scott, C., Fong, J.H., Geer, L.Y., Geer, R.C., Gonzales, N.R., Gwadz, M., Hurwitz, D.I., gamma phage for identification of Bacillus anthracis. J. Clin. Microbiol. 43, Jackson, J.D., Ke, Z., Lanczycki, C.J., Lu, F., Marchler, G.H., Mullokandov, M., 4780–4788. Omelchenko, M.V., Robertson, C.L., Song, J.S., Thanki, N., Yamashita, R.A., Zhang, L. Lorenz et al. / Virology 444 (2013) 374–383 383

D., Zhang, N., Zheng, C., Bryant, S.H., 2011. CDD: a Conserved Domain Database Schattner, P., Brooks, A.N., Lowe, T.M., 2005. The tRNAscan-SE, snoscan and snoGPS for the functional annotation of proteins. Nucleic Acids Res. 39, D225–D229. web servers for the detection of tRNAs and snoRNAs. Nucleic Acids Res. 33, Maughan, H., Van, d.A., 2011. Bacillus taxonomy in the genomic era finds W686–W689. phenotypes to be essential though often misleading. Infect. Genet. Evol. 11, Schneider, C.A., Rasband, W.S., Eliceiri, K.W., 2012. NIH Image to ImageJ: 25 years of 789–797. image analysis. Nat. Methods 9, 671–675. McGuffin, L.J., Bryson, K., Jones, D.T., 2000. The PSIPRED protein structure prediction Shevchenko, A., Tomas, H., Havlis, J., Olsen, J.V., Mann, M., 2006. In-gel digestion for server. Bioinformatics 16, 404–405. mass spectrometric characterization of proteins and proteomes. Nat. Protoc. 1, Meijer, W.J.J., Horcajadas, J.A., Salas, M., 2001. Φ29 family of phages. Microbiol. Mol. 2856–2860. Biol. Rev. 65, 261–287. Sievers, F., Wilm, A., Dineen, D., Gibson, T.J., Karplus, K., Li, W., Lopez, R., McWilliam, Minakhin, L., Semenova, E., Liu, J., Vasilov, A., Severinova, E., Gabisonia, T., Inman, R., H., Remmert, M., Soding, J., Thompson, J.D., Higgins, D.G., 2011. Fast, scalable Mushegian, A., Severinov, K., 2005. Genome sequence and gene expression of generation of high-quality protein multiple sequence alignments using Clustal Bacillus anthracis bacteriophage Fah. J. Mol. Biol. 354, 1–15. Omega. Mol. Syst. Biol. 7. Morais, M.C., Kanamarul, S., Badasso, M.O., Koti, J.S., Owen, B.A.L., McMurray, C.T., Söding, J., Biegert, A., Lupas, A.N., 2005. The HHpred interactive server for protein Anderson, D.L., Rossmann, M.G., 2003. Bacteriophage Φ29 scaffolding protein homology detection and structure prediction. Nucleic Acids Res. 33, W244–W248. gp7 before and after prohead assembly. Nat. Struct. Biol. 10, 572–576. Stewart, C.R., Casjens, S.R., Cresawn, S.G., Houtz, J.M., Smith, A.L., Ford, M.E., Peebles, Perkins, D.N., Pappin, D.J., Creasy, D.M., Cottrell, J.S., 1999. Probability-based protein C.L., Hatfull, G.F., Hendrix, R.W., Huang, W.M., Pedulla, M.L., 2009. The genome identification by searching sequence databases using mass spectrometry data. of Bacillus subtilis Bacteriophage SPO1. J. Mol. Biol. 388, 48–70. Electrophoresis 20, 3551–3567. Thomas, J.A., Hardies, S.C., Rolando, M., Hayes, S.J., Lieman, K., Carroll, C.A., Perrière, G., Gouy, M., 1996. WWW-query: an on-line retrieval system for biological Weintraub, S.T., Serwer, P., 2007. Complete genomic sequence and mass sequence banks. Biochimie, 364–369. spectrometric analysis of highly diverse, atypical Bacillus thuringiensis phage Pietila, M.K., Laurinmaki, P., Russell, D.A., Ko, C.C., Jacobs-Sera, D., Hendrix, R.W., 0305phi8-36. Virology 368, 405–421. Bamford, D.H., Butcher, S.J., 2013. Structure of the archaeal head-tailed virus HSTV- van Elsas, J.D., Penido, E.G.C., 1981. Isolation and characterization ofBacillus pumilus 1 completes the HK97 fold story. Proc. Natl. Acad. Sci. U.S.A. 110, 10604–10609. phages from Brazilian soil. Abl. Bakt. II. Abt. 136, 581–589. Promega, 2010. Wizard DNA Clean-Up System Technical Bulletin. 2008. Xiang, Y., Leiman, P.G., Li, L., Grimes, S., Anderson, D.L., Rossmann, M.G., 2009. Reanney, D.C., Teh, C.K., 1976. Mapping pathways of possible phage-mediated Crystallographic insights into the autocatalytic assembly mechanism of a genetic interchange among soil . Soil Biol. Biochem. 8, 305–311. bacteriophage tail spike. Mol. Cell 34, 375–386. Roy, A., Kucukural, A., Zhang, Y., 2010. I-TASSER: a unified platform for automated Xu, J., Hendrix, R.W., Duda, R.L., 2004. Conserved translational frameshift in dsDNA protein structure and function prediction. Nat. Protoc. 5, 725–738. bacteriophage tail assembly genes. Mol. Cell 16, 11–21. Rutherford, K., Parkhill, J., Crook, J., Horsnell, T., Rice, P., Rajandream, M., Barrell, B., Young, R., 1992. Bacteriophage lysis: mechanism and regulation. Microbiol. Rev. 56, 2000. Artemis: sequence visualization and annotation. Bioinformatics 16, 430–481. 944–945. Young, R., Bläsi, U., 1995. Holins: form and function in bacteriophage lysis. FEMS Salkinoja-Salonen, M.S., Vuorio, R., Andersson, M.A., Kampfer, P., Andersson, M.C., Microbiol. Rev. 17, 191–205. Honkanen-Buzalski, T., Scoging, A.C., 1999. Toxigenic strains of Bacillus licheni- Zhang, Y., 2008. I-TASSER server for protein 3D structure prediction. BMC Bioinfor- formis related to food poisoning. Appl. Environ. Microbiol. 65, 4637–4645. matics 9. (40–2105-9-40).