Hindawi Publishing Corporation Sequencing Volume 2012, Article ID 953609, 5 pages doi:10.1155/2012/953609

Research Article Identification of Prophages and Prophage Remnants within the Genome of Avibacterium paragallinarum Bacterium

Y. Roodt, R. R. Bragg, and J. Albertyn

Department of Microbial, Biochemical and Food Biotechnology, University of the Free State, Bloemfontein 9300, South Africa

Correspondence should be addressed to R. R. Bragg, [email protected]

Received 10 September 2012; Revised 21 November 2012; Accepted 28 November 2012

Academic Editor: Alexei Sorokin

Copyright © 2012 Y. Roodt et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Bacterial whole genome sequencing has delivered an abundance of prophage sequences as a by-product and the analysis of these sequences revealed ways in which phages have affected the genome of their host in various bacterial . The aim of this study was to identify the phage-related sequences in the draft assembly of the Avibacterium paragallinarum genome, the causative agent of infectious coryza in poultry. Whole genome assembly was not possible due to the presence of gaps and/or repeats existent on the ends of contigs. However, genome annotation revealed prophage and prophage remnant sequences present in this genome. From the results obtained, a complete Mu-like bacteriophage could be identified that was termed AvpmuC-2M. A complete sequence of HP2-like bacteriophage, named AvpC-2M-HP2, was also identified.

1. Introduction units can be imported from these sources and the transferred DNA can range from 1 kb to more than 100 kb in size and can Bacteriophages were first discovered in England, 1915, by encode complex surface structures or even entire metabolic Frederick W. Twort and independently thereof in 1917 pathways [4]. It has long been established that bacteriophages by Felix d’Herelle at the Pasteur Institute in Paris [1, 2]. contribute to the pathogenicity of their bacterial hosts as Bacteriophages are viruses which infect bacteria and can many genes have been shown to undergo transfer among be divided into two groups according to their means of bacteria through phages [4]. These genes can code for a interaction with the bacterial cell, namely, lytic (virulent) and diverse subset of virulence factors such as toxin, regulatory temperate (lysogenic) phages. Lytic phages proceed typically factors (which upregulate the expression of host virulence with replication the moment after infecting the host cell, genes), and enzymes, which can alter the bacterial virulence where large numbers of new viruses are released through the components [4]. lysis of the host cell. Temperate phages do not necessarily Bacterial whole genome sequencing has delivered an start replication immediately after infection, and depending abundance of prophage sequences [6]. Prophages can consti- on a number of conditions, these phages may integrate tute as much as between 10 and 20% of a bacterium’s genome their chromosome into the genome of the host cell thus though many of these prophages are cryptic and in a state remaining silent until induced [1]. Once a bacteriophage of mutational decay [7]. Analysis of these sequences revealed genome is integrated into the host cell genome it is referred numerous ways in which prophages shape the genome of to as a prophage. Prophages are the primary suspects in their host bacteria [6]. Prophages have the ability to affect the adaptation event of existing pathogens to new hosts or their host genome architecture by changing the content of the the emergence of new pathogens or epidemic clones [3]. genome, by modifying the organization of the genome, or by Bacteria have no sexual life cycle and the exchange of alleles altering the location and the order of genes [3, 8]. Temperate within a population is fulfilled via horizontal gene transfer, bacteriophages play an intricate role in the generation of and the source of this DNA may be phages, amongst others microbial diversity and in the evolution of bacterial genomes [4]. Unless restricted by the species barrier, entire functional through the mediation of rearrangement within the bacterial 2 Sequencing Phage F-like protein Phage F-like Phage I-family protein gp 48 gp 45 GemA-like protein GemA-like gp 46 gp 46 Gam-like protein Gam-like Tail collar domain collar Tail Major head protein Major Tape measure protein measure Tape Terminase Bacteriophage Mu P-protein Bacteriophage Mu Phage transcription C regulator Portal gp 27 Base plate J-like protein J-like Base plate Lysozyme T3 Lysozyme DNA circulation protein N-terminus protein circulation DNA Tail sheath protein Tail tube protein Tail gp 36 gp 37 gp 38 gp 41 Phage virion morphogenesis protein Transcriptional regulator Transcriptional Transcriptional regulator Transcriptional Transposase Transposase

2 847 bp 2 449 bp 27 107 bp Figure 1: Putative genome representation of AvpmuC-2M. The three contigs that were used in the construction of AvpmuC-2M are separated by //. Contig sizes are indicated beneath each contig in base pairs. chromosome and as a result they contribute significantly 2.2. DNA Extraction. Genomic DNA extraction was perfor- towards interstrain differences within the same bacterial med with the use of a QIAamp DNA Mini kit from Qiagen. species [3, 7, 9]. By comparison, two Streptococcus pyo- The manufacturer’s recommendations were meticulously genes strains belong to different M serotypes and they followed except for the elution procedure where genomic are unconnectedly associated with different diseases—but DNA was eluted in 2 × 80 μL 10 mM Tris-HCl, pH 8. when these strains were compared on DNA level the key ff di erences correlated to prophage sequences [3]. Another 2.3. Identification of Av. paragallinarum. Av. paragallinarum example can be observed within Campylobacter jejuni,where was identified by using a PCR protocol developed specifically ff interstrain di erences may be attributed towards intra- for this bacterium [17]. Additionally, the 16S ribosomal genomic inversions of Mu-like prophage DNA sequences DNA (rDNA) region was amplified in order to identify Av. [10]. Comparisons between colinear genomes showed that paragallinarum [18]. The PCR product obtained was excised gaps between the relevant genomes were attributed to the from the agarose gel, purified, and directly sequenced to presence of prophage sequences or as a result of rearranged avoid any possible foreign genomic DNA contamination. bacterial genomes [11]. In many cases prophages with lim- ited DNA sequence identity or duplicated prophages serve 2.4. Genome Assembly and Annotation. Genomic DNA sam- as anchoring points for homologous recombination [3]. In ples of Av. paragallinarum strain C-2 (Modesto) were sent to addition prophages can recombine with other prophages Agowa Genomics (now known as LGC Genomics), Germany, contributing to their mosaic structure [11]. for GS-FLX Titanium pyrosequencing. Sequences/reads Avibacterium paragallinarum is a pathogen targeting obtained from the pyrosequencing were assembled using chickens and is causing the disease infectious coryza [12, 13]. Newbler 2.0.01.14 from 454 Life Sciences Corporation with Phylogenetically this bacterium belongs to the an overlap minimum match identity of 90% and an overlap family and was formerly known as Haemophilus paragal- minimum match length of 40 bases. The sequences of linarum renamed in 2005 [14]. The family Pasteurellaceae obtained contigs were submitted to the J. Craig Venter is made up of closely related bacteria [14] and within this Institute (JCVI) as a pseudomolecule for annotation. The family, various bacteriophages have been reported in pipeline includes gene finding with Glimmer [19, 20], Haemophilus influenzae, Actinobacillus actinomycetemcomi- BLAST-Extend-Repraze (BER) searches [21, 22], Hidden tans, Pasteurella multocida, and Mannheimia haemolytica. Markov Models (HMM) searches [23], Trans Membrane Bacteriophage-like sequences have been reported in Histo- Hidden Markov Models (TMHMM) searches [24], SignalP philus somni [2, 15]. In this paper, we describe prophage and predictions [25], and automatic annotations from AutoAn- prophage remnants detected in Av. paragallinarum. notate. All of this information is stored in a MySQL database 2. Materials and Methods and associated files which was downloaded to our site. The manual annotation tool Manatee was downloaded from 2.1. Bacterial Strains and Growth Conditions. Av. paragallina- SourceForge (http://manatee.sourceforge.net/)andusedto rum strain C-2 (Modesto) was obtained from Onderstepoort manually review the output from the pipeline. pDraw32 Biological Products, Onderstepoort, South Africa. Strain from ACACLONE software was used to draw genetic maps Modesto is NAD+ dependent and was cultivated in TM/SN for prophage region. supplemented with 1% (v/v) chicken serum, 1% (v/v) NAD+, and 0.0005% (m/v) thiamine solution and cultivated under 2.5. Accession Numbers. AvpmuC-2M (accession no.: oxygen limiting conditions in a candle jar at 37◦Cfor16h JN627905); HP-2 like prophages contig A (JN627906), B [16]. (JN627907), and C (JN627908); lamboid prophage Sequencing 3

Phage Regulation Replication Recombination Structural lysis ninB Prophage antirepressor Prophage Recombination protein ninG protein Recombination Replication protein O protein Replication protein Recombination protein Recombination ninG protein Recombination Tape measure protein measure Tape Endolysin Replication protein O protein Replication P protein Replication protein fiber repeat Tail I assembly protein Tail protein component Tail Antitermination protein Q protein Antitermination Regulatory protein Rha family Regulatory protein Replication protein P protein Replication Tail protein Tail Recombination protein Bet protein Recombination Phage holin Addiction module killer protein Addiction Addiction module antidote protein module antidote Addiction P protein Replication Prophage antirepressor Prophage Antitermination protein Q protein Antitermination Minor tail protein L tail protein Minor Repressor CI Repressor Crossover junction Crossover Transcriptional regulator Transcriptional KilA-N domain KilA-N

814 bp 4 175 bp 4 875 bp 1 401 bp 647 bp 537 bp 2 147 bp 8 890 bp 1 595 bp 6 507 bp 914 bp Figure 2: Lamboid genetic representation. Various lambda genes have been mapped to the contigs from which they were identified and grouped according to associated functions [5]. Beginnings and ends of the contigs are indicated by //. Baseplate-like protein Baseplate-like Baseplate assembly protein V assembly protein Baseplate Phage late control protein D protein control Phage late Phage lysozyme GpUregion X Phage tail protein Phage tail-like protein Phage tail-like Tail fiber repeat protein fiber repeat Tail 5 806 bp (a) Baseplate assembly protein V assembly protein Baseplate Baseplate-ike protein Baseplate-ike Phage lysozyme Tape measure protein measure Tape Phage tail protein X Phage tail protein Tail fiber repeat protein fiber repeat Tail Phage late control protein D protein control Phage late GpU region 8 158 bp (b) old protein GpO old protein ff Orf 28 Orf 34 Orf 33 Small terminase subunit terminase Small Orf 30 Orf 31 tail fibers Tail tape measure protein tape measure Tail Tail sheath protein Tail Capsid sca Orf 35 Plasmid stabilization Portal protein Portal protein Phage head completion tube protein Tail Replication protein A protein Replication Integrase Integrase Phage N adenylyltransferase Phage lysozyme Antirepressor protein Antirepressor Major capsid protein Major Virion morphogenesis protein Virion Baseplate-like protein Baseplate-like Cl repressor protein Cl repressor 50 012 bp

(c)

Figure 3: HP2-like prophage genetic representation. The various contigs (a), (b), and (c) carrying HP2-like open reading frames are illus- trated within this diagram. A complete HP2-like prophage AvpC-2M-HP2 was identified within the genome of Av. paragallinarum Modesto represented by (c) in this diagram. 4 Sequencing fragments (accession nos.: JN627909-JN627919); prophage Thus the mapping of all of the prophage genes uncovered integrase genes (accession nos.: JN627920-JN627928). allowed us to suggest two prophages in Av. paragallinarum. One prophage, AvpmyC-2M, resembles a Mu-like prophage, 3. Results and Discussion whereas the other prophage, AvpC-2M-HP2, resembles the HP2 prophage of H. influenzae. The whole genome sequencing project of Av. paragallinarum yielded a total number of 160 743 reads/sequences through pyrosequencing chemistry. The total number of contigs Conflict of Interests assembled from these sequences was 300 with the largest con- The authors declare that no conflict of interests exists with tig comprising 78 465 bp and the average contig size longer any of the commercial identities mentioned within the paper. than 500 bp were 10 727 bp. A closed genome could not be assembled for Av. paragallinarum from the sequencing results obtained due to repeat sequences that existed on the end(s) Acknowledgment of contigs or as a result of sequence gaps between contigs. A pseudogenome molecule was assembled from the 300 contigs The authors would like to thank JCVI for providing the JCVI obtained and was sent to the JCVI for genome annotation. Annotation Service which provided them with automatic Annotation results indicated that a total of 7% or 141 open annotation data and the manual annotation tool Manatee. reading frames were assigned to phage protein functions. The 141 open reading frames were identified on 60 of the References 300 contigs where either one or a series of prophage genes were mapped. Nine prophage integrase genes were identified [1] W. O. K. Grabow, “Bacteriophages: update on application as and mapped to seven of the 60 contigs. Investigations into models for viruses in water,” Water SA, vol. 27, no. 2, pp. 251– open reading frames adjacent to the identified integrases 268, 2001. revealed no additional prophage-related genes, but rather [2] D. Nelson, “Phage : we agree to disagree,” Journal of genes encoding membrane proteins; metabolic proteins; Bacteriology, vol. 186, no. 21, pp. 7029–7031, 2004. ribosomal proteins; cell membrane function proteins, or [3] H. Brussow,¨ C. Canchaya, and W. D. Hardt, “Phages and the membrane export proteins. No significant homology was evolution of bacterial pathogens: from genomic rearrange- observed between the integrases. ments to lysogenic conversion,” Microbiology and Molecular Mu-like prophage genes were identified and mapped to Biology Reviews, vol. 68, no. 3, pp. 560–602, 2004. 11 different contigs. The 11 contigs were compared to the [4] P. L. Wagner and M. K. Waldor, “Bacteriophage control of genome map of Mu-like phages reported by [26]. Here we bacterial virulence,” Infection and Immunity,vol.70,no.8,pp. suggest a putative Mu-like prophage for Av. paragallinarum, 3985–3993, 2002. AvpmuC-2M (Figure 1). A large portion of the Mu gene [5] G. Resch, E. M. Kulik, F. S. Dietrich, and J. Meyer, “Complete (25) was identified on a single contig comprising 27, 107 bp. genomic nucleotide sequence of the temperate bacteriophage Two additional contigs were identified completing the map AaΦ23 of Actinobacillus actinomycetemcomitans,” Journal of of a Mu-like prophage for Av. paragallinarum. Bacteriology, vol. 186, no. 16, pp. 5523–5528, 2004. Forty-one lamboid genes were mapped to 11 different [6] H. Brussow¨ and R. W. Hendrix, “Phage genomics: small is bea- contigs (Figure 2). With the data available from the whole utiful,” Cell, vol. 108, no. 1, pp. 13–16, 2002. genome sequencing project it could not be concluded if the [7] K. V. Srividhya, V. Alaguraj, G. Poornima et al., “Identification lamboid prophage is complete or if it has undergone massive of prophages in bacterial genomes by dinucleotide relative loss of functional DNA resulting in bacteriophage genome abundance difference,” PLoS ONE, vol. 2, no. 11, Article ID fragmentation and ultimately leading to its disappearance e1193, 2007. [3]. [8] Y. Tan, K. Zhang, X. Rao et al., “Whole genome sequencing A complete lysogen of a temperate phage was identified of a novel temperate bacteriophage of P.aeruginosa: evidence on a single contig within the genome. The succession of of tRNA gene mediating integration of the phage genome into the host bacterial chromosome,” Cellular Microbiology, vol. 9, the genes identified is similar to that of H. influenzae HP2 no. 2, pp. 479–491, 2007. prophage. A nucleotide-to-nucleotide comparison revealed a weak similarity between the H. influenzae prophage HP2 and [9] C. Balding, S. A. Bromley, R. W. Pickup, and J. R. Saunders, “Diversity of phage integrases in Enterobacteriaceae: devel- the identified prophage. The identified prophage is therefore opment of markers for environmental analysis of temperate unique to Av. paragallinarum. We suggest calling it AvpC- phages,” Environmental Microbiology, vol. 7, no. 10, pp. 1558– 2M-HP2. 1567, 2005. Two additional contigs also contained HP2-like genes [10]A.E.Scott,A.R.Timms,P.L.Connerton,C.LocCarrillo,K. (Figure 3). Further investigations of similarity between the Adzfa Radzum, and I. F. Connerton, “Genome dynamics of identified contigs (A, B, and C of AcpC-2M-HP2) indicated Campylobacter jejuni in response to bacteriophage predation,” that the three contigs share the same genomic organization PLoS Pathogens, vol. 3, no. 8, p. e119, 2007. but they are not identical. [11] L. S. Frost, R. Leplae, A. O. Summers, and A. Toussaint, An abundance of other prophage genes have also been “Mobile genetic elements: the agents of open source evolu- identified but could not be assigned to any specific phage or tion,” Nature Reviews Microbiology, vol. 3, no. 9, pp. 722–732, phage family. 2005. Sequencing 5

[12] K. Kume, A. Sawata, and Y. Nakase, “Haemophilus infections in chickens. I. Characterization of Haemophilus paragallinar- um isolated from chickens affected with coryza,” The Japanese Journal of Veterinary Science, vol. 40, no. 1, pp. 65–73, 1978. [13] P.J. Zhang, M. Miao, H. Sun, Y. Gong, and P.J. Blackall, “Infec- tious coryza due to Haemophilus paragallinarum serovar B in China,” Australian Veterinary Journal, vol. 81, no. 1-2, pp. 96– 97, 2003. [14] P. J. Blackall, H. Christensen, T. Beckenham, L. L. Blackall, and M. Bisgaard, “Reclassification of Pasteurella gallinarum, [Haemophilus] paragallinarum, Pasteurella avium and Pas- teurella volantium as Avibacterium gallinarum gen. nov., comb. nov., Avibacterium paragallinarum comb. nov., Avibacterium avium comb. nov. and Avibacterium volantium comb. nov,” International Journal of Systematic and Evolutionary Microbi- ology, vol. 55, no. 1, pp. 353–362, 2005. [15] S. K. Highlander, S. Weissenberger, L. E. Alvarez, G. M. Wein- stock, and P. B. Berget, “Complete nucleotide sequence of a P2 family lysogenic bacteriophage, ϕsymbolMhaA1-PHL101, from Mannheimia haemolytica serotype A1,” Virology, vol. 350, no. 1, pp. 79–89, 2006. [16] P. J. Blackall and R. Yamamoto, “Infectious coryza,” in A Laboratory Manual for the Isolation and Identification of Avian Pathogens, H. G. Parchase, L. H. Arp, C. H. Domermuth, and J. E. Pearson, Eds., pp. 27–31, American Association of Avian Pathologists, Ames, Iowa, USA, 3rd edition, 1990. [17] X. Chen, J. K. Miflin, P. Zhang, and P. J. Blackall, “Develop- ment and application of DNA probes and PCR tests for Hae- mophilus paragallinarum,” Avian Diseases,vol.40,no.2,pp. 398–407, 1996. [18] A. Mendoza-Espinoza, Y. Koga, and A. I. Zavaleta, “Amplified 16S ribosomal DNA restriction analysis for identification of Avibacterium paragallinarum,” Avian Diseases, vol. 52, no. 1, pp. 54–58, 2008. [19] S. L. Salzberg, A. L. Deicher, S. Kasif, and O. White, “Micro- bial gene identification using interpolated Markov models,” Nucleic Acids Research, vol. 26, no. 2, pp. 544–548, 1998. [20] A. L. Delcher, D. Harmon, S. Kasif, O. White, and S. L. Salzberg, “Improved microbial gene identification with GLIMMER,” Nucleic Acids Research, vol. 27, no. 23, pp. 4636– 4641, 1999. [21] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lip- man, “Basic local alignment search tool,” JournalofMolecular Biology, vol. 215, no. 3, pp. 403–410, 1990. [22] T. F. Smith and M. S. Waterman, “Identification of common molecular subsequences,” Journal of Molecular Biology, vol. 147, no. 1, pp. 195–197, 1981. [23] S. R. Eddy, “Profile hidden Markov models,” Bioinformatics, vol. 14, no. 9, pp. 755–763, 1998. [24] A. Krogh, B. Larsson, G. Von Heijne, and E. L. L. Sonnham- mer, “Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes,” Journal of Molecular Biology, vol. 305, no. 3, pp. 567–580, 2001. [25] J. D. Bendtsen, H. Nielsen, G. Von Heijne, and S. Brunak, “Improved prediction of signal peptides: signalP 3.0,” Journal of Molecular Biology, vol. 340, no. 4, pp. 783–795, 2004. [26]G.J.Morgan,G.F.Hatfull,S.Casjens,andR.W.Hendrix, “Bacteriophage Mu genome sequence: analysis and compar- ison with Mu-like prophages in Haemophilus, Neisseria and Deinococcus,” Journal of Molecular Biology, vol. 317, no. 3, pp. 337–359, 2002. Journal of Nucleic Acids

International Journal of The Scientific Biochemistry Peptides World Journal Research International Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation ArchaeaHindawi Publishing Corporation http://www.hindawi.com Volume 2013 http://www.hindawi.com Volume 2013 http://www.hindawi.com Volume 2013 http://www.hindawi.com Volume 2013 http://www.hindawi.com Volume 2013

Advances in Journal of Marine Biology Hindawi Publishing Corporation Hindawi Publishing Corporation Virologyhttp://www.hindawi.com Volume 2013 http://www.hindawi.com Volume 2013

Submit your manuscripts at http://www.hindawi.com

Journal of Advances in Signal Transduction Bioinformatics Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2013 http://www.hindawi.com Volume 2013

BioMed Research International Journal of International Genomics

Enzyme Research International Journal of Stem Cells Evolutionary Biology International

Hindawi Publishing Corporation Hindawi Publishing Corporation Volume 2013

http://www.hindawi.com Volume 2013 Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com http://www.hindawi.com Volume 2013 http://www.hindawi.com Volume 2013 http://www.hindawi.com Volume 2013

ISRN ISRN ISRN ISRN ISRN Cell Biology Molecular Biology Zoology Biotechnology Microbiology Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation http://www.hindawi.com Volume 2013 http://www.hindawi.com Volume 2013 http://www.hindawi.com Volume 2013 http://www.hindawi.com Volume 2013 http://www.hindawi.com Volume 2013