Research Collection

Doctoral Thesis

Genetic diversity of Streptococcus thermophilus phages and development of phage-resistant starters for the dairy industry

Author(s): Lucchini, Sacha

Publication Date: 1999

Permanent Link: https://doi.org/10.3929/ethz-a-003839329

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH Library Diss ETH Nr 13335

Genetic Diversity of Streptococcus thermophilics Phages and Development of Phage-Resistant Starters for the Dairy Industry

A dissettation submitted to the

SWISS FEDERAL INSTITUTE OF TECHNOLOGY ZURICH (ETHZ)

for the degree of

Doctor of Natural Sciences

Piesented by

SACHA LUCCHINI Dipl Biol ETH Bom March 13 1970 Citizen of Frasco (Tl)

Accepted on the recommendation of Prof Dr Michael Teuber, examiner Dr Harald Brussow co-examiner

Zurich 1999 Acknowledgments

This work was carried out in the Department of Bioscience, at the Nestlé

Research Centre in Vers-chez-les-Blanc. I thank Dr. Andrea Pfeiffer and Jean-

Richard Neeser for providing me with excellent working facilities. I would like to extend my thanks to Prof. Michael Teuber for supporting the project and having accepted me as a Ph.D. student. This study was supported by grants from the Swiss

National Science Foundation in the framework of the SPP Biotechnology module

(grant 5002-044545/1).

My first acknowledgment goes to my supervisor Dr. Harald Brüssow for having guided me throughout my thesis. I would also like to thank Frank Desiere for introducing me to the secret world of Nestlé research; Josette Sidoti, Fanny Scalfo and Deborah Moine for their technical support; Sophie Foley, Marie-Camille Zwahlen and Michelle Delley for useful discussions; and Anne Bruttin for technical advice and not allowing me to transform the lab into something else.

Last but not least, I am grateful to my friends in Lausanne for the good time I had with them. 1

Contents

1 Introduction 1

1.1 2

1.1.1 Classification of bacteriophages 2

1.1.2 Evolution of 3

1.1.3 The life cycle of viruses 4

1.1.3.1 T4 5

1.1.3.2 Bacteriophage lambda 6

1.1.4 Bacteriophages of lactic acid bacteria 8

1.2 Phage resistance mechanisms in LAB 10

1.2.1 Natural resistance mechanisms 10

1.2.1.1 Phage adsorption inhibition 10

1.2.1.2 Block of phage DNA injection 11

1.2.1.3 Abortive infection mechanisms 11

1.2.1.4 Restriction and modification 12

1.2.2 Novel phage resistance mechanisms 12

1.3 Outline of the thesis 13

1.3.1 Phage diversity 13

1.3.2 Phage resistant starter strains 14

2 Results 21

2.1 The genetic relationship between virulent and temperate 21 Streptococcus thermophilus bacteriophages: whole genome comparison of cos-site phages Sfi19 and Sfi21

2.2 The structural module in Streptococcus 35 thermophilus bacteriophage cx>Sfi11 shows a hierarchy of relatedness to Siphoviridae from a wide range of bacterial hosts

2.3 Comparative genomics of Streptococcus thermophilus 47 phage species supports a modular evolution theory

2.4 Similarly organized lysogeny modules in temperate 71 Siphoviridae from low GC content Gram-positive bacteria II

2.5 A short noncoding viral DNA element showing 91 characteristics of a replication origin confers bacteriophage resistance to Streptococcus thermophilus

2.6 Construction of food-grade Streptococcus thermophilus 103 starters with potent, broad range phage resistance phenotypes

3 Summary and conclusions 117 in

Abstract

To address the question of the diversity of Streptococcus thermophilus phages we sequenced three lytic phages The phage genomes underwent pairwise alignments by including two temperate and one further virulent S thermophilus phages to localize areas of difference Pairwise and multiple alignments of these sequences supports the theory of modular evolution of bacteriophages are clustered into functional units (modules), all S thermophilus phages have a similar genomic organization (DNA-packaging followed by head morphogenesis, tail morphogenesis host lysis lysogeny module, DNA replication and finally a region possibly involved in phage development control) Individual S thermophilus phages appear to result from recombination of exchangeable genetic elements drawn from a common gene pool Recombination is apparently the basis of differences in life style (lytic temperate), host range, DNA packaging and structural protein pattern The units of genetic exchange are either entire functional modules (several adjacent structural genes), part of a module (individual genes in the lysogeny module) or even parts of a gene corresponding to likely protein domains (host range determinant of the likely tail fibers, DNA binding domain of regulatory proteins) A detailed comparison of the DNA sequences from phages Sfi19 and Sfi18 indicates that point mutations may also play an important role in phage adaptation Very few base pair changes seem to be under the escape of phage Sfi19 from the inhibition mediated by the Sfi21 encoded origin of replication We examined the genetic relationship between S thermophilus phages and other phages focusing on modules that are under different evolutionary constraints, the structural gene cluster and the lysogeny module The structural gene cluster from lytic group II S thermophilus bacteriophage Sfi11 was compared to the corresponding region from other Siphoviridae The analysis revealed a hierarchy of relatedness and a common genomic organization We hypothesize that the morphogenesis module of B1 Siphoviridae shares a common ancestry Conservation of the genetic organization was also found when comparing the lysogeny modules of temperate Siphoviridae from the low GC content Gram positive bacteria The gene organization of the lysogeny module of this group of phages distinguishes it from the Siphoviridae infecting Gram-negative bacteria (e g coliphages X, 933W and P22) or high GC-content Gram-positive bacteria (mycobactenophage L5) In contrast, similarly organized lysogeny modules were found in the temperate P2~like Myovindae

To obtain S thermophilus strains with increased phage resistance, we tested a range of molecular approaches First it was shown that the phage Sfi21 replication origin when cloned in a high copy number plasmid provides protection against a high number of phages Few point mutations in this region (< 15-nt differences over a stretch of 302 nt) seem sufficient to allow phages to escape the inhibition However, substitution of the Sfi21 on with that of insensitive phages was shown to extend the range of inhibitory activity of the origin The second approach relied on the protection provided by the immunity functions of the Sfi21 prophage A targeted deletion in an essential tail gene of the prophage resulted in a lysogen, which proved to be unable to release infectious phage particles but maintained intact superinfection control Finally we demonstrated that the random inactivation of host genes followed by a selection for phage-resistant phages could lead to extremely phage-resistant strains IV

Résumé

Pour examiner la diversité des phages de Streptococcus thermophilus, trois phages lytiques ont été séquences. Afin de localiser des régions de différenciation, les séquences ainsi obtenues ont été comparées en y encluant deux phages tempérés et un autre phage lytique. L'alignement de ces séquences supporte la théorie d'une évolution modulaire des bacteriophages: les gènes sont regroupés en unités fonctionnelles (modules), tous les phages de S. thermophilus ont une organisation génomique similaire (l'ordre des modules est le suivant: gènes pour l'insertion du DNA phagique dans la tête, morphogenèse de la tête, morphogenèse de la queue, lyse de la cellule hôte, module de lysogénie, replication de l'ADN et une région probablement impliquée dans le contrôle du développement du phage). Pour certains modules plusieurs alleles existent. Les phages de S. thermophilus semblent donc être le résultât de la recombinaison d'éléments génétiques interchangeables puisés dans un pool génétique commun. Les recombinaisons sont apparemment à la base des différences dans le style de vie (virulent, tempéré), spectre d'hôte, mécanisme d'insertion du DNA dans la tête, protéines de structure. Les unités d'échange génétique peuvent être un module fonctionnel complet (gènes de structure), une partie d'un module (échange de gènes isolés dans le module de lysogénie), ou même une partie d'un gène (déterminants du spectre d'hôte des probables fibres de la queue du phage). Une comparaison détaillée des séquences des phages SU 19 et Sfi18 indique que les mutations ponctuelles pourraient aussi avoir un rôle important dans l'adaptation des phages. Un nombre très limité de changements dans la séquence d'ADN de Sfi19 semble être à la base de l'insensibilité de Sfi19 à l'inhibition médiée par l'origine de replication du phage Sfi21. Nous avons examiné les relations génétiques entre les phages de S. thermophilus et les phage infectants d'autres cellules hôtes. Nous nous sommes concentrés sur deux modules fonctionnels qui subissent différentes contraintes évolutionnaires: le module structural et le module de lysogénie. En premier, nous avons comparé le module structural du phage Sfi11 de S. thermophilus (groupe lytique II) avec la région correspondante d'autres Siphoviridae. L'analyse a révélé un gradient de similitudes et une organisation génomique similaire. Nous avons donc formulé l'hypothèse que le module structural des Siphoviridae appartenant au groupe B1 a une origine commune. Une conservation de l'organisation génomique a aussi été détectée pour les modules de lysogénie des Siphoviridae tempérés infectant des bactéries Gram+ positives à contenus en GC bas. L'organisation des gènes du module de lysogénie de ce groupe de phages les distingue des Siphoviridae isolés dans bactéries Grarrf (coliphages X, 933W et P22). Par contre, des modules de lysogénie similaires ont été trouvés chez les Myovidirae du groupe P2.

Pour obtenir des souches de S. thermophilus possédant une plus grande résistance aux phages, nous avons testé plusieurs approches moléculaires. Premièrement, il a été montré que l'origine de replication du phage Sfi21 protège contre un grand nombre de phage quand elle est clonée dans un plasmide à grand nombre de copies. Cependant, quelques mutations ponctuelles dans cette région ( < 15 mutations dans une région de 302 nucleotides) semblent être suffisantes pour permettre aux phages d'échapper à l'inhibition. La subsitution de l'origine de replication du phage Sfi21 avec la région correspondante de phages non soumis à l'inhibition permet d'étendre le spectre d'action de l'origine de replication. La seconde approche a été basée sur l'effet protectif du phage Sfi21 intégré dans le chromosome d'une cellule lysogène. La deletion ciblée d'un gène du prophage essentiel pour la morphogénése de la queue a produit une souche lysogène incapable de produire des phages infectieux. Mais la protection médiée par le prophage n'a pas été modifiée. Enfin, nous avons pu démontrer que l'inactivation aléatoire de certains gènes de la cellule hôte, suivie par la sélection de mutants résistants aux phages, a amené à l'isolement de souches extrêmement résistantes aux phages. 1

1 Introduction

Lactic acid bacteria (LAB) have been used for millennia for the fermentation of various food products (cereals, vegetables, meat and milk) The low pH generated by their metabolism and the action of other fermentation products inhibits the growth of many food spoilage bacteria In addition to providing an effective preservation, lactic acid bacteria determine the flavor and texture of the products If the growth of the starter is inhibited, fermentation is delayed or even abolished resulting in the complete loss of the product (64)

Bacteriophage infection is the main cause of starter failure during the industrial production of cheese and yogurt (83) This problem is increased in modern manufacturing where a limited number of defined starter cultures are used to transform large volumes of milk Industrial fermentation has opened new possibilities for phages to flourish The most susceptible target starter species are and Streptococcus thermophilus The dairy industry has used various practical methods to control the phage problem, regular disinfecting of equipment, use of closed vats, strain rotation and multiple phage-resistant strain starters (3,75)

In L lactis many natural anti-phage mechanisms have been found, mainly on conjugative piasmids (3) These plasmids can be transferred into phage-sensitive commercial strains to increase their resistance to phages (84) Recently, genetic engineering is used to design novel phage resistance mechanisms

In contrast to L lactis, very few natural phage resistance mechanisms were identified in S thermophilus (55) Plasmid-contaming strains are rare (65) and few have phage inhibitory effects (73 78) Spontaneous bactenophage-insensitive mutants (BIM) have provided a source of replacement strains fori lactis However, such mutants are usually slow acid producers and can also revert to phage sensitivity

(51) Moreover, attempts to obtain S thermophilus BIM were unsuccessful (15)

Thus, the design of phage resistant S thermophilus starter cultures has to rely mostly on genetic engineering As many of the novel phage resistance mechanisms are based on the knowledge of the phage biology, a better understanding of bacteriophage genetics is needed In addition, bacteriophages show an extremely high diversity Knowledge of the phage population involved should therefore assist the design of efficient anti-phage systems o

1.1 Bacteriophages

Bacterial viruses are small nucleoprotein complexes, where nucleic acid is enclosed in a protein coat. In some cases a lipid envelope is present. The genetic

material can be either DNA or RNA, single- or double- stranded. As they have no

metabolic capabilities, phages need to parasitize their cellular host for reproduction.

In principle, all cellular organisms are susceptible to infection.

Bacteriophages are probably the most abundant biological entities found in the environment. They are present in virtually all biotopes where bacteria are found.

Researchers have reported phage counts 5-25 times higher than bacterial counts in marine water (35). More than 4500 different phages have been found in about 130 bacterial genera until now, both from Eubacteria and Archaea (1).

1.1.1 Classification of bacteriophages

Morphology and nucleic acid type are the main criteria employed to classify observed phages. Based on these characteristics phages have been classified in 13 families (Table 1).

Morphotype Shape Nucleic acid Family Particulars

A1 to A3 Tailed dsDNA, linear Contractile tail

B1 to B3 Siphoviridae Long, noncontractile tail

C1 to C3 Podovmdae Short tail

D1 Polyhedral ssDNA, circular Miciovindae Conspicuous capsomers

D3 ssDNA, circular, S Coriticovindae Complex capsid, lipids

D4 dsDNA, linear Tectiviridae Double capsid, pseudotail, lipids

E1 ssRNA, linear Levivindae

E2 dsRNA, linear, seg Cystovmdae Envelope, lipids

F1 Filamentous ssDNA, circular Inovmdae a. Long filaments

F2 b Short rods

F3 dsDNA, linear Lipothnxvindae Envelope, lipids

G1 Pleomorphic dsDNA, circular, S Plasmavindae Envelope, lipids, no capsid G2 dsDNA, circular, S Fusellovindae Same, lemon-shaped

Table 1

Tailed phages account for about 96% of all observed phages. They are differentiated into 3 families, Myoviridae, Siphoviridae and (Table 1).

Each family is further differentiated by head morphology; small isometric, prolate (elongated) or large prolate (morphotypes 1-3, respectively). Tailed phages have

been found in all major prokaryotic groups. This ubiquity led to the hypothesis that tailed phages preceded the Eubacteria and Archaea differentiation. Apart from

having a tail, these phages have other common characteristics; they all have linear dsDNA genomes and similarly organized head morphogenesis gene clusters. On that basis, Ackermann (2) proposed to group the families Myoviridae, Siphoviridae and

Podoviridae in the order .

1.1.2 Evolution of viruses

Morphology and nucleic acid type are useful criteria for a rapid and easy classification of viruses, but care has to be taken when trying to attribute phylogenetic meaning to this subdivision. In fact, the comparison of bacteriophages by either heteroduplex analysis or sequencing has shown a mosaic structure in the pattern of similarities. When two phage strains from the same host species are compared, regions of high similarity are interspersed with regions of low or no sequence similarity. This is most likely the result of the exchange of genes or part of genes between phages. The comparison of the lambdoid phages led to the formulation of a theory of modular phage evolution (10): the product of evolution is not a given virus but a family (group) of interchangeable genetic elements (modules) each of which carries out a particular biological function. Evolution is perceived to act at the level of modules by selecting for a good execution of function and for optimal recombination events of functional genetic units. If the gene order is the same in all members of an interbreeding population, homologous or non-homologous recombination events could lead to increased diversity without giving rise to individuals that lack any essential function. Actually, phages seem to be very recombinogenic. Isolates from nature that are indistinguishable from the well-studied type isolates are rarely found (21). Viruses sharing the same gene pool can then differ widely in many characteristics, including morphology and host range, since these are aspects of individual modules. Good examples are bacteriophage X and

P22. These phages are morphologically distinct, X is a siphovirus P22 a podovirus.

Nevertheless, they have very similar genetic maps and can form many different viable hybrids (11;39).

Among bacteriophages four types of genetic exchange seem to occur (21): (1) homologous recombination between regions of very high sequence similarity, (2) recombination between regions of micro-homology such as promoters or terminators; 4

(3) illegitimate recombination; (4) site specific recombination such as the tail fiber

gene inversion system of coliphages Mu and P1.

1.1.3 The life cycle of viruses

The first step in a viral infection is the attachment of the virus particle to the host.

Each strain of bacteria has characteristic proteins, carbohydrates and lipopolysaccharides on its surface. Each of these molecules can act as a receptor for specific phages. Because of the specificity of the interactions, the diversity of the cell surface is one important factor in the limitation of the host range of bacteriophages.

After the attachment, the phage genome enters the cell. At this stage there are different strategies used by viruses to exploit the host for their own proliferation. (1) In lytic infection the cell is reprogrammed by the phage to produce the maximal amount of progeny virions, which are then released by the lysis of the cell. (2) In chronic infection progeny virions are released by extrusion without killing the cell. (3) In temperate infection a bacteriophage maintains a stable genetic relationship with the host. The nucleic acid of the phage becomes part of the genome of the cell, either as a plasmid or integrated in the host chromosome (prophage), and can be reproduced along with the host for many generations The prophage can be induced by events such as DNA damage or other physiological stress. This activation triggers then a lytic replication cycle.

Two well-characterized phages will be used as examples to illustrate the most representative phage groups' T4 as the prototype lytic phage and X as the best known temperate phage (7) *>

1.1.3.1 Bacteriophage T4

The best-characterized example of a lytic phage is the E.

coli bacteriophage T4 Morphologically it belongs to the

Myoviridae family It has an elongated head (78x111 nm),

which contains its genome (166 kb, dsDNA) and a long

contractile tail (113x20 nm) with six tail fibers attached to the

baseplate The study of the T4 life cycle has revealed the

strategies used to exploit the bacterial host.

The tail makes the first contact with the host, first

reversibly then irreversibly. The adsorption specificity of T4 is largely determined by gene 37, which encodes the 1026 aa residue distal tail fiber

The ~140 C-terminal residues of this protein contains determinants that can recognize either E coli B type lipopolysaccharides or OmpC protein

After the tail fibers and the tail tip have bound to the cell surface, the tail sheath contracts, the tail tube crosses the plasma membrane and the DNA is injected into the cytoplasm. Soon after DNA injection, the E coli RNA polymerase start the synthesis of phage mRNA This mRNA and all other early mRNAs (mRNA transcribed before phage DNA is made) encode the enzymes required to acquire the control over the cell Synthesis of host DNA, RNA and proteins is stopped, and the cell is forced to produce viral constituents Some early viral enzymes degrade host

DNA to nucleotides This halts gene expression and provides raw material for virus

DNA synthesis. Other early proteins and a few phage head proteins injected along with phage DNA modify the bacterial RNA polymerase. The polymerase then recognizes promoters on viral DNA rather than bacterial promoters Within 5 minutes, virus DNA synthesis starts The DNA of T4 has a unique feature, all of its cytosine residues are replaced by 5-hydroxymethylcytosine (HMC) Two viral proteins are employed in its synthesis This base does not normally occur inE coli HMC has the same hydrogen-bonding characteristics as normal cytosine, but it offers a reactive site where to attach one molecule of glucose This difference has important biological functions, firstly the glucose moieties prevent restriction of phage DNA by host cell endonucieases, secondly it blocks the synthesis of cellular DNA because HMC can be used only by the phage polymerase and not the cellular DNA polymerase

After DNA replication, late mRNA is produced It directs the synthesis of two kinds of proteins, (1) phage structural proteins and proteins helping phage assembly, (2) proteins involved in cell lysis and phage release All the proteins requested for phage assembly are synthesized simultaneously and used in independent assembly lines 6

The tail, tail fibers and heads are made in parallel, then assembled At the end of phage production the cell is lysed One gene directs the synthesis of a lysozyme that attacks the cell wall peptidoglycan Another phage protein damages the bacterial plasma membrane allowing the lysozyme to reach the cell wall At 37°C the lysis takes place 22 minutes after infection About 300 infectious particles are released in the medium

1.1.3.2 Bacteriophage Lambda

•«* The E coli bacteriophage 1 is a member of the

Siphoviridae family It has a small icosaedral head (60 nm) i | and a long non-contractile tail (150x17 nm) Its genome

consists in a single linear dsDNA molecule of 48 kb with

cohesive ends Bacteriophage X can start a normal lytic

infection with subsequent host lysis and release of phage

particles, or it can silence the expression of its lytic genes and 100 nm *~~ integrate the genome of its host as a prophage Once integrated, the prophage will be propagated along with the cells The first steps of the lytic and temperate pathways are identical First, the virus attaches reversibly to the host through the tip of the tail fiber and then the end of the tail binds irreversibly a component of the maltose permeation system The phage DNA can now enter the cell, probably by simple diffusion RNA transcription begins promptly (immediate early transcription) Starting from the promoter PL, the protein N is expressed, Cro is expressed from the PR promoter (Fig 1)

N is an antiterminator and allows the RNA polymerase to bypass the tL and tR1 terminators (delayed early transcription) Additional proteins are thus expressed, ell is one of the most important This protein is a key factor for the establishment of lysogeny It has two important functions First, it activates the P, promoter leading to the expression of the integrase The integrase is responsible for the integration of the

X genome into the host chromosome by site specific recombination Second, ell also activates the PRE (repression establishment) promoter allowing the expression of the

CI repressor CI will bind to two operators, 0| and 0R blocking both PL and PR promoters and activating PRM This will result in the silencing of all genes except for itself But, CI is in competition with Cro (control of repressor and other things) for the same operators Both repressors have different affinities for the binding sites of the 7

operators; CI binds preferentially to site 1, cro to site 3. Thus, while CI blocks PL, PR

and activates PRM, cro will block PRM.

At least three bacterial and one phage factor do determine the stability and

expression level of CIL According to the concentration of these factors at the moment

of the infection, there will be more or less CI. If not enough CI protein is produced,

cro concentration prevails, Cro will block PM and CI concentration will further

decrease. Now, N will be fully expressed allowing the transcription of all genes

downstream of PR, including Q. Q is a strong transcriptional activator, which will

permit the expression of both structural and lysis genes. At this stage, lytic

development will proceed unrestrained

In the case there is enough CI to block the transcription at PL and PR, X will

lysogenize the cell. Activation of PRM will provide enough CI to maintain the lysogeny

and prevent the superinfection of the lysogen by the same phage. The prophage can

be induced by environmental factors such as UV light or chemical mutagens that

damage host DNA. These damages cause the RecA protein to act as a protease and

cleave the CI repressor, inducing the transcription of the phage genes.

•rm < PR PL RE

att int xis red „l \ N rex d

I cro ^ ell OP tR2 Q > rR

Pl *RM prri 1.1 p R q. q,

Figure 1: Regulation of early X transcription. Transcription of early X genes is initiated at PL and PR, and is subject to repression by CI acting at 0L and Or. Dashed lines indicate transcripts. The expanded diagrams show the relationship between PR, PL and PRM with the three repressor binding sites. CI binds preferentially to the sites 1, thus blocking PL and PR. Cro binds preferentially to the sites 3, blocking PRM. Adapted from E. Birge, Bacterial and Bacteriophage genetics (7). 8

1.1.4 Bacteriophages of lactic acid bacteria

Phage infection is a permanent threat in the dairy industry; therefore many dairy

research institutes focused on the characterization of the phage population involved.

A detailed classification of the various isolates has been developed. All viruses

infecting LAB are tailed phages and contain linear dsDNA. Genome sizes generally

range between 18 and 55 kb. According to the classification scheme of Bradley (19)

phages of LABs are differentiated into three main classes; morphotype A (contractile

tails), B (long, noncontractile tails) and C (short noncontractiie tails) (48).

L. lactis phages have been divided into 10 species according to morphology and

DNA/DNA hybridization. Some of the best characterized phages are listed in Table 2.

Species 936, c2 and recently P335 are the predominant species (48). P335 is the

only species to comprise both temperate and lytic phages (20;82).

Host Phage strain Morphology Life style Relevant information available References (species) L. lactis r1t(P335) B1 temperate Totally sequenced, genetic (74,92) switch.

LC3(P335) B1 temperate Lysis cassette, integrase. (8;57) ski (936) B1 lytic Totally sequenced, transcription (22,23) map, lysis cassette and origin of replication. Tue 2009 B1 temperate Totally sequenced (5;90,91) (P335) (unpublished), lysin, holm, major structural proteins, putative repressor. BK5-T (BK5-T) B1 temperate Totally sequenced (12-14) (unpublished), integrase, putative repressor TP901-1 B1 temperate Totally sequenced (24;25;47;62) (P335) (unpublished), major structural proteins, integrase c2 (c2) B2 lytic Totally sequenced, structural (59;60,96) proteins, replication origin, transcription map Lb LL-H B1 lytic Totally sequenced Deletion (66-69) delbrueckii derivative of a temperate phage. mv4 B1 temperate attP, integrase and endolysin. (9;32) JCL1032 B3 Contains homologous DNA (34) segments with LL-H and mv4. Lb. gassen adh B1 temperate Totally sequenced (41;81) (unpublished) Integrase, lysis cassette

Lb. casei A2 B1 temperate Integrase, DNA packaging (4,37;38;53) proteins and genetic switch

Lb. g1e B1 temperate Totally sequenced, major (49,52;79) plantarum structural proteins and lysis cassette

Table 2: Examples of characterized phages of lactic acid bacteria 9

Lactobacillus phages generally belong to the Siphoviridae family and have small, isometric heads A few phages with a prolate head have been also described

Myoviridae have been isolated in Lb helveticus and Lb plantarum Phages of Lb delbrueckii, the best investigated group have been classified into four DNA homology groups, a, b c and d (85) Group a is the most prevalent and contains both temperate and lytic phages In general the DNA homology between lytic and temperate phages is much higher for phages of Lactobacillus that for lactococcal phages

A molecular characterization of several phages of lactic acid bacteria has been initiated Efforts are mainly focusing on the analysis of the genes involved in phage integration, life cycle control and host lysis since this can result in molecular tools such as integration vectors, inducible gene expression systems or controlled cell lysis The complete sequences of 6 L lactis and 3 Lactobacillus sp have been determined (Table 2)

Recently, the increasing economical importance of S thermophilus led an increased interest into their phages Many phage isolates have been obtained from

fermentation failures in the Growth Growth medium Milk M17 broth in Growth tempeiature 40°C dairy industry resulting Eclipse period 30 mm several collections of S Rise period 20 mm Burst size 140 200 thermophilus phages S Morphology Head 65 nm Tail 230 260 nm thermophilus phages have Tail fiber 50 400 nm been classified by host Taxonomy Order Caudovirales Family Siphoviridae Group B1 range, morphology

Genome ds DNA 31 45 kb serotype, DNA homology, Table 3: Characteristics of S thermophilus phages (16) structural protein pattern and DNA packaging mechanism (6 18 54 56 76 80) All phages examined until now have the same basic features (Table 3), they consist of a small isometric head and a long, noncontractile tail In some cases a long tail fiber attached to the base of the tail can be seen They are all classified as group B1 Siphoviridae The size of their dsDNA genome has been calculated to range between 31 and 45 kb from restriction digests DNA hybridization studies revealed that all phages belong to a single DNA homology group All S thermophilus bacteriophages shared at least one segment of their genome Host range, on the othei hand is very narrow The majority of the phages are strain specific

Analysis of the host range and serotype from a subset of phages isolated in

French factories during the last 30 years allowed the separation of the phages in two 10

groups, lytic group I and lytic group II phages (18) Both groups had clearly distinct host ranges Antibodies raised against a lytic group I phage could neutralize most lytic group I phages, but not lytic group II phages Conversely, antibodies raised against a lytic group II phage inhibited other lytic group II phages, but not lytic group I phages This was interpreted as a clear indication for differences in some structural genes A subsequent analysis based on the protein patterns and DNA analysis could split all S thermophilus phages in two groups (56), one group had two major proteins

(32 and 26 kDa), the other group had three major proteins (41, 25 and 13 kDa) The difference in the protein pattern correlated very well with the packaging mechanism used, all phages with two major proteins were cos-site phages, and all phages with three major proteins were pac-site phages Until now all lytic group I phages were cos-site phages, whereas lytic group II phages were members of the pac-site phages

Both temperate and virulent phages have been isolated, but temperate phages represent only about 1% of all isolates Dotplot tests showed extensive crosshybndisation between temperate and virulent phages (17)

1.2 Phage resistance mechanisms in LAB

1.2.1 Natural resistance mechanisms

Molecular analysis provided insight into natural phage defense systems in

Lactococcus lactis These bacteria developed a variety of mechanisms to cope with phage infections, which act at various point of the phage life cycle Major categories include (1) adsorption inhibition (2) block of phage DNA injection, (3) restriction/modification and (4) abortive infection mechanisms

1.2.1.1 Phage adsorption inhibition

The expression adsorption inhibition is used when the phage is unable to attach to the cell surface Most of the spontaneous phage resistant mutants selected by phage challenge (Bacteriophage /nsensitive Mutants or BIM) are modified in their adsorptive characteristics (50 58 63 93) The phenotype can result from many mechanisms, e g changes in the cell surface carbohydrate composition (40) masking of cell surface structures (86) or from alteration of specific phage protein receptors (72)

Interference with phage adsorption has also been linked with a number of lactococcal plasmids Lactococcal plasmids pSK112 (28 86) and pCI528 (61) encode 11

proteins for the production of Galactosyl-containig lipoteichoic acid, and galactose and rhamnose-containing polymers, respectively These substances possibly mask the phage receptor In both cases mild alkali treatment could remove this material restoring phage sensitivity

The major problem with adsorption inhibition is that phages are not removed from the media and can attack a sensitive cell present in the same environment

1.2.1.2 Block of phage DNA injection

Phage resistant strains have been isolated which allow normal phage adsorption, but which apparently do not permit phage DNA injection (89 95) pNP40 of L lactis

MG1614 permits phage c2 to adsorb to the cell surface (42) but no phage DNA could be detected into the cell after adsorption The resistance mechanisms were by¬ passed by electroporating phage DNA into the cell (43)

1.2.1.3 Abortive infection mechanisms

Abortive infection (Abi) is a phage resistance mechanism that acts after phage

DNA injection but is not restriction and modification Abi results in cell death before phage progeny is released thus limiting the propagation of new phage particles in the culture Cells harboring Abi-resistance mechanisms are generally killed either by the Abi protein itself or by the modifications caused by the infection In lactococci Abi mechanisms are usually encoded on plasmids (42) The phenotype of an abortive infection covers reduction in the efficiency of plaquing (EOP) ranging between 0 5 and < 109, smaller plaques due to smaller burst size and a decrease in the efficiency of formation of centers of infection (ECOI) (3)

A few mechanisms of action have been investigated One is the AbiA system encoded by pTR2030 (46) and pCI829 (26 27) It delays as other Abi systems, phage DNA replication It affects 3 lactococcal species (936 P335 and c2) A leucine zipper was identified in the AbiA protein Site-directed mutagenesis experiments demonstrated its essential role in conferring phage resistance (3)

Phages can be isolated that overcome Abi mechanisms Point mutations are the most common changes leading to Abi insensitivity (29) Transfers of bacterial genes into the phage genome leading to insensitivity have also been observed (70) 12

1.2.1.4 Restriction and modification

Many R/M systems have been identified in lactic acid bacteria, reviewed in (3)

' They operate at varying levels of efficiency (EOP of 10 - 106) The problem with this system is that it allows some phages to escape restriction resulting in modified progeny virions The modified phages can then infect a second host bearing an identical R/M system without being restricted (EOP=1 0) However, R/M systems can be extremely useful when combined with Abi defenses The restriction/modification system provides the first level of protection and promotes cell survival If the phage escapes restriction, Abi systems will kill the cell and prevent the development of the modified phage

Hill et al described a lactococcal phage being able to overcome the phage resistance mechanisms encoded by plasmid pTR2030 (45) This plasmid encodes for two different phage insensitivity mechanisms, a R/M system and the abortive infection system AbiA To escape the host restriction activity the bacteriophage acquired the functional domain from the type II A methylase encoded by the plasmid

Insensitivity to the second phage-resistance mechanism seems to be due to its phage origin of replication which apparently differs from those targeted by the AbiA system

Moineau et al (71) succeeded in protecting several industrial strains of S thermophilus from phage infection by the introduction of a lactococcal R/M system

1.2.2 Novel phage resistance mechanisms

In addition to natural resistance mechanisms, resistance mechanisms have been constructed in the laboratory that are not encountered in LAB (Per and antisense

RNA) Per is an acronym for Phage Encoded Resistance and relies on the phage inhibitory effect of phage DNA when cloned on a plasmid For both lactococcal and streptococcal phages, the presence of phage origin of replication (on) on a high copy number plasmid, can interfere with phage replication in an Abi-like manner

(33,44,77) Phage replication proteins are bound to the on on the high copy instead to the physiological on on the phage DNA (33)

On theoretical grounds antisense RNA appears a promising approach, but recent experiments had little success (94) Apparently, the choice of the target gene is critical It should be a crucial phage gene conserved in a wide range of phages and expressed at low enough levels to permit effective inhibition 13

Recently, a further protection system has been constructed for L. lactis. A bacteriophage inducible promoter was used to trigger the expression of a bacterial suicide system (31). The lethal genes consisted of the restriction cassette LlaIR deprived of its corresponding methylase. The EOP was lowered to 10"5. However, mutant phages developed that were less sensitive (EOP of 0.4). Changes were found in their transcription activator, which decreased expression of the suicide system

(30). Finally, the loss of the phage receptor in L. lactis C2 can completely block the infection by the c2 phage (36).

1.3 Outline of the thesis

1.3.1 Phage diversity

The first aim of my thesis was the definition of phage diversity by comparative genomics of S. thermophilus phages by sequencing the genomes of 3 lytic phages.

The phage genomes underwent pairwise alignments by including two temperate and one further virulent S. thermophilus phages to localize areas of difference (Table 4).

The phages were chosen to allow a pairwise alignment of phages differing in specific phenotypes: (A) Life style (Sfi19 vs. Sfi21; Sfi11 vs. O1205). (B) DNA packaging and structural proteins, (cos-phage Sfi19 vs. pac-phage Sfi11). (C) Host range (Sfi19 vs.

DT1; Sfi11 vs. 01205). (D) Sensitivity vs. resistance to Sfi21 prophage mediated superinfection exclusion mechanisms (Sfi 19 vs. Sfi18). The aim of these comparisons was to correlate phenotypic differences with genotypic differences.

Results are reported in chapters 2 to 4.

Phage Characteristics Reference __ _ Sfiî î Lytic This work. pac-phage Sensitive to phage-ori mediated resistance mechanisms Sfi18 Lytic This work, cos-phage Sensitive to phage-ori mediated resistance mechanisms Submitted to Sfi21-prophage control

Sfi19 Lytic This work. cos-phage Not sensitive to phage-on mediated resistance mechanisms Not submitted to Sfi21-prophage control

Sfi21 Temperate F. Desiere, Diss ETHZ Nr. cos-phage 13332, 1999. Sensitive to phage-ori mediated resistance mechanisms

DT1 Lytic (88) Cos-phage

O1205 Temperate (87) Pac-phage

Table 4: Sequenced S. thermophilus phages 14

The evolutionary relationship of S thermophilus phages to Siphoviridae from low

GC content Gram-positive bacteria is explored in chapters 3 and 5

1.3.2 Phage resistant starter strains

The second aim of my thesis was the establishment of phage-resistant S thermophilus starter strains Since plasmids and natural phage resistance mechanisms are rare in S thermophilus we turned our attention to phage DNA segments that inhibited phage infection of strains that contained these elements on high copy number plasmids We identified the phage replication origin as an inhibitory DNA element The results of these experiments are described in chapter 6

A second approach to establish phage resistant starters relied on the anti-phage properties of lysogénie starters The Sfi21 prophage encodes potent superinfection immunity genes The drawback is that lysogens continuously release phage particles in the media These phages could then infect sensitive strains Therefore, we created deletions in the prophage by targeted insertion mutagenesis to obtain lysogens, which retain superinfection immunity but do not produce infectious particle Finally, we hypothesized that S thermophilus phages need host genes tor their development and that it should be possible to obtain phage resistant strains of S thermophilus by random insertion mutagenesis of the host genome If these genes are nonessential for bacterial growth in milk, such insertion mutants can yield phage resistant starters

The results of the insertion mutagenesis experiments are reported in chapter 7

Reference List

1 Ackermann, H.W. 1996 Frequency of morphological phage descriptions in 1995 Arch Virol 141 209-218 2 Ackermann, H.W. 1998 Tailed bacteriophages the order cauclovirales Adv Virus Res 51 135-201

3 Allison, G.E. and T.R. Klaenhammer 1998 Phage resistance mechanisms in lactic acid bacteria Int Dairy J 8 207-226 4 Alvarez, M.A., M. Herrero, and J.E. Suarez 1998 The site-specific recombination of system the Lactobacillus species bacteriophage A2 integrates in gram- positive and gram-negative bacteria Virology 250 185 193 5 Arendt, E.K., C. Daly, G F. Fitzgerald, and M. van de Guchte 1994 Molecular characterization of lactococcal bacteriophage Tuc2009 and identification and analysis of genes encoding lysin a putative holm and two structural proteins Appl Environ Microbiol 60 1875-1883 15

6 Benbadis, L., M. Faelen, P. Slos, A. Fazel, and A. Mercenier 1990 Characterization and comparison of virulent bacteriophages of Streptococcus thermophilus isolated from yoghurt Biochimie 72-855-862 7 Birge, E.A. 1994 Bacterial and bacteriophage genetics Springer-Verlag, New York 8 Birkeland, N.K. 1994 Cloning molecular characterization, and expression of the genes encoding the lytic functions of lactococcal bacteriophage phi LC3 a dual lysis system of modular design Can J Microbiol 40 658-665 9 Boizet, B., M.Y. Lahbib, L. Dupont, P. Ritzenthaler, and M. Mata 1990 Cloning, expression and sequence analysis of an endolysin- encoding gene of Lactobacillus bulgancus bacteriophage mv1 Gene 94 61-67 10 Botstein, D. 1980 A theory of modular evolution for bacteriophages Ann N Y Acad Sei 354 484-490 11 Botstein, D. and I. Herskowitz 1974 Properties of hybrids between Salmonella phage P22 and cohphage lambda Nature 251 584-589 12 Boyce, J.D., B.E. Davidson, and A.J. Hillier 1995a Identification of prophage genes expressed in lysogens of the Lactococcus lactis bacteriophage BK5-T Appl Environ Microbiol 61 4099-4104 13 Boyce, J.D., B.E. Davidson, and A.J. Hillier 1995 Sequence analysis of the Lactococcus lactis temperate bacteriophage BK5-T and demostration that the phage DNA has cohesive ends Appl Environ Microbiol 61 4089-4098 14 Boyce, J.D., B.E. Davidson, and A.J. Hillier 1995b Spontaneous deletion mutants of the Lactococcus lactis temperate bacteriophage BK5-T and localization of the BK5- T attP site Appl Environ Microbiol 61 4105-4109 15 Brüssow, H. 1999 (UnPub) 16 Briissow, H. 1999 Phages of Streptococcus thermophilus, R Webster and A Granoff (eds ), Encyclopedia of virology Academic Press, London, England 17 Brüssow, H. and A. Bruttin 1995 Characterization of a temperate Streptococcus thermophilus bacteriophage and its genetic relationship with lytic phages Virology 212 632-640 18 Briissow, H., M. Fremont, A. Bruttin, J. Sidoti, A. Constable, and V. Fryder 1994 Detection and classification of Streptococcus thermophilus bacteriophages isolated from industrial milk fermentation Appl Environ Microbiol 60 4537- 4543 19 Bradley, D.E. 1969 Ultrastructure of bacteriophages and bactenocins BactenolRev 31 230-314 20 Braun, V.Jr., S. Hertwig, H. Neve, A. Geis, and ML Teuber 1989 Taxonomic differentiation of bacteriophages of Lactococcus lactis by electro microscopy, DNA-DNA hybridization and protein profiles J Gen Microbiol 135 2551- 2560 21 Casjens, S.R., G.F. Hatfull, and R.W. Hendrix 1992 Evolution of dsDNA tailed- bactenophage genomes Semin Virol 3 383-397 22 Chandry, P.S., B.E. Davidson, and A.J. Hillier 1994 Temporal transcription map of the Lactococcus lactis bacteriophage ski Microbiology 140 2251-2261 23 Chandry, P.S., S.C. Moore, J.D. Boyce, B.E. Davidson, and A.J. Hillier 1997 Analysis of the DNA sequence, gene expression, origin of replication and modular structure of the Lactococcus lactis lytic bacteriophage ski Mol Microbiol 26 49-64 24 Christiansen, B., L. Brjndsted, F.K. Vogensen, and K. Hammer 1996 A resolvase-like protein is required for the site-specific integration of the temperate lactococcal bacteriophage TP901-1 J Bactenol 178 5164-5173 25 Christiansen, B., M.G. Johnsen, E. Stenby, F.K. Vogensen, and K. Hammer 1994 Characterization of the lactococcal temperate phage TP901-1 and its site-specific integration J Bactenol 176 1069-1076 26 Coffey, A.G., G.F. Fitzgerald, and C. Daly 1989 Identification and characterization of a plasmid encoding abortive infection from Lactococcus lactis ssp lactis UC811 Neth Milk Dairy J 43 229-244 27 Coffey, A.G., G.F. Fitzgerald, and C. Daly 1991 Cloning and characterization of the determinant for abortive infection of bacteriophage from lactococcal Plasmid pCI829 J Gen Microbiol 137 1355-1362 16

28 De Vos, W.M., H.M. Underwood, and F.L. Davies 1984 Plasmid encoded bacteriophage resistance in Streptococcus cremons SK11 FEMS Microbiol Lett 96 135-142

29 Dinsmore, P.K. and T.R. Klaenhammer 1995 Bacteriophage resistance in Lactococcus Mol Biotechnol 4 297-314 30 Djordjevic, G.M. and T.R. Klaenhammer 1997 Bactenophage-tnggered defence systems phage adaptation and design improvement Appl Environ Microbiol 63 4370-4376 31 Djordjevic, G.M., D.J. O'Sullivan, S.A. Walker, M.A. Conkling, and T.R. Klaenhammer 1997 Tnggered-suicide system designed as a dfence against bacteriophages J Bactenol 179 6741-6748 32 Dupont, L., B.B. Boizet, M. Coddeville, F. Auvray, and P. Ritzenthaler 1995 Characterization of genetic elements required for site- specific integration of Lactobacillus delbrueckii subsp bulgancus bacteriophage mv4 and construction of an integration- proficient vector for Lactobacillus plantarum J Bactenol 177 586-595 33 Foley, S., S. Lucchini, M.-C. Zwahlen, and H. Brussow 1998 A short noncodmg viral DNA element showing characteristics of a replication origin confers bacteriophage resistance to Streptococcus thermophilus Virology 250 377- 387 34 Forsman, P. 1993 Characterization of a prolate-headed bacteriophage of Lactobacillus delbrueckii subsp lactis and its DNA homology with isometnc- headed phages Arch Virol 132 321-330 35 Fuhrman, J.E. 1999 Marine viruses and their biogeochemical and ecological effects Nature 399 541-548 36 Garbutt, K.C., J. Kraus, and B.L. Geller 1997 Bacteriophage resistance in Lactococcus lactis engineered by replacement of a gene for a bacteriophage receptor J Dairy Sei 80 1512-1519 37 Garcia, P., J.C. Alonso, and J.E. Suârez 1997 Molecular analysis of the cos region of the Lactobacillus casei bacteriophage A2 Gene product 3, gp3, specifically binds to its downstream cos region Mol Microbiol 23 505-514 38 Garcia, P., V. Ladero, J.C. Alonso, and J.E. Suârez 1999 Cooperative interaction of CI protein regulates lysogeny of Lactobacillus casei by bacteriophage A2 J Virol 73 3920-3929 39 Gemski, P., L.S. Baron, and N, Yamamoto 1972 Formation of hybrids between coliphage lambda and salmonella phage P22 with a Salmonella typhimunum hybrid sensitive to these phages Proc Natl Acad Sei U S A 69 3110-3114 40 Gopal, P.K. and V.L. Crow 1993 Characterisation of loosely associated material from the cell surface of Lactococcus lactis subsp cremons E8 and its phage- resistant variant strain 398 Appl Environ Microbiol 59 3177-3189 41 Henrich, B., B. Binishofer, and U. Blasi 1995 Primary structure and functional analysis of the lysis genes of Lactobacillus gasseri bacteriophage phi adh J Bactenol 177 723-732 42 Hill, C. 1993 Bacteriophage and bacteriophage resistance in lactic acid bacteria FEMS Microbiol Rev 12 87-108 43 Hill, C, P. Garvey, and G.F. Fitzgerald 1996 Bactenophage-host interactions and resisatnce mechaniss analysis of the conjugative bacteriophage resistance plasmid pNP40 Lait 76 67-79 44 Hill, C, L.A. Miller, and T.R. Klaenhammer 1990 Cloning, expression, and sequence determination of a bacteriophage fragment encoding bacteriophage resistance in Lactococcus lactis J Bactenol 172 6419-6426 45 Hill, C, L.A. Miller, and T.R. Klaenhammer 1991 In vivo genetic exchange of a functional domain from a type II A methylase between lactococcal plasmid pTR2030 and a virulent bacteriophage J Bactenol 173 4363-4370 46 Hill, C, D.A. Romero, D.S. Mckenney, K.R. Finer, and T.R. Klaenhammer 1989 Localisation, cloning and expression of genetic determinants for bacteriophage resistance (Hsp) from the conjugative plasmid pTR2030 Appl Environ Microbiol 55 1684-1689 17

47 Johnsen, M.G., K.F. Appel, P.L. Madsen, F.K. Vogensen, K. Hammer, and J. Arnau 1996 A genomic region of lactococcal temperate bacteriophage TP901-1 encoding major virion proteins Virology 218 306-315 48 Josephsen, J. and H. Neve 1998 Bacteriophages and lactic acid bacteria, p 385- 436 In S SalminenandA von Wright (eds ) Lactic acid bacteria Dekker, Basel 49 Kakikawa, M., M. Oki, H. Tadokoro, S. Nakamura, A. Taketo, and K. Kodaira 1996 Cloning and nucleotide sequence of the major capsid proteins of Lactobacillus bacteriophage phi gle Gene 175 157-165 50 King, W.R., E.B. Collins, and E.L. Barrett 1983 Frequencies of bactenophage- resistant and slow acid-producing variants of Streptococcus cremons Appl Environ Microbiol 45 1481-1485 51 Klaenhammer, T.R. 1987 Plasmid-directed mechanisms for bacteriophage dfence in lactic streptococci FEMS Microbiol Rev 46 313-325 52 Kodaira, K., M. Oki, M. Kakikawa, N. Watanabe, M. Hirakawa, K. Yamada, and A. Taketo 1997 Genome structure of the Lacfoöac///us temperate phage Fgle the whole genome sequence and the putative promoter/repressor system Gene 187 45-53 53 Ladero, V., P. Garcia, V. Bascaran, M. Herrero, M.A. Alvarez, and J.E. Suârez 1998 Identification of the repressor-encodmg gene of the Lactobacillus bacteriophage A2 J Bactenol 180 3474-3476 54 Larbi, D., L. Colurin, B. Ronselle, B. Decaris, and J.-M. Simonet 1990 Genetic and biochemical characterization of nine Streptococcus salivanus subsp thermophilus bacteriophages Lait 70 107-116 55 Larbi, D., B. Decaris, and J.-M. Simonet 1992 Different bacteriophage resistance mechanisms in Streptococcus salivanus subsp thermophilus J Dairy Res 59 349-357 56 Le Marrec, C, D. van Sinderen, L. Walsh, E. Stanley, E. Vlegels, S. Moineau, P. Heinze, G. Fitzgerald, and B. Fayard 1997 Two groups of bacteriophages infecting Streptococcus thermophilus can be distinguished on the basis of mode of packaging and genetic determinants for major structural proteins Appl Environ Microbiol 63 3246-3253 57 Lillehaug, D. and N.K. Birkeland 1993 Characterization of genetic elements required for site- specific integration of the temperate lactococcal bacteriophage phi LC3 and construction of integration-negative phi LC3 mutants J Bactenol 175 1745-1755 58 Limsowtin, G.K.Y. and B.E. Terzaghi 1976 Phage resistant mutants their selection and use in cheese factories New Zealand J Dairy Sei 11 251-256 59 Lubbers, M.W., K. Schofield, N.R. Waterfield, and K.M. Polzin 1998 Transcription analysis of the prolate-headed lactococcal bactenophage c2 J Bactenol 180 4487-4496 60 Lubbers.M.W., N.R. Waterfield, T.P.J. Beresford, R.W.F. Le Page, and A.W. Jarvis 1995 Sequencing and analysis of the prolate-headed lactococcal bacteriophage c2 genome and identification of the structural genes Appl Environ Microbiol 61 4348-4356 61 Lucey, M., C. Daly, and G.F. Fitzgerald 1992 Cell surface characteristics of Lacotcoccus lactis harbouring pCI528 a 46-kb plasmid encoding inhibition of bacteriophage adsorption J Gen Microbiol 138 2137-2143 62 Madsen, P.L. and K. Hammer 1998 Temporal transcription of the lactococcal temperate phage TP901-1 and DNA sequence of the early promoter region Microbiology 144 2203-2215 63 Marshall, R.J. and N.J. Berridje 1976 Selection and properties of phage-resistant starters for cheese making J Dairy Res 43 449-458 64 McKay, L.L. and K.A. Baldwin 1990 Applications for biotechnology present and future improvements in lactic acid bacteria FEMS Microbiol Rev 87 3-14 65 Wlercenier, A. 1990 Molecular genetics of Streptococcus thermophilus FEMS Microbiol Rev 7 61-77

66 Mikkonen, M. and T, Alatossava 1994 Characterization of the genome region encoding structural proteins of Lactobacillus delbrueckii subsp lactis bacteriophage LL-H Gene 151 53-59 18

67 M. Mikkonen, and T. Alatossava . 1995 A group I intron in the terminase gene of Lactobacillus delbrueckii subsp lactis phage LL-H Microbiology. 141.2183- 2190 68 Mikkonen, M., L. Dupont, T. Alatossava, and P. Ritzenthaler 1996 Defective site- specific integration elements are present in the genome of virulent bacteriophage LL-H of Lactobacillus delbrueckii Appl Environ Microbiol 62 1847-1851

69 Mikkonen, M., L. Räisänen, and T. Alatossava 1996 The early gene region completes the nucleotide sequence of Lactobacillus delbrueckii subsp lactis phage LL-H Gene 175 49-57 70 Moineau, S., S. Pandian, and T.R, Klaenhammer 1994 Evolution of a lytic bacteriophage via DNA acquisition from the Lactococcus lactis chromosome Appl Environ Microbiol 60 1832-1841 71 Moineau, S., S.A. Walker, B.J. Holler, E.R. Vedamuthu, and P.A. Vandenbergh 1995 Expression of a Lactococcus lactis phage resistance mechanism by Streptococcus thermophilus Appl Environ Microbiol 612461-2466 72 Monteville, M.R., B. Ardestani, and B.L. Geller 1994 Lactococcal bacteriophages require a host cell wall carbohydrate and a plasma membrane protein for adsorption and ejection of DNA Appl Environ Microbiol 60 3204-3211 73 Morelii, L Personal Communication 74 Nauta, A., D. van Sinderen, H. Karsens, E. Smit, G. Venema, and J. Kok 1996 Inducible gene expression mediated by a repressor-operator system isolated from Lactococcus lactis bacteriophage rit Mol Microbiol 19-1331-1341 75 Neve, H. 1996 Bacteriophage, p 157-189 In T M Cogan and J.-P Accolas (eds ), Dairy starter cultures VCH Publishers, New York 76 Neve, H., U. Krusch, and M. Teuber 1989 Classification of virulent bacteriophages of Streptococcus salivanus subsp thermophilus isolated from yoghurt and Swiss-type cheese Appl Microbiol Biotechnol 30 624-629 77 O'Sullivan, D.J., C. Hill, and T.R. Klaenhammer 1993 Effect of increasing the copy number of bacteriophage origins of replication, in trans on incoming-phage proliferation Appl Environ Microbiol 59 2449-2456 78 O'Sullivan, T., D. van Sinderen, and G. Fitzgerald 1999 Structural and functional analysis of pCI65st, a 6 5 kb plasmid from Streptococcus thermophilus NDI-6 Microbiology 145 127-134 79 Oki, M., M. Kakikawa, S. Nakamura, E.T, Yamamura, K. Watanabe, M. Sasamoto, A. Taketo, and K. Kodaira 1997 Functional and structural features of the holm HOL protein of the Lactobacillus plantarum phage phi gle analysis in Escherichia coll system Gene 197 137-145 80 Prévôts, F., P. Relano, M. Mata, and P. Ritzenthaler 1989 Close relationship of virulent bacteriophages of Streptococcus salivanus subsp thermophilus at both the protein and the DNA level J Gen Microbiol 135 3337-3344 81 Raya, R.R., C. Fremaux, G.L. De Antoni, and T.R. Klaenhammer 1992 Site- specific integration of the temperate bacteriophage phi adh into the Lactobacillus gassen chromosome and molecular characterization of the phage (attP) and bacterial (attB) attachment sites J Bactenol 174 5584- 5592 82 Relano, P., M. Mata, M. Bonneau, and P. Ritzenthaler 1987 Molecular characterization and comparison of 38 virulent and temperate bacteriophages of Streptococcus lactis J Gen Microbiol 133 3053-3063 83 Sanders, M.E. 1987 Bacteriophages of industrial importance, p 211-244 In S M Goyal, C P Gerba and G Bitton (eds ), Phage Ecology Wiley Interscience, NY 84 Sanders, M.E., P.K. Leonhard, W.D. Sing, and T.R. Klaenhammer 1986 Conjugal strategy for construction of fast acid producing, bactenophage-resistant lactic streptococci for use in dairy fermentations Appl Environ Microbiol 52 1001- 1007 85 Sechaud, L., P.J. Cluzel, M. Rousseau, A. Baumgartner, and J.P. Accolas 1988 Bacteriophages of lactobacilli Biochimie 70 401-410 86 Sijtsma, L., J.T.M. Wouters, and K.J. Hellingwerf 1990 Isolation and characterization of lipoteichoic acid, a cell envelope component involved in 19

preventing phage adsorption, from Lactococcus lactis subsp cremons J Bactenol 172 7130 87 Stanley, E., G.F. Fitzgerald, C. Le Marrec, B. Fayard, and D. van Sinderen 1997 Sequence analysis and characterization of phi 01205, a temperate bacteriophage infecting Streptococcus thermophilus CNRZ1205 Microbiology 143 3417-3429 88 Tremblay, D.M. and S. Moineau 1999 Complete genomic sequence of the lytic bacteriophage DT1 of Streptococcus thermophilus Virology 255 63-76 89 Valyasevi, R., V.E. Sandine, and B.L. Geller 1991 A membrane protein is required for bacteriophage c2 infection of Lactococcus lactis subsp lactis C2 J Bactenol 173 6095-6100 90 van de Guchte, M., C. Daly, G.F. Fitzgerald, and E.K. Arendt 1994 Identification of int and attP on the genome of lactococcal bacteriophage Tuc2009 and their use for site-specific plasmid integration in the chromosome of Tuc2009- resistant Lactococcus lactis MG1363 Appl Environ Microbiol 60 2324-2329 91 van de Guchte, M., C. Daly, G.F. Fitzgerald, and E.K. Arendt 1994 Identification of the putative repressor-encoding gene cl of the temperate lactococcal bacteriophage Tuc2009 Gene 144 93-95 92 van Sinderen, D., H. Karsens, J. Kok, P. Terpstra, M.H. Ruiters, G. Venema, and A. Nauta 1996 Sequence analysis and molecular characterization of the temperate lactococcal bacteriophage rit Mol Microbiol 19 1343-1355 93 Vlegels, P.A.P., W.C. Hazeleger, T.H. Heimerhorst, and J.T.M. Wouters 1988 Phage resistance of Streptococcus cremons due to low adsorption efficiency Neth Milk Dairy J 42 195-206 94 Walker, S.A. and T.R. Klaenhammer 1998 Molecular characterization of a phage- inducible middle promoter and its transcriptional activator from the lactococcal bacteriophage F31 J Bactenol 180 921-931 95 Watanabe, K., K. Ishibashi, Y. Nakashima, and T. Sakurai 1984 A phage resistant mutant of Lactobacillus casei which permits phage adsorption but not genome injection J Gen Microbiol 65 981-986 96 Waterfield, N.R., Lubbers.M.W., K.M. Polzin, R.W.F. Le Page, and A.W. Jarvis 1996 An origin of DNA replication from Lactococcus lactis bacteriophage c2 Appl Environ Microbiol 62 1452-1453 20 21

2. RESULTS

2.1 The genetic relationship between virulent and temperate Streptococcus thermophilus bacteriophages: whole genome comparison of cos-site phages Sfi19andSfi21

(Virology 260, 232-243 (1999)) ->-) ^z*

232-243 Virology 260, (1999) - _ |(| Article ID viro1999 9814 available online at http //www idealibrary com on I D I P^h

The Genetic Relationship between Virulent and Temperate Streptococcus thermophilus Bacteriophages Whole Genome Comparison of cos-Site Phages Sfi19 and Sfi21

Sacha Lucchini Frank Desiere and Hamid Brüssow1

Nestle Research Centre Nesteo I td Vers ch°z l°s B -vie CH 000 I aw,anno 2b Switzerland

Received December c9 1998 etu red to autho 'or /t/s o i Ma 1 I99J accepted May 1 / 1099

The virulent cos site Streptococcus thermophilus bacteriophage Sfi19 has a 37392 bp long genome consisting of 44 open

reading frames all encoded on the same DNA strand The genome of the temperate cos site S thermophilus phage Sfi21 is 3 3 kb longer (40 740 bp 53 oris) Both genomes are very similarly organized and differed mainly by gene deletion and DNA rearrangement events in the lysogeny module oene lephcement duplication, and deletion events in the DNA replication module and numerous point mutations The level of point mutations varied from <-l% (lysis and DNA replication modules) to ">15% (DNA packaging and head morphogenesis modules) A dotplot analysis showed nearly a straight line over the left 25 kb of their genomes Over the right genome half a more variable dotplot pattern was observed The entire lysogeny module from Sfi21 comprising 12 genes was replaced by 7 oris in Sfi19 six showed similarity with genes from temperate pac site S thermophilus phages None of the genes implicated in the establishment of the lysogénie state (integrase

superinfection immunity repressor) or remnants of it were conserved in Sfi19 while a Cro like repressor was detected Downstream of the highly conserved DNA replication module 11 and 13 oris were found in Sfi19 and

was only found in Sfi21 All other oris from this region which included a second putative phage lepressor were closely related between both phages Two noncoding regions of Sfi 19 showed sequence similarity to pST1, a small cryptic plasmid

Of S thermophilus O 1999 Academ c Press

INTRODUCTION lactic acid bacteria (lactic streptococci lactococci lac tobacilli) have recently become a focus of reseaich in Bacteriophages can be classified according to their food microbiology (Josephsen and Neve 1998) All S lifestyle into virulent and temperate phao.es Somewhat thermophilus phages isolated until now belong to a sin generalized temperate phages have the cho ce to mult! gle class of viruses isometric headed phages with a ply on their host cells leading to the lysis of the cell or to long noncontractile tail {Siphoviridae) containing —40 kb integrate the phage genome into the bacterial chromo of linear double stranded DNA that showed at least some to become a prophage Piophages are piCDaaatnd some cross hybridization (Brüssow et al 1994 Brussow passively by the replication machmeiy of the bacter al aid Bruttin 1995) Ecologically S thermophilus cell Virulent phages are unable to lysogenize then I ost phages are reasonably well characterized More than 200 dis cells and have to rely on productive nfecton cycles for t net S isolates have their propagation Because both lifestyles have ditferent thermophilus phage been de scribed several laboratories ^30 reguirements with respect to genome organization viru by representing yeais of lent and temperate phages from manv bacterial hosts systematic surveys including longitudinal ecological et al 1989 e g Fscheiichia coli share tew if any genetic relation surveys (Neve Prevots ef al 1989 Brussow ships This separation into two fundamentally different et al 1994 Bruttin ef 9/ 1997a Le Marrec ef a/ 1997) classes of bacteriophages seems to be less clear n Temperate and virulent phages have been detected in

lactic acid bacteria (Shimizu Kadota era/ 1983 Brussow this phage group and they cross hybridized to a variable and Bruttin 1995 Mikkonen et al 199b) extent (Brussow and Bruttin 1995) Possibly as an adap Our laboratory is interested in phages of Sf/ep^cc tve response to the competition by virulent phages the eus a low GO thermophilus content giam pos five ea„ temoerate S thermophilus phage Sf(21 has developed a tenum used in industrial milk fermentation (Memen or powerful double immunity system The Sfi21 immunity 1990) Because of their economical as nbib importance system cons sts of a phage repressor protecting the itors of industrial milk fermentation barter from ODhage& lysogen aga nst superinfection with the homologous temperate phage and its derivatives (holey ef al unpub I shed data) and 3 second protein protecting the lysogen 1 Jtorr the To whom reprint requests should ie ->1dressecl -sx 0011 1 7^j superinfection by majority of the virulent phages 8925 Email harald bruessow® dis nestlt et m (Brutto °t ai 1997b) However some virulent phages

0042 6822/99 $30 00 Oopynqht O 1999 byAcadanc Pross

All of in orn rese vei MP) rqhts toproducton Tny 21

COMPARISON OF cos SITE PHAGES Sfi19 AND Sfi21 233

were still able to multiply on the lysogen (Bruttin et al organization were detected in the lysogeny and DNA 1997b) Within this group of phages two types could be replication module (Fig 1 blue shading) differentiated One type is represented by phage S17 The similarity between both phages extended to the according to DNA-DNA hybridization experiments S17 nucleotide seguence level A dotplot analysis showed

DNA is only weakly related to Sfi21 DNA (Brussow ind nearly a stra ght line over the left 25 kb of then genomes

Bruttin 1995) The second type is represented by S* 19 (Fig 2) Ov°r the right genome half a more variable According to DNA-DNA hybridization experiments Sf 19 dotplot pattern was observed The entire lysogeny mod DNA is closely telated to Sfi21 DNA over the tail mot t le from Sfi2l consisting of 12 orfs (Bruttin ef al 1997b) phogenesis host lysis and DNA replication modules was leplaced by a sequence unrelated cluster of 7 orfs (Brussow and Bruttin 1995) this was confirmed by se in Sf110 In the rightmost part of the compared Dhage guence analysis (Desiere ef a/ 1998 Foley ef al 1998) genomes the dotplot revealed a DNA duplication in Sfi19 However, Sfi19 like the majority of the lytic group I and two nonahgnments (Fig 2) {cos site phages) S thermophilus phages failed to The dotplot analysis does not provide a guantitative cross hybridize with the DNA packaging head moipho analys s of the level of nucleotide seguence similarity genesis and lysogeny modules fiom Sfi°1 under str i between corresponding genome regions of the two gent hybridization conditions (Brussow and Biutt i phages ]o depict the differences quantitatively, a plot 1995) To further our understanding of the geiet c rel i s milarity inalysis was done (Fig 3) It demonstrated that tionship between temperate and virulent 6 thermof. hi us the level of sequence similarity varied over different phages we completed the seguencmg of Sfi19 and Sfi21 legions of the compared genomes Over the DNA pack and provide here their whole genome comparison After aging and head morphogenesis modules a relatively the recent comparison of the virulent D29 and temperate high degree of diversification was observed Both

L5 mycobactenophage genomes (Ford et al 1998) this phages differed by — 15% at the basepair level Over the is to our the second whole knowledge only genome tail morphogenesis module seguence variation ranged comparison of a genetically closely related virulent tern from «-.1% to "-25% substantial variation was seen even perate phage pair between diffère it regions of the same gene Except for one insertion the lysis genes were highly conserved this RESULTS was also the case for the DNA replication module The right end of the ohage genomes showed an alternating Overall comparison of the S thermophilus Sfi19 and pattern ot nonconservation and conservation of nucleo Sfi21 phage genomes tide seguence Throe regions of the Sf110 genome that The cos-site S Sf 21 temperate thermophilus phage have not been described previously will be analyzed in has a 40 740 bp long genome consisting of 53 oo°n detail below reading frames (orfs) longer than 50 codons Two o fs were of unusual length for prokaryotic genomes i tnat The DNA packaging and head morphogenesis they exceed 1200 codons All but four orfs started w to ai modules of Sfi19 ATG start codon (Fig 1A) Twenty five orfs were ore ceded by a standard nbosomal binding site for S mer The left end of the two phage genomes showed an mophilus (GAG) (Guedon ef al 1995) in approonate identical genetic organization However at the nucleo spacing to the initiation codon three isolated cases of tide sequence level important differences were ob potential translational coupling of two orfs were ob served The noncoding region around the cohesive ends served (Fig 1A) Factor independent terminator struc dffeied substantially between the two phages [64 tures were identified downstream of the major tail pro changes in 189 it positions (34%) 3 of 15 nucleotide tem gene and the lysin gene The phage DNA was pos t ons fror i the cohesive ends of Sfi21 were changed densely packed with orfs only six noncoding regions n Sfr 191 (Fig 4) Most of the direct and inverted repeats longer than 200 bp were detected the longest was 450 "fom S*i?1 (Desiem ef a 1999) were not conserved in bp in length (Fig 1A) Sf110 Notaoly the noncoding region from Sf110 showed The virulent cos site S thermophilus phage Sti19 lias hignly significant similarity to the 9 thermophilus plas a 37392 bp long genome 3 3 kb shorter than Sfi21 con mid pST1 (94% bp identity over two 50 bp long DNA 1 sisting of 44 open reading frames longer than 50 codons segments P =10 Janzen ef al 1992) whereas this (Fig 1B) Remarkably all orfs are encoded on the same was not the case for the noncoding region from <^Sfi21 DNA strand in Sf119 while in Sfi21 four lysogeny related (P - 0 015) (Fig 4) Previous biomformatic analysis has genes were encoded on the opposte strand Both ge suggested that orf 152 and ori 623 encode the small and nomes are very similarly organized with respect to mod large subunit terminase, respectively (Desiere ef al ular structure number and length of the orfs loc"toi o* 1999) The putative large subunit terminase was well and use of intergenic regions alternative start codons conserved between both phages (93% aa identity with and terminators differences in (Fig 1) Major genome evenly distributed aa changes) while the orf 152 gp 4*

A« öTf«fe i Head morphogenesis Head-ta il Tail Lysogeny Packaging Host lysis Replication Joining morphogenesis modale ~t r "TT "ir

141a 87a 140a 74b 287 8S61 17S 384 397 116 202 141b 74a 122 75 87b 233 271 106b «1 181114 235 132 152a 623 S3 221 106S123 117a 15S0 515 1276 670 117b 288 110 140b 3S3 203127 93 157 4431S1 504 143 57130170152b

Pi «ft i

ï> i rf I • \ i f ',A t c Ä c; !! *Ä c o o

10000 20000 30000 4ö008bp i

>

S £ "C CO oCO

^t K / - t • 132 162 «25 511 1291 S70 131 288 111 54 229 233443151 604 143 67 152 235 170 141 183 SS 157 271 106 61 166 S? 88 83 67 38 i J Lysogeny replacement module

FIG 1 prediction of open reading Barnes in tne cono et° genomes ot "he empe ate S tncrmophilus phage Sfi21 (A)and the /irulent 0 thermophilus phage ùfi19 'B; The ods are mar

COMPARISON OF cos-SITE PHAGES Sfi19 AND Sfi21 235

40,OOO

ego ) 0> -£ ° £ o

Sfi19 _ O 2 , OOO

— 0> ra o

.L O , O O O

Head-tail Tail Lysogeny Packaging Host lysis Replication joining morphogenesis module Head morphogenesis Sfi21

FIG. 2. Dotplot calculated for the DNA genome sequences of Sfi21 !\'axis) and Sfi 19 (y axis). The comoaiison window was 50 bp and the stringency

30 For an easier the of bp. orientation, gene maps Sfi21 a^t\ Sfi19 ne'e given at the x axis and they axis, respectively, together with the map position in basepairs as defined in Fig. 1. Regions of sequence differences were marked and annotated

showed only 84% aa identity and the changes were Sfi 19, located between orf 111 and orf 157 (Fig. 5). We concentrated in the C terminus of both proteins. The called this DNA segment a lysogeny-replacement mod¬ Sfi19 genes encoding the putative portal protein, pro¬ ule. Interestingly, the left half of the lysogeny replace¬ tease and head differed 13-15% at the major protein by ment module from Sfi 19 was identical to a DNA segment bp level from the corresponding genes of Sfi21 (Table 1). found —I kb downstream of the integrase gene in the The bp changes were evenly distributed and mainly temperate pac-site S. thermophilus phages TP-J34 (Neve found in the third base position of the codons. ef al., 1998). Only 2 bp differences were observed over this region: 1 bp change led to a different stop codon (orf The module in Sfi19 lysogeny replacement 183 vs. orf 103) and 1 bp indel (insertion/deletion) led to

the of two orfs in Sfi19 instead of one orf 83 in The sequence similarity between the left genome prediction halves of both phages ended after oil 111/110 (84% bp TP-I34 (Fig. 5). Over the central identity), located 1 orf downstream of the lysin gene, and part of the lysogeny leplacement resumed over orf 157 (98% bp identity) (Fig. 3), flanking module, we detected nucleotide sequence similarity with the highly conserved DNA replication module (Desiere ef DNA segments of the temperate pac-site S. thermophilus al., 1997). None ot the 12 genes from the lysogeny mod¬ phage O1205 (Stanley ef al., 1997). Several points are ule of Sfi21 74 to orf showed bo (orf 87) (or deduced aa) noteworthy. First, this Sfi 19 DNA segment showed se¬

with the seven orfs in sequence similarity predicted quence similarity with four noncontiguous DNA seg- 26

236 LUCCHINI, DESIERE, AND BRUSSOW

Head morphogenesis Lysogeny Haad-tail Tail module Packaging joining morphogenesis Host lysis Replication

Sfi21

n ü,ooo 4 0,000

Posit±on

FIG. 3. Plot-similarity analysis of the sequence differences hetwee" the genomes of Sfi21 and Sfi 19. The lower x axis gives the map position in basepairs as defined in Fig. 1. The upper x axis gives the gene mac of Sfi21 for orientation. The y axis gives the similarity score (1.0, 0.5 and 0.0 correspond to 100, 50, and 0% sequence identity) The window was 100 bo.

ments of 01205 located, respectively, 1.6 kb downstream oathogenicity island. The noncoding regions preceding and 1.5 kb upstream of the phage integrase gene (F;g, 5). and following orf 229 as well as the region preceding orf Second, some sequences were highly similar (e.g., the 3' 69 showed sequence similarity to noncoding regions in half from orf 88 showed 99% bp identity to orf 53 from O1205 (notably including part of the hypothetical genetic O1205), whereas others were at the borderline of nt switch region of 01205). sequence similarity (orf 69 showed 45% bp identity with orf 5 from O1205). Third, orf 69 gp showed similarity to The right genome ends of Sfi 19 and Sfi21 Cro-like repressors both from the temperate S. ther¬ The conserved mophilus phage O1205 (42% aa identity) and the temper¬ highly part of the DNA replication mod¬ ate Lactococcus lactis phage TP901-1 (Madsen and ules from S. thermophilus phages extended from orf 157 Hammer, 1998) (48% aa identity; Fig. 6). Fourth, orf 88 to orf 143 when a large panel of phage isolates was from Sfi 19 is apparently the product of a gene fusion investigated (Desiere ef al., 1997). Eleven orfs were event between two oris separated in O1205. found between orf 143 and the right cohesive end of the The right part of the lysogeny replacement module is Sfi 19 genome, whereas 13 orfs were detected over this occupied by orf 229, which had no counterpart in tem¬ region in the temperate cos-site phage Sfi21. Interest¬ perate S. thermophilus phages. The predicted orf con¬ ingly, a similarly organized genome region was found in tained a DNA sequence of low complexity: An 18-bp-long the pac-site temperate S. thermophilus phage 01205, sequence was repeated 18 times. The consensus se¬ consisting of 10 orfs between orf 143 and the pac-site quence was: CT(Al1/G7)TGGl5TA,lA.7CGCA.,XG.iGGT (Fig. 7). The similarity extended to the nucleotide se¬ (the subscripts give the number of positions identical to quence level: orf 106 to orf 51 were identical in Sfi19 and the specified nt, nt in bold were identical in all 18 re¬ Sfi21 and the corresponding orfs in 01205 differed by peats). The same sequence could also be read in a 11-14% bp changes. None of the predicted four proteins different reading frame (orf 129) or on the opposite strand showed a database match. Downstream of this region (orf 114), resulting in predicted proteins with GNAxVx, the three phages differed. The temperate cos-site phage or PxRYHR The GSxVTx, peptide repeats, latter showed Sfi2l showed two orfs, orf 61 and orf 130. Orf 130 gp to the srrA of similarity gene product Yersinia pseudotu¬ snared significant sequence similarity with gp 64 from berculosis P (score 105, =10"') located in a possiDle mycobacteriophage L5 (40% aa identity, P =10~25). Gene 27

COMPARISON OF cos-SITE PHAGES Sfi19 AND Sfi21 237

Sfi21, 1326 ttäaacägaataccgatctacgactttccgtttcatcatgggcaggtttaatagcccaca Sfil9, 1301 B@aaacagaa|accgaSctac(^ctttccgtë@ca|c@tgggcag!|tttaatagc|^CTACGgcTTTCC mir <-TrT

R5 R5 stop 175 Rl Sfi21, 138 6 AAAAAGAGAGACGTCGTTAAACACCTCTAAAAAGACCATAGCAAGCCTTTAGAACGTTTG Sfil9, 358 IaAAaIaGAGACGTJÎGTTAA?iCScCTCTAAAA|G|cclTAGCABG|HTTTA}lAJ8c|TfflG Rl ~R~T Rl stop 170 R2 IRL. Rl Sfi21, 144 6 TAGTATTTTTCCCCGTGCCAAATTTTAAAACTACCCCCGCCCCfÀTAATCTAAAAfÂGGGÂT7AGG0 SfilS, 417 T|GTAT^TT|CCCHS3&ÄTiTMAABT&CCC^CCCT!EEEEfflHBTAGG| "ÏÏZ"

cos-site R3 R2 R3 Sfi21, 1506 GAGCCdCCACAPGGTGTS rTCTCTTGTCGCACGCAACTTTTGAGGATTTTTAAAGGGTGT Sfil9, 477 GAGCcJc|aCA)a|gGTGtB rTCTJ^GTC|Ci|c|CAA|TTTTGAGGATTTTTAAAGGGTGT

Start 152 Sfi21, 1566 C-TATACCAAACGGAGAGGAGTAATGATGApTGfGTTAAGAATCCATACTTCAAGCAGAAT Sfil9, 537 cHTAB^BJAAAcBGA^GGAGBBShWTJB^TG3t|aa|aATCCATACTt||5aAGCa]JJ}aAT Start 152

IR2 Sfi21, 162 5 tcggggcgcttaccaaecgaccctcc Sfil9, 596 tcggggcgBttaccaSBBIgaccctcc

FIG. 4. Comparison of the cos-region from Sfi2l and SfM9 phages. Tne stop codon from orf 175 and the alternative start codons from orf 152 are boxed. The 15 nucleotide 3' overhang from the cohesive ends is marked bv a large box; the region of twofold hyphenated rotational symmetry in Sfi2l

is marked by a smaller box. Direct repeats (R) are marked by lines, inveied repeats (IR) by arrows. Differences between Sfi21 and Sfi 19 are shaded. The Sfi 19 regions showing 94% bp identity with S. thermophilus plasmid pST1 (Janzen et ai, 1992) are bracketed.

64 is found downstream of a putative DNA replication After this region of diversification, the three phages module from phage L5 (Hatfull and Sarkis, 1993). The were closely related over the next three orfs. The orf 19

virulent cos-site phage Sfi 19 showed at this position a gpfrom 01205 showed similarity to orf 192 gp from Sfi 19 gene replacement by orf 67. Orf 67 gp resembled the N over the N- and C-terminal third, while the central part of terminus of orf 22 gp from the L. lactis phage rit (P orf 19 gp showed similarity to orf 19 gp from L lactis = 10~9). Orf 22 was located downstream of the putative phage r1t {P =10~4) (Fig. 7), The corresponding Sfi21 r1t DNA replication module (van Sinderen ef al., 1995). gene differed by only 6-bp changes. A similar mosaic-like Intriguingly, orf 67 was followed by a nearly perfect re¬ pattern of relatedness was observed for the next orf (orf peat of the Sfi19 origin of replication located downstream 170, 166, 20, Fig. 7). Over the C-terminal halves, the three of the putative primase gene (Foley ef al., 1998) (Fig. 8). predicted phage proteins were nearly identical and

TABLE 1

Comparison of the DNA Packaging and Head Morphogenesis Modules in the cos-Site S. thermophilus Phages Sfi21 and Sfi19

Orf Sfi21 Orf SH19 bp a a Ks Ka Comments

132 132 14 14 0.37 0.08

175 170 20 1B 0.71 0 11 D: 3

" 152 152 23 5 - 93 0.12

623 623 14 7 0 25 0.14 terminase

59 59 15 12 0 72 0.07 384 386 13 10 0.53 0.05 D: 6/portal protein 221 221 15 1b 0 60 0.06 ClpP protease 397 397 14 9 0.52 0.05 major head protein

Note. the in Orf, corresponding oils the two cos-site phages a ne indicated by then codon length; the orfs are enumerated according to their position on the genome (see 1), cf baseoai'' diftcerces between the map Fig. bp, percentage aligned DNA sequences from Sfi21 and Sfi19. aa, percentage of amino acid differences between the fic-n aligned protein seque-ces Sfi 19 and Sf! 11. Ks, number of substitutions per synonymous site. Ka, number of substitutions site. D per nonsynonymous Comments, deletions with the number of nucleotides affected, arbitrarily defined by reference to the genome of Sfi21 phage/attribution of genes. 28

238 LUCCHINI, DESIERE, AND BRUSSOW

54 55 56 O1205

Sfi19

TPJ-34

FIG. 5. Comparison of the lysogeny replacement module in the \rule~t cos-site S. thermophilus phage Sfi19 with the lysogeny modules from the temperate pac-site S. thermophilus phages O1205 and TPJ-34. The predicted oaen leading frames are annotated with their length in codon numbers and in the case of O1205 the numbering system descriced in Stanley e: a' (1997). Genes whose functions were identified by bioinformatic analysis are given with their genetic symbols. Regions of greater than 70% bo ide"t:v are connected by shading.

showed significant similarity to the arpR gene product of (Kodaira ef al., 1997). arpR and arpU belong to the same Enterococcus hirae (P =10~"12, 38% aa identity). Over the bacterial operon implicated in muraminidase export (Del N-terminal halves, the Sfi19 and Sfi21 proteins, but not Mar Lleo ef al., 1995), Database searches identified in the 01205 protein, demonstrated significant similarity this operon a further gene which encoded a protein with with possible transcriptional regulators from Bacillus matches to several bacteriophages from Lactococcus subtilis and Methanococcus janaschii as well as repres¬ lactis. Due to its many links to phage genes, the arp sor proteins from a number of phages infecting a large operon might be part of a prophage from E. hirae. range of bacterial hosts {Streptococcus, Lactococcus,

Lactobacillus, The cov¬ Bacillus, Escherichia). similarity DISCUSSION ered the helix-turn-helix DNA binding region of the well- characterized coliphage 434 repressor (Fig. 9A). The next The possible temperate origin of new virulent phages orf (orfs 98, 114, 21, Fig. 7) differed between the three isolated in the dairy industry has been a subject of phages only by 24-43 point mutations. speculation for years. This question is not of purely The following genome position was again variable. academic interest because successful phage control SfÎ21 showed an orf 152 without database link, this orf programs in industrial milk fermentation depend on was replaced by an unrelated orf 22 in O1205 that also knowledge of the source of new phages. In earlier re¬ lacked database matches. No gene was found at this ports, when only lactococcal phages were reasonably position in Sfi 19. Very similar proteins without database well investigated, it has been suggested that temperate match were predicted for the next gene in all three dairy phages do not contribute significantly to the emer¬ phages (orf 235, 236, Fig. 7). The SfÎ21 protein was more gence of new virulent phages, because virulent and closely related to the O1205 protein (9% aa differences) temperate Lactococcus phages are not significantly ho¬ than to the Sfi 19 protein (19% aa differences). Down¬ mologous (Jarvis, 1984a,b; Relano ef al., 1987). The stream of orf 235 two related genes were found in the present and previous works from our laboratory showed two cos-site phages (Table 1). Orf 175 showed similarity that this conclusion can not be generalized to other lactic to cos-site associated genes from a Lactobacillus and acid bacteria. In lactic streptococci, virulent and temper¬ Staphylococcus phage (Desiere ef al., 1999). The pac-site ate phages showed close genetic relationships (Desiere phage O1205 showed at the corresponding genome oo- ef al., 1998; Lucchini ef al., 1998). In addition, we de¬ orf 24 sition whose gene product showed significant scribed how the temperate phage Sfi21 gave rise to a similarity to the ArpLJ protein from Enterococcus hirae virulent phage by a single (Bruttin and Brüssow, 1996), (Stanley et al., 1997); ArpU showed significant similarity integrase-mediated (Bruttin ef al., 1997c) deletion event with orf 31 gp from Lactobacillus phage g1e, Orf 31 is in the lysogeny module. Similar deletion events occurred located downstream of the g1e DNA replication module under selective laboratory conditions in coliphage A

Sfil9-orf69 TP901-l-orf72

O1205-orf5

Sfil9-orf69

TP901-l-orf72 KVH 72 O1205-orf5 JKVQDIEHN 7 9

FIG. 6. Alignment of orf 69 gp from Sfi19 with the Cro-Ike leoiessc c :erss from tne pac-site temperate Lactococcus lactis phage TP901-1 and S. thermophilus 01205. Amino acid positions shared by two phage cc: ^s are shaded. 29

COMPARISON OF cos SITF PHAGES Sfi19 AND Sfi21 239

Mycobaetenophage LS E hirae gene 64 DNA binding domain arpR of several phage repressors

DNA prlmase Sfl21 )cos

Sfi19

01205

r1t ort19

f FIG 7 Alignment of the gene maps from ht rq g°nome e ds t i, oh u stages Sfi?1 Sfi19 and O1205 The predicted open reading

frames are annotated with their length in codcns The end ct hen 0 er o e^ s at ie rght ard is given by the cos u pac site Reg ons of ~--86% bp sequence identity are connected by siad ng The shadea wedges ia

(Davis and Parkinson 1971) In S thermophilus th s insertions deletions duplications and substitutions of deletion process is not limited to specific laboratory DNA as well as a large number of point mutations mainly selection because it was also observed in the v rUent over the structuial gene cluster Because we do not

field isolate S3 (Bruttin and Brussow, 1996) Despite all possess a molecular clock for phage genomes we can similarity in genome organization, Sfi 19 is not a simple not approximate what time interval separates the struc deletion derivative of Sfi21 as S3 A detailed comparison tural gene clusters from Sfi 19 and Sfi21 Notably ecolog

of both genomes showed that they were punctuated by ical surveys of 9 thermophilus phages covering an ob

iz_irr> oril —-CAGCAGTAGTGGTTYFGG^TACTCAGTAACCGAAGGrrACTGC-AACAAGTAACCGTA 57

on2 AAGAGCA—A"TGATTATGGTTACTCAGTAACCGAAGGrT —TGCCAAGAAGTAACCGTA 5 6

r w 4 ._ oril AAACCCATTGAAAATAAAGGGTTTCGGTTGCTTTAGTTACTTAGTTACTACTÏTTAAATA 117 ori2 AAACCCATTGAAAATAAAGGGTTTCGGTTGCTTTAGTTACTTAGTTACTACTTTTAAATA 116

oril TATTTATAATAAA1AAATAAATAAA rAlATATATATAGAGAGAGCCATCAAAAAAAGTAg 177 ori2 TATTTATAATAAATAAATAAATAAATATATATAlATAGAGAGAGCCATCAAAAAAAGTAG 17 6

********************************************^\**^***********

oril WM^élMtf-lfaMtlJIJTGGCCGGAAAr.CTTGATATArAAdGGrTTGTAGTGGTTAr.GAG 237 ori2 rAACTAAGTAACTAAAGTGGCCGGAAACCTTGATATATAAGGGTTTGTAGTGGITACTAG 236

I************************-********************************* * *

oril TAAAAGTAACTGTTACTl^t»iWrfrtiTdll>iW*fe.tMaiGAGAAAA.AAATGGAAATTCAATACT 297

ori2 TAAAAGTAACTGTTACTGTAATCGAGTAACAÄAAGGAGAAÄAAAAT&GAAATTCAATACT 296

************************************************************

oril TAGA&ArT

ori2 IAGAGATT

********

FIG 8 Alignment of the putative origin ot ntaqe rer I ca from M 9 f nd d »«- earn of erf 04 {or 1) and downstream cf orf 67 {on2) DNA identified in oril are indicted d f+e e d I "»s t i r repeat by tyica ~~ie ba" ifd a rows located he regions of significant sequence similarity (90 and 87% bp identity) with the S thermo)> s olasmid i°"1 a = <- 1912 30

240 LUCCHINI, DESIERE, AND BRUSSOW

A

vslLlVLKKiHBTSK:i:SRREA: LDTTEND B.subtJUs mycqHrcj iaIssyggySaeskk |ltdd TucZ009 MVIEQIHKYV(Jp|DYiK5r| potdstrmnvss1 V.cho]erae MFS|KIRCÏlvF.RCflNiEl DSETN TPJ-34 ---MDLNKQRG«IES| FGDNSSNN Lacr.ob. MASSCIGtjp|t:§ i-israryshl|nerne pSEKSHK -T rQQSlïCllNÏKTKRP-RF INGTSDSNVPFVG

hi

B

Sfi21-gpl70 LDTTENDSlTtdHAKtNEg [ISHj 'kgnfkiekmky' IRDVFLKPNDFDlPiMA 120 arpR MDF,|l|rR-VEcl SKDr acsskcmlkt: VAASLARK 1-|HG 47

Sfi2t-gpl70 BS 1 j pGVlrODHB 3TR 170 arpR i:|ak|ndiJlyB KlVYGvi SSSDLEEAE 103

FIG. 9. Orf 170 gp from Sfi21 has two distinct database matches. (A) Argument of the N-terminal part of orf 170 gp with the indicated phage repressor proteins. Sfi21, S. thermophilus phage; B. subtiiis, celluiar)oo/7 gens (phage related transcriptional regulator); Tuc2009, Lactococcus lactis phage; V. cholerae, cellular CTX phage; TPJ-34, S. thermophilus peace, _ac:cb., Lactobacillus phage g1e; E. coli, coliphage 434, (B) Alignment of the C-terminal part of orf 170 gp with the ArpR protein from Enterococcus 'viae.

servation period of several years in a highly dynamic strand as the rest of the phage genome. Only the latter industrial environment did not identify an accumulât on genes from the two pac-site phages contributed to the of point mutations (Bruttin et al., 1997a). Despite a dis¬ genome of the virulent S. thermophilus phage. tinctively different genome organization between phages Close genetic relationships between temperate and from low and high GC content gram-positive bacteria, tne virulent phages were also described in Lactobacillus processes that differentiate the virulent/temperate myco- delbrueckii. An alignment of the temperate phage mv4 bacteriophage pair (Ford etal., 1998) were fundamentally with the virulent phage LL-H (Vasala ef al., 1993; similar to those that differentiate the streptococcal phage Mikkonen ef al., 1996) suggested DNA deletion events pair. However, a marked difference was seen with re¬ over the lysogeny module followed by complex DNA spect to the lysogeny-related genes. In the virulent my- rearrangements as the origin of the virulent phage. In cobacteriophage D29, a 3.6-kb deletion removed the contrast to streptococcal phages, the virulent phage repressor gene that accounted for the inability of D29 to LL-H still possessed a truncated integrase gene and two form lysogens. An apparently functional integrase gene sites homologous to the mv4 attP site (Mikkonen ef al., was still found on the genome of the virulent phage (Ford 1996). In lactic Streptococcus phages, the separation of ef al., 1998). In fact, when the L5 repressor was provided virulent and temperate phages lines seems therefore to from an extrachromosomally plasmid, the Virulent" be more evolved because no remnants of essential ly¬ phage D29 could lysogenize its host. In the two lactic sogeny genes were detected. The changes in the repli¬ streptococcal phages, the virulent phage could concep¬ cation module could possibly indicate a further adapta¬ be derived from the tually only closest temperate phage tion to the virulent lifestyle. However, we need seguence a deletion of the entire Sfi21-like by precise lysogeny data from further virulent phages and biochemical exper¬ module combined with a complicated recombination iments concerning the role of the second origin to sub¬ process with at least two distinct pac-site temperate S. stantiate these suggestions. thermophilus phages. Two observations are noteworthy. Classically, dairy microbiologists have perceived the First, alignments of the lysogeny modules from different genetic conflict between the phage and the host lactic temperate S. thermophilus phages suggested that this acid bacterium as the major driving force for the evolu¬ in contrast to other region has, genome regions from this tion of the phage genomes. There are good arguments phage group, been shaped by numerous recombination for this view in the field of lactococcal phages (Garvey ef et Lucchini processes (Neve al., 1998; ef al., 1999). Sec¬ al., 1995). Lactococci possess numerous plasmids con¬ ond, the lysogeny module can be divided into two parts: taining phage resistance functions (Allison and Klaen¬ a of group essential lysogeny genes encoded on the hammer, 1998), e.g., restriction systems (Garvey et al., strand opposite (integrase, superinfection immunity, re¬ 1995). Notably, many lactococcal phages showed a con¬ and a of pressor) group genes downstream and up¬ spicuous avoidance of restriction sites m their genomes stream ot the core lysogeny genes that are on tne same (Moineau er al., 1993) and phages have been isolated 11

COMPARISON OF cos SITE PHAGES Sfi19 AND Sfi21 241

that had acguired bacterial genes to undermine cellular viruses to extrachromosomal elements Plasmids share resistance functions based on restriction systems important properties with temperate phages and indeed (Moineau ef al 1994) Little evidence for this type of can be generated by some phages Thus phage P1 phage host conflict has been documented for S ther which showed stronq sequence similarity with the Sfi?1 mophilus (Larbi et al 1992) Plasmids ate conspicuously antirepressor (Bruttin and Brussow 1996) lysoqenizes few in S thermophilus and few have shown an anti as a olasmid Also phage A can multiply as a plasmid phage function (0 Sullivan ef al 1999 L Morelli oei after mutation or larqe deletions Integrative plasmids sonal communication) We documented even a poss ble resemble temperate phaqe genomes in their ability to indication for coadaptation of the bactenal and onage integrate into the bactenal chromosome It was pro genome the bacterial attachment site [attB] containea a oosed that evolution proceeded from simple plasmids DNA sequence that complemented the C terminus of me which direct only their own replication to more complex phage integrase that would otherwise have been o\\-, Plasmids (F and R factors) then to those bactenocms rupted by the prophage integration process (Brutt n ef al that showed the structure of complex phaqe tails and 1997c) The genome comparison of Sfi21 and S 19 suq anally to complete phages Do we have m S thermophi gested that the genetic conflict between temperate ano us with plasmid oST1 and phage Sfi 19 representatives virulent S thermophilus phages might be a furthei f i rt tor both ends ot this evolutionary sequence'? Does S the major driving force for phage genome evolutot n thermophilus contain intermediate stages of this hypo this group of bacteria Once again we need additcnal thetcal scenario that could make this sequence of temperate/virulent S thermophilus phage genome com events more likely? pansons and ecological phage surveys outside of the dairy environment to substantiate this hypothesis METHODS Virologist have been intrigued by the questions where Phages, strains, and media do viruses belong in the biological v\orld? What are their and closest relatives? At what did their origins point The phages were propagated on their appropriate S evolution start to from that of the el° diverge genetic fhermophilus hosts in lactose M17 broth as described ments now found in cells'? Of all living things viruses are previously (Brussow and Bruttin 1995 Bruttin and Brus least to have a because likely monophyletic origin they sow 1996) E coli strain JM 101 (Stratagene) was grown in the of anoints o4 always replicate presence large n LB broth oi on LB broth solidified with 1 5% (w/v) agar nonviral nucleic acids which can become ncorpont°d Ampicillin IPTG and Xgal (all from Sigma) were used at into viral The bioinformatic of the genome analysis ti t concentrations of 100 //.g/ml 1 mM and 0 002% (w/v) genomes from 9 thermophilus phages has not ident4" ea respectively a single case where a bactenal gene and not anct1 m phage gene was the closest relative of a S them och l s DNA techniques phage gene This conclusion also applies to the CloP Phage purification and DNA extraction were done as like putative protease gene fiom Sfi21 CIdP protease-, descr bed previously (Brussow and Bruttin 1995 Bruttin are common constituents of prokaryotic and eukaryot c and Brussow 1996) Plasmid DNA was isolated using cells and have not been described in any viral system Qiagen midi plasmid isolation columns Restriction en However phylogentie analysis of the Sfi21 CIpP like pro zymes were obtained from Boehrmger Mannheim and tein did not demonstrate a close relationship with bac used according to the suppliers instructions tenal CIpP proteins (Desiere ef al 1999) In another

case a plasmid and not a phage gene turned out tob1 Sequencing the nearest database match Sfi21 orf 504 gp showed

strong similarity to a putative DNA pnrnase from tne DNA sequencing was started with universal forward cryptic plasmid pWS58 of Lactobacillus delbrueck i (DQ and reverse primers on pUC19 or pNZ124 shotgun siere et al 199/) In addition we detected sequence clones and cont nued with synthetic oligonucleotide (18 similarity between the putative minus origin from the mer) primers (Microsynth Switzerland) Both strands of cryptic S thermophilus plasmid pSF1 and the phage the cloned DNA were sequenced by the Sanger method origin located directly downstream of orf 504 (Des ere ef of dideoxy mediated chain termination using the fmol al 1997) Intnguingly the noncoding DNA surrounding DNA Sequencing System of Promega (Madison Wl) The the cos site of Sfi 19 but not of Sf 21 showed strong sequencing primers weie end labeled usinq [y-33P]ATP similarity to a closely adjacent (80 bp distance) no icod according to the manufacturers protocol The thermal ing region of the same plasmid pSl 1 is a small c rypt c cycler (Perron Elmer) was Drogrammed at 30 cycles of

plasmid that encodes only its lephcase (Janzen er ai 95 C foi 30 s 50°C lo\ 30 s and 72°C for 1 mm 1992) The sequence similarity between S thetmepi Its In add tion pUC19 clones of Sau3A digested phage phages and plasmids is also interesting in v eyv ot clas S i21 DNA were sequenced using the Amersham Lab sical hypotheses that linked the evolution of bactenal station sequencing kit based on Thermo Sequenase 32

242 LUCCHINI DESIERE AND BRUSSOW

labeled primer cycle sequencing with 7-deaza-dGTP coccus thermophilus bacteriophage infections in a cheese factory Appl Environ Microbiol 63, 3144-3150 (RPN2437) Sequencing was done on a Licor 6000L au Bruttin, A Desiere f Lucchini S Foley S and Brussow H (1997b) tomated sequencer with fluorescence-labeled universal Characterization of the lysogeny DNA module from the temperate reverse and forward pUC19 primers Sf/eotococcus thermophilus bacteriophage Sfi21 Virology 233, 136- 148

PCR Bat in A Foley S and Boissov H (1997c) The site specific intégra tion sys em cf tie t->mpeie e Streptocoi eus thermophilus bacteno PCR was used to which were not ob span regions Dhige c6Sfi21 V o ay 23/ 1 8-158 tamed through tandom cloning PCR oroducts were gen Corpet e (ijb8) Vultoleseqie cc al gnment with hierarch cal dis erated using the synthetic oligonucleotide pair designed tenia NucecAcds Res 16 10831-10890 Ja\ s R V\ aid Pa Kinsoi J S (1371) Deletion mutants of bacte io accoiding to the established Sfi21 DNA sequence pun pi -> J lambda III Physical structure of att J Mol Biol 56 403 479 tied DNA and phage Super Faq Polymerase (Stehelin N Del ar Ifco M Fontana R and Solioz M (1995) Identification of a PCR were Basel, Switzerland) products purified using = je (aioU) controlling muraminidase 2 export in Enterococcus the QIAquick-spin PCR Purification Kit I i I Bac end 177 5912-5917

Des e e F Lucchini S and Brussow H (1998) Evolution of Strepto Sequence analysis coccus thermopl I is bactenophage genomes by modular ex ci3! 3es followed by poir t mutations and small deletions and inser The Genetics Computer Group sequence aielyss fms Virology 241 345 356 "^es ere V Luce hni S and Brussow H package (University of Wisconsin) was used to asse Tbl» (1999) Compaiative sequence il\sis cf the DNA packaging head and tail morphogenesis mod and analyze the sequences Nueleotd° (nt) ano ore lies tne e ripera te coi site Streptococcus thernopl us bacteno dieted amino acid (aa) sequences were compared w *h Liad^Sf21 iology260 744 253 those in the databases Release 109 EMBL ^ (GenBank, Je-, en Li echini S Bruttin A Zwahlen M C and Brussow H (Abridged) Release 56, PIR Protein Release 57 SWISS (199 7) A highly conserved DNA replication module from Streptococ PROT, Release 36 PROSITE Release 15 0) using FastA c 9 theimoph lus phages is similar in sequence and topology to a module fiom Lactococcus lactis phages Virology 234 372-382 (Lipman and Pearson 1985), and BLAST (Altschul ef a roley S Lucchini S 7wahlen M C and Brussow H (1998) A short 1990) programs Sequence alignments were performed loncodinq viral DN^ element showing characteristics of a replica using the CLUSTALW 174 method (Thompson ef a tion origin confers bactenophage lesistance to Stteptococcus ther the and th^ SIM 1994), Multalign program (Corpet 1988) mophilus Vnology 250 377 387 alignment tool (Huang and Miller 1991) o d VI E Sarkis G J Bélanger A L Hendnx R and Hatfull G F Genome structure of D29 The complete Sfi 19 and Sfi21 genome sequences (1998) mycobactenophage Implications for phtge evolution J Mol Biol 279 143-164 were deposited in the GenBank database unapt tl e Garvey P van Sindt en D Twomey D P Hill C and Fitzgerald G F Accession No AF115102 and AF115103 (1995) Molecular genetics of bactenophage and natural phage de fence systems in the genus Lactococcus Int Dairy J 8 905-947 ACKNOWLEDGMENTS Guedon G Bouiaon F Pebay M Roussel Y Colmin C Simonet J M and Decans 3 (1995) Characterization and distribution of two

Special thanks go to D van Smderen who shared l^du I ai m insertion sequences IS1191 and iso IS981 in Streptococcus ther with us that demonstrated that \ n ne c ^ s = sequence similarity mophilus Does i terqenic transfer of insertion sequences occur in

S thermophilus phaaes is a widespread prooe ty of s / e tV lactic acid bacteiia co cultures? Mol Microbiol 16 69-78

We thank the Swiss National Science cor Ida c. fc Plasmids it Hatfull G FandSarks fa J (1993) DNA sequence structure and gene financial of F Desiere and S Lucchini in the f a levc vcf -> support expression of mycobac enophage 15 A phage system for mycobac Module Biotechnology (Grant 5002 044545/1) and S Fole\ for c t I tenal genetics Mil M crobiol 7 395-405 of the reading manusenpt Huang X and Mille U (1991) A time efficient linear space local similarity algorithm Adv Aop Math 12 337-357

?en T Kleinschmidt J Neve H and Geis A REFERENCES (199?) Sequencing and charactenzatio i cf ob M a cryptic plasmid from Streptococcus FL M \1irobol L"t> 95 175-180 Allison G F and Klaenhamme T R (1998) Phage esista ice neci thermophilus cvs A Oiff^ en latior of lactc anism<; in lactic acid bac er a Int Di i\ J 8 e-0/ °6 (1984a) °treptococcal phaoes into

= ma es DNA Altschul S F Gish W Miller W Vyt rs E W 1 du D 1c>9n ie spec by DNA homology App Env ion VI crobiol 47 i g4Q Basic local ahgim°rt search tool J Mol B i 215 4 1 1 A Brussow H and Bruttin A (1995) Cha acte zatoi fa ei" Jarvs (198 rb) DNA DNA Homology be'ween lactic streptococci and

h°n temne ^ e aid Envron Streptococcus thermophilus bactenophage md s ge e c relc c lyf chages Apnl M crob oi 47 1031- ship with lytic phages Virology 212 632 6^0 1033

Brussow H Fremont M Bruttin A Side i j Canst pje A a"d Josepisei J end Neve H (1998) Bacteriophages and lactic acid

Fryder V (1994) Detection and classifica+ioi of Sf eDfococc is tl " bacteria In L act c At id Bacteria Miciobiolocy and cunctional As

mophilus bacteuophages isolated from mdust lal i" Ik fe me etc pects (S Sa minen and A V Wright Eds) p 386 436 Marcel Appl Environ Microbiol 60 4537 4S43 Dekker New York

Bruttin A and Brussow H (19961 Sitesiecft sooi ° leous dci° s kodaira K I Oki M Kakikawa VI Watanabe \ Huakawa M

in three genome regions of a temoerate St ep o occ jst e n u Y Tilda K fTakeo A (1997) Geioie s ructure of the lactoba 219 phage Virology 96-104 Il s enofc ate ohage <7>g1e The whole genome sequence and the Bruttin A Desiere F dAmico N Guerin ) P Side i J ut R c tve p omoter recressor system Gene 187 45-53 Lucchini S and Brussow H (1997a) Molecuh ecoloryc1 s l3 di D D^ca is B and Simonet J M (199?) Different bacteriophage > a

COMPARISON OF cos SITE PHAGES Sfi 19 AND Sfi21 243

resistance mechanisms in Streptococcus salivanus subsp ther Neve H Zenz K I Desiere, F Koch A Heller K J and Brussow H J Res mophilus Dairy 59, 349-357 (1998) Comparison of the lysogeny modules from the temperate D Lipman J and Pearson W R (1985) Rapid and sensitive protein Streptococcus thermophilus bacteriophages TP J34 and Sfi21 Impli searches Science 1435-1441 similarity 227, cations foi the modular theory of phage evolution Virology 241, r Le Marrec C van Sinderen D Walsh L E Stanley Vlegeis 51-7? Moineau S Heinze P Fitzgerald G and Fayard B (1997) 1v o i Sullivan T van Smderen D and Fitzgerald G (1999) Structural and groups of bacteriophages infecting Streptococci s theimophi'us ca i fuictioi al analysis of pCI65st a 6 5 kb plasmid from Streptococcus be distinguished on the basis of mode cf packaging and canot)r Wem ophilus NDI 6 Microbiology 145, 127-134 determinants for major structural proteins Appl Environ M u^, Pievc i P Relano M Mata and Ritzenthaler P (1989) Close 63,3246-3253 teiationsniD of vi ulent bacteriophages of Streptococcus ssltvan is Lucchini S Desiere F and Brussow H (1998) The stttt i al cone ^uosp thermophilic at boji the protein and the DNA level J Gen module in Streptococcus thermophilus bacteriophage Still shews a M cro^ol 135 3337-3344 hierarchy of relatedness to Siphovindae fiom a wide ra c^ of Dt-c Ré lai o P Mata Bonneau M and Ritzenthaler P Mo tenal hosts Virology 246, 63-73 M, (1987) Desiere F and Biussow H The leoihr characterization and comparison of 38 virulent and fem Lucchini, S, (Submitted) lysoge -, modules from temperate Siphoviridae of low GC gram positive bac Dei ate bacteriophages of Streptococcus lactis J Gen Microbiol tena share a common genetic organization Virology 133 3053-3063 Madsen, P L and Hamnei K (1998) Tempt ral transciiption of I Shimizu Kadota M Sakurai F and Tsuohida N (1983) Prophage and DNA lactococcal temperate phaqe TP901-1 sequence of the ongm of a virulent phage aooeanng on fermentations of Lactobacil 144 2203-2215 early promoter region Microbiology 1rs casei S I Appl Em iron ¥icobiol 45, 669-674 Mercenier, A (1990) Molecular geretics of S'ieofococcus therms ' Stanley E Fitzgerald u F 1 e Marrec C Fayard R and van Smderen lus FEMS Microbiol Rev 87, 61-78 D (199/) Sequei e anakbis and charade ization of O1205 a tern ~ Mikkonen M Dupont L Ala*ossavQ a id Ritzenthaler P (1Q » > perate bacterid h ige infecting Streptococci s thermophilus Defective site soecific ir tegntion elements are present in tJt2 -^ CNRZ1205 Muobro^ji 143,3417-3429 nome of virulent bactenophage LL H of Lactobacillus delb °c< Tnompson J D H ogms D G and Gibson TJ (1994) CLUSTAL W Appl Environ Microbiol 62, 1847-1851 improving the sensi ivity of progressive multiple sequence alignment Moineau S Pandian S and Klaenhammer T R (1993) Rest ic ic through sequence weghing position specific gao penalties and modification systems and restriction erdonucleases are mere eff°" weight matrix choice Nucleic Acids Res 22,4673-4680 tive on lactococcal bacteriophages that have emerged recently r => van Smderen D Kaisens H Kok J Terpstra P Ruiters M H dairy industry Appl Environ Microbiol 59, 197-202 Venema G and Nauta A Moineau S Pandian S and Klaenhammer T R (1994) Evoluioitf^ (1996) Sequence analysis and molecular characterization of the temoerate lactococcal ri t Mol lytic bactenophage via DNA acquisition from the Lactococcus 5 bacteriophage Microbiol chromosome Appl Envron M crobiol 60, 1832-1841 19, 1343 Urö Neve H Krusch U Teuber M (1989) Classification of virulei t bec Vasala A Dupont I Rai nann VI Ritzenthaler P and Alatossava T

tenophages of Streotococcus sain irrus subsp thermophilus isoh ea (1993) Molecular comoartsoi of the structural proteins encoding from yoghurt and Swiss-type cheese Apol M ciob oi Botet I 30 qene clusters of tv\o tela ed Lactobaci lus deloiuecku bacteno 624-629 ohiqes J Vuol 67 3061 3068 34 2.2 The structural gene module in Streptococcus thermophilus bacteriophage OSfi11 shows a hierarchy of relatedness to Siphoviridae from a wide range of bacterial hosts

(Virology 246, 63-73(1998)) 36

VIROLOGY 246, 63-73 (1998)

ARTICLF no VY989190

The Structural Gene Module in Streptococcus thermophilus Bacteriophage

a Hierarchy of Relatedness to Siphoviridae from a Wide Range of Bacterial Hosts

Sacha Lucchini Frank Desiere and Harald Rrussow'

Nestle Research Centei Nestecitd \°rs cbe? les Blan C H 1000 Lausanne P6 Switzerland

Received March ) 9QS ecceoted Ao il lb I99S

The structural gene cluster and the lysis module from lytic group II Streptococcus thermophilus bacteriophage (/>Sfi11 was compared to the corresponding region from other Siphoviridae The analysis revealed a hierarchy of relatedness r/TSfil 1 differed from the temperate S thermophilus bacteriophage «-,601205 by about 10% at the nucleotide level The majority of the changes were point mutations, mainly at the third base position Only a single gene (orf 695) differed substantially between the two phages Over the putative minor tail and lysis genes, Sfi11 and the lytic group I S thermophilus <-ASfi19 shared regions with variable degrees of similarity Orf 1291 from <#Sfil9 was replaced by four genes in

and orf 695) showed a complicated pattern of similarity and nonsimilanty compared with Sfi19 The predicted orf 695 gp resembles the receptor-recognizing protein of Teven cohphages in its organization, but not its sequence No sequence similarity was detected between <7Sfi11 and Sfi11 and Siphoviridae from gram-positive hosts and correlated with the evolutionary distance between the bacteria! hosts Our data are compatible with the hypothesis that the

structural gene operon from Siphoviridae of the low G + C group of gram positive bacteria is derived from a common

ancestor » 1998 Academic Pre«

INTRODUCTION 0 05% at the DNA level In fact, the range of diversity covered by a single bacterial species is greater than that One of the seminal conceptual advances in contem¬ between different biological genera, e g man and chim porary biology was the populational definition of the panzee differ by about 2% at the DNA level Neverthe¬ biological species In this concept a species is defined less the chromosome of a given bacterium constitutes a as a group of individuals belonging to a closed nter- coadapted complex that aoparently evolved separately breeding population whose genes can be considered a despite the existence of pathways for gene transfer with common pool The essence of speciation lies 11 the other bacteria Iheiefoie m practical terms the evolu reproductive isolation of members of a given spûc es

" tionary definition of a bacterial species is commonly The notion of a gene pool makes sense only the accepted The species concept in is a debated members of that pool reeombine at a significant täte virology issue 1996 Ackermann and DuBow The under natural conditions For the gene pool to be closed (Murphy, 1987) oresent universal of virus defined ar¬ gene flow between members of the species and other system taxonomy hierachical levels of order sources should ideally be 0 or at least be small com¬ bitrarily (-virales), family (-viri- pared to intraspecific recombination (Camobell 1988) dae) genus species and strain The taxonomy of animal The species concept is less well rooted in prokaryotes viruses is lelatively developed (Murphy, 1996), as dem¬ onstrated the There are fundamental problems with the species con by following example oider Mononegavi- cept in bacteria since accessory DNA elements like rales family Paramyxoviudae, genus , spe¬ plasmids, transposons, and phages are disseminated cies measles virus stiam Schwarz Most families of among bacteria that aie very distantly related taxonom- viruses have distinct morphology, genome structure, and

ically (Ochman and Lawrence, 1996) In addition a bac¬ strategies of replication I he virus family is being recog terial species like Escherichia coli covers strains that nized as a taxon uniting viruses with a common, even if

might vary by about 5% at the sequence level (Milkman distant phylogeny Consequently the relatedness for ex¬ 1996, Whittam, 1996) Biologically defined soec es cover ample between Paramyxoviudae was studied by Phylo¬

much less sequence diversity eg it is estmatea that genese tree analyses based on nucleic acid or amino different individuals of Homo sapiens mignt differ by acid sequences (Griff-n and Bellini, 1996) In contrast, the relatonship between different families within the Mono-

negaviiales (e g Rhabdovindae and Paramyxoviudae) is

seen at the To whom correspondence aid reprin reques*s sicild only genomic organization level and not any dressed Email Harald BruessowTrGHtSM "4ESTRD C i lonqei at th° sequence level (Strauss ef ai, 1996) Ihe

0042 6822'98$25 00

Copyr ght o U )h> bv Academ c Pros11;

All rghts of reoroducton in any form reserved 37

64 LUCCHINI, DESIERE, AND BRUSSOW molecular taxonomy of bacterial viruses is much less morphogenesis module The relatedness extended to the developed The International Committee on Taxonomy of sequence level for bacteriophages from the low GC group Viruses classified bacterial viruses on the basis of two of gram-positive bacteria The implications of these findings criteria at the first level the genome type (DNA or RNA for the understanding of Siphoviridae evolution are dis¬ double-stranded or single-stranded) and then at the level cussed in the framework of the species concept of phage morphology In the group of banter oDhages with double-stranded DNA genomes eight morphological RESULTS types are currently distinguished (Fauquet, 1997) Nu¬ merically by far the most prominent are three groups r/5Sf111 and r/>Sfi19 are clearly distinct phage types phages with contractile tails (Myoviridae, prototype T^ We tested the host range of our prototype lytic grouo I phage), long and noncontractile tails (Siphoviridae pro¬ phage <-/>Sfi19, and our prototype lytic group II phage, totype phage lambda), and short tails (Podovmdae pro¬ f/>STi11 (Btussowefa/, 1994a) on a total of 226 distincts totype T7 phage) Approximately half of the about 3000 the mophilus strains Not a single strain was lysed by known bacteriophages belong to the group Siphoviridae both phages Furthermore, 130 S thermophilus phages (Ackermann and DuBow, 1987) Much less is known +ioti another phage collection (H Neve, Kiel/Germany) about the phylogenetic relationships of bactenal viruses were tested on our lytic group I and II indicator cells. Not than those of animal viruses Compaiative data are a single phage infected both indicator cells Apparently, mainly available for phages belonging to restricted !\te group I and II phages oiffei in a fundamental way, phage groups like lambdoid phages (Botstein, 1980 result ng in two nonoverlapping infection patterns Campbell and Botstein, 1983) oi T-even phages (Monoo Puither results confirmed the difference between ef al, 1997) In fact, many bactenal virologists see the r/tSf 11 and r/»Sfi19 First, the two phages differed in the populational definition of a virus species as clearly inap¬ i°utml zation of their infectivity by hyperimmune sera plicable to bacteriophages Some virologists imagine defining two clearly separated serotypes (Biussow et al, that homologous recombination within a phage popula¬ 1994a) Second, the phages differed in then polypeptide tion is so frequent that the basic unis of selection are not composition when CsCI gradient-purified phage parti¬ the individual phage particles but rather the segments of cles were compared by SDS -Polyacrylamide gel electro¬ their genomes that are interchangeable by recombina¬ phoresis (data not shown) Tnird, the two phages differed tion (Casjens etal, 1992) Even more radical, others see m tail morphology tails from <-/>Sfi11 were thinner than illegitimate recombination as such a dominant force that tails from ^Sfi19 and the striation of the tail was less phage genomes can best be regarded as mosaics of evident in r/SSfiH In addition, amole numbeis of tail fibers genes from various nonphage sources (Campbell 1988) were detected in r/>Sfi11 (Fig 1B), while we never ob¬ To address these questions, we have started compara¬ served tail fibers in preparations of <7>Sfi19 (Fig 1A) The tive phage genome seguencing projects in our laboratory latter observation should be interpreted with caution Our laboratory is interested in bacteriophages of Sfeoto- since S thermophilus phages are unstable in CsCI gra coccus thermophilus, a gram-positive lactic acid oacter urn dients, as exemDhfied by the isolation of <7JSfi11 heads used extensively in industrial milk fermentation (Mercen ei lacking tails (Fig 1C) oi loss of phage particles during 1990) These studies were motivated by the inoustra! aim of purification (Fayard etal, 1993) Interestingly, the tailless developing phage-resistant bactenal startet cultures r/>Sfi11 particles lacked, in addition to some minor pro¬ Therefore, we have to understand the natuial vanab'l ty of teins the 27-kDa major protein (data not shown), sug¬ S thermophilus phages Previously we have class fied gesting that this protein is likely to be a tail protein these phages by different taxonomic cntena leading to the definition of different lytic groups (Brussow ef al 1994a DNA sequence of c/>Sfi11 Brussow and Bruttin, 1995) Two membeis of lytic group I were analyzed in sequencing projects (Brussow ef c5/ A 24-kb DNA segment from <$3fi11 was sequenced, cor- 1994b, Bruttin etal, 1997b, Desiere ef al 1997, 1998) Here resoonding to about 60 % of the genome When only ATG we describe the genome sequence of a tepresentative Mic start codons were accepted, 33 open reading frames (orf) group II phage and compare it, first, with more or less longer than 60 aa were detected Twenty-three orfs re¬ related Siphoviridae from the same host species S V ei- mained when those located within or opposite to orfs that mophilus (Le Marrec ef al, 1997, Stanley ef al, 1997) then showed similarity to entries from the database were sub¬ with Siphoviridae from a related host Lactococcus lactis tracted (Fig 2) With one exception, all orfs were located on ef van Smderen cf al (Chandry al, 1997, 1996, Johnsen ef the same stiand The overall genetic structure of this region and al, 1996, Boyce etal, 1995a), finally with Siphovindae was very dense The start and stop codons of three groups from hosts showing decreasing phylogenetic relatedness of genes overlapped, indicating potential translational cou¬

{Bacillus subtilis, Mycobacteria Streptomyces Escherichia pling (Fig 2) Translational coupling (Draper, 1996) is a with S coir) thermophilus (Becker ef al, I997, Anne ef a1, posttranscription mechanism that helps ensure balanced Hatful and This 1990, Sarkis, 1993) approach revealed a oroduction of polypeptides that function as part of a multi- common genome organization ot these pnages over the comoonent complex 38

EVOLUTION OF SIPHOVIRIDAE 65

Electron of S. micrographs thermophilus bacteriophages

If entries from S, thermophilus phages were excluded, 001205 is temperate) and in host range (001205 was 20 of the 23 orfs showed similarity to predicted proteins unable to multiply on any of our S. thermophilus strains from the database. Without exception the similarities including a number of lytic group II strains). However, were to proteins from bacteriophages. Most similarities except for one gene (orf 695, see below, Comparison were to from proteins phages infecting taxonornically with 0Sfi19) 0Sfi11 showed very similar genetic organi¬ related bacteria such as Streptococcus pneumoniae, L zation to 001205. The genetic similarity between the two lactis, and B. subtilis. However, two proteins showed phages extended to the nucleotide level (Table 2). Over¬ similarities to bacteriophages from distantly related all, an average base pair change rate of about 10% was gram-positive bacteria, Streptomyces venezuelae and calculated. The majority of the changes were point mu¬ tuberculosis Mycobacterium (Fig. 2). tations, mainly at the third base position, but an 8.1% Several showed adjacent <^Sfi11 genes similarity to a average aa change rate was still observed for the pre¬ cluster from gene the L lactis phage TP 901-1 (Table 1), dicted gene products. Over the putative lysis cassette, the in the size and the Interestingly, similarity topological 0Sfi11 was more closely related to lytic group I phage organization of these two phage gene clusters extended to 0Sfi19 (Desiere ef al., 1998) than to 001205 (Table 2, Fig. no from genes showing sequence similarity, e.g., mhp r/>TP 4), An interesting case is the unattributed gene preced¬ 901-1 for a coding 348-aa-long protein and orf 348 from ing the holin genes, where the proteins predicted for the 0Sfi11 (see below, Conservation of genome organization). two temperate phages differed from those predicted for of the Upstream lysis cassette, bioinformatic analysis the two virulent phages by a 14-aa internal deletion (see revealed genes coding for likely phage structural proteins. Fig. 5 in Desiere ef al., 1998). These genes are commonly clustered in phage genomes. The molecular weights of the six minor structural proteins Comparison with 0Sfi19 estimated from PAGE {Fig. 3) showed a reasonable match Figure 4 shows an alignment of the partial 0Sfi11 to proteins predicted for all but one of the larger orfs from gene map with the corresponding region of lytic group the v6Sfi11 DNA segment analyzed (Fig, 2). i S. thermophilus phage 0Sfi19 (Desiere ef al., 1998). The two phages showed if Comparison with 001205 comparable genetic maps one postulates a fragmentation of orf 1291 from 0Sfi19 S. thermophilus 001205 (Stanley ef al., 1997) and into separate orfs in 0Sfi11 (orf 1000, 373, 57, and 695), differ in is a virulent 0Sfi11 lifestyle (0Sfi11 phage, while which is supported by the sequence similarity data 39

66 LUCCHINI DESIERE, AND BRUSSOW

Streptococcus thermophilus

001205 MPS MPLMPM 58kDa 155kDa 58kDa 112kDa 78kDa 75kDa (60kDa) <60kDa) (70kDa) (120kDd) (82kDa) (74kDa) y v ^ 129 119 53 104128 117 141 502 284 193 348 113 114 168 105 1510 512 1000 373 57 695 669 149 87 288

2 1 3 3 3 3 2 13-3

A 5000 15000 20000 10000 24000

head protein 1 OVWB minor tail protein tail Streotomvces ILS orfl orf2 holm holm portal minor head protein lysin Venezuelas 1 *EJ-1 1 protein protein *SPP1 Mytobac'aium r1t *TP901 1 crlt iIBK5T

Lactococcus lactii

FIG 2 Prediction of open reading frames in the 24 kb f icme tef A3fill The orfsw^re maiked above the arrows with their length in aa the reading frame is indicated below the arrow Tilled ar cws indica'e orfs pvHad 1 1 sfandd>d nbosc nal binding site Overlaps of orfs are marked by a filled circle and overlaps of start and stop codo is are ma ked by fiilea n ces Tie rule piovides he nucleotide scale Similarities to proteins fiom the database are indicated by shadings Wheie the shading does 1 of c\e e > -He Sfi11 gene only paitof the gene showed similarity to the database entry The boxes at Die bottom of the figure identify the similarities ere ~ao e t for the similarity with

^ the high molecular weight minor proteins are suggested in the top lines c t i -> fiqire he gene products are indicated by their calculated mass in kilodaltons the obseived masses of the iarger minor proteins are given r a'erheses References Streptococcus pneumoniae 1 (Sheehan ef al 1996) lactocu s he s c/ßK^T {Boyce et al 1995) 0TP9O1 1 (Johnsen etal 1996) <^r1t

< (van Smderen of al 1996) and ç6blL67 (Sehr tiler et al 1994) Bacillus s 1 i

(Fig 4) In the left half of the aligned maps the two S1 larity to a phaqe from a different bactenal genus Over thermophilus phages showed no sequence similarity the right part of the maps a variable degree of se¬ while each of these phages showed sequence simi quence similarity was observed between the two

phages Identity exceeding 97 % at the aa level (98 1%

bp identity) was found between the genes of the pu¬ TABLE 1 tative lysis cassette (orf 131 to 289) The transition

Similarity of the Indicated Gene Products from TP901 1

Sfi11 orf r/>1P901 1 f/>TP901 1 orf Ida c \ -160 (length in aa) gene (length in a >) il g ed c-n r |L- -120

113 t1 110 2fa/b8 1 125 "82 104 b3 103 33/102 or 2/4 1 14 a1 112 32/109 10" J 70 -60 128 X 129 37/126 10-

168 mti 1t9 64 1o7 10"

117 s2 -11 22 8b 0ÛC21

105 02 ro 29 ^2 0 -44 1310 c4 >20 3 "33

° Note The fust column gives the length of tf rfSfi11 l f 1 i° Lf encoded amino auds The orfs were lis ed acco di o o i° i d roi the "> coire¬ -2/ sponding C/3TP901 1 gene in the terminology cf Jihnson e» / ( 996) mtp means major tail protein The lengths of the orf n 1 umbei ot encoded amino acids are given in *he third lolumi The fou fh cok ma gives the number of identical aa for the two protens comcared ^Hie indicated of length computer aligned ai (i a 1 ot sop icable s ce -p FIG 3 SrfiHuml oolypeprfdes in CsCI density giadient purified alignment was over several noncontiguous segments! Tieff iciiu^ <6Sfi19 ( ight lanef The g°l was stained with Coomassie brilliant blue the v^ gives probability derived from B1 ASFP sco e fci c it n ire a nci ei i lar Türkeis ere fiom 993/83 (lef+ lane Brussow et al chance by 133?) Molea iai masses aie given in kilodaltons 40

EVOLUTION OF SIPHOVIRIDAE 67

TABLE 2

Comparison of the Indicated <£Sfi11 Gene and Gene Product with the Corresponding Element from 001205 and Analysis of the Differences

Distribution of bp differences Gaps Deletion/insertion

502 502 10.4 7.2 H 2 284 297 10.2 8 5 H 2 0:18,21

193 196 5.9 5 2 H 1 _

119 119 16.2 11.0 G3' 2 —

348 348 7.0 4.0 H 0 _

113 113 17.4 16.0 H 1 _

104 104 5 I 8.0 C:3' 0 —

114 114 0.0 0

128 128 3.9 5.0 G3' 0 —

168 168 52 2.0 H 0 _

117 117 1.1 2.0

105 105 98 3 0 H 1 —

1510 1517 7.7 7 5 Om iddle 14 D:21

512 512 8.1 55 H 2 ._

1000 1006 75 5b H 1 D:18

373 373 6.9 695 843 High High C Many Mosaic

669 669 9 9 80 G3' 8 — 149 117 131 13 7 C3' 0 I:3G

141 141 52 5 0 C:5' 0 —

80 80 21.6 22 5 H 1 —

281 281 19.5 17.8 H 9 ._

Note. ORFSfi11 genome (Fig. 3). bp differences: The percentage of basepair differences between the corresponding orfs of the two pnages. aa differences: the percentage of amino acid differences between the two proteins. Distribution: The distribution of the basepair differences in the compared orfs was classified as homogeneous (H) or clustered (C); the location of the clustered base pair differences at the 3' or 5' end or the middle of the orf is given. Deletion: deletions (D) and insertions (I) observed in the alignment of the corresponding orfs. The numbers give the length of the DI in oaseoairs deduced from the SIM alignment. If more than one number is given, multiple D/l were observed. The distinction of D/l is arbitmniy based c~ the r/>Sfi11 sequence.

zone from very high to moderately high sequence to orf 1 gp from S. pneumoniae phage Dp-1 (Fig. 4). In similarity coincided with the start codon of orf 131 in contrast, the short (20 to 70 aa), interspersed segments 0Sfi19. A complex pattern of similarity was observed 3, 5, and 9 from the 0Sfi11 protein demonstrated >80% between orf 1291 gp from 0Sfi19 and its complements identity with the 0Sfi19 protein, while the 001205 protein in 0Sfi11. Two adjacent N-terminal segments of the orf was only distantly related. Segments 3 and 5 demon¬ 1291 gp showed similarity to the N- and C-terminal strated strong similarities to L lactis phage c2 and BK5-T parts, respectively, of the orf 1000 gp (Fig. 4). These proteins. Finally, the 130- to 150-aa-long segments 4 and two were shared with L. lactis BK5-T segments phage 8 showed less than 15% aa identity between the three S. (Desiere etal., 1998). In addition, part of the interven¬ thermophilus phages, while the 0Sfi19 protein showed in orf 1000 showed with a ing segment gp similarity moderately high sequence identity (38%) to the tail tip minor tail from S. protein pneumoniae phage Cp-1 protein from the lactococcal phage blL67 (orf 35 gp). A (36% aa P = 10"~11), the C-terminus identity, Finally, number of gaps were introduced by the alignment pro¬ from orf 1000 showed with the gp significant similarity gram (e.g., segment 2 which is lacking in 0Sfi11). Many C-terminus from the orf 373 gp (34% aa over 90 identity of the gaps in the aa alignment reflected gaps in the aa, P - 10~e). same positions of the nucleotide alignment, possibly An even more complicated pattern fro"1 the emerged indicating deletion/insertion processes. Interestingly, a of orf 695 with the comparison 0Sfi11 gp corresponding spontaneous deletion covering segments 2, 3, 4, and 5 proteins from 0Sfi19 and 001205. The triple alignment was observed during serial passage of phage 0Sfi21 in allowed the demarcation of 10 Over the cen¬ segments. our laboratory (Fig. 4; Desiere et al., 1998). tral 150-aa segment 7, all three proteins showed 90% aa identity. This region contained collagen-like GXY repeats Conservation of genome organization which were also found in the lactococcal phage BK5-T Over the (Fig. 4). 170-aa C-terminal segment 10, the Next we did an alignment of the genetic maps from was 98% identical to the 0Sfi11 protein «601205 protein, phages which showed significant (P ss 10-5) sequence while the was 50% identical. This 0Sfi19 protein only part similarity with

LUCCHINI DESIERE AND BRUSSOW

—I 1 1 1 1 i' 5000 10000 15000 20000bp

Leuconostoc oenus Cp-1 (DEJ-1 $Dp-1 K' H A l E F C orf620 orf150orf360 orf 410 orfl orf2 holm holm lysin orf1904 orf21 orfZ orf7 i" ');'.>{»,')>' )vi >E3a) tai

104 140 203 117 670 131141 87 iaa *Sfi19

OSfi11 113 114 108117105 373 57 695 669 13114180 281

gP26 orf19 orf21 orf2 orf? tail orf 410 orf1904 tail orf35 orfl orf2 holm hoiin t.liyLi^a 'i^Eii n^m. n^ Li^i—yi n ^ protein protein lysin OL-5 d b3 a1 x mtp a2 c2 b4' «DBK5-T OCp-1 c6bIL67 *Dp-1 Cp-1 (DEJ-1 *Dp-1 Lactococcus lactis OTP901-1 orf1904 d>BK5-T

FIG 4 of the from S Alignment partial gen maps thermopl u a t oshages c/>Sfi19 (op ! ne) and Sfi19 (top line) and <^Sf 11 (bottom line) are indicated by shadings in the central paitof the figure The degrees cf ^ s m la t\ are g adedf i ^lack to light grey The pe centeqes express the percentage of aa identity

<* between the depicted regions The st aqeef t,r e ages c o ,y c°> s= e give the aa identt\ to i° gene prodt us nred cted foi orf 131 to

289 respectively Zones connected dvii z I ss pc v g no significant « a i ar y at i° aa ^egu0ice level (P ^ 0 01)

for the point orientation of the maps since it shewed tnree genes upstieam from this refeience gene phage

significant sequence similarity with qos from three Lac lambda possesses a gene coding for the major tail pro tococcus one Bacillus phages phage and one Mycob )c tein Five qenes upstream from the major tail gene tecum phage Throe genes upstream from this refer°nte phaqe lambda shows the major head gene This constel

we found a gene gene coding for a major structunl lation is similar to 0sk1 but one gene shorter than in the (putative or proven tail) protein (exception 0 rlt v\ i^re other Siphoviridae from the gram positive bacteria The the distance is five genes) Six genes upstream o£ tie difference might be due to a possible displacement of tail a further structuial putative major gene major q°ne gene Wfrom its morphogenetic context lambda gps U Z was localized in all of the phages coding for the dl tit ve Fll and W interact to join the head to the tail (Georgo

or head rfs<1 where s proven protein (exception the d douIos ef a/ 1983 Katsuta 1983) Ihiee of these pto tance was five differs from the genes) 0Sfi11 ot^i te ns are clustered between the major head and major phages by having two adjacent major structural qenes at tail encoding qenes while qene W is with the prehead this of this structural position Directly upstream gene assembly qenes The interspersed character of gene W had a that showed 0Sfi11 gene significant similarity to =s was previously noted from comparisons of the gene map 0r1t gene that preceded the major «6r1t structurai qene of two hmbdoid phages {A and P22 Cppler ef al 1991 at the left end of Finally the partial Sft11 map we but see also Smith and Fe ss 1993)

identified two genes that showed strong and weak ^ m ilanty respectively with two adjacent genes from Bicil DISCUSSION lus phage SPP1 encoding a portal and a oroheaa prote n

Over the structural cluster io ° gene sequence similar Over tl morphogenesis module our analysis re was detected between ity coliphage lambda a Siphovit v-ealed sti k ig oaiallels m the genome structure of Si dae from a gram negative bacterium and Siphoviridae PTCvrche I ^mbda virologists have already observed from bacteria However if gram positive Phage lambda that tne t rdc r of iction of the gene products during H is with orf 1510 from o6Sf 11 a gene aligned str k "ah» Jiaqe lam ben assembly is similar to the arrangement of similar order was observed gene in phage hmocia As n tne qenes on tne hmbda genome It has been argued the lactococcal, streptococcal and bac liar S phov i oae tnat the order ot acton is based on the stiuctural inter 42

FV0LUT10N OF SIPH0V1RIDAF 69

v^ ^ ^

<«r ^ jF ^ cr ^V ^\f^

minor tail MHP MTP scaffold protein

16 17 ^a 202122 24 11 12 13 14 16 19 23 26 26 16, ,19 23, <ï>L-5 qcrji jiiiz^i^^ci^Ol) L_ff^q

structural proteins

<î>sk1

MHP MTP ^protem1 terL 35

27 28 29 30 31 32 33 34 36 37 38 39 40 41 Or1t :fei^r~^^

b2 b3 b1 mhp cl a1 x mtp a2c2 b4" <£>TP901 -1

MSP portal

protein Y v~ m 114 y 502 284 193 113 348 63 113 1281S8 117106 1510 <ï>Sfi11 MET! Y»J\pYArJui r

minor head proteins

minor head i protein scaffold | protein MHp | I portal MHP MTPV protein i y y *V 16 161 17,3 \17 3 10 11 12 13 14 16 17171 17.5, (DSPP1 irWy BMfy1 Jy-f~-f—r—|PfWl-~-y

17.2

FIG 5 Comparison of the organization cf the head and tail asse nbly ce es n ccliohage A (4cd In e) Mvcobactemim phage L 5 (second line) L lactis phages ski (third line) r1t (fourth line) TP901 1 (fifth line) 6 the'-nophil js oia ^e S 11 (six'h I ^e) a d B subtilis phage SPP1 (bottom line) The genomes ate aligned at the right by a region coding for a putative minor 'ail p c+or ( ed) she ,v g s cie"sesi nia iyove afi liste 1 Siphovndae fiom gram posi'ive bacteria Genes coding for the putative or proven maw tail piotein (blue) majoi he-\1 pro en hi d^1 scafo'l n" on em (bl~ <) oottal c c'en (dark gieen) and small subunfterminases (light oink) and large subunittermmasefdaik pink) we e coded The Ie e'on e i rev is pmocnoiai th° leigti cf 'tie predicted open leading frnne The individual orfs were identified by the numbering system used n the oipjral oubl ca it i iic*e e^e-, / kits ohage s<1 (Cherdy ef al 1997) for all others see Fig 3) For phaqe lambda the functional attribution ot the genes is it dca'ed Zoi es ot !iai bl e I x regions showing signi'icint sequence similarity (P < 10~5) 4 S

70 LUCCHINI DESIERF AND BRUSSOW

action between proteins involved in adjacent steps of the sesses four perfect tandem repeats containing the col assembly pathway (Katsura 1983) In addition it was lagen like motifs (Boyce etal 1995a) which also suffered argued that the conservation of the gene order in lamb spontaneous deletions (Boyce ef al 1995b) Further oh doid phages is of clear evolutionary advantage since t goglycine repeats were found in the conserved seg minimizes the number of unproductive interactions that ments of orf 695 gp The corresponding DNA repeats arise by recombination between phages having Dart ally could lead to deletions by slippage of the DNA polymer homologous chromosomes (Casjens and Hendnx 1974) ase oi to DNA exchanges or DNA expansion by unequal However this hypothesis can explain the conseivat on of crossover events Interestingly a structure prediction foi the gene order only within closely related phage groups the adsorption protein from phage T4 suggested a num such as the E coli lambdoid phages where such hybr ds ber of hvpervanablo looos mediating the receptot recoq can and do form It cannot be a present day force r° n t on held together oy conserved oligoqlycine stietches sponsible foi maintaining the gene order between I ano (Henning and Hashemolhosseini 1994) The 9 ther the Salmonella phage P22 since there is not enoigi mophilus phages miqht thus resemble coliphages in similarity in seguence over the morphogenesis operoi wh ch the evolution of tail fiber genes apparently occurs (Epplerefa/ 1991) This argument is even more ev dent ly recombinational reshuffling (Haggard Ljungquist et when Siphoviridae from gram postve and gran neqa il 1992) tive bacterial hosts are compared Alternatively the coo We analyzed the sim hnty of S thermophilus phages served gene order could be a rel c from an anceit to other phages at the sequence level and in an evolu common ancestor or an undefined selection force t t tonary context The sequence comparisons tevealed a favors this particular order hierarchy of relatedness At the first level are relation Ihe similarity of the lambda structural gene map wti ships between S thermophilus phages belonging to the that of L lactis phage ski and R subtilis phage SPP1 same lytic group they showed a nearly identical qene was observed recently by Chandry ef al (1997) and map and differed at the DNA sequence level by about Becker ef al (1997) respectively On the basis of our 10% (Desiere ef al 1998) At the second level of related multiple comparisons of qene maps we propose that the less are relationships between S thermophilus phages gene map from phage lambda can be used to pred ct oelongmg to different lytic groups They showed over the tentative gene functions in uncharactenzed Siphoviridae whole genome a comparable gene order but at the For an industrial laboratory deal ng with dairy phages sequence level they are patchy demonstrating regions of the genetic basis of the phage host range is of obv oi s no and very high (""- 99/o) DNA sequence dentity (Brus practical importance Therefore we tested the prec ct ve sow et al 1994b) At fn third level of relatedness are 6 power of this method by searching for the streptococcal iiermoplvlus 0Sfi21 a id the L lactis 0BK5 T (Boyce et phage complement to the tail tip protein of phage lambda al 1995a) The qene order is very similar and many responsible for mediating phage adsorption In phage genes showed high sequence similarity (up to about 63% lambda this protein is encoded by gene J wh ch s h° Qt the aa level) It should be noted that this is the level of last gene of the morphogenesis operon (Katsura 1983) relatedness between two lambdoid coliphages (Smith Our bioinformatic analysis localized the end of the stan and Feiss 1993) The similarity was not restricted to the tural gene cluster in 0Sfi11 at orf 695 or 669 In fact t rf morphogenesis operon but was also found over the

695 gp fulfils several requirements for a Phage aasoip lysogeny module (Bruttin ef al 1997a) At a similar level tion protein First it differed among phages that show are the relationships between S thermophilus 0Sfi11 distinct host ranges (0Sfi11/O12O5 Sf 11 Sfi 19) wh le t and several Lactococcus phages (0TP9O1 1 r1t 7 9) oi was nearly identical among phages snow ng an exten S thermophilus 0Sfi21 with 1 euconostoc rfLIO (Desiere sively ovetlappinq host range (0Sfi19 Sfi21) Second orf etal 1998) High sequence similarity (up to 56% atthe aa 695 resembles the gp anti-ieceptoi of onage T4 (prote a level) was detected over several adjacent genes At the 38) with its alternating stretches of high low aid no fourth level are relationships between B subt/lis 0SPP1 homology in multiple alignments wth related phages and c6Sf 11 Adjacent genes fiom two different modules (Montag ef al, 1987) In addition when a nun ber of (structural genes DNA packaging Becker et al 1997) lambdoid phages were compared by heteroauplex anal showed aa similarity (^ 30% identity) At the same level J showed also conserved and ysis gene varabh s q aie the relationships with S pneumoniae phages over ments ei al (Highton 1990) Third the DNA ieq oi cov the lysis module At the fifth level are relationships be ered by orf 695 is a recombnational hotspot in stmpto tween S tnermoohiius phages and Siphoviridae from coccal phages A large spontaneous delet on was taxonom cally more distant bacterial hosts like Lactoba observed several times in the orf 695 ortholoque from ci lus (Dps ere ef al 1998) Streptomyc es and Mycobac and 0Sfi21 (Bruttin Brussow 1996) Th s delet oi v\as t^rnm Sequence smhrty was found for individual flanked a 53 3 by nearly perfect bp recent (Des ere ef qenes (-- 30% aa identity) but no longer over adjacent 1998) contaminq collagen like GXY repeats In add ton qenes h nally we have the distant relationship between like were also found in orf 1904 from 9 collagen repeats gp thermophilus phages and phaqe lambda from gram lactococcal BK5T this phage Interestingly protein oos negative bacteria Over the structural gene operon both 44

EVOLUTION OF SIPHOVIRIDAE 71

phages showed a relatively conserved gene order but no view of the possibility of horizontal gene transfer be¬ sequence similarity tween different phage systems, the species barrier can¬ The observation of this series of graded related¬ not be absolute even between clearly distinct phage ness is notable since it is the hallmark of any biolog¬ systems Between lytic group I and II S thermophilus ical system undergoing evolutionary changes The phages we observed very high DNA seguence identity m fact that the degrees of lelatedness are correlated the DNA replication module (>99%, Brussow ef al approximately with the evolutionary distance between 1994b Desiere ef al 1997) and the lysis cassette There the bactenal hosts (S thermophilus-^Lactococc js—> is no process which can explain this high sequence Leuconostoc-^Bacillus-^Lactobacillus (all low GO conservation except recent DNA exchanges Sequence group of gram-positive bacteria)-^Mycobacteiium—> similarity of >90% was observed over appioximately a Streptomyces (both high GC group of gram positive quarter of the genomes from lytic group I and II phages bac\er\a)->Eschenchia (gram-negative bacterium)) is (Desieie ef al, manuscript in preparation) For us it is not intriguing It will be important to confirm this correla¬ meaningful to sepaiate these phages into two species tion from the perspective of other Siphoviridae so In contrast 0BK5Tand S thermophilus 0Sfi21 differed lated from an evolutionary distant host This is cut oy at least 35% at the aa level As long as no lactococcal rently not possible since only relatively few phage or streptococcal phages are described that narrow this sequences of Siphoviridae are in tne database and gap we anticipate that these phage groups share a their bacterial hosts do not represent the evolutionary common ancestor and are not currently exchanging DNA diversity of eubactena Our data are compatible w th and this do not belong to the same phage species On the hypothesis that the morphogenesis opeion from the othet end of taxonomical hierarchy one might lift the Siphoviridae of the low G + C grouD of qram-pos tive family Siphoviridae to order level (Siphovirales) and re¬ bacteria is derived from a common ancestor In con¬ serve the family level for example, for Siphoviridae from trast, the sequence similarity between Siphoviridae of the low GC group of gram-positive bacteria and genus the high and low G + C group of gram-positive bac¬ level to phages which are as similar as lactococcal teria (which was limited to individual genes) is more SBKb-1 and streptococcal 0Sfi21 However any defini¬ likely to represent horizontal gene transfer in a rela¬ tion of higher taxonomic groups in Siphoviridae depends tively distant past than common ancestry The similar¬ on the postulated model of phage evolution ity between the morphogenesis opérons from phage lambda and Siphoviridae from giom-positive bacteria MATERIALS AND METHODS allows two interpretations convergent evolution (then Phages, strains, and media a very astonishing one) oi splitting of the two lines in a distant which obscured all aa similar relatively past The phages were propagated on then appropriate S ities Current data do not constrain the time scale of thermophilus hosts in lactose M17 broth as described the is latter process It possible that all extant Sipho¬ previously (Bruttin and Brussow, 1996, Brussow and Brut¬ viridae from a common ancestor diverged very rapidly tin 1995) E coli strain JM 101 (Stratagene) was grown m The high rate of bp observed between co¬ changes I B broth or on LB broth solidified with 1 5% (w/v) agar is indirect evidence for a oT co¬ liphages rapid pace Ampicillin IPTG and X-qal (all from Sigma) were used at evolution Therefore the of a co- liphage impiession concentrations of 100 jiiq /ml, 1 mM and 0 002% (w/v), evolution of the phages with the host bacteria does not respectively necessarily implicate descent from a phage ancestor which already existed when the bacterial genera sollt DNA techniques apart We suspect that the ancestor of Siphoviridae is Phage purification and DNA extraction were done as much younger than the separation of gram-positi\e descr bed oreviously (Bruttin and Brussow, 1996, Brus¬ and gram-negative bacteria A putative ancestor sow and Bruttin 1995 Bruttin ef al 1997a) Plasmid DNA phage could have invaded different bactenal genera n was isolated using Qiaqen midi-plasmid isolation col¬ an evolutionary not too distant past and the S oho/ n umns Restriction enzymes were obtained from Boehr- dae split into distinct lines due to separations o" tne inger Mannheim and used according to the suppliers gene pools affected by more or less tight host range instructions barriers between bacterial species or geneia What does the comparât ve sequencing approach tell Protein techniques us about the definition of a phage species? Since lytic group I and II S thermophilus phages have clearly dis¬ Phage particles were concentrated by PEG précipita tinct structural genes we might define phages troog a tion and purified by two rounds of CsCI density gradient single lytic group as a phage species However any centnfuqation (3 h at 40 000 ipm using a Beckman biologically meaningful species definition shot Id take SW55 5 rotor) on a 5 step preformed CsCI gradient (nD -= ot material as exchange genetic an nclusion cr tenon 14 1 372 1 3698 1 368? 1 367) The phage bands were and lack of this as an exclusion exchange cntei on In recovered with a Pasteui pipette diluted m phage buffer 45

72 LUCCHINI, DESIERE, AND BRUSSOW

(Brüssow and Bruttin, 1995) and concentrated by high ACKNOWLEDGMENTS speed centrifugation (1 h, 40,000 rpm, SW55.5 rotor, We thank V. Fryder for electron microscopy, Anne Bruttin for her Beckman). The purified phage particles were then dena¬ contributions in the initial phase of the work, Fanny Scalfo for valuable tured for 2 min at 100°C SDS buffer using gel-loading technical assistance, Sophie Foley for critical reading of the manu¬ with /?-mercaptoethanol. SDS-PAGE was done on 8 % script, and the Swiss National Science Foundation for financial support of Sacha Lucchini and Frank Desiere in the framework of its Biotech¬ acrylamid slab gels which were subsequently stained nology Module (Grant 5002-044545/1). with Coomassie brilliant blue (Bio-Rad).

REFERENCES Sequencing

Ackemiann, H.-W., and DuBow, VI. S. (1987). Bacteriophage taxonomy. DNA sequencing was started with universal forward in "Viruses of Prokaryotes" (H.-W. Ackermann and M. S, DuBow, Eds.), and reverse on or primers pUC19 pNZ124 shotgun Vol, I, Chap, 2, pp, 13-28. CRC Press, Baca Raton, Florida. clones and continued with synthetic oligonucleotide (18- Anne, J., van Mellaert, L, Decock, B., van Damme, J., van Aerschoot, A., mer) primers (Microsynth, Switzerland). Both strands of rie-dewijn, P., and Eyssen, H. (1990). Further biological and molec¬ ular characterization of actinoohage VWB. J. Gen. Microbiol. 136, the cloned DNA were sequenced by the Sanger method 1365-1372. of chain termination the fmol dideoxy-mediated using A'tscnul, S. F., Gish, W„ Miller, W„ Myers, E, W, and Lipman, D. J. (1990). DNA Seguencing System of Promega (Madison, Wl). ~he Basic local alignment search tool. J. Mol. Biol. 215, 103-410. sequencing primers were end-labeled using [a>33PlATP Becxan B., de la Fuente, N., Gassei, M., Günther, D., Tavares, P., Lurz, T. Tangier, T. A., and Alonso, J. C. (1997). Head according to the manufacturer's protocol. The thermal morphogenesis ge-es cf the Bacillus subtilis bacteriophage SPP1. J. Mol. Biol. 268, cycler (Perkin-Elmer) was programmed at 30 cycles of 322-839. 95°C for 30 50°C for 30 and 72°C for 1 min, s, s, Botstein, D. (1980). The theory of modular evolution in bacteriophages. In addition, pUC19 clones of Sau3A-digested phage Ann. N. Y. Acad. Sei. 354, 484-491. J. B. and Sfi21 DNA were sequenced using the Amersham Lab- Boyce, D., Davidson, F., Hillier, A. J. (1995b). Spontaneous deletion mutants of the Lactococcus lactis temperate bacteriophage station sequencing kit based on Thermo Sequenase- BK5-T and localization of the BK5-T attP site. Appl. Environ. Microbiol. labeled with 7-deaza-dGT- primer cycle sequencing 61, 4105-4109, P(RPN2437). Sequencing was done on a Licor 6000L Boyce, J. D., Davidson, B, E., and Hillier, A. J. (1995a). Sequence analysis automated sequencer with fluorescence-labeled univer¬ of the lactococcus (actis temperate bacteriophage BKB-T and dem¬ onstration thsat the DNA has cohesive ends. Environ. sal reverse and forward pUC19 primers. phage Appl. Microbiol. 61, 4089-4098.

Bi;ssovv, H., and Bruttin, A, (1995). Characterization of a temperate PCR Streptococcus thernoDhilus bacteriophage and its genetic relation¬ ship with lytic phages. Virology 212, 632-640. PCR was used to which were ob¬ span regions, not Brüssow, H., Fremont. M., Bruttin, A., Sidoti, J., Constable, A., and tained through random cloning. PCR products were gen¬ Fryder, V. (1994a). Detection and classification of Streptococcus erated using the synthetic oligonucleotide pair designed thermophilus bacteriophages isolated from industrial miik fermenta¬ tion. Appl. Environ. Microbiol. 60, 4537-4543. according to the established o6Sfi21 DNA sequence, pu¬ Brüssow, H., Probst, A., Fremont, M., and Sidoti, J. (1994b). Distinct rified phage DNA, and Super Taq polymerase (Stehel/n, Streptococcus thermophilus bacteriophages share an extremely Basel, Switzerland). PCR products were purified using conserved DNA fragment. Virology 200, 854-857. the QIAquick-spin PCR purification kit. Bruttin, A., and Brüssow, H. (1996). Site-specific spontaneous deletions in three genome regions of a temperate Streptococcus thermophilus phage. Virology 219, 96-104. Sequence analysis 3't.ttin, A., Desiere, F., d'Amico, N,, Guerin, J. P., Sidoti, J., Huni, B., Lucchini, S., and Brüssow, H. (1997a). Molecular ecology of Strepto¬ The Genetics Computer Group sequence analysis coccus thermophilus bacteriophage infections in a cheese factory. package (University of Wisconsin) was used to assemble Appl Environ. Microbiol. 63, 3144-3150. and analyze the sequences. Nucleotide (nt) and pre¬ Bruttin, A., Desiere. F,, Lucchini, S., Foley, S., and Brüssow, H, (1997b). Cha'acterlzation cf the lysogeny DNA module from the dicted amino acid (aa) sequences were compared to temperate Streptococcus thermophilus bacteriophage Sfi21. Virology 233, those in the databases (GenBank, release 102; EMBL 136-148 release 51; release SWISS- (abridged), PIR-protein, 53; CaniDbell, A. (1988). Phage evolution and speciation, In 'The Bacterio¬ PR release OT, 34; PROSITE, release 13.0) using the phages" (R. Calendar, Ed.), Chap. 1, pp, 1-14, Plenum, New Yoik, FastA (Lipman and Pearson, 1985) and BLAST (Altschul Camnceil, A, and Botstein, D, (1983), Evolution of the lambdoid phages. In "Lamoda II" (R, Hendrix, J. W. Roberts, F. W. Stahl, and R. A. et al., 1990) programs. Sequence alignments were per¬ Wetsberg, Eds.), Dp.365-380. Cold Spring Harbor Laboratory Press, formed using the CLUSTALW 1,6 method (Thompson et Cold Spring Harbor, NY. the al., 1994), Multalign program (Corpet, 1988), http:// Casjers, S., Hatfull, G., and Hendrix, R. (1992). Evolution of dsDNA www.toulouse.inra.fr/multalin.html, and the SIM align¬ taiied-Dacteriophage genomes. Semin. Virol. 3, 383-397. ment tool (Huang and Miller, 1991, http7/expasy. Casjens, S,, and Hendrix, R. (1974). Comments on the arrangement of the memnogenetic of lambda. J. Mol. Biol. hcuge.ch/sprot/sim-nucl.html), genes bacteriophage 90,

The Sfi11 sequence was deposited in the GenBank C-'a-dry, P S., Voore, S. C, Boyce, J. D., Davidson, B. t., and Hillier, A. J. database under Accession No. AF057033, !1P97). Analysis of the DNA seguence, gene expression, origin of 46

EVOLUTION OF SIPHOVIRIDAE 73

replication and modular structure of the Lactococcus lactis lytic determinants for major structural proteins. Appl. Environ. Microbiol. bacteriophage ski. Mol. Microbiol. 26, 49-64. 63, 3246-3253. Corpet, F. (1988). Multiple seguence alignment with hierarchical clus¬ Lipman, D. J., and Pearson, W. R. (1985). Rapid and sensitive protein tering. Nucleic. Acids. Res. 16, 10881-10890. similarity searches. Science 227, 1435-1441. Desiere, E, Lucchini, S., Brüssow, H. (1998). Evolution of Streptococcus Lopez, R., Garcia, f. L, Garcia, E., Ronda, C, and Garcia, P. (1992). thermophilus bacteriophage genomes by modular exchanges fol¬ Structural analysis and biological significance of the cell wall lytic lowed by point mutations and small insertions, Virology 241, 61-72, enzymes of Streptococcus pneumoniae and its bacteriophage. FEMS Desiere, E, Lucchini, S., Bruttin, A., Zwahlen, fvl.-O, and Brüssow, H. Microbiol. Lett. 79, 439-447.

(1997), A highly conserved DNA replication module from Streptococ¬ Madin, A, C, I opez, R., and Garcia, P. (1996). Analysis of the complete

cus thermophilus phages is similar in sequence and tocology to a nucleotide sequence and functional organization of the genome of module from Lactococcus lactis phages. Virology 234, 372-382, Streptococcus pneumoniae bacteriophage Cp-1. J. Virol. 70, 3678- Draper, D, E, (1996), Translational Initiation. In "Escherichia co'i 0"d 3687 Salmonella: Cellular and Molecular Biology" (F. C, \'eidhardt Ed.). Mercenier, A. (1990). Molecular genetics of Streptococcus thermophi¬ Chap. 59, pp. 902-908. ASM Press, Washington, DC. lus. FEMS Microbiol. Rev. 87, 61-78.

Eppler, K„ Wyckoff, E„ Goates, J., Parr, R„ and Casjens, S. (1991). Milkman, R. (1996). Recombinational exchange among clonal popula¬ Nucleotide sequence of the bactenophage P22 genes required for tions. In "Escherichia coll and Salmonella: Cellular and Molecular

DNA packaging. Virology 183, 519-538. Biology" (F. C. Meidhardt, Ed,), Chap. I45, pp, 2663-2684, ASM Press, Fauquet, C. M. (1997). International Committee on Taxonomy of Viruses, Washington, DC. Chart available from [email protected]. Monod, O, Repoila, E, Kutateiadze, M., Tetart, F., and Krisch, H. M.

Fayard, B., Haeflinger, M., and Accolas, J.-P. (1993), Irte'act'cs cf (1997), The genome of the pseudo T-even bacteriophages, a diverse temperate bacteriophages of Streptococcus salivanus suoso. :he-- grciiD that resembles T4. J. Mol. Biol. 267, 237-249. mophilus with lysogénie indicators affect pnage DMA restnction Vontag, D., Riede, I., =schbach, M.-L, Degen, M., and Henning, U, patterns and host ranges. J. Dairy Res. 60, 385-399. (1987). Receptor-recognizing proteins of T-even tyne bacteriophages.

Georgopoulos, C, Tilly, K., and Casjens, S. (1983). Lambdoid ohage Constant and hypervariable regions and an unusual case of evolu¬ head assembly. In "Lambda II" (R. W. Hendrix, J. VV. Roberts, F IV, tion. J. Mol. Biol. 196, 165-174. Stahl, and R. A. Weisberg, Eds.), Cold Spring Harooi Laboratory Murphy, F. A. (1996). Virus taxonomy. In "Fields Virology" (B. Fields, Ed.), Press, Cold Spring Harbor NY. Chap. 2, pp. 15-58. Lippincott-Raven, Philadelphia. Griffin, D. E., and Bellini, W, J. Measles virus. (1996). In "Fields Virology" Ochman, H., and Lawrence, J. G. (1996). Phylogenetics and the ame¬ B. Fields, (Ed.), Chap. 43, pp. 1267-1312. Lippincott-Raven, Phi'adel- lioration of bacterial genomes. In Escherichia coli and Salmonella:

phia. Cellular and Molecular Biology" (F. C. Neidhardt, Ed.), Chap. 142, pp. Guédon, G., Bourgoin, E, Pébay, M,, Roussel, Y, Colmin, C, Sinoret, 2627-2637. ASM Press, Washington, DC. J. M., and Decaris, B. (1995). Characterization and distribution of two Schonler, O, Ehrlich, S. D., and Chopin, M. C. (1994). Sequence and insertion sequences, IS 1191 and iso-IS981, in Streptococcus :he-- organization of the lactococcal prolate-headed biL67 phage genome. mophilus: Does intergenic transfer of insertion sequences ceci/ r Microbiology. 140, 3061-3069. iactic acid bacteria co-cultutes? Mol. Microbiol 16, 69-78. Sheehan, M. M„ Garcia, I. L, Lopez, R„ and Garcia, P. (1996). Analysis Haggard-Ljungquist, E,, Hailing, O, and Calendar, R, (1992), DNA se¬ of the catalytic domain of the lysin of the lactococcal bacteriophage quences of the tail fibre genes of bacteriophage P2- Evidente fcr Tt,o2009 by chimeric gene assembling. FEMS Microbiol. Lett. 140, horizontal transfer of tail fibre genes among unrelated oacter;c- 23-28. phages. J. Bacteriol. 174, 1462-1477. Smith, M. P., and Feiss, M. (1993). Sequence analysis of the phage 21 Hatful, G. R, and Sarkis, G. J. (1993). DNA sequence, structure and geno genes for prohead assembly and head completion. Gene 126, 1-7. expression of mycobacteriophage L5: a phage system for mycobac¬ Stanley, E., Fitzgerald, G,, Le Marrec, O, Fayard, B., and van Sinderen, terial genetics. Mol. Microbiol. 7, 395-405. D. (1997). Sequence analysis and characterization of

2.3 Comparative genomics of Streptococcus thermophiius phage species supports a modular evolution theory

(Journal of Virology 73, 8647-8656 (1999)) 48

Comparative Genomics of Streptococcus thermophilus Phage Species Supports a Modular Evolution Theory (Journal of Virology 73 8647-8656 (1999))

Sacha Lucchini, Frank Desiere and Harald Brussow

Nestle Research Centre Nestec Ltd Vers-chez-les-Blanc CH-1000 Lausanne 26 Switzerland

Abstract

The comparative analysis of five completely sequenced Streptococcus thermophilus bacteriophage genomes demonstrated that their diversification was achieved by a combination of DNA recombination events and an accumulation of point mutations The five phages covered lytic and temperate phages of both pac-site and cos-site phages from three distinct geographical areas The units of genetic exchange were either large and comprised the entire morphogenesis gene cluster excluding the putative tail fiber genes or small thereby consisting of one or maximally two genes or even segments of a gene Many mdels were flanked by DNA repeats Differences in a single putative tail fiber gene correlated with the host range of the phages The predicted tail fiber protein consisted of highly conserved domains containing conspicuous glycine repeats interspersed with highly variable domains As in the T-even coliphage adhesins the glycine-containing domains were recombinational hotspots Downstream of a highly conserved DNA replication region, all lytic phages showed a short duplication in three isolates the origin of replication was repeated Conceptionally the lytic phages could be derived from the temperate phages by deletion and multiple rearrangement events in the lysogeny module giving rise to occasional selfish phages that defy the superinfection control systems of the corresponding temperate phages

Introduction

The most important mechanism for producing new types of RNA viruses is mutation, mostly in the form of point mutation For RNA viruses in which the genome is present in several discrete genome segments, new viruses can also be created by reassortment of genome segments This strategy provides a rapid method for the production of viruses with totally new potentials and is thought to be the basis for the dramatic antigenic shifts in influenza virus (48) Recombination, in which a single polynucleotide strand contains sequences which have originated from two parental types, is only infrequently utilized among the RNA viruses However examples have been documented for picomaviruses

(43), coronaviruses (33), (24, 47), artenvirus (28) and in certain plant viruses (13) The two major forces acting upon DNA virus genomes to generate diversity are mutation and recombination The role of recombination for the generation of diversity has been documented for several DNA viruses (polyomaviruses, adenoviruses, reviewed in 41) Some DNA viruses exert control over recombination by encoding proteins that enhance recombination between viral genomes and thus speed up their own evolution 49

This strategy is most prominent in bacterial DNA viruses (reviewed in 41)

Recombination between related viruses might produce a genome containing a trans¬ acting gene product trying unsuccessfully to interact with a c/s-acting element derived from another virus To avoid these difficulties, viral genomes are commonly organized with c/s-actmg elements located near the genes encoding the proteins that bind them

Genes which encode proteins that interact are also frequently located next to each other

Experimental data from lambdoid phages led to the formulation of the modular theory of phage evolution by Botstein (2) In fact lambdoid phages are related to each other in a mosaic fashion, implying an evolutionary history with significant amounts of horizontal exchange of genetic material (14) The relationship between lambdoid phages has mainly been developed by heteroduplex mapping techniques (for a more recent reference 26) Now that an increasing number of phage genomes have been sequenced, it is possible to ask about the similarity between different phage genomes at the nucleotide level (25) These comparisons should be especially revealing when they are done at the level of whole genome comparisons and between related phages differing in important aspects of their phenotypes (lytic/ temperate life style, polypeptide composition, immunity groups, host range, DNA packaging mechanism, geographical and ecological origin) Only two such comparisons have been reported The comparison between the lytic and temperate mycobactenophages D29 and L5 (order Caudovirales, family Siphoviridae, genus "L5-like viruses"(34)), respectively, showed that their genomes differed by a large number of insertions, deletions and substitutions of genes

(21) The comparison of the lytic and temperate cos-site containing Streptococcus thermophilus phages Sfi 19 and Sfi21, respectively, demonstrated that their genomes were similarly organized and differed in addition to numerous point mutations by gene deletion/ insertion, duplication and DNA rearrangement events (32) However, comparisons between more than two related phage genomes are necessary to decipher the processes that shaped the genomes of a given phage species Streptococcus thermophilus phages (5,7) could be a suitable object for such a study Due to their industrial importance in milk fermentation (36), many S thermophilus phages have been isolated covering a substantial geographical diversity (7 and references therein) A longitudinal factory survey has documented the ecological dynamics of phage infection

(11) All phages characterized until now belong to the same morphological class

(Siphoviridae, B1 morphotype) and the same DNA homology group (8,9), but they could 50

be split into two groups cos-site and pac-site phages. The two DNA packaging mechanisms correlated with two distinct structural polypeptide patterns (30) and in yogurt phage isolates with host range and serotype (8) Four complete S thermophilus phage genomes have been reported the temperate pac-site phage O1205 (40), the temperate cos-site phage Sfi21 (32) and two lytic cos-site phages, the French yogurt isolate Sfi19 (17,32) and the Canadian cheese isolate DT1 (44) Here we report the complete sequence of the lytic pac-site phage Sfi 11 (31) and half of the genome sequence of the lytic cos-site phage Sfi18 This creates now the largest sequence database set for a group of closely related phages infecting the same host species The phages were chosen to cover differences in life styles (lytic / temperate), DNA packaging mechanisms and structural polypeptide pattern (pac-site, cos-site), host ranges, serotype, susceptibility to Sfi21 prophage control ecological environment and geographical origin We first used pairwise comparisons of phages differing in defined phenotypes in order to establish associations between phenotype and genotype of the phages We then used multiple alignments to decipher basic principles of genome diversification in S thermophilus phages

Materials and Methods

Phages, strains and media. The phages were propagated on their appropriate S thermophilus hosts in lactose M17 broth as described previously (6,10) £ coli strain J M 101 (Stratagene) was grown in LB broth or on LB broth solidified with 1 5% (w/v) agar Ampicillin, IPTG and Xgal (all from Sigma) were used at concentrations of 100jj.g /ml, 1 mM and 0 002% (w/v), respectively Pertinent features of the compared phages are given in Table 1

DNA techniques. Phage purification and DNA extraction were done as described previously (6,10) Plasmid DNA was isolated using Qiagen midi-plasmid isolation columns Restriction enzymes were obtained from Boehnnger Mannheim and used according to the supplier's instructions

Sequencing. DNA sequencing was started with universal forward and reverse primers on pUC19 or pNZ124 shotgun clones and continued with synthetic oligonucleotide (18- mer) primers (Microsynth Switzerland) Both strands of the cloned DNA were sequenced by the Sanger method of dideoxy-mediated chain termination using the fmol DNA Sequencing System of Promega (Madison Wl) The sequencing primers were end-labelled using [ y-33P]ATP according to the manufacturer's protocol The thermal cycler (Perkin Elmer) was programmed at 30 cycles of 95° C for 30 sec, 50°C for 30 sec, and 72°C for 1 mm 51

In addition, pUC19 clones of Sat/3A-digested phage Sfi11 DNA were sequenced using the Amersham Labstation sequencing kit based on Thermo Sequenase-labelled primer cycle sequencing with 7-deaza-dGTP(RPN2437) Sequencing was done on a Licor 6000L automated sequencer with fluorescence-labelled universal reverse and forward pUC19 primers

PCR. PCR was used to span regions, which were not obtained through random cloning PCR products were generated using the synthetic oligonucleotide pair designed according to the established Sfi 11 DNA sequence, purified phage DNA and Super Taq Polymerase (Stehelin, Basel, Switzerland) PCR products were purified using the QIAquick-spin PCR Purification Kit

Sequence analysis. The Genetics Computer Group sequence analysis package (University of Wisconsin) was used to assemble and analyze the sequences Nucleotide (nt) and predicted amino acid (aa) sequences were compared to those in the databases (GenBank, Release 109, EMBL (Abridged), Release 56, PIR-Protein, Release 57, SWISS-PROT, Release 36, PROSITE Release 15 0) using FastA(29) and BLAST (1) programs Sequence alignments were performed using the CLUSTALW 1 74 method (45), the Multalign program (15) and the SIM alignment tool (27) The complete Sfi11 and Sfi 18 genome sequences were deposited in the GenBank database under the accession number AF158600 and AF158601

Results

Comparison of the Sfi11 and 01205 phage genomes

Phage Sfi 11 (31) was chosen for complete genome sequencing since until now no complete genome sequence was available for lytic pac-site S thermophilus phages

Phage Sfi 11 has a 39,807 bp genome consisting of 52 orfs longer than 50 codons (Fig

1) The bioinformatic analysis suggested a modular structure of the Sfi 11 genome which closely resembled that of four other sequenced S thermophilus phage genomes

Phages Sfi11 and O1205 differ in life style, host range and geographical origin, while they belong to the same DNA packaging group and share similar structural protein pattern and a similar ecological origin (Table 1) Major differences between both genomes should therefore identify candidate genes involved in life style decision and host range determination genomes should therefore identify candidate genes involved in life style decision and host range determination With the exception of the lysogeny module and two differences in the rightmost part of their genomes, a one-by-one correspondence of the predicted gene map was observed (Fig 1) This similarity extended to the nucleotide sequence level Head morphogenesis Head-tail Tail Lysogeny Packaging Host lysis Replication joining morphogenesis module

30 33 36 49 56 1619 iW 30 32 35 38 48 51 53 55 3 5 7 1518 21 23

' 1 25 26 27 28 29 31 34 37 39 40 41 42 "tO T f ^O 46 47 50 52 54 57 2 4 6 1011 12 13 14 17 20 22 A. O1205 ^Éj^J^I«^ R RR US R MAL A

on * * >v_ | B.Sfi11 68172411 502 284193 348 104 168105 1510 512 1000 373 695 669 149 2 111 7169157 443151 50414357 178 120 137 119 53114 117 57 141 183 14540 233 271 10651 167 236 113 128 83 8966 93

ro

„on B- C.Sfi19~ R R 132 152 623 59 221397 153203 1626 515 1291 670 131 28811147 69 229 233443151 504 14357 192 235 170 386 104 123117 141 18354 40 157 271 10651 166 116 87 88 8967 98 Lysogeny replacement module

Figure 1. Prediction of open reading frames m the complete genomes of the temperate pac-site S thermophilus phage O1205 (top),the virulent pac-site S thermophilus phage Sfi11 (center)and the virulent cos-site S thermophilus phage Sfi19 (bottom) For a better reference to previous publicationsthe orfs from O1205 were numbered as in Stanleyet al (40j,while the Sfi11 and Sfi19 orfs were marked with their lengthin codon numbers Probable gene functions identified by bioinformatic analysis or biologicalexperiments were noted The phage genomes were divided into functional units according to previous bioinformatic and comparative evolutionaryanalysis(32) Genes belonging to a same unit have the same color Grey fillingindicates lack of information about the function of the orf Orfs preceded by a potentialRBS (23) are marked with an R inside the arrow Orfs startingwith an unconventional initiationcodon are indicated with an asterisk and possiblerho independent terminators are indicated with a hairpin Overlap of start and stop codon is indicated with a triangleAreas of blue shading connect regions of major sequence difference between the compared phage genomes The genome maps were oriented to maximize alignments,the cos-site and pac-site, respectively,are thus not at the very ends of the maps 53

The dot plot analysis (Fig. 2A) showed a nearly uninterrupted straight line over the

putative DNA packaging, morphogenesis and lysis modules. Two gaps were found at the

position of orf 695 (Sfi11 numbering) encoding a likely tail fiber protein possibly involved

in host range determination (31).

Phage Life Cycle DNA Origin Ecology Host Sero¬ Immunity Ref packaging range type

"" 01205" Temp. pac Greece Yoghurt Non- n.t. n.a. '"40

LGI.Il

Sfi 11 Lytic pac France Yoghurt LGII 2 n a. 31

Sfi21 Temp. cos France Yoghurt LGl 1 + 12,16-19,

32

Sfi 19 Lytic cos France Milk LGI 1 - 32

Sfi 18 Lytic cos France Yoghurt LG I 1 + this report

DT1 Lytic cos Canada Cheese n t n t n.t. 44

Table 1: Characteristics of the investigated Streptococcus thermophilus phages. Notes: For the definition of host range/ lytic group (LG) and serotype see (8,31), n t. not tested. Immunity: The indicated phage is inhibited in its growth (+) on a lysogénie cell containing the Sfi21 prophage or it is not inhibited on the lysogen (-), n a non applicable, the 01205 lysogen shows no immunity to superinfection The Sfi21/ Sfi 19 comparison has been reported previously (32)

Over the genome region covering the lysogeny module the dotplot indicated a

combination of DNA rearrangements and deletion processes between both phages.

Over the DNA replication modules the dotplot showed again a straight line. This similarity

continued with one interruption until the right end of the two compared phage genomes.

The interruption was an indel of an orf 66 followed by a partial duplication of the origin of

replication in Sfi11. Over the DNA packaging and morphogenesis module both phages

differed on the average by only 10 % at the nucleotide level, while over the lysis module

the average nucleotide difference was about 20 %. The DNA replication module was

nearly sequence identical, while the rightmost DNA segment showed on the average a

greater than 20 % nucleotide difference. 54

r\

-duplicated origin of replication-

Replication

I Lysogeny §§§ - lysogeny replacement- replacement!. :Sr Host Iysis| 3|C_-

putative antireceptor : Sfi11 "LiT origin of replication Tail morphogenesis lysogeny module

Head-tail joining It Head morphogenesis

F^ckaging

ItafiftiilrtÄ'

A ^ 4 ^ O' ^ « 'Cf '^ 'lb \ V\ % x

Figure 2: Dotplot analysis. (A) Dotplot calculated for the DNA genome sequences of the virulent pac-site phage Sfi 11 (y-axis) and the temperate pac-site phage 01205 (x-axis). The comparison window was 50 bp and the stringency 30 bp. For an easier orientation the color-coded gene maps of Sfii 1 and O1205 were given at the y-axis and the x-axis, respectively, together with the map position in bp as defined in Fig. 1. Regions of sequence differences were marked and annotated. (B) Dotplot calculated for the DNA genome sequences of the virulent pac-site phage Sfi 11 (y- axis) and the virulent cos-site phage Sfi 19 (x-axis). (C) Dotplot calculated for the DNA genome sequences of the virulent cos-site phage Sfi19 (y-axis) and the virulent cos-site phage DT1 (x- axis).

Comparison of the Sfi11 and Sfi19 phage genomes

Phages Sfi 11 and Sfi 19 differ in the DNA packaging mechanism and structural protein pattern, host range, serotype and ecological origin. In contrast, the two phages share the same life style and geographical origin (Table 1). Both phages showed a similar genome organization (Fig. 1). However, with one exception no nucleotide sequence similarity was seen over the left 18 kb of the aligned genomes in dotplot analysis (Fig. 2B) and the deduced proteins lacked sequence similarity. This region 55

D

duplicated origin of replication — k A Replication

orf229 Lysogeny deletion replacement c Host lysis o « o t putative Q. o Sfi11 © antireceptor a

i r O c Tail orf71-orf145 5) Ü morphogenesis orf512^ fused to orf88 o X! a>

(8 Ü "5. 3 orf515 o Head tail joining

Head morphogenesis

Packaging ? T TT MmOi )M v\\

covers the likely DNA packaging head and tail morphogenesis modules Over orf 695

(Sfi 11 numbering) regions of nucleotide sequence similarity alternated with regions of non-similarity followed by a nearly uninterrupted dotplot line Only three insertion / deletion events were observed over the right halves of the two genomes Two indels were localized in the lysogeny replacement module and one near the right genome end

Comparison of the Sfi19 and DT1 phage genomes

Phages Sfi19 and DT1 differ in host range and geographical and ecological origin

The two phages share life style DNA packaging mechanism and polypeptide pattern

(Table 1) Major differences between both genomes should therefore identify candidate genes for host range determinants and polymorphisms that became established in 56

duplicated origin of replication

Replication

Lysogeny Lysogeny replacement- replacement

Host lysis I

Sfi19 orf1291 'minor tail protein'

Tail morphogenesis orf1626

Head-tail joining

Head morphogenesis

Packaging

WIM \\> xs X XX %. X x % > DT1 Figure 2-contmued geographically and ecologically separated phage lineages A dotplot analysis revealed a straight line over the structural gene cluster that was interrupted by substitutions and insertion/ deletion of gene segments in the three largest orfs from Sfi19 (orf 1626, 1291 and 670, Fig 2C) All three genes could therefore play a role in host range determination The lysis gene from DT1 was interrupted by an intron (20) The lysogeny replacement module from DT1 is the result of a distinct, but related DNA rearrangement and deletion process Multiple small DNA substitutions and indels punctuated the dotplot over the right genome end of both phages possibly indicating genome polymorphisms

Comparison of the Sfi19 and Sfi18 phage genomes

Phages Sfi18 and Sfi19 are related with respect to DNA packaging and structural gene pattern, geographical origin, host range and life style (Table 1) They differ, 57

however, in their behavior towards prophage Sfi21 Phage Sfi18 was unable to multiply on a cell containing the chromosomally integrated Sfi21 prophage, the cloned orf 203 or the origin of replication from Sfi21 on a high copy number plasmid, while the Sfi19 multiplication was unabated (12 19) Major differences between both phages could thus identify adaptations of SFi19 to escape from prophage control PCR using Sfi 19 primers revealed identical products for Sfi 19 and Sfi18 over the structural gene cluster, while one difference was seen over the functional gene cluster (data not shown) We limited the sequencing to the right genome half from Sfi 18 since we judged it unlikely that modifications in the structural gene cluster of Sfi 19 are responsible for the escape from prophage control Excluding the indel in orf 88 from the lysogeny replacement region

(see Fig 4) and a 40 bp DNA substitution within the second origin, the functional gene clusters from Sfi19 and Sfi18 differed by only 145 bp changes resulting in 34 aa changes The aa changes were unevenly distributed orf 111 gp alone showed 13 aa changes and the orf 166 gp demonstrated a cluster of 5 aa changes A third of all bp changes were in two non-coding regions resembling origins of replication When a group of 20 phages with known susceptibility to prophage control were investigated by PCR neither the indel in orf 88 nor the substitution within the second putative origin correlated with escape from prophage Sfi21 control (data not shown), while the sequence of the origin correlated with this phenotype (19)

Recombination hot spots

After the pairwise alignments of two phage genomes, we aligned all available S thermophilus phage genomes This multiple alignment revealed a subdivision into conserved and variable regions Conserved regions were represented by the DNA packaging, head and tail gene region (two alleles, excluding the tail fiber genes), the lysis cassette (1 allele, DT1 showed a gene substitution for the first holm gene) and the

DNA replication region (1 allele) Three variable regions were identified the likely tail fiber genes, the lysogeny/ replacement region and the right genome end All three regions demonstrate the importance of recombination processes for the diversification of

S thermophilus phage genomes 58

or£923-DTl GMTYDMVASEIJÎQPTTLTISAWVDNKEVASEEVTFVNVSDGK - -- 500 orf 1291-sf 119 GMTYDMVASDITEPTrLTVAAvv\DIDKEVASEEVTFFNVSDGKHGEKGDIGPKGADGRTQY 512 Oif843-1205 MKIITRNDITLTNVYDGTDAPTVFVKSYTYSAGSRAEIKLTGPNAFEQTVYSRGH 5S orf695-sfill MKIITRNDITLTNVNDGAD 19

"*|—|

orf921-DTl - -QGPKGDCjGIPGPKG "514 orfl291-sfil9 SLIRGNDGKQGPQGERGLTGPQGPKGDRGERGLQGPRGDQGIPGPKGEDGKTHYTBIAYA 83 6 orf843-1205 GYSVAKMGEOGPKGDR GEQGLQGPRGEQG1PGPKGADGRTQYTHIAYA 382 orf695-sfill SLIRGNDGKQGPQGER GPQGLQGPKGDQGXPGPKGADGRTQYTHÏAYA 273

orf921 DTI - -- ADGKT 519 orf129 L-sf119 DTVSGSGFSQTDVNKPYIGMYQÛFNEVBSNNPQDYRWTKWK-GSBGRÛGIPGK&GADGRÏ 895 orf84 ^-1205 DAISGSGFSQTDVSKPYIGMYQDFNFVDSNNPQDYRWSKWK-GSDGKDGVPGKAGADGRT 44] or£ 6 9 5 - s f111 JDTVSGSGPSQTDVNKPYIGMYQDFNAIDSQNPQDYRWTKWK- GSDGKDGIPGÏC&GAPGRT 3 3 2 orf921-DTl PYVHFAYADSADGQKGFSLTQTGSKRYIjGVYTDF'NQAGSTNPADYTWSDTAGSVSVGGDN 579 orfl291-sfi!9 t»YVMFAYADSADGRDGFSl,TQTGSKRYIGVLTNFFKEDSTNPADYTWNDTAGSISVGGRN 955 oif 843 -1205 PWBFAYADSÄDGRTGFSLXQNGRKRYLGVLTNFlKKrjSTNPSDYSWKIDTÄGSV^VGGEN 501 orf695-sfill P'YVHFAYAÖSÄOGRNGFSLTQTGRKRYLGVLTNYFKEDSTNPEEY'rWNDTAGSISVGARKI 3 92 orf921-DT1 LITNSSFPKNLDNWGYWEAGSPNENLHIATHGF-YYNGTRTLFRLDCDNSYGVPAASRRP 63 8 orf1291-sf119 ÜVKTN--QGITNWDWTMSNGDKSVEEVNIDGIRAVKLTKGTKTANTGGNYIQYQGLLRK 1013 orf843-1205 LIRNSAFPKNLDGWGHWEAPQPNSNLSISSHDF-YYNDSKPLFLLNSSSS-GVPAATLRF 559 orf 6 95- sf 111 LLKGSKGP FKPDRKPTNFDNYV- YYENETSVYLEQNK KY 430 orf921-DTl PVKROTDYSLNIQMFATENIKGVNIYFLGRKSNETNEMFTKAVNIKHLEKSPSTTDVVKF 6 98 orf12 91-sf119 LIRPNTQYTLSFDVKPSVDVSFSATLIRG NYQAELTDTVLMNKALANQWTKVSCVLT 10 7 0 orf843-12 05 PVKRNTtoYSFNIQTFATGNIKGVlIYFLGRKANEKDKTFTKWNVKTINGSPSVTQAVKW 619 orf695~sfill IISAKTDGNFSDSHNPYADSDNWLMLFG--- ~DG ÏLQTISDSNTGÏTGTQF 478

orf9 21-DTl HFTFNSGECDEGFIRVDNYGATYGT-SSLFFTELDCYEGTTDRSWQASPEDLKDEIDTKA 757 orf12 91-sf119 SKETLSGDLNQ-WYLAGMPTTNG--NWLIIKNIKLEEGNIPTPWTPAIEDIQEE10SKA 112 7 oif843-12 05 HLTFNSGDCDEGFIRVDNNGTTDGKTSMLFFGELDCYEGTTDRAWQASPKDLQEEIDTKA 679 orf 6 95-sf 111 TWTHPTO---TYHLRVN TYHKTATKSVWKVKIEDQNIKTDWTPAIBDIQEEIDTKÄ 531 orf921-DTl DNALTQAQtiNKLSEINSVMKAELEAKASLATINQWIKAYQCiFVNANSADRAQAQKALADA 817 orfl291-sfil9 DDVLTQAQtNKliKEISSIVKAELAAKASLDTLDQWKQAYQDFVNANNANRAQAEKDLADA 118 7 orf 843 - 12 05 DAAMTOAQIMRLNEEDSILKAEIDARAKADIVNNWIKDYNDFLELSERERVKÄSQDIiKNA 73 9 orf 6 95-sf ill ÖAÄMTQAQIiNRIjNEEDSIFKAEIAARÄKADIVNNWIKDYNDFLEVSERERVKAEQOIiKNA 5 91 orf921-DTI SARWKLENNLNDMSERWNFIDNYMAASNDGLVIGKK0NSSSIVFNPNGRISMFSAGNEV 877 orf 12 91-sf 119 SARVTKLENDLNDMSERWNF1DSYMTASMEGLWGKTDNSSSMLFSPNGRÏSMFSAGHEV 124 7 orf843-12 05 QQRLNAINKOl,KEMTERWNFI0TYMSASNBGFVIGKQbGSKSIMLNADGGlTFF$AGRST 799 orf 6 95-sf ill QQRLNAINKDIiKEMTBRWNFlDTYMSASNEGFVlGKQDGSKSIMLNADGGITFFS&GRST 651 or£921-DTl MYISRGVIHIENGIFSKTIQIGRFREEQDFINPDRNVIRYAGGI 921 orf1291-sf119 MYISQGVIHIENGIFSKTIQIGRYREEQDIINPDRNVIRYVGGS 1291 orf843-1205 MT1SEGKTEMNSGWKKSLKVGRYIEQPYHVDPDIWIRYVGGE 843 orf695-sfill MTISEGKTEMNNGWKKSLKVGRYIEQPYHIDPDINV1RYVGEE 695

Figure 3. Multiple alignment of a putative tail fiber protein from S thermophilus phages showing distinct host ranges The proteins are identified with their corresponding codon length. Gaps (-) were introduced for maximal alignment Aa positions that differ between the corresponding proteins from phages Sfi 19 and Sfi21 are underlined and the location of a spontaneous deletion in the Sfi21 protein is marked by broken arrows Aa positions which were identical in at least three proteins were shaded 59

The likely tail fiber genes

Multiple alignment of the largest predicted protein from the three cos-site phages

Sfi19, Sfi21 and DT1 revealed a very conserved 500 aa N-terminal segment, followed by a variable region, then a highly conserved central region of 75 aa that was flanked by a perfect 12 aa repeat (YKHNKKFKKFVD). After a second variable region a third conserved region was observed. Weak similarity was detected with the protein encoded by the topologically corresponding orf 1510 in Sfi 11 (23 % over 500 aa, P=10"10).

An even more complicated pattern was seen for the second largest proteins. Over the N-terminal 900 aa the corresponding proteins from the twocos-site phages Sfi19 and

DT1 were nearly identical, except for an indel in DT1 which started and ended in a collagen-like repeat. The borders corresponded relatively precisely to a spontaneous deletion in Sfi21 (Fig. 3). Sf121 differed from Sfi 19 by a few aa replacements and an indel of a small collagen-like repeat. DT1 phage showed a second indel flanked at both sides by collagen-like repeats (Fig. 3) suggesting the collagen-like repeats as recombination hotspots. This conserved region was followed by a highly variable region between Sfi 19 and DT1 and ended in a further highly conserved region.

Notably, at a topologically corresponding genome position pac-site phages encoded a protein that shared the highly conserved collagen-like repeat region with cos-site phages. This repeat region was followed by a region which did not only differ between cos- and pac-site phages, but also between the two pac-site phages. The variable region was followed by a C-terminal domain that was highly conserved between the corresponding proteins from both pac-site phages and moderately conserved between the proteins from pac-and cos-site phages (Fig. 3).

The lysogeny replacement region

Temperate and lytic S. thermophilus phages showed the same overall genome organization. The lysogeny module was flanked on one side by the lysis module and on the other side by the DNA replication module. All lytic phages showed at the position of the lysogeny module a replacement module. A multiple alignment of the gene maps demonstrated that the replacement modules were derived from lysogeny modules by a combination of insertion/ deletion and DNA rearrangement processes (Fig. 4).

Interestingly, the three lytic phage isolates from France showed a replacement module that differed only by three insertion / deletion events: One event was an in frame fusion 60

Lysogeny module Replication

O1205 orbl

» T" |i »-* -*—* ^ »'' y 111 183 83 71 1« 6940 157 233

lyqin

111 183 47 71 145 69 40 229 54

atytke

111 183 4? »40 229 *—W 34

cro-ike

" iegnse

22 121 SI 218 OT S 46

Figure 4. Comparison of the lysogeny replacement module from the virulent pac-site phage Sfi11 with that of the virulent cos-site phages Sfi 19 Sfi 18 and DT1 The predicted orfs were annotated with their codon lengths The one bp indel was indicated by an arrow, the position of a second indel is marked by a filled arrowhead The third indel corresponds to orf 229 flanked by 68 bp direct repeats (open triangle) For comparison the genome maps of the lysogeny modules from the pac-site S thermophilus phages 01205 (top) and TPJ-34 (bottom) were given Regions of sequence similarity between the phages O1205 and Sfi11 on one side and DT1 and TPJ-34 on the other side are connected by dark shading Regions of sequence diversity between the four virulent phages are connected light shading

of orf 71 and orf 145 from Sfi 11 and Sfi 18 to yield orf 88 in phage Sfi19 (Fig 4) This event is likely the consequence of DNA polymerase slippage since the putative deletion site was precisely flanked by a heptanucleotide repeat GATGATT and only one repeat was retained in orf 88 from phage Sfi 19 (filled triangle in Fig 4) A second event comprised the entire orf 229 in Sfi 18 and Sfi19 (Fig 4) This orf is flanked by a perfect 68 bp repeat situated 16 bp upstream and 2 bp downstream, respectively, of the start and stop codon of orf 229 Sfi11 showed at the corresponding position only one 68 bp sequence suggesting again DNA polymerase slippage or homologous recombination as the cause for this polymorphism Over orf 111 to orf 71 the temperate phage TPJ-34 differed from the lytic phages Sf111, Sfi18 and Sfi19 by one bp change resulting in a premature stop codon in TPJ-34 (asterisk in Fig 4) and one bp insertion resulting in different orf predictions for Sfi 18 and Sfi19 phages when compared to phage Sfi 11

(arrow in Fig 4)

Notably, the Canadian virulent phage isolate DT1 showed a distinct DNA 61

Mvcobacteriophage oL-S gene64 £ h

DNA-primase

( on . Sfi2i . >n / r=r;r=r: , yz ' 504' 143 106 89 575161 130 181 170 114 152 235

L. la clis r1t

~ -_ "" Sfi 19 - I ori I __ ) =+-R-ori" __ '} _,' ZZ- ' 504' 143 106 89 5751 67 192 166 98 235

" ~ Sfi18 I ^m~ori" r ~i f )ori ; ~^ , \ 504' 143 106 89 5751 66 192 166 99 235

— ~ Sfi 11 r ) ori i ') 'yzzzt^Zder ffl-ori" f« '—^ msma^zz: 504' 143 106 89 5751 66 178 167 99 120 236

' ~ ' O1205 I } ori I ZT^Z1' /7"; ZZs z" ^(^lJ 504' 143 106 78 57 50 156 181 101 146 236

dti N ori I 504' 143 107 82 51 70 165 100 235

1 kb Figure 5, Alignment of the gene maps from the right genome ends of the indicated S. thermophilus phages. The predicted open reading frames are annotated with their codon lengths. Insertions / deletions are connected by light gray shading. Sequence diversity greater than 20 % at the aa level are indicated by different shadings of the arrows representing the orfs.

rearrangement event over this region (Fig. 4) demonstrating that the majority of the orfs from this region are non-essential genes as already indicated by the high number of polymorphisms in the French isolates. Only one gene is conserved between all four virulent phages (orf 183) and all possess a cro-like gene. However, the DT-1 phage shared a highly related cro-like gene with TPJ-34, while the French lytic phages shared a cro-like gene that was distantly related to 01205. Previously, we had attributed orf 157 to the highly conserved DNA replication module from S. thermophilus phages (16). DT-1 showed a distinct orf 104 at this position.

The right genome end

Beyond the lysogeny or the replacement module all six investigated S. thermophilus phages showed a common DNA segment which covered the putative DNA replication module and extended until orf 236, three orfs upstream of the respective cos- orpac-site.

The left part of this common genome segment was highly conserved between the investigated phages, while the right part was punctuated by numerous insertion/ deletions of entire genes (e.g. orf 57, 146 in DT-1), substitution of genes (e.g. orf 152 in

Sfi21 and orf 146 in 01205) or a segment of a gene (e g. orf 156 in O1205) (Fig. 5). The 62

region downstream of orf 51 is a recombination hotspot The three lytic phages isolated in France showed at this position a replacement of orf 61 and orf 130 (Sfi21) by a distinct orf 67 followed by a duplicated origin of phage replication The lytic Canadian isolate DT-

1 showed at this position a distinct duplication, while 01205 lacked supplementary DNA between orf 50 and orf 156 (Fig 5)

Discussion

The comparative sequence analysis revealed that genetic diversity is created in S thermophilus phages by two distinct processes One process of obvious importance is genetic recombination Recombination is apparently at the basis of differences in life cycle, host range, DNA packaging and structural protein pattern The second process creating diversity is the accumulation of point mutations The degree of bp sequence diversity differed substantially over the various phage genome segments Practically no sequence variability was found over the replication module, while up to 30 % bp differences were detected over the morphogenesis genes Interestingly, morphogenesis genes from Lactococcus lactis phages showed up to 50 % bp identity with phage Sfi21

(18) raising the possibility that the accumulation of point mutations is an important driving force of phage evolution In comparison with phage Sfi18 relatively few bp changes could free the lytic S thermophilus phage Sfi 19 from the superinfection exclusion by the Sfi21 prophage The analysis of competitive interactions among RNA phages within the framework of the game theory demonstrated that defection (selfishness) evolved, despite the greater fitness pay-off that would result if all phage players were to cooperate (46)

Phage Sfi 19 could thus be a selfish phage player which might explain the relative rarity of lysogeny in S thermophilus

Genetic recombination apparently plays a comparable role in S thermophilus phages to that of the much better investigated (although less intensively sequenced) lambdoid coliphages (14) Lambdoid coliphages showed a similar division into a structural and a functional gene cluster As in S thermophilus phages, only two different structural gene clusters were identified in lambdoid phages Apparently due to the multiple interactions of the structural proteins, modular exchanges are not allowed within the morphogenesis gene cluster The similarity between both phage systems is not surprising since the structural gene map from S thermophilus phages could be aligned with that from 63

lambdoid coliphages (18) In both phage systems, modular exchange reactions were limited to the putative tail fiber genes and the functional gene cluster InS thermophilus phages the units of genetic exchange over the functional gene cluster are small and comprise single genes or even segments of genes Similar observations were made for mycobactenophages (21) and there is also increasing evidence that units of genetic exchange in lambdoid phages can be as small as segments of an individual gene

(26,38)

As seen previously from the comparison of temperate S thermophilus phages (39), the lysogeny region is a recombination hotspot All four virulent phages showed a lysogeny replacement module which may be derived from temperate phages by two processes the deletion of genes essential for the establishment of the lysogénie state

(integrase, repressor gene) and the rearrangement of DNA regions flanking the lysogeny module in temperate pac-site S thermophilus phages A common denominator was the conservation of a cro-like repressor Such a close similarity between virulent and temperate phages has also been described for phages from other gram-positive bacteria, such as Lactobacillus (37) and Mycobacterium (21)

The putative tail fiber genes are a further recombination hotspot Multiple alignment of these proteins revealed a pattern of conservation and non-conservation reminiscent of the mechanism with which T-even phages create host range diversity in their adhesins

(42) Further similarities with T-even phage adhesins were glycine-nch segments in the conserved protein regions Tetart et al (42) described oligoglycine stretches m T4, while

Desiere et al (17) reported collagen-like glycine-X-Y repeats m S thermophilus phages

We speculate that these elements play a role in the recombinational reshuffling of the putative phage anti-receptors Similar collagen-like repeats were observed in

Lactococcus phage BK5-T (3) During serial passage in the laboratory BK5-T and Sfi21 phages showed spontaneous deletions which originated in the collagen-like repeats

(4,10) The conserved C-terminus of the putative S thermophilus phage anti-receptor was also detected in a corresponding protein from the S pneumoniae phage Dp-1 (17)

Since this protein segment gave a strong coiled coil prediction, it could represent an important structural determinant for the assembly of a fibrous protein

The prominent role of recombination in the diversification of bacteriophage genomes 64

has until now prevented the assessment of phylogenetic relationships between the different phage groups In phage modules, such as the morphogenesis gene cluster, which do not allow exchange reaction, we have observed evidence for the role of point mutations in the evolution of temperate Siphoviridae from low GC content Gram-positive bacteria With an increasing database of phage sequences it might therefore become possible to delineate evolutionary relationships at least for selected phage modules The prominent role of recombination has also complicated all taxonomical approaches to phages The International Committee for the Taxonomy of Viruses (ICTV) defined a virus species "as a polythetic class of viruses that constitutes a replicating lineage and occupies a particular ecological niche (35) According to this definition all S thermophilus phages investigated in this report should belong to the same phage species since they share extensive DNA homology indicative of a replicative lineage and they infect a single bacterial species from a defined dairy environment In our opinion this definition has the advantage of biological plausibility The ICTV definition of a virus genus is vague « a virus genus is a group of species sharing certain common characters »

(35) Maniloff and Ackermann (34) have delineated a taxonomy of bacterial viruses where they have tentatively defined six viral genera in the family Siphoviridae Two criteria were used to define the tailed virus genera properties related to DNA replication and packaging, and specific features such as the ability to establish a temperate infection or bacterial host According to this proposition it could be argued that cos-site and pac-site S thermophilus phages represent distinct and new phage genera of

Siphoviridae (if the criteria from Maniloff and Ackermannn (34) are somewhat stretched, cos-site S thermophilus phages could be grouped with the genus "A,-like phages") The attribution of S thermophilus phages into two genera seems counterintuitive to us

According to the concept of Maniloff and Ackermann (34) many more new Siphoviridae genera will be created as more Siphoviridae from undennvestigated bacterial groups will be characterized We remind that less than 1 percent of the bacteria from the natural environment can be cultivated and that viruses are the most common biological agents for example in the sea (22) The ICTV rules state that it is not obligatory to use all levels of the taxonomical hierarchy (35) We propose to renounce the attribution ofSiphovmdae genera for the moment until we dispose of a reasonable sequence database for bacterial viruses In view of the current ease of sequence acquisition a coordinated international effort to obtain whole genome sequences for available phages from all major classes of 65

bacteria is warranted

Acknowledgements

We thank the Swiss National Science Foundation for the financial support of S

Lucchini and F Desiere in the framework of its Biotechnology Module (Grant 5002-

044545/1) and S Foley for critical reading of the manuscript

References

1 Altschul, S.F., W. Gish, W. Miller, E.W. Myers, and D.J. Lipman. 1990 Basic local alignment search tool J Mol Biol 215 403-410

2 Botstein, D. 1980 A theory of modular evolution for bacteriophages Ann N Y Acad Sei 354,484-491

3 Boyce, J.D, B.E. Davidson, and Hillier, A.J. 1995 Sequence analysis of the Lactococcus lactis temperate bacteriophage BK5-T and demonstration that the phage DNA has cohesive ends Appl Environ Microbiol 61,4089-4098

4 Boyce, J.D, B.E. Davidson, and Hillier, A.J. 1995 Spontaneous deletion mutants of the Lactococcus lactis temperate bacteriophage BK5-T and localization of the BK5-TatfPsite Appl Environ Microbiol 61,4105-4109

5 Brüssow, H. 1999 Phages of Streptococcus thermophilus In Encyclopedia of Virology R Webster and A Granoff (editors) Second Edition, Academic Press, London

6 Brüssow, H., and A. Bruttin. 1995. Characterization of a temperate Streptococcus thermophilus bacteriophage and its genetic relationship with lytic phages Virology 212, 632-640

7 Brüssow, H., A. Bruttin, F. Desiere, S. Lucchini, and S. Foley. 1998 Molecular ecology and evolution of Streptococcus thermophilus bactenophages-a review Virus Genes 16, 95-109

8 Brüssow, H., M. Fremont, A. Bruttin, J. Sidoti, A. Constable, and Fryder, V. 1994 Detection and classification of Streptococcus thermophilus bacteriophages isolated from industrial milk fermentation Appl Environ Microbiol 60, 4537-4543

9 Brüssow, H., A. Probst, M. Fremont, and J. Sidoti. 1994 Distinct Streptococcus thermophilus bacteriophages share an extremely conserved DNA fragment Virology 200, 854-857 66

10 Bruttin, A., and H. Brüssow. 1996 Site-specific spontaneous deletions m three genome regions of a temperate Streptococcus thermophilus phage Virology 219, 96-104

11 Bruttin, A., F. Desiere, N. d'Amico, J.P. Guérin, J. Sidoti, B. Huni, S. Lucchini, and H. Brüssow. 1997 Molecular ecology of Streptococcus thermophilus bacteriophage infections in a cheese factory Appl Environ Microbiol 63,3144- 3150

12 Bruttin, A., F. Desiere, S. Lucchini, S. Foley, and H. Brüssow. 1997 Characterization of the lysogeny DNA module from the temperate Streptococcus thermophilus bacteriophage phi Sfi21 Virology 233, 136-148

13 Bujarski, J.J., and P. Kaesberg. 1986 Genetic recombination between RNA components of a multipartite plant virus Nature 321 528-531

14 Casjens, S., G. Hatful, and R. Hendrix. 1992 Evolution of dsDNA tailed bacteriophage genomes Semin Virol 3 383-397

15 Corpet, F. 1988 Multiple sequence alignment with hierarchical clustering Nucleic Acids Res 16 10881-10890 ,

16 Desiere, F., S. Lucchini, A. Bruttin, M.-C. Zwahlen, and H. Brüssow. 1997 A highly conserved DNA replication module from Streptococcus thermophilus phages is similar in sequence and topology to a module from Lactococcus lactis phages Virology 234, 372-382

17 Desiere, F., S. Lucchini, and H. Brüssow. 1998 Evolution of Streptococcus thermophilus bacteriophage genomes by modular exchanges followed by point mutations and small deletions and insertions Virology 241, 345-356

18 Desiere, F., S. Lucchini, and H. Brüssow. 1999 Comparative sequence analysis of the DNA packaging, head and tail morphogenesis modules in the temperate cos-site Streptococcus thermophilus bacteriophage <|>Sfi21 Virology 260, 244-253

19 Foley, S., S. Lucchini, M.-C. Zwahlen, and H. Brussow. 1998 A short noncoding viral DNA element showing characteristics of a replication origin confers bacteriophage resistance to Streptococcus thermophilus Virology 250, 377-387

20 Foley, S., A. Bruttin, and H. Brussow. 1999 Widespread distribution of a group I intron and its three deletion derivatives in the lysin gene of Streptococcus thermophilus bacteriophages J Virol in press

21 Ford, M.E., GJ. Sarkis, A.E. Belanger, R. Hendrix, and G.F. Hatful. 1998 Genome structure of mycobactenophage D29 Implications for phage evolution J Mol Biol 279, 143-164

22 Fuhrman, J.A. 1999 Marine viruses and their biogeochemical and ecological effects Nature 399, 541-548 67

23 Guédon, G., F. Bourgoin, M. Pébay, Y. Roussel, C. Colmin, J.M. Simonet, and B. Decaris. 1995 Characterization and distribution of two insertion sequences, IS1191 and IS0-IS981, in Streptococcus thermophilus Does intergenic transfer of insertion sequences occur in lactic acid bacteria co-cultures ? Mol Microbiol 16, 69-78

24 Hahn, CS., S. Lustig, E.G. Strauss, and J.H. Strauss. 1988 Western equine encephalitis virus is a recombinant virus Proc Natl Acad Sei USA 85, 5997-6001

25 Hendrix, R., M.C.M. Smith, R.N. Burns, M.E. Ford, and G.F. Hatfull. 1999 Evolutionary relationships among diverse bacteriophages and prophages All the world's a phage Proc Natl Acad Sei USA 96 2192-2197

26 Highton, P.J., Y. Chang, and R.J. Myers. 1990 Evidence for the exchange of segments between genomes during the evolution of lambdoid bacteriophages Mol Microbiol 4, 1329-1340

27 Huang, X., and W. Miller. 1991. A time-efficient, linear-space local similarity algorithm Advances in Applied Mathematics 12, 337-357

28 Li, K., Z. Chen, and P. Plagemann. 1999 High-frequency homologous genetic recombination of an artenvirus, lactate dehydrogenase-elevating virus, in mice and evolution of neuropathogenic variants Virology 258, 73-83

29 Lipman, D.J. and W.R. Pearson. 1985 Rapid and sensitive protein similarity searches Science 227, 1435-1441

30 Le Marrec, C, D. van Sinderen, L. Walsh, E. Stanley, E. Vlegels, S. Moineau, P. Heinze, G. Fitzgerald, and B. Fayard. 1997 Two groups of bacteriophages infecting Streptococcus thermophilus can be distinguished on the basis of mode of packaging and genetic determinants for major structural proteins Appl Environ Microbiol 63, 3246-3253

31 Lucchini, S., F., Desiere, and H. Brussow. 1998 The structural gene module in Streptococcus thermophilus bacteriophage phi Sfi 11 shows a hierarchy of relatedness to Siphoviridae from a wide range of bacterial hosts Virology 246, 63- 73

32 Lucchini, S., F. Desiere, and H. Brussow. 1999 The genetic relationship between virulent and temperate Streptococcus thermophilus bacteriophages Whole genome comparison of cos-site phages Sfi 19 and Sfi21 Virology 263 427-435

33 Makino, S., J.G. Keck, S.T. Stohiman, and M.M.C. Lai. 1986 High-frequency RNA recombination of murine coronaviruses J Virol 57, 729-737

34 Maniloff, J., and H.-W. Ackermann. 1998 Taxonomy of bacterial viruses establishment of tailed virus genera and the order Caudovirales Arch Virol 143, 2051-2063 68

35. Mayo, M.A. 1996 Recent revisions of the rules of and nomenclature Arch Virol 141,2479-2484

36 Mercenier, A. 1990 Molecular genetics of Streptococcus thermophilus FEMS Microbiol Rev 87,61-78

37 Mikkonen, M., L. Dupont, T. Alatossava, and P. Ritzenthaler. 1996 Defective site- specific integration elements are present in the genome of virulent bacteriophage LL- H of Lactobacillus delbrueckii Appl Environ Microbiol 62, 1847-1851

38 Moore, D.D., K.J. Denniston, and F.R. Blattner. 1981 Sequence organization of the origins of DNA replication in lambdoid coliphages Gene 14, 91-101

39 Neve, H., K.I. Zenz, F. Desiere, A. Koch, K.J. Heller, and H. Brüssow. 1998 Comparison of the lysogeny modules from the temperate Streptococcus thermophilus bacteriophages TP-J34 and Sfi21 Implications for the modular theory of phage evolution Virology 241, 61-72

40 Stanley, E., G.F, Fitzgerald, M.C. LeMarrec, B. Fayard, and D. van Sinderen. 1997 Sequence analysis and characterization of phi 01205, a temperate bactenophage infecting Streptococcus thermophilus CNRZ1205 Microbiology 143, 3417-3429

41 Strauss, E.G., J.H. Strauss, and A.J. Levine. 1996 Virus evolution In "Virology" N D M and P M Eds 3rd ed 153-171 Raven (B Fields, Knipe, Howley ), , pp Press, New York

42 Tétart, F., C. Desplats, and H. Krisch. 1998 Genome plasticity in the distal tail fiber locus of the T-even bacteriophage recombination between conserved motifs swaps adhesin specificity J Mol Biol 282 543-556

43 Tolskaya, E.A., L.A. Romanov, M.S. Kolesnikova, and V.l. Agol. 1983 Intertypic recombination in poliovirus genetic and biochemical studies Virology 124, 121-132

44 Tremblay, D., and S. Moineau. 1999 Complete genomic sequence of the lytic bacteriophage DT1 of Streptococcus thermophilus Virology 255, 63-76

45 Thompson, J.D., D.G. Higgins, and T.J. Gibson. 1994 CLUSTAL W improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice Nucleic Acids Res 22, 4673-4680

46 Turner, P.E., and L. Chao. 1999 Prisoners dilemma in an RNA virus Nature 398, 441-443

47 Weaver, S.C., W. Kang, Y. Shirako, T. Rùmenapf, E.G. Strauss, and J.H. Strauss. 1997 Recombinational history and molecular evolution of Western equine encephalomyelitis complex J Virol 71 613-623 69

48. Webster, R.G., W.G. Laver, G.M. Air, and G.C. Schild. 1982. Molecular mechanisms of variation in influenza viruses. Nature 296, 115-121. 70 71

2.4 Similarly organized lysogeny modules in temperate Siphoviridae from low GC content Gram-positive bacteria

(Virology 263, 427-435 (1999)) 72

Similarly Organized Lysogeny Modules in Temperate Siphoviridae from low GC content Gram-Positive Bacteria (Virology 263 427-435(1999)) Sacha Lucchini, Frank Desiere and Harald Brussow

Nestlé Research Centre, Nestec Ltd Vets-chez-les-Blanc CH-1000 Lausanne 26, Switzerland

Abstract

Temperate Siphoviridae from an evolutionary related branch of low GC content Gram-positive bacteria share a common genetic organization of lysogeny-related genes and the predicted proteins are linked by many sequence similarities Their compact lysogeny modules (mtegrase/1-2 orfs [phage exclusion'? and metalloproteinase motif proteins]/cl-like repressor/ cro-like repressor/ anti-repressor [optional]) differ clearly from that of X-like and L5-hke viruses, the two currently established genera of temperate Siphoviridae, while they resemble those of the P2-like genus of Myoviridae In all known temperate Siphovindae from low GC content Gram-positive bacteria the lysogeny module is flanked by the lysis module and the DNA replication module This modular organization is again distinct from that of the known genera of temperate Siphovindae On the basis of comparative sequence analysis we propose a new genus of Siphoviridae «Sfi21-hke» phages With a larger database of phage sequences it might be possible to establish a genomics-based phage taxonomy and to retrace the evolutionary history of selected phage modules or individual phage genes The anti-repressor of Sfi21-hke phages has an unusual wide-spread distribution since proteins with high aa similarity (40 %) were not only found in phages from Gram-negative bacteria, but also in insect viruses

Introduction

Principally all cellular organisms are susceptible to viral infection, often by more than one type of virus A given type of virus usually has a restricted host range, often a single species In theory this implies that viruses are probably the most diverse creatures on

Earth Viruses might also be the most abundant biological objects typically exceeding the bacterial count in uncultured environmental samples (Fuhrman, 1999) About 4500 bacterial viruses have been examined in the electron microscope Three quarters are from just 12 bacterial genera (Ackermann, 1996) If one considers that only a tiny fraction of bacteria can presently be cultivated, it becomes clear that we have a very incomplete picture of the diversity of bacterial viruses

The International Committee on Taxonomy of Viruses (ICTV) has developed a taxonomic system for bacterial viruses Notably, 96 per cent of all described phages are tailed phages (new order Caudovirales) Three families are distinguished in this order

Myovnidae (viruses with contractile tails), Siphovindae (viruses with long, noncontractile tails) and Podovindae (viruses with short noncontractile tails) At present six genera of phages have tentatively been defined in Myoviridae (e g the genus of "P2-hke viruses"), six in Siphovindae (e g the genus of A-like viruses") and three in Podovindae (e g the genus of "P22-like viruses") The genera were defined based on properties involving viral DNA replication, packaging and some specific features (e g genome size temperate/ lytic life style, Maniloff and Ackermann, 1998)

A contentious issue is whether these taxonomic units delineate natural phage groups

(Ackermann, 1999) The Podovirus P22 ( a Salmonella phage), for example, showed a strikingly similar genome organization to the Siphovirus X (an E coll phage), both phages were considered members of the lambdoid phage group (Casjens et al, 1992) and indeed viable hybrid phages between P22 and X have been described (Gemski et al, 1972) The attribution of phages into different phage families on the sole basis of a distinct morphology seems therefore biologically implausible

Comparative sequence analysis should theoretically shed some light on the phylogenetic relationships between phages However, two facts limit the use of this straightforward method On the one hand many phages encode recombination proteins and are thus very recombinogenic To account for this fact the modular theory of phage evolution (Botstein, 1980) states that the units of evolution are not individual phage genomes, but groups of functionally related genes (modules) which can relatively freely be exchanged between phages differing otherwise in many respects On the other hand, only few phage genomes have been sequenced and even less comparative phage genomics studies have been conducted (Hendrix, 1999)

Previously, we conducted comparative and evolutionary analyses for Streptococcus

et 1998 Lucchini et al a thermophilus phages (Desiere al, 1999, , 1998, 1999a, b), group of phages of economical importance to the dairy industry (Brussow et al, 1998)

Notably, over the morphogenesis module the degree of sequence similarity between S thermophilus phages and phage sequences from the database correlated with the evolutionary distance of their bacterial hosts The Lactococcus lactis phage BK5-T

(Boyce et al, 1995) showed still nucleotide sequence similarity with the S thermophilus phage Sfi21 (Desiere et al, 1999) Phages from more distantly related low GC content

Gram-positive bacteria like Bacillus and Lactobacillus showed only sequence similarity at the ammo acid level with S thermophilus phages (Lucchini et al 1998, Desiere et al, 74

1999) Finally, over the morphogenesis genes S thermophilus phage Sfi21 showed a very similar gene map as lambdoid coliphages, but amino acid sequence similarity was no longer detected The observation of such a hierarchy of relatedness is the hallmark of any biological system undergoing evolutionary changes

Have Siphovindae an evolutionary history that can partially be deduced by comparative sequencing? To address this question we extended our comparative analysis to another phage module from Siphovindae that is under different functional contraints than the morphogenesis genes The structural proteins show intensive protein-protein interactions during morphogenesis In contrast, phage proteins implicated in the control of the virulent / lysogénie lifestyle decision (integrase excisionase, CI and

Cro repressor) interact competitively with target DNA segments while showing relatively few protein-protein interactions Our database analysis of the lysogeny modules revealed again a close relationship between temperate Siphovindae infecting low GC content

Gram-positive bacteria with respect to overall genetic organization and sequence similarity Their lysogeny modules differed clearly from that of other genera of temperate

Siphovindae, while similarly organized lysogeny modules were found in P2-like

Myoviridae We discuss the implications of these observations for a sequence-based theory of bacteriophage evolution and phage taxonomy

RESULTS

Comparable lysogeny modules are found in Siphoviridae from low GC content

Gram-positive bacteria

Probably the best characterized lysogeny module from low GC content Gram-positive phages is that of S thermophilus phage Sfi21 (Fig 1A and Table 1) From the database we retrieved DNA sequences from temperate Siphovindae and localized their putative lysogeny-related genes Bioinformatic analysis revealed similarly organized lysogeny modules in Siphovindae from five evolutionary related bacterial genera Streptococcus

(n-3) Lactococcus (n=3), Lactobacillus (n=3) Staphylococcus (n=1) and Bacillus (n=1) 75

gleist ! f "—V 'l

adhli [ s

Sfi21

TP-J34

01205

TP901 1

BK5T

r1t

A2

S pyogenes^

105,

PVL

B HP1^s K1394i

S24

C LS l ' ) ^ l| . \ i) ». i t IS 1111!

X 1 kb

jgg integrase i mpm d4l

cIHke/ 1 u lknow i c o- ke "1 mranty antirepressor

IMIL JT i J D 1 kb

5 6

Ht t-ttt—rrr

BK5-T

63 79 266

Figure 1. Comparative genetic organization of the lysogeny related genes in selected temperate phages A) Siphovindae infecting divers genera of low GC Gram positive bacteria Lactobacillus (g1e, adh, A2) Streptococcus (Sfi21 TP-J34 O1205 prophage 296) Lactococcus (TP901-1 BK5-T r1t) Bacillus (phi-105) and Staphylococcus (PVL) B) P2-hke genus of Myoviridae infecting the Gram negative bacteria Vibrio cholerae (K139) and Hemophilus influenzae (S2, HP1 ) C) For comparison the organization of lysogeny related genes is given for the L5-like genus and the X-hke genus of Siphovindae infecting Mycobacterium tuberculosis (a high GC content gram-positive bacterium) and Escherichia coil (a gram-negative bacterium) respectively The genomes are aligned at the left by their integrase genes Gene functions attributed by bioinformatic analysis or biological experiment are color-coded as indicated in the figure legend (mpm metalloproteinase motif (Jongeneel et al 1989)) The length of the arrow is proportional to the length of the predicted open reading frame Note the change in scale in Fig 1C (reduced) D) Alignment of the putative lysogeny modules and the adjacent region in L lactis r1t and BK5-T The predicted open reading frames are color-coded (see Fig 1C) and are annotated as in their original publication (see table 2 for the references) The colored bars give the percentage of nt sequence identity between the two compared sequences as defined in the color code at the top of the figure The figure was created with the SIM Alignment tool (Huang and Miller 1991) and LALNview (Duret et al 1996) 76

(Fig. 1A). The compact organization of lysogeny-related genes in this group of phages differed clearly from the genetic organization of lysogeny-related genes in

Siphoviridae from high GC content gram-positive bacteria (L5-like genus) and from

Gram-negative bacteria (X-like genus). In the£ coli phage X fourteen genes separate the integrase/ excisionase genes from the immunity/ genetic switch region and in the mycobacteriophage L5 thirty-five orfs separated the integrase gene from the repressor gene and no genetic switch region was observed (Fig. 1C).

Orf Likely gene function Evidence Comments Ref. 359 integrase Bioinformatics, Integration vector Bruttin et al., 1997a,b cloning and expression 203 Superinfection cloning and Phage resistance Bruttin et al., 1997b exclusion expression 122 ? Metalloproteinase motif Neve etal., 1998

127 Cl-like repressor Bioinformatics, Phage resistance, genetic Bruttin etal., 1997 b, cloning and switch binding, H-T-H motif Foley et al., expression unpublished

75 Cro-like repressor Bioinformatics H-T-H motif Bruttin et al., 1997 b 287 Anti-repressor Bioinformatics H-T-H motif Bruttin and Brüssow, 1996 ; Neve etal., 1998

93 ? weak excisionase motif Boccard et al., 1989

Table 1: The genes from the lysogeny module of S thermophilus phage Sfi21.

Temperate phages are also found in other groups of tailed phages (P1-like, P2-like,

Mu-like, (bH-like genera of Myoviridae and P22-like genus of Podoviridae). Interestingly, the organization of the lysogeny genes of P2-like Myoviridae which infect diverse Gram- negative bacteria is very similar to that of temperate Siphoviridae from low GC content

Gram-positive bacteria (Fig. 1B, Table 3 provides a compilation of the phages investigated in the current report). For example, the Vibrio cholerae phage K139 showed the gene order : attP- int- glo (superinfection exclusion)-orf 2-cl-cox- (a combination of a cro-like transcriptional and a x/s-like recombinational switch; Esposito and Scocca,

1994)-c// (Nesper et al., 1999; Fig. 1B). In comparison, S. thermophilus phage Sfi21 showed the gene order : attP-int-imm (superinfection exclusion)-mpm (metalloproteinase motif protein)-c/-cro-an/(antirepressor) (Fig. 1A). 77

Modular exchanges in the lysogeny modules of lactococcal phages

Apart from the three temperate S. thermophilus phages (Neve et al., 1998), only the lysogeny modules from the two Lactococcus lactis phages r1t and BK5-T could be aligned at the nucleotide level (Fig. 1D). As previously demonstrated forS. thermophilus phages (Neve et al., 1998), the characteristic feature of this alignment was the alternation of regions with high DNA sequence identity with DNA segments showing only low or no sequence identity (Fig. 1D). The int gene, part of the repressor gene and a possible ant gene homologue (see below) were highly conserved at the DNA level.

Except for the repressor gene, the transition zones of DNA conservation to non- conservation coincided precisely with gene borders (Fig. 1D and 2). Over the right border of the lysogeny module, the phage r1t DNA can formally be regarded as a product of recombination between phages BK5-T and TP901-1 and at least one further phage (Fig. 2). Three possible crossover points were identified in a 2 kb stretch of phage

Mt DNA. The very high DNA sequence similarity indicated that BK5-T and TP901-1 are very near to the authentic phages which contributed DNA to r1t. 78

The lysogeny modules are located between the lysis and the DNA replication modules

For nine out of the twelve Siphovindae presented in Fig. 1A sequence information was provided for the regions flanking the lysogeny modules. Downstream of the integrase gene, separated by zero to six orfs a lysin gene was identified by the different authors (Fig 3, see Table 3 for the original references). On the other border, the lysogeny module was flanked by a likely DNA replication module. One to eight orfs downstream of the cro-like repressor gene the different researchers identified genes which predicted proteins that showed bioinformatic links to DNA replication functions

(Fig. 3). The genes encoded a putative helicase (g1e), a putative replication initiation and

DnaC-like protein (rit), a putative single stranded DNA binding protein (TP901-1) or proteins with similarity to predicted proteins from the DNA replication modules of phage

Sfi21 (Desiere et al., 1997, Foley et al, 1998) or phage Mt (van Sinderen et al., 1996)

(BK5-T, PVL),

l« ^ \ - oi205 rtjuaMf s-* i J. j> —i i ^rrV L Tn'ccmi

Sfi2i i l>1 ' <~ ' -J ^~ ~^_ iv M:è4«»s»è 0< — i 'i UL^Lcnrrp

' ' K 31e >}) tv) i yf„ v\ „^ tzzo ^jld\ i 'm_,i t

— adh lU^KHHH^ I,' ZL i

1j< BK5-T iimmmmmm zzzzi , _^ c __\_ » n Vrttn

~ - rit i ~i l uJBiawe f v^ »);, tV ; nTjLrrrr)

TP901-1 1- ^r— -1A: y, i r'lV-V- \J) | " ^ ' t»in-genes_

v- s pvl •^HiiwunH^ •> _i —i— —x i N[s^ ) f K{ '• i |Yj rcq

~ il v ' phi-105 hjVübe^ '['M N ),\i in rmrmmJ>

mg^^ hosl îysispn-1 rspication no information ^ sequence unknown lysogeny ^

Figure 3. The lysogeny module of Streptococcus (O1205, Sfi21), Lactobacillus (g1e, adh), Lactococcus (BK5-T, rit, TP901-1), Bacillus (phi-105) and Staphylococcus (PVL) phages are flanked by the lysis and the DNA replication module The putative lysogeny related genes are shown with open arrows Genes attributed by bioinformatic analysis to the lysis module (H holm, L. lysin) are shown with black arrows, genes attributed to likely DNA replication functions (see text) are shown as striped arrows Genes which cannot be attributed to a functional module are shown with gray arrows I integrase, M metalloproteinase motif protein, R repressor, C Cro-like repressor, A antirepressor 79

Sequence similarities between the lysogeny modules from Siphoviridae of low GC content Gram-positive bacteria

At the protein level many sequence similarities linked the lysogeny modules from

Siphovindae infecting low GC content Gram-positive bacteria. Where significant matches

(P values < 1Q"3) to the database were found, the best matches were always to proteins from other members of this phage group.

All but one of the phage integrases were evolutionary related. A phylogentic tree was previously presented (Bruttin et al., 1997a) The integrase from phage TP901-1

(Christiansen et al., 1996) belongs to a different class of enzymes (resolvases), its closest relative was found in the Bacillus cereus phage TP21 and phage Mu.

Proteins with the metalloproteinase motif were detected in Siphoviridae from

Streptococcus, Lactococcus, Bacillus and Lactobacillus (data not shown). The sequence similarities were not limited to this motif, but were found over the entire protein sequence.

The sequence similarity defined two subtypes of Cl-iike repressor proteins (Table 2).

One subtype was represented by the phage TPJ-34 repressor, the other by the phage

TP901-1 repressor The repressor from phage 01205 linked both subgroups, as well as two outlyers (Table 2). A multiple alignment was possible over the entire length of the three repressor proteins

adh Sfi21 TPJ34 AZ 01205 TP901 BK5-T r1t 296 PVL g1e 105 dh fi21 6

PJ34 7 22 2 12 3 1205 * 5 9

P901 22 K5-T 8

1t 4 109

** 96 4 4 5 18 VL 20 32 8 11 17 1e 4 05 ***

Table 2: Sequence relationships between the putative repressor proteins of the indictated Siphoviridae from low GC content Gram-positive bacteria Note See Table 2 for a key to the phage strains The figures are the negative decadic logarithms of the P values from Blast

* ** *** searches (e g 6 means P=106) 25 identical out of 102 aa positions, P=0 07, 31 identical out of 129 aa positions 80

The Cro-like repressors tended to be more diverse with relatively distinct Sfi21,

01205 and BK5-T phage proteins (data not shown) Downstream of the cro gene, a putative antirepressor gene (Neve et al 1998) was detected in three streptococcal phages (Sfi21, TP-J34, 296) and the staphylococcal phage PVL At a corresponding position a distinct gene was detected in two Lactococcus phages (BK5-T, r1t) and one

Lactobacillus phage (LL-H), while a hybrid gene was seen m Lactobacillus phage A2 that linked both subgroups In fact, the A2 protein shared its N-terminal half with the BK5-T protein and its C-terminal half with the TP-J34 anti-repressor (Fig. 4).

fiu-HH422 wmÊmmÊmmÊmmÊSÊmKÊmm 11 n n i m i i 11111 m i

~ 296 wmimmÊmKmÊmfmtamÊmÊL —______rrz

~ tpj-34 wmKtmKiimamÊmÊaaaam. : —i

BK5-T/r1t r77y7zZZ7777Z7////7'/'/ZZZ7WÊ

N15 /Z/ZZZZZZZZZZZZZ77777Ä>^^

fIu-HI1418 7/Z7Z/ZZZZZZZZZZZZ777 I I 11 LI I I II I I I I11TT

" " — H-19B »l*!?«, -T : -\

"~ —' SPBc2 r rSKïii J -«'Sf"«,. '

—— pi mmÊmaÊÊÊÊÊÊÊÊÊMÊÊÊÊÊÊÊÊmm^mmÊÊÊÊm ——i

Figure 4. Comparison of the putative antirepressor genes detected in phages from Streptococcus (TP-J34), Staphylococcus (PVL) Escherichia (coliphages P1 N15 H-19B), Lactococcus (BK5-T) Lactobacillus (A2 a stop codon was changed) or prophages (Haemophilus influenzae flu B subtilis SPBc2 Streptococcus pyogenes 296) Gene products are depicted by boxes and drawn to scale Domains having amino acid similarity are shown by identical shading within the boxes

This observation suggests a two-domain structure for these proteins and the possibility of recombination between these protein domains Proteins which shared either the N- or the C-terminal halves of these hypothetical anti-repressors were detected in phages and prophages from Gram-negative bacteria (coliphages P1, N15, H-19B,

Haemophilus influenzae prophage flu) (Fig 4) Overall these phage proteins defined four distinct C-terminal domains (TPJ-34-like and BK5-T-like [linked by the B subtilis prophage SPBc2 domain] d)flu-like N15-hke domain) and four clearly distinct N-terminal domains (TPJ-34-like BK5-T-like H-19B-like, P1-like) Notably, the evolutionary 81

distribution of one domain was even wider : proteins with sequence similarity to the N- terminal half of the BK5-T protein were also found in several different families of eukaryotic invertebrate viruses ( : orf 2 protein from Autographa californlca nuclear polyhedrosis virus, P=10"7; Poxvihdae : MSV194 protein from Melanoplus sanguinipes entomopoxvirus, P=10'22; Ihdoviridae : 011L protein from Chilo iridescent virus, P=10"4). The proteins belong to a new uncharacterized gene family of insect viruses (Afonso et al., 1999). The entomopoxvirus protein shared an astonishing 40 % aa identity with the N-terminal domain of the BK5-T protein (Fig. 5).

In comparison the degree of aa identity between proteins from temperate

Siphoviridae infecting different genera of low GC content Gram-positive bacteria did not exceed 33 % for the integrases, 40 % for the CI and Cro proteins and 56 % for the

Ant proteins.

BK5-T |ne|qn{|n 'N~LfflTiSIND Pox MDffiDNLI N-KKIHIAIYEj" A2 |ne|qh KG-RQJ^TJgvgJD! Hae mgklsilskrkytmknqiqfst JHäiV^iDPKGL Bac maevefaqdmnavhvatndvankpsqvkmge )GE - DVF|LRYffiL|DPVK N15 jgKAjgSVj |QESH|l|vggjJGG-D|

BK5-T IggKNFRl - - -ISYraRESRITggSgvgsVT ISEPGLY 7R Pox ÏEWKDTNIS Is|yedlinr|gilpsltynekntiy|iseHgly 8 3 A2 gSKPANj aF§GVTKLMÏpG|jê|K|S|DFV 75 Hae ÏTNSRK iQDgCKQ GGVTKRYTPgjKS7 gjMEMT FiNHiN" 92 Bac iKgQDAKRJ iriSsdSkykstfehgeirshlasnalakqS BpLYLHP--HTVlHtKEB 117 NI 5 JIANPSB RgJLDH] -EKLTLGLTEAQKLgRMAREVN- 79

BK5-T SL|S|EP|QDpYJ EVI.IÄDTLINIATQL 13 0 Pox TLTT|FQNLLTLKEQE 13 8 A2 -DM |n|deItdIk KAIYIgDFIINLATQL 13 0 Hae -RK |e|ep3EaB| QLQPQQLALPEPEKKFSFEFTEY 14 7 Bac |YgVELQA|L: PAIAEgSILR ISO NIB T lgLRCRDAVKQGTT|wR|RKgjj' FVEPEPKNAGE|lDWRQKEELR 13 9

Figure 5. Multiple alignment of the N-terminal half of the antirepressor proteins of Siphoviridae from Lactococcus lactis phage BK5-T, Lactobacillus casei phage A2, E. coli phage N15 prophage flu from Haemophilus influenzae (for all see Fig. 4) with two insect viruses (Baculovirus from Lymantria dispar [orf 161 gp] and Poxvirus from Melanoplus sanguinipes [MSV194 gp]). Amino acid positions which are identical in at least three proteins were shaded. 82

Comparison with the lysogeny module from P2-like Myoviridae

The proteins encoded by the lysogeny modules of the P2-like genus of Myoviridae were also linked by many sequence similarities Haemophilus influenzae phages HP1 and S2 are highly related Their lysogeny modules could be aligned resulting in regions with high DNA sequence identity interspersed with DNA segments showing only low or no sequence identity (Skowronek and Baranowski 1997) Their Int and CI proteins showed also significant similarity to£ coli phage P186 and V cholerae phage K139, two further Myoviridae of the P2 group The latter two phages shared a second type of Cox and CM protein.

The gene upstream of the integrase showed no similarity between the four P2-like phages The same is also true for temperate Siphovindae of low GC content Gram- positive bacteria Interestingly, the S2 phage protein showed detectable aa similarity with the corresponding protein from r1t (P=104) and in both cases a hydrophobic N-terminus was predicted Recently, Nesper et al (1999) have revealed a structural similarity between the genes upstream of the integrase in the Myoviridae K139 and the

TPJ-34 This encodes both in the K139 et al Siphovindae gene Myovirus (Nesper , 1999) and in the Siphovirus Sfi21 a superinfection exclusion function (Bruttin et al, 1997a) In addition, all integrases of the P2 phage group shared weak sequence similarity (about

25% aa identity over the C-terminal halves P=105) with the integrases of Siphovindae from low GC content Gram-positive bacteria

Discussion

Temperate Siphovindae from low GC-content Gram-positive bacteria share not only related morphogenesis modules, but as demonstrated in the current communication also related lysogeny modules In fact all currently known temperate Siphovindae from this evolutionary related group of bacteria showed an identical overall genome organization with the following modular organization DNA packaging -> head morphogenesis -> tail morphogenesis -> tail fiber morphogenesis -> lysis -> lysogeny -> DNA replication and an unattnbuted, possibly regulatory module (Desiere et al, manuscript in preparation)

This common overall organization and the peculiar organization of the lysogeny-related genes differentiates this group of temperate Siphovindae from the two currently 83

established genera of temperate Siphovindae (the A,-hke genus and the L5-hke genus,

Maniloff and Ackermann, 1998) Taxonomically one could propose a new genus

(« Sfi21-like phages ») to the current ICTV scheme of Siphovindae

The aim of taxonomy is the introduction of order into the biological diversity according to rational criteria The ultimate aim is that this taxonomical order reflects the natural phylogenetic relationships between the biological objects Bacteriophages present a dilemma with respect to this goal The modular theory of phage evolution predicts exchanges of functional gene segments between phages that differ otherwise in many respects (Botstein 1980) On the basis of the remarkable recombinogenic properties of phages, it has been doubted that phylogenetic relationships can be established for phages It is thus interesting to note that temperate Siphovindae from a group of evolutionary related bacteria are extensively linked by sequence relationships over two modules that are under different evolutionary constraints Actually, a preliminary analysis of further genome regions (lysis, DNA replication modules) demonstrated additional sequence relationships between temperate Siphovindae from low GC content Gram- positive bacteria Apparently, this group of phages shares at least to a certain extent a common evolutionary history It will now be important to extend the comparative analysis of phage genomes to other groups of bacterial viruses Do temperate Siphovindae from high GC content Gram-positive bacteria or from Gram-negative bacteria also constitute groups of phages with a common genetic organization suggestive of shared evolutionary histories?

Interestingly, distinct evolutionary affinities were revealed for the different modules from temperate Siphovindae infecting low GC content Gram-positive bacteria Over the morphogenesis module their closest relatives were A.-like Siphovindae, while over the lysogeny module their closest relatives were P2-like Myoviridae As in the case of the genetic similarities between the Siphovirus ) and the Podovirus P22, the relationship between Sfi21-like Siphovindae and the P2-hke Myoviridae blurrs the taxonomical distinction of phage families distinguished primarily on morphological grounds

A further contentious issue of phage genomics is the relationship of bacterial viruses to other biological entities A previous analysis of gene sequences from theE coli phage

T4 revealed relationships to both bacteria and eukaryotes (Bernstein and Bernstein,

1989) What can be deduced from our comparative sequence analysis of S 84

thermophilus phage genomes'? With very few exceptions (e g repressors) S

thermophilus phage genes demonstrated no sequence similarities with bacterial genes

In striking contrast clear links were observed between S thermophilus phages and

plasmids of the same bacterial species Previously we reported that two non-coding

phage regions (cos-site and origin of replication) showed sequence similarity to a cryptic

S thermophilus plasmid (Lucchini et al 1999a) A recently sequenced S thermophilus plasmid showed another link its putative primase showed 28 % aa identity over its entire length with a putative S thermophilus phage primase (P=1048, accession number

AJ242479 1) In contrast to T4, no S thermophilus phage genes showed closer links to eukaryotic than to prokaryotic genes Apparently S thermophilus phages derive their genes from other phage genomes and perhaps S thermophilus plasmids Interestingly, some weak, but significant links were also observed between S thermophilus phages and eukaryotic viruses e g a virus infecting the green algae Chlorella (Desiere et al,

1997) In the current report we describe a highly significant link between the hypothetical anti-repressor of staphylococcal, lactococcal and lactobacillar phages and at least two morphologically distinct classes of insect viruses The current database of phage algal and invertebrate viruses is too small to assess whether these links between bacteriophages and eukaryotic viruses are chance observations or reflect actual genetic relationships (common ancestry ancient modular gene transfers)

We have identified only about 35 complete phage genome sequences in the database (excluding prophage sequences from bacterial genome projects) This number is clearly insufficient to decide whether the evolutionary history of individual phage modules or even entire phage genomes can be deduced from comparative sequence analysis With the current ease of DNA sequence acquistion an international consortium interested in bacteriophage evolution should direct efforts to obtain phage sequences from a set of phages representing a wide evolutionary distribution of cultivatable bacteria Especially revealing could be random sequencing of DNA extracted from uncultivated tailed phages purified from environmental sources like ocean water and freshwater 85

Material and Methods

All phage DNA sequences analyzed in the present report were retrieved from the database Table 3 provide a list of the phage identifications, database accession numbers and references Nucleotide and predicted amino acid sequences were compared to those in the databases (GenBank Release 102, SWISS-PROT, Release 34 using FastA (Lipman and Pearson, 1985) and BLAST (Altschul et al, 1990) programs

Sequence alignments were performed using the CLUSTALW 1 6 method (Thompson et al, 1994), the Multalign program (Corpet, 1988) and the SIM alignment tool (Duret et al,

1996, Huang and Miller, 1991)

Phage Host (Phage Family) Ace no Reference Sfi21 X95646 Lucchini etal 1999 Streptococcus thermophilus (S) , TPJ-34 AF020798 Neve etal 1998 Streptococcus thermophilus (S) , 1205 Streptococcus thermophilus (S) U88974 Stanley etal, 1997

296 Streptococcus pyogenes (S)* Unfinished Genome Project TP901-1 Lactococcus lactis (S) Y14232 Madsen and Hammer, 1998 BK5-T Lactococcus lactis (S) L44593 Boyce etal, 1995 r1t Lactococcus lactis (S) U38905 Van Smderen et al, 1996 adh Lactobacillus gassen (S) Z97974 Engeletal, 1998 A2 Lactobacillus casei (S) Y12813 Alvarez etal, 1998

gie Lactobacillus spec (S) X98106 Kodaira etal, 1997

PVL aureus AB009866 Kanekoetal 1998 Staphylococcus (S) , 105 Bacillus subtilis (S) AB016282 unpublished L5 Mycobacterium smegmatis (S) Z18946 Hatfull and Sarkis, 1993 K139 Vibrio cholerae (M) AF125163 Nesper etal, 1999 S2 Haemophilus influenzae (M) Z71576 Skowronek and Baranowski, 1997 HP1 Haemophilus influenzae (M) U06847 Espositoetal, 1994

Table 3: List of phages used for the comparative genome analysis Notes Phage family (S), * from bacterial Ferretti Siphovindae , (M) Myoviridae prophage sequence genome project (J University of Oklahoma) # Incomplete phage sequence

Reference List

Ackermann, H W (1996) Frequency of morphological phage descriptions in 1995 86

Arch.Virol. 141,209-218.

Ackermann, H W (1999) Tailed bactenophages the order Caudovirales. Adv Virus Res 51, 135-201

C L E R Z E G F and The Afonso, , Tulman, , Lu, , Oma, , Kutish, , Rock, D.L(1999) genome of Melanoplus sanguinipes entomopoxvirus J Virol 73, 533-552

S F W W E W and D J Basic local Altschul, , Gish, , Miller, , Myers , Lipman, (1990) alignment search tool d Mol Biol 215, 403-410

M A M and J E The recombination Alvarez, , Herrero, , Suarez, (1998) site-specific system of the Lactobacillus species bacteriophage A2 integrates in gram-positive and gram-negative bacteria Virology 250 185-193

H and C T4 with Bernstein, , Bernstein, (1989) Bacteriophage genetic homologies bacteria and eucaryotes d Bactenol 171,2265-2270

F T J L A and M The Boccard, , Smokvina, , Pernodet, , Fnedmann, , Guerineau, (1989) integrated conjugative plasmid pSAM2 of Streptomyces ambofaciens is related to temperate bacteriophages EMBO d 8, 973-980

Botstein, D (1980) A theory of modular evolution for bacteriophages Ann N Y Acad Sei 354:484-90

J D B E and A J of the Boyce, , Davidson, , Hillier, (1995) Sequence analysis Lactococcus lactis temperate bacteriophage BK5- T and demonstration that the phage DNA has cohesive ends Appl Environ Microbiol 61, 4089-4098

Bruttin, A and Brussow, H (1996) Site-specific spontaneous deletions in three genome regions of a temperate Streptococcus thermophilus phage Virology 219, 96-104

A F S S and H Characterization Bruttin, , Desiere, , Lucchini, , Foley, , Brussow, (1997) of the lysogeny DNA module from the temperate Streptococcus thermophilus bacteriophage phi Sfi21 Virology 233, 136-148

A S and H The of the Bruttin, , Foley, , Brussow, (1997) site-specific integration system temperate Streptococcus thermophilus bacteriophage phiSfi21 Virology 237, 148-158

H A F S and S Molecular Brussow, , Bruttin, , Desiere, , Lucchini, , Foley, (1998) ecology and evolution of Streptococcus thermophilus bactenophages-a review Virus Genes 16, 95-109

S G F and R Evolution of dsDNA tailed- Casjens, , Hatfull, , Hendrix, (1992) bacteriophage genomes Sem Virol 3, 383-397

B L F K and K A Christiansen, , Brondsted, , Vogensen, , Hammer, (1996) resolvase-like protein is required for the site-specific integration of the temperate lactococcal bacteriophage TP901-1 d Bactenol 178, 5164-5173 87

Corpet, F (1988) Multiple sequence alignment with hierarchical clustering Nucleic Acids Res 16, 10881-10890

F F and H of the Desiere, , Lucchini, , Brussow, (1999) Comparative sequence analysis DNA packaging, head and tail morphogenesis modules in the temperate cos-site Streptococcus thermophilus bacteriophage Sfi21 Virology 260, 244-253

F S A M C and H A Desiere, , Lucchini, , Bruttin, , Zwahlen, , Brussow, (1997) highly conserved DNA replication module from Streptococcus thermophilus phages is similar in sequence and topology to a module from Lactococcus lactis phages Virology 234, 372-382

F S and Brussow H Evolution of Desiere, , Lucchini, , (1998) Streptococcus thermophilus bacteriophage genomes by modular exchanges followed by point mutations and small deletions and insertions Virology 241, 345-356

L E and Perrière G LALNVIEW a viewer for Duret, , Gasteiger, , (1996) graphical pairwise sequence alignments Comput Appl Biosci 12, 507-510

G E J and B Structure of a Engel, , Altermann, , Klein, , Henrich, (1998) genome region of the Lactobacillus gassen temperate phage phi adh covering a repressor gene and cognate promoters Gene 210 67-70

Esposito, D and Scocca, J J (1994) Identification of an HP1 phage protein required for site- specific excision Mol Microbiol 13, 685-695

S S M C and H A short viral Foley, , Lucchini, , Zwahlen, , Brussow, (1998) noncoding DNA element showing characteristics of a replication origin confers bacteriophage resistance to Streptococcus thermophilus Virology 250, 377-387

Fuhrman, J A (1999) Marine viruses and their biogeochemical and ecological effects Nature 399, 541-548

P J L S and Yamamoto N Formation of between Gemski, , Baron, , (1972) hybrids coliphage lambda and Salmonella phage P22 with a Salmonella typhimunum hybrid sensitive to these phages Proc Natl Acad Sei U S A 69,3110-3114

Hatfull, G F and Sarkis, G J (1993) DNA sequence, structure and gene expression of mycobacteriophage L5 a phage system for mycobacterial genetics Mol Microbiol 7, 395-405

R W M C R N Ford M E and G F Hendrix, , Smith, , Burns, Hatfull, (1999) Evolutionary relationships among diverse bacteriophages and prophages all the world's a phage Proc Natl Acad Sei U S A 96 2192-2197

X Q R C and W A for Huang, , Hardison, , Miller, (1990) space-efficient algorithm local similarities Comput Appl Biosci 6,373-381

C V J and A A identifies Jongeneel, , Bouvier, , Bairoch, (1989) unique signature a family of zinc-dependent metallopeptidases FEBS Lett 242, 211-214 88

T S and T nucleotide Kaneko, J., Kimura, , Nanta, , Tomita, (1998). Complete sequence and molecular characterization of the temperate staphylococcal bacteriophage PVL carrying Panton- Valentine leukocidin genes Gene 215, 57-67

K I M M N M K and Kodaira, , Oki, , Kakikawa, , Watanabe, , Hirakawa, , Yamada, , Taketo, A (1997) Genome structure of the Lactobacillus temperate phage phi g1e- the whole genome sequence and the putative promoter/repressor system Gene 187, 45-53

Lipman, D J and Pearson, W R (1985) Rapid and sensitive protein similarity searches Science 227, 1435-1441

S F and H The structural module in Lucchini, , Desiere, , Brüssow, (1998) gene Streptococcus thermophilus bacteriophage phi Sfi11 shows a hierarchy of relatedness to Siphoviridae from a wide range of bacterial hosts Virology 246, 63-73

F and H The between Lucchini, S., Desiere, , Brussow, (1999a) genetic relationship virulent and temperate Streptococcus thermophilus bacteriophages Whole genome comparison of cos-site phages Sfi 19 and Sfi21 Virology 260, 232-243

S F and H of Lucchini, , Desiere, , Brüssow, (1999b) Comparative genomics Streptococcus thermophilus phage species supports a modular evolution theory d Virol 73, 8647-8656

Madsen, P L and Hammer, K (1998) Temporal transcription of the lactococcal temperate phage TP901-1 and DNA sequence of the early promoter region Microbiology 144, 2203-2215

Maniloff, J and Ackermann, H W (1998) Taxonomy of bacterial viruses establishment of tailed virus genera and the order Caudovirales Arch Virol 143, 2051-2063

J J M and J Characterization of the Nesper, , Blass, , Fountoulakis, , Reidl, (1999) major control region of Vibrio cholerae bacteriophage K139 immunity, exclusion, and integration d Bactenol 181, 2902-2913

H K I F A K J and H Neve, , Zenz, , Desiere, , Koch, , Heller, , Brussow, (1998). Comparison of the lysogeny modules from the temperate Streptococcus thermophilus bacteriophages TP-J34 and Sfi21 Implications for the modular theory of phage evolution Virology 241, 61-72

Skowronek, K and Baranowski, S (1997) The relationship between HP1 and S2 bacteriophages of Haemophilus influenzae Gene 196, 139-144

E G F Le C B and van D Stanley, , Fitzgerald, , Marrec, , Fayard, , Smderen, (1997) Sequence analysis and characterization of phi 01205, a temperate bacteriophage infecting Streptococcus thermophilus CNRZ1205 Microbiology 143, 3417-3429

J D D G and Gibson T J of Thompson, , Higgms, , (1994) Improved sensitivity profile 89

searches through the use of sequence weights and gap excision. Comput. Appl. Biosci. 10, 19-29. van Sinderen, D., Karsens, H., Kok, J., Terpstra, P., Ruiters, M.H., Venema, G., and Nauta, A.(1996). Sequence analysis and molecular characterization of the temperate lactococcal bacteriophage rit. Mol.Microbiol. 19, 1343-1355. 90 91

2.5 A short noncoding viral DNA element showing characteristics of a replication origin confers bacteriophage resistance to Streptococcus thermophiius

(Virology 250, 377-387 (1998)) 92

VIROLOGY 250, 377-387 (1998) article no. VY989387

A Short Noncoding Viral DNA Element Showing Characteristics of a Replication Origin Confers Bacteriophage Resistance to Streptococcus thermophilus

Sophie Foley, Sacha Lucchini, Marie-Camille Zwahlen, and Harald Brüssow1

Nestlé Research Centre, Nestoc Ltd, Vers-chez-les-Blanc, CH-1000 Lausanne 26, Switzerland

Received May 21 1998; returned lo author for revision August 4, 1998; accepted August 19, 1998

A 302-bp noncoding DNA fragment from the DNA replication module of phage Sfi21 was shown to protect the Streptococcus thermophilus strain Sfi1 from infection by 17 of 25 phages. The phage-inhibitory DNA possesses two determinants, each of which individually mediated phage resistance. The phage-inhibitory activity was copy number dependent and operates by blocking the accumulation of phage DNA, Furthermore, when cloned on a plasmid, the 0Sfi21 DNA acts as an origin of replication driven by phage infection. Protein or proteins in the Sfi21 -infected cells were shown to interact with this phage-inhibitory DNA fragment, forming a retarded protein-DNA complex in gel retardation assays. A model in which phage proteins interact with the inhibitory DNA such that they are no longer available for phage propagation can be used to explain the observed bacteriophage resistance. Genome analysis of Sfi19, a phage that is insensitive to the inhibitory activity of the o6Sfi21 -derived DNA, led to the characterisation of a variant putative phage replication origin that differed in 14 of 302 nucleotides from that of <-/>Sfi21. The variant origin was cloned and exhibited an inhibitory activity toward phages that were insensitive to the

INTRODUCTION resistance. The protective DNA included a 500-bp re¬ gion rich in secondary structure, which represented After virus adsorption and DNA injection, DNA rep¬ the phage origin of replication (ori). lication is a critical step in the virus life cycle and is Our laboratory is interested in bacteriophages of frequently targeted In strategies developed to block Streptococcus thermophilus, a thermophilic lactic acid viral infection. Bacteria possess several native de¬ bacterium. Little information is available concerning an- fence mechanisms that protect against bacteriophage tiphage mechanisms in this species (Larbi et ai, 1992). infection. During the past decade, molecular ap¬ However, Moineau et al. (1995) successfully protected proaches have provided a number of genetically de¬ several industrial S. thermophilus strains from phage fined resistance mechanisms directed against meso- infection by the introduction of an R/M determinant from philic lactococcal phages (reviewed by Garvey et al.. a lactococcal plasmid, The fact that few phage defence 1995b). Many of the resistance mechanisms are based mechanisms have been reported or developed for S, on defence strategies native to Lactococcus and are theimophilus reflects not only the fact that the molecular frequently encoded by plasmids, the majority of which characterisation of S, thermophilus and its has are self-transmissible. The resistance mechanisms phages started ielatively recently but also that, in contrast to characterised to date can be grouped into four main Lactococcus, plasmids are conspicuously scarce in S. categories based on their mode of action: adsorption thermophilus (Mercenier ef al., 1990). inhibition, DNA penetration blocking, restriction/mod¬ The amount of informa¬ ification (R/M), and abortive infection (Abi), A number increasing genome sequence tion available for 5. thermophilus phages should facili¬ of the Abi systems characterised from the lactic acid tate the development of bacteriophage resistance mech¬ bacteria Lactococcus operate either directly or indi¬ anisms for the of industrial starter cultures. rectly by blocking phage DNA replication (e.g., abiA, protection of the S, Hill ef al., 1991; ab/'F, Garvey et al., 1995a), However, Sequence analysis thermophilus 50 hybridisation experiments, ule is in >70% of S. DNA in the development of an abortive type ot phage present thermophilus phages (Brüs¬ sow ct al, 1994), DNA sequence analysis of this region from cföfiZ1 identified several open reading frames (orfs) 1 for To whom reprint requests shojlcl bo addressed Fjx OOP putatively coding enzymes, including primase and 8925. E-mail: Harald BruessowS'chlsnr nestrd ch. helicose, implicated in the initiation of phage DNA rep-

0042-6822/98 $25.00 Copynght ' ' 1993 by Acddcnrc Pfcss

All rights of i coproduction in any form icsoivcd. 93

378 FOI FY FT AL

i com. FloKI Xbaï llmffll EcoRl Ihndlïi EcoKl TtoRI Lysogeny module < - -eo" 6000 nt

primase' il

pM723c

Bçfll Urol Pw/Il \fa>I /eoRI r-ii ',"'RI Awl 1071 1 0 00 400 600 sou

ORF SO ORF 143

om n pSI23d

PS*23e

pSt23f

pS123g(302bp)

1 CAGCAGTAGTGGTTATGGTTACTC^-AACCGCAGSTTATTeCMCAASBA££S.TAAAAC ** 61 ^1 62 CJCAT|TGAAAATAAAGGGTTTCGGTTGCTTTAGTTACTTAGTTACTACTTTTAAATATATTT < -- p. j* 123 ATA/\ATAAATAAATAAATAAATAMTA"^ATAT^GAGAGAGACTTAAAAAMCGTGTMCTA Ball 184 AGTMCTAMCffGGTrlÂlGAAACrTTGATATAT^GGGGTTTGCGG tin 245 TAACTGTTACTGTAArCGAGTMCAAAAGG.AGAAAAAA^GAAATTCAATACTTAGA

FIG 1 (A) Schematic lepiosen ation of the previously defined s NA en i itoi nodule (Desiere et al 1997) ind the subc loning of the i phage mhibitoiy clone pM/?3t | VIZ' It encodes only the 81 \ t i n) iu Is of o f 14 ? Si/e of DNA inseits is indicated in biarkets (B) Nucleotide sequence of the 302 bp ph ice nhibitoi y element presen rn V'n oi -> J and andern lepeats ire indicated by ai rows Direct repeats arc underlined or shaded The st Ht codons ol oif 61 and oif 143 ne i lac l a s he £ 11 les notion site

lication An aiea of particulai interest was the 21! nude RESULTS AND DISCUSSION otide (nt) noncoding legion located between oif 504 an 1 Identification of resistance determinant orf 143 which revealed an 80% dentity to the pufa ivo bacteriophage stranded of of the S ther single origin (sso) leplication The clone plW23c consisting of an 824 bp FcoRI ef al This to mophilus plasmid pSTI (Jan/en 1992) ft igment ol repeat sequences spec piopagat nq on Sfr1 17 ate sensitive to the presence of 231 is the of <4Sfi2l DNA bp noncoding sequence origin pMZ23c as demonstrated by a dramatic reduction in the replication efficiency of plaquing (COP) In fact for all 17 phages no In the communication we on the c losei present repot plaques could be detected T his is equivalent to a ">6 log examination of the ar d assoc ite it intragenic region reduction in phage titles which compares favourably with a biological activity that of bactei onhage ipsb with tie 4 log tedut ion observed for the phage encoded tance This study demonstrates that when present oi i resistance fper) mechanism derived fiom a lactococcal plasmid this legion can ict as an origin ot DNA enl^i ph ige (OSull vin ct al 1993) When intiocluced in the S tion Genome analysis of a phaqe That w is not i hib ted met mophilus strain Sfi9 pMZ°3c also protects against by this DNA element led to die c ho ictersation of i nfett on by the lyt c qioup II phages c/jSfil I Stl2 variant putative phaqe on ind o6St2 (data not shown) 94

VIRAL DNA ELEMENI CONFERS RESISTANCE TO S thermophilus 379

TABLE 1 oppositely oriented orf 143 and orf 50 (Fig 1A) The three subclones of orf 50 and orf 61 Titration of S thermophilus Phages on Sfd Transformed pSF23d (deletion ), pSI~23e with the Indicated Plasmids (deletion of orf 50) and pSF23f (deletion of orf 143) protected Sfi1 from infection by (/3Sft21 as determined by PN7124-" pMZ23c pSE23q the inability to detect plaques (Table 2) Therefore all Phage (pfu/ml) (pfu/ml) (Lfu/mi) th ec oifs present in pM?23c could be deleted without less ol the lesistance that the Sfi2l 17X10 <10?" — phenotype suggesting S89 1 /\ 10 <102 obseivod resrstant e is not mediated by a phage protein

] \ io8 <102 - S69 The inteigenie region between the divergently on S3 2 ^X10 •-10? -l ented oil 61 and oif 143 was then tested A 302 bp DNA Sfi2 3 X10"3 <10? a the c/JSfi21 sum SfiJJ 75X108 -jo' -1 fragment covering seguence showing St3 26X108 •-10' r^ lanty to the piPative pSF1 s so of replication was PCR

St44A 32 X 10° -10 ^ amplified and cloned in pNZ124 generating the con

X108 •-1Û2 ^ S19 I stiuct pSI 23g (Fig I A) I he presence of pSF23g in S S96 6 x10r -in A thermophilus Sfi I protected the bactena fiom phage in St44 6 x IO8 -10 - fee ton and was just as effective as the clone St42 8 X10 -I - patent nMZ2 3c in terms of cduUion in the FOP and of M4 31 48X10' -P - lange

o M4 15 ')x I0r -H phoqes inhibited (fable 1) I lorn these observations we

- St40 2 X10Û I0 1 cl can c ncludo that the observed bactenophage resis Sfi18 6 X 108 --K) <^ tance n pMZ23c is mediated by a noncoding DNA se St?0 3 X108 •-10 0 Furthermore because the se Sfi 19 28X10" I x 10a 1 1 1 quence noncoding es in and in St25 ?j X 108 > x 10 3 X qaenc pSF23g pM723c are opposite orien St33 3 x 10s I x !0C 7 \ 1 tations with lespect to each other the inhibitory effect ot St28 9 x 10s I x IO7 8 x 1 the 302 bp noncoding seguence must be independent of Sil/ () X108 6 X 10' 5 X 10 Tanscription originating in the vector St159 I X10° r X I01 2 X n both of the clones and in S17 1 X107 > x n7 4 x 10 interestingly pSF2dd pSF23f St30 3 xl0q i x n 3 x r which all of the DNA to the left and right respectively of the Ball site is deleted mediate resistance against sensitive vector contiol 'Phage <Ä369 and I his 0 collages r/>Sfi21 89 (Fable 2) implies <10' indicates tint no plaques wore dctccfcd on a lawn of bac e hot contains m fact two elements Io nal growth pMZ23c genetic

n cl not determined v^ateci on either side of the Ball site each of which independently is capable of protecting the host cell from bacteriophage infection A c loset examination of the nu To define fuither the DNA sequence responsible fn leotide sequence of oSF2:>q highlighted a number of mediating the observed bactenophage tesistime the sequences repeated to the left and right of the Ball site EcoRl insert fragment of pM72 3c was subJoned (Fig (Fig IB) Itc hould howevei be no^ed that the ptotection 1A) and the various subclones were tested to then provided by pSF23g is much stronger than that of ability to piotecttho strain Sfi 1 fiom phage infection The pSr23d oi pSF23f in terms of the ieduction in EOP insert fragment of pM723c contains three orfs orf 61 the and/ot the range of phages affected (Fable 2) These

TABLF 2

Analysis of pMZ23c Subclones for Bacteriophage Resistance

pNZ124 I SF23c lSPOj pSr23e dSI 23f Phage (pfu/ml) ipfu mil J 1 i mfi (pfu/ml) (pfu/ml)

Sfi21 0 7 x 107 -i -.10 <102 <102 S89 1 7 x 10" -JO2 <107 •slO2 S69 00 x to7 "10 nd J <102

SI 23 X 10' -1 <-10 nd 8 X 104

ST3 26x 10 •-P 88 > 107 nd G X IO7 i SR4A 37 x IO8 - 1 23 x 10 nd 1 4 x IO3

S96 0 x 10° -P 2 x n 1 10'

Note All ol the abo\e pi ismids were teste Imni M <-l0' indicates (licit no plaques \ eie defec 1 n i

'n d not dclci i med 95

380 FOLEY ET AL.

accumulated 40 min after in¬ differences probably reflect different binding reguire- amount of DNA phage reduced in with ments of the corresponding phage proteins. fection was Sfi1(pSF24) compared Sfi1(pNZ124) (Fig. 2A). resistance Effect of copy number on bacteriophage phenotype Plasmid replication driven by bacteriophage infection

vector for 5, ther¬ pSF28 is a nonreplicative integration To examine the effect, if any, of phage infection on the based on the functions of the DNA mophilus integration phage replication of a plasmid carrying the 0Sfi21 -derived as a selectable 0Sfi21 and the car gene from pC194 fragment, Sfil containing the low-copy-number clone To examine whether the marker (Bruttin et al., 1997b), pSF24 was chosen. A significant increase in the pSF24 824-nt from is of inter¬ of 0SH21 fragment pMZ23c capable cojoy number was observed when the quantity pSF24 when as a fering with phage infection present single DNA present in uninfected Sfil (pSF24) was compared an Xbal-Pstl copy in the bacterial cell, it was cloned as with that of 0Sfi21 -infected cells, 40 min after infection in Lac- fragment in pSF28. This cloning was performed (Fig, 3), Although 0Sfi21 infection apparently boosted MG1363 because the tococcus lactis ssp, lactis phage pSF24 plasmid DNA replication, infection with the heter- in is toxic in Escherichia integrase gene present pSF28 oioaous phage S69 had no observable effect on pSF24 coli. S. Sfil was subsequently transformed thermophilus copy number (Fig, 3). This correlates with the plaque with the construct, pSF20, and the integrants resulting assay results that indicated the high-copy-number clone obtained were confirmed PCR The bacterio¬ by analysis. pMZ23c had an inhibitory effect on a large number of of this was tested phage sensitivity integrant Sfi1::pSF20 phages, including 0Sfi21 and 0S69, whereas the low- For a 100-fold reduction in the by plaque assays. 0Sfi21, copy-number clone pSF24 was effective against only EOP was observed together witli a reduced plaque size. ûiSfJ21. no reduction in the EOP or size was However, plaque The simultaneous monitoring of phage (Fig, 2) and observed for the two and Sfi21 only (i.e., no encoded proteins. The experimental data therefore sug¬ plaques observed), This clearly indicates that to be ef¬ gest a working hypothesis wherein this noncoding fective against a larger range of phages, the 0Sfi21- 0S1121 -derived DNA serves as a binding site(s) for pliage derived DNA fragment must be provided at a high copy proteins which, when multiple copies are present, are number. capable of titrating essential phage proteins such that they are no longer available for phage DNA replication. Effect of bacteriophage resistance determinant on Bacteriophages have adopted a plethora of replication phage DNA replication strategies that differ mainly in the method of priming and The relative amount of 0Sfi21 DNA was determined the dependence on host- and phage-encoded proteins. throughout the phage lytic cycle by comparing the total Among the phages of gram-positive bacteria, only the DNA isolated from phage-infected cells harvested at Bacillus phages 0SPP1 (Pedré ef al., 1994) and 029 different times after infection. Using DNA probes specific (wh'ch uses protein-primed replication; Salas etal., 1995) for the phage or the vector pNZ124, it was possible to hove been extensively characterised in terms of their simultaneously monitor both phage and plasmid DNA, replication functions. In contrast, although considerable The high-copy-number clone pMZ23c had a dramatic sequence data exist for phages of lactic acid bacteria, effect on (/»Sfi21 as demonstrated by the amount of replication origins have been identified only for the lac¬ DNA accumulated in the strain Sfil (pNZ124) compared tococcal phages Sfi21 DNA was etal., 1993), The putative ori of the lactococcal 031 has detectable 20 min afler infection, increasing to a maxi¬ been functionally identified, although no sequence data mum at 40 min and followed by a reduction at 60 and 80 are available (O'Sullivan ef ai, 1993), As yet, no informa¬ min (Fig. 2B). For Sfil containing the pMZ23c clone, tion is available concerning the general mechanisms of 0DNA was detected 20 mm after infection, but in con¬ replication or the phage and host proteins required, All of trast to Sfi1(pNZ124), no further accumulation of (/»DNA the putative origins of lactic acid bacteria phages are was observed (Fig, 2B), A similar pattern was observed rich m direct repeat sequences, but there is no apparent for not When Sfi1(pSF23g) (data shown), DNA was correlation in the size, relative location, or seguence of monitored in Sfil (pSF24), a pattern similar to the phage- the repeated elements between lactococcal phages and sensitive control Sfil(pNZ124) was observed, but the those of S. thermophilus. In this report, we describe a 96

VIRAL DNA FLEMENT CONFERS RFSISTANCb TO S thermophilus 381

A

m 1 2a 2b 2c 2d 2c 2f 3 4a 4b 4c 4d 4e 4f 5 6a 6b 6c 6d 6e 6f 7

23 1 —

94 —

6 6 —

44 —

23 _ 20 —

B

m 1 2a 2b 2c 2d 2e 21 6a 6b 6c 6d 6e 6f

^ ir ^ \

23 1 «*% 94 6 6 ^A«¥| 44 ^tt «Ht M* IM »—»

2 3 , 20

FIG 2 (A) Agarose gel olecùopho esis of tot il DNA isola ed it a at s t me points during phage <-/JSfi21 infection The cultures tested were Sfil containing pNZ124 (lanes 6a f) pMZ2 5c (lanes 2a f) and pSF? 1 I n s ta f) i f lepresent total DNA isolated bofoio phage infection immcdntely after infection and 20 40 60 and 80 mm aftei infection respec tlv Lait 1 conta ns Sfi21 DNA I anes 3 5 and 7 conta n pMZ23c pSF24 md pNZ124 plasmid DNA respectively All DNA samples weie d ce cd v th Xba\ fht size of ihc À HindlW DNA moleculai weight maiker (lane m) is indicated in kilobascs (B) Flybi idisation of the s impies nd c i ed inidA isigl ilx led I and cfofi21 DNA as probes

sequence and functional symmetry for the

0Sfi2l derived DNA described above shows 80 o identity piepared fiom 0Sfi2l infected and uninfected Sfil cells lo the putative cmqlo stranded origin of the uyj ic S f4Sfi21 mection resulted in a DNA fragment of lowei thermophilus plasmid pSTI (Janzen et al 1992) t may mob i'y n Polyacrylamide gels (Fig 4) The specificity of be to examine whether i interesting phage inhibtoiy the protein-DNA tntei action was demonstrated by c om could be with the activity associated pSFI sso Fot this pet tion experiments in which the piesence of a 500 fold purpose the sso tegion of pSTI was PCR amphf ed and excess of the uni ibeled DNA fragment abolished the cloned in pNZ124 The presence of the cloned putative retard itron (data not shown) Interestingly when the ssofrom pST1 in S thermophilus Sfil did not piotectthe same exper iment was tepeated using crude cell extracts 97

382 FOIEY FT AL

À B

m la 1b le ld le 11 2 m la 1b le ld le 11 2

- 21 1 - ^ssssP? 23 i

0 j_ Vlî£ii# 94-

66 " "* 66-

44- ^M^Ü^ 4 i_

\ - 2 2 1 • 20 2ir

O D eronm O D (on i «

*•--< 1)

in la 1b le ld le 11 2 ta lble ld le 11 2

23 1 -

0 4 -

— (, (, 1MI 66-

44- 44 -

2 J- 2 3

~ 20 2 0

O D Olli ii

FIG 3 Hybridisation of Xba\ diges ed total DNA isolated ftom Sfil S> >\) the it phace infection (A) and infected with S69 (C) and <7>Sft49 (D) Ihc probe consisted of labeled À and pNZ12l DNA 1a-ll ie| re em to al DNA isolated at vaiious time points before phage infection immediately af er infection and 20 10 60 and 80 mm after infection ie pee vcly Lane 2 contains Xho\ digested pSF24 plasmid DNA Ihesizeof the A HiniWW DNA moleeill n \ eight r iikci (I me m) is indicated in k la ases Vie isui ments obtained of ODr0Olm at vaiious imc poinds aie also indicated

prepared fiom 0SF19 infected Sfil no retarded protein- sitivity ot 0SF19 to the 0Sfi21 derived DNA element DNA complex was observed (Fig 4) This obsetvation is 0Sfil9 revealed an identical organisation of orfs and a significant because the 0SF21 derived DNA has no n high degree of sequence similarity to 0Sfi21 (Table 3) An effect on infection hibitory by 0Sfil9 interesting observation was the neai absence of silent po nt mutations (Table 3) of the and DNA Sequence comparison 0Sfi19 0Sfi21 A sequence oliqnment of the inhibitory 0SF21 DNA modules replication element with the conesponding intergenic region of 0Sti19 evealed an overall nucleotide of 95% Although pMZ23c is effective against a range or identity (t e 14 nt differences over a stretch of 302 nt 5) phages eight of the phages tested in this study ine lud Fig When this was extended to addr ing 0Sfi19 are insensitive to the 0Sfi21 derived DNA comparison eight tronal a cleai division in which (Table 1) Comparative sequencing of a 1/ kb eg on of phages emerged sensitive oi insensitive to the denved 0SF21 and 0Sfi19 encompassing the lytic cassette and phages 0SF21 DNA a that of the structural gene clustei revealed tint these two yielded sequence resembling 0SF21 and phages are similarly organised (Desiere ef ?/ I99b) 0Sfil9 lespechvely (Fig 5) Exc lusion of the polymoi Therefore 4 was of interest to extend tins c ompat ison »o phisms with n the 0Sfi2l group identified nine nude the region encompassinq the DNA epl c itron module ir do different e-, between the 0Sfr21 and 0SF19 type (previously defined toi 0>Sfi21 by Desieie it >/ P9~h seqi eines An extension of this sequence comparison because it may facilitate the identification of 'he qena c fo a bioader range of phaqes may load to the identi determinant or determinants lesponsble foi the nsen fie ation ot die critical nucleotides determining the 98

VIRAL DNA LLLMEN1 CONFERS RLSIS1ANCL 10 S thermophilus 383

3 0Sl / and 0ST3O, which are insensitive to the 0Sfi21 and 0Sfi19 origin-mediated phage resistance, do not pos¬ sess the highly conseived 2 2 kb FroRl DNA fragment present in ^70% of S theimophilus phages (data not shown Btussowntd/ 1994) Titadoxieally the 0Sfi 19 denved fragment in pSF19l potects Sfil fiom infect on by 0Sfi21 and the other phages inhibited by the 0SO21 DNA element, wheieas in contrast the 0Sfi21-derived fragment does not protect against 0SF19 Apparently, a 0SF21 protein is capable of inter acting with both types of origin, whereas the eguiv aient 0SF19 protein is origin specific In phages, DNA sequences targeted by proteins are frequently found drrectly in the vicinity of the gene coding for the protein in question The genes preceding oif 504 in the DNA eplit at'on module are unlikely to code for the origin- bind ng protein because all of the am.no acid teplace meats n 0Sfil9 piotems (compared with 0Sfr2l) weie FIG 4 Gel reiaidation assay using 'he 1?P labeled 302 bp phac ^ llso detected in proteins predicted for 0Sfi 11 oi 0Sfi18 inhibiloiy DNA fragment from Sfi19 (laie cToble 3) are susceptible to the 0SH21 3) Lane 1 is a control lane in which a < itrdo cell extiac t piepared from denved inhibitory activity Interestingly, the orf 504 gp of uninfected Sfil colls was used c6Sfi?l and <7JSfi19 diffei in only ono amino acid position .n which a C-terminal lysine residue is substituted by a glutamic acid residue in 0Sti19 The difference between of the interactions at the specificity piotem-DNA a basic and acidic residue may have consequences for phage putative lephcation origin protein folding To explain the complex inhibitory activity The diffeiences between the putative could origins ot the diffeient ongins will requne a sequence compai leflect the moleculai basis for the ot insensitivity 0Sfil9 isoii (both nucleotide and amino acids) of a broader to the To test this, the DNA 0SF21 origin noncoding ange of phages together with the elucidation of the region of 0Sfil9 located clownstieam of orf 504 was spec ific DNA-protein interaction(s) and the identification PCR and cloned in in the amplified pNZ124, losulting of the components involved construct pSF 191 Sfil(pSF191) was found to be resistant to 0SF19 infection Sfil(pSFIOl) was then tested against MATERIALS AND METHODS the remaining seven phages which were insensitive to Bacterial strains, bacteriophages, and media the resistance mechanism piovided by piW2 3c and was found to be resistant (i e no plaques) to all but tvwo The F coli strain XL 1 Blue was piopagatecl in LB broth phages 0S17 and 0ST3O It is interesting to note that O' on I B broth solidified with 1 5% (w

TABLF 3

Comparison of the Predicted Gene Products within the DNA Replication Modules of <-/>Sfi21 and Sfi19

Size of (aa) No of it i No Of clJ (nil No of aa changes orf of oil c gp équivaloir ch-n^v-, t p i changes helAcen specific to

137 13/ ?(/) 0 233 233 KD 0 443 443 0 0 151 151 6 (6) 8(8) 0 271 271 4'5) 2(3) 0 504" 504" m 0 V 143 143 0 01 0 0

a ' d <6SfP3 n Older oo identify the ammo acid (aa) replacements specific to the ^6Sli21/(^Sfi10 ( ompanson

"Several stait codons aie foi f,io i ative n ^i c e lneoi« < •> possible j j i he i o> s j nb nbastr 10 binding site \Rl3S) w [JUG at position ,L 3758 giving an orf oil 504 as was selected toi A1 a, -, inie < a Cl

"-Compai!1 on of Sfi21 and rfSfilO od CI op e ened a L>s a CI i nnsi e i F a ) 99

384 FOLEY ET AL

Lys/Glu stop 504 1—1 r 1—1 S£i21 GAAAACGAGG iCGAAGAAAGC JAGCAGTAGTG iGTTATGGTTA CTCAGTAACC iGCAGGTTATT .GCAACAAGTA

Sfi2 GAAAACGAGG iCGAAGAAAGC JAGCAGTAGTG iGTTATGGTTA CTCAGTAACC 13CAGGTTATT iGCAACAAGTA

St2 GAAAACGAGG 'CGAAGAAAGC 1AGCAGTAGTG iGTTATGGTTA CTCAGTAACC IGCAGGTTATT iGCAACAAGTA

Basl9 GAAAACGAGG .CGAAGAAAGC 1AGCAGTAGTG iGTTATGGTTA CTCAGTAACC (GCAGGTTATT .GCAACAAGTA

Sfil8 GAAAACGAGG

' Still GAAAACGAGG •CGAAGAAAGC \AGCAGTAGTG .CTTATGGTTA CTCAGTAACC (GCAGGTTATT GCAACAAGTA

Sfi 19 GAAAACGAGG (CGGAGAAAGC ;AGCAGTAGTG !ST'IATGG'i'lA ,CTCAGTAACC (3AAGGTTACT i3CAACÄAGTA

. ( 1 St 17 GAAAACGAGG

i ( iGCAACAAGTA St 2 b GAAAACGAC-G ICGGAGAAAGC AGCAGTAGTG .STTAT>TA CTCAGTAACC 3AAGGTTÄCT St33- GAAAACGAGG

S£i21 ACCGTAAAAC CCATTGAAAA TAAAGGC-TTT CGGTTGCTTT AGTTACTTAG TTACTACTTT TAAATATATT Sfi2 ACCGTAAAAC CCATTGAAAA TAAAGGGTTT CGGTTGCTTT AGTTACTTAG TTACTACTTT TAAATATATT St2 ACCGTAAAAC CCATTGAAAA TAAAGGGTTT CGGTTGCTTT AGTTACTTAG TTACTACTTT TAAATATATT Basl9 ACCGTAAAAC CCATTGAAAA TAAAGGGTIÏ CGGTTGCTTT AGTTACTTAG TTACTACTTT TAAATATATT Sfil8 ACCGTAAAAC CCATTGAAAA TAAAGGGTTT CGGTTGCTTT AGTTATTTÄG TTACTACTTT TATATATATT Still ACCGTAAAAC CCATTGAAAA TAAACGCTTT CGGTTGCTTT AGTTACTTAG TTACTACTTT TAAATATATT Sfll9 ACCGTAAAAC CCATTGAAAA 1AAACGGTTT CGGTTGCTTT AGTTACTTAG TTACTACTTT TAAATATATT

" Stl7 ACCGTaAAAC CCATTGAAAA TAAAGGoTT CGGTTGCTTT AGTTACTTAG TTACTACTTT TAAATATATT St25 ACCGTAAAAC CCAT'l &AAAA TAAAGGGTTT CGGTTCCTTT AGTTACTTAG TTACTACTTT TAAATATATT St 3 3 ACCGTAAAAC CCATTGAAAA TAAAGGGTTT CGGTTGCTTT AGTTACTTAG TTACTACTTT TAAATATATT

Sfi21 TATAAATAAA TAAATAAATA AATAAATATA TATAGAGAGA GAC.TTAAÄA AÄACGTGTAA CTAAGTAACT Sfi2 TATAAATAAA I'AAATAAATA AATAAATÄTA TATAGAGAGA GAC.TTAAÄA AAACGTGTAA CTAAGTAACT St2 TATAAATAAA TAAATAAATA AATAAATATA TATAGAGAGA GAC.TTAAÄA AÄACGTGTAA CTAAGTAACT Bas 19 TATAAATAAA TAAATAAATA AATAAATATA TATAGAGAGA GAC.TTAAÄA AAACGTGTAA CTAAGTAACT Stil8 TAT. AATAAA TAAATAAATA AATATATATÄ TATAGAGAGA GCCATCAAAA AÄACGTGTAA CTAAGTAACT Sfill TATAAATAAA TAAATAAATA AATAAATATA TATAGAGAGA GAC. TCAAÄA AAACGTGTAA CTAAGTAACT AATATATATÄ TATAGAGAGA GCCATCAAAA AAAGTAGTAA CTAAGTAACT Sfil9 TAT . AATAAA TAiAATAAATA Stf TAT.AATAAA TAAATAAATA AATATATATÄ TATAGAGAGA GCCATCAAAA AAAGTAGTAA CTAAGTAACT St2 5 TAT.AATAAA TAAATAAATA AATATATATÄ TATAGAGAGA GCCATCAAAA AAAGTAGTAA CTAAGTAACT St33 TAT.AATAAA TAAATAAATA AATATATATÄ TATAGAGAGA GCCATCAAAA AAAGTAGTAA CTAAGTAACT

Sf i21 ÄAAGTGGCCA GAAACCT.GA TaTAT.AAGGG GTTTGCGGTG GTTACGAGTA AAA&TAACTG TTACTGTÄAT

Sfi2 AAAGTGGCCÄ GAAÄCCTTGA TATATAAGGG GTTTGCGGTG GTTACGAGTA AAAGTAACTG TTÄCTGTAAT St2 AAAGTGGCCA GAAACCTTGA TATArAAGGG GTTTGCGGTG GTTACGAGTA AÄAGTAACTG TTACTGTAAT Bas 19 AAAGTGGCCÄ GAAACATTCA TATATAAGGG GTTTGCGGTG GTTACGAGTA AAAGTAACTG TTÄCTGTAAT Sfi 18 AAAGTGGCCA GAAACCTTGA TATATAAGGG GTTTGCGGTG GTTACGAGTA AAAGTAACTG TTÄCTGTAAT Sfill AAAGTGGCCA GAAACCTTGA TATATAAGGG GTTTGCGGTG GTTACGAGTA AAAGTAACTG TTACTGTAAT Sfil9 AAAGTGGCCG GAAACCTTGA TATATAA.GG GTTTGTAGTG GTTACGAGTA AAAGTAACTG TTACTGTAAT Stl7 AAAGTGGCCG GAAACCTTGA TA"AT&A GG GTTTGTAGTG GTTACGAGTA AAAGTAACTG TTACTGTAAT St25 AAAGTGGCCG GAAACCTTGA TATATAA.GG GTTTGTAGTG GTTACGAGTA AAAGTAACTG TTACTGTAAT

St 3 3 AAAGTGGCCG GAAACCTTGA TATATAA.GG GTTTGTAGTG GTTACGAGTA AAAGTAACTG TTACTGTAAT

Start 143 <—1 I 1 Sli21 CGAGTAACAA AÄGGAGAAAA AAATGGAÄAT TCAATACTTÄGA Sfi? CGAGTAACAA ÄAGGAGAAAA St2 CGAGTAACAA AAGGAGÄAAA AAATGGÄ.,

Bas 19 CGAGTAACAA AÄGGAGAAAA AAATG

SfilS CGAGTAACAA AÄGGAGAAAA AÄATG . , Sfill CGAGTAACAA AÄGGAGAAAA AAATGGAÄAT TCAATACTTÄGA Sfil9 CGAGTAACAA AÄGGAGAAAA AAATGGAÄAT TCAATACTTÄGA Stl7 CGAGTAACAA AÄGGAGAAAA AAATGGAÄAT TCAATACTTÄGA St2 5 CGAGTAACAA AÄGGAGAAAA AÄATG

St3 3 CGAGTAACAA AÄGGAGAAAA AAATG... .

FIG 5 Nucleotide seguence alignment of 'he phage nh hi oi\ six ier p derived horn SV>\ and o&St119 and companson with me equivalent sequence fiom additional S thcrmochilw, phoqps Seeacaee dilfcienco t-ct veen Co CC021 and <3Cfi19 wpe scgucnccs are in bold The single amino acid difference (Lys/Glu) in orf 504 gp is indicaooj rhe icgion coirosponcCc, o lie 302 bp pCigo inhibitoiv fiagment is indicated bv airows

ef al., 1989) at 37°C under agitation S theimophilus requ.red at a final concentration ot 3, 5, and 20 /eg/ml, strain Sfi 1 and transformants thereof were routinely sub- respectively, for 5 thermophilus, Lactococcus, and E. cultured at 42°C in either LM17 (M17 supplemented with coli 0 5% lactose) (leizaghi and Sandine, 19/5) oi Belhke- The S thermophiius phages used in this study were (Ellikei plus 1% beef extiact) media lactococcus lactis obtained fiom the Nestle phaqe collection The phages MG1363 was propagated in GM17 (M17 supplemented weie piopagated on thou approptiate 5 thermophilus with 0 5% glucose) Chicot amphemool was used when stiain in LM1/ broth as described previously (Brussow 100

VIRAL DNA ELEMENT CONFERS RESISTANCE TO S. thermophilus 385

and Bruttin, 1995; Bruttin and Brussow, 1996). Phage copy-number shuttle vector pNZ121 (de Vos and Simons, enumeration was achieved by plaque assay using LM17 1994), and the S. thermophilus integration vector pSF28 agar supplemented with 0.25% (w/v) glycine and 0.1% (Bruttin etal., 1997b). (w/v) skim milk, to enhance plaque formation, and MRS pMZ23c consists of an 824-bp fcoRI fragment from top agar. the 0Sfi2l DNA replication module cloned in pNZ124, The pMZ23c subclones pSF23d, pSF23e, and pSF23f DNA techniques were constructed by deleting the Pvull-Mscl (iso- chizomer of Ball), PwjlGßsr1107lr and A/fecl-Fc/l36ll frag¬ Phage purification, DNA extraction and purification, ments, respectively, from pMZ23c, agarose gel electrophoresis, Southern blot hybridisa¬ pSF23g consists of a 302-bp EcoRl-Bglll PCR frag¬ tion, and DNA labelling were done as described pre ment cloned in the high-copy-number vector pNZ124, viously (Briissow and Bruttin, 1995; Bruttin ef al., The PCR fragment was generated using pMZ23c as the 1997a; Bruttin and Brüssow, 1996), General DNA tech¬ template DNA and primers (5'-GCG AAT TCA GCA GTA niques were performed as described by Sambrook ef GTG GTT ATG G-3' and b'-GGA GAT CTA AGT ATT GAA al. (1989). The Qiaprep plasmid kit (Qiagen) and the TTT CC-3') containing FcoRI or Bglll restriction sites Jetstar Plasmid Maxi-kit (Genomod) were used for the (undet lined). rapid isolation cof plasmid DNA from E, coli. Restriction The low-copy-number clone pSF24 was constructed enzymes and T4 DNA ligase were obtained from Boeh- by ligoting the 876-bp Bglll-Xhol fragment from pMZ23c ringer-Mannheim and used according to the suppliers to Sg/ll-Sa/l-digested pNZ121 DNA For chromosomal instructions. E. coli was electrotransformed as out¬ the Xbal-Pstl from lined in the BioRad instruction manual Lactococcus integration, 834-bp fragment pMZ23c was cloned in the eguivalent sites of the S. thermophilus and S. thermophilus were electrotransformed using integration vector pSF28, the con¬ the procedures described by Holo and Nes (1989) and thereby generating struct pSF20. Slos etal. (1991), respectively. pSF 191 consisted of a 352-bp FcoRI-ßg/ll-digested PCR cloned in the sites cot the PCR fragment equivalent high- copy-number vector pNZ124, The PCR fragment was DNA were in a Perkin-Elmer ther¬ samples amplified generated using 0Sfi19 DNA as a template and primers mal cycler programmed for 30 cycles, each consisting (5'-GCG AATTCG GGA CGA AAA CGA GGC GG-3' and of 94°C tor 30"s, 55°C for 30 s, and 72°C for 1 m,n b'-GCA GAT CTC ATT AGG TTC GTG TTC TTG-3') con¬ were to the Synthetic primers designed according taining FcoRI or Bglll sites (underlined) to facilitate clon¬ established 0SH21 and 0Sfil9 DNA sequences and ing. used together with the relevant DNA template and Tao

polymerase Fermentas, PCR products were gel-pur - Analysis of intracellular phage DNA fied using Ultrafree-MC Centrifugal Filter Units (Mdh- pore) and according to the manufacturer's instruc¬ Bacterial cultures were grown at 40°C in Belliker

tions, broth to an of 0,2. was added to a final OD600 nm CaCI2 concentration of 10 mM, followed by the addition of DNA sequencing and analysis the relevant bacteriophage at a multiplicity of infec¬ tion (moi) of >2. Samples (I ml) were removed at DNA was sequenced on both strands by the Sanger regular intervals: before infection, immediately after method of dideoxy-mediated chain termination using infection, and at 20, 40, 60, and 80 min after infection. the fmollM DNA Sequencing System of Promego (Mad¬ The samples weie rapidly centrifugea, and the pel¬ ison, Wl), The sequencing primers were end-labeled lets were frozen by immersion in ultracold cthanol with [y-33P]ATP according to the manufacturer's in¬ K-50 C), After thawing, the pellets were resus- structions. The thermal cycler (Perkin-Elmet) was pended in 1 ml of lysis buffer (6.7% sucrose, 50 mM programmed at 30 cycles of 95°C foi 30 s, 50"C for Trrs, 1 mM FDTA, pH 8 0) containing 10 mg/ml ly¬ 30 s, and 72°C for 1 mm. The sequence obtained was sozyme and incubated for 1 h on ice. SDS was added analysed as previously described by Dosiere et al. (125 /.d of 10% stock solution) together with proteinase (1998), The relevant accession numbers for 0Sfi2l and K (50 /il of 20 mg/ml stock) and the lysates were 0Sfi19 sequences are AF004379 and AF077306, re¬ incubated af 65°C for 30 min. RNase A (50 /tg/ml) spectively. was subsequently added, and incubation was contin¬ ued af 65°C foi an additional 30 mm, Two phenol- Construction of plasmids chloroform extractions were performed on the result¬ The cloning vectors used in this study were as follows1 ing lysates, followed by an ethanol precipitation, The the E. high-copy-number co/Hactococcal-streptococca! DNA pellet obtained was washed in 70% ethanol shuttle vector pNZ124 (Platteeuw et ni, 1994), the low and finally resuspended in 50 fil cof distilled water, of 101

386 FOLEY ET AL

of the bacteno which 25 /tl was used for restriction analysis Re tion system temperate Streptococcus thermophilus phage 0Sfi21 Viroloqy 237, 148-158 stricted DNA samples were electrophoresed, and by Chandry P S Moore S Boyce J D Davidson B E and Hillici A J hybridisation of Southern blots the amount of phage (1997) Analysis of the DNA seguence gene expression origin of DNA was assessed at various time in points during replica1 ion and modular stiuclure of the Lactococcus lactis lytic fection baceiiophago ski Mol Microbiol 26,49-64 Desieie F Lucchini S and Biussow H (1998) Evolution of St/epio coccus theimophilus bactenophage genomes by modular ex Gel retardation assay changes followed by point mutations and small deletions and tnser An overnight culture of S thermophilus Sfil was lions Viioloqy 241 345-356 Desieie F Lucchini S Biutlin A Zwahlen M C and Brussow H in Belliker broth and to an inoculated (2%) grown ODl0n v (1997) A highly conserved DNA replication module fiom Slrepto of 0 2 CaCI^ was then added (final concenti ation coccus theimophilus phages is similar in seguenc e and topology 10 and the culture was in 50 ml vol mM), aliquoted 'o a i odule from Lactococcus lactis phages Vuoloqy 234 3/2 umes The cultures were subsequently infected vi h 382 the relevant phaqe at an m o i of 2 and incubated at De Vos W M and Simons G F M (1994) Gene cloning and exprès sion sys'oms m lactococci //) Genetics and Biotechnology of Lactic 40°C for 30 min The cells were harvested by cent! f Acid Bacteria (M 1 Gasson and W M de Vos Eds; pp 52-105 ugation and washed and the pellets resuspendecl n Blat kie Acadei ic and Piofcssional UK 750 of Tris buffer mM 8 the crude ceM /^.l (10 pH 4°C) Foley S Bron S Venema G Daly C and I itzgcrsld G F (1996) al extracts were prepared as described by Foley ef Molec ilar analysis of the i cplication oi igin of the Lactococcus lactis (1996) Plasmid pCI305 Plasm d 36 125-141 P G F and Hill C (1995a) and DNA A 302 bp Bqlll diqested PCR generated DNA frag Gtivey Fit/geiald Cloning seguence analysis ol two abortive infection phage resistance deter ment corresponding to the phage DNA nseit in mmnts fiom the lactococcal plasmid pNP40 Appl Lnvnon Micro labeled with the Klenow cof pSF23g was end fragment bol 61 432l-4o28 in the DNA polymerase I (New England Biolabs) ap Gaivey P van Sindeien D Iwomey D P Hill C and Fitzgerald propiiato buffer and in the presence of }0 juCi of G F (1995b) Moleculai genetics of bactenophage and natural phage defence systems in the genus Lactococcus Int Dairy J 5, [a 32P]ATP The labeled DNA was phenol extracted 905-947 and ethanol piecipitated The binding conditions used Hill C Massey J and Klaenhammer 1 R (1991) Rapid method to were as described ef al (1996) and mcluded by Foley chaiactenze lactococcal bactenophage genomes Appl Lnvnon Mi 11 /tl of crude cell extiact (-5/ig cof total protein) Aftei c obiol 57 283-288 a 10 mm incubation at room tempeiature 5 /tl ol 50" H h C VI11 er L A and Klaenhammer 1 R (1990) Cloning expression and lenco deteuninalion ol a glycerol was added to each reaction before immediate reg baclcnophagc fiagment encoding bacteriophage resistance in Lactococcus lactis J Bactenol 172, loading on a 4% nondenaturing polyaciylatmde gel 6119 6426 2 6% erol was containing glyc Electrophoresis per Holo H and Nes 1 F (1989) High frequency tiansformation by formed at room for 4 h at 10 V/cm The temperature gel elec lopoiation of Lactococcus lactis ssp cremoit> grown in gly was vacuum dried onto 3MM Whatman pape and cine in osmotically stabilised media Appl lnvnon Miciobiol 55, M autoradiographed for 36 h 1119-31 tanzen I Kleinschmidt J Neve II and Geis A (1992) Segucncing and charactensation of pSTI a cryptic plasmid fiom Stieptococcus ACKNOWLEDGMENTS theimophilus rDMS Microbiol Lett 95,1/5-180 laibi D Dccans B and Simonet I M (1992) Different bacteriophage We thank Josette Sidoti for invaluable technical asbictance and he resistance mechanisms m Streptococcus salivanus subsp ther Swiss National Science Foundation for financial support of Sophie mophilus 1 Dairy Res 59 149-157 Foley and Sacha Lucchini in thefiameworkof i s Biotechnology Vocl C Mcrccnicr A (1990) Moleculai genetics of Stieptococcus thoimophi (Grant 5002-044343/1) We thank Beat Molle f and Divid Cidmoto fci us FFMSMiuooiol Rev 8/ 61-78 contribution^ in the initial ol the woik important phase Moineau S Walket S A Hollei B J Vedamuthu t R andVanden

beieh P A (1995) Expression of a I aclococcus lactis phage resis REFERENCES tance mechanism by Stieptococcus thermophilus Aopl Envnon Miciobiol 61 2461-2466

Brussow H and Bruttin A (1995) C hanctensa ion of a or peiait 0 Sullivan D J Hill C and Klacnhammci 1 R (1993) Effect of Stieptococcus thermophilus bactenophage and Cs oenetic relation increasing the copy number of bacteriophage origins of replication ship with lytic phages Viroloqy 212 632 640 in trans on incoming phage proliferation Appl Envnon Microbiol Biussow H Piobst A Fremont M and Sidoh I (1094) Distnct 59 2449-2456 Streptococcus theimophilus bacteriophages share an extiemely Pcclie X Weise F Chai S Luder G and Alonso 1 C (1994) Analysis conserved DNA fragment Virology 200 834 837 of eis and inns acting elements reguired foi the initiation of DNA Bruttin A and Brussow H (1996) Site specific spontaneous delet ons replica ion in the bacillus subtilis bacteriophage SPP1 / Mol Biol in three genome regions of a tempei a e Streptococcus tl ei nooi i us 236 1,1 1310 phage Vu ology 219 96-104 PhUxeuw C Simons G and de Vos W M (1994) Use of the Esch Bruttin A Desiere F Lucchini s Fole> s and Bi issow H (1(C7a) cm Ina coi ß glucuronidase [gusÄj gent as a reportei gene for Characteuzation of the lysogeny DNA i odule l^m Su omp'iate analyzing p onoters in lactic acid bacteria Appl Lnvnon Microbiol Streptococcus theimophilus bacteriophage e/CfiC Vncloci), 233 60 587-593 136-148 Salas M Fieirc R Soengas M S Esteban J A Mendez I Bravo Bruttin A Foley S andBiussovv H (1997b) Fhe site specific inicgia A Soriano M Blasco M A Lazaro J M Blanco L Gutierrez C J 02

VIRAL DNA ELEMENT CONFERS RESISTANCE TO S thermophilus 387

and HermobO J M (1995) Protein nucleic interactions in bacteno D (1997) Seguence analysis and characterization of ^601205 a phage 29 DNA replication FEMS Microbiol Rev 17, 73 82 lemperale bactenophage infecting Streptococcus thermophilus Sambrook J Fntsch E F and Maniatis T (1989) Molecular Cloning CNRZ1205 Miciobiol 143, 3417-3429 A Laboiatoiy Manual 2nd ed Cold Spring Flarboi Press Cold leizaghi B E and Sandine W E (1975) Improved media for lactic Spring Harboi New York streptococci and their bacteriophages Appl Microbiol 29 807- Slos P Bouiguin J C Lemoine Y and Merceniei A (1991) Isohlonand 313 characterisation of chromosomal promote i s of St/ep'ococ i^aia i Waterfield N R Lubbers M W Polzin K M Le Page R W F and subsp theimophilus Appl Environ Microbiol 57 H33-133"1 Jaivis A W (199C) An origin of DNA replication from Lactococcus Stanley E Fitzgeiald G F leMairec C layaid B and aiCnJee actis bactenophage c2 Appl Envnon Microbiol 02 1452-1453 103

2.6 Construction of food-grade Streptococcus thermophilus starters with potent, broad range phage resistance phenotypes 104

Construction of Food-Grade Streptococcus thermophilus Starters with Potent, Broad Range Phage Resistance Phenotypes

Sacha Lucchini and Harald Brussow

Nestle Research Centre Nestec Ltd Veis-Chez-Les-Blanc CH-1000 Lausanne 26 Switzerland

Introduction

S thermophilus is a thermophilic starter used in the production of various dairy products such as yogurt, Italian- and Swiss-type cheese Bacteriophages represent the main cause of slow fermentation or even complete starter failure To cope with the phage problem, methods currently used are rigorous sanitation of the dairy plant, use of multiple strain starters in rotation of cultures and closed production lines (12)

In Lactococcus lactis, the major mesophihc starter of the dairy industry, natural phage-resistance mechanisms are abundant and usually encoded on conjugative plasmids (1) These have been used to construct food-grade phage resistance mechanisms into important industrial lactococcal starters (22)

In S thermophilus little is known about natural phage resistance mechanisms

(14,19) This is partially the consequence that plasmids are scarce in S thermophilus

(17) Spontaneous phage-insensitive mutants (BIM) can be selected by phage challenge This approach has been applied to S thermophilus However, such mutants are usually slow acid producers and can also revert to phage sensitivity (4)

Thus, it was decided to exploit molecular methods to create phage-resistant strains

One approach based on the random cloning of phage DNA segments on a plasmid has been successful One fragment was shown to interfere with phage DNA replication and to provide effective protection against a broad range of phages (10)

To increase the range of strategies available to obtain phage-resistant strains, we tested two further approaches First, we targeted bacterial host genes for inactivation

The rationale was that we postulated the presence of bacterial genes essential to phage development, but dispensable for bacterial growth in milk In fact, bacteriophages depend on host factors in many steps of their life cycle, e g adsorption, DNA injection, replication and morphogenesis Disruption of one of these factors will lead to a phage-resistant cell Having no knowledge of the S thermophilus genome we opted for a random gene inactivation approach relying on the temperature sensitive plasmid pG+host9ISSf (16) The method is self-selective 105

after phage challenge nonresistant cells will be eliminated and mutants leading to decreased bacterial fitness will be outnumbered.

The second approach relied on the protection provided by the Sfi21 prophage (8).

Temperate phages usually code for superinfection immunity genes, which are quite effective in protecting the lysogen from superinfection by both temperate and lytic phages. In the case of phage Sfi21 the superinfection control is apparently mediated by two distinct genetic elements: orf 203, a superinfection immunity gene and orf

127, the phage repressor. The drawback is that Sfi21 lysogens continuously release infectious phage particles in the media, which would contaminate the factory and possibly prevent a later introduction of valuable starters susceptible to phage Sfi21. It has also been shown that temperate phages can be the source of lytic derivatives

(3;7;23). Therefore, we decided to create targeted deletions in the Sfi21 prophage inserted in the lysogénie starter Sfi1cl6 to obtain a lysogen that would retain superinfection control, but has lost its capacity to produce infectious virions.

Here we demonstrate that both approaches provide powerful phage-resistant food-grade starters for industrial milk fermentation.

Materials and methods

Strains, media, plasmids, and culture conditions. TheE. coli strain 101 was propagated in LB broth or LB broth solidified with 1.5% (WA/) agar at 37°C. Liquid cultures were grown under agitation (240 rpm). S. thermophilus strains Sfil, its lysogénie derivative Sfi1cl6 (containing the Sfi21 prophage) and their transformants were cultivated at 42°C either in M17 supplemented with 0.5% lactose (LM17), Belliker media or MSK. Erythromycin was used when required at a final concentration of 2 and 150 jLig/ml for S. thermophilus and E.coli, respectively. The phages used in this study were obtained from the Nestlé collection and propagated on their appropriate S. thermophilus strain in LM17 broth as described previously (5;7). Phage enumeration was achieved as described in (10). For random insertion mutagenesis and directed mutagenesis, plasmid pG+host9ISSf and pG+host9 have been used (16), respectively.

DNA techniques. Phage purification, DNA extraction and purification, agarose gel electrophoresis, Southern blot hybridization, and DNA labeling were executed as described previously (5;7;8). The Qiaprep plasmid kit (Qiagen) was used for the rapid isolation of plasmid DNA from £. coli. Restriction enzymes and T4 ligase were purchased from Boehringer-Mannheim and used according to the supplier's instructions. E. coll was electrotransformated as outlined in the BioRad instruction manual. S. thermophilus was electroporated using the procedure described by Slos et al. (25). The analysis of intracellular phage DNA, PCR and DNA sequencing have been performed as described previously (10),

Sequence analysis. The Genetics Computer Group (University of Wisconsin) sequence analysis package was used to assemble and analyze the sequences. Nucleotide and predicted amino acid sequences were compares to those in the 106

databases (GenBank, release 109, EMBL, release 56, PIR-Protein, release 57, SWISS-PROT, release 36, and PROSITE, release 15 0) with FastA(15) and BLAST programs (2). Prediction of transmembrane domains was performed using the TMpred program (11)

Construction of plasmid for site directed integration. To create a deletion in the Sfi21 prophage, the thermosensitive plasmid pG+host9AB1560, a derivative of pG+host9 has been created Two fragments of approximately 500 bp, A and B, were chosen in the orf1560 on the prophage sequence at a distance of 2 4 kb so that a deletion of the same size would be created by homologous recombination. Fragment A was generated by PCR using phage Sfi21 DNA as a template and primers (5'-AAC TGCAGT CTC AGC TCA AAG GGA C-3' AND 5'-GGA ATT CTA GCC GTG ATG TTT TTG-3') containing Pst\ and EcoRI restriction sites (underlined) Fragment B was generated by PCR using primers (5'-GGA ATT CGA CGC AAT TAA AGA CCC-3' AND 5'-CCA TCG ATC TGC TTC CAA AAT CTC G-3') containing EcoRI and C/al restriction sites Both clones were then cloned into pG+host9, so to be adjacent, generating the construct pG+host9AB1560 The construct was first generated intoE coli 101, then transformed into S thermophilus Sfi1cl6

Transposition of pG+host9lSSf and pG+host9AB1560 in theS. thermophilus chromosome. S thermophilus Sfil and Sfi1cl6 containing pG+host9ISS7 and pG+host9AB1560, respectively, were grown overnight in LM17 medium supplemented with 2 jig/ml of erythromycin The saturated cultures were diluted 100- fold in LM17 medium containing 1 jug/ml of erythromycin and incubated 2 h at 30°C. The cultures were then shifted to 42°C to eliminate free plasmids and grown to saturation The frequency of integration per cell was estimated as the ration of the number of EmR cells at 42°C to the number of viable cells at 30°C Integration of the Plasmids was checked by Southern blot hybridization To excise the transposed vectors, serial passages have been performed in LM17 broth without antibiotic

Selection of phage-resistant mutants. The culture containing the original population of Sfil pG+host9ISSf integrants was diluted 100-fold in LM17 supplemented with 2 ug/ml of erythromycin and challenged with lytic phage Sfi 19 at a MOI of 5 The culture was then grown to saturation The experiments were considered unsuccessful when no growth was observed after 48h

Phage adsorption test. S theimophilus cultures were grown in LM17 until an OD6oo of 0 6 was reached Then phages were added at a M O I of 1 and the cultures incubated at room temperature Probes were taken immediately after phage addition and after 30 mm These probes were then filtered to remove the bacterial cells from the cultures Phages left were then enumerated The adsorption was calculated as phage counts at time 30 divided by the phage counts at time zero 107

Results

Random mutagenesis of S. thermophilus Sfil. Starter strain Sfil is our best

indicator cell. It is susceptible to about 25 of the 100 phages from our collection. This allows testing mutant starters against a broad range of phages.

Transposition of the plasmid could be achieved with a very high integration frequency, about 50% (see materials and methods). Integrants were randomly chosen on agar-plates and grown in liquid cultures. Their total DNA was then extracted to check by Southern blot hybridization the integration of the plasmid into the chromosome. Hybridization was performed on EcoRI digested chromosomal

DNA using labeled pG+host9ISSf as probe. All integrants showed signals corresponding to distinct chromosomal fragments (data not shown), confirming the integration of the plasmid and the randomness of the transposition.

Selection of phage-resistant mutants. Integrants were challenged with the lytic phage Sfi19 (M.O.I.=5) to select for phage-resistant mutants. In presence of phage

Sfi19 the Sfi1::pG+host9ISS"/ culture showed a delayed growth, an OD600of 0.85 was only reached after 8 to 12 h of incubation in comparison with 3 h for the transformants in the absence of challenge phage. In contrast, no growth was observed for the parental Sfil starter after phage challenge (OD60o < 0.02 after 48 h).

In fact, in LM17 medium we consistently failed to obtain phage resistant mutants of starter Sfil after challenge with numerous phages.

1 2 3 4 5 6 M

*•*' ««* Plasmid .u-ù«-* **•"* ^* tandem repeats -» --~ •«

Figure 1. Southern blot of total DNA of phage resistant integrants. Lanes 1-5: phage resistant Sfi1::pG+host9ISSl Lane 6. Sfil M DNA marker (X-DNA x Hindlll). The blot was probed with radioactively marked pG+host9ISS7' plasmid and X-DNA. Note: this figure is the combination of two independent blots (lane 1 has been added).

Individual phage-resistant colonies were isolated; their total DNA was extracted, and digested with EcoRI, and Southern blots were performed using the plasmid DNA 108

as probe. Three different hybridization patterns were identified, corresponding to the

Sfi1-R7 (lane 3), Sfi1-R24 (lane 1) and Sfi1-R71 (lanes 2,4,5) insertion mutants (Fig. 1)-

Characterization of the phage-resistant mutants. The 3 mutants were tested for phage resistance against 15 different S. thermophilus phages. In no case, phage plaques were observed suggesting in some cases efficiency of plating of < 10"7 (Table 1).

SfrCR7~"""> Phage (pfu/ml) (pfu/ml) (pfu/ml) (pfu/ml)

Sfi21 6X 10s <102 <102 <102

S69 72X 103 <102 <102 <102

SfiSJ 2X108 <102 <102 <102

St44A 1 X108 <102 <102 <102

S19 2X1Û8 <102 <102 <102

S96 4X10' <102 <102 <102

St44 1 X 10s <102 <102 <102

St40 2X 105 <102 <102 <102

Sfi 18 2 7X 107 <102 <102 <102

Sfi 19 1 1 X 109 <102 <102 <102

St25 3X108 <102 <102 <102

St17 1 X108 <102 <102 <102

S17 6X108 <102 <102 <102

H 4 5X 103 <102 <102 <102

F 1X104 <102 <102 <102

«.««^.w««^«.^«^«w^^^^^^^w^^^wvv. ^.^ , *««.**..<„* ^ _ ^^w^Mv^*.M«<^^^^^«W.-««*««, •.«, ^».«^.w.«««« w«-«,.

To have some indications at what stage the phage development is blocked, phage-adsorption and phage DNA-replication tests were performed. The adsorption test was performed using Sfi 19 on Sfi 1 and the three mutants. After 30', 90% of the input phages adsorbed to Sfil, 94% to Sfi1-R7, 91% to Sfi1-R24 and 87% to Sfi1-

R71. This indicates that the mutation did not affect phage adsorption.

DNA replication of phage Sfi19 in Sfil and the mutants was analyzed by determining the relative amount of intracellular phage DNA at different times during phage infection. In Sfil phage DNA was readily detected 20 min after infection (Fig.

2). The amount of DNA increased to a maximum after 40 min. Then a decrease was observed after 60 and 80 min, probably because of phage induced cell lysis. Phage

DNA replication in the Sfi1-R24 mutant was delayed and decreased. 109

Sfil Sfi1-R7 Sfi1-R24 Sfi1-R71

M 0 20 40 60 80 0 20 40 60 80 0 20 40 60 80 0 20 40 60 80 Wl

S Figure 2: Detection of intracellular phage DNA. The ^P numbers correspond to the time in minutes after which samples have been taken (see Materials and methods) The two lines marked with M correspond to the size marker (1 kb DNA Ladder, GibcoBRL)

Phage DNA was detected only after 40 mm reaching a maximum at 60 min. In the case of Sfi1-R7 and Sfi1-R71 no phage replication was observed. A very low and invariable level of phage DNA was detected by hybridization. This could represent the injected DNA of the infecting input phages or DNA from uninjected, but adsorbed input phages. The growth of the mutants was compared to that of the wild type strain in milk This was done by measuring changes in impedance of the culture media

(Rapid Automated Bacterial Impedance Technique, Don Whitley Systems) This system measures indirectly the transformation of a weak electrical conductor like the polar, but uncharged lactose into the electrically charged lactic acid during bacterial growth The resulting curve (time vs conductivity) can be correlated with both bactenal growth and acidification of the culture. All three mutants did not show significant differences in growth (and acidification) to the parental strain Sfil (Fig. 3).

43

.3,

III

Ü I Figure 3. Growth curves of phage- resistant S thermophilus strains compared to the parental strain V Sfil 2 Sfi1-R7 3 Sfi1-R71 4 Sfi1-R24 Time [h] 110

Determination of the site of integration. The fragments adjacent to the insertion

point were obtained by plasmid rescue as described by Maguin et al. (16) for the

Sfi1-R7 and Sfi1-R24 mutants. The approach was unsuccessful for Sfi1-R71. The

regions flanking the pG+host9ISS7 insertion point were sequenced for the two

mutants (Fig. 4).

In Sfi1-R7 the plasmid integrated in the last 4 amino acids of a putative

chorismate mutase chain A gene (PheA), orf 90. This site of integration could also function as promoter region for the downstream orf 394. Orf 394 gene product shows

similarities (26% identity) to a hypothetical conserved protein of Methanococcus jannashii (Accession number MJ0305). Orf 394 gp may be an integral protein since 9

strong transmembrane helices were predicted. Orf 115 gp encoded by a gene

located directly downstream of orf 394, showed significant similarity to the ribosomal

protein L19 from a number of bacteria (Haemophilus influenzae, Salmonella

typhimurium, Serratia marcescens, Synechocystis sp.). A gene almost identical to a

tRNAArg gene (anticodon CCU) follows this gene from E. coli and interestingly the

phage attachment site attB for prophage integration. Upstream of orf 90 a gene we

identified orf 486 (see Fig. 4), which encodes a further possible membrane protein.

The deduced protein for this orf shows significant similarity (50% over 427 aa) to a

hypothetical Cl-channel-like protein from E, coli (Accession number P37019).

0 1000 2000 3000 4000bp

PheA attB Cl-channel-like TM-protein L19 Sfi1-R7 486 90J 394 115 |V

Insertion tRNAAr9

Oxi do-

red uctase hsdR Sfi1-R24 A 82 104 79 67 740 61 269 71

Insertion

Figure 4: Sites of pG+host9ISSf insertion in Sfi1-R7 and Sfi1-R24. Open arrows indicate the predicted open reading frames (orf) The numbers indicate the number of codons for each orf. When possible, a tentative function has been attributed to the predicted gene products. Ill

In Sfi1-R24 the pG+host9ISSf insertion point was in a likely oxidoreductase

involved in fatty acid biosynthesis (orf 269). The predicted protein showed 42%

amino acid identity with an E. coli 3-oxoacyl-[acyl-carrier protein] reductase.

Interestingly, one fragment of an R-subunit of a type I restriction enzyme is located

downstream orf 269. No database matches were found for the orfs upstream of orf

269.

Plasmid excision. To obtain food-grade starter strains it is necessary to remove

the erythromycin resistance gene present in the transformants. ISS? undergoes

replicative transposition which leads to an integrated plasmid vector flanked by two

IS elements. Therefore, homologous recombination between this two copies of the IS

element will entirely remove the plasmid and leave only a single copy of the IS

element. To favor recombination, serial passages of the transformants were done in

absence of antibiotic selection. Sfi1-R24 lost the EmR phenotype after 29 passages

(now Sfi1-R24e), Sfi1-R71 after 40 (Sfi1-R71e). Excision was easier with Sfi1-R24,

because of the lack of tandem repeats (Fig. 1). Sfi1-R7 Ems (Sfi1-R7e) could be

obtained after 60 passages. Then, phage resistance was checked for the Ems

mutants. In 2 cases (Sfi1-R24 and Sfi1-R71) the loss of the plasmid resulted in full

reversion to phage sensitivity while Sfi1-R7e did retain its full phage resistance

phenotype. This derivative lacked the plasmid sequence except for one ISSf

sequence. Sequencing of the flanking sequences of the remaining ISSf showed in

addition a 69 bp deletion which started at the IS transposition site and removed 3'~

end of orf 90 and the first 15 codons of orf 394 (Fig. 5). Since two independently

obtained Sfi1-R7e mutants showed the same deletion it is likely that the deletion has

already occurred in the parental Sfi1-R7 mutant.

\S.S1 Stop orf90 Start orf394

2101 b ' -CGTACTTATCAAGCAAGTAAGCTAGAAGAAAAGTAG3CTCCTAGGATG. VAAGTAAGACAA

3'-GCATGAATAGTTCGTTCATTCGATCTTCTTTTCATCCGAGGATCCTACTTTCATTCTGTT

2161 5 ' -TTAAGTGATAGTAAAGGTCTTTCTTACTTACAT7TAGTAATGTTAAGTCTTTATGCAATT

3' -AATTCACTATCATTTCCAGAAAGAATGAATG7AAATCATTACAATTCAGAAATACGTTAA

Figure 5: Deletion in Sfi1-R7e The sequence deleted in Sfi1-R7e is in bold. The arrow indicates the pG+host9ISSf integration site.

Deletion of orf 1560 from the Sfi21 prophage. Integration of pG+host9AB1560

into the prophage was readily obtained (see materials and methods). Integration in the right site was checked by PCR. More than 50% of the integrants were shown to 112

give a signal corresponding to the predicted size for homologous recombination

events. Plasmid excision was obtained after 20 serial passages. Because the second

homologous recombination can lead both to the wild type or the deletion mutant,

PCR was used to test the Ems clones. As expected about 50% of the clones were

deletion derivatives (Sfi1cl6A1560).

Characterization of Sfï1cl6Àl560. To test if we inactivated an essential gene of the phage we induced the prophage of both Sficl6 and Sfi1cl6A1560 with 2u.g/ml of

Mytomicin C. Sficl6 released 105 pfu/ml, Sfi1cl6D1560 released no detectable

infectious particles. Subsequently we checked whether the superinfection exclusion

phenotype was preserved in the derivative lysogen

Phage Sfil Sfi1cl6 Sfi1cl6A1560 (pfu/ml) (pfu/ml) (pfu/ml) Sfi 18 7 4X"Ï0r~ ^jTj^ < 10'

S96 9 8X106 <102 <• 102

Sfi21 8.3 X107 <102 <102

Sfi 19 1 6X108 8 8X105 1 7X106

ST25 1 1X108 1 6X107 2 2X107

ST17 6 5X108 1 2X 104 1 3X 104

Table 2

In fact, plaque assays demonstrated an identical phage exclusion phenotype in both with respect to suppression of phage infectivity expressed as efficiency of plaquing as well as in the range of different phages which were inhibited (Table 2).

Growth curves of the lysogen and its deletion derivative were identical (Fig. 6).

«

& /

o

Figure 6. Growth curve of S thermophilus Sfi1cl6A1560 (2)

* 0^ - compared to Sfi1cl6 (1). 113

Discussion

Two independent strategies led to increased phage resistance of S thermophilus starter strains The first strategy relied on random insertion of an integrative plasmid into the chromosome of S thermophilus Thereby, we obtained highly phage resistant (e o p < 106) mutants

The broad range of anti-phage activity of the Sfi1-R7 mutation suggests that the

15 phages, which were tested, have similar requirements for host factors, i e orf 90 and orf 394 gps At the current level of analysis we can not exclude effects of the IS element and/or the adjacent deletion on the transcription of adjacent bacterial genes

Phage structural proteins interact with bacterial structures at least for phage adsorption and DNA injection The apparent lack of intracellular phage DNA in the

Sfi1-R7 and Sfi1-R71 mutants may be an indication of interference with phage DNA injection In phages from Gram-positive bacteria initial phage adsorption and phage

DNA injection seem to follow a two step process (13,20,24,26,27) First, phages adsorb reversibly to carbohydrate components of the cell wall, then appears to follow an irreversible interaction with the plasma membrane and ejection of DNA For bacteriophage c2 it has been shown that a membrane protein, PIP, was responsible for irreversible phage adsorption to the host (18) Phage c2 was unable to infect cells with defective PIP proteins Whether PIP is also responsible for phage DNA injection or other structures are needed remains an open question For example, bacteriophage X requires a host protein (Pel) in addition to the LamB receptor for an effective DNA injection Pel is apparently not involved in phage adsorption since X binds tightly to E coli pet strains (21) It is thus possible that in the case of the Sfi1-

R7 and Sfi1-R71 mutants, proteins involved in phage irreversible adsorption and/or

DNA injection have been modified In fact, the sequence information available for

Sfi1-R7 indicates that the insertion of pG+host9ISS*/ may have blocked the expression of a putative integral membrane protein

The phage resistance phenotype of Sfi1-R24 resembles that of an abortive infection mechanism The phage DNA can enter the cell and phage DNA replication takes place, but it is delayed and diminished We suspect therefore that the insertion event affects the expression of a host factor involve in a phage multiplication step preceding DNA replication Blockage of this step apparently affects DNA replication 114

negatively In the case of Sfi1-R24, the oxidoreductase is not the host factor whose

inactivation causes the delay in DNA replication since its interruption by the IS

element after plasmid excision fails to provide phage inhibition Apparently, the transcription of adjacent genes is disturbed by the integrated plasmid A gene

encoding a subunit of a restriction endonuclease was the only bioinformatic hit in the targeted region However, restriction and modification mechanisms are excluded

since no traces of phage DNA degradation are detected in the Southern blot (Fig 2)

Direct interference with phage DNA replication is unlikely since the inhibited phages

have apparently different DNA replication modules (6) and are thus likely to depend

on different host replication functions

The Sfi21 prophage could be inactivated by targeted food-grade phage gene

inactivation No release of infectious particles by the Sfi1cl6A1560 lysogen could be

detected after prophage induction Importantly, the inactivated prophage retained its

protective effect against superinfection with a broad range of temperate and lytic S

thermophilus phages Consequently, inactivated lysogens can now be considered as

valuable starters Their protective power exceeds that obtained with starter strains

containing phage-inhibitory plasmids For example, the superinfection immunity gene

of Sfi21 (orf 203) was cloned in a high copy number plasmid and tested for protection

against phages (8) It was quantitatively and qualitatively less complete than that

mediated by the prophage In addition, the plasmid was not any longer phage

inhibitory when integrated as a single copy into the bacterial chromosome Therefore,

the use of orf 203 has to rely on its presence on a high copy number plasmid, with

the problem of high metabolic costs and plasmid instability In contrast, the integrated

prophage seems to be of low metabolic cost to the cell since lysogénie and non-

lysogenic S thermophilus starters have identical growth properties (unpublished

results) Inactivated prophages can be used to increase the phage resistance of

other strains than Sfil since a number of valuable industrial S thermophilus starters

can be lysogenized with phage Sfi21 (9)

Reference List

1 Allison, G.E. and T.R. Klaenhammer 1998 Phage resistance mechanisms in lactic acid bacteria Int Dairy J 8 207-226 2 Altschul, S.F., W. Gish, W. Miller, E.W. Myers, and DJ. Lipman 1990 Basic local alignment search tool J Mol Biol 215 403-410 3 Boyce, J.D., B.E. Davidson, and A.J. Hillier 1995 Spontaneous deletion mutants of the Lactococcus lactis temperate bactenophage BK5-T and localization of the BK5-T attP site Appl Environ Microbiol 61 4105-4109 115

4. Brüssow, H. 1999 (UnPub) 5. Brüssow, H. and A. Bruttin 1995 Characterization of a temperate Streptococcus thermophilus bacteriophage and its genetic relationship with lytic phages Virology 212 632-640 6. Brüssow, H., A. Probst, M. Fremont, and J. Sidoti 1994 Distinct Streptococcus thermophilus bacteriophages share an extremely conserved DNA fragment Virology 200 854-857 7 Bruttin, A. and H. Brüssow 1996 Site-specific spontaneous deletions in three genome regions of a temperate Streptococcus thermophilus phage Virology 219 96-104. 8 Bruttin, A., F. Desiere, S. Lucchini, S. Foley, and H. Brüssow 1997 Characterization of the lysogeny DNA module from the temperate Streptococcus thermophilus bactenophage phi Sfi21 Virology 233 136-148 9 Bruttin, A., S. Foley, and H. Brüssow 1997 The site-specific integration system of the temperate Streptococcus thermophilus bacteriophage phiSfi21 Virology 237 148-158 10. Foley, S., S. Lucchini, M.-C. Zwahlen, and H. Brüssow 1998 A short noncoding viral DNA element showing characteristics of a replication origin confers bacteriophage resistance to Streptococcus thermophiius Virology 250 377- 387

11 Hofmann, K. and W. Stoffel 1993 TMbase - A database of membrane spanning proteins segments Biol Chem 347 166-166 12 Josephsen, J. and H. Neve 1998 Bacteriophages and lactic acid bacteria, p 385- 436 In S Salminen and A von Wright (eds ), Lactic acid bacteria Dekker, Basel

13 Keogh, B.P. and G. Pettingill 1983 Adsorption of bacteriophage eb7 on Streptococcus cremons EB7 Appl Environ Microbiol 45 1946-1948 14 Larbi, D., B. Decaris, and J.-M. Simonet 1992 Different bacteriophage resistance mechanisms in Streptococcus salivanus subsp thermophiius J Dairy Res 59-349-357 15. Lipman, D.J. and W.R. Pearson 1985 Rapid and sensitive protein similarity searches Science 227 1435-1441 16. Maguin, E., H. Prévost, S.D. Ehrlich, and A. Gruss 1996 Efficient insertional mutagenesis in lactococci and other gram-positive bacteria J Bactenol 178 931-935 17. Mercenier, A. 1990 Molecular genetics of Streptococcus thermophilus FEMS Microbiol Rev 7 61-77 18 Monteville, M.R., B. Ardestani, and B.L. Geller 1994 Lactococcal bacteriophages require a host cell wall carbohydrate and a plasma membrane protein for adsorption and ejection of DNA Appl Environ Microbiol 60 3204-3211 19 O'Sullivan, T., D. van Sinderen, and G. Fitzgerald 1999 Structural and functional analysis of pCI65st, a 6 5 kb plasmid from Streptococcus thermophilus NDI-6 Microbiology 145 127-134 20 Oram, J.D. and B. Reiter 1968 The adsorption of phage to group N streptococci The specificity of adsorption and the location of phage receptor substances in the cell wall and plasma membrane fractions J Gen Virol 3 103-119 21 Roessner, C.A. and G.M, Ihler 1987 Sequence of amino acids in lamB responsible for spontaneous ejection of bacteriophage lambda DNA J Mol Biol 195 963- 966 22 Sanders, M.E., P.K. Leonhard, W.D. Sing, and T.R. Klaenhammer 1986 Conjugal strategy for construction of fast acid producing, bactenophage-resistant lactic streptococci for use in dairy fermentations Appl Environ Microbiol 52 1001- 1007

23 Shimizu-Kadota, M.S.T. and N. Tsushida 1983 Prophage origin of a virulen phage appearing on fermentations of Lactobacillus casei S-1 Appl Environ Microbiol 45 669-674 24 Sijtsma, L., A. Sterkenburg, and J.T.M. Wouters 1988 Properties of the cell walls of Lactococcus lacits subsp cremons SK110 and its bactenophage-sensitive variant SK112 Appl Environ Microbiol 54 2808-2811 116

25. Slos, P., J.-C. Bourquin, Y. Lemoine, and A. Mercenier. 1991. Isolation and characterization of chromosomal promoters of Streptococcus salivarius subsp. thermophilus. Appl.Environ.Microbiol. 57:1333-1339. 26. Valyasevi, R., V.E. Sandine, and B.L. Geller. 1994. Lactococcus lactis subsp. lactis C2 bacteriophage ski involves rhamnose and glucose moieties in the cell wall. J.Dairy Sei. 77:1-6, 27. Valyasevi, R., W.E. Sandine, and B.L. Geller. 1990. The bacteriophage kh receptor of Lactococcus lactis subsp. cremoris KH is the rhamnose of the extracellular wall polysaccharide. Appl. Environ.Microbiol. 56:1882-1889. 117

3 Summary and conclusions

S. thermophilus phage diversity. To address the question of phage evolution

and diversity S thermophilus phages were initially compared by Southern blot and dot blot (3,4) All S thermophilus phages, both temperate and lytic, were highly

related and belonged to a single DNA-hybndization group However, regions of high sequence similarity were interspersed with unrelated DNA segments An extremely conserved 2 2 kb DNA region (>99% identity), probably involved in phage DNA replication, was shared by phages differing in many characteristics (4) This was interpreted as the result of recent genetic exchange within this group of phages

Evidences of genetic exchange between related phages were previously observed for the lambdoid phages, too (7) A sequencing project was then started to determine the complete DNA sequence of several S thermophilus phages covering differences in life styles (lytic and temperate), DNA packaging mechanism (cos- and pac-site phages) and structural protein pattern (2 major structural proteins vs 3), host range, serotype (1 and 2), susceptibility to prophage control (homo- or hetero-immune to phage Sfi21) and inhibition by phage Sfi21 encoded replication origin

The complete sequences of five S theimophilus phages are now available (Sfil 1,

Sfi19, Sfi21 (11), DT1 (38) and 01205 (36)) In addition, the right half of the genome of phage Sfi18 was sequenced Pairwise and multiple alignments of these sequences supports the theory of modular evolution of bacteriophages (1) genes are clustered into functional units (modules) all S thermophilus phages have a similar genomic organization (DNA-packaging followed by head morphogenesis, tail morphogenesis, host lysis, lysogeny module, DNA replication and finally a region possibly involved in phage development control) When phage DNA sequences were aligned, sharp transitions between regions of high sequence similarity and regions of low or no relatedness were observed Individual S thermophilus phages appear to result from recombination of exchangeable genetic elements drawn from a common gene pool

This gene pool is apparently restricted since no evidence of recent horizontal gene transfer of host genes or genes from phages infecting other bacterial species have been found (defined by >70% nucleotide sequence similarity) An exception was provided by the high similarities (>80%) detected between the phage Sfi21 replication origin (12) and the noncoding region surrounding the cos-site of Sfi 19 with the cryptic S thermophilus plasmid pST1

The units of genetic exchange were either entire functional modules (several adjacent structural genes), part of a module (individual genes in the lysogeny module) or even parts of a gene corresponding to likely protein domains (host range 118

determinant of the likely tail fibers, DNA binding domain of regulatory proteins). For

each unit of genetic exchange, multiple alleles exist. In the lambdoid phages the frequency of the occurrence of the different alleles indicated that the total number in the population might be small (20). This seems also to be the case among the

S.thermophilus phage population. For example, two different structural gene clusters were found; one corresponds to the cos-phages and one to the pac-phages. No evidence of a third allele could be found, all phages analyzed by Southern blot could

be classified into one of these two groups (23;24). Also only two different DNA

replication modules have been detected (39). The diversity within each allele was

limited to point mutations (resulting in 15% sequence diversity) and small indels.

The genes involved in DNA replication or phage morphogenesis and DNA packaging are apparently exchanged as multigenic sequences. These proteins build a network of interactions with each other and the exchange of a single protein between unrelated alleles will probably result in nonproductive protein-protein interactions. The situation is different for the likely tail fiber genes. This region is clearly a recombination hot spot, Horizontal gene transfer events may be facilitated by the postulated elongated conformation of these proteins. In fibrous proteins individual protein segments make less contacts with other parts than in globular proteins allowing for more genetic flexibility.

The amino-end of the likely tail fibers are well conserved within the pac- or the cos-phages (>90%), whereas the carboxy-terminal part is the most diversified. This is similar to the situation in the tail fiber genes from X and T4. In these two phages the highly conserved amino-end is thought to bind to the body of the virions, the more variable carboxy-end determines the host range (9). Transfer of segments of tail fiber genes seems to occur between cos- and pac-phages. This is reminiscent of what has been described for coliphages (18), where likely horizontal gene transfer of tail fiber genes occurs between phages having otherwise unrelated structural proteins. It has been postulated that the exchange of host range determinants is a mechanism used by phages to extend their host range.

A second recombination hot spot is the lysogeny module (30). In the comparison of three temperate S. thermophilus phages distinct multiple sharp transitions zones from highly similar to moderately similar or different DNA sequences were observed, suggesting multiple exchange events. The unit of genetic exchange could be as small as a single gene or even a segment of a gene. This remarkably recombination freedom could be rationalized by the characteristics of the encoded functions. 119

Proteins encoding integrative functions, superinfection immunity and transcription

control do not depend on protein-protein interactions as structural virion proteins.

Therefore, these functional units may be exchanged freely. Such exchanges within

the lysogeny module have also been observed for lactococcal phages (25), and P2~

like phages from Haemophilus influenzae (34). DNA/DNA-hybridization experiments

showed that temperate and lytic phages of S. thermophilus are related (3). This has

also been observed for L. lactis (5;13;31) and Lactobacillus (28;32;37) phages.

Comparison of the entire genome of the temperate cos-site phage Sfi21 with the lytic

cos-site phage Sfi19 confirmed their relationship. Both phages have a similar genomic organization and share extensive DNA sequence identity (>85%) along their entire genome except for the lysogeny region. The corresponding region in phage

Sfi19 is much shorter (2.6 kb vs. 5.7 kb) and lacks sequence relatedness, both at the

DNA and amino acid level.

Comparison of the Sfi19 lysogeny replacement module revealed a sequence similarity with the lysogeny modules of further temperate S. thermophilus phages

(01205, TP-J34). It is therefore likely that virulents, thermophilus phages are the result of deletion events and gene rearrangements in temperate phages. In fact, a lytic derivative of the temperate phage Sfi21 could be readily isolated in the laboratory (6). A very similar virulent phage was also isolated from the field. All these observations raise the possibility that temperate and lytic phages of S. thermophilus ate related to each other by simple gene insertion or deletion events. Sfil 9-like lysogeny replacement modules were also found in Sfi11 and Sfi18. Their sequences were highly conserved (>95%). Many indels and frame shifts were detected, suggesting that several of the genes in that region are nonessential. Interestingly, all sequenced lytic phages conserved a cro-like gene.

Most of the sequenced S. thermophilus phages have related host lysis modules consisting of two holins followed by a lysin. These proteins are well conserved with aa similarities ranging from 78 to 99%. However, in phage DT1 (38) the first holin gene is missing. It has been shown that overexpression of only one of the holins is needed to have a lethal effect on E, coli. Further evidences that each gene of the host lysis module can be exchanged independently from the others have been reported by Sheehan et al. (33).

All the sequenced S. thermophilus phages have a very similar right genome half.

The region directly downstream the lysogeny or lysogeny replacement module is involved in DNA replication. This is the most conserved part of the S. thermophilus 120

phage genomes (4;12). The region adjacent to the cos-site may have regulatory

functions. Many deletions, insertions and gene replacements punctuated this region.

A detailed comparison of the DNA sequences from phages Sfi19 and Sfi18

indicates that point mutations may also play an important role in phage adaptation.

Very few base pair changes seem to be under the escape of phage Sfi19 from the

inhibition mediated by the Sfi21 encoded origin of replication.

Relationship of S. thermophilus phages to Siphoviridae from the Gram-

positive bacteria. Classification of bacteriophages has always been a controversial

issue. Since phages seem to exchange genetic material so frequently, phylogenetic

relationships are expected to be blurred. The taxonomical criteria used currently to classify phages can not easily be used to establish phylogenetic relationships. We are convinced that any meaningful classification of phages has ultimately to be sequence based. Because of the frequent lateral gene transfers between phages, their evolutionary relationship has to be examined at the level of individual modules

(1). Recently, data are accumulating indicating that lateral gene transfer events play also a prominent role in the evolution of bacterial genomes (14). For example, since

E. coli diverged from the Salmonella lineage at about 100 million years ago, 18% new genes have been introduced into the E. coli genome (22).

To test whether it is possible to trace the evolutionary history of phages, we examined the genetic relationship between S. thermophilus phages and other phages focusing on modules that are under different evolutionary constraints, the structural gene cluster and the lysogeny module.

By comparing the structural genes of the pac-phage Sfi 11 to the databases, we found multiple links to B1 -Siphoviridae (small isometric head and long noncontractile tail) from Gram-positive bacteria. The phage sequence comparison revealed a hierarchy of relatedness, which correlated approximately with the evolutionary distance between their bacterial hosts to S. thermophilus: the more distant the hosts, the less similar were the phage structural genes. The observation of such a graded relatedness in phages is important, since it is the hallmark of a system undergoing evolutionary changes. Our data also indicate lack of horizontal gene transfer events between phages infecting distinct bacterial species. The highest similarity between a

S, thermophilus phage and a phage infecting a different host species was found comparing the structural gene modules of Sfi21 and that of the lactococcal phage

BK5-T. The comparison showed detectable DNA sequence similarity (50-60 %) along most of the structural gene module (11). This high identity may indicate that lateral 121

gene transfers between these phages may have occurred. However, DNA/DNA

homology studies have shown that phages from these bacteria are genetically distinct (21) and probably, no more in genetic contact.

Alignment of the genetic maps of Bl-Siphovindae from Gram-positive bacteria

revealed a striking conservation of gene order. This was true for both pac-site and

cos-site B1 -Siphoviridae from Gram-positive bacteria and extended even to temperate Bl-Siphoviridae from Gram-negative bacteria like coliphage X and an archaevirus (11) On the basis of these observations we hypothesize that the morphogenesis module of B1-Siphovindae shares a common ancestry. The conservation was not restricted to the gene order, but was seen in protein characteristics such as size and isoelectric point (10). A similar gene order may, at least in theory, favor the productive recombination between phages(8). Whether this fact is a selective force to maintain gene order is questionable for phages that are not any longer exchanging genetic material This particular gene order could have been conserved because a correlation between gene arrangement and order of action of the gene products might be of selective advantage (8) The conservation of the gene order is even more striking when considering the fact that no sequence similarity can be detected between the structural proteins of X and those of the B1-Siphovihdae from the Gram-positive bacteria. It is envisageable that these structural gene clusters separated in such a distant past that the protein sequences diverged beyond recognition.

At the moment the amount of available sequences is a limiting factor for the determination of sequence relationships between evolutionary distant phages Only about 35 complete sequences of phages are available and most of them come from low-GC content Gram-positive bacteria (n=19), followed by y-proteobactena (n=9) and high-GC content Gram-positive bacteria (n=4) The sequencing of phages from a broader phylogenetic range of hosts should give us the possibility to find more links between phages and thus, will provide the data for a sequence based theory of phage evolution (19) Recently, sequencing of the Streptomyces phage C31 showed that the head assembly proteins were conserved in phages coming from distantly related hosts; Streptomyces phage C31, coliphage HK97, staphylococcal phage

PVL, two Rhodobacter capsulatus prophages and two Mycobacterium tuberculosis prophages (35). Similar links were demonstrated for head assembly proteins fromS. thermophilus Sfi21 (11) 122

Alignment of the lysogeny modules of all currently sequenced temperate

Siphoviridae from the low GC-content Gram-positive demonstrated related lysogeny modules. These modules showed not only a similar overall gene organization, but were also linked by many sequence similarities The gene organization of the lysogeny module of this group of phages distinguishes it from the Siphovindae infecting Gram-negative bacteria (e g coliphages X, 933W and P22) or high GC- content Gram-positive bacteria (mycobacteriophage L5) In contrast, similarly organized lysogeny modules were found in the temperate P2-!ike Myoviridae (29)

Such a relationship is not in accordance with the current phage taxonomy Other phage genome comparisons have revealed similar inconsistencies The Podovirus

P22 and the Siphovirus X have a comparable genome map and still can form viable hybrids in the laboratory (2,17), but were classified into different phage families on the basis of morphological criteria The dilemma of current phage taxonomy is thus too evident In fact, a non-sequence-based taxonomy may miss many evolutionary links between phages

At the moment, two genera of temperate Siphovindae have been established

(27) the A.-like and L5-like genera Interestingly, all the currently sequenced temperate phages from the low-GC Gram-positive bacteria showed an identical overall genome organization (11,25) that differentiates them from these two established genera Therefore, we suggest a new genus the Sfi21-like phage genus

Because of their relatedness, we would also include lytic S thermophilus phages in this genus, even if the difference in life style, according to the currently proposed taxonomy, would separate them in a different genus

Phage resistance. Sequencing of several S thermophilus phage genomes combined with large surveys of phage genomes by DNA/DNA hybridization provided insight into the genomic diversity of this phage group There are regions of extreme diversity interspersed with highly conserved genome segments (4) The later regions are important in view to obtain strains that are resistant to a broad range of phages

The phage resistance encoded by the Sfi21 origin of replication (16) is a good example of an anti-phage mechanism providing protection against a large number of phages In fact, the Sfi21-like replication module is widespread and highly conserved among phages from our collection (3,4) It has been shown that phages, e g Sfi19, having a Sfi21-like replication module could still escape the inhibition Point mutations probably altered the specificity of some replication protein for the on

Substitution of the Sfi21 replication origin with that of Sfi19 was shown to extend the 121

range of inhibitory activity of the origin (16) However, this strategy has one important

limitation to be efficient the replication origin has to be cloned in a high copy number

plasmid At the moment, food-grade vectors are not yet available forS thermophilus

Attempts to obtain a food-grade vector based on selection exerted by key metabolic functions have encountered difficulties (15) Therefore, we decided to test two

methods, which do not rely in the maintenance of a plasmid in the phage-resistant strains

First, we targeted bacterial genes for inactivation by hypothesizing the presence of bacterial genes essential to the phage replication cycle Should these genes be dispensable for bacterial growth in milk, their inactivation would lead to stable phage resistant cells We used a temperature-sensitive integrative plasmid

(pG+host9ISS7)(26) to randomly inactivate bacterial genes After plasmid insertion we challenged the culture with phages to select for phage-resistant mutants Using this approach we obtained three distinct Sfil phage-resistant insertion mutants The growth of the starter cells was unaffected by the mutation The phage protection level was very high (e o p < 106) and each insertion mutant was protected against a wide range of phages, suggesting that the phages tested have similar cellular requirements In two cases, phage development was likely blocked at the DNA- injection level The third mutant showed a delayed and lowered phage-DNA replication Excision of the integrative plasmid was then induced to obtain food-grade strains in which only one copy of the ISS? element was left One of the three mutants retained its phage resistance phenotype after the excision

The second approach relied on the protection provided by the immunity functions of the Sfi21 prophage The drawback of lysogénie starters is that they continuously release infectious phages into the factory environment Because of this problem, the use of lysogens is avoided in the dairy industry We decided to create a targeted deletion in an essential tail gene of the prophage to obtain a lysogen that would retain superinfection immunity, but has lost its capacity to produce infectious virions

We obtained the desired deletion derivative of the lysogen, which proved to be unable to release infectious phage particles (< 100 pfu/ml) but maintained intact superinfection control

We showed that both methods provide powerful phage resistant and food-grade starters for industrial milk fermentation Both approaches can be applied to a number of industrial starter strains The field is now ready for the development of phage resistant S thermophilus starters at a pilot plant level Should they meet the technological requirements a large-scale application of these starters could be envisioned in the dairy industry 124

Reference List

1 Botstein, D. 1980 A theory of modular evolution for bacteriophages Ann N Y Acad Sei 354 484-490 2 Botstein, D. and I. Herskowitz 1974 Properties of hybrids between Salmonella phage P22 and coliphage lambda Nature 251 584-589 3 Brüssow, H. and A. Bruttin 1995 Characterization of a temperate Streptococcus thermophilus bacteriophage and its genetic relationship with lytic phages Virology 212 632-640 4 Brüssow, H., A. Probst, M. Fremont, and J. Sidoti 1994 Distinct Streptococcus thermophilus bacteriophages share an extremely conserved DNA fragment Virology 200 854-857 5 Braun, V.Jr., S. Hertwig, H. Neve, A. Geis, and M. Teuber 1989 Taxonomic differentiation of bacteriophages of Lactococcus lactis by electro microscopy, DNA-DNA hybridization, and protein profiles J Gen Microbiol 135 2551-2560 6 Bruttin, A. and H. Brussow 1996 Site-specific spontaneous deletions in three genome regions of a temperate Streptococcus thermophilus phage Virology 219 96-104 7 Casjens, S.R., G.F. Hatfull, and R.W. Hendrix 1992 Evolution of dsDNA tailed- bactenophage genomes Semin Virol 3 383-397 8 Casjens, S.R. and R.W. Hendrix 1974 Comments on the arrangement of the morphogenetic genes of bacteriophage lambda J Mol Biol 90 20-23 9 Casjens, S.R. and R.W. Hendrix 1988 p 15-91 In R Calendar (ed ), The Bacteriophages Plenum Press New York 10 Chandry, P.S., S.C. Moore, J.D. Boyce, B.E. Davidson, and A.J. Hillier 1997 Analysis of the DNA sequence gene expression, origin of replication and modular structure of the Lactococcus lactis lytic bacteriophage ski Mol Microbiol 26 49-64 11 Desiere, F 1999 Diss ETHZ Nr 13332 12 Desiere, F., S. Lucchini, A. Bruttin, M.-C. Zwahlen, and H. Brüssow 1997 A highly conserved DNA replication module from Streptococcus thermophilus phages is similar in sequence and topology to a module from Lactococcus lactis phages Virology 234 372-382 13 Dinsmore, P.K. and T.R. Klaenhammer 1997 Molecular characterization of a genomic region in a Lactococcus bacteriophage that is involved in its sensitivity to the phage defense mechanism AbiA J Bactenol 179 2949-2957 14 Doolittle, W.F. 1999 Phylogenetic classification and the universal tree Science 284 2124-2128 15 Foley, S. (UnPub) 16 Foley, S., S. Lucchini, M.-C. Zwahlen, and H. Brussow 1998 A short noncoding viral DNA element showing characteristics of a replication origin confers bacteriophage resistance to Streptococcus thermophilus Virology 250 377- 387 17 Gemski, P., L.S. Baron, and N. Yamamoto 1972 Formation of hybrids between coliphage lambda and salmonella phage P22 with a Salmonella typhimunum hybrid sensitive to these phages Proc Natl Acad Sei U S A 69 3110-3114 18 HaggÄrd, L.E., C. Hailing, and R. Calendar 1992 DNA sequences of the tail fiber genes of bacteriophage P2 evidence for horizontal transfer of tail fiber genes among unrelated bacteriophages J Bactenol 174 1462-1477 19 Hendrix, R.W., M.C. Smith, R.N. Burns, M.E. Ford, and G.F. Hatfull 1999 Evolutionary relationships among diverse bacteriophages and prophages all the world's a phage Proc Natl Acad Sei U S A 96 2192-2197 20 Highton, P.J., Y. Chang, and R.J. Myers 1990 Evidence for the exchange of segments between genomes during the evolution of lambdoid bacteriophages Mol Microbiol 4 1329-1340 12*5

21 Jarvis, A.W. 1989 Bacteriophages of lactic acid bacteria J Dairy Sei 72 3406-3428 22. Lawrence, J.G. and H. Ochman 1998 Molecular archaeology of the Escherichia coli genome Proc Natl Acad Sei U S A 95 9413-9417 23 Le Marrec, C, D. van Sinderen, L. Walsh, E. Stanley, E. Vlegels, S. Moineau, P. Heinze, G. Fitzgerald, and B. Fayard 1997 Two groups of bacteriophages infecting Streptococcus thermophilus can be distinguished on the basis of mode of packaging and genetic determinants for major structural proteins Appl Environ Microbiol 63 3246-3253 24 Lucchini, S. (UnPub) 25 Lucchini, S., F. Desiere, and H. Brüssow 1999 Similarly organized lysogeny modules in temperate Siphovindae from low GC content Gram-positive bacteria Virology 263 427-435 26 Maguin, E., H. Prévost, S.D. Ehrlich, and A. Gruss 1996 Efficient insertional mutagenesis in lactococci and other gram-positive bacteria J Bactenol 178 931-935 27 Maniloff, J. and H.W. Ackermann 1998 Taxonomy of bacterial viruses establishment of tailed virus genera and the order Caudovirales [news] Arch Virol 143 2051- 2063 28 Mikkonen, M., L. Dupont, T. Alatossava, and P. Ritzenthaler 1996 Defective site- specific integration elements are present in the genome of virulent bacteriophage LL-H of Lactobacillus delbrueckii Appl Environ Microbiol 62 1847-1851 29 Nesper, J., J. Blass, M. Fountoulakis, and J. Reidl 1999 Characterization of the major control region of Vibrio cholerae bacteriophage K139 immunity, exclusion, and integration J Bactenol 1812902-2913 30 Neve, H., K.I. Zenz, F. Desiere, A. Koch, K.J. Heller, and H. Brüssow 1998 Comparison of the lysogeny modules from the temperate Streptococcus thermophilus bacteriophages TP-J34 and Sfi21 implications for the modular theory of phage evolution Virology 241 61-72 31 Relano, P., M. Mata, M. Bonneau, and P. Ritzenthaler 1987 Molecular characterization and compandon of 38 virulent and temperate bacteriophages of Streptococcus lactis J Gen Microbiol 133 3053-3063 32 Sechaud, L., P.J. Cluzel, M. Rousseau, A. Baumgartner, and J.P. Accolas 1988 Bacteriophages of lactobacilli Biochimie 70 401-410 33 Sheehan, M.M., E. Stanley, G.F. Fitzgerald, and D. van Sinderen 1999 Identification and characterization of a lysis module present in a large proportion of bacteriophages infecting Streptococcus thermophilus Appl Environ Microbiol 65 569-577 34 Skowronek, K. and S. Baranowski 1997 The relationship between HP1 and S2 bacteriophages of Haemophilus influenzae Gene 196 139-144 35 Smith, M.C., R.N. Burns, S.E. Wilson, and M.A. Gregory 1999 The complete genome sequence of the Streptomyces temperate phage straight phiC31 evolutionary relationships to other viruses Nucleic Acids Res 27 2145-2155 36 Stanley, E., G.F. Fitzgerald, C. Le Marrec, B. Fayard, and D. van Sinderen 1997 Sequence analysis and characterization of phi O1205, a temperate bacteriophage infecting Streptococcus thermophilus CNRZ1205 Microbiology 143 3417-3429 37 Trautwetter, A., P. Ritzenthaler, T. Alatossava, and G.M. Mata 1986 Physical and genetic characterization of the genome of Lactobacillus lactis bacteriophage LL-H J Virol 59 551-555

38 Tremblay, D.M. and S. Moineau 1999 Complete genomic sequence of the lytic bactenophage DT1 of Streptococcus thermophilus Virology 255 63-76 39 van Sinderen, D Personal Communication Curriculum vitae

Born March 13, 1970 in Neuilly sur Seine, France.

1976-1981 Primary education in Melide, Tl.

1981-1985 Secondary education in Barbengo, Tl.

1985-1990 Lyceum in Lugano, Tl.

1990-1996 Study of Biochemistry (Abteilung XAe) at the Swiss Federal Institute of Technology in Zürich, ZH.

1996-1999 Scientific collaborator at the group of food microbiology at the Nestlé Research Centre in Vers-chez-les-Blanc, VD. Ph. D. Thesis. Publication list

1 Bruttin, A., F. Desiere, N. dAmico, J.P. Guérin, J. Sidoti, B. Huni, S. Lucchini, and H. Briissow 1997 Molecular ecology of Streptococcus thermophilus bacteriophage infections in a cheese factory Appl Environ Microbiol 63 3144-3150

2 Bruttin, A., F. Desiere, S. Lucchini, S. Foley, and H. Brüssow 1997 Characterization of the lysogeny DNA module from the temperate Streptococcus thermophilus bacteriophage phi Sfi21 Virology 233 1 SO¬ US

3 Desiere, F., S. Lucchini, A. Bruttin, M.C. Zwahlen, and H. Brüssow 1997 A highly conserved DNA replication module from Streptococcus thermophilus phages is similar in sequence and topology to a module from Lactococcus lactis phages Virology 234 372-382

4 Brüssow, H., A. Bruttin, F. Desiere, S. Lucchini, and S. Foley 1998 Molecular ecology and evolution of Streptococcus thermophilus bactenophages-a review Virus Genes 16 95-109

9 Desiere, F., S. Lucchini, and H. Brüssow 1998 Evolution of Streptococcus thermophilus bacteriophage genomes by modular exchanges followed by point mutations and small deletions and insertions Virology 241 345-356

6 Lucchini, S., F. Desiere, and H. Brüssow 1998 The structural gene module in Streptococcus thermophilus bacteriophage phi Sfil 1 shows a hierarchy of relatedness to Siphoviridae from a wide range of bacterial hosts Virology 246 63-73

7 Foley, S., S. Lucchini, M.-C. Zwahlen, and H. Brüssow 1998 A short noncoding viral DNA element showing characteristics of a replication origin confers bacteriophage resistance to Streptococcus thermophilus Virology 250 377-387

8 Lucchini, S., F. Desiere, and H. Brüssow 1999 The genetic relationship between virulent and temperate Streptococcus thermophilus bacteriophages whole genome comparison of cos-site phages Sfi 19 and Sfi21 Virology 260 232-243

9 Desiere, F., S. Lucchini, and H. Brüssow 1999 Comparative sequence analysis of the DNA packaging head and tail morphogenesis modules in the temperate cos-site Streptococcus thermophilus bacteriophage Sfi21 Virology 260 244-253

10 Lucchini, S., F. Desiere, and H. Brüssow 1999 Comparative genomics of Streptococcus thermophilus phage species supports a modular evolution theory J Virol 73 8647-8656

11. Lucchini, S., F. Desiere, and H. Brüssow 1999 Similarly organized lysogeny modules in temperate Siphovindae from low GC content Gram-positive bacteria Virology 263 427-435 12. Lucchini, S. and H. Brüssow. 1999. Phage Resistant Bacteria. Patent Application B.N. 9920431.5.

13. van Beilen, J. B., S. Lucchini, M. Röthlisberger and B. Witholt. 1999. Organization of the Pseudomonas putida (oleovorans) GPo1 and P. putida P1 alkane degradation genes. J. Bact. (submitted)